You will need to have installed the required libraries in requirements-localhost.txt
, preferably in a new
(isolated) virtual environment (such as Anaconda, venv, virtualenv). Install the libraries with:
pip install -r requirements-localhost.txt
The project and shared libraries must be on the current Python path; when using the code, start a command shell at the project root and use this as the Python source path by entering (in Unix):
export PYTHONPATH=`pwd`
Alternatively, using Windows, you would require set PYTHONPATH=<your current folder>
where
you manually enter the current folder path.
NOTE eventually this requirement would be replaced when an installation script is created.
At present, the NE Travel Data is not supported with a stand-alone machine (just the GCP implementation).
Also, database integration is not supported (CSV files are output per day which can then be ingested
into a database as required). Finally, time series analysis in R is not supported either (this code is
still linked to GCP BigQuery, refer to cloud/vm
for R source code).
This script enables a set of images to be processed over a set date range.
This script is useful in running the time series on pre-existing data - for example, if you have started image acquisition before readying the model, or if you wish to re-process imagery with an alternative model or alternative settings.
This script works the same as scripts/local/process_scheduled.py
, only processing a date range rather
than a single sample.
Command line options are:
--start-date
starting date for when images will be processed--end-date
end date for when images will be processed--config-folder
folder where configuration data is stored (default:localhost/config
)--download-folder
folder where image data downloaded (default:localhost/data
)--counts-path
folder where image counts are stored (default:localhost/counts
)--log-level
Level of detail to report in logs (default:INFO
)--help
detailed help on each option, with default arguments listed
Launch the script by defining a start date and end date for images to be processed
between (inclusive), the script will detect image sources from the localhost/config/analyse
folder.
So for example, to process all selected cameras between 1st and 2nd of June 2020, run:
python3 scripts/localhost/batch_process_images.py --start-date="20200601" --end-date="20200602"
Progress bars will be presented showing the dates being iterated over, along with which image supplier and camera. This will take approximately 30s per camera image, as it is processing 144 images per selected day (images are captured every 10 minutes, so 6 per hour; hence 24*6=144 images).
Analysis is recorded in CSV files, generating one CSV file per day in the folder named after the current
model (defined in scripts/localhost/config/analyse-configuration.json
).
At present the single detection model used is called NewcastleV0
, supplied courtesy of
Newcastle University's Urban Observatory from their GitHub repository.
This script downloads the set of camera images to local disc; it should be called every 10 minutes to accumulate imagery to process.
Command line options are:
--config-folder
folder where configuration data is stored (default:localhost/config
)--download-folder
folder where image data downloaded (default:localhost/data
)--maximum-download-attempts
maximum number of download attempts per image (default: 5)--log-level
Level of detail to report in logs (default:INFO
)--help
detailed help on each option, with default arguments listed
The script can be launched without arguments, as the defaults will be sufficient;
this will search the localhost/config/ingest
folder for image sources, which will then
be downloaded into the localhost/data
folder.
For example, run:
python3 scripts/localhost/download_files.py
Progress bars will be presented showing the cameras being iterated over, noting that the execution time will depend on your internet connection. The script downloads items serially (one at a timer), so does not maximise download bandwidth.
This script analyses images taken 20 minutes ago, and should be called every 10 minutes to "chase" the downloaded images. Note that it processes 20 minutes in arrears as it requires images before and after the analysed image (in time), to detect static objects.
Command line options are:
--config-folder
folder where configuration data is stored (default:localhost/config
)--download-folder
folder where image data downloaded (default:localhost/data
)--counts-path
folder where image counts are stored (default:localhost/counts
)--log-level
Level of detail to report in logs (default:INFO
)--help
detailed help on each option, with default arguments listed
Launch the script with the defaults, the script will detect image sources from the localhost/config/analyse
folder.
So for example, to process all selected cameras on the current data:
python3 scripts/localhost/process_scheduled.py
Progress bars will be presented showing the cameras being iterated over. There will be a delay as the model is loaded, then the processing will start. It will take ~1s per image, depending on local hardware (the model uses tensorflow, so will take advantage of multiple cores or, if installed, a GPU).
Analysis is recorded in CSV files, generating one CSV file per day in the folder named after the current
model (defined in scripts/localhost/config/analyse-configuration.json
).
At present the single detection model used is called NewcastleV0
, supplied courtesy of
Newcastle University's Urban Observatory from their GitHub repository.
This script removes images older than a specified number of days from the data
bucket.
Once images are analysed, they do not need to be retained (unless you are modifying the model or filtering) and can be removed. In case of system issues, the images can be retained on a rolling deletion basis - default is to remove images older than 28 days (4 weeks).
Command line options are:
--maximum-number-of-days
maximum number of days an image is retained before it is deleted (date folder is used to determine when it was created - so if an image was downloaded today but the folder indicated 60 days ago, running the script with less than 60 days specified will remove the image)--download-folder
folder where image data was downloaded - and where it will be removed (default:localhost/data
)--log-level
Level of detail to report in logs (default:INFO
)--help
detailed help on each option, with default arguments listed
Script uses the default data download location, so to to remove all images older than 28 days just use defaults:
python3 scripts/localhost/remove_old_images.py
This script updates the list of cameras available to ingest (but does not amend the list to be analysed).
Command line options are:
--config-folder
folder where configuration data is stored (default:localhost/config
)--log-level
Level of detail to report in logs (default:INFO
)--help
detailed help on each option, with default arguments listed
Launch the script with the defaults, the script will store image sources in the localhost/config/ingest
folder.
So for example, to process all selected cameras on the current data:
python3 scripts/localhost/update_sources.py
Each image provider is listed as it is examined for camera sources. At present only Transport for London is supported (TfL), but this can be expanded by the user as required.