Person Counting

This repository aims to train a convolutional-recurrent neural network to count persons in videos. The algorithm is trained by using the pcds dataset to count persons passing a bus entrance.

The architecture of the best algorithm trained so far can be seen in the following figure:

Train and evaluate in Google Colab

The training logging and further usage with the person counting algorithm was optimized to train in the Google Colaboratory environment. Mounting this folder in The notebooks provided can be directly used to train the algorithms and the logging and saving of the weights will be done in the related Google Drive folder.

A Google Colab notebook is provided which can be found in the colab_notebooks folder with the name trainer_counting to train the algorithm. In the /bin/cnn_regression.py file you can find the hyperparameter space to tune the architecture or learning parameters. The results will be automatically logged if you mount a Google Drive before starting the training session. Tensorboard logging is activated so you can see live performances of your models on your local machine. To get the results on your local machine you have to activate the automatic synchronization with your local machine from the Google Drive.

Input Data

Detection Frames are the inputs for the person counting algorithm. You can create those .npy files for every video in a specified folder with the person detection repository. All x- and y-coordinates of person centers which were detected by a specified RetinaNet model will be saved in a .npy file. You can load this .npy file into a numpy array. An example detection frame for x- y- and time-coordinate can be seen here:

Inference Example

In the inference folder there can be found an example how inference is done on an entire video. The current example does not use a fine tuned person detector anymore to avoid version conflicts with the Retinanet implementation that was used to fine tune the person detecotr. Therefore the person detections are worse and since the person counter was trained on the Retinanet detections the overall results are worse than the results that were achieved with the Retinanet model.

Further ideas:

Inspired by the 3D input data (x-, y- and time coordinates from the video) in the image below, a 3D convolution Kernel would be an interesting approach.

Citation

The work of this code was published as a paper and please consider citing the paper when using parts of this work:

@INPROCEEDINGS{9742924,
               author={Baumann, Daniel and Sommer, Martin and Schrempp, Yannick and Sax, Eric},
               booktitle={2022 International Conference on Connected Vehicle and Expo (ICCVE)},
               title={Use of Deep Learning Methods for People Counting in Public Transport},   
               year={2022},
               pages={1-6},
               doi={10.1109/ICCVE52871.2022.9742924}}

Name		Name	Last commit message	Last commit date
Latest commit History 114 Commits
images		images
src		src
tests		tests
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Person Counting

Train and evaluate in Google Colab

Input Data

Inference Example

Further ideas:

Citation

About

Releases

Packages

Contributors 2

Languages

License

Yannick947/person_counting

Folders and files

Latest commit

History

Repository files navigation

Person Counting

Train and evaluate in Google Colab

Input Data

Inference Example

Further ideas:

Citation

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages