Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE]: Batching .extract_faces #1434

Open
galthran-wq opened this issue Feb 12, 2025 · 3 comments
Open

[FEATURE]: Batching .extract_faces #1434

galthran-wq opened this issue Feb 12, 2025 · 3 comments
Labels
enhancement New feature or request

Comments

@galthran-wq
Copy link
Contributor

galthran-wq commented Feb 12, 2025

Description

Related:
#1433
#1101

I think it would be a good idea to be able to batch input images on detection.
Many currently existing detectors are natively able to to perform batched inference (to name a few, YOLO, retinaface, MTCNN)
The hypothesis is that batching would improve the performance significantly.
Partially validated by serengil/retinaface#116.
Since retinaface is not going to support batching, we could at least have YOLO, MTCNN and some others

@serengil what do you think?

Additional Info

No response

@NatLee
Copy link
Contributor

NatLee commented Feb 17, 2025

Hello, I have a question regarding .extract_faces. If it supports batch input, would the subsequent models, such as emotion or age detection, also need to be adjusted based on the output of .extract_faces?

For instance, if I have three input images, each will produce face detection results. How should the subsequent analysis define multiple faces across these three images? Perhaps using the input order to each face could be a solution?

@galthran-wq
Copy link
Contributor Author

hi. I do not think they would have to be adjusted, because the function is supposed to work exactly the same for a single image input and all the susequent models are using it currently with a single image.

Now, if we were to adjust the usage in subsequent models to batched inputs, then you indeed raise a good point on how to recover which detection results correspond to which images in the input. And I think the solution here is this. The output type of the .extract_faces should be Union[List[Dict[str, Any]], List[List[Dict[str, Any]]. So, it is List[Dict[str, Any]] for a single image input (just like before), and List[List[Dict[str, Any] if a list of images is passed -- for each input face, a list of detected faces.

I'm planing to incorporate this change in the PR

@serengil
Copy link
Owner

@galthran-wq so, if you feed 3 images to input as list or numpy and the 2nd one has 5 faces, then function will still return a list of 3. The second item in the response will have 5 items. Liked that design.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants