-
-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Investigate merging attention's saliency features for "smart" crop #295
Comments
Hello, might https://github.com/lovell/attention provide what you're looking for? |
Interesting! So is the idea that the return values from How would you use the x/y coordinates returned by attention.point in sharp? Ideally if So for example something like |
You've got the idea. It's quite experimental and the performance could probably be improved. I might consider merging some of the features of |
Would love to see |
+1 |
Hi @lovell just wondering if there was any update on this - I'm about to start using attention in my image processing workflow unless there's an integration on the near horizon? Could I help at all (guessing its beyond a simple PR though)? |
@homerjam This is still planned but with nothing implemented yet. It'd be great to learn which features of |
Cool, well not to worry I will press on with First up I'm going to be using |
Focal point to generate thumbnails is enougth for me. I guess, that's a most demanded method. |
Actually I have some integration of smartcrop with sharp. I'll probably release it together with the next release. :) |
@jwagner 👍 |
Commit 2034efc on the Here's an example of how you might use this to generate auto-cropped 200px square thumbnails using Streams: var transformer = sharp().resize(200, 200).crop(sharp.strategy.entropy);
readableStream.pipe(transformer).pipe(writableStream); Feedback very much welcome. |
One question. Do i understand right, that scale can vary? It selects region with requested width/height ratio, crop it and scale down to exact size. Correct? |
@puzrin The image is resized so at least one dimension is correct, then the edges of the remaining dimension are repeatedly cropped until it too is correct. (I've added this feature in such a way that, in the future, we could also use it to auto- |
Got it. Probably i don't undersatand how api was intended to be used. Let me describe my task:
It would be nice to have simple call for that. I expect such use case to be the most demanded. |
@puzrin My thinking here is that we can add further "strategies" more suited to the use case you describe. These might be things like "skin tones", "edges", "contrast" etc. As with the approach used in The initial entropy-based strategy is more about removing the least valuable edges rather than keeping the most valuable/salient regions - I'll try to make the docs clearer - thanks for the feedback! |
Thanks for explanation. After thinking a bit, probably fuzzy edges cut will be enougth for my needs. |
Yes, I like the trim boring edges strategy, it seems like a simple, reliable way to cut an image down that should need little training. Most photos will not have a very small detail that you want to cut out. It must be much more common to just want to handle off-centre compositions automatically. |
It looks like something in libgobject in causing
I'll investigate. |
@jcupitt Should hist_entropy.c#L83 be vips_log( t[0], &t[1], NULL ) instead of vips_log( t[0], &t[1], 1.0 / sum, 0, NULL ) ? |
Ooops, yes, looks like a copy-paste error. |
there was a copy-paste error in the call to vips_log(), thanks Lovell see lovell/sharp#295
I fixed it in 8.2 and master, and added a test for it. Thanks for spotting the dumbness @lovell! |
@jcupitt Fantastic, thank you. |
The release of libvips v8.2.3 with the |
I'm looking for something similar enough that I didn't think I should make a new issue: Trimming whitespace around an image. I've implemented it before using the vips ruby bindings, but sharp doesn't expose the vips methods I would need. Ruby implementation below. Note that this implementation assumes that the the image is already RGB(A), and would need more smarts to handle other color spaces. def trim(img)
alpha = nil
# Remove the alpha channel, if there is one, as it breaks mask creation
if img.bands == 4
alpha = img.extract_band(3)
img = img.extract_band(0, 3)
end
mask = img.less(240)
columns, rows = mask.project
left = columns.profile_h.min
right = columns.x_size - columns.fliphor.profile_h.min
top = rows.profile_v.min
bottom = rows.y_size - rows.flipver.profile_v.min
# Put the alpha channel back in, if it had one
img = img.bandjoin(alpha.clip2fmt(img.band_fmt)) if alpha
img = img.extract_area(left, top, right - left, bottom - top)
img
end |
@calebshay This discussion is more about strategies for dealing with cropping-when-resizing. What you describe sounds like automated image extraction so feel free to create a new feature request for this. (I see the possibility of combining the two approaches in one pipeline, e.g. extract non-whitespace then resize+crop using entropy.) |
The entropy-based cropping strategy is in v0.14.0, now available via npm, thanks for all the comments and help here. I'm going to leave this task open to track further additions/improvements from |
It seems to work strange. I've tried to create cropped thumbnails for images with clear left focus & right focus. Those are detected well by smartcrop.js, but not with sharp 0.14 (with new crop param). |
@lovell is previous explanation clear enougth or i should provide more info? I used this demo to compare result https://29a.ch/sandbox/2014/smartcrop/examples/testbed.html. |
Glad to know. My test case is crop 4:3 ratio image to 170*150 pixels (downscale + cut left & right sides a bit). Your link has at least one image (with focus on the left) good for algorythm check. It should cut such images from the one side only. |
@lovell do you have any estimates/priorities for revisiting smartcrop feature? |
A little update on integrating smartcrop with sharp. I have released smartcrop 1.0 along with smartcrop-sharp now. It's not super efficient right now as the image needs to be decoded twice (once for smartcrop, once for operating on it with sharp). But in practice it works quite well. :) |
Oooooh lovely @jwagner, thanks! |
Update/teaser: The following graph shows image count (y-axis) against % error (x-axis) for the existing entropy-based crop strategy (dark blue) vs the attention-based strategy (green). Closer to the origin is closer to the "ground truth" and therefore better, so the attention-based approach is the relative winner in terms of accuracy. (The MSRA Salient Object Database image set B was used as the source of "ground truth".) The attention-based strategy is currently ~50% faster than entropy, typically adding <50ms to processing time, but work continues to fine-tune both accuracy and performance. |
That's a fantastic graph Lovell! Very nice work. I should look at your attention crop code. |
The attention branch adds experimental support for a crop "strategy" based on a slightly modified+simplified version of the original logic in the attention module. sharp(input).resize(200, 200).crop(sharp.strategy.attention)... |
Commit 18b9991 adds this to the master branch ready for inclusion in v0.16.1. |
@lovell What would be the difference between using the original In layman's terms what's the difference between the entropy and attention based strategies? |
Not sure this qualifies as layman's terms but here's a bit of an explanation... |
sharp will be getting an updated+improved version of the focal-point logic from attention, made available via entropy ranks regions based on their vips_hist_entropy value, or "which bit of the image has the most energy?" attention converts image regions to the LAB and LCH colourspaces and generates 3 masks:
...then adds them together and finds the maximum value to rank regions. |
v0.16.1 now available via npm. Thanks everyone for your help with this! |
Often when images are shrunk to generate thumbnails through resizing or cropping, the resulting image doesn't look very good just because of the content and the dimensions of the original image. But if there was a way to generate 'smart' thumbnails that's based on the content of the image, it would allow for much better thumbnails. For example, http://29a.ch/sandbox/2014/smartcrop/examples/testsuite.html.
There's a JS library that implements this https://github.com/jwagner/smartcrop.js, would it be possible to offer similar functionality in sharp?
The text was updated successfully, but these errors were encountered: