-
Notifications
You must be signed in to change notification settings - Fork 81
This issue was moved to a discussion.
You can continue the conversation there. Go to discussion →
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inconsistent mapping of volume PCA clusters to UMAP plot between analyze_landscape
and analyze_landscape_full
#414
Comments
Actually, I suspect I must have made a mistake somewhere...I'll update this when I find out what I did wrong. Update: However, to refer to the 999 volumes I filtered using This made me realize I had some misconceptions about what
So, a few questions:
|
Hey, I'm making this a Discussion as there isn't an outright issue being described here, although as I've mentioned in e.g. #413 we are working on some overhauls of the I will get back to soon about your other points, but in the meantime, it would be helpful if you took a look at some of our older material for the landscape analyses, the former describing the sketching used in |
This issue was moved to a discussion.
You can continue the conversation there. Go to discussion →
Hi!
I am currently using cryoDRGN v3.3.3 and have been working with the analyze_landscape command. From my understanding, this command performs PCA on the 1000 volumes from the kmeans1000 directory and plots each particle cluster onto the UMAP plot generated by the analyze command in the filtering notebook.
After filtering out one outlier, I reran the command on the remaining 999 volumes, which produced the following histograms and plots. As shown, the clusters from the volume PCA align well with the clusters in the UMAP plot:

However, when I applied the same clustering analysis in the Jupyter notebook generated by the analyze_landscape_full command, I observed a completely different mapping of the same 999 volumes onto the UMAP plot. In this case, the clusters identified by the PCA appear to be randomly dispersed:




I am currently attempting to write my own code in the analyze_landscape_full notebook to plot the clusters onto the UMAP plot manually. Any insights into why the behavior of this notebook differs from the analyze_landscape command would be greatly appreciated!
I'm also not very clear on what exactly
analyze_landscape_full
does. Thestate_particle_counts.png
histogram generated by theanalyze_landscape
command already seems to cluster the entire dataset of ~1 million particles. Why does theanalyze_landcape_full
command generate additional training volumes, and why does it use a neural net to assign all of the particles to clusters?Best,
cbeck
The text was updated successfully, but these errors were encountered: