Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scarcity on dependabot compatibility scores #4407

Closed
hrz6976 opened this issue Nov 13, 2021 · 7 comments
Closed

Scarcity on dependabot compatibility scores #4407

hrz6976 opened this issue Nov 13, 2021 · 7 comments
Labels
service 💁 Relates to Dependabot features GitHub provides

Comments

@hrz6976
Copy link

hrz6976 commented Nov 13, 2021

Hi There :) Don't know if it is the right place to start a discussion but had no luck with GitHub support

I'm an open-source software researcher working on analyzing dependabot PRs. Nearly all (~99%) PRs in our dataset had their compatibility score labelled as "unknown", which is a little bit counter-intuitive to me. Considering npm package updates is the most common case in our dependabot PR dataset (120503/186697, 64.5%), I did a quick validation on npm packages:

  • sample 20 most depended-upon npm packages (lodash, react-dom, vue, axios etc.) data source

  • fetch all major/minor/patch versions released after 2020/01/01 from the npm registry

  • fetch compatibility score for each version pair (i.e package P may update from version V1 to version V2)

Here's the code snippet used to extract compatibility score from svg (full code and data here):

from xml.dom import minidom
def get_dependabot_compatibility_score_ver(package_manager, dependency_name, oldver, newver) -> str:
    url = f"https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name={dependency_name}&package-manager={package_manager}&previous-version={oldver}&new-version={newver}"
    res = requests.get(url)
    svg_text = res.text
    doc = minidom.parseString(svg_text)
    for ele in doc.getElementsByTagName("text"):
        res = ele.firstChild.nodeValue
        if res != "compatibility":
            return res

The results showed a sparse data distribution. 1604/1629 (98.5%) version pairs are labelled as "unknown". And here come the questions:

  • Is it something with our methodology or scarcity on compatibility scores confirmed?

  • How is the score calculated? Is dependabot filtering out certain projects or leaving the score "unknown" until a certain number of projects merged the update PR?

  • What is the possible cause of this sparse data distribution? (I'm not quite familiar with testing, so no idea about this)

Any idea about this is welcomed!

@jurre jurre added this to Dependabot Nov 26, 2021
@jurre jurre added the service 💁 Relates to Dependabot features GitHub provides label Nov 26, 2021
@brrygrdn
Copy link
Contributor

Hi @12f23eddde, this is something we've noticed internally recently and are tracking. I don't have anything to share in terms of improving it at the moment, but I'm fairly certain it is a problem on our side and not your methodology.

@tonydehnke
Copy link

Same issue here, and if you click on the icon/graphic, it no longer load the page that shows you the previous scores.
image

@brombaut
Copy link

brombaut commented Mar 8, 2022

Dependabot requires at least 5 candidate updates for a valid compatibility score badge to be shown (at least that was the case back in the summer of 2021 - see #4001).

It also used to explain on Dependabot's site (which has since been taken down since being acquired by GitHub) that Dependabot will only include results from PRs for dependency updates that have a CI pipeline configured (e.g., GitHub Actions or TravisCI). Also, the PR didn't necessarily have to be merged in order for it to count towards the compatibility score (e.g, if a PR for a dependency update failed the client's CI pipeline, and the client decided to close the PR without merging, that dependency update would still count towards the compatibility score).

Of course, this is only what it used to say on the dependabot.com website. Things might have changed since then.

@tonglil
Copy link

tonglil commented Sep 20, 2022

I don't think this feature is functioning properly as of late.
Never seen anything other than unknown.
This was useful, now just broken?

@jeffwidman
Copy link
Member

This is still on our radar. Many thanks to @Nishnha who recently tweaked the DB query used to calculate the compatibility score. Our metrics showed a noticeable reduction in the number Unknown.

However, the metrics also show that we still return Unknown more often than we'd like, so we're still tracking improving it further when we have more time down the road.

@jeffwidman
Copy link
Member

@Nishnha made some further improvements here after @malcolmtaylor noticed the database wasn't using an index like we expected. Since then, we've seen a massive reduction in the number of query timeouts for these badges. 🎉

That won't solve all cases of missing / unknown badges because as noted above we do require a minimum number of candidate PR's before we show the badge (I think it's 5 but haven't doublechecked the code)... but it should help with a bunch of them.

I'm going to close this for now, but if you happen to notice a PR bumping a popular lib where you expected we'd have enough candidate PR's to generate a badge but still see unknown, please feel free to file an issue and we can doublecheck what's going on.

@Drowze
Copy link

Drowze commented Nov 29, 2023

I feel there's still an issue here 🤔
In our case we're using Dependabot across all of our Ruby repositories, and we're still getting "Unknown" compatibility scores in pretty much all of our Dependency upgrades (with rare and sparse exceptions). For example:

Screenshot 2023-11-29 at 11 31 56

Screenshot 2023-11-29 at 11 32 10

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
service 💁 Relates to Dependabot features GitHub provides
Projects
Archived in project
Development

No branches or pull requests

8 participants