-
Notifications
You must be signed in to change notification settings - Fork 696
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Evaluate and implement improvements to code analysis #6111
Comments
Based on interesting criteria for the project I've gathered a comparison between the tools and some insights of possible next steps. Comparison of tools |
Having researched both Pyre/Pysa and Semgrep here is a small list of pros and cons of each of them:
Semgrep:
CodeQL would bound our CI to GitHub and seems overall less feature rich. In both cases, instrumenting the code and writing meaningful rules would require to go over most of the code and a lot of manual work at least the first time but would have the benefit of giving us come kind of automated security and regression checking in CI after the initial effort. I would say that the better example and support availability of Semgrep could make the initial effort easier. Seems also like that in case of abandonment of Pyre/Pysa by Facebook, there would not be a big enough community to maintain the tool. |
(@L3th3 is looking into this and may open a draft PR this sprint, moving into "in development".) |
I encountered quite a few problems with pyre/pysa. While it works fine on smaller projects and the sample cases that they provide, I had a very hard time getting it to work on any real world projects such as securedrop. For future reference:
Overall, while the taint analysis is a very interesting feature, in practice the tool needs to mature before we can really make use of it. In particular, there needs to be some significant improvement on the models they offer, because while we can make our own models, we would still need to rely on the models they provide, which are simply insufficient right now. |
Closing as a research task, semgrep support was added in #6479 |
(Based on an issue originally filed by @emkll internally.)
We should consider improving and tweaking our static code analysis to identify common code quality/security issues. We are currently using Bandit, but Pyre and Semgrep also provide taint analysis, to ensure untrusted or user-specified input does not make its way to sensitive functions without being sanitized.
Furthermore, we should add rules for existing issues we've uncovered historically, to prevent regressions. The following is a non-exhaustive list of static code analysis tools
Bandit
We are currently using Bandit, but the default rules provided by the tool may be insufficient. We can consider creating custom rules for Bandit, as we identify new issues or to verify.
Pyre
We have done several small spikes to investigate the use of Pyre in the context of the SecureDrop Client: freedomofpress/securedrop-client#385
Semgrep
We use Semgrep in the SecureDrop Client as of freedomofpress/securedrop-client#1226 and and in
securedrop-export
as of freedomofpress/securedrop-export#67, but we do not use it in this repo yet.CodeQL
This solution offered by GitHub is similar to Semgrep, but is only available through GitHub Actions, and as such cannot be run locally.
The text was updated successfully, but these errors were encountered: