-
Notifications
You must be signed in to change notification settings - Fork 97
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow mirrors that support both HTTP and HTTPS #186
Open
elboulangero
wants to merge
15
commits into
etix:master
Choose a base branch
from
elboulangero:github/http+https
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Closed
This needs a manual rebase. |
In this commit, all the code block that filter mirrors (ie. separate them in two groups, mirrors candidates for the redirect, and mirrors excluded) is moved to a separate function, for the purpose of adding test cases for it. The test cases added don't cover every possible situation, far from that, but at least it's a start.
Instead of modifying mlist in place, we can just create a new slice to hold the mirrors that are accepted. Then we don't have to move elements within an existing slice, making the code simpler. Gbp-Pq: Name prep-selection-Simplify-Filter-a-bit.patch
In this piece of the code we compare distances twice, and IMO it's easier to read if we put the mirror distance on the same side of the operator for both comparisons.
Define and use a new helper `HasAnyPrefix`, to make some lines of code more concise. At the moment, it's used only twice, but we're going to use it more in the upcoming commits, so it's going to be useful.
Those functions are trivial for now, but in the upcoming commits we will allow a mirror to support both HTTP and HTTPS. The logic to know if it's up will be more complex, given that a mirror can be up over HTTP but down over HTTPS, for example. Similarily, a status string will be a bit more than just "up" or "down", eg. when the mirror is up for HTTP and down for HTTPS.
With this change, the actual health check implementation (ie. the http request) is moved to a new function called `healthCheckDo`. The original `HealthCheck` function does only two things now: 1) pick the file that will be used to perform the health check and 2) call `healthCheckDo`. The purpose of this split is to allow, in the upcoming commits, to perform 2 health checks instead of one. That will be used for mirrors that support both HTTP and HTTPS, we will need to check that requests work for both protocols.
This commit adds the protocol (HTTP or HTTPS) to the health-check logs. This in preparation for the next commits, where a mirror can support both HTTP and HTTPS, and health checks will be done twice: over HTTP and HTTPS. In this case, logging that a mirror is "up" or "down" won't be enough, we'll need to know the protocol concerned as well. 1) Changes to the `mirrorbits logs` command Let's take a look at how `mirrorbits log MIRROR` looks like before this commit: ``` 2025-01-08 08:54:42 UTC: Mirror is down: Unreachable 2025-01-08 09:11:12 UTC: Mirror is up ``` And after this commit: ``` 2025-01-08 08:55:02 UTC: HTTPS mirror is down: Unreachable 2025-01-08 09:11:12 UTC: HTTPS mirror is up ``` If you considered `mirrorbits log` as a stable format (ie. maybe you have scripts that parse its output), then take this commit as a breaking change. You'll need to adjust your scripts accordingly. 2) Changes to the mirrorbits stderr logs Similarily, the mirrorbits logs (that go to stderr by default) are updated as such. Before this commit: ``` 2025/01/09 15:35:53.121 UTC kali.darklab.sh Down! Status: 503 2025/01/10 03:48:56.864 UTC repo.jing.rocks Up! (509ms) ``` After this commit: ``` 2025/01/09 15:35:53.121 UTC kali.darklab.sh HTTP Down! Status: 503 2025/01/10 03:48:56.864 UTC repo.jing.rocks HTTPS Up! (509ms) ``` 3) How to review this commit File `mirrors/logs.go` is simple to review, there is a new field `proto` in the struct, and updates to support this new field. The field `proto` is of type `Protocol`, this is a new type introduced in `mirrors/mirrors.go`. Finally, in order to actually add this field to the log, we need to update the signatures of `MarkMirrorUp`, `MarkMirrorDown` and `SetMirrorState` (`mirrors/mirrors.go`), then the caller `daemon/monitor.go` can make use of it.
This is in preparation to support relative URLs in the `HttpURL` field. These are trivial code changes, aimed at making the diff more readable in the upcoming commit that will add support for relative URLs.
This is in preparation to support relative URLs. This commit adds a new field `AbsoluteURL` in the `Mirror` struct (`mirrors/mirrors.go`). This field is derived from the field `HttpURL`, and is guaranteed to be a absolute URL (ie. to have a scheme). At the moment, it's equal to `HttpURL` (since `HttpURL` always have a scheme), but that will change in the next commits. As can be seen (or rather, guessed) in the Mirror struct, this field is among the fields that are set during mirror selection, so only the code that comes after that can make use of it, and it is not saved in the redis db. In practice, this field is set in two places: `http/selection.go` and also in `http/http.go` (in the code that handles fallbacks mirrors).
Before this commit, the value for HttpURL must start with either `http://` or `https://`, for example `http://mirror.com/foobar/`. With this commit, we also allow "relative URLs", ie. URLs without a scheme, meaning that `mirror.com/foobar/` is now a valid value, for mirrors and also for fallback mirrors (defined in mirrorbits.conf). In case of relative URL, the protocol is assumed to be HTTP, but that's only temporary. In upcoming commits, relative URL will mean that the mirror supports both HTTP and HTTPS. This change adds some complexity: all the code that uses `HttpURL` must now check if the URL has a scheme, and deal with that. There's no one-size-fits-all helper for that (even though for now it looks like that, but that's because relative URL default to HTTP now, and that's only temporary). Additionally, codes that runs after the selection process, and that requires a absolute URL, must now use `AbsoluteURL` instead of `HttpURL` (that's `templates/mirrorlist.html` and `http/pagerenderer.go`).
So far, the field ExcludeReason had two uses. First, it was set during health-check when the mirror was found to be down (hence commited to redis db), and it was later set during mirror selection when the mirror was excluded (in this case, the value set was never set in the redis db, it was just discarded after the request was served). So these are two different things that were conflated in one field, and that worked, but now that we're about to support both HTTP and HTTPS for mirrors, that won't cut it. So, this commit does two things: 1) Rename ExcludeReason to DownReason - this is the field that is set by the health check 2) Add a new ExcludeReason field, which is to be set by the mirror selection, and used by the code that comes after that and wants to know why the mirror was excluded Note that, given that the redis key is also renamed, there needs to be a database migration - that will come in a later commit.
In previous commit "Support relative URL in the mirror HttpURL" we started to accept mirror URLs without a scheme, eg. mirror.com/foobar/, but in this case the scheme was HTTP by default, so it was not very interesting... With this commit, we change this behaviour: mirrors with a relative URL now support both HTTP and HTTPS. It means that the health check is done twice (for HTTP and for HTTPS, cf. `daemon/monitor.go`). We duplicate the fields `Up, DownReason` to become `HttpUp, HttpsUp, HttpDownReason, HttpsDownReason` in the Mirror struct `mirrors/mirrors.go`). And finally rework the selection logic to redirect appropriately (`http/selection.go`). Note that, given that some redis keys are renamed, there needs to be a database migration - that will come in a later commit.
Refer to previous commit for details. Note that a good part of the code was taken from database/v1/version1.go, it wasn't written from scratch.
As there was changes in rpc.proto in previous commits
Mirrorbits doens't respect the incoming protocol *strictly*, instead the behavior is as such: - HTTPS request: HTTPS redirect - HTTP request: HTTP or HTTPS redirect So it doesn't try to respect the HTTP protocol, and is happy to redirect HTTP requests to HTTPS. Which makes sense if mirrorbits is meant to serve files downloaded from a web browser, where HTTPS is the norm. However for Linux distro, for a package repository, it's important to respect the incoming protocol. There are minimal systems that don't have HTTPS support out of the box. For Debian (and Debian-like), the standard Docker image doesn't have CA certificates pre-installed, and apt is configured to use HTTP repositories. The first command to run is `apt update`, and if ever that gets redirected to HTTPS, apt fails. With this commit, we can now change this behavior, and ask mirrorbits to respect the protocol strictly. As a bonus point, having it in mirrobits.conf allows to document a rather obscure (but important) aspect of mirrorbits behavior. Note that, by default, the setting is true, so the default behavior of mirrorbits is still the same: redirect HTTP to HTTPS is allowed.
46e71ac
to
7700eb2
Compare
Rebased |
Open
I've reviewed patches 1 to 8 so far. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This pull request is a complete rework of #156 and replaces it.
short story
This PR brings a new feature to mirrorbits: users can now define mirrors that support both HTTP and HTTPS, by using a relative URL (ie. URL without scheme) for the field
HttpURL
. Additionally, a new settingAllowHTTPToHTTPSRedirects
(default: true) allow users to control this behavior that was hardcoded in mirrorbits before.Bonus point: in the process of reworking the mirror selection, this PR added serious unit testing to this critical function!
long story
So far, mirrorbits would only accept URLs that start with either "http://" or "https://", meaning that it's not possible to express that a mirror supports both HTTP and HTTPS. Which was Ok in a sense, due to another detail: mirrorbits happily redirects HTTP requests to HTTPS mirrors. I suspect the initial use-case for mirrorbits is to serve downloads from a web page, and redirecting HTTP to HTTPS without asking is standard practice for the web.
However, for Linux distros (at least, based on Debian), this behaviour doesn't work well. The decision to use HTTP or HTTPS belongs to the user, and the redirector must respect it. There are minimal systems that don't have HTTPS support out of the box. For Debian (and Debian-like), the standard Docker image doesn't have CA certificates pre-installed, and apt is configured to use HTTP repositories. The first command to run is
apt update
, and if ever that gets redirected to HTTPS, apt fails.Therefore, this PR makes mirrorbits much more useful for distro mirrors, by doing mostly two things:
mirror.example.com/foo/
in place ofhttp://mirror.example.com/foo/
) for the fieldHttpURL
of a mirror, therefore declaring that a mirror supports both HTTP and HTTPS.AllowHTTPToHTTPSRedirects
tofalse
.under the hood
When a mirror has a relative URL, it means that mirrorbits will perform 2 health-checks (one over HTTP and one over HTTPS) instead of one. It also means that, internally, mirrorbits now has 2 fields to keep track of the status of a mirror:
HttpUp
andHttpsUp
, and also to different fieldHttpDownReason
andHttpsDownReason
to describe why the mirror is down.Which also means that this pull request brings: a rebuild of the generated file
rpc.pb.go
and a database upgrade (v1 -> v2
).visible changes for users
For the command
mirrorbits list
, column STATE, and when a mirror supports both HTTP and HTTPS, a mirror can now beup/down
(if it's up over HTTP but down over HTTPS) ordown/up
(HTTP down, HTTPS up).For the command
mirrorbits logs
, now mirrorbits reports the protocol, meaning that a lineMirror is up
is nowHTTPS mirror is up
, for example. Same with the mirrorbits stdout logs, the linerepo.jing.rocks Up! (509ms)
becomesrepo.jing.rocks HTTPS Up! (509ms)
.The HTML templates have been updated in this PR, users who don't use the stock template will need to update their copies.
Database upgrade (should take only a few seconds), meaning NO ROLL BACK POSSIBLE AFTER UPDATING!
quick guide to review
The 2 initial commits actually belong to #185, and can be dropped if ever #185 is merged.
Commits b4bc62d to 88a07e7 add unit testing to mirror selection (and could be merged in independently of this PR).
Commits 825278f to 493d8b9 are fairly trivial, and just prepare the code for changes to come.
Commits cf647db to afbd77d are where the big changes are done.
Finally comes the 3 last commits: db upgrade, regen of rpc.pb.go, and the addition of the setting
AllowHTTPToHTTPSRedirects
which is needed to make the whole PR really useful.to finish
The first iteration of this work at #156 has been running in production for more than a year now. This new PR has been deployed in a staging instance for a few weeks, and is now in production at eg. https://http.kali.org/README?mirrorlist.
While #156 was functional, it was a bit rushed out and not ready, the approach was clumsy. I believe this PR is a much better attempt.
This PR will close #157
Thanks!