Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Changes to how we bring in publicsuffix #1479

Closed
jsha opened this issue Feb 10, 2016 · 23 comments
Closed

Changes to how we bring in publicsuffix #1479

jsha opened this issue Feb 10, 2016 · 23 comments

Comments

@jsha
Copy link
Contributor

jsha commented Feb 10, 2016

Right now, when the publicsuffix list is updated, we wait for golang.org/x/net to update the copy baked into the publicsuffix package, and then we merge that change into our fork of the net repo (we have a fork because we have a local patch, and then we updated the vendorized copy of our fork in boulder. This is all rather complicated and time consuming. This issue is to discuss ways to improve the process.

@jsha
Copy link
Contributor Author

jsha commented Feb 10, 2016

@bradfitz said:

I'm wondering if we should add a different mode to Go's publicsuffix package where instead of baking in the data structure, it instead regularly polls or otherwise subscribes to the master list and keeps it all in memory (in a larger format, not the 10 minutes of CPU "crushed" form). Then boulder would always be up-to-date with what's live in the public suffix list, without needing to wait for Go.

This is appealing, though I am somewhat nervous about relying on an automated network fetch for this, given that it affects which domains we are willing to issue to.

I would be more comfortable with an approach where the publicsuffix package could load the current version of the list from a file (in the original PSL format). We would then maintain a copy of the PSL in the Boulder repo, updated periodically from the original and deployed with our built packages.

/cc @bradfitz and @weppos

@bradfitz
Copy link

That's fine with me too.

(Btw, you may want to edit your second comment in this issue to make it clear that I wrote that first quoted part)

/cc @nigeltao

@weppos
Copy link
Contributor

weppos commented Feb 10, 2016

I think that this issue is somehow connected with an issue I recently had with the current publicsuffix package in the PSL itself.

I started to create a test suite and a linter for the PSL. I'm writing the linter in Go, and I wanted to use Go also to test the list, so that I could run all the tests in a single Travis run. However, I could not use the Go package because I needed the ability to load an arbitrary list from file or other source.

That's why I ended up (at the moment) to use my Ruby library to load the PSL file and test it.

My Ruby library works in a way really close to what @bradfitz suggested. It loads and arbitrary PSL in memory (the source can be anything) into a List object, then you call the methods on the specific instance of the List.

A default list is packaged in the gem itself and it is used unless otherwise specified.

I may work on something similar in Go, if you feel that it would make sense. It should not take too much long to get a simple prototype ready for some early comments. What do you think? Does the approach seem reasonable to you?

@jsha
Copy link
Contributor Author

jsha commented Feb 10, 2016

That sounds good to me. See also my linked review at https://go-review.googlesource.com/#/c/16712/. We treat ICANN TLDs differently than private ones, so need a special ICANNTLD method.

@weppos
Copy link
Contributor

weppos commented Feb 15, 2016

@jsha I'll work on something this week and I'll ping you when done. I've just finished working on some other internal tools for the PSL that absorbed a major part of the time I dedicate to OSS.

I'd definitely love to have a feedback from @bradfitz and @nigeltao as well, as they indeed have a larger experience than me on Go and I'll have a lot to learn from them.

@jsha
Copy link
Contributor Author

jsha commented Feb 27, 2016

Update: After review on https://go-review.googlesource.com/#/c/16712/, I realized Boulder can make do without ICANNTLD, so we'll be snapping that indirection (#1518).

Right now I'm leaning towards checking in a copy of public_suffix_list.dat to the Boulder repo, and generating table.go at build time. Right now gen.go takes 20 minutes to crush everything. There's a -crush=false flag that should be faster, but without crushing, the concatenated DNS labels become too big for the number of bits publicsuffix wants to use to represent them.

So I spent the evening hacking on gen.go to make it faster. I've got a branch that runs in 2-4 seconds and produces a list crushed as small as the current gen.go: https://github.com/golang/net/compare/master...jsha:faster-gen?expand=1. It's pretty messy still: I've got to clean it up so it makes more sense, and vet it for any late-night stupid mistakes.

@jsha
Copy link
Contributor Author

jsha commented Feb 29, 2016

@jsha jsha added this to the Sprint 2016-03-01 milestone Mar 1, 2016
@jsha jsha self-assigned this Mar 3, 2016
@weppos
Copy link
Contributor

weppos commented Mar 12, 2016

@jsha I see you made some progress. However, as promised, I took the time to write a different implementation of the publicsuffix package that brings to Go some of the feature I built in other packages.

You can find it here https://github.com/weppos/publicsuffix-go

The package is young, but it passes all the official PSL tests therefore it can be considered quite stable. I also created a x/net/publicsuffix drop-in adapter. Simply replace the inclusion of the x/net library with the adapter, and you have a zero-config replacement.

This could work in most cases, however if you want to take advantage of the real flexibility of the package, you should use the github.com/weppos/publicsuffix-go/publicsuffix package directly.

There are a few features you may find particularly useful:

  • Ability to create multiple lists: e.g. you can have a standard list, and one tailored to specific requirements (such as private domains validations)
  • Ability to ignore private domains at runtime (given a list), or for an entire list (by creating a list skipping the private domains). This is useful if you need to perform some IANA TLD validation
  • Ability to merge/load other lists. This feature may be specifically useful to Let's Encrypt to deal with the rate-limit issue for DDNS providers. Instead of relying on the Public Suffix and/or forking the list (and introduce the effort to deal with mainstream updates) you can simply maintain a separate custom list of whitelisted suffixes (either defined in a PSL-like file, or in a string, or in a Go code). Then, you can load that list and merge it in the default packaged list, or in another custom list.

This package also gives you more advanced access to the rule that determined a specific result, if needed.

One final note: the library currently uses a sequential scan to detect the matching rules. This is not the most efficient scan, but my primary focus in the first phase was on compatibility and compliance, rather than performance. Performances should not be an issue, unless you need to call parse thousands of strings per second. The library was designed to hide the internals, I plan to work on refactoring the lookup using an index (as I did in the Ruby library) that will solve the sequential scan performance issue.

Let me know what do you think!

@nigeltao
Copy link

@weppos neat! I dropped some code review comments, mostly on Go style, on the various github commits.

weppos added a commit to weppos/publicsuffix-go that referenced this issue Mar 13, 2016
Thanks Nigel for the review, I really appreciate it! 🙇

See letsencrypt/boulder#1479 (comment)
@weppos
Copy link
Contributor

weppos commented Mar 13, 2016

Thanks @nigeltao, I incorporated all your feedbacks! 👍 The only one I have to work on is the concurrency protection for the lazy-initialization of the default list.

@jsha
Copy link
Contributor Author

jsha commented Mar 14, 2016

This is excellent, thanks @weppos! I will take a closer look this week, but it sounds like this will probably meet our needs, in a more convenient way than my approach was going to.

@mjtko
Copy link

mjtko commented Jun 8, 2016

Hi,

After some reading around the subject I saw from letsencrypt/net#12 (comment) that work was being planned in the near future to integrate publicsuffix-go and resolve this issue. Is it ongoing somewhere and/or can I do anything to help out?

Thanks!

@weppos
Copy link
Contributor

weppos commented Jun 8, 2016

I talked with @jsha and I hope to get a PR ready for this week for review and comments.

@mjtko
Copy link

mjtko commented Jun 8, 2016

Excellent. Thanks for the update!

@Scratch-net
Copy link

Sorry to break in, but will this resolve the "detail": "Error creating new cert :: Too many certificates already issued for: ddns.net" Error?

@kaefert
Copy link

kaefert commented Jun 13, 2016

@Scratch-net yes. The rate-limit for all the domains listed here: publicsuffix/list@469ec07 should be lifted once letsencrypt merges a version of the publicsuffixlist that includes this pull request (which was merged 2016-05-25)

Two possible solutions - either this new way of including the publicsuffixlist gets implemented (publicsuffix-go) that is talked about here - or this pullrequest gets merged: letsencrypt/net#12

@Scratch-net
Copy link

@kaefert Thanks, will wait for the fix and stick to dedyn.io for now

weppos added a commit to weppos/boulder that referenced this issue Jun 24, 2016
jsha pushed a commit that referenced this issue Jun 30, 2016
This PR replaces the `x/net/publicsuffix` package with `weppos/publicsuffix-go`.

The conversations that leaded to this decision are #1479 and #1374. To summarize the discussion, the main issue with `x/net/publicsuffix` is that the package compiles the list into the Go source code and doesn't provide a way to easily pull updates (e.g. by re-parsing the original PSL) unless the entire package is recompiled.

The PSL update frequency is almost daily, which makes very hard to recompile the official Golang package to stay up-to-date with all the changes. Moreover, Golang maintainers expressed some concerns about rebuilding and committing changes with a frequency that would keep the package in sync with the original PSL. See #1374 (comment)

`weppos/publicsuffix-go` contains a compiled version of the list that is updated weekly (or more frequently). Moreover, the package can read and parse a PSL from a String or a File which will effectively decouple the Boulder source code with the list itself. The main benefit is that it will be possible to update the definition by simply downloading the latest list and restarting the application (assuming the list is persisted in memory).
@andig
Copy link

andig commented Jul 11, 2016

@Scratch-net yes. The rate-limit for all the domains listed here: publicsuffix/list@469ec07 should be lifted once letsencrypt merges a version of the publicsuffixlist that includes this pull request (which was merged 2016-05-25)

Checking in here. What is missing to make letsencrypt merge this version? Is there anybody who has the task on his radar?

This is a blocker for loads of ddns users out there...

@mjtko
Copy link

mjtko commented Jul 11, 2016

@andig: I spent a few minutes investigating the progress of this earlier today, and it looks like #1969 has been merged. It should go live with the next release of Boulder, though I can't work out when that's scheduled. The last release of boulder on June 29 just missed this update, and there hasn't been one since.

I guess we need to keep an eye on the status history page for when this change finally goes live!

@cpu
Copy link
Contributor

cpu commented Jul 11, 2016

@andig @mjtko You're correct #1969 has been merged to the staging branch.

We do our production releases on Thursdays. Normally this changed would have been pulled into the production branch & deployed last Thursday but we had to cancel the deploy for unrelated reasons. If nothing changes this should reach production sometime this Thursday (the 14th).

Thanks for your patience!

@jsha jsha removed their assignment Jul 12, 2016
@andig
Copy link

andig commented Jul 14, 2016

Boulder is being updated right now, will retest asap.

@Scratch-net
Copy link

I've already got ddns.com cert, so it's definitely working! Thanks!

@jsha
Copy link
Contributor Author

jsha commented Sep 20, 2016

Fixed by using @weppos's publicsuffix library. Thanks!

@jsha jsha closed this as completed Sep 20, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

10 participants