Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

acme-selfsigned-ca.service fails #147348

Closed
pinpox opened this issue Nov 25, 2021 · 11 comments · Fixed by #147784 or #147834
Closed

acme-selfsigned-ca.service fails #147348

pinpox opened this issue Nov 25, 2021 · 11 comments · Fixed by #147784 or #147834
Labels
0.kind: bug Something is broken 6.topic: nixos Issues or PRs affecting NixOS modules, or package usability issues specific to NixOS

Comments

@pinpox
Copy link
Member

pinpox commented Nov 25, 2021

Describe the bug

The sytemd service acme-selfsigned-ca.service created by services.nginx.virtualHosts.<name>.enableACME failes even though certificates for the configured virtual hosts are correctly retrieved from Lets encrypt.

I have been trying to figure out what is causing the problem but the logs are not very helpful:

root@porree:~/ > systemctl status acme-selfsigned-ca.service
× acme-selfsigned-ca.service - Generate self-signed certificate authority
     Loaded: loaded (/etc/systemd/system/acme-selfsigned-ca.service; linked; vendor preset: enabled)
     Active: failed (Result: start-limit-hit) since Wed 2021-11-24 15:16:21 CET; 20h ago
  Condition: start condition failed at Thu 2021-11-25 06:26:19 CET; 5h 0min ago
             └─ ConditionPathExists=!/var/lib/acme/.minica/key.pem was not met
    Process: 28079 ExecStart=/nix/store/dfr0wc8dpnds6zlrvgn5j98aj6rafd6a-unit-script-acme-selfsigned-ca-start/bin/acme-selfsigned-ca-start (code=exited, status=0/SUCCESS)
   Main PID: 28079 (code=exited, status=0/SUCCESS)
         IP: 0B in, 0B out
        CPU: 238ms

Nov 24 15:18:58 porree systemd[1]: Failed to start Generate self-signed certificate authority.
Nov 24 15:19:07 porree systemd[1]: acme-selfsigned-ca.service: Start request repeated too quickly.
Nov 24 15:19:07 porree systemd[1]: acme-selfsigned-ca.service: Failed with result 'start-limit-hit'.
Nov 24 15:19:07 porree systemd[1]: Failed to start Generate self-signed certificate authority.
Nov 24 15:19:14 porree systemd[1]: Condition check resulted in Generate self-signed certificate authority being skipped.
Nov 24 15:22:28 porree systemd[1]: Condition check resulted in Generate self-signed certificate authority being skipped.
Nov 25 00:13:17 porree systemd[1]: Condition check resulted in Generate self-signed certificate authority being skipped.
journalctl -eu acme-selfsigned-ca.service
Nov 24 15:18:58 porree systemd[1]: Failed to start Generate self-signed certificate authority.
Nov 24 15:18:58 porree systemd[1]: acme-selfsigned-ca.service: Start request repeated too quickly.
Nov 24 15:18:58 porree systemd[1]: acme-selfsigned-ca.service: Failed with result 'start-limit-hit'.
Nov 24 15:18:58 porree systemd[1]: Failed to start Generate self-signed certificate authority.
Nov 24 15:19:07 porree systemd[1]: acme-selfsigned-ca.service: Start request repeated too quickly.
Nov 24 15:19:07 porree systemd[1]: acme-selfsigned-ca.service: Failed with result 'start-limit-hit'.
Nov 24 15:19:07 porree systemd[1]: Failed to start Generate self-signed certificate authority.
Nov 24 15:19:14 porree systemd[1]: Condition check resulted in Generate self-signed certificate authority being skipped.
Nov 24 15:22:28 porree systemd[1]: Condition check resulted in Generate self-signed certificate authority being skipped.
Nov 25 00:13:17 porree systemd[1]: Condition check resulted in Generate self-signed certificate authority being skipped.
Nov 25 00:35:37 porree systemd[1]: Condition check resulted in Generate self-signed certificate authority being skipped.
Nov 25 03:03:28 porree systemd[1]: Condition check resulted in Generate self-signed certificate authority being skipped.
Nov 25 06:26:19 porree systemd[1]: Condition check resulted in Generate self-signed certificate authority being skipped.

Let me know what additional information I can provide to find the cause of this.

Steps To Reproduce

Steps to reproduce the behavior:

  1. Enable a nginx virtual host with services.nginx.virtualHosts.<name>.enableACME = true

Expected behavior

Service not failing

Screenshots

If applicable, add screenshots to help explain your problem.

Additional context

Add any other context about the problem here.

Notify maintainers

@aanderse @andrew-d @arianvp @Emily @flokli @m1cr0man

Metadata

Please run nix-shell -p nix-info --run "nix-info -m" and paste the result.

[user@system:~]$ nix-shell -p nix-info --run "nix-info -m"
 - system: `"x86_64-linux"`
 - host os: `Linux 5.10.79, NixOS, 21.11 (Porcupine)`
 - multi-user?: `yes`
 - sandbox: `yes`
 - version: `nix-env (Nix) 2.4`
 - channels(pinpox): `"home-manager"`
 - nixpkgs: `/nix/store/wkavyf1xaqkpw8f74wig2h18f3mlr862-source````

Maintainer information:
```yaml
# a list of nixpkgs attributes affected by the problem
attribute:
  - acme
# a list of nixos modules affected by the problem
module:
  - acme
  - nginx
@pinpox pinpox added the 0.kind: bug Something is broken label Nov 25, 2021
@Mic92
Copy link
Member

Mic92 commented Nov 25, 2021

I think the issue is └─ ConditionPathExists=!/var/lib/acme/.minica/key.pem was not met. This service should have been not started in the first place. Still for some reasons systemd tries to start it.

@Mic92
Copy link
Member

Mic92 commented Nov 25, 2021

That's my workaround: Mic92@b787367

@m1cr0man
Copy link
Contributor

No, that's not it. The conditionPathExists would only block it from running and produces this status:

○ acme-selfsigned-ca.service - Generate self-signed certificate authority
     Loaded: loaded (/etc/systemd/system/acme-selfsigned-ca.service; linked;>
     Active: inactive (dead)
  Condition: start condition failed at Wed 2021-11-24 20:40:30 GMT; 14h ago
             └─ ConditionPathExists=!/var/lib/acme/.minica/key.pem was not met

The real question here is why it's in condition start-limit-hit. I just created a loop on my own system to restart it 100 times in a loop and it dis not happen, as expected. I'm on systemd 249.4 (slightly behind?) and I would hedge a guess this is an upstream issue at the moment. That or you have a globally configured start limit? I'm still skeptical start limit takes precedence over start condition failed

As a test can you try this (typing on mobile please autocorrect typos 😉 )

systemctl reset-failed acme-selfsigned-ca
for i in {1..50}; do systemctl start acme-selfsigned-ca; done
systemctl status acme-selfsigned-ca

@m1cr0man
Copy link
Contributor

Another possibility is were not seeing an error further up in the service logs. It's possible on your first run the script actually failed, and with many certs it would be attempted many times. Can you take a quick scan further back up the logs see if there's anything obvious?

@pinpox
Copy link
Member Author

pinpox commented Nov 25, 2021

As a test can you try this (typing on mobile please autocorrect typos )

Sure:

root@porree:~/ > systemctl list-units --state failed
  UNIT                       LOAD   ACTIVE SUB    DESCRIPTION
● acme-selfsigned-ca.service loaded failed failed Generate self-signed certificate authority

LOAD   = Reflects whether the unit definition was properly loaded.
ACTIVE = The high-level unit activation state, i.e. generalization of SUB.
SUB    = The low-level unit activation state, values depend on unit type.
1 loaded units listed.
root@porree:~/ > systemctl reset-failed acme-selfsigned-ca

root@porree:~/ > systemctl list-units --state failed
  UNIT LOAD ACTIVE SUB DESCRIPTION
0 loaded units listed.
root@porree:~/ > for i in {1..50}; do systemctl start acme-selfsigned-ca; done

Job for acme-selfsigned-ca.service failed.
See "systemctl status acme-selfsigned-ca.service" and "journalctl -xeu acme-selfsigned-ca.service" for details.
Job for acme-selfsigned-ca.service failed.
See "systemctl status acme-selfsigned-ca.service" and "journalctl -xeu acme-selfsigned-ca.service" for details.
Job for acme-selfsigned-ca.service failed.
See "systemctl status acme-selfsigned-ca.service" and "journalctl -xeu acme-selfsigned-ca.service" for details.
Job for acme-selfsigned-ca.service failed.
See "systemctl status acme-selfsigned-ca.service" and "journalctl -xeu acme-selfsigned-ca.service" for details.
Job for acme-selfsigned-ca.service failed.
See "systemctl status acme-selfsigned-ca.service" and "journalctl -xeu acme-selfsigned-ca.service" for details.
Job for acme-selfsigned-ca.service failed.
See "systemctl status acme-selfsigned-ca.service" and "journalctl -xeu acme-selfsigned-ca.service" for details.
Job for acme-selfsigned-ca.service failed.
See "systemctl status acme-selfsigned-ca.service" and "journalctl -xeu acme-selfsigned-ca.service" for details.
Job for acme-selfsigned-ca.service failed.
See "systemctl status acme-selfsigned-ca.service" and "journalctl -xeu acme-selfsigned-ca.service" for details.
Job for acme-selfsigned-ca.service failed.
See "systemctl status acme-selfsigned-ca.service" and "journalctl -xeu acme-selfsigned-ca.service" for details.
Job for acme-selfsigned-ca.service failed.
See "systemctl status acme-selfsigned-ca.service" and "journalctl -xeu acme-selfsigned-ca.service" for details.
Job for acme-selfsigned-ca.service failed.
See "systemctl status acme-selfsigned-ca.service" and "journalctl -xeu acme-selfsigned-ca.service" for details.
Job for acme-selfsigned-ca.service failed.
See "systemctl status acme-selfsigned-ca.service" and "journalctl -xeu acme-selfsigned-ca.service" for details.
Job for acme-selfsigned-ca.service failed.
See "systemctl status acme-selfsigned-ca.service" and "journalctl -xeu acme-selfsigned-ca.service" for details.
Job for acme-selfsigned-ca.service failed.
See "systemctl status acme-selfsigned-ca.service" and "journalctl -xeu acme-selfsigned-ca.service" for details.
Job for acme-selfsigned-ca.service failed.
See "systemctl status acme-selfsigned-ca.service" and "journalctl -xeu acme-selfsigned-ca.service" for details.
Job for acme-selfsigned-ca.service failed.
See "systemctl status acme-selfsigned-ca.service" and "journalctl -xeu acme-selfsigned-ca.service" for details.
Job for acme-selfsigned-ca.service failed.
See "systemctl status acme-selfsigned-ca.service" and "journalctl -xeu acme-selfsigned-ca.service" for details.
Job for acme-selfsigned-ca.service failed.
See "systemctl status acme-selfsigned-ca.service" and "journalctl -xeu acme-selfsigned-ca.service" for details.
Job for acme-selfsigned-ca.service failed.
See "systemctl status acme-selfsigned-ca.service" and "journalctl -xeu acme-selfsigned-ca.service" for details.
Job for acme-selfsigned-ca.service failed.
See "systemctl status acme-selfsigned-ca.service" and "journalctl -xeu acme-selfsigned-ca.service" for details.
Job for acme-selfsigned-ca.service failed.
See "systemctl status acme-selfsigned-ca.service" and "journalctl -xeu acme-selfsigned-ca.service" for details.
Job for acme-selfsigned-ca.service failed.
See "systemctl status acme-selfsigned-ca.service" and "journalctl -xeu acme-selfsigned-ca.service" for details.
Job for acme-selfsigned-ca.service failed.
See "systemctl status acme-selfsigned-ca.service" and "journalctl -xeu acme-selfsigned-ca.service" for details.
Job for acme-selfsigned-ca.service failed.
See "systemctl status acme-selfsigned-ca.service" and "journalctl -xeu acme-selfsigned-ca.service" for details.
Job for acme-selfsigned-ca.service failed.
See "systemctl status acme-selfsigned-ca.service" and "journalctl -xeu acme-selfsigned-ca.service" for details.
Job for acme-selfsigned-ca.service failed.
See "systemctl status acme-selfsigned-ca.service" and "journalctl -xeu acme-selfsigned-ca.service" for details.
Job for acme-selfsigned-ca.service failed.
See "systemctl status acme-selfsigned-ca.service" and "journalctl -xeu acme-selfsigned-ca.service" for details.
Job for acme-selfsigned-ca.service failed.
See "systemctl status acme-selfsigned-ca.service" and "journalctl -xeu acme-selfsigned-ca.service" for details.
Job for acme-selfsigned-ca.service failed.
See "systemctl status acme-selfsigned-ca.service" and "journalctl -xeu acme-selfsigned-ca.service" for details.
Job for acme-selfsigned-ca.service failed.
See "systemctl status acme-selfsigned-ca.service" and "journalctl -xeu acme-selfsigned-ca.service" for details.
Job for acme-selfsigned-ca.service failed.
See "systemctl status acme-selfsigned-ca.service" and "journalctl -xeu acme-selfsigned-ca.service" for details.
Job for acme-selfsigned-ca.service failed.
See "systemctl status acme-selfsigned-ca.service" and "journalctl -xeu acme-selfsigned-ca.service" for details.
Job for acme-selfsigned-ca.service failed.
See "systemctl status acme-selfsigned-ca.service" and "journalctl -xeu acme-selfsigned-ca.service" for details.
Job for acme-selfsigned-ca.service failed.
See "systemctl status acme-selfsigned-ca.service" and "journalctl -xeu acme-selfsigned-ca.service" for details.
Job for acme-selfsigned-ca.service failed.
See "systemctl status acme-selfsigned-ca.service" and "journalctl -xeu acme-selfsigned-ca.service" for details.
Job for acme-selfsigned-ca.service failed.
See "systemctl status acme-selfsigned-ca.service" and "journalctl -xeu acme-selfsigned-ca.service" for details.
Job for acme-selfsigned-ca.service failed.
See "systemctl status acme-selfsigned-ca.service" and "journalctl -xeu acme-selfsigned-ca.service" for details.
Job for acme-selfsigned-ca.service failed.
See "systemctl status acme-selfsigned-ca.service" and "journalctl -xeu acme-selfsigned-ca.service" for details.
Job for acme-selfsigned-ca.service failed.
See "systemctl status acme-selfsigned-ca.service" and "journalctl -xeu acme-selfsigned-ca.service" for details.
Job for acme-selfsigned-ca.service failed.
See "systemctl status acme-selfsigned-ca.service" and "journalctl -xeu acme-selfsigned-ca.service" for details.
Job for acme-selfsigned-ca.service failed.
See "systemctl status acme-selfsigned-ca.service" and "journalctl -xeu acme-selfsigned-ca.service" for details.
Job for acme-selfsigned-ca.service failed.
See "systemctl status acme-selfsigned-ca.service" and "journalctl -xeu acme-selfsigned-ca.service" for details.
Job for acme-selfsigned-ca.service failed.
See "systemctl status acme-selfsigned-ca.service" and "journalctl -xeu acme-selfsigned-ca.service" for details.
Job for acme-selfsigned-ca.service failed.
See "systemctl status acme-selfsigned-ca.service" and "journalctl -xeu acme-selfsigned-ca.service" for details.
Job for acme-selfsigned-ca.service failed.
See "systemctl status acme-selfsigned-ca.service" and "journalctl -xeu acme-selfsigned-ca.service" for details.
root@porree:~/ > systemctl status acme-selfsigned-ca

× acme-selfsigned-ca.service - Generate self-signed certificate authority
     Loaded: loaded (/etc/systemd/system/acme-selfsigned-ca.service; linked; vendor preset: enabled)
     Active: failed (Result: start-limit-hit) since Wed 2021-11-24 15:16:21 CET; 21h ago
  Condition: start condition failed at Thu 2021-11-25 12:34:51 CET; 14s ago
             └─ ConditionPathExists=!/var/lib/acme/.minica/key.pem was not met
    Process: 28079 ExecStart=/nix/store/dfr0wc8dpnds6zlrvgn5j98aj6rafd6a-unit-script-acme-selfsigned-ca-start/bin/acme-selfsigned-ca-start (code=exited, status=0/SUCCESS)
   Main PID: 28079 (code=exited, status=0/SUCCESS)
         IP: 0B in, 0B out
        CPU: 238ms

Nov 25 12:34:52 porree systemd[1]: Failed to start Generate self-signed certificate authority.
Nov 25 12:34:52 porree systemd[1]: acme-selfsigned-ca.service: Start request repeated too quickly.
Nov 25 12:34:52 porree systemd[1]: acme-selfsigned-ca.service: Failed with result 'start-limit-hit'.
Nov 25 12:34:52 porree systemd[1]: Failed to start Generate self-signed certificate authority.
Nov 25 12:34:52 porree systemd[1]: acme-selfsigned-ca.service: Start request repeated too quickly.
Nov 25 12:34:52 porree systemd[1]: acme-selfsigned-ca.service: Failed with result 'start-limit-hit'.
Nov 25 12:34:52 porree systemd[1]: Failed to start Generate self-signed certificate authority.
Nov 25 12:34:52 porree systemd[1]: acme-selfsigned-ca.service: Start request repeated too quickly.
Nov 25 12:34:52 porree systemd[1]: acme-selfsigned-ca.service: Failed with result 'start-limit-hit'.
Nov 25 12:34:52 porree systemd[1]: Failed to start Generate self-signed certificate authority.

The start seems to fail on all tries.

Can you take a quick scan further back up the logs see if there's anything obvious?

This is the full log as shown by journalctl -u acme-selfsigned-ca.service
https://gist.github.com/pinpox/d79ab04cd0448596973533d6c6dcede8

@m1cr0man
Copy link
Contributor

Thanks for running that test + the logs. It's super weird, it does indeed look like start-limit-hit is being triggered despite systemd clearly skipping based on the condition. I'm going to update my system later and try this myself. What is your nixos-version?

@pinpox
Copy link
Member Author

pinpox commented Nov 25, 2021

What is your nixos-version?

root@porree:~/ > nixos-version
21.11.20211117.715f634 (Porcupine)

Thanks for looking into it. Let me know if you need any additional info!

@veprbl veprbl added the 6.topic: nixos Issues or PRs affecting NixOS modules, or package usability issues specific to NixOS label Nov 25, 2021
@ncfavier
Copy link
Member

ncfavier commented Nov 26, 2021

I'm also running into this on nixos-unstable 263ef4c. Logs look similar to the ones posted.

@m1cr0man
Copy link
Contributor

I tried making a modified version of the ACME test suite whilst on 715f634 to try and provoke this issue. What I do notice is that on nixos-rebuild it does seem to get invoked 3 times in rapid succession, but I'm struggling to get it to fire 5+ times and thus I haven't reproduced this issue yet.

Mic92's workaround is fine but ConditionPathExists does + has worked fine for a very long time, and I haven't yet figured out why you would end up with a start limit error on the service. Will continue to look into it over the weekend, but posting here in hope someone has some bright ideas 😅

@m1cr0man
Copy link
Contributor

Wait..

What if we just set the StartLimitIntervalSec to zero and ignore this non-issue? This is a lighter solution than moving the logic to bash since it would avoid spinning up an interpreter + script just to no-op. Anyone got objections to this? Infact - hold your answers for a PR :)

Mic92 pushed a commit to Mic92/nixpkgs that referenced this issue Nov 29, 2021
Closes NixOS#147348

I was able to reproduce this intermittently in the
test suite during the tests for HTTPd. Adding
StartLimitIntervalSec=0 to disable rate limiting
for these services works fine. I added it anywhere
there was a ConditionPathExists.
github-actions bot pushed a commit that referenced this issue Nov 29, 2021
Closes #147348

I was able to reproduce this intermittently in the
test suite during the tests for HTTPd. Adding
StartLimitIntervalSec=0 to disable rate limiting
for these services works fine. I added it anywhere
there was a ConditionPathExists.

(cherry picked from commit be952ab)
@pinpox
Copy link
Member Author

pinpox commented Dec 1, 2021

Can confirm this works now. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
0.kind: bug Something is broken 6.topic: nixos Issues or PRs affecting NixOS modules, or package usability issues specific to NixOS
Projects
None yet
5 participants