Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Preliminary steps to save the CI infrastructure #39009

Open
wants to merge 38 commits into
base: develop
Choose a base branch
from

Conversation

kwankyu
Copy link
Collaborator

@kwankyu kwankyu commented Nov 20, 2024

To improve the situation with the CI infrastructure, this PR:

  • added comments untangling obscure code in CI-related files, for those poor guys who ever attempt to read the files for whatever reasons.

  • while doing the cosmetic changes, a bug (about -uninstall targets) was found build/make/Makefile.in, which is fixed here.

    to test, do

    $ ./configure --enable-dot2tex | grep dot2tex
    $ make build | grep dot2tex
    $ ./configure --disable-dot2tex | grep dot2tex
    $ make build | grep dot2tex
    
  • fixed some jobs in the CI-linux workflow that fail because of duplicate artifact names.

  • removed ubuntu-lunar, ubuntu-mantic, conda-forge-python3.11, ubuntu-bionic-gcc_8-i386, debian-bullseye-i386 from the list of the default systems that CI runs for. This is how to properly modify the list:

    • first edit tox.ini (find DEFAULT_SYSTEM_FACTORS)
    • run tox -e update_docker_platforms
    • commit the changes
  • removed old versions of linuxmint and added new versions.

  • "optional" and "experimental" jobs now run upon "standard" docker images, instead of "maximal" ones, to avoid "out of runner space" error.

  • renamed "Reusable workflow for Docker-based portability CI" to "Workflow for Linux portability CI" for short name and made it runnable through github interface to facilitate testing specific platform by adding "workflow-dispatch" calling docker.yml.

    test: https://github.com/kwankyu/sage/actions/workflows/docker.yml

  • added helpful comments and updated the developer doc

  • reimplemented .ci/write-dockerfile.sh so that simplified Dockerfile is generated for present and future stability

  • turned off failing jobs in "CI Linux incremental"

  • removed seemingly useless subprojects/factory directory to eliminate certain git warnings.

  • turned off "standard-sitepackegs" and "standard-constraints_pkgs-norequirements" jobs as they fail on (almost) all platforms.

test CI run: https://github.com/kwankyu/sage/actions/runs/13570424080
compare with the status quo: https://github.com/sagemath/sage/actions/runs/13465411463

test CI with a PR: kwankyu#82

The main objective of this PR is to solve issues with the workflow "CI Linux" such that a failure on a platform reveals solely some problem of sage built on the platform, but not a problem of the CI infrastructure. After this PR, hopefully, each of failing platforms should be tackled individually. If a platform fails, perhaps we should

  1. decide first whether to support the platform or not.
  2. if the platform is supported, open a github issue for it.
  3. if the platform is not supported, then remove it from the "master list of supported linux platforms" in tox.ini.
  4. if a supported platform constantly fails but no PR for the issue is present, then we may turn it off (by commenting it out) until fixed.

Only decent platforms according to the CI results should be listed in https://github.com/sagemath/sage/wiki/Sage-10.6-Release-Tour#availability-and-installation-help.

The following diagram shows how packages are installed for each of CI jobs:

                _prereq | standard package | optional package | experimental package
-------------------------------------------------------------------------------------
"minimal"        SSSSSS | ---------------- |
"standard"       SSSSSS | SSSSSSSSSSS----- |
"maximal"        SSSSSS | SSSSSSSSSSS----- | SSSS------------ | 
"optional"       SSSSSS | SSSSSSSSSSS----- | ---------------- | 
"experimental"   SSSSSS | SSSSSSSSSSS----- |                  | ------------------ |

where "S" represents system package and dash "-" represents Sage package. Hence

  • In the test results of "minimal" job, we can examine how well standard sage packages behave with sage.
  • In the test results of "standard" job, we can examine how well standard system packages behave with sage.
  • In the test results of "maximal" job, we can examine how well optional system packages behave with sage.
  • In the test results of "optional" job, we can examine how well optional sage packages behave with sage.
  • In the test results of "experimental" job, we can examine how well experimental sage packages behave with sage.

📝 Checklist

  • The title is concise and informative.
  • The description explains in detail what this PR is about.
  • I have linked a relevant issue or discussion.
  • I have created tests covering the changes.
  • I have updated the documentation and checked the documentation preview.

⌛ Dependencies

@kwankyu kwankyu changed the title Add comments untangling complicated code Add comments untangling obscure code in a few build-related files Nov 20, 2024
Copy link

github-actions bot commented Nov 20, 2024

Documentation preview for this PR (built with commit a501144; changes) is ready! 🎉
This preview will update shortly after each push to this PR.

@kwankyu kwankyu force-pushed the p/add-comments-to-scripts branch from ff6bdcd to 7da5efe Compare November 20, 2024 13:47
@kwankyu kwankyu changed the title Add comments untangling obscure code in a few build-related files Add comments untangling obscure code in a few CI-related files Nov 21, 2024
@kwankyu kwankyu force-pushed the p/add-comments-to-scripts branch from db666ee to 6df7a6a Compare February 13, 2025 08:35
@kwankyu kwankyu changed the title Add comments untangling obscure code in a few CI-related files Add comments untangling obscure code in CI-related files Feb 13, 2025
@kwankyu kwankyu changed the title Add comments untangling obscure code in CI-related files Preliminary steps to save the CI infrastructure Feb 13, 2025
@kwankyu kwankyu marked this pull request as ready for review February 13, 2025 10:12
@kwankyu kwankyu mentioned this pull request Feb 13, 2025
5 tasks
@dimpase dimpase added the disputed PR is waiting for community vote, see https://groups.google.com/g/sage-devel/c/IgBYUJl33SQ label Feb 13, 2025
@dimpase
Copy link
Member

dimpase commented Feb 13, 2025

before proceeding it's good to decide whether we rather go with the other PR

@kwankyu kwankyu marked this pull request as draft February 15, 2025 00:39
@kwankyu kwankyu mentioned this pull request Feb 27, 2025
5 tasks
@tobiasdiez
Copy link
Contributor

tobiasdiez commented Feb 27, 2025

As I've said before

"optional" and "experimental" jobs now run upon "standard" docker images, instead of "maximal" ones

is not fine with me. The optional job should test that installing all optional packages from the system manager and then building the missing ones with sage-the-distro works. If you base it on "standard" then you are testing that building all optional packages from sage-the-distro works. This is akin to introducing a new "minimal" but now for optional packages.
Instead I would recommend to analyze why the "maximum" image is so much larger than "standard" - probably it is installing a bunch of nonsense that is actually not needed.

removed ubuntu-lunar, ubuntu-mantic

Not sure if you really want to remove them completely. Their failures look like CI bugs so they should be temporarily disabled only for CI, not completely removed.

That's the only remarks I can make about this PR based on reading the PR description, since the code diff is unreviewable for me (sorry but I cannot keep track of what changes are related to one of the 10 different points done in this PR - would be way easier to just have one PR per bullet point.)

Both my other PR and this one here seem to be quite independent and only have minor merge conflicts. My proposal for proceeding would be:

  • @kwankyu extracts the changes of this PR that are fixing some of the 'standard' runs to a new PR, that is merged first
  • I update my other PR to no longer remove these fixed systems, that PR is merged then
  • @kwankyu updates this PR (possibly splitting it) with the remaining changes

What do you think?

@kwankyu
Copy link
Collaborator Author

kwankyu commented Feb 27, 2025

As I've said before

"optional" and "experimental" jobs now run upon "standard" docker images, instead of "maximal" ones

is not fine with me. The optional job should test that installing all optional packages from the system manager and then building the missing ones with sage-the-distro works.

This is now tested with "maximal" (previous "maximal-pre"). See the diagram in the PR description.

By the way, it seem that many optional sage packages do not have equivalent systems packages on many platforms.

@kwankyu
Copy link
Collaborator Author

kwankyu commented Feb 27, 2025

removed ubuntu-lunar, ubuntu-mantic

Not sure if you really want to remove them completely. Their failures look like CI bugs so they should be temporarily disabled only for CI, not completely removed.

They are not CI bugs. See https://github.com/sagemath/sage/actions/runs/13465411463/job/37630064699 (look for #6 [with-system-packages 2/4])

Ubuntu repos for those releases do not seem to exist.

@kwankyu
Copy link
Collaborator Author

kwankyu commented Feb 27, 2025

  • @kwankyu extracts ...
    What do you think?

Not possible with me because I object to your changes (removing CI jobs). The possible way forward, with least friction, is

  • merge this PR first.
  • observe how CI runs with the next release.
  • you rebase your PR onto the next release, removing CI jobs and making changes as you want.
  • Dima and others evaluate merits of your PR. I would object.
  • If your PR gets approved by voting, then your PR will be merged in the next release.

@user202729
Copy link
Contributor

This is now tested with "maximal" (previous "maximal-pre"). See the diagram in the PR description.

Is it? Looking again at the code

https://github.com/sagemath/sage/pull/39009/files#diff-6af0b0fccc3cf263e0c4d190262250468de88454afb5dbee155414ad812536b9R167

it seems that you're just renaming maximal-pre to maximal (or is it not? Then what does the -pre mean?), then change optional to run on top of standard?

On the other hand, basing it from standard does sound like semantically change it. The reason might be that some optional package is incompatible with some package preinstalled by maximal instead (so Sage rebuild it and there end up being two copies), but it would probably take a long time to debug.

@kwankyu
Copy link
Collaborator Author

kwankyu commented Feb 27, 2025

This is now tested with "maximal" (previous "maximal-pre"). See the diagram in the PR description.

Is it? ...
it seems that you're just renaming maximal-pre to maximal (or is it not? Then what does the -pre mean?), then change optional to run on top of standard?

Yes, "maximal" = "maximal-pre"

I don't know the origin of "-pre". Perhaps because the old "optional" is based on "maximal-pre"?

On the other hand, basing it from standard does sound like semantically change it.

Yes.

The reason might be that some optional package is incompatible with some package preinstalled by maximal instead (so Sage rebuild it and there end up being two copies), but it would probably take a long time to debug.

I think the old "optional" and "experimental" are:

                    _prereq | standard package | optional package | experimental package
-------------------------------------------------------------------------------------
"minimal"            SSSSSS | ---------------- |
"standard"           SSSSSS | SSSSSSSSSSS----- |
"maximal-pre"        SSSSSS | SSSSSSSSSSS----- | SSSS------------ | 
old "optional"       SSSSSS | SSSSSSSSSSS----- | ~~~~------------ | 
old "experimental"   SSSSSS | SSSSSSSSSSS----- | SSSS------------ | ------------------ |

where "~" means sage package in addition to equivalent system packages preinstalled.

The diagram also explains why old "optional" and "experimental" builds are bigger than the new "optional" and "experimental" based on "standard".

@kwankyu
Copy link
Collaborator Author

kwankyu commented Feb 27, 2025

it seems that you're just renaming maximal-pre to maximal (or is it not? Then what does the -pre mean?), then change optional to run on top of standard?

Yes, "maximal" = "maximal-pre"

I don't know the origin of "-pre". Perhaps because the old "optional" is based on "maximal-pre"?

Now I know: "maximal-pre" because it builds up to "with-targets-pre", which means "a full installation of all non-Python packages (SAGE_LOCAL)".

Looking at changes, now I see that "maximal" is not the same with "maximal-pre". The "maximal" defined in the present PR is "standard" + optional packages. See also the commit a501144.

@kwankyu
Copy link
Collaborator Author

kwankyu commented Feb 27, 2025

I fixed a couple of minor bugs after Dima's positive review.

@tobiasdiez
Copy link
Contributor

I fixed a couple of minor bugs after Dima's positive review.

Then please don't set it to positive review if you continued working on it...

also #39009 (comment) seems to be still not addressed (the semantic change of maximal > standard for optional in particular)

@kwankyu kwankyu added disputed PR is waiting for community vote, see https://groups.google.com/g/sage-devel/c/IgBYUJl33SQ s: needs review and removed s: needs work labels Feb 27, 2025
@kwankyu
Copy link
Collaborator Author

kwankyu commented Feb 27, 2025

I responded to your comments. I don't have more things to work on.

Now let Dima and @user202729 decide how to proceed.

@kwankyu
Copy link
Collaborator Author

kwankyu commented Feb 27, 2025

Counting:

+1: Dima, and me (the author)
-1: @tobiasdiez

@tobiasdiez
Copy link
Contributor

Why do you think any of the following actions are okay?

  • Setting a PR to positive review without having all changes being reviewed
  • Setting a PR to positive review that breaks a major part of the system ("default" run)
  • Setting a PR to positive review before the relevant CI have even finished
  • Setting a PR to positive review while simultaneously marking it as 'disputed,' despite previously enforcing a rule that disputed PRs must remain so for at least a week before re-evaluation.

I regret that our discussion about the CI started off on the wrong foot, and I recognize that we may never fully agree on certain aspects (particularly around the definition of 'minimal'). However, I would appreciate it if we could maintain a level of professionalism that reflects the seriousness of the topic at hand.

@kwankyu
Copy link
Collaborator Author

kwankyu commented Feb 28, 2025

  • Setting a PR to positive review without having all changes being reviewed

Do you believe that Dima is positive to your PR after reviewing all changes, but he is positive to this PR without reviewing all changes?

  • Setting a PR to positive review that breaks a major part of the system ("default" run)

Did you look into the run to see why the "default" run failed? It failed because of an issue of the latest sage release. So that is a normal result.

  • Setting a PR to positive review before the relevant CI have even finished

Because the last changes I made are minor and I checked that they work with the partial run. See below.

  • Setting a PR to positive review while simultaneously marking it as 'disputed,' despite previously enforcing a rule that disputed PRs must remain so for at least a week before re-evaluation.

Yes, I broke the rule to give some time to Dima and @user202729 to decide how to proceed, instead of leaving it to @vbraun. Dima at least does not seem to be okay with your PR merged first. If you can wait, let's remove "positive labels" from both PRs. OK?

... I would appreciate it if we could maintain a level of professionalism that reflects the seriousness of the topic at hand.

I agree.

@tobiasdiez
Copy link
Contributor

  • Setting a PR to positive review without having all changes being reviewed

Do you believe that Dima is positive to your PR after reviewing all changes, but he is positive to this PR without reviewing all changes?

You pushed 4 commits after his review.

  • Setting a PR to positive review that breaks a major part of the system ("default" run)

Did you look into the run to see why the "default" run failed? It failed because of an issue of the latest sage release. So that is a normal result.

How do you then make sure that your changes don't introduce other issues in "default". The normal procedure is to then to include the other PR that fixes the issues in "develop" as a dependency (or to wait until that other PR is merged).

Yes, I broke the rule to give some time to Dima and @user202729 to decide how to proceed, instead of leaving it to @vbraun. Dima at least does not seem to be okay with your PR merged first. If you can wait, let's remove "positive labels" from both PRs. OK?

Sorry but I fail to see how setting a PR to positive review gives time for others to express their opinion.

... I would appreciate it if we could maintain a level of professionalism that reflects the seriousness of the topic at hand.

I agree.

Awesome! Could you then please start by describing why you think it's a good idea to expand the CI tests to cover the case where all optional packages are installed via sage-the-distro? Previously, maximum-pre was only building the image with the system packages that was then used by optional + experimental (that's the reason for the name 'pre'). With your changes, maximum now corresponds to the old optional. So you are essentially introducing another CI run and make experimental last even longer. Who do you think will profit from this?

@dimpase
Copy link
Member

dimpase commented Feb 28, 2025

let's keep experimental packages for another PR.
They are a huge mess, and I don't think testing them often automatically makes much sense.

And, indeed, currently there is no positive review for this PR from me. I still think it's not really the right direction to go.

@kwankyu
Copy link
Collaborator Author

kwankyu commented Mar 1, 2025

  • Setting a PR to positive review that breaks a major part of the system ("default" run)

Did you look into the run to see why the "default" run failed? It failed because of an issue of the latest sage release. So that is a normal result.

How do you then make sure that your changes don't introduce other issues in "default".

Because "default" also fails without this PR for the very same reason. See https://github.com/sagemath/sage/actions/runs/13465411463/job/37630049652

The normal procedure is to then to include the other PR that fixes the issues in "develop" as a dependency (or to wait until that other PR is merged).

This PR, which is about CI, is orthogonal with the build system (whether it is broken or not). The job of CI is to report whether the build system is broken or not.

The normal procedure is to include other PR as a dependency if that is necessary for this PR branch to work. That does not apply here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
disputed PR is waiting for community vote, see https://groups.google.com/g/sage-devel/c/IgBYUJl33SQ p: critical / 2
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants