Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parameterize reindex job in GitLab with catalog to reindex #5336

Closed
14 tasks done
achave11-ucsc opened this issue Jun 21, 2023 · 2 comments
Closed
14 tasks done

Parameterize reindex job in GitLab with catalog to reindex #5336

achave11-ucsc opened this issue Jun 21, 2023 · 2 comments
Assignees
Labels
+ [priority] High debt [type] A defect incurring continued engineering cost demo [process] To be demonstrated at the end of the sprint demoed [process] Successfully demonstrated to team infra [subject] Project infrastructure like CI/CD, build and deployment scripts orange [process] Done by the Azul team

Comments

@achave11-ucsc
Copy link
Member

achave11-ucsc commented Jun 21, 2023

We frequently need to reindex a single catalog only. The only way to do this currently is to reindex locally by invoking the respective script with the --catalogs parameter.

GitLab recently added the ability to pass variables to jobs when running them manually. I believe these variables are exposed to the job as an environment variable. It would be good to constrain what variables an operator can pass, but I am not sure that's possible.

Instead of threading an explicit argument through the job definition, the Makefile to the script, we should define a new environment variable that overrides the default of the --catalogs parameter. Sketch:

Index: environment.py
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/environment.py b/environment.py
--- a/environment.py	(revision f1330a3fe741206250beed992953d7f7e66f851e)
+++ b/environment.py	(date 1687470442872)
@@ -130,6 +130,10 @@
         #
         'AZUL_CATALOGS': None,
 
+        # The name of a catalog to perform reindex or other operational tasks on.
+        #
+        'azul_current_catalog': None,
+
         # The Account ID number for AWS
         'AZUL_AWS_ACCOUNT_ID': None,
 
Index: src/azul/__init__.py
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/src/azul/__init__.py b/src/azul/__init__.py
--- a/src/azul/__init__.py	(revision f1330a3fe741206250beed992953d7f7e66f851e)
+++ b/src/azul/__init__.py	(date 1687470353639)
@@ -890,6 +890,10 @@
     def default_catalog(self) -> CatalogName:
         return first(self.catalogs)
 
+    @property
+    def current_catalog(self) -> Optional[str]:
+        return self.environ.get('azul_current_catalog')
+
     def it_catalog_for(self, catalog: CatalogName) -> Optional[CatalogName]:
         it_catalog = self.catalogs[catalog].it_catalog
         assert it_catalog in self.integration_test_catalogs, it_catalog
Index: scripts/update_subgraph_counts.py
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/scripts/update_subgraph_counts.py b/scripts/update_subgraph_counts.py
--- a/scripts/update_subgraph_counts.py	(revision f1330a3fe741206250beed992953d7f7e66f851e)
+++ b/scripts/update_subgraph_counts.py	(date 1687470540063)
@@ -54,7 +54,7 @@
 def main(args: list[str]):
     parser = argparse.ArgumentParser(description=__doc__,
                                      formatter_class=AzulArgumentHelpFormatter)
-
+    # TODO: same here
     parser.add_argument('--catalogs',
                         nargs='+',
                         metavar='NAME',
Index: scripts/reindex.py
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/scripts/reindex.py b/scripts/reindex.py
--- a/scripts/reindex.py	(revision f1330a3fe741206250beed992953d7f7e66f851e)
+++ b/scripts/reindex.py	(date 1687470266789)
@@ -63,8 +63,13 @@
                     nargs='+',
                     metavar='NAME',
                     default=[
-                        c for c in config.catalogs
-                        if c not in config.integration_test_catalogs
+                        catalog.name
+                        for catalog in config.catalogs.values()
+                        if not catalog.is_integration_test_catalog
+                    ]
+                    if config.current_catalog is None else
+                    [
+                        config.catalogs[config.current_catalog].name
                     ],
                     choices=config.catalogs,
                     help='The names of the catalogs to reindex.')
  • Security design review completed; the Resolution of this issue does not
    • … affect authentication; for example:
      • OAuth 2.0 with the application (API or Swagger UI)
      • Authentication of developers with Google Cloud APIs
      • Authentication of developers with AWS APIs
      • Authentication with a GitLab instance in the system
      • Password and 2FA authentication with GitHub
      • API access token authentication with GitHub
      • Authentication with
    • … affect the permissions of internal users like access to
      • Cloud resources on AWS and GCP
      • GitLab repositories, projects and groups, administration
      • an EC2 instance via SSH
      • GitHub issues, pull requests, commits, commit statuses, wikis, repositories, organizations
    • … affect the permissions of external users like access to
      • TDR snapshots
    • … affect permissions of service or bot accounts
      • Cloud resources on AWS and GCP
    • … affect audit logging in the system, like
      • adding, removing or changing a log message that represents an auditable event
      • changing the routing of log messages through the system
    • … affect monitoring of the system
    • … introduce a new software dependency like
      • Python packages on PYPI
      • Command-line utilities
      • Docker images
      • Terraform providers
    • … add an interface that exposes sensitive or confidential data at the security boundary
    • … affect the encryption of data at rest
    • … require persistence of sensitive or confidential data that might require encryption at rest
    • … require unencrypted transmission of data within the security boundary
    • … affect the network security layer; for example by
      • modifying, adding or removing firewall rules
      • modifying, adding or removing security groups
      • changing or adding a port a service, proxy or load balancer listens on
  • Documentation on any unchecked boxes is provided in comments below
@achave11-ucsc achave11-ucsc added the orange [process] Done by the Azul team label Jun 21, 2023
@hannes-ucsc hannes-ucsc added enh debt [type] A defect incurring continued engineering cost infra [subject] Project infrastructure like CI/CD, build and deployment scripts - [priority] Medium labels Jun 22, 2023
@hannes-ucsc hannes-ucsc removed their assignment Jun 22, 2023
@achave11-ucsc achave11-ucsc self-assigned this Jan 19, 2024
@achave11-ucsc
Copy link
Member Author

Test this by having the operator use the new functionality (index a single catalog via env variable) to reindex a single sandbox catalog.

@hannes-ucsc
Copy link
Member

hannes-ucsc commented Mar 2, 2024

For demo, reindex a small catalog in a main deployment.

@hannes-ucsc hannes-ucsc added the demo [process] To be demonstrated at the end of the sprint label Mar 2, 2024
@achave11-ucsc achave11-ucsc added the demoed [process] Successfully demonstrated to team label Mar 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
+ [priority] High debt [type] A defect incurring continued engineering cost demo [process] To be demonstrated at the end of the sprint demoed [process] Successfully demonstrated to team infra [subject] Project infrastructure like CI/CD, build and deployment scripts orange [process] Done by the Azul team
Projects
None yet
Development

No branches or pull requests

3 participants