Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

git_pillar should permit automatic mapping of environment to branch a-la gitfs #32245

Closed
tkwilliams opened this issue Mar 30, 2016 · 24 comments
Closed
Assignees
Labels
Feature new functionality including changes to functionality and code refactors, etc. Pillar Platform Relates to OS, containers, platform-based utilities like FS, system based apps stale
Milestone

Comments

@tkwilliams
Copy link
Contributor

Gitfs (for the salt side of things) can be configured to dynamically map git branches to environments, which is stunningly useful for scenarios where development takes place in numerous, transient, "feature branches"; which are then merged into "well-known" branches such as dev, qa and prd as the code promotion process.

git_pillar, contrariwise, must be configured in the master conf with explicit, exhaustive, mappings of environments to branches. Since, at least in our case, good data/code separation has led to about 90% of changes taking place in pillar rather than state files, we're finding the actual USEFULNESS of dynamic branch mapping to be almost completely nullified by the lack of such functionality in pillar.

We've tried a large assortment of things to work around this lack (stupid Jinja tricks, strange combinations of git_pillar_* config options and so on) but the basic lack of a way to do anything but statically map predefined environments to branches seems a hard limit on our workflow.

If the above is not detailed enough I can certainly give some examples of our desired workflow, but I think the basic usefulness of such a feature is pretty evident :)

Thanks!

@jfindlay jfindlay added Feature new functionality including changes to functionality and code refactors, etc. Platform Relates to OS, containers, platform-based utilities like FS, system based apps Pillar labels Mar 30, 2016
@jfindlay
Copy link
Contributor

@tkwilliams, thanks for reporting. This seems like a useful feature to me.

@jfindlay
Copy link
Contributor

One thing to be careful about with such a feature is that enabling inexplicit behavior for pillar may reduce clarity on how pillars are assigned to minions and by this cause expose secret information to the wrong system/party.

@tkwilliams
Copy link
Contributor Author

Hrrm. I can see that concern... On the other hand, I can see the same exposures within the current design. Say the environments "dev", "qa", and "prd" are defined via the current syntax in the master's pillar section. If the minion is in "dev" but lies and requests an environment of "qa", it will either receive the data requested (if the top file allows this for some strange reason) or will not. Now inject dynamic mapping into this, where rather than static mappings in the master's pillar section, we instead simply say "map whatever environment is requested to it's equivalent branch name" (which is just as deterministic as the current method if you think about it), you have the identical situation. That is, the top file determines what the minion can actually see. This is merely a convenience "short-hand" to compress a very very large exhaustive list of default mappings into one line. As discussed in something I asked on the mailing list, the implementation could be done in various ways:

ext_pillar:
  - git:
    - git@bitbucket.org:redbookplatform/infra-salt.git

and have it say to itself "hey, he didn't provide an environment in the definition, maybe he wants to map ALL available branches to their environments...".

Or being a bit more kludgy perhaps some magic value:

ext_pillar:
  - git:
    - * git@bitbucket.org:redbookplatform/infra-salt.git

or maybe a new pillar_git option along the lines of

git_pillar_auto_glob_env: True

as examples. Certainly you'd not want to make this new behaviour the default without a big flashing neon sign above it first, outlining the attendant potential risks.

@tkwilliams
Copy link
Contributor Author

Earlier discussion on the subject in salt-users:
https://groups.google.com/forum/#!topic/salt-users/OXHpBuf28QU

@jfindlay jfindlay added this to the Approved milestone Apr 6, 2016
@peter-slovak
Copy link
Contributor

+1 👍

@tkwilliams
Copy link
Contributor Author

@cachedout @rallytime @jfindlay I don't suppose anyone has had a chance to look into this, have they?

@rallytime
Copy link
Contributor

@terminalmage Has done quite a bit of work on git pillar. I think he'd be the best person to respond here.

@terminalmage
Copy link
Contributor

I don't see this ever being implemented, to be honest. git_pillar and gitfs aren't the same thing, they have fundamental differences. With gitfs, individual files are requested, which lends itself easily to mapping an environment to a branch, tag or SHA. We never actually check out a branch, we just fetch from the repo and then pull files out of the fetched objects. With git_pillar, each defined repo must be checked out, so that the SLS files within it exist locally and can be processed similar to states.

Not only that, but every defined git_pillar remote is evaluated when pillar data is compiled. If you assume that every branch/tag/whatever is mapped, you're essentially asking the master to evaluate git_pillar for all branches, for each minion, every time its pillar data is refreshed. This would be an absolute performance nightmare.

@tkwilliams
Copy link
Contributor Author

@terminalmage @rallytime

Vis-a-vis the fragment If you assume that every branch/tag/whatever is mapped. This is exactly the OPPOSITE of what I suggested. I specifically mention that a client would (explicitly or implicitly) request a SINGLE branch/environment, which is already the case with the current code - this would simply change the semantics around WHICH branch is compiled when the client requests it. As I mentioned in a comment above:

This is merely a convenience "short-hand" to compress a very very large exhaustive list of default mappings into one line. 

The "exhaustive list" approach is technically DO-ABLE with the current code, but of course having a non-dynamic-need-to-update-pillar-config-and-restart-salt-every-time-you-add-a-branch code is a non-starter to any reasonable person :)

It's all good :) I did an end-run around the whole problem back when I wrote my makostack module, which provided the functionality I was actually hoping for here in pillar.

Feel free to close this as wontfix (or your equiv).

Though based on discussions on the forum, this is NOT just a feature I was hoping for from stock pillar, and I know of several (otherwise reasonable) workflows which are utterly impossible if pillar doesn't provide some more flexible semantics for controlling from whence its data comes.

As it is, the workflow I needed (which, again based on feedback from others is a common and reasonable one) literally forced me to write a custom ext pillar simply to allow the "merging with overrides" behavior required to permit us (as the shared infra team) to define the vast majority of another teams pillar defs (those bits involving infra we manage, or portions with defaults we don't want them fiddling with), while allowing them to have their own git repo, providing data for stuff they are to manage, and letting it all merge nicely into a single pillar tree -- which our sls files then use to manage ALL of our AWS infrastructure. It's working swell, but it would have been even sweller if pillar were mebbe slightly more flexible -- at least enough to not more-or-less PRESCRIBE the permitted data modelling, and workflows attendant thereto...

And IIRC, @ryan-lane was also very interested in this functionality, likely due to yaml/data merging issues he was having WRT multiple teams and merging/overriding values from different layers.

@terminalmage
Copy link
Contributor

@tkwilliams have you looked at using the __env__ branch placeholder? https://docs.saltstack.com/en/latest/ref/pillar/all/salt.pillar.git_pillar.html#configuring-git-pillar-for-salt-releases-2015-8-0-and-later

IIRC there is still a race condition issue though, since different minions requesting pillar data from different pillarenvs simultaneously would result in the repo contents changing. This needs to be remedied. Possible solutions for this would include either having one git_pillar cachedir per worker thread, or implementing locking. The problem with locking though is that if you have many minions requesting pillar data at the same time from different environments, then that has the potential to slow down pillar compilation significantly. Yet another potential fix would ditch checking out altogether in favor of copying all files from a requested environment to a pillarenv-specific directory, and cleaning these files when updates for a given repo are fetched. Of course, that solution also brings with it the potential for a race condition, as we potentially change files in the middle of another minion's pillar compilation.

@tkwilliams
Copy link
Contributor Author

@terminalmage Wouldn't the current code have the same race conditions? E.g. if a static map sends minion1 to one pillarenv (git branch), while minion2 gets a different branch, and they both happen to run at the same time, is this not going to cause the same issues, or am I misunderstanding things?

@terminalmage
Copy link
Contributor

The idea to work around the race condition would be to have different roots per environment. It's on the tentative list of features for the Fluorine release cycle.

@tkwilliams
Copy link
Contributor Author

WARNING -- TL;DR -- WARNING

@terminalmage Y'know, it's just now occurred to me that I've never actually given the CONTEXT for the original request or how it arose in the first place - there's a good chance a concrete example would at least add clarity and mebbe cut through the vagaries I may have introduced by talking high-concept...

Soooo, one of "the other guys" (puppet (booo) to be specific) is our currently deployed CM - I inherited it when I started, with a firm commitment I'd be allowed to replace it with salt "eventually". The low-hanging fruit was orchestration FIRST, since puppet has none of that to replace anyway, and what little existed walking in the door was ugly shell scripts and the like. Now, orchestration is largely done, and I'm about to jump into the "replace all the puppet CM with salt, FINALLY" phase. Hooray and hallelujah!

That said, one thing puppet DOES offer (or provide the hooks for anyway) is a SUPER AWESOME workflow, wherein you can check out various branches of puppet/hiera data to /etc/puppet/environment/<branchname>, and then run puppet agent, from the client system, with puppet agent -t --environment <branchname> and it basically requests that branch from the master as it's "catalog", and then runs puppet FROM THAT BRANCH - of course applying any cool differences you may have IN that branch along the way.

This leads to an insanely simple development / code promotion workflow:

  1. There are (in our case) four long-lived branches - dev, qa, stg, prd - which map to "environments" in puppet, which then map to AWS accounts with that environment's VPCs inside.
  2. The master branch in git is replaced with prd, which is also what our production hosts run.
  3. ALL work is done in feature branches, and all feature branches are spun from prd
    -- split a feature branch (e.g. infra666) from prd
    -- make your changes
    -- git push your feature branch to remote (git push origin -u infra666)
  4. Test the code changes
    -- check out (via a handy tool called r10k) your FB on the puppet master
    -- run puppet agent -t --environment infra666 [--noop]
  5. When code is deemed mature, you then
    -- merge the FB to dev, and test
    -- if OK, merge to qa and either test yourself, or have the QA team test, depending on what's been changed
    -- etc, up to prd, at which point it becomes canonical and any new feature branches include it
    -- note that, when any of [dev, qa, stg, prd] are updated, the puppet master for that VPC automatically pulls it, then all clients update to the new code at next run (half-hourly in our case).
  6. Feature branch can be deleted

This means that dev is a superset of QA, which is a superset of stg, which is a superset of prd, and all are always consistent and compatible, since all feature branches come from prd, and are ALWAYS merged completely into all the environments, in order. Even canceled feature branches (say it got to dev and then we decided we didn't want it) get merged up through prd - generally a git revert or equiv is done on the FB, and then it's promoted as usual, as a no-op change.

This has been working phenomenally for us for several years, supports multiple team members as well as multiple teams all updating the code at the same time. We've cross-trained the rest of our org in using it and it has become a key part of our organizational gestalt. In talking to others on the mailing list, and at cons, I find many companies have arrived at something similar over time, via variety of starting positions.

But all of this is predicated on the ability to run ON THE CLIENT the equivalent of puppet agent -t --environment <branch> and have it use both the appropriate CODE, in sync with the appropriate DATA. And because these feature branches are transient (we often create, work, and dispose of several a day), hard-coding each into the config on the salt-master and restarting the daemon for each is simply not feasible.

The current salt (well, specifically PILLAR) behavior does not permit this. I've tried a huge (to be honest, staggering) number of config variations and hopeful workarounds, and my inescapable conclusion at the end was that I had to write my own if I needed it. Whence makostack.

I still really hate the fact that it can't be done in stock pillar though - it FEELS like it should be do-able, and several times I came REEEEEALY close to getting it working, only to hit some designed-in limitation which prevented it.

It SEEMS like it could be getting there though -- "different roots per environment" sure SOUNDS like what would be needed to at least allow the above workflow to be possible...

@terminalmage
Copy link
Contributor

terminalmage commented Mar 16, 2018

Yeah, the main hurdle right now is that there isn't a great way of doing this using the Python bindings for Salt git. I would love to be able to leverage git worktree support (git 2.5+) using the Python bindings, it's the natural use-case for this and I'd rather not reinvent the wheel here.

GitPython could do worktrees since it has the ability to run arbitrary git CLI commands, but pygit2 is a different story. Worktree support was added to libgit2 semi-recently in 0.26.0, but pygit2 has yet to add worktree support (I've opened libgit2/pygit2#777 upstream for this). One possible compromise for git_pillar using Pygit2 would be to use the git CLI to create new worktrees, and then just use Pygit2 to attach to them like they were repositories. I haven't checked to see how pygit2 performs when you attach to a worktree and not the main checkout for a repo, but I would hope it would still work properly.

@terminalmage
Copy link
Contributor

@tkwilliams Good news, the code for managing worktrees using pygit2 is already sitting in a pending pull request. It was just awaiting a unit test, so I've written one and am working to get that PR across the finish line.

@terminalmage terminalmage self-assigned this Mar 16, 2018
@tkwilliams
Copy link
Contributor Author

Holy cow! That's some sweet news. I'll be excited to play with it when it's ready.

Thanks for following up / through with this :) Let me know how I can help test when it's ready...
t.

@terminalmage
Copy link
Contributor

@tkwilliams Sure thing! I have an open PR to get worktree support for pygit2 finished: libgit2/pygit2#779

@terminalmage
Copy link
Contributor

@tkwilliams Assuming this feature gets into pygit2 0.26.4, that would be the minimum supported version for pygit2. GitPython would be able to support worktrees sooner as it has ways to allow for arbitrary git commands to be run, so any version of GitPython would work. As far as I know, git 2.5+ would be required since that is when worktree support was added, so that rules out older LTS releases like Ubuntu 14.04 LTS and CentOS 7 even, unless newer git were to be compiled and installed on those releases.

What distros/versions are used in your infrastructure?

@tkwilliams
Copy link
Contributor Author

@terminalmage At the moment we are exclusively AWS Linux (AMI version 2016.09, but the next expedite ticket in my queue is actually to update to 2017.09 so we should be at least that much more current soon).

AMI 2016.09 has git 2.7.4 though, so that shouldn't be an issue. Pygit2 is at 0.20.3, so I'd likely role a custom RPM for that (unless you guys put it into your salt deps repo). Gitpython is at 2.0.8, but as you say, it's largely version irrelevant anyway...

@terminalmage
Copy link
Contributor

terminalmage commented Mar 19, 2018

OK. Newer pygit2 may be a problem, because versions around 0.21.x switched to using python-cffi, which needs to be built against libffi. The problem though is that the cffi pygit2 requires needs newer libffi, and libffi is one of those things that you can't upgrade without building newer versions of everything that is built against it.

Isn't AWS Linux based off of RHEL/CentOS 7? I know that CentOS 7 has new enough libffi to support newer pygit2, since epel has pygit2 0.24.2.

Keep in mind that pygit2 should always be kept at the same release cycle as libgit2, so upgrading pygit2 to 0.26.4 would involve also building a 0.26.x release of libgit2: http://www.pygit2.org/install.html#version-numbers

Also, you'll need cmake to build libgit2, and you'll want libssh2-devel installed if you want to build libgit2 with ssh support. The cmake output will show you if it is building with ssh support if I remember correctly.

@sathieu
Copy link
Contributor

sathieu commented Nov 8, 2018

Appart from the race, there is also a bug with __env__ pillar inclusion handling. Proposed fix in #50417.

@sathieu
Copy link
Contributor

sathieu commented Nov 29, 2018

It looks like the __env__mapping is solved by #50417. Close this bug then?

The race condition is reported as #39420.

@sathieu
Copy link
Contributor

sathieu commented Dec 5, 2018

See PR #50768, for the all_saltenvs parameter. IMO, the only missing piece now is fixing the race-condition (#39420). Also the config is not consistent with gitfs.

@stale
Copy link

stale bot commented Jan 9, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

If this issue is closed prematurely, please leave a comment and we will gladly reopen the issue.

@stale stale bot added the stale label Jan 9, 2020
@stale stale bot closed this as completed Jan 16, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature new functionality including changes to functionality and code refactors, etc. Pillar Platform Relates to OS, containers, platform-based utilities like FS, system based apps stale
Projects
None yet
Development

No branches or pull requests

6 participants