-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
git_pillar should permit automatic mapping of environment to branch a-la gitfs #32245
Comments
@tkwilliams, thanks for reporting. This seems like a useful feature to me. |
One thing to be careful about with such a feature is that enabling inexplicit behavior for pillar may reduce clarity on how pillars are assigned to minions and by this cause expose secret information to the wrong system/party. |
Hrrm. I can see that concern... On the other hand, I can see the same exposures within the current design. Say the environments "dev", "qa", and "prd" are defined via the current syntax in the master's pillar section. If the minion is in "dev" but lies and requests an environment of "qa", it will either receive the data requested (if the top file allows this for some strange reason) or will not. Now inject dynamic mapping into this, where rather than static mappings in the master's pillar section, we instead simply say "map whatever environment is requested to it's equivalent branch name" (which is just as deterministic as the current method if you think about it), you have the identical situation. That is, the top file determines what the minion can actually see. This is merely a convenience "short-hand" to compress a very very large exhaustive list of default mappings into one line. As discussed in something I asked on the mailing list, the implementation could be done in various ways:
and have it say to itself "hey, he didn't provide an environment in the definition, maybe he wants to map ALL available branches to their environments...". Or being a bit more kludgy perhaps some magic value:
or maybe a new pillar_git option along the lines of
as examples. Certainly you'd not want to make this new behaviour the default without a big flashing neon sign above it first, outlining the attendant potential risks. |
Earlier discussion on the subject in salt-users: |
+1 👍 |
@cachedout @rallytime @jfindlay I don't suppose anyone has had a chance to look into this, have they? |
@terminalmage Has done quite a bit of work on git pillar. I think he'd be the best person to respond here. |
I don't see this ever being implemented, to be honest. git_pillar and gitfs aren't the same thing, they have fundamental differences. With gitfs, individual files are requested, which lends itself easily to mapping an environment to a branch, tag or SHA. We never actually check out a branch, we just fetch from the repo and then pull files out of the fetched objects. With git_pillar, each defined repo must be checked out, so that the SLS files within it exist locally and can be processed similar to states. Not only that, but every defined git_pillar remote is evaluated when pillar data is compiled. If you assume that every branch/tag/whatever is mapped, you're essentially asking the master to evaluate git_pillar for all branches, for each minion, every time its pillar data is refreshed. This would be an absolute performance nightmare. |
Vis-a-vis the fragment
The "exhaustive list" approach is technically DO-ABLE with the current code, but of course having a non-dynamic-need-to-update-pillar-config-and-restart-salt-every-time-you-add-a-branch code is a non-starter to any reasonable person :) It's all good :) I did an end-run around the whole problem back when I wrote my makostack module, which provided the functionality I was actually hoping for here in pillar. Feel free to close this as wontfix (or your equiv). Though based on discussions on the forum, this is NOT just a feature I was hoping for from stock pillar, and I know of several (otherwise reasonable) workflows which are utterly impossible if pillar doesn't provide some more flexible semantics for controlling from whence its data comes. As it is, the workflow I needed (which, again based on feedback from others is a common and reasonable one) literally forced me to write a custom ext pillar simply to allow the "merging with overrides" behavior required to permit us (as the shared infra team) to define the vast majority of another teams pillar defs (those bits involving infra we manage, or portions with defaults we don't want them fiddling with), while allowing them to have their own git repo, providing data for stuff they are to manage, and letting it all merge nicely into a single pillar tree -- which our sls files then use to manage ALL of our AWS infrastructure. It's working swell, but it would have been even sweller if pillar were mebbe slightly more flexible -- at least enough to not more-or-less PRESCRIBE the permitted data modelling, and workflows attendant thereto... And IIRC, @ryan-lane was also very interested in this functionality, likely due to yaml/data merging issues he was having WRT multiple teams and merging/overriding values from different layers. |
@tkwilliams have you looked at using the IIRC there is still a race condition issue though, since different minions requesting pillar data from different pillarenvs simultaneously would result in the repo contents changing. This needs to be remedied. Possible solutions for this would include either having one git_pillar cachedir per worker thread, or implementing locking. The problem with locking though is that if you have many minions requesting pillar data at the same time from different environments, then that has the potential to slow down pillar compilation significantly. Yet another potential fix would ditch checking out altogether in favor of copying all files from a requested environment to a pillarenv-specific directory, and cleaning these files when updates for a given repo are fetched. Of course, that solution also brings with it the potential for a race condition, as we potentially change files in the middle of another minion's pillar compilation. |
@terminalmage Wouldn't the current code have the same race conditions? E.g. if a static map sends minion1 to one pillarenv (git branch), while minion2 gets a different branch, and they both happen to run at the same time, is this not going to cause the same issues, or am I misunderstanding things? |
The idea to work around the race condition would be to have different roots per environment. It's on the tentative list of features for the Fluorine release cycle. |
WARNING -- TL;DR -- WARNING @terminalmage Y'know, it's just now occurred to me that I've never actually given the CONTEXT for the original request or how it arose in the first place - there's a good chance a concrete example would at least add clarity and mebbe cut through the vagaries I may have introduced by talking high-concept... Soooo, one of "the other guys" (puppet (booo) to be specific) is our currently deployed CM - I inherited it when I started, with a firm commitment I'd be allowed to replace it with salt "eventually". The low-hanging fruit was orchestration FIRST, since puppet has none of that to replace anyway, and what little existed walking in the door was ugly shell scripts and the like. Now, orchestration is largely done, and I'm about to jump into the "replace all the puppet CM with salt, FINALLY" phase. Hooray and hallelujah! That said, one thing puppet DOES offer (or provide the hooks for anyway) is a SUPER AWESOME workflow, wherein you can check out various branches of puppet/hiera data to This leads to an insanely simple development / code promotion workflow:
This means that dev is a superset of QA, which is a superset of stg, which is a superset of prd, and all are always consistent and compatible, since all feature branches come from prd, and are ALWAYS merged completely into all the environments, in order. Even canceled feature branches (say it got to dev and then we decided we didn't want it) get merged up through prd - generally a This has been working phenomenally for us for several years, supports multiple team members as well as multiple teams all updating the code at the same time. We've cross-trained the rest of our org in using it and it has become a key part of our organizational gestalt. In talking to others on the mailing list, and at cons, I find many companies have arrived at something similar over time, via variety of starting positions. But all of this is predicated on the ability to run ON THE CLIENT the equivalent of The current salt (well, specifically PILLAR) behavior does not permit this. I've tried a huge (to be honest, staggering) number of config variations and hopeful workarounds, and my inescapable conclusion at the end was that I had to write my own if I needed it. Whence I still really hate the fact that it can't be done in stock pillar though - it FEELS like it should be do-able, and several times I came REEEEEALY close to getting it working, only to hit some designed-in limitation which prevented it. It SEEMS like it could be getting there though -- "different roots per environment" sure SOUNDS like what would be needed to at least allow the above workflow to be possible... |
Yeah, the main hurdle right now is that there isn't a great way of doing this using the Python bindings for GitPython could do worktrees since it has the ability to run arbitrary git CLI commands, but pygit2 is a different story. Worktree support was added to libgit2 semi-recently in 0.26.0, but pygit2 has yet to add worktree support (I've opened libgit2/pygit2#777 upstream for this). One possible compromise for git_pillar using Pygit2 would be to use the git CLI to create new worktrees, and then just use Pygit2 to attach to them like they were repositories. I haven't checked to see how pygit2 performs when you attach to a worktree and not the main checkout for a repo, but I would hope it would still work properly. |
@tkwilliams Good news, the code for managing worktrees using pygit2 is already sitting in a pending pull request. It was just awaiting a unit test, so I've written one and am working to get that PR across the finish line. |
Holy cow! That's some sweet news. I'll be excited to play with it when it's ready. Thanks for following up / through with this :) Let me know how I can help test when it's ready... |
@tkwilliams Sure thing! I have an open PR to get worktree support for pygit2 finished: libgit2/pygit2#779 |
@tkwilliams Assuming this feature gets into pygit2 0.26.4, that would be the minimum supported version for pygit2. GitPython would be able to support worktrees sooner as it has ways to allow for arbitrary git commands to be run, so any version of GitPython would work. As far as I know, git 2.5+ would be required since that is when worktree support was added, so that rules out older LTS releases like Ubuntu 14.04 LTS and CentOS 7 even, unless newer git were to be compiled and installed on those releases. What distros/versions are used in your infrastructure? |
@terminalmage At the moment we are exclusively AWS Linux (AMI version 2016.09, but the next expedite ticket in my queue is actually to update to 2017.09 so we should be at least that much more current soon). AMI 2016.09 has git 2.7.4 though, so that shouldn't be an issue. Pygit2 is at 0.20.3, so I'd likely role a custom RPM for that (unless you guys put it into your salt deps repo). Gitpython is at 2.0.8, but as you say, it's largely version irrelevant anyway... |
OK. Newer pygit2 may be a problem, because versions around 0.21.x switched to using python-cffi, which needs to be built against libffi. The problem though is that the cffi pygit2 requires needs newer libffi, and libffi is one of those things that you can't upgrade without building newer versions of everything that is built against it. Isn't AWS Linux based off of RHEL/CentOS 7? I know that CentOS 7 has new enough libffi to support newer pygit2, since epel has pygit2 0.24.2. Keep in mind that pygit2 should always be kept at the same release cycle as libgit2, so upgrading pygit2 to 0.26.4 would involve also building a 0.26.x release of libgit2: http://www.pygit2.org/install.html#version-numbers Also, you'll need |
Appart from the race, there is also a bug with |
On the road to saltstack#32245.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. If this issue is closed prematurely, please leave a comment and we will gladly reopen the issue. |
Gitfs (for the salt side of things) can be configured to dynamically map git branches to environments, which is stunningly useful for scenarios where development takes place in numerous, transient, "feature branches"; which are then merged into "well-known" branches such as dev, qa and prd as the code promotion process.
git_pillar, contrariwise, must be configured in the master conf with explicit, exhaustive, mappings of environments to branches. Since, at least in our case, good data/code separation has led to about 90% of changes taking place in pillar rather than state files, we're finding the actual USEFULNESS of dynamic branch mapping to be almost completely nullified by the lack of such functionality in pillar.
We've tried a large assortment of things to work around this lack (stupid Jinja tricks, strange combinations of git_pillar_* config options and so on) but the basic lack of a way to do anything but statically map predefined environments to branches seems a hard limit on our workflow.
If the above is not detailed enough I can certainly give some examples of our desired workflow, but I think the basic usefulness of such a feature is pretty evident :)
Thanks!
The text was updated successfully, but these errors were encountered: