[lang]

Present Perfect

Personal
Projects
Packages
Patches
Presents
Linux

Picture Gallery
Present Perfect

Puppet pains

Filed under: sysadmin — Thomas @ 2:53 pm

2012-3-27
2:53 pm

The jury is still out on puppet as far as I’m concerned.

On the one hand, of course I relish that feeling of ultimate power you are promised over all those machines… I appreciate the incremental improvements it lets you make, and have it give you the feeling that anything will be possible.

But sometimes, it is just so painful to deal with. Agent runs are incredibly slow. It really shouldn’t take over a minute for a simple configuration with four machines. Also, does it really need to be eating 400 MB of RAM while it does so ? And when running with the default included web server (is that webrick ?), I have to restart my puppetmaster for every single run because there is this one multiple definition that I can’t figure out that simply goes away when you restart, but comes back after an agent run:
err: Could not retrieve catalog from remote server: Error 400 on SERVER: Duplicate definition: Class[Firewall::Drop] is already defined; cannot redefine at /etc/puppet/environments/testing/modules/manifests/firewall/drop.pp:19 on node esp

And sometimes it’s just painfully silly. I just spent two hours trying to figure out why my production machine couldn’t complete its puppet run.

All it was telling me was
Could not evaluate: 'test' is not executable

After a lot of googling, I stumbled on this ticket. And indeed, I had a file called ‘test’ in my /root directory.

I couldn’t agree with the reporter more:

I find it incredibly un-pragmatic to have policies fail to run whenever someone creates a file in root which matches the name of an executable I am running.

11 Comments »

  1. Yes, there is still a lot that can be improved about puppet. But, it’s incredibly useful. I have a couple several large deployments (90 and 150 under puppet management).

    A few things that will make your life easier:

    a) run the puppet client via cron. Do not run it as a daemon, it leaks memory (at least, it used to in the 0.24 days).

    b) run the puppetmaster under passenger. That works really well. Forget about webrick, it’s not suitable for a production environment.

    c) run the latest version

    Comment by ward — 2012-3-27 @ 3:09 pm

  2. I’d suggest you fix the “multiple declaration” issue properly, rather than bodging around it. Without seeing any of your stuff, it’s almost certainly going to be the result of an import or an include, perhaps a circular one.

    But yeah, I get a similar feeling with puppet; it’s good, but not quite good *enough*. Having said that, it’s better than what we’ve been trying before. Incidentally, what have you been trying before?

    Comment by Jon — 2012-3-27 @ 4:57 pm

  3. This is a very similar set of things we’ve struggled with in fedora infrastructure. In many cases the very puppet recipe language makes a number of things more convoluted than I think they need to be.

    We’re debating the efficacy of continuing with it.

    The problem is, of course, overcoming inertia.
    -sv

    Comment by Seth Vidal — 2012-3-27 @ 10:45 pm

  4. Take a look at salt[1]. We like to think we do things a bit differently and simpler. We also have a pretty rocking community[2] on IRC, via the mailinglist, or even on github. If you want to chat about it sometime, you can find me on IRC gimpnet #sysadmin or freenode #salt as SEJeff

    [1] http://saltstack.org
    [2] http://readthedocs.org/docs/salt/en/latest/topics/community.html#community

    Comment by Jeff Schroeder — 2012-3-27 @ 11:24 pm

  5. Hi Jeff,

    thanks for commenting. I had taken a look at salt (and also asked my sysadmin team at Flumotion to take a look at it). I didn’t choose it because it didn’t seem like it would support Selinux nicely in a way that would let me add specific exceptions or policies, though I may be wrong?

    Comment by Thomas — 2012-3-28 @ 1:39 pm

  6. I actually thought that my move to passenger fixed the issue entirely, and chalked it up to puppet having an internal bug. But now after a while it’s back in certain cases, although different than before. Regardless, I think it doesn’t make any sense that sometimes it happens, sometimes it doesn’t, and it never happens after a restart. That kind of behaviour is definitely a bug in puppet, even if whatever it is I’m doing wrong should not be allowed. Allow always or never, but not depending on the state of the master.

    Before, I was using a custom tool called ##savon which managed machines as a bunch of layers being projected on top of each other, managing file contents, ownership, permissions, and selinux context. I preferred the ‘change it until it works then commit’ model, but it’s definitely not declarative.

    Comment by Thomas — 2012-3-28 @ 1:42 pm

  7. a) am running it by hand at the moment. My use case is different though.
    b) I switched to using passenger, but I don’t find it particularly faster or better so far. Setup was pretty complicated too – a single letter typo cost me a few hours to track down because of the insanely unuseful error messages. I’m sure it’s faster for 100 machines, but for my 4-machine use case it’s the same speed, and it’s just too damn slow.
    c) I’m stinking with 2.6 for now since that’s what people have readily packaged for centos/rhel 6.

    Comment by Thomas — 2012-3-28 @ 1:43 pm

  8. Thomas, what exactly do you mean about specific SELinux policies? We are very apt to add support for features our users want. Is there any chance you could email me the features you’re looking for exactly so we can add them to our list of things to properly support and test? Writing a SELinux module would not be a lot of work.

    Comment by Jeff Schroeder — 2012-3-28 @ 1:58 pm

  9. forget puppet, use cfengine. Way way faster (it is not even funny, there is even a LISA paper on it showing it is 30x faster), rock solid, widely adopted (facebook, amd, intel, just to name a few little companies using it), well supported.

    We apply our policies every 5 minutes, no sweat on the hosts running them. No memory leaks. For the record, our hosts do not exist just to run cfengine, they have other goals and cfengine just very efficiently facilitates that.

    If you are just beginning to evaluate a config management system, and those things are bothering you, do not just go with the flow because the ‘cool’ kids use puppet. Look at what you or your company need(s), and what the best tool is. I have been in several teams facing this decision and *every time* after serious evaluation of puppet, chef and cfengine, the choice was cfengine.

    Let the flame wars begin :-)

    Comment by oxtan — 2012-3-29 @ 9:31 pm

  10. For four machines, you shouldn’t need to switch from webrick or sqlite.

    Version 2.7 of puppet provides deterministic catalog compilation. Either you’ll always get that “duplicate definition” error or you’ll never get it. Try upgrading to 2.7.12 using the appropriate yum repository at http://yum.puppetlabs.com/el/.

    Don’t hesitate to ask any questions you have over at http://groups.google.com/group/puppet-users !

    Comment by Brian — 2012-3-30 @ 6:34 pm

  11. What I have right now for Puppet is a manifest that looks at a file of copied audit messages, and creates and installs an selinux policy to allow those. That manifest was made by a collague of mine, and that’s a good way to specifically only allow things that you know you need, without a blanket disable selinux.

    When the project is launched, I will probably go back and see if it makes sense to support actually writing and installing the policies in the usual way.

    Comment by Thomas — 2012-4-2 @ 10:47 am

RSS feed for comments on this post. TrackBack URL

Leave a comment

picture