Introducing Puppet Exec[‘again’]

Puppet is missing a number of much-needed features. That’s the bad news. The good news is that I’ve been able to write some of these as modules that don’t need to change the Puppet core! This is an article about one of these features.

Posit: It’s not possible to apply all of your Puppet manifests in a single run.

I believe that this holds true for the current implementation of Puppet. Most manifests can, do and should apply completely in a single run. If your Puppet run takes more than one run to converge, then chances are that you’re doing something wrong.

(For the sake of this article, convergence means that everything has been applied cleanly, and that a subsequent Puppet run wouldn’t have any work to do.)

There are some advanced corner cases, where this is not possible. In these situations, you will either have to wait for the next Puppet run (by default it will run every 30 minutes) or keep running Puppet manually until your configuration has converged. Neither of these situations are acceptable because:

  • Waiting 30 minutes while your machines are idle is (mostly) a waste of time.
  • Doing manual work to set up your automation kind of defeats the purpose.
'Are you stealing those LCDs?' 'Yeah, but I'm doing it while my code compiles.'

Waiting 30 minutes while your machines are idle is (mostly) a waste of time. Okay, maybe it’s not entirely a waste of time :)

So what’s the solution?

Introducing: Puppet Exec[‘again’] !

Exec[‘again’] is a feature which I’ve added to my Puppet-Common module.

What does it do?

Each Puppet run, your code can decide if it thinks there is more work to do, or if the host is not in a converged state. If so, it will tell Exec[‘again’].

What does Exec[‘again’] do?

Exec[‘again’] will fork a process off from the running puppet process. It will wait until that parent process has finished, and then it will spawn (technically: execvpe) a new puppet process to run puppet again. The module is smart enough to inspect the parent puppet process, and it knows how to run the child puppet. Once the new child puppet process is running, you won’t see any leftover process id from the parent Exec[‘again’] tool.

How do I tell it to run?

It’s quite simple, all you have to do is import my puppet module, and then notify the magic Exec[‘again’] type that my class defines. Example:

include common::again

$some_str = 'ttboj is awesome'
# you can notify from any type that can generate a notification!
# typically, using exec is the most common, but is not required!
file { '/tmp/foo':
    content => "${some_str}\n",
    notify => Exec['again'], # notify puppet!
}

How do I decide if I need to run again?

This depends on your module, and isn’t always a trivial thing to figure out. In one case, I had to build a finite state machine in puppet to help decide whether this was necessary or not. In some cases, the solution might be simpler. In all cases, this is an advanced technique, so you’ll probably already have a good idea about how to figure this out if you need this type of technique.

Can I introduce a minimum delay before the next run happens?

Yes, absolutely. This is particularly useful if you are building a distributed system, and you want to give other hosts a chance to export resources before each successive run. Example:

include common::again

# when notified, this will run puppet again, delta sec after it ends!
common::again::delta { 'some-name':
    delta => 120, # 2 minutes (pick your own value)
}

# to run the above Exec['again'] you can use:
exec { '/bin/true':
    onlyif => '/bin/false', # TODO: some condition
    notify => Common::Again::Delta['some-name'],
}

Can you show me a real-world example of this module?

Have a look at the Puppet-Gluster module. This module was one of the reasons that I wrote the Exec[‘again’] functionality.

Are there any caveats?

Maybe! It’s possible to cause a fast “infinite loop”, where Puppet gets run unnecessarily. This could effectively DDOS your puppetmaster if left unchecked, so please use with caution! Keep in mind that puppet typically runs in an infinite loop already, except with a 30 minute interval.

Help, it won’t stop!

Either your code has become sentient, and has decided it wants to enable kerberos or you’ve got a bug in your Puppet manifests. If you fix the bug, things should eventually go back to normal. To kill the process that’s re-spawning puppet, look for it in your process tree. Example:

[root@server ~]# ps auxww | grep again[.py]
root 4079 0.0 0.7 132700 3768 ? S 18:26 0:00 /usr/bin/python /var/lib/puppet/tmp/common/again/again.py --delta 120
[root@server ~]# killall again.py
[root@server ~]# echo $?
0
[root@server ~]# ps auxww | grep again[.py]
[root@server ~]# killall again.py
again.py: no process killed
[root@server ~]#

Does this work with puppet running as a service or with puppet agent –test?

Yes.

How was the spawn/exec logic implemented?

The spawn/exec logic was implemented as a standalone python program that gets copied to your local system, and does all the heavy lifting. Please have a look and let me know if you can find any bugs!

Conclusion

I hope you enjoyed this addition to your toolbox. Please remember to use it with care. If you have a legitimate use for it, please let me know so that I can better understand your use case!

Happy hacking,

James

 

4 thoughts on “Introducing Puppet Exec[‘again’]

  1. I just make sure my kickstart bootstrap script runs puppet 3 times in a row and boom – convergence as you say.

    • This won’t work for certain scenarios– for Puppet-Gluster, you need each host to run, then after they’ve all each run, subsequent runs are necessary, and so on.

      For a single host which doesn’t depend on others, this might work, but at best it’s an awkward hack.

      Can you post your manifests to see if I can get it down to a single run?

  2. This is a good way to solve situations when a puppet module legitimately needs to run multiple times for it to complete correctly. This is a self dispatching model! I’ve used this before for celery [0] tasks that know that they need to run again, but cannot resolve the issue on its own and are not sure exactly when it should check again.

    One small improvement may be to have the backoff be chosen at runtime so that it’s not deterministic. If a lot of hosts are using the same backoff then the system could experience a stamped. For instance TCP and other network protocols rely heavily on a backoff not being deterministic for exactly this reason.

    [0]: http://celery.readthedocs.org/en/latest/

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s