A revisionist history of configuration management

I’ve got a brand new core feature in mgmt called send/recv which I plan to show you shortly, but first I’d like to start with some background.

History

This is my historical perspective and interpretation about the last twenty years in configuration management. It’s likely inaccurate and slightly revisionist, but it should be correct enough to tell the design story that I want to share.

Sometime after people started to realize that writing bash scripts wasn’t a safe, scalable, or reusable way to automate systems, CFEngine burst onto the scene with the first real solution to this problem. I think it was mostly all quite sane, but it wasn’t a tool which let us build autonomous systems, so people eventually looked elsewhere.

Later on, a new tool called Puppet appeared, and advertised itself as a “CFEngine killer”. It was written in a flashy new language called Ruby, and started attracting a community. I think it had some great ideas, and in particular, the idea of a safe declarative language was a core principle of the design.

I first got into configuration management around this time. My first exposure was to Puppet version 0.24, IIRC. Two major events followed.

  1. Puppet (the company, previously named “Reductive Labs”) needed to run a business (rightly so!) and turned their GPL licensed project, into an ALv2 licensed one. This opened the door to an open-core business model, and I think was ultimately a detriment to the Puppet community.
  2. Some felt that the Puppet DSL (language) was too restrictive, and that this was what prevented them from building autonomous systems. They eventually started a project called Chef which let you write your automation using straight Ruby code. It never did lead them to build autonomous systems.

At this point, as people began to feel that the complexity (in particular around multi-machine environments) starting to get too high, a flashy new orchestrator called Ansible appeared. While I like to put centralized orchestrators in a different category than configuration management, it sits in the same problem space so we’ll include it here.

Ansible tried to solve the complexity and multi-machine issue by determining the plan of action in advance, and then applying those changes remotely over SSH. Instead of a brand new “language”, they ended up with a fancy YAML syntax which has been loved by many and disliked by others. You also couldn’t really exchange host-local information between hosts at runtime, but this was a more advanced requirement anyway. They never did end up building reactive, autonomous systems, but this might not have been a goal.

Sometime later, container technology had a renaissance. The popular variant that caused a stir was called Docker. This dominant form was one in which you used a bash script wrapped in some syntactic sugar (a “Dockerfile”) to build your container images. Many believed (although incorrectly) that container technology would be a replacement for this configuration management scene. The solution was to build these blobs with shell scripts, and to mix-in the mostly useless concept of image layering.

They seem to have taken the renaissance too literally, and when they revived container technology, they also brought back using the shell as a build primitive. While I am certainly a fan and user of bash, and I do appreciate the nostalgia, it isn’t the safe, scalable design that I was looking for.

Docker is definitely in a different category than configuration management, and in fact, I think the two are actually complementary, and even though I prefer systemd-nspawn, we’ll mention Docker here so that I can publicly discredit the notion that it sits in or replaces this problem space.

While in some respects they got much closer to being able to build autonomous systems, you had to rewrite your software to be able to fit into this model, and even then, there are many shortcomings that still haven’t been resolved.

Analysis

On the path to autonomous systems, there is certainly a lot of trial and error. I don’t pretend to think that I have solved this problem, but I think I’ll get pretty close.

  • Where CFEngine chose the C language for portability, it lacked safety, and
  • Where Puppet chose a declarative language for safety, it lacked power, and
  • Where Chef chose raw code for power, it lacked simplicity, and
  • Where Ansible chose an orchestrator for simplicity, it lacked distribution and
  • Where Docker chose multiple instances for distribution, it lacked coordination.

I believe that instead the answer to all of these is still ahead. When discussing power, I think the main mistake was the lack of a sufficiently advanced resource primitive. The event based engine in mgmt is intended to be the main aspect of this solution, but not the whole story. For another piece of this story, I invented something I’m calling send/recv.

Send/Recv

I’d like to go into this today, but I think I’m going to split this discussion into a separate blog post. Expect something here within a week!

If you hate the suspense, become a contributor and be involved in these discussions! We’re hanging out in #mgmtconfig on Freenode. I also hold occasional videoconferences with code contributors where we talk about the future.

Thanks

I learned a tremendous amount from all of these earlier tools and communities, and even though I am working on a next generation tool, I would never be where I am now if it wasn’t for all of those who came before me. I’m even trying to borrow ideas where it is appropriate to do so! I welcome all of those communities into the mgmt circle, and I hope that their users all continue to positively influence the design of mgmt.

Happy hacking,

James

Hybrid management of FreeIPA types with Puppet

(Note: this hybrid management technique is being demonstrated in the puppet-ipa module for FreeIPA, but the idea could be used for other modules and scenarios too. See below for some use cases…)

The error message that puppet hackers are probably most familiar is:

Error: Duplicate declaration: Thing[/foo/bar] is already declared in file /tmp/baz.pp:2; 
cannot redeclare at /tmp/baz.pp:4 on node computer.example.com

Typically this means that there is either a bug in your code, or someone has defined something more than once. As annoying as this might be, a compile error happens for a reason: puppet detected a problem, and it is giving you a chance to fix it, without first running code that could otherwise leave your machine in an undefined state.

The fundamental problem

The fundamental problem is that two or more contradictory declarative definitions might not be able to be properly resolved. For example, assume the following code:

package { 'awesome':
    ensure => present,
}

package { 'awesome':
    ensure => absent,
}

Since the above are contradictory, they can’t be reconciled, and a compiler error occurs. If they were identical, or if they would produce the same effect, then it wouldn’t be an issue, however this is not directly allowed due to a flaw in the design of puppet core. (There is an ensure_resource workaround, to be used very cautiously!)

FreeIPA types

The puppet-ipa module exposes a bunch of different types that map to FreeIPA objects. The most common are users, hosts, and services. If you run a dedicated puppet shop, then puppet can be your interface to manage FreeIPA, and life will go on as usual. The caveat is that FreeIPA provides a stunning web-ui, and a powerful cli, and it would be a shame to ignore both of these.

The FreeIPA webui is gorgeous. It even gets better in the new 4.0 release.

The FreeIPA webui is gorgeous. It even gets better in the new 4.0 release.

Hybrid management

As the title divulges, my puppet-ipa module actually allows hybrid management of the FreeIPA types. This means that puppet can be used in conjunction with the web-ui and the cli to create/modify/delete FreeIPA types. This took a lot of extra thought and engineering to make possible, but I think it was worth the work. This feature is optional, but if you do want to use it, you’ll need to let puppet know of your intentions. Here’s how…

Type excludes

In order to tell puppet to leave certain types alone, the main ipa::server class has type_excludes. Here is an excerpt from that code:

# special
# NOTE: host_excludes is matched with bash regexp matching in: [[ =~ ]]
# if the string regexp passed contains quotes, string matching is done:
# $string='"hostname.example.com"' vs: $regexp='hostname.example.com' !
# obviously, each pattern in the array is tried, and any match will do.
# invalid expressions might cause breakage! use this at your own risk!!
# remember that you are matching against the fqdn's, which have dots...
# a value of true, will automatically add the * character to match all.
$host_excludes = [],       # never purge these host excludes...
$service_excludes = [],    # never purge these service excludes...
$user_excludes = [],       # never purge these user excludes...

Each of these excludes lets you specify a pattern (or an array of patterns) which will be matched against each defined type, and which, if matched, will ensure that your type is not removed if the puppet definition for it is undefined.

Currently these type_excludes support pattern matching in bash regexp syntax. If there is a strong demand for regexp matching in either python or ruby syntax, then I will add it. In addition, other types of exclusions could be added. If you’d like to exclude based on some types value, creation time, or some other property, these could be investigated. The important thing is to understand your use case, so that I know what is both useful and necessary.

Here is an example of some host_excludes:

class { '::ipa::server':
    host_excludes => [
        "'foo-42.example.com'",                  # exact string match
        '"foo-bar.example.com"',                 # exact string match
        "^[a-z0-9-]*\\-foo\\.example\\.com$",    # *-foo.example.com or:
        "^[[:alpha:]]{1}[[:alnum:]-]*\\-foo\\.example\\.com$",
        "^foo\\-[0-9]{1,}\\.example\\.com"       # foo-<\d>.example.com
    ],
}

This example and others are listed in the examples/ folder.

Type modification

Each type in puppet has a $modify parameter. The significance of this is quite simple: if this value is set to false, then puppet will not be able to modify the type. (It will be able to remove the type if it becomes undefined, which is what the type_excludes mentioned above is used for.)

This $modify parameter is particularly useful if you’d like to define your types with puppet, but allow them to be modified afterwards by either the web-ui or the cli. If you change a users phone number, and this parameter is false, then it will not be reverted by puppet. The usefulness of this field is that it allows you to define the type, so that if it is removed manually in the FreeIPA directory, then puppet will notice its absence, and re-create it with the defaults you originally defined.

Here is an example user definition that is using $modify:

ipa::server::user { 'arthur@EXAMPLE.COM':
    first => 'Arthur',
    last => 'Guyton',
    jobtitle => 'Physiologist',
    orgunit => 'Research',
    #modify => true, # optional, since true is the default
}

By default, in true puppet style, the $modify parameter defaults to true. One thing to keep in mind: if you decide to update the puppet definition, then the type will get updated, which could potentially overwrite any manual change you made.

Type watching

Type watching is the strict form of type modification. As with type modification, each type has a $watch parameter. This also defaults to true. When this parameter is true, each puppet run will compare the parameters defined in puppet with what is set on the FreeIPA server. If they are different, then puppet will run a modify command so that harmony is reconciled. This is particularly useful for ensuring that the policy that you’ve defined for certain types in puppet definitions is respected.

Here’s an example:

ipa::server::host { 'nfs':    # NOTE: adding .${domain} is a good idea....
    domain => 'example.com',
    macaddress => "00:11:22:33:44:55",
    random => true,        # set a one time password randomly
    locality => 'Montreal, Canada',
    location => 'Room 641A',
    platform => 'Supermicro',
    osstring => 'RHEL 6.6 x86_64',
    comment => 'Simple NFSv4 Server',
    watch => true,    # read and understand the docs well
}

If someone were to change one of these parameters, puppet would revert it. This detection happens through an elaborate difference engine. This was mentioned briefly in an earlier article, and is probably worth looking at if you’re interested in python and function decorators.

Keep in mind that it logically follows that you must be able to $modify to be able to $watch. If you forget and make this mistake, puppet-ipa will report the error. You can however, have different values of $modify and $watch per individual type.

Use cases

With this hybrid management feature, a bunch of new use cases are now possible! Here are a few ideas:

  • Manage users, hosts, and services that your infrastructure requires, with puppet, but manage non-critical types manually.
  • Manage FreeIPA servers with puppet, but let HR manage user entries with the web-ui.
  • Manage new additions with puppet, but exclude historical entries from management while gradually migrating this data into puppet/hiera as time permits.
  • Use the cli without fear that puppet will revert your work.
  • Use puppet to ensure that certain types are present, but manage their data manually.
  • Exclude your development subdomain or namespace from puppet management.
  • Assert policy over a select set of types, but manage everything else by web-ui and cli.

Testing with Vagrant

You might want to test this all out. It’s all pretty automatic if you’ve followed along with my earlier vagrant work and my puppet-gluster work. You don’t have to use vagrant, but it’s all integrated for you in case that saves you time! The short summary is:

$ git clone --recursive https://github.com/purpleidea/puppet-ipa
$ cd puppet-ipa/vagrant/
$ vs
$ # edit puppet-ipa.yaml (although it's not necessary)
$ # edit puppet/manifests/site.pp (optionally, to add any types)
$ vup ipa1 # it can take a while to download freeipa rpm's
$ vp ipa1 # let the keepalived vip settle
$ vp ipa1 # once settled, ipa-server-install should run
$ vfwd ipa1 80:80 443:443 # if you didn't port forward before...
# echo '127.0.0.1   ipa1.example.com ipa1' >> /etc/hosts
$ firefox https://ipa1.example.com/ # accept self-sign https cert

Conclusion

Sorry that I didn’t write this article sooner. This feature has been baked in for a while now, but I simply forgot to blog about it! Since puppet-ipa is getting quite mature, it might be time for me to create some more formal documentation. Until then,

Happy hacking,

James