Oh-My-Vagrant “Mainstream” mode and COPR RPM’s

Making Oh-My-Vagrant (OMV) more developer accessible and easy to install (from a distribution package like RPM) has always been a goal, but was previously never a priority. This is all sorted out now. In this article, I’ll explain how “mainstream” mode works, and how the RPM work was done. (I promise this will be somewhat interesting!)


If you haven’t read any of the previous articles about Oh-My-Vagrant, I’d recommend you start there. Many of the articles include screencasts, and combined with the examples/ folder, this is probably the best way to learn OMV, because the documentation could use some love.


OMV is now easily installable on Fedora 22 via COPR. It probably works on other distros and versions, but I haven’t tested all of those combinations. This is a colossal improvement from when I first posted about this publicly in 2013. There is still one annoying bug that I occasionally hit. Let me know if you can reproduce.

Install from COPR:

james@computer:~$ sudo dnf copr enable purpleidea/oh-my-vagrant

You are about to enable a Copr repository. Please note that this
repository is not part of the main Fedora distribution, and quality
may vary.

The Fedora Project does not exercise any power over the contents of
this repository beyond the rules outlined in the Copr FAQ at
, and
packages are not held to any quality or security level.

Please do not file bug reports about these packages in Fedora
Bugzilla. In case of problems, contact the owner of this repository.

Do you want to continue? [y/N]: y
Repository successfully enabled.
james@computer:~$ sudo dnf install oh-my-vagrant
Last metadata expiration check performed 0:05:08 ago on Tue Jul  7 22:58:45 2015.
Dependencies resolved.
 Package           Arch     Version            Repository                  Size
 oh-my-vagrant     noarch   0.0.7-1            purpleidea-oh-my-vagrant   270 k
 vagrant           noarch   1.7.2-7.fc22       updates                    428 k
 vagrant-libvirt   noarch   0.0.26-2.fc22      fedora                      57 k

Transaction Summary
Install  3 Packages

Total download size: 755 k
Installed size: 2.5 M
Is this ok [y/N]: n
Operation aborted.
james@computer:~$ sudo dnf install -y oh-my-vagrant
Last metadata expiration check performed 0:05:19 ago on Tue Jul  7 22:58:45 2015.
Dependencies resolved.
 Package           Arch     Version            Repository                  Size
 oh-my-vagrant     noarch   0.0.7-1            purpleidea-oh-my-vagrant   270 k
 vagrant           noarch   1.7.2-7.fc22       updates                    428 k
 vagrant-libvirt   noarch   0.0.26-2.fc22      fedora                      57 k

Transaction Summary
Install  3 Packages

Total download size: 755 k
Installed size: 2.5 M
Downloading Packages:
(1/3): vagrant-1.7.2-7.fc22.noarch.rpm          626 kB/s | 428 kB     00:00    
(2/3): vagrant-libvirt-0.0.26-2.fc22.noarch.rpm  70 kB/s |  57 kB     00:00    
(3/3): oh-my-vagrant-0.0.7-1.noarch.rpm         243 kB/s | 270 kB     00:01    
Total                                           246 kB/s | 755 kB     00:03     
Running transaction check
Transaction check succeeded.
Running transaction test
Transaction test succeeded.
Running transaction
  Installing  : vagrant-1.7.2-7.fc22.noarch                                 1/3 
  Installing  : vagrant-libvirt-0.0.26-2.fc22.noarch                        2/3 
  Installing  : oh-my-vagrant-0.0.7-1.noarch                                3/3 
  Verifying   : oh-my-vagrant-0.0.7-1.noarch                                1/3 
  Verifying   : vagrant-libvirt-0.0.26-2.fc22.noarch                        2/3 
  Verifying   : vagrant-1.7.2-7.fc22.noarch                                 3/3 

  oh-my-vagrant.noarch 0.0.7-1                vagrant.noarch 1.7.2-7.fc22       
  vagrant-libvirt.noarch 0.0.26-2.fc22       


If you’d like to avoid typing passwords over and over again when using vagrant, you can add yourself into the vagrant group. 99% of people do this. The downside is that it could allow your user account to get root privileges. Since most developers have a single user environment, it’s not a big issue. This is necessary because vagrant uses the qemu:///system connection instead of qemu:///session. If you can help fix this, please hack on it.

james@computer:~$ groups
james wheel docker
james@computer:~$ sudo usermod -aG vagrant james
# you'll need to logout/login for this change to take effect...

Lastly, there is a user session plugin addition that is required. Installation is automatic the first time you create a new OMV project. Let’s do that and see how it works!

james@computer:~$ mkdir /tmp/omvtest
james@computer:~$ cd !$
cd /tmp/omvtest
james@computer:/tmp/omvtest$ which omv
james@computer:/tmp/omvtest$ omv init
Oh-My-Vagrant needs to install a modified vagrant-hostmanager plugin.
Is this ok [y/N]: y
Cloning into 'vagrant-hostmanager'...
remote: Counting objects: 801, done.
remote: Total 801 (delta 0), reused 0 (delta 0), pack-reused 801
Receiving objects: 100% (801/801), 132.22 KiB | 0 bytes/s, done.
Resolving deltas: 100% (467/467), done.
Checking connectivity... done.
Branch feat/oh-my-vagrant set up to track remote branch feat/oh-my-vagrant from origin.
Switched to a new branch 'feat/oh-my-vagrant'
sending incremental file list

sent 20,560 bytes  received 286 bytes  41,692.00 bytes/sec
total size is 19,533  speedup is 0.94
Patched successfully!
Current machine states:

omv1                      not created (libvirt)

The Libvirt domain is not created. Run `vagrant up` to create it.
james@computer:/tmp/omvtest$ ls
ansible/  docker/  kubernetes/  omv.yaml  puppet/  shell/

You can see that the plugin installation worked perfectly, and that OMV created a few files and folders.

More usage:

You can hide that generated mess in a subfolder if you prefer:

james@computer:/tmp/omvtest$ mkdir /tmp/omvtest2
james@computer:/tmp/omvtest$ cd !$
cd /tmp/omvtest2
james@computer:/tmp/omvtest2$ omv init mess
Current machine states:

omv1                      not created (libvirt)

The Libvirt domain is not created. Run `vagrant up` to create it.
james@computer:/tmp/omvtest2$ ls
mess/  omv.yaml@
james@computer:/tmp/omvtest2$ ls -lAh
total 0
drwxrwxr-x. 7 james 160 Jul  7 23:26 mess/
lrwxrwxrwx. 1 james  13 Jul  7 23:26 omv.yaml -> mess/omv.yaml
drwxrwxr-x. 3 james  60 Jul  7 23:26 .vagrant/
james@computer:/tmp/omvtest2$ tree
├── mess
│   ├── ansible
│   │   └── modules
│   ├── docker
│   ├── kubernetes
│   │   ├── applications
│   │   └── templates
│   ├── omv.yaml
│   ├── puppet
│   │   └── modules
│   └── shell
└── omv.yaml -> mess/omv.yaml

10 directories, 2 files

As you can see all the mess is wrapped up in a single folder. This could even be named .omv if you prefer, and should all be committed inside of your project. Now that we’re installed, let’s get hacking!

Mainstream mode:

Mainstream mode further hides the ruby/Vagrantfile aspect of a Vagrant project and extends OMV so that you can define your entire project via the omv.yaml file, without the rest of the OMV project cluttering up your development tree. This makes it possible to have your project use OMV by only committing that one yaml file into the project repo.

The main difference is that you now control everything with the new omv command line tool. It’s essentially a smart wrapper around the vagrant command, so any command you used to use vagrant for, you can now substitute in omv. It also saves typing four extra characters!

As it turns out (and by no accident) the omv tool works exactly like the vagrant tool. For example:

james@computer:/tmp/omvtest2$ omv status
Current machine states:

omv1                      not created (libvirt)

The Libvirt domain is not created. Run `vagrant up` to create it.
james@computer:/tmp/omvtest2$ omv up
Bringing machine 'omv1' up with 'libvirt' provider...
==> omv1: Box 'centos-7.1' could not be found. Attempting to find and install...
    omv1: Box Provider: libvirt
    omv1: Box Version: >= 0
==> omv1: Adding box 'centos-7.1' (v0) for provider: libvirt
    omv1: Downloading: https://dl.fedoraproject.org/pub/alt/purpleidea/vagrant/centos-7.1/centos-7.1.box
james@computer:/tmp/omvtest2$ omv destroy
Unlocking shell provisioning for: omv1...
==> omv1: Domain is not created. Please run `vagrant up` first.


The existing tools you know and love, like vlog, vsftp, vscreen, vcssh, vfwd, vansible, have all been modified to work with OMV mainstream mode as well. The same goes for common aliases such as vs, vp, vup, vdestroy, vrsync, and the useful (but occasionally dangerous) vrm-rf. Have a look at the above links on my blog and the source to see what these do. If it’s not clear enough, let me know!

All of these are now packaged up in the oh-my-vagrant COPR and are installed automatically into /etc/profile.d/oh-my-vagrant.sh for your convenience. Since they’re part of the OMV project, you’ll get updates when new functions or bug fixes are made.

The plumbing:

Mainstream mode is possible because of an idea rbarlow had. He gets full credit for the idea, in particular for teaching me about VAGRANT_CWD which is what makes it all work. I rejected his 6 line prototype, but loved the idea, and since he was busy making juice, I got bored one day and hacked on a full implementation.

james@computer:~/code/oh-my-vagrant$ git diff --stat 853073431d227cbb0ba56aaf4fedd721904de9a8 aa764ae79d69475b87f293c43af4f20fd7d1d000
 DOCUMENTATION.md    | 18 +++++++++++++++
 bin/omv.sh          | 50 +++++++++++++++++++++++++++++++++++++++++
 vagrant/Vagrantfile | 65 ++++++++++++++++++++++++++++++++++-------------------
 3 files changed, 110 insertions(+), 23 deletions(-)

It turned out it was a little longer, but I artificially inflated this by including some quick doc patches. What does it actually do differently? It sets VAGRANT_CWD and VAGRANT_DOTFILE_PATH so that the vagrant command looks in a different directory for the Vagrantfile and .vagrant/ directories. That way, all the plumbing is hidden and part of the RPM.

Making the RPM:

The RPM’s happened because stefw made me feel bad about not having them. He was right to do so. In an case, RPM packaging still scares me. I think repetitive work scares me even more. That’s why I automate as much as I can. So after a lot of brain loss, I finally made you an RPM so that you could easily install it. Here’s how it went:

I started by adding the magic so that my Makefile could build an RPM.

This made it so I can easily run make srpm to get a new RPM or SRPM.

Then I added COPR integration, so a make copr automatically kicks off a new COPR build. This was the interesting part. You’ll need a Fedora account for this to work. Once you’re logged in, if you go to https://copr.fedoraproject.org/api you’ll be able to download a snippet to put in your ~/.config/copr file. Lastly, the work happens in copr-build.py where the python copr library does the heavy lifting.


# for initial setup, browse to: https://copr.fedoraproject.org/api/
# and it will have a ~/.config/copr config that you can download.
# happy hacking!

import os
import sys
import copr

COPR = 'oh-my-vagrant'
if len(sys.argv) != 2:
    print("Usage: %s <srpm url>" % sys.argv[0])

url = sys.argv[1]

client = copr.CoprClient.create_from_file_config(os.path.expanduser("~/.config/copr"))

result = client.create_new_build(COPR, [url])
if result.output != "ok":

A build looks like this:

james@computer:~/code/oh-my-vagrant$ git tag 0.0.8 # set a new tag
james@computer:~/code/oh-my-vagrant$ make copr 
Running templater...
Running git archive...
Running git archive submodules...
Running rpmbuild -bs...
Wrote: /home/james/code/oh-my-vagrant/rpmbuild/SRPMS/oh-my-vagrant-0.0.8-1.src.rpm
Running SRPMS sha256sum...
Running SRPMS gpg...

You need a passphrase to unlock the secret key for
user: "James Shubin (Third PGP key.) <james@shubin.ca>"
4096-bit RSA key, ID 24090D66, created 2012-05-09

gpg: WARNING: The GNOME keyring manager hijacked the GnuPG agent.
gpg: WARNING: GnuPG will not work properly - please configure that tool to not interfere with the GnuPG system!
Running SRPMS upload...
sending incremental file list

sent 8,583 bytes  received 2,184 bytes  4,306.80 bytes/sec
total size is 1,456,741  speedup is 135.30
Build was added to oh-my-vagrant.

A few minutes later, the COPR build page should look like this:

a screenshot of the Oh-My-Vagrant COPR build page for people who like to look at pretty pictures instead of just terminal output

A screenshot of the Oh-My-Vagrant COPR build page for people who like to look at pretty pictures instead of just terminal output.

There was a bunch of additional fixing and polishing required to get this as seamless as possible for you. Have a look at the git commits and you’ll get an idea of all the work that was done, and you’ll probably even learn about some new, features I haven’t blogged about yet. It was exhausting!

omv-exhaustedAs a result of all this, you can download fresh builds easily. Visit the COPR page to see how things are cooking:


I’ll try to keep this pumping out releases regularly. If I lag behind, please holler at me. In any case, please let me know if you appreciate this work. Comment, tweeter, or contact me!

Happy Hacking,


Introducing: Silent counter

You might want to write code that can tell how many iterations have passed since some action occurred. Alternatively, you might want to know if it’s the first time a machine has run Puppet. To do these types of things, you might wish to have a monotonically increasing counter in your Puppet manifest. Since one did not exist, I set out to build one!

The code:

If you just want to try the code, and skip the ramble, you can include common::counter into your manifest. The entire class is part of my puppet-common module:

git clone https://github.com/purpleidea/puppet-common


Usage notes are hardly even necessary. Here is how the code is commonly used:

include ::common::counter    # that's it!

# NOTE: we only see the notify message. no other exec/change is shown!
notify { 'counter':
        message => "Value is: ${::common_counter_simple}",

Now use the fact anywhere you need a counter!

Increasing a variable:

At the heart of any counter, you’ll need to have a value store, a way to read the value, and a way to increment the value. For simplicity, the value store we’ll use will be a file on disk. This file is located by default at ${vardir}/common/counter/simple. To read the value, we use a puppet fact. The fact has a key name of $::common_counter_simple. To increment the value, a simple python script is used.


To cause an increment of the value on each puppet run, an exec would have to be used. The downside of this is that this causes noise in your puppet logs, even if nothing else is happening! This noise is unwanted, so we work around this with the following code:

exec { 'counter':
        # echo an error message and be false, if the incrementing fails
        command => '/bin/echo "common::counter failed" && /bin/false',
        unless => "${vardir}/counter/increment.py '${vardir}/counter/simple'",
        logoutput => on_failure,

As you can see, we cause the run to happen in the silent “unless” part of the exec, and we don’t actually allow any exec to occur unless there is an error running the increment.py!

Complex example:

If you want to do something more complicated using this technique, you might want to write something like this:

$max = 8
exec { "/bin/echo this is run #: ${::common_counter_simple}":
        logoutput => on_failure,
        onlyif => [
                "/usr/bin/test ${::common_counter_simple} -lt ${max}",
                "/usr/bin/test ${::common_counter_simple} -gt 0",
    #notify => ...,    # do some action

Side effects:

Make sure not to use the counter fact in a $name or somewhere that would cause frequent unwanted catalog changes, as it could cause a lot of changes each run.

Module pairings:

You might want to pair this module with my “Finite State Machine” concept, or my Exec[‘again’] module. If you come up with some cool use cases, please let me know!

Future work:

If you’d like to extend this work, two features come to mind:

  1. Individual named counters. If for some reason you want more than one counter, named counters could be built.
  2. Reset functionality. If you’d like to reset a counter to zero (or to some other value) then there should be a special type you could create which causes this to happen.

If you’d like to work on either of these features, please let me know, or send me a patch!

Happy hacking!


The switch as an ordinary GNU/Linux server

The fact that we manage the switches in our data centres differently than any other server is patently absurd, but we do so because we want to harness the power of a tiny bit of silicon which happens to be able to dramatically speed up the switching bandwidth.


beware of proprietary silicon, it’s absurd!

That tiny bit of silicon is known as an ASIC, or an application specific integrated circuit, and one particularly well performing ASIC (which is present in many commercially available switches) is called the Trident.

None of this should impact the end-user management experience, however, because the big switch companies and chip makers believe that there is some special differentiation in their IP, they’ve ensured that the stacks and software surrounding the hardware is highly proprietary and difficult to replace. This also lets them create and sell bundled products and features that you don’t want, but which you can’t get elsewhere.

This is still true today! System and network engineers know too well the hassles of dealing with the different proprietary switch operating systems and interfaces. Why not standardize on the well-known interface that every GNU/Linux server uses.

We’re talking about iptables of course! (Although nftables would be an acceptable standard too!) This way we could have a common interface for all the networked devices in our server room.

I’ve been able to work around this limitation in the past, by using Linux to do my routing in software, and by building the routers out of COTS 2U GNU/Linux boxes. The trouble with this approach, is that they’re bigger, louder, more expensive, consume more power, and don’t have the port density that a 48 port 1U switch does.

a 48 port, 1U switch

a 48 port, 1U switch

It turns out that there is a company which is actually trying to build this mythical box. It is not perfect, but I think they are on the right track. What follows are my opinions of what they’ve done right, what’s wrong, and what I’d like to see in the future.

Who are they?

They are Cumulus Networks, and I recently got to meet, demo and discuss with one of their very talented engineers, Leslie Carr. I recently attended a talk that she gave on this very same subject. She gave me a rocket turtle. (Yes, this now makes me biased!)

my rocket turtle, the cumulus networks mascot

my rocket turtle, the cumulus networks mascot

What are they doing?

You buy an existing switch from your favourite vendor. You then throw out (flash over) the included software, and instead, pay them a yearly licensing fee to use the “Cumulus” GNU/Linux. It comes as an OS image, based off of Debian.

How does it talk to the ASIC?

The OS image comes with a daemon called switchd that transfers the kernel iptables rules down into the ASIC. I don’t know the specifics on how this works exactly because:

  1. Switchd is proprietary. Apparently this is because of a scary NDA they had to sign, but it’s still unfortunate, and it is impeding my hacking.
  2. I’m not an expert on talking to ASIC’s. I’m guessing that unless you’ve signed the NDA’s, and you’re behind the Trident paywall, then it’s tough to become one!

Problems with packaging:

The OS is only distributed as a single image. This is an unfortunate mistake! It should be available from the upstream project with switchd (and any other add-ons) as individual .deb packages. That way, I know I’m getting a stock OS which is preferably even built and signed by the Debian team! That way I could use the same infrastructure for my servers to keep all my servers up to date.

Problems with OS security:

Unfortunately the OS doesn’t benefit from any of the standard OS security enhancements like SELinux. I’d prefer running a more advanced distro like RHEL or CentOS that have these things out of the box, but if Cumulus will continue using Debian, then they must include some more advanced security measures. I didn’t find AppArmor or grsecurity in use either. It did seem to have all the important bash security updates:

cumulus@switch1$ bash --version
GNU bash, version 4.2.37(1)-release (powerpc-unknown-linux-gnu)
Copyright (C) 2011 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later 

This is free software; you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Use of udev:

This switch does seem to support and use udev, although not being a udev expert I can’t comment on if it’s done properly or not. I’d be interested to hear from the pros. Here’s what I found:

cumulus@switch1$ cd /etc/udev/
cumulus@switch1$ tree
`-- rules.d
    |-- 10-cumulus.rules
    |-- 60-bridge-network-interface.rules -> /dev/null
    |-- 75-persistent-net-generator.rules -> /dev/null
    `-- 80-networking.rules -> /dev/null

1 directory, 4 files
cumulus@switch1$ cat rules.d/10-cumulus.rules | tail -n 7
# udev rules specific to Cumulus Linux

# Rule called when the linux-user-bde driver is loaded
ACTION=="add" SUBSYSTEM=="module" DEVPATH=="/module/linux_user_bde" ENV{DEVICE_NAME}="linux-user-bde" ENV{DEVICE_TYPE}="c" ENV{DEVICE_MINOR}="0" RUN="/usr/lib/cumulus/udev-module"

# Quanta LY8 uses RTC1
KERNEL=="rtc1", PROGRAM="/usr/bin/platform-detect", RESULT=="quanta,ly8_rangeley", SYMLINK+="rtc"

Other things:

There seems to be a number of extra things running on the switch. Here’s what I mean:

cumulus@switch1$ ps auxwww 
root         1  0.0  0.0   2516   860 ?        Ss   Nov05   0:02 init [3]  
root         2  0.0  0.0      0     0 ?        S    Nov05   0:00 [kthreadd]
root         3  0.0  0.0      0     0 ?        S    Nov05   0:05 [ksoftirqd/0]
root         5  0.0  0.0      0     0 ?        S    Nov05   0:00 [kworker/u:0]
root         6  0.0  0.0      0     0 ?        S    Nov05   0:00 [migration/0]
root         7  0.0  0.0      0     0 ?        S<   Nov05   0:00 [cpuset]
root         8  0.0  0.0      0     0 ?        S<   Nov05   0:00 [khelper]
root         9  0.0  0.0      0     0 ?        S<   Nov05   0:00 [netns]
root        10  0.0  0.0      0     0 ?        S    Nov05   0:02 [sync_supers]
root        11  0.0  0.0      0     0 ?        S    Nov05   0:00 [bdi-default]
root        12  0.0  0.0      0     0 ?        S<   Nov05   0:00 [kblockd]
root        13  0.0  0.0      0     0 ?        S<   Nov05   0:00 [ata_sff]
root        14  0.0  0.0      0     0 ?        S    Nov05   0:00 [khubd]
root        15  0.0  0.0      0     0 ?        S<   Nov05   0:00 [rpciod]
root        17  0.0  0.0      0     0 ?        S    Nov05   0:00 [khungtaskd]
root        18  0.0  0.0      0     0 ?        S    Nov05   0:00 [kswapd0]
root        19  0.0  0.0      0     0 ?        S    Nov05   0:00 [fsnotify_mark]
root        20  0.0  0.0      0     0 ?        S<   Nov05   0:00 [nfsiod]
root        21  0.0  0.0      0     0 ?        S<   Nov05   0:00 [crypto]
root        34  0.0  0.0      0     0 ?        S    Nov05   0:00 [scsi_eh_0]
root        36  0.0  0.0      0     0 ?        S    Nov05   0:00 [kworker/u:2]
root        41  0.0  0.0      0     0 ?        S    Nov05   0:00 [mtdblock0]
root        42  0.0  0.0      0     0 ?        S    Nov05   0:00 [mtdblock1]
root        43  0.0  0.0      0     0 ?        S    Nov05   0:00 [mtdblock2]
root        44  0.0  0.0      0     0 ?        S    Nov05   0:00 [mtdblock3]
root        49  0.0  0.0      0     0 ?        S    Nov05   0:00 [mtdblock4]
root       362  0.0  0.0      0     0 ?        S    Nov05   0:54 [hwmon0]
root       363  0.0  0.0      0     0 ?        S    Nov05   1:01 [hwmon1]
root       398  0.0  0.0      0     0 ?        S    Nov05   0:04 [flush-8:0]
root       862  0.0  0.1  28680  1768 ?        Sl   Nov05   0:03 /usr/sbin/rsyslogd -c4
root      1041  0.0  0.2   5108  2420 ?        Ss   Nov05   0:00 /sbin/dhclient -pf /run/dhclient.eth0.pid -lf /var/lib/dhcp/dhclient.eth0.leases eth0
root      1096  0.0  0.1   3344  1552 ?        S    Nov05   0:16 /bin/bash /usr/bin/arp_refresh
root      1188  0.0  1.1  15456 10676 ?        S    Nov05   1:36 /usr/bin/python /usr/sbin/ledmgrd
root      1218  0.2  1.1  15468 10708 ?        S    Nov05   5:41 /usr/bin/python /usr/sbin/pwmd
root      1248  1.4  1.1  15480 10728 ?        S    Nov05  39:45 /usr/bin/python /usr/sbin/smond
root      1289  0.0  0.0  13832   964 ?        SNov05   0:00 /sbin/auditd
root      1291  0.0  0.0  10456   852 ?        S
root     12776  0.0  0.0      0     0 ?        S    04:31   0:01 [kworker/0:0]
root     13606  0.0  0.0      0     0 ?        S    05:05   0:00 [kworker/0:2]
root     13892  0.0  0.0      0     0 ?        S    05:13   0:00 [kworker/0:1]
root     13999  0.0  0.0   2028   512 ?        S    05:16   0:00 sleep 30
cumulus  14016  0.0  0.1   3324  1128 pts/0    R+   05:17   0:00 ps auxwww
root     30713 15.6  2.4  69324 24176 ?        Ssl  Nov05 304:16 /usr/sbin/switchd -d
root     30952  0.0  0.3  11196  3500 ?        Ss   Nov05   0:00 sshd: cumulus [priv]
cumulus  30954  0.0  0.1  11196  1684 ?        S    Nov05   0:00 sshd: cumulus@pts/0
cumulus  30955  0.0  0.2   4548  2888 pts/0    Ss   Nov05   0:01 -bash

In particular, I’m referring to ledmgrd, pwmd, smond and others. I don’t doubt these things are necessary and useful, in fact, they’re written in python and should be easy to reverse if anyone is interested, but if they’re a useful part of a switch capable operating system, I hope that they grow proper upstream projects and get appropriate documentation, licensing, and packaging too!

Switch ports are network devices:

Hacking on the device couldn’t feel more native. Anyone remember how to enumerate the switch ports on IOS? … Who cares! Try the standard iproute2 tools on a Cumulus box:

cumulus@switch1$ ip a s swp42
44: swp42: <broadcast,multicast> mtu 1500 qdisc noop state DOWN qlen 500
    link/ether 08:9e:01:f8:96:74 brd ff:ff:ff:ff:ff:ff
cumulus@switch1$ ip a | grep swp | wc -l

What about ifup/ifdown?

This one is a bit different:

cumulus@switch1$ file /sbin/ifup
/sbin/ifup: symbolic link to `/sbin/ifupdown'
cumulus@switch1$ file /sbin/ifupdown 
/sbin/ifupdown: Python script, ASCII text executable

The Cumulus team encountered issues with the traditional ifup/ifdown tools found in a stock distro. So they replaced them, with shiny python versions:


I hope that this project either gets into the upstream distro, or that some upstream writes tools that address the limitations in the various messes of shell scripts. I’m optimistic about networkd being the solution here, but until that’s fully baked, the Cumulus team has built a nice workaround. Additionally, until the Debian team finalizes on the proper technical decision to use SystemD, it has a bleak future.


All the kernel hackers out there will want to know what’s under the hood:

cumulus@switch1$ uname -a
Linux leaf1 3.2.46-1+deb7u1+cl2.2+1 #3.2.46-1+deb7u1+cl2.2+1 SMP Thu Oct 16 14:28:31 PDT 2014 ppc powerpc GNU/Linux

Because this is an embedded chip found in a 1U box, and not an Xeon processor, it’s noticeably slower than a traditional server. This is of course (non-sarcastically) exactly what I want. For admin tasks, it has plenty of power, and this trade-off means it has lower power consumption and heat production than a stock server. While debugging some puppet code that I was running takes longer than normal on this box, I was eventually able to get the job done. This is another reason why this box needs to act like more of an upstream distro — if it did, I’d be able to have a faster machine as my dev box!

Other tools:

Other stock tools like ethtool, and brctl, work out of the box. Bonding, vlan’s and every other feature I tested seems to work the same way you expect from a GNU/Linux system.

Puppet and automation:

Readers of my blog will know that I manage my servers with Puppet. Instead of having the puppet agent connect over an API to the switch, you can directly install and run puppet on this Cumulus Linux machine! Some users might quickly jump to using the firewall module as the solution for consistent management, but a level two user will know that a higher level wrapper around shorewall is the better approach. This is all possible with this switch and seems to work great! The downside was that I had to manually add repositories to get the shorewall packages because it is not a stock distro :(

Why not SDN?

SDN or software-defined networking, is a fantastic and interesting technology. Unfortunately, it’s a parallel problem to what I’m describing in this article, and not a solution to help you work around the lack of a good GNU+Linux switch. If programming the ASIC’s wasn’t an NDA requiring activity, I bet we’d see a lot more innovative networking companies and technologies pop up.


This product isn’t quite baked yet for me to want to use it in production, but it’s so tantalizingly close that it’s strongly worth considering. I hope that they react positively to my suggestions and create an even more native, upstream environment. Once this is done, it will be my go to, for all my switching!


Thanks very much to the Cumulus team for showing me their software, and giving me access to demo it on some live switches. I didn’t test performance, but I have no doubt that it competes with the market average. Prove me right by trying it out yourself!

Thanks for listening, and Happy hacking!


PS: Special thanks to David Caplan for the great networking discussions we had!

Hacking out an Openshift app

I had an itch to scratch, and I wanted to get a bit more familiar with Openshift. I had used it in the past, but it was time to have another go. The app and the code are now available. Feel free to check out:


This is a simple app that takes the URL of a markdown file on GitHub, and outputs a pandoc converted PDF. I wanted to use pandoc specifically, because it produces PDF’s that were beautifully created with LaTeX. To embed a link in your upstream documentation that points to a PDF, just append the file’s URL to this app’s url, under a /pdf/ path. For example:


will send you to a PDF of the puppet-gluster documentation. This will make it easier to accept questions as FAQ patches, without needing to have the git embedded binary PDF be constantly updated.

If you want to hear more about what I did, read on…

The setup:

Start by getting a free Openshift account. You’ll also want to install the client tools. Nothing is worse than having to interact with your app via a web interface. Hackers use terminals. Lucky, the Openshift team knows this, and they’ve created a great command line tool called rhc to make it all possible.

I started by following their instructions:

$ sudo yum install rubygem-rhc
$ sudo gem update rhc

Unfortunately, this left with a problem:

$ rhc
/usr/share/rubygems/rubygems/dependency.rb:298:in `to_specs': Could not find 'rhc' (>= 0) among 37 total gem(s) (Gem::LoadError)
    from /usr/share/rubygems/rubygems/dependency.rb:309:in `to_spec'
    from /usr/share/rubygems/rubygems/core_ext/kernel_gem.rb:47:in `gem'
    from /usr/local/bin/rhc:22:in `'

I solved this by running:

$ gem install rhc

Which makes my user rhc to take precedence over the system one. Then run:

$ rhc setup

and the rhc client will take you through some setup steps such as uploading your public ssh key to the Openshift infrastructure. The beauty of this tool is that it will work with the Red Hat hosted infrastructure, or you can use it with your own infrastructure if you want to host your own Openshift servers. This alone means you’ll never get locked in to a third-party providers terms or pricing.

Create a new app:

To get a fresh python 3.3 app going, you can run:

$ rhc create-app <appname> python-3.3

From this point on, it’s fairly straight forward, and you can now hack your way through the app in python. To push a new version of your app into production, it’s just a git commit away:

$ git add -p && git commit -m 'Awesome new commit...' && git push && rhc tail

Creating a new app from existing code:

If you want to push a new app from an existing code base, it’s as easy as:

$ rhc create-app awesomesauce python-3.3 --from-code https://github.com/purpleidea/pdfdoc
Application Options
Domain:      purpleidea
Cartridges:  python-3.3
Source Code: https://github.com/purpleidea/pdfdoc
Gear Size:   default
Scaling:     no

Creating application 'awesomesauce' ... done

Waiting for your DNS name to be available ... done

Cloning into 'awesomesauce'...
The authenticity of host 'awesomesauce-purpleidea.rhcloud.com (' can't be established.
RSA key fingerprint is 00:11:22:33:44:55:66:77:88:99:aa:bb:cc:dd:ee:ff.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'awesomesauce-purpleidea.rhcloud.com,' (RSA) to the list of known hosts.

Your application 'awesomesauce' is now available.

  URL:        http://awesomesauce-purpleidea.rhcloud.com/
  SSH to:     00112233445566778899aabb@awesomesauce-purpleidea.rhcloud.com
  Git remote: ssh://00112233445566778899aabb@awesomesauce-purpleidea.rhcloud.com/~/git/awesomesauce.git/
  Cloned to:  /home/james/code/awesomesauce

Run 'rhc show-app awesomesauce' for more details about your app.

In my case, my app also needs some binaries installed. I haven’t yet automated this process, but I think it can be done be creating a custom cartridge. Help to do this would be appreciated!

Updating your app:

In the case of an app that I already deployed with this method, updating it from the upstream source is quite easy. You just pull down and relevant commits, and then push them up to your app’s git repo:

$ git pull upstream master 
From https://github.com/purpleidea/pdfdoc
 * branch            master     -> FETCH_HEAD
Updating 5ac5577..bdf9601
 wsgi.py | 2 --
 1 file changed, 2 deletions(-)
$ git push origin master 
Counting objects: 7, done.
Delta compression using up to 4 threads.
Compressing objects: 100% (3/3), done.
Writing objects: 100% (3/3), 312 bytes | 0 bytes/s, done.
Total 3 (delta 2), reused 0 (delta 0)
remote: Stopping Python 3.3 cartridge
remote: Waiting for stop to finish
remote: Waiting for stop to finish
remote: Building git ref 'master', commit bdf9601
remote: Activating virtenv
remote: Checking for pip dependency listed in requirements.txt file..
remote: You must give at least one requirement to install (see "pip help install")
remote: Running setup.py script..
remote: running develop
remote: running egg_info
remote: creating pdfdoc.egg-info
remote: writing pdfdoc.egg-info/PKG-INFO
remote: writing dependency_links to pdfdoc.egg-info/dependency_links.txt
remote: writing top-level names to pdfdoc.egg-info/top_level.txt
remote: writing manifest file 'pdfdoc.egg-info/SOURCES.txt'
remote: reading manifest file 'pdfdoc.egg-info/SOURCES.txt'
remote: writing manifest file 'pdfdoc.egg-info/SOURCES.txt'
remote: running build_ext
remote: Creating /var/lib/openshift/00112233445566778899aabb/app-root/runtime/dependencies/python/virtenv/venv/lib/python3.3/site-packages/pdfdoc.egg-link (link to .)
remote: pdfdoc 0.0.1 is already the active version in easy-install.pth
remote: Installed /var/lib/openshift/00112233445566778899aabb/app-root/runtime/repo
remote: Processing dependencies for pdfdoc==0.0.1
remote: Finished processing dependencies for pdfdoc==0.0.1
remote: Preparing build for deployment
remote: Deployment id is 9c2ee03c
remote: Activating deployment
remote: Starting Python 3.3 cartridge (Apache+mod_wsgi)
remote: Application directory "/" selected as DocumentRoot
remote: Application "wsgi.py" selected as default WSGI entry point
remote: -------------------------
remote: Git Post-Receive Result: success
remote: Activation status: success
remote: Deployment completed with status: success
To ssh://00112233445566778899aabb@awesomesauce-purpleidea.rhcloud.com/~/git/awesomesauce.git/
   5ac5577..bdf9601  master -> master

Final thoughts:

I hope this helped you getting going with Openshift. Feel free to send me patches!

Happy hacking!


Hybrid management of FreeIPA types with Puppet

(Note: this hybrid management technique is being demonstrated in the puppet-ipa module for FreeIPA, but the idea could be used for other modules and scenarios too. See below for some use cases…)

The error message that puppet hackers are probably most familiar is:

Error: Duplicate declaration: Thing[/foo/bar] is already declared in file /tmp/baz.pp:2; 
cannot redeclare at /tmp/baz.pp:4 on node computer.example.com

Typically this means that there is either a bug in your code, or someone has defined something more than once. As annoying as this might be, a compile error happens for a reason: puppet detected a problem, and it is giving you a chance to fix it, without first running code that could otherwise leave your machine in an undefined state.

The fundamental problem

The fundamental problem is that two or more contradictory declarative definitions might not be able to be properly resolved. For example, assume the following code:

package { 'awesome':
    ensure => present,

package { 'awesome':
    ensure => absent,

Since the above are contradictory, they can’t be reconciled, and a compiler error occurs. If they were identical, or if they would produce the same effect, then it wouldn’t be an issue, however this is not directly allowed due to a flaw in the design of puppet core. (There is an ensure_resource workaround, to be used very cautiously!)

FreeIPA types

The puppet-ipa module exposes a bunch of different types that map to FreeIPA objects. The most common are users, hosts, and services. If you run a dedicated puppet shop, then puppet can be your interface to manage FreeIPA, and life will go on as usual. The caveat is that FreeIPA provides a stunning web-ui, and a powerful cli, and it would be a shame to ignore both of these.

The FreeIPA webui is gorgeous. It even gets better in the new 4.0 release.

The FreeIPA webui is gorgeous. It even gets better in the new 4.0 release.

Hybrid management

As the title divulges, my puppet-ipa module actually allows hybrid management of the FreeIPA types. This means that puppet can be used in conjunction with the web-ui and the cli to create/modify/delete FreeIPA types. This took a lot of extra thought and engineering to make possible, but I think it was worth the work. This feature is optional, but if you do want to use it, you’ll need to let puppet know of your intentions. Here’s how…

Type excludes

In order to tell puppet to leave certain types alone, the main ipa::server class has type_excludes. Here is an excerpt from that code:

# special
# NOTE: host_excludes is matched with bash regexp matching in: [[ =~ ]]
# if the string regexp passed contains quotes, string matching is done:
# $string='"hostname.example.com"' vs: $regexp='hostname.example.com' !
# obviously, each pattern in the array is tried, and any match will do.
# invalid expressions might cause breakage! use this at your own risk!!
# remember that you are matching against the fqdn's, which have dots...
# a value of true, will automatically add the * character to match all.
$host_excludes = [],       # never purge these host excludes...
$service_excludes = [],    # never purge these service excludes...
$user_excludes = [],       # never purge these user excludes...

Each of these excludes lets you specify a pattern (or an array of patterns) which will be matched against each defined type, and which, if matched, will ensure that your type is not removed if the puppet definition for it is undefined.

Currently these type_excludes support pattern matching in bash regexp syntax. If there is a strong demand for regexp matching in either python or ruby syntax, then I will add it. In addition, other types of exclusions could be added. If you’d like to exclude based on some types value, creation time, or some other property, these could be investigated. The important thing is to understand your use case, so that I know what is both useful and necessary.

Here is an example of some host_excludes:

class { '::ipa::server':
    host_excludes => [
        "'foo-42.example.com'",                  # exact string match
        '"foo-bar.example.com"',                 # exact string match
        "^[a-z0-9-]*\\-foo\\.example\\.com$",    # *-foo.example.com or:
        "^foo\\-[0-9]{1,}\\.example\\.com"       # foo-<\d>.example.com

This example and others are listed in the examples/ folder.

Type modification

Each type in puppet has a $modify parameter. The significance of this is quite simple: if this value is set to false, then puppet will not be able to modify the type. (It will be able to remove the type if it becomes undefined, which is what the type_excludes mentioned above is used for.)

This $modify parameter is particularly useful if you’d like to define your types with puppet, but allow them to be modified afterwards by either the web-ui or the cli. If you change a users phone number, and this parameter is false, then it will not be reverted by puppet. The usefulness of this field is that it allows you to define the type, so that if it is removed manually in the FreeIPA directory, then puppet will notice its absence, and re-create it with the defaults you originally defined.

Here is an example user definition that is using $modify:

ipa::server::user { 'arthur@EXAMPLE.COM':
    first => 'Arthur',
    last => 'Guyton',
    jobtitle => 'Physiologist',
    orgunit => 'Research',
    #modify => true, # optional, since true is the default

By default, in true puppet style, the $modify parameter defaults to true. One thing to keep in mind: if you decide to update the puppet definition, then the type will get updated, which could potentially overwrite any manual change you made.

Type watching

Type watching is the strict form of type modification. As with type modification, each type has a $watch parameter. This also defaults to true. When this parameter is true, each puppet run will compare the parameters defined in puppet with what is set on the FreeIPA server. If they are different, then puppet will run a modify command so that harmony is reconciled. This is particularly useful for ensuring that the policy that you’ve defined for certain types in puppet definitions is respected.

Here’s an example:

ipa::server::host { 'nfs':    # NOTE: adding .${domain} is a good idea....
    domain => 'example.com',
    macaddress => "00:11:22:33:44:55",
    random => true,        # set a one time password randomly
    locality => 'Montreal, Canada',
    location => 'Room 641A',
    platform => 'Supermicro',
    osstring => 'RHEL 6.6 x86_64',
    comment => 'Simple NFSv4 Server',
    watch => true,    # read and understand the docs well

If someone were to change one of these parameters, puppet would revert it. This detection happens through an elaborate difference engine. This was mentioned briefly in an earlier article, and is probably worth looking at if you’re interested in python and function decorators.

Keep in mind that it logically follows that you must be able to $modify to be able to $watch. If you forget and make this mistake, puppet-ipa will report the error. You can however, have different values of $modify and $watch per individual type.

Use cases

With this hybrid management feature, a bunch of new use cases are now possible! Here are a few ideas:

  • Manage users, hosts, and services that your infrastructure requires, with puppet, but manage non-critical types manually.
  • Manage FreeIPA servers with puppet, but let HR manage user entries with the web-ui.
  • Manage new additions with puppet, but exclude historical entries from management while gradually migrating this data into puppet/hiera as time permits.
  • Use the cli without fear that puppet will revert your work.
  • Use puppet to ensure that certain types are present, but manage their data manually.
  • Exclude your development subdomain or namespace from puppet management.
  • Assert policy over a select set of types, but manage everything else by web-ui and cli.

Testing with Vagrant

You might want to test this all out. It’s all pretty automatic if you’ve followed along with my earlier vagrant work and my puppet-gluster work. You don’t have to use vagrant, but it’s all integrated for you in case that saves you time! The short summary is:

$ git clone --recursive https://github.com/purpleidea/puppet-ipa
$ cd puppet-ipa/vagrant/
$ vs
$ # edit puppet-ipa.yaml (although it's not necessary)
$ # edit puppet/manifests/site.pp (optionally, to add any types)
$ vup ipa1 # it can take a while to download freeipa rpm's
$ vp ipa1 # let the keepalived vip settle
$ vp ipa1 # once settled, ipa-server-install should run
$ vfwd ipa1 80:80 443:443 # if you didn't port forward before...
# echo '   ipa1.example.com ipa1' >> /etc/hosts
$ firefox https://ipa1.example.com/ # accept self-sign https cert


Sorry that I didn’t write this article sooner. This feature has been baked in for a while now, but I simply forgot to blog about it! Since puppet-ipa is getting quite mature, it might be time for me to create some more formal documentation. Until then,

Happy hacking,



Introducing Puppet Exec[‘again’]

Puppet is missing a number of much-needed features. That’s the bad news. The good news is that I’ve been able to write some of these as modules that don’t need to change the Puppet core! This is an article about one of these features.

Posit: It’s not possible to apply all of your Puppet manifests in a single run.

I believe that this holds true for the current implementation of Puppet. Most manifests can, do and should apply completely in a single run. If your Puppet run takes more than one run to converge, then chances are that you’re doing something wrong.

(For the sake of this article, convergence means that everything has been applied cleanly, and that a subsequent Puppet run wouldn’t have any work to do.)

There are some advanced corner cases, where this is not possible. In these situations, you will either have to wait for the next Puppet run (by default it will run every 30 minutes) or keep running Puppet manually until your configuration has converged. Neither of these situations are acceptable because:

  • Waiting 30 minutes while your machines are idle is (mostly) a waste of time.
  • Doing manual work to set up your automation kind of defeats the purpose.
'Are you stealing those LCDs?' 'Yeah, but I'm doing it while my code compiles.'

Waiting 30 minutes while your machines are idle is (mostly) a waste of time. Okay, maybe it’s not entirely a waste of time :)

So what’s the solution?

Introducing: Puppet Exec[‘again’] !

Exec[‘again’] is a feature which I’ve added to my Puppet-Common module.

What does it do?

Each Puppet run, your code can decide if it thinks there is more work to do, or if the host is not in a converged state. If so, it will tell Exec[‘again’].

What does Exec[‘again’] do?

Exec[‘again’] will fork a process off from the running puppet process. It will wait until that parent process has finished, and then it will spawn (technically: execvpe) a new puppet process to run puppet again. The module is smart enough to inspect the parent puppet process, and it knows how to run the child puppet. Once the new child puppet process is running, you won’t see any leftover process id from the parent Exec[‘again’] tool.

How do I tell it to run?

It’s quite simple, all you have to do is import my puppet module, and then notify the magic Exec[‘again’] type that my class defines. Example:

include common::again

$some_str = 'ttboj is awesome'
# you can notify from any type that can generate a notification!
# typically, using exec is the most common, but is not required!
file { '/tmp/foo':
    content => "${some_str}\n",
    notify => Exec['again'], # notify puppet!

How do I decide if I need to run again?

This depends on your module, and isn’t always a trivial thing to figure out. In one case, I had to build a finite state machine in puppet to help decide whether this was necessary or not. In some cases, the solution might be simpler. In all cases, this is an advanced technique, so you’ll probably already have a good idea about how to figure this out if you need this type of technique.

Can I introduce a minimum delay before the next run happens?

Yes, absolutely. This is particularly useful if you are building a distributed system, and you want to give other hosts a chance to export resources before each successive run. Example:

include common::again

# when notified, this will run puppet again, delta sec after it ends!
common::again::delta { 'some-name':
    delta => 120, # 2 minutes (pick your own value)

# to run the above Exec['again'] you can use:
exec { '/bin/true':
    onlyif => '/bin/false', # TODO: some condition
    notify => Common::Again::Delta['some-name'],

Can you show me a real-world example of this module?

Have a look at the Puppet-Gluster module. This module was one of the reasons that I wrote the Exec[‘again’] functionality.

Are there any caveats?

Maybe! It’s possible to cause a fast “infinite loop”, where Puppet gets run unnecessarily. This could effectively DDOS your puppetmaster if left unchecked, so please use with caution! Keep in mind that puppet typically runs in an infinite loop already, except with a 30 minute interval.

Help, it won’t stop!

Either your code has become sentient, and has decided it wants to enable kerberos or you’ve got a bug in your Puppet manifests. If you fix the bug, things should eventually go back to normal. To kill the process that’s re-spawning puppet, look for it in your process tree. Example:

[root@server ~]# ps auxww | grep again[.py]
root 4079 0.0 0.7 132700 3768 ? S 18:26 0:00 /usr/bin/python /var/lib/puppet/tmp/common/again/again.py --delta 120
[root@server ~]# killall again.py
[root@server ~]# echo $?
[root@server ~]# ps auxww | grep again[.py]
[root@server ~]# killall again.py
again.py: no process killed
[root@server ~]#

Does this work with puppet running as a service or with puppet agent –test?


How was the spawn/exec logic implemented?

The spawn/exec logic was implemented as a standalone python program that gets copied to your local system, and does all the heavy lifting. Please have a look and let me know if you can find any bugs!


I hope you enjoyed this addition to your toolbox. Please remember to use it with care. If you have a legitimate use for it, please let me know so that I can better understand your use case!

Happy hacking,



a puppet-ipa user type and a new difference engine

A simple hack to add a user type to my puppet-ipa module turned out to cause quite a stir. I’ve just pushed these changes out for your testing:

3 files changed, 1401 insertions(+), 215 deletions(-)

You should now have a highly capable user type, along with some quick examples.

I’ve also done a rewrite of the difference engine, so that it is cleaner and more robust. It now uses function decorators and individual function comparators to help wrangle the data into easily comparable forms. This should make adding future types easier, and less error prone. If you’re not comfortable with ruby, that’s okay, because it’s written in python!

Have a look at the commit message, and please test this code and let me know how it goes.

Happy hacking,


PS: This update also adds server configuration globals management which you may find useful. Not all keys are supported, but all the framework and placeholders have been added.