Sharing dev environments with Oh-My-Vagrant

With Oh-My-Vagrant (omv) you can set up a dev environment in seconds. (Read the omv introduction if you’ve never used it before!) Since everything is defined in a single omv.yaml file, it is easy to share your cluster prototype with a friend! The one missing feature was associating code with this config file. This is now possible! Let me show you how it works…

In the omv.yaml file there is an extern variable. It is a list of each external repository which you’d like to include. Each element in this list is a hash of key value pairs. Currently four are supported: type, system, repository, directory.

An example will help you visualize this:

---
:domain: example.com
:network: 192.168.123.0/24
:image: fedora-21
:extern:
- type: git
  system: ansible
  repository: https://github.com/eparis/kubernetes-ansible
  directory: kubernetes
:reallyrm: true

In this example, we list one external repository. It is of type git, it is intended for use with the ansible integration provided by omv, the repository is hosted by eparis, and we’ll store this in a local directory called kubernetes.

We currently only support git repositories, but patches for other systems are welcome. A few different “systems” are supported, including puppet, docker and kubernetes. They each integrate with omv, and, as a result can pull code and modules into the appropriate places. Any repository path that is valid is acceptable (including local file paths) and lastly, the directory you choose is entirely up to you!

The most important part that I need to mention is the reallyrm variable. If this is set to true, and you remove a git repository from the list, omv will make sure that it removes it! Since some users might not expect this behaviour, it defaults to false, and shouldn’t bite you! If you’re not comfortable with it, don’t use it! I find it incredibly helpful.

Here’s a small screencast to show you some examples of this in action:

oh-my-vagrant-extern-screencast.ogv

I hope you enjoyed this. Please share and enjoy, and I’ll be back soon to explain some more of the features! Documentation patches are appreciated!

Happy hacking,

James

Fancy git aliases and git cherryfetch

Here are two quick git tricks that I’ve added to my toolbox lately…

I wanted to create a git alias that takes in argv from the command, but in the middle of the command. Here’s the hack that I came up with for the [alias] section of my ~/.gitconfig:

[alias]

    # cherryfetch fetches a repo ($1) / branch ($2) and applies it rebased!
    # the && true at the end eats up the appended args
    cherryfetch = !git fetch "$1" "$2" && git cherry-pick HEAD..FETCH_HEAD && true

Explanation:

The git alias is called “cherryfetch“. It starts with an exclamation point (!) which means it’s a command to execute. It uses $1 and $2 as arguments, which thankfully exist in this environment! Because running a git alias causes the arguments to get appended to the command, the && true part exists at the end to receive those commands so that they are silently ignored! This works because the command:

$ true ignored command line arguments

Does absolutely nothing but return success. Better naming suggestions for my alias are welcome!

What does this particular alias do?

Too often I receive a patch which has not been rebased against *my* current git master. This happens because the patch author started hacking on the feature against current git master, but git master grew new commits before the patch was done. The best solution is to ask the author to rebase against git master and resubmit, but when I’m feeling helpful, I can now use git cherryfetch to do the same thing! When their patch is unrelated to my changes, this works perfectly.

Example:

$ git cherryfetch https://github.com/example_user/puppet-gluster.git feat/branch_name
From https://github.com/example_user/puppet-gluster
* branch            feat/branch_name -> FETCH_HEAD
[master abcdef01] Pi day is awesome, but Tau day is better! purpleidea/puppet-gluster#feat/branch_name
Author: Example User <user@example.com>
Date: Tue Mar 14 09:26:53 2015 -0400
4 file changed, 42 insertions(+), 13 deletions(-)

Manual learning:

If you were to do the same thing manually, this operation is the equivalent of checking out a branch at the same commit this patch applies to, pulling the patch down (causing a fast-forward) running a git rebase master, switching to master and merging your newly rebased branch in.

My version is faster, isn’t it?

Happy hacking!

James

Update: I’ve updated the alias so that it works with N commits. Thanks to David Schmitt for pointing it out, and Felix Frank for finding a nicer solution!

Building RHEL Vagrant Boxes with Vagrant-Builder

Vagrant is a great tool for development, but Red Hat Enterprise Linux (RHEL) customers have typically been left out, because it has been impossible to get RHEL boxes! It would be extremely elegant if hackers could quickly test and prototype their code on the same OS as they’re running in production.

Secondly, when hacking on projects that have a long initial setup phase (eg: a long rpm install) it would be excellent if hackers could roll their own modified base boxes, so that certain common operations could be re-factored out into the base image.

This all changes today.

Please continue reading if you’d like to know how :)

Subscriptions:

In order to use RHEL, you first need a subscription. If you don’t already have one, go sign up… I’ll wait. You do have to pay money, but in return, you’re funding my salary (and many others) so that we can build you lots of great hacks.

Prerequisites:

I’ll be working through this whole process on a Fedora 21 laptop. It should probably work on different OS versions and flavours, but I haven’t tested it. Please test, and let me know your results! You’ll also need virt-install and virt-builder installed:

$ sudo yum install -y /usr/bin/virt-{install,builder}

Step one:

Login to https://access.redhat.com/ and check that you have a valid subscription available. This should look like this:

A view of my available subscriptions.

A view of my available subscriptions.

If everything looks good, you’ll need to download an ISO image of RHEL. First head to the downloads section and find the RHEL product:

A view of my available product downloads.

A view of my available product downloads.

In the RHEL download section, you’ll find a number of variants. You want the RHEL 7.0 Binary DVD:

A view of the available RHEL downloads.

A view of the available RHEL downloads.

After it has finished downloading, verify the SHA-256 hash is correct, and continue to step two!

$ sha256sum rhel-server-7.0-x86_64-dvd.iso
85a9fedc2bf0fc825cc7817056aa00b3ea87d7e111e0cf8de77d3ba643f8646c  rhel-server-7.0-x86_64-dvd.iso

Step two:

Grab a copy of vagrant-builder:

$ git clone https://github.com/purpleidea/vagrant-builder
Cloning into 'vagrant-builder'...
[...]
Checking connectivity... done.

I’m pleased to announce that it now has some documentation! (Patches are welcome to improve it!)

Since we’re going to use it to build RHEL images, you’ll need to put your subscription manager credentials in ~/.vagrant-builder/auth.sh:

$ cat ~/.vagrant-builder/auth.sh
# these values are used by vagrant-builder
USERNAME='purpleidea@redhat.com' # replace with your access.redhat.com username
PASSWORD='hunter2'               # replace with your access.redhat.com password

This is a simple shell script that gets sourced, so you could instead replace the static values with a script that calls out to the GNOME Keyring. This is left as an exercise to the reader.

To build the image, we’ll be working in the v7/ directory. This directory supports common OS families and versions that have high commonality, and this includes Fedora 20, Fedora 21, CentOS 7.0, and RHEL 7.0.

Put the downloaded RHEL ISO in the iso/ directory. To allow qemu to see this file, you’ll need to add some acl’s:

$ sudo -s # do this as root
$ cd /home/
$ getfacl james # james is my home directory
# file: james
# owner: james
# group: james
user::rwx
group::---
other::---
$ setfacl -m u:qemu:r-x james # this is the important line
$ getfacl james
# file: james
# owner: james
# group: james
user::rwx
user:qemu:r-x
group::---
mask::r-x
other::---

If you have an unusually small /tmp directory, it might also be an issue. You’ll need at least 6GiB free, but a bit extra is a good idea. Check your free space first:

$ df -h /tmp
Filesystem Size Used Avail Use% Mounted on
tmpfs 1.9G 1.3M 1.9G 1% /tmp

Let’s increase this a bit:

$ sudo mount -o remount,size=8G /tmp
$ df -h /tmp
Filesystem Size Used Avail Use% Mounted on
tmpfs 8.0G 1.3M 8.0G 1% /tmp

You’re now ready to build an image…

Step three:

In the versions/ directory, you’ll see that I have provided a rhel-7.0-iso.sh script. You’ll need to run it from its parent directory. This will take a while, and will cause two sudo prompts, which are required for virt-install. One downside to this process is that your https://access.redhat.com/ password will be briefly shown in the virt-builder output. Patches to fix this are welcome!

$ pwd
/home/james/code/vagrant-builder/v7
$ time ./versions/rhel-7.0-iso.sh
[...]
real    38m49.777s
user    13m20.910s
sys     1m13.832s
$ echo $?
0

With any luck, this should eventually complete successfully. This uses your cpu’s virtualization instructions, so if they’re not enabled, this will be a lot slower. It also uses the network, which in North America, means you’re in for a wait. Lastly, the xz compression utility will use a bunch of cpu building the virt-builder base image. On my laptop, this whole process took about 30 minutes. The above run was done without an SSD and took a bit longer.

The good news is that most of hard work is now done and won’t need to be repeated! If you want to see the fruits of your CPU labour, have a look in: ~/tmp/builder/rhel-7.0-iso/.

$ ls -lAhGs
total 4.1G
1.7G -rw-r--r--. 1 james 1.7G Feb 23 18:48 box.img
1.7G -rw-r--r--. 1 james  41G Feb 23 18:48 builder.img
 12K -rw-rw-r--. 1 james  10K Feb 23 18:11 docker.tar
4.0K -rw-rw-r--. 1 james  388 Feb 23 18:39 index
4.0K -rw-rw-r--. 1 james   64 Feb 23 18:11 metadata.json
652M -rw-rw-r--. 1 james 652M Feb 23 18:50 rhel-7.0-iso.box
200M -rw-r--r--. 1 james 200M Feb 23 18:28 rhel-7.0.xz

As you can see, we’ve produced a bunch of files. The rhel-7.0-iso.box is your RHEL 7.0 vagrant base box! Congratulations!

Step four:

If you haven’t ever installed vagrant, you’ll pleased to know that as of last week, vagrant and vagrant-libvirt RPM’s have hit Fedora 21! I started trying to convince the RPM wizards about a year ago, and we finally have something that is quite usable! Hopefully we’ll iterate on any packaging bugs, and keep this great work going! There are now only three things you need to do to get a working vagrant-libvirt setup on Fedora 21:

  1. $ yum install -y vagrant-libvirt
  2. Source this .bashrc add-on from: https://gist.github.com/purpleidea/8071962
  3. Add a vagrant.pkla file as mentioned here

Now that we’re now in well-known vagrant territory. Adding the box into vagrant is a simple:

$ vagrant box add rhel-7.0-iso.box --name rhel-7.0

Using the box effectively:

Having a base box is great, but having to manage subscription-manager manually isn’t fun in a DevOps environment. Enter Oh-My-Vagrant (omv). You can use omv to automatically register and unregister boxes! Edit the omv.yaml file so that the image variable refers to the base box you just built, enter your https://access.redhat.com/ username and password, and vagrant up away!

$ cat omv.yaml 
---
:domain: example.com
:network: 192.168.123.0/24
:image: rhel-7.0
:boxurlprefix: ''
:sync: rsync
:folder: ''
:extern: []
:puppet: false
:classes: []
:docker: false
:cachier: false
:vms: []
:namespace: omv
:count: 2
:username: 'purpleidea@redhat.com'
:password: 'hunter2'
:poolid: true
:repos: []
$ vs
Current machine states:

omv1                      not created (libvirt)
omv2                      not created (libvirt)

This environment represents multiple VMs. The VMs are all listed
above with their current state. For more information about a specific
VM, run `vagrant status NAME`.

You might want to set repos to be:

['rhel-7-server-rpms', 'rhel-7-server-extras-rpms']

but it depends on what subscriptions you want or have available. If you’d like to store your credentials in an external file, you can do so like this:

$ cat ~/.config/oh-my-vagrant/auth.yaml
---
:username: purpleidea@redhat.com
:password: hunter2

Here’s an actual run to see the subscription-manager action:

$ vup omv1
[...]
==> omv1: The system has been registered with ID: 00112233-4455-6677-8899-aabbccddeeff
==> omv1: Installed Product Current Status:
==> omv1: Product Name: Red Hat Enterprise Linux Server
==> omv1: Status:       Subscribed
$ # the above lines shows that your machine has been registered
$ vscreen root@omv1
[root@omv1 ~]# echo thanks purpleidea!
thanks purpleidea!
[root@omv1 ~]# exit

Make sure to unregister when you are permanently done with a machine, otherwise your subscriptions will be left idle. This happens automatically on vagrant destroy when using Oh-My-Vagrant:

$ vdestroy omv1 # make sure to unregister when you are done
Unlocking shell provisioning for: omv1...
Running 'subscription-manager unregister' on: omv1...
Connection to 192.168.121.26 closed.
System has been unregistered.
==> omv1: Removing domain...

Idempotence:

One interesting aspect of this build process, is that it’s mostly idempotent. It’s able to do this, because it uses GNU Make to ensure that only out of date steps or missing targets are run. As a result, if the build process fails part way through, you’ll only have to repeat the failed steps! This speeds up debugging and iterative development enormously!

To prove this to you, here is what a second run looks like (after the first successful run):

$ time ./versions/rhel-7.0-iso.sh 

real    0m0.029s
user    0m0.013s
sys    0m0.017s

As you can see it completes almost instantly.

Variants:

To build a variant of the base image that we just built, create a versions/*.sh file, and modify the variables to add your changes in. If you start with a copy of the ~/tmp/builder/${VERSION}-${POSTFIX} folder, then you shouldn’t need to repeat the initial steps. Hint: btrfs is excellent at reflinking data, so you don’t unnecessarily store multiple copies!

Plumbing Pipeline:

What actually happens behind the scenes? Most of the magic happens in the Makefile. The relevant series of transforms is as follows:

  1. virt-install: install from iso
  2. virt-sysprep: remove unneeded junk
  3. virt-sparsify: make sparse
  4. xz –best: compress into builder format
  5. virt-builder: use builder to bake vagrant box
  6. qemu-img convert: convert to correct format
  7. tar -cvz: tar up into vagrant box format

There are some intermediate dependency steps that I didn’t mention, so feel free to explore the source.

Future work:

  • Some of the above steps in the pipeline are actually bundled under the same target. It’s not a huge issue, but it could be changed if someone feels strongly about it.
  • Virt-builder can’t run docker commands during build. This would be very useful for pre-populating images with docker containers.
  • Oh-My-Vagrant, needs to have its DNS management switched to use vagrant-hostmanager instead of puppet resource commands.

Disclaimers:

While I expect you’ll love using these RHEL base boxes with Vagrant, the above builder methodology is currently not officially supported, and I can’t guarantee that the RHEL vagrant dev environments will be either. I’m putting this out there for the early (DevOps) adopters who want to hack on this and who didn’t want to invent their own build tool chain. If you do have issues, please leave a comment here, or submit a vagrant-builder issue.

Thanks:

Special thanks to Richard WM Jones and Pino Toscano for their great work on virt-builder that this is based on. Additional thanks to Randy Barlow for encouraging me to work on this. Thanks to Red Hat for continuing to pay my salary :)

Subscriptions?

If I’ve convinced you that you want some RHEL subscriptions, please go have a look, and please let Red Hat know that you appreciated this post and my work.

Happy Hacking!

James

UPDATE: I’ve tested that this process also works with the new RHEL 7.1 release!

Introducing: Silent counter

You might want to write code that can tell how many iterations have passed since some action occurred. Alternatively, you might want to know if it’s the first time a machine has run Puppet. To do these types of things, you might wish to have a monotonically increasing counter in your Puppet manifest. Since one did not exist, I set out to build one!

The code:

If you just want to try the code, and skip the ramble, you can include common::counter into your manifest. The entire class is part of my puppet-common module:

git clone https://github.com/purpleidea/puppet-common

Usage:

Usage notes are hardly even necessary. Here is how the code is commonly used:


include ::common::counter    # that's it!

# NOTE: we only see the notify message. no other exec/change is shown!
notify { 'counter':
        message => "Value is: ${::common_counter_simple}",
}

Now use the fact anywhere you need a counter!

Increasing a variable:

At the heart of any counter, you’ll need to have a value store, a way to read the value, and a way to increment the value. For simplicity, the value store we’ll use will be a file on disk. This file is located by default at ${vardir}/common/counter/simple. To read the value, we use a puppet fact. The fact has a key name of $::common_counter_simple. To increment the value, a simple python script is used.

Noise:

To cause an increment of the value on each puppet run, an exec would have to be used. The downside of this is that this causes noise in your puppet logs, even if nothing else is happening! This noise is unwanted, so we work around this with the following code:


exec { 'counter':
        # echo an error message and be false, if the incrementing fails
        command => '/bin/echo "common::counter failed" && /bin/false',
        unless => "${vardir}/counter/increment.py '${vardir}/counter/simple'",
        logoutput => on_failure,
}

As you can see, we cause the run to happen in the silent “unless” part of the exec, and we don’t actually allow any exec to occur unless there is an error running the increment.py!

Complex example:

If you want to do something more complicated using this technique, you might want to write something like this:


$max = 8
exec { "/bin/echo this is run #: ${::common_counter_simple}":
        logoutput => on_failure,
        onlyif => [
                "/usr/bin/test ${::common_counter_simple} -lt ${max}",
                "/usr/bin/test ${::common_counter_simple} -gt 0",
        ],
    #notify => ...,    # do some action
}

Side effects:

Make sure not to use the counter fact in a $name or somewhere that would cause frequent unwanted catalog changes, as it could cause a lot of changes each run.

Module pairings:

You might want to pair this module with my “Finite State Machine” concept, or my Exec[‘again’] module. If you come up with some cool use cases, please let me know!

Future work:

If you’d like to extend this work, two features come to mind:

  1. Individual named counters. If for some reason you want more than one counter, named counters could be built.
  2. Reset functionality. If you’d like to reset a counter to zero (or to some other value) then there should be a special type you could create which causes this to happen.

If you’d like to work on either of these features, please let me know, or send me a patch!

Happy hacking!

James

Replying to mailing lists with Evolution

I use the Evolution mail client. It does have a few annoying bugs, but it has a plethora of great features too! Hopefully this post will inspire you to help hack on this piece of software and fix the bugs!

Mailing list etiquette:

When replying to mailing lists, it’s typically very friendly to include the email address of the person you’re replying to in the to or cc fields along with the mailing list address. This lets that person know that someone has answered their question. In some cases, if they’re not subscribed to that mailing list, (if you don’t do this), then they might not see your reply at all.

To enable this feature, there is a check box inside of the Evolution mail preferences. It is labelled: “Ignore Reply-To: for mailing lists“.

You can find this option in the Evolution "Composer Preferences" tab, under the "Replies and Forwards" heading.

You can find this option in the Evolution “Composer Preferences” tab, under the “Replies and Forwards” heading.

This works, because by default, most mailing lists set the “Reply-To:” address to be that of the mailing list. In this case, when you click “Group Reply” (“Reply to all”) in your MUA, then that field will be ignored, and the correct recipients will be selected in your composer window.

If instead you simply click “Reply”, then you will be prompted to choose the kind of reply you’d like to send.

evolution-send-private-reply

Doesn’t this annoy users?

No, this actually gives the recipients more choice! If they’d prefer not to see your reply in their inbox, they can set up a filter so that mail that includes the mailing list address goes to a special folder. If they prefer to see your reply in their inbox, then they can configure their filters so that mail that comes exclusively from the mailing list address goes to a specific folder.

Instead of choosing the "contains" (in_array) operator, you could have chosen "is" (equals).

Instead of choosing the “contains” (in_array) operator, you could have chosen “is” (equals).

Won’t this cause duplicate messages being sent to the user?

Again, that’s up to the user. Most mailing lists allow you to configure this setting.

In this particular example, it is a very low-volume list, therefore I don't filter messages into a separate folder, they go to my inbox, so there's no need to get two copies.

This particular example is of a very low-volume list, therefore I don’t filter messages into a separate folder; they go to my inbox, so there’s no need to get two copies.

Ultimately, Evolution is a great MUA, which has the best message composer available. Some might prefer mutt+vim, but for my money, I’ll stick with Evolution for now.

Happy hacking,

James

PS: If you hack on Evolution, and write a good feature that I like, or fix a bug that affects me, I’m happy to feature you on this blog and tweet about you as soon as your code hits git master! </free promotion for good hackers!>

Captive web portals are considered harmful

Recently, when I tried to access http://slashdot.org/ in Firefox, I would see my browser title bar flash briefly to “AT&T GUI”, and then I would get redirected to: http://slashdot.org/cgi-bin/redirect.ha which returns slashdot’s custom error 404 page! What is going on? (Read on for answer…)

  • Did slashdot mess up their mod_rewrite config?
    (Nope, works fine in a different browser…)
  • Did my HTTPS everywhere extension go crazy?
    (Nope, still broken when disabled…)
  • Are my HTTP requests being MITM-ed?
    (Yes, probably by the NSA, but they wouldn’t make this kind of mistake…)
  • Is my computer p0wned?
    (I use GNU/Linux, so probably not…)

A keyword search will show you that others are also affected by this, except that the base domain (slashdot.org) is usually different… One thing that all the links I viewed have in common: none of them seem to know what’s happening.

Some background:

Recently, I used my laptop with a public WIFI access point. The router behind these access points usually performs a MITM redirection on your HTTP traffic to send you to a captive web portal which you’ll need to use before being authorized to route out to the public internet.

After connecting to the wireless SSID, whichever site you visit next will get replaced with the portal. This typically can’t be an HTTPS url, because they aren’t easily MITM-ed without causing a certificate error.

On my Firefox new tab page, the only non-HTTPS site that I visit is http://slashdot.org/ and as a result, I’ll click this link when I know I’m expecting a portal… (Seriously slashdot, wtf!)

What’s happening?

When I visited http://slashdot.org/ on public WIFI, the captive portal web page got permanently cached in my browser, and now every time I attempt to visit slashdot, I actually get the cached, MITM-ed, portal version.

How to fix this?

Actually it’s very simple: just clear your browser cache. You don’t need to delete your cookies or your history. Choose the “Clear Now” button in Firefox. Example:

clear-cache

Whose fault is this?

  • The AT&T portal programmers for allowing a portal page to be cached.
  • Any website that doesn’t require HTTPS (and lets themselves get MITM-ed).
  • Firefox for not protecting against this (other browsers are affected too!)
  • Public WIFI services for using captive portals (just free the internet already!)

Is there any good news?

It’s easily fixed, and there didn’t seem to be any malicious code in the cached web portal redirector. It turned out to only include a META refresh. Phew :)

Hope this provides an authoritative answer for everyone who is experiencing this problem!

Happy hacking!

James

The switch as an ordinary GNU/Linux server

The fact that we manage the switches in our data centres differently than any other server is patently absurd, but we do so because we want to harness the power of a tiny bit of silicon which happens to be able to dramatically speed up the switching bandwidth.

absurd

beware of proprietary silicon, it’s absurd!

That tiny bit of silicon is known as an ASIC, or an application specific integrated circuit, and one particularly well performing ASIC (which is present in many commercially available switches) is called the Trident.

None of this should impact the end-user management experience, however, because the big switch companies and chip makers believe that there is some special differentiation in their IP, they’ve ensured that the stacks and software surrounding the hardware is highly proprietary and difficult to replace. This also lets them create and sell bundled products and features that you don’t want, but which you can’t get elsewhere.

This is still true today! System and network engineers know too well the hassles of dealing with the different proprietary switch operating systems and interfaces. Why not standardize on the well-known interface that every GNU/Linux server uses.

We’re talking about iptables of course! (Although nftables would be an acceptable standard too!) This way we could have a common interface for all the networked devices in our server room.

I’ve been able to work around this limitation in the past, by using Linux to do my routing in software, and by building the routers out of COTS 2U GNU/Linux boxes. The trouble with this approach, is that they’re bigger, louder, more expensive, consume more power, and don’t have the port density that a 48 port 1U switch does.

a 48 port, 1U switch

a 48 port, 1U switch

It turns out that there is a company which is actually trying to build this mythical box. It is not perfect, but I think they are on the right track. What follows are my opinions of what they’ve done right, what’s wrong, and what I’d like to see in the future.

Who are they?

They are Cumulus Networks, and I recently got to meet, demo and discuss with one of their very talented engineers, Leslie Carr. I recently attended a talk that she gave on this very same subject. She gave me a rocket turtle. (Yes, this now makes me biased!)

my rocket turtle, the cumulus networks mascot

my rocket turtle, the cumulus networks mascot

What are they doing?

You buy an existing switch from your favourite vendor. You then throw out (flash over) the included software, and instead, pay them a yearly licensing fee to use the “Cumulus” GNU/Linux. It comes as an OS image, based off of Debian.

How does it talk to the ASIC?

The OS image comes with a daemon called switchd that transfers the kernel iptables rules down into the ASIC. I don’t know the specifics on how this works exactly because:

  1. Switchd is proprietary. Apparently this is because of a scary NDA they had to sign, but it’s still unfortunate, and it is impeding my hacking.
  2. I’m not an expert on talking to ASIC’s. I’m guessing that unless you’ve signed the NDA’s, and you’re behind the Trident paywall, then it’s tough to become one!

Problems with packaging:

The OS is only distributed as a single image. This is an unfortunate mistake! It should be available from the upstream project with switchd (and any other add-ons) as individual .deb packages. That way, I know I’m getting a stock OS which is preferably even built and signed by the Debian team! That way I could use the same infrastructure for my servers to keep all my servers up to date.

Problems with OS security:

Unfortunately the OS doesn’t benefit from any of the standard OS security enhancements like SELinux. I’d prefer running a more advanced distro like RHEL or CentOS that have these things out of the box, but if Cumulus will continue using Debian, then they must include some more advanced security measures. I didn’t find AppArmor or grsecurity in use either. It did seem to have all the important bash security updates:

cumulus@switch1$ bash --version
GNU bash, version 4.2.37(1)-release (powerpc-unknown-linux-gnu)
Copyright (C) 2011 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later 

This is free software; you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Use of udev:

This switch does seem to support and use udev, although not being a udev expert I can’t comment on if it’s done properly or not. I’d be interested to hear from the pros. Here’s what I found:

cumulus@switch1$ cd /etc/udev/
cumulus@switch1$ tree
.
`-- rules.d
    |-- 10-cumulus.rules
    |-- 60-bridge-network-interface.rules -> /dev/null
    |-- 75-persistent-net-generator.rules -> /dev/null
    `-- 80-networking.rules -> /dev/null

1 directory, 4 files
cumulus@switch1$ cat rules.d/10-cumulus.rules | tail -n 7
# udev rules specific to Cumulus Linux

# Rule called when the linux-user-bde driver is loaded
ACTION=="add" SUBSYSTEM=="module" DEVPATH=="/module/linux_user_bde" ENV{DEVICE_NAME}="linux-user-bde" ENV{DEVICE_TYPE}="c" ENV{DEVICE_MINOR}="0" RUN="/usr/lib/cumulus/udev-module"

# Quanta LY8 uses RTC1
KERNEL=="rtc1", PROGRAM="/usr/bin/platform-detect", RESULT=="quanta,ly8_rangeley", SYMLINK+="rtc"

Other things:

There seems to be a number of extra things running on the switch. Here’s what I mean:

cumulus@switch1$ ps auxwww 
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root         1  0.0  0.0   2516   860 ?        Ss   Nov05   0:02 init [3]  
root         2  0.0  0.0      0     0 ?        S    Nov05   0:00 [kthreadd]
root         3  0.0  0.0      0     0 ?        S    Nov05   0:05 [ksoftirqd/0]
root         5  0.0  0.0      0     0 ?        S    Nov05   0:00 [kworker/u:0]
root         6  0.0  0.0      0     0 ?        S    Nov05   0:00 [migration/0]
root         7  0.0  0.0      0     0 ?        S<   Nov05   0:00 [cpuset]
root         8  0.0  0.0      0     0 ?        S<   Nov05   0:00 [khelper]
root         9  0.0  0.0      0     0 ?        S<   Nov05   0:00 [netns]
root        10  0.0  0.0      0     0 ?        S    Nov05   0:02 [sync_supers]
root        11  0.0  0.0      0     0 ?        S    Nov05   0:00 [bdi-default]
root        12  0.0  0.0      0     0 ?        S<   Nov05   0:00 [kblockd]
root        13  0.0  0.0      0     0 ?        S<   Nov05   0:00 [ata_sff]
root        14  0.0  0.0      0     0 ?        S    Nov05   0:00 [khubd]
root        15  0.0  0.0      0     0 ?        S<   Nov05   0:00 [rpciod]
root        17  0.0  0.0      0     0 ?        S    Nov05   0:00 [khungtaskd]
root        18  0.0  0.0      0     0 ?        S    Nov05   0:00 [kswapd0]
root        19  0.0  0.0      0     0 ?        S    Nov05   0:00 [fsnotify_mark]
root        20  0.0  0.0      0     0 ?        S<   Nov05   0:00 [nfsiod]
root        21  0.0  0.0      0     0 ?        S<   Nov05   0:00 [crypto]
root        34  0.0  0.0      0     0 ?        S    Nov05   0:00 [scsi_eh_0]
root        36  0.0  0.0      0     0 ?        S    Nov05   0:00 [kworker/u:2]
root        41  0.0  0.0      0     0 ?        S    Nov05   0:00 [mtdblock0]
root        42  0.0  0.0      0     0 ?        S    Nov05   0:00 [mtdblock1]
root        43  0.0  0.0      0     0 ?        S    Nov05   0:00 [mtdblock2]
root        44  0.0  0.0      0     0 ?        S    Nov05   0:00 [mtdblock3]
root        49  0.0  0.0      0     0 ?        S    Nov05   0:00 [mtdblock4]
root       362  0.0  0.0      0     0 ?        S    Nov05   0:54 [hwmon0]
root       363  0.0  0.0      0     0 ?        S    Nov05   1:01 [hwmon1]
root       398  0.0  0.0      0     0 ?        S    Nov05   0:04 [flush-8:0]
root       862  0.0  0.1  28680  1768 ?        Sl   Nov05   0:03 /usr/sbin/rsyslogd -c4
root      1041  0.0  0.2   5108  2420 ?        Ss   Nov05   0:00 /sbin/dhclient -pf /run/dhclient.eth0.pid -lf /var/lib/dhcp/dhclient.eth0.leases eth0
root      1096  0.0  0.1   3344  1552 ?        S    Nov05   0:16 /bin/bash /usr/bin/arp_refresh
root      1188  0.0  1.1  15456 10676 ?        S    Nov05   1:36 /usr/bin/python /usr/sbin/ledmgrd
root      1218  0.2  1.1  15468 10708 ?        S    Nov05   5:41 /usr/bin/python /usr/sbin/pwmd
root      1248  1.4  1.1  15480 10728 ?        S    Nov05  39:45 /usr/bin/python /usr/sbin/smond
root      1289  0.0  0.0  13832   964 ?        SNov05   0:00 /sbin/auditd
root      1291  0.0  0.0  10456   852 ?        S
root     12776  0.0  0.0      0     0 ?        S    04:31   0:01 [kworker/0:0]
root     13606  0.0  0.0      0     0 ?        S    05:05   0:00 [kworker/0:2]
root     13892  0.0  0.0      0     0 ?        S    05:13   0:00 [kworker/0:1]
root     13999  0.0  0.0   2028   512 ?        S    05:16   0:00 sleep 30
cumulus  14016  0.0  0.1   3324  1128 pts/0    R+   05:17   0:00 ps auxwww
root     30713 15.6  2.4  69324 24176 ?        Ssl  Nov05 304:16 /usr/sbin/switchd -d
root     30952  0.0  0.3  11196  3500 ?        Ss   Nov05   0:00 sshd: cumulus [priv]
cumulus  30954  0.0  0.1  11196  1684 ?        S    Nov05   0:00 sshd: cumulus@pts/0
cumulus  30955  0.0  0.2   4548  2888 pts/0    Ss   Nov05   0:01 -bash

In particular, I’m referring to ledmgrd, pwmd, smond and others. I don’t doubt these things are necessary and useful, in fact, they’re written in python and should be easy to reverse if anyone is interested, but if they’re a useful part of a switch capable operating system, I hope that they grow proper upstream projects and get appropriate documentation, licensing, and packaging too!

Switch ports are network devices:

Hacking on the device couldn’t feel more native. Anyone remember how to enumerate the switch ports on IOS? … Who cares! Try the standard iproute2 tools on a Cumulus box:

cumulus@switch1$ ip a s swp42
44: swp42: <broadcast,multicast> mtu 1500 qdisc noop state DOWN qlen 500
    link/ether 08:9e:01:f8:96:74 brd ff:ff:ff:ff:ff:ff
cumulus@switch1$ ip a | grep swp | wc -l
52

What about ifup/ifdown?

This one is a bit different:

cumulus@switch1$ file /sbin/ifup
/sbin/ifup: symbolic link to `/sbin/ifupdown'
cumulus@switch1$ file /sbin/ifupdown 
/sbin/ifupdown: Python script, ASCII text executable

The Cumulus team encountered issues with the traditional ifup/ifdown tools found in a stock distro. So they replaced them, with shiny python versions:

https://github.com/CumulusNetworks/ifupdown2

I hope that this project either gets into the upstream distro, or that some upstream writes tools that address the limitations in the various messes of shell scripts. I’m optimistic about networkd being the solution here, but until that’s fully baked, the Cumulus team has built a nice workaround. Additionally, until the Debian team finalizes on the proper technical decision to use SystemD, it has a bleak future.

Kernel:

All the kernel hackers out there will want to know what’s under the hood:

cumulus@switch1$ uname -a
Linux leaf1 3.2.46-1+deb7u1+cl2.2+1 #3.2.46-1+deb7u1+cl2.2+1 SMP Thu Oct 16 14:28:31 PDT 2014 ppc powerpc GNU/Linux

Because this is an embedded chip found in a 1U box, and not an Xeon processor, it’s noticeably slower than a traditional server. This is of course (non-sarcastically) exactly what I want. For admin tasks, it has plenty of power, and this trade-off means it has lower power consumption and heat production than a stock server. While debugging some puppet code that I was running takes longer than normal on this box, I was eventually able to get the job done. This is another reason why this box needs to act like more of an upstream distro — if it did, I’d be able to have a faster machine as my dev box!

Other tools:

Other stock tools like ethtool, and brctl, work out of the box. Bonding, vlan’s and every other feature I tested seems to work the same way you expect from a GNU/Linux system.

Puppet and automation:

Readers of my blog will know that I manage my servers with Puppet. Instead of having the puppet agent connect over an API to the switch, you can directly install and run puppet on this Cumulus Linux machine! Some users might quickly jump to using the firewall module as the solution for consistent management, but a level two user will know that a higher level wrapper around shorewall is the better approach. This is all possible with this switch and seems to work great! The downside was that I had to manually add repositories to get the shorewall packages because it is not a stock distro :(

Why not SDN?

SDN or software-defined networking, is a fantastic and interesting technology. Unfortunately, it’s a parallel problem to what I’m describing in this article, and not a solution to help you work around the lack of a good GNU+Linux switch. If programming the ASIC’s wasn’t an NDA requiring activity, I bet we’d see a lot more innovative networking companies and technologies pop up.

Future?

This product isn’t quite baked yet for me to want to use it in production, but it’s so tantalizingly close that it’s strongly worth considering. I hope that they react positively to my suggestions and create an even more native, upstream environment. Once this is done, it will be my go to, for all my switching!

Thanks:

Thanks very much to the Cumulus team for showing me their software, and giving me access to demo it on some live switches. I didn’t test performance, but I have no doubt that it competes with the market average. Prove me right by trying it out yourself!

Thanks for listening, and Happy hacking!

James

PS: Special thanks to David Caplan for the great networking discussions we had!