Building RHEL Vagrant Boxes with Vagrant-Builder

Vagrant is a great tool for development, but Red Hat Enterprise Linux (RHEL) customers have typically been left out, because it has been impossible to get RHEL boxes! It would be extremely elegant if hackers could quickly test and prototype their code on the same OS as they’re running in production.

Secondly, when hacking on projects that have a long initial setup phase (eg: a long rpm install) it would be excellent if hackers could roll their own modified base boxes, so that certain common operations could be re-factored out into the base image.

This all changes today.

Please continue reading if you’d like to know how :)

Subscriptions:

In order to use RHEL, you first need a subscription. If you don’t already have one, go sign up… I’ll wait. You do have to pay money, but in return, you’re funding my salary (and many others) so that we can build you lots of great hacks.

Prerequisites:

I’ll be working through this whole process on a Fedora 21 laptop. It should probably work on different OS versions and flavours, but I haven’t tested it. Please test, and let me know your results! You’ll also need virt-install and virt-builder installed:

$ sudo yum install -y /usr/bin/virt-{install,builder}

Step one:

Login to https://access.redhat.com/ and check that you have a valid subscription available. This should look like this:

A view of my available subscriptions.

A view of my available subscriptions.

If everything looks good, you’ll need to download an ISO image of RHEL. First head to the downloads section and find the RHEL product:

A view of my available product downloads.

A view of my available product downloads.

In the RHEL download section, you’ll find a number of variants. You want the RHEL 7.0 Binary DVD:

A view of the available RHEL downloads.

A view of the available RHEL downloads.

After it has finished downloading, verify the SHA-256 hash is correct, and continue to step two!

$ sha256sum rhel-server-7.0-x86_64-dvd.iso
85a9fedc2bf0fc825cc7817056aa00b3ea87d7e111e0cf8de77d3ba643f8646c  rhel-server-7.0-x86_64-dvd.iso

Step two:

Grab a copy of vagrant-builder:

$ git clone https://github.com/purpleidea/vagrant-builder
Cloning into 'vagrant-builder'...
[...]
Checking connectivity... done.

I’m pleased to announce that it now has some documentation! (Patches are welcome to improve it!)

Since we’re going to use it to build RHEL images, you’ll need to put your subscription manager credentials in ~/.vagrant-builder/auth.sh:

$ cat ~/.vagrant-builder/auth.sh
# these values are used by vagrant-builder
USERNAME='purpleidea@redhat.com' # replace with your access.redhat.com username
PASSWORD='hunter2'               # replace with your access.redhat.com password

This is a simple shell script that gets sourced, so you could instead replace the static values with a script that calls out to the GNOME Keyring. This is left as an exercise to the reader.

To build the image, we’ll be working in the v7/ directory. This directory supports common OS families and versions that have high commonality, and this includes Fedora 20, Fedora 21, CentOS 7.0, and RHEL 7.0.

Put the downloaded RHEL ISO in the iso/ directory. To allow qemu to see this file, you’ll need to add some acl’s:

$ sudo -s # do this as root
$ cd /home/
$ getfacl james # james is my home directory
# file: james
# owner: james
# group: james
user::rwx
group::---
other::---
$ setfacl -m u:qemu:r-x james # this is the important line
$ getfacl james
# file: james
# owner: james
# group: james
user::rwx
user:qemu:r-x
group::---
mask::r-x
other::---

If you have an unusually small /tmp directory, it might also be an issue. You’ll need at least 6GiB free, but a bit extra is a good idea. Check your free space first:

$ df -h /tmp
Filesystem Size Used Avail Use% Mounted on
tmpfs 1.9G 1.3M 1.9G 1% /tmp

Let’s increase this a bit:

$ sudo mount -o remount,size=8G /tmp
$ df -h /tmp
Filesystem Size Used Avail Use% Mounted on
tmpfs 8.0G 1.3M 8.0G 1% /tmp

You’re now ready to build an image…

Step three:

In the versions/ directory, you’ll see that I have provided a rhel-7.0-iso.sh script. You’ll need to run it from its parent directory. This will take a while, and will cause two sudo prompts, which are required for virt-install. One downside to this process is that your https://access.redhat.com/ password will be briefly shown in the virt-builder output. Patches to fix this are welcome!

$ pwd
/home/james/code/vagrant-builder/v7
$ time ./versions/rhel-7.0-iso.sh
[...]
real    38m49.777s
user    13m20.910s
sys     1m13.832s
$ echo $?
0

With any luck, this should eventually complete successfully. This uses your cpu’s virtualization instructions, so if they’re not enabled, this will be a lot slower. It also uses the network, which in North America, means you’re in for a wait. Lastly, the xz compression utility will use a bunch of cpu building the virt-builder base image. On my laptop, this whole process took about 30 minutes. The above run was done without an SSD and took a bit longer.

The good news is that most of hard work is now done and won’t need to be repeated! If you want to see the fruits of your CPU labour, have a look in: ~/tmp/builder/rhel-7.0-iso/.

$ ls -lAhGs
total 4.1G
1.7G -rw-r--r--. 1 james 1.7G Feb 23 18:48 box.img
1.7G -rw-r--r--. 1 james  41G Feb 23 18:48 builder.img
 12K -rw-rw-r--. 1 james  10K Feb 23 18:11 docker.tar
4.0K -rw-rw-r--. 1 james  388 Feb 23 18:39 index
4.0K -rw-rw-r--. 1 james   64 Feb 23 18:11 metadata.json
652M -rw-rw-r--. 1 james 652M Feb 23 18:50 rhel-7.0-iso.box
200M -rw-r--r--. 1 james 200M Feb 23 18:28 rhel-7.0.xz

As you can see, we’ve produced a bunch of files. The rhel-7.0-iso.box is your RHEL 7.0 vagrant base box! Congratulations!

Step four:

If you haven’t ever installed vagrant, you’ll pleased to know that as of last week, vagrant and vagrant-libvirt RPM’s have hit Fedora 21! I started trying to convince the RPM wizards about a year ago, and we finally have something that is quite usable! Hopefully we’ll iterate on any packaging bugs, and keep this great work going! There are now only three things you need to do to get a working vagrant-libvirt setup on Fedora 21:

  1. $ yum install -y vagrant-libvirt
  2. Source this .bashrc add-on from: https://gist.github.com/purpleidea/8071962
  3. Add a vagrant.pkla file as mentioned here

Now that we’re now in well-known vagrant territory. Adding the box into vagrant is a simple:

$ vagrant box add rhel-7.0-iso.box --name rhel-7.0

Using the box effectively:

Having a base box is great, but having to manage subscription-manager manually isn’t fun in a DevOps environment. Enter Oh-My-Vagrant (omv). You can use omv to automatically register and unregister boxes! Edit the omv.yaml file so that the image variable refers to the base box you just built, enter your https://access.redhat.com/ username and password, and vagrant up away!

$ cat omv.yaml 
---
:domain: example.com
:network: 192.168.123.0/24
:image: rhel-7.0
:boxurlprefix: ''
:sync: rsync
:folder: ''
:extern: []
:puppet: false
:classes: []
:docker: false
:cachier: false
:vms: []
:namespace: omv
:count: 2
:username: 'purpleidea@redhat.com'
:password: 'hunter2'
:poolid: true
:repos: []
$ vs
Current machine states:

omv1                      not created (libvirt)
omv2                      not created (libvirt)

This environment represents multiple VMs. The VMs are all listed
above with their current state. For more information about a specific
VM, run `vagrant status NAME`.

You might want to set repos to be:

['rhel-7-server-rpms', 'rhel-7-server-extras-rpms']

but it depends on what subscriptions you want or have available. If you’d like to store your credentials in an external file, you can do so like this:

$ cat ~/.config/oh-my-vagrant/auth.yaml
---
:username: purpleidea@redhat.com
:password: hunter2

Here’s an actual run to see the subscription-manager action:

$ vup omv1
[...]
==> omv1: The system has been registered with ID: 00112233-4455-6677-8899-aabbccddeeff
==> omv1: Installed Product Current Status:
==> omv1: Product Name: Red Hat Enterprise Linux Server
==> omv1: Status:       Subscribed
$ # the above lines shows that your machine has been registered
$ vscreen root@omv1
[root@omv1 ~]# echo thanks purpleidea!
thanks purpleidea!
[root@omv1 ~]# exit

Make sure to unregister when you are permanently done with a machine, otherwise your subscriptions will be left idle. This happens automatically on vagrant destroy when using Oh-My-Vagrant:

$ vdestroy omv1 # make sure to unregister when you are done
Unlocking shell provisioning for: omv1...
Running 'subscription-manager unregister' on: omv1...
Connection to 192.168.121.26 closed.
System has been unregistered.
==> omv1: Removing domain...

Idempotence:

One interesting aspect of this build process, is that it’s mostly idempotent. It’s able to do this, because it uses GNU Make to ensure that only out of date steps or missing targets are run. As a result, if the build process fails part way through, you’ll only have to repeat the failed steps! This speeds up debugging and iterative development enormously!

To prove this to you, here is what a second run looks like (after the first successful run):

$ time ./versions/rhel-7.0-iso.sh 

real    0m0.029s
user    0m0.013s
sys    0m0.017s

As you can see it completes almost instantly.

Variants:

To build a variant of the base image that we just built, create a versions/*.sh file, and modify the variables to add your changes in. If you start with a copy of the ~/tmp/builder/${VERSION}-${POSTFIX} folder, then you shouldn’t need to repeat the initial steps. Hint: btrfs is excellent at reflinking data, so you don’t unnecessarily store multiple copies!

Plumbing Pipeline:

What actually happens behind the scenes? Most of the magic happens in the Makefile. The relevant series of transforms is as follows:

  1. virt-install: install from iso
  2. virt-sysprep: remove unneeded junk
  3. virt-sparsify: make sparse
  4. xz –best: compress into builder format
  5. virt-builder: use builder to bake vagrant box
  6. qemu-img convert: convert to correct format
  7. tar -cvz: tar up into vagrant box format

There are some intermediate dependency steps that I didn’t mention, so feel free to explore the source.

Future work:

  • Some of the above steps in the pipeline are actually bundled under the same target. It’s not a huge issue, but it could be changed if someone feels strongly about it.
  • Virt-builder can’t run docker commands during build. This would be very useful for pre-populating images with docker containers.
  • Oh-My-Vagrant, needs to have its DNS management switched to use vagrant-hostmanager instead of puppet resource commands.

Disclaimers:

While I expect you’ll love using these RHEL base boxes with Vagrant, the above builder methodology is currently not officially supported, and I can’t guarantee that the RHEL vagrant dev environments will be either. I’m putting this out there for the early (DevOps) adopters who want to hack on this and who didn’t want to invent their own build tool chain. If you do have issues, please leave a comment here, or submit a vagrant-builder issue.

Thanks:

Special thanks to Richard WM Jones and Pino Toscano for their great work on virt-builder that this is based on. Additional thanks to Randy Barlow for encouraging me to work on this. Thanks to Red Hat for continuing to pay my salary :)

Subscriptions?

If I’ve convinced you that you want some RHEL subscriptions, please go have a look, and please let Red Hat know that you appreciated this post and my work.

Happy Hacking!

James

Introducing: Silent counter

You might want to write code that can tell how many iterations have passed since some action occurred. Alternatively, you might want to know if it’s the first time a machine has run Puppet. To do these types of things, you might wish to have a monotonically increasing counter in your Puppet manifest. Since one did not exist, I set out to build one!

The code:

If you just want to try the code, and skip the ramble, you can include common::counter into your manifest. The entire class is part of my puppet-common module:

git clone https://github.com/purpleidea/puppet-common

Usage:

Usage notes are hardly even necessary. Here is how the code is commonly used:


include ::common::counter    # that's it!

# NOTE: we only see the notify message. no other exec/change is shown!
notify { 'counter':
        message => "Value is: ${::common_counter_simple}",
}

Now use the fact anywhere you need a counter!

Increasing a variable:

At the heart of any counter, you’ll need to have a value store, a way to read the value, and a way to increment the value. For simplicity, the value store we’ll use will be a file on disk. This file is located by default at ${vardir}/common/counter/simple. To read the value, we use a puppet fact. The fact has a key name of $::common_counter_simple. To increment the value, a simple python script is used.

Noise:

To cause an increment of the value on each puppet run, an exec would have to be used. The downside of this is that this causes noise in your puppet logs, even if nothing else is happening! This noise is unwanted, so we work around this with the following code:


exec { 'counter':
        # echo an error message and be false, if the incrementing fails
        command => '/bin/echo "common::counter failed" && /bin/false',
        unless => "${vardir}/counter/increment.py '${vardir}/counter/simple'",
        logoutput => on_failure,
}

As you can see, we cause the run to happen in the silent “unless” part of the exec, and we don’t actually allow any exec to occur unless there is an error running the increment.py!

Complex example:

If you want to do something more complicated using this technique, you might want to write something like this:


$max = 8
exec { "/bin/echo this is run #: ${::common_counter_simple}":
        logoutput => on_failure,
        onlyif => [
                "/usr/bin/test ${::common_counter_simple} -lt ${max}",
                "/usr/bin/test ${::common_counter_simple} -gt 0",
        ],
    #notify => ...,    # do some action
}

Side effects:

Make sure not to use the counter fact in a $name or somewhere that would cause frequent unwanted catalog changes, as it could cause a lot of changes each run.

Module pairings:

You might want to pair this module with my “Finite State Machine” concept, or my Exec[‘again’] module. If you come up with some cool use cases, please let me know!

Future work:

If you’d like to extend this work, two features come to mind:

  1. Individual named counters. If for some reason you want more than one counter, named counters could be built.
  2. Reset functionality. If you’d like to reset a counter to zero (or to some other value) then there should be a special type you could create which causes this to happen.

If you’d like to work on either of these features, please let me know, or send me a patch!

Happy hacking!

James

Replying to mailing lists with Evolution

I use the Evolution mail client. It does have a few annoying bugs, but it has a plethora of great features too! Hopefully this post will inspire you to help hack on this piece of software and fix the bugs!

Mailing list etiquette:

When replying to mailing lists, it’s typically very friendly to include the email address of the person you’re replying to in the to or cc fields along with the mailing list address. This lets that person know that someone has answered their question. In some cases, if they’re not subscribed to that mailing list, (if you don’t do this), then they might not see your reply at all.

To enable this feature, there is a check box inside of the Evolution mail preferences. It is labelled: “Ignore Reply-To: for mailing lists“.

You can find this option in the Evolution "Composer Preferences" tab, under the "Replies and Forwards" heading.

You can find this option in the Evolution “Composer Preferences” tab, under the “Replies and Forwards” heading.

This works, because by default, most mailing lists set the “Reply-To:” address to be that of the mailing list. In this case, when you click “Group Reply” (“Reply to all”) in your MUA, then that field will be ignored, and the correct recipients will be selected in your composer window.

If instead you simply click “Reply”, then you will be prompted to choose the kind of reply you’d like to send.

evolution-send-private-reply

Doesn’t this annoy users?

No, this actually gives the recipients more choice! If they’d prefer not to see your reply in their inbox, they can set up a filter so that mail that includes the mailing list address goes to a special folder. If they prefer to see your reply in their inbox, then they can configure their filters so that mail that comes exclusively from the mailing list address goes to a specific folder.

Instead of choosing the "contains" (in_array) operator, you could have chosen "is" (equals).

Instead of choosing the “contains” (in_array) operator, you could have chosen “is” (equals).

Won’t this cause duplicate messages being sent to the user?

Again, that’s up to the user. Most mailing lists allow you to configure this setting.

In this particular example, it is a very low-volume list, therefore I don't filter messages into a separate folder, they go to my inbox, so there's no need to get two copies.

This particular example is of a very low-volume list, therefore I don’t filter messages into a separate folder; they go to my inbox, so there’s no need to get two copies.

Ultimately, Evolution is a great MUA, which has the best message composer available. Some might prefer mutt+vim, but for my money, I’ll stick with Evolution for now.

Happy hacking,

James

PS: If you hack on Evolution, and write a good feature that I like, or fix a bug that affects me, I’m happy to feature you on this blog and tweet about you as soon as your code hits git master! </free promotion for good hackers!>

Captive web portals are considered harmful

Recently, when I tried to access http://slashdot.org/ in Firefox, I would see my browser title bar flash briefly to “AT&T GUI”, and then I would get redirected to: http://slashdot.org/cgi-bin/redirect.ha which returns slashdot’s custom error 404 page! What is going on? (Read on for answer…)

  • Did slashdot mess up their mod_rewrite config?
    (Nope, works fine in a different browser…)
  • Did my HTTPS everywhere extension go crazy?
    (Nope, still broken when disabled…)
  • Are my HTTP requests being MITM-ed?
    (Yes, probably by the NSA, but they wouldn’t make this kind of mistake…)
  • Is my computer p0wned?
    (I use GNU/Linux, so probably not…)

A keyword search will show you that others are also affected by this, except that the base domain (slashdot.org) is usually different… One thing that all the links I viewed have in common: none of them seem to know what’s happening.

Some background:

Recently, I used my laptop with a public WIFI access point. The router behind these access points usually performs a MITM redirection on your HTTP traffic to send you to a captive web portal which you’ll need to use before being authorized to route out to the public internet.

After connecting to the wireless SSID, whichever site you visit next will get replaced with the portal. This typically can’t be an HTTPS url, because they aren’t easily MITM-ed without causing a certificate error.

On my Firefox new tab page, the only non-HTTPS site that I visit is http://slashdot.org/ and as a result, I’ll click this link when I know I’m expecting a portal… (Seriously slashdot, wtf!)

What’s happening?

When I visited http://slashdot.org/ on public WIFI, the captive portal web page got permanently cached in my browser, and now every time I attempt to visit slashdot, I actually get the cached, MITM-ed, portal version.

How to fix this?

Actually it’s very simple: just clear your browser cache. You don’t need to delete your cookies or your history. Choose the “Clear Now” button in Firefox. Example:

clear-cache

Whose fault is this?

  • The AT&T portal programmers for allowing a portal page to be cached.
  • Any website that doesn’t require HTTPS (and lets themselves get MITM-ed).
  • Firefox for not protecting against this (other browsers are affected too!)
  • Public WIFI services for using captive portals (just free the internet already!)

Is there any good news?

It’s easily fixed, and there didn’t seem to be any malicious code in the cached web portal redirector. It turned out to only include a META refresh. Phew :)

Hope this provides an authoritative answer for everyone who is experiencing this problem!

Happy hacking!

James

The switch as an ordinary GNU/Linux server

The fact that we manage the switches in our data centres differently than any other server is patently absurd, but we do so because we want to harness the power of a tiny bit of silicon which happens to be able to dramatically speed up the switching bandwidth.

absurd

beware of proprietary silicon, it’s absurd!

That tiny bit of silicon is known as an ASIC, or an application specific integrated circuit, and one particularly well performing ASIC (which is present in many commercially available switches) is called the Trident.

None of this should impact the end-user management experience, however, because the big switch companies and chip makers believe that there is some special differentiation in their IP, they’ve ensured that the stacks and software surrounding the hardware is highly proprietary and difficult to replace. This also lets them create and sell bundled products and features that you don’t want, but which you can’t get elsewhere.

This is still true today! System and network engineers know too well the hassles of dealing with the different proprietary switch operating systems and interfaces. Why not standardize on the well-known interface that every GNU/Linux server uses.

We’re talking about iptables of course! (Although nftables would be an acceptable standard too!) This way we could have a common interface for all the networked devices in our server room.

I’ve been able to work around this limitation in the past, by using Linux to do my routing in software, and by building the routers out of COTS 2U GNU/Linux boxes. The trouble with this approach, is that they’re bigger, louder, more expensive, consume more power, and don’t have the port density that a 48 port 1U switch does.

a 48 port, 1U switch

a 48 port, 1U switch

It turns out that there is a company which is actually trying to build this mythical box. It is not perfect, but I think they are on the right track. What follows are my opinions of what they’ve done right, what’s wrong, and what I’d like to see in the future.

Who are they?

They are Cumulus Networks, and I recently got to meet, demo and discuss with one of their very talented engineers, Leslie Carr. I recently attended a talk that she gave on this very same subject. She gave me a rocket turtle. (Yes, this now makes me biased!)

my rocket turtle, the cumulus networks mascot

my rocket turtle, the cumulus networks mascot

What are they doing?

You buy an existing switch from your favourite vendor. You then throw out (flash over) the included software, and instead, pay them a yearly licensing fee to use the “Cumulus” GNU/Linux. It comes as an OS image, based off of Debian.

How does it talk to the ASIC?

The OS image comes with a daemon called switchd that transfers the kernel iptables rules down into the ASIC. I don’t know the specifics on how this works exactly because:

  1. Switchd is proprietary. Apparently this is because of a scary NDA they had to sign, but it’s still unfortunate, and it is impeding my hacking.
  2. I’m not an expert on talking to ASIC’s. I’m guessing that unless you’ve signed the NDA’s, and you’re behind the Trident paywall, then it’s tough to become one!

Problems with packaging:

The OS is only distributed as a single image. This is an unfortunate mistake! It should be available from the upstream project with switchd (and any other add-ons) as individual .deb packages. That way, I know I’m getting a stock OS which is preferably even built and signed by the Debian team! That way I could use the same infrastructure for my servers to keep all my servers up to date.

Problems with OS security:

Unfortunately the OS doesn’t benefit from any of the standard OS security enhancements like SELinux. I’d prefer running a more advanced distro like RHEL or CentOS that have these things out of the box, but if Cumulus will continue using Debian, then they must include some more advanced security measures. I didn’t find AppArmor or grsecurity in use either. It did seem to have all the important bash security updates:

cumulus@switch1$ bash --version
GNU bash, version 4.2.37(1)-release (powerpc-unknown-linux-gnu)
Copyright (C) 2011 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later 

This is free software; you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Use of udev:

This switch does seem to support and use udev, although not being a udev expert I can’t comment on if it’s done properly or not. I’d be interested to hear from the pros. Here’s what I found:

cumulus@switch1$ cd /etc/udev/
cumulus@switch1$ tree
.
`-- rules.d
    |-- 10-cumulus.rules
    |-- 60-bridge-network-interface.rules -> /dev/null
    |-- 75-persistent-net-generator.rules -> /dev/null
    `-- 80-networking.rules -> /dev/null

1 directory, 4 files
cumulus@switch1$ cat rules.d/10-cumulus.rules | tail -n 7
# udev rules specific to Cumulus Linux

# Rule called when the linux-user-bde driver is loaded
ACTION=="add" SUBSYSTEM=="module" DEVPATH=="/module/linux_user_bde" ENV{DEVICE_NAME}="linux-user-bde" ENV{DEVICE_TYPE}="c" ENV{DEVICE_MINOR}="0" RUN="/usr/lib/cumulus/udev-module"

# Quanta LY8 uses RTC1
KERNEL=="rtc1", PROGRAM="/usr/bin/platform-detect", RESULT=="quanta,ly8_rangeley", SYMLINK+="rtc"

Other things:

There seems to be a number of extra things running on the switch. Here’s what I mean:

cumulus@switch1$ ps auxwww 
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root         1  0.0  0.0   2516   860 ?        Ss   Nov05   0:02 init [3]  
root         2  0.0  0.0      0     0 ?        S    Nov05   0:00 [kthreadd]
root         3  0.0  0.0      0     0 ?        S    Nov05   0:05 [ksoftirqd/0]
root         5  0.0  0.0      0     0 ?        S    Nov05   0:00 [kworker/u:0]
root         6  0.0  0.0      0     0 ?        S    Nov05   0:00 [migration/0]
root         7  0.0  0.0      0     0 ?        S<   Nov05   0:00 [cpuset]
root         8  0.0  0.0      0     0 ?        S<   Nov05   0:00 [khelper]
root         9  0.0  0.0      0     0 ?        S<   Nov05   0:00 [netns]
root        10  0.0  0.0      0     0 ?        S    Nov05   0:02 [sync_supers]
root        11  0.0  0.0      0     0 ?        S    Nov05   0:00 [bdi-default]
root        12  0.0  0.0      0     0 ?        S<   Nov05   0:00 [kblockd]
root        13  0.0  0.0      0     0 ?        S<   Nov05   0:00 [ata_sff]
root        14  0.0  0.0      0     0 ?        S    Nov05   0:00 [khubd]
root        15  0.0  0.0      0     0 ?        S<   Nov05   0:00 [rpciod]
root        17  0.0  0.0      0     0 ?        S    Nov05   0:00 [khungtaskd]
root        18  0.0  0.0      0     0 ?        S    Nov05   0:00 [kswapd0]
root        19  0.0  0.0      0     0 ?        S    Nov05   0:00 [fsnotify_mark]
root        20  0.0  0.0      0     0 ?        S<   Nov05   0:00 [nfsiod]
root        21  0.0  0.0      0     0 ?        S<   Nov05   0:00 [crypto]
root        34  0.0  0.0      0     0 ?        S    Nov05   0:00 [scsi_eh_0]
root        36  0.0  0.0      0     0 ?        S    Nov05   0:00 [kworker/u:2]
root        41  0.0  0.0      0     0 ?        S    Nov05   0:00 [mtdblock0]
root        42  0.0  0.0      0     0 ?        S    Nov05   0:00 [mtdblock1]
root        43  0.0  0.0      0     0 ?        S    Nov05   0:00 [mtdblock2]
root        44  0.0  0.0      0     0 ?        S    Nov05   0:00 [mtdblock3]
root        49  0.0  0.0      0     0 ?        S    Nov05   0:00 [mtdblock4]
root       362  0.0  0.0      0     0 ?        S    Nov05   0:54 [hwmon0]
root       363  0.0  0.0      0     0 ?        S    Nov05   1:01 [hwmon1]
root       398  0.0  0.0      0     0 ?        S    Nov05   0:04 [flush-8:0]
root       862  0.0  0.1  28680  1768 ?        Sl   Nov05   0:03 /usr/sbin/rsyslogd -c4
root      1041  0.0  0.2   5108  2420 ?        Ss   Nov05   0:00 /sbin/dhclient -pf /run/dhclient.eth0.pid -lf /var/lib/dhcp/dhclient.eth0.leases eth0
root      1096  0.0  0.1   3344  1552 ?        S    Nov05   0:16 /bin/bash /usr/bin/arp_refresh
root      1188  0.0  1.1  15456 10676 ?        S    Nov05   1:36 /usr/bin/python /usr/sbin/ledmgrd
root      1218  0.2  1.1  15468 10708 ?        S    Nov05   5:41 /usr/bin/python /usr/sbin/pwmd
root      1248  1.4  1.1  15480 10728 ?        S    Nov05  39:45 /usr/bin/python /usr/sbin/smond
root      1289  0.0  0.0  13832   964 ?        SNov05   0:00 /sbin/auditd
root      1291  0.0  0.0  10456   852 ?        S
root     12776  0.0  0.0      0     0 ?        S    04:31   0:01 [kworker/0:0]
root     13606  0.0  0.0      0     0 ?        S    05:05   0:00 [kworker/0:2]
root     13892  0.0  0.0      0     0 ?        S    05:13   0:00 [kworker/0:1]
root     13999  0.0  0.0   2028   512 ?        S    05:16   0:00 sleep 30
cumulus  14016  0.0  0.1   3324  1128 pts/0    R+   05:17   0:00 ps auxwww
root     30713 15.6  2.4  69324 24176 ?        Ssl  Nov05 304:16 /usr/sbin/switchd -d
root     30952  0.0  0.3  11196  3500 ?        Ss   Nov05   0:00 sshd: cumulus [priv]
cumulus  30954  0.0  0.1  11196  1684 ?        S    Nov05   0:00 sshd: cumulus@pts/0
cumulus  30955  0.0  0.2   4548  2888 pts/0    Ss   Nov05   0:01 -bash

In particular, I’m referring to ledmgrd, pwmd, smond and others. I don’t doubt these things are necessary and useful, in fact, they’re written in python and should be easy to reverse if anyone is interested, but if they’re a useful part of a switch capable operating system, I hope that they grow proper upstream projects and get appropriate documentation, licensing, and packaging too!

Switch ports are network devices:

Hacking on the device couldn’t feel more native. Anyone remember how to enumerate the switch ports on IOS? … Who cares! Try the standard iproute2 tools on a Cumulus box:

cumulus@switch1$ ip a s swp42
44: swp42: <broadcast,multicast> mtu 1500 qdisc noop state DOWN qlen 500
    link/ether 08:9e:01:f8:96:74 brd ff:ff:ff:ff:ff:ff
cumulus@switch1$ ip a | grep swp | wc -l
52

What about ifup/ifdown?

This one is a bit different:

cumulus@switch1$ file /sbin/ifup
/sbin/ifup: symbolic link to `/sbin/ifupdown'
cumulus@switch1$ file /sbin/ifupdown 
/sbin/ifupdown: Python script, ASCII text executable

The Cumulus team encountered issues with the traditional ifup/ifdown tools found in a stock distro. So they replaced them, with shiny python versions:

https://github.com/CumulusNetworks/ifupdown2

I hope that this project either gets into the upstream distro, or that some upstream writes tools that address the limitations in the various messes of shell scripts. I’m optimistic about networkd being the solution here, but until that’s fully baked, the Cumulus team has built a nice workaround. Additionally, until the Debian team finalizes on the proper technical decision to use SystemD, it has a bleak future.

Kernel:

All the kernel hackers out there will want to know what’s under the hood:

cumulus@switch1$ uname -a
Linux leaf1 3.2.46-1+deb7u1+cl2.2+1 #3.2.46-1+deb7u1+cl2.2+1 SMP Thu Oct 16 14:28:31 PDT 2014 ppc powerpc GNU/Linux

Because this is an embedded chip found in a 1U box, and not an Xeon processor, it’s noticeably slower than a traditional server. This is of course (non-sarcastically) exactly what I want. For admin tasks, it has plenty of power, and this trade-off means it has lower power consumption and heat production than a stock server. While debugging some puppet code that I was running takes longer than normal on this box, I was eventually able to get the job done. This is another reason why this box needs to act like more of an upstream distro — if it did, I’d be able to have a faster machine as my dev box!

Other tools:

Other stock tools like ethtool, and brctl, work out of the box. Bonding, vlan’s and every other feature I tested seems to work the same way you expect from a GNU/Linux system.

Puppet and automation:

Readers of my blog will know that I manage my servers with Puppet. Instead of having the puppet agent connect over an API to the switch, you can directly install and run puppet on this Cumulus Linux machine! Some users might quickly jump to using the firewall module as the solution for consistent management, but a level two user will know that a higher level wrapper around shorewall is the better approach. This is all possible with this switch and seems to work great! The downside was that I had to manually add repositories to get the shorewall packages because it is not a stock distro :(

Why not SDN?

SDN or software-defined networking, is a fantastic and interesting technology. Unfortunately, it’s a parallel problem to what I’m describing in this article, and not a solution to help you work around the lack of a good GNU+Linux switch. If programming the ASIC’s wasn’t an NDA requiring activity, I bet we’d see a lot more innovative networking companies and technologies pop up.

Future?

This product isn’t quite baked yet for me to want to use it in production, but it’s so tantalizingly close that it’s strongly worth considering. I hope that they react positively to my suggestions and create an even more native, upstream environment. Once this is done, it will be my go to, for all my switching!

Thanks:

Thanks very much to the Cumulus team for showing me their software, and giving me access to demo it on some live switches. I didn’t test performance, but I have no doubt that it competes with the market average. Prove me right by trying it out yourself!

Thanks for listening, and Happy hacking!

James

PS: Special thanks to David Caplan for the great networking discussions we had!

Testing Evolution’s git master and GNOME continuous

I’ve wanted a feature in Evolution for a while. It was formally requested in 2002, and it just recently got fixed in git master. I only started publicly groaning about this missing feature in 2013, and mcrha finally patched it. I tested the feature and found a small bug, mcrha patched that too, and I finally re-tested it. Now I’m blogging about this process so that you can get involved too!

Why Evolution?

  • Evolution supports GPG (Geary doesn’t, Gmail doesn’t)
  • Evolution has a beautiful composer (Gmail’s sucks, just try to reply inline)
  • Evolution is Open Source and Free Software (Gmail is proprietary)
  • Evolution integrates with GNOME (Gmail doesn’t)
  • Evolution has lots of fancy, mature features (Geary doesn’t)
  • Evolution cares about your privacy (Gmail doesn’t)

The feature:

I’d like to be able to select a bunch of messages and click an archive action to move them to a specific folder. Gmail popularized this idea in 2004, two years after it was proposed for Evolution. It has finally landed.

In your account editor, you can select the “Archive Folder” that you want messages moved to:

evolution-account-archive-folder

This will let you have a different folder set per account.

Archive like Gmail does:

If you use Evolution with a Gmail account, and you want the same functionality as the Gmail archive button, you can accomplish this by setting the Evolution archive folder to point to the Gmail “All Mail” folder, which will cause the Evolution archive action to behave as Gmail’s does.

To use this functionality (with or without Gmail), simply select the messages you want to move, and click the “Archive…” button:

evolution-context-menu-archive

This is also available via the “Message” menu. You can also activate with the Control-Alt-a shortcut. For more information, please read the description from mcrha.

GNOME Continuous:

Once the feature was patched in git master, I wanted to try it out right away! The easiest way for me to do this, was to use the GNOME Continuous project that walters started. This GNOME project automatically kicks off integration builds of multiple git master trees for the most common GNOME applications.

If you follow the Gnome Continuous instructions, it is fairly easy to download an image, and then import it with virt-manager or boxes. Once it had booted up, I logged into the machine, and was able to test Evolution’s git master.

Digging deep into the app:

If you want to tweak the app for debugging purposes, it is quite easy to do this with GTKInspector. Launch it with Control-Shift-i or Control-Shift-d, and you’ll soon be poking around the app’s internals. You can change the properties you want in real-time, and then you’ll know which corresponding changes in the upstream source are necessary.

Finding a bug and re-testing:

I did find one small bug with the Evolution patches. I’m glad I found it now, instead of having to wait six months for a new Fedora version. The maintainer fixed it quickly, and all that was left to do was to re-test the new git master. To do this, I updated my GNOME Continuous image.

  1. Click on Control-Alt-F2 from the virt-manager “Send Key” menu.
  2. Log in as root (no password)
  3. Set the password to something by running the passwd command.
  4. Click on Control-Alt-F1 to return to your GNOME session.
  5. Open a terminal and run: pkexec bash.
  6. Enter your root password.
  7. Run ostree admin upgrade.
  8. Once it has finished downloading the updates, reboot the vm.

You’ll now be able to test the newest git master. Please note that it takes a bit of time for it to build, so it is not instant, but it’s pretty quick.

Taking screenshots:

I took a few screenshots from inside the VM to show to you in this blog post. Extracting them was a bit trickier because I couldn’t get SSHD running. To do so, I installed the guestfs browser on my host OS. It was very straight forward to use it to read the VM image, browse to the ~/Pictures/ directory, and then download the images to my host. Thanks rwmjones!

Conclusion:

Hopefully this will motivate you to contribute to GNOME early and often! There are lots of great tools available, and lots of applications that need some love.

Happy Hacking,

James

Hacking out an Openshift app

I had an itch to scratch, and I wanted to get a bit more familiar with Openshift. I had used it in the past, but it was time to have another go. The app and the code are now available. Feel free to check out:

https://pdfdoc-purpleidea.rhcloud.com/

This is a simple app that takes the URL of a markdown file on GitHub, and outputs a pandoc converted PDF. I wanted to use pandoc specifically, because it produces PDF’s that were beautifully created with LaTeX. To embed a link in your upstream documentation that points to a PDF, just append the file’s URL to this app’s url, under a /pdf/ path. For example:

https://pdfdoc-purpleidea.rhcloud.com/pdf/https://github.com/purpleidea/puppet-gluster/blob/master/DOCUMENTATION.md

will send you to a PDF of the puppet-gluster documentation. This will make it easier to accept questions as FAQ patches, without needing to have the git embedded binary PDF be constantly updated.

If you want to hear more about what I did, read on…

The setup:

Start by getting a free Openshift account. You’ll also want to install the client tools. Nothing is worse than having to interact with your app via a web interface. Hackers use terminals. Lucky, the Openshift team knows this, and they’ve created a great command line tool called rhc to make it all possible.

I started by following their instructions:

$ sudo yum install rubygem-rhc
$ sudo gem update rhc

Unfortunately, this left with a problem:

$ rhc
/usr/share/rubygems/rubygems/dependency.rb:298:in `to_specs': Could not find 'rhc' (>= 0) among 37 total gem(s) (Gem::LoadError)
    from /usr/share/rubygems/rubygems/dependency.rb:309:in `to_spec'
    from /usr/share/rubygems/rubygems/core_ext/kernel_gem.rb:47:in `gem'
    from /usr/local/bin/rhc:22:in `'

I solved this by running:

$ gem install rhc

Which makes my user rhc to take precedence over the system one. Then run:

$ rhc setup

and the rhc client will take you through some setup steps such as uploading your public ssh key to the Openshift infrastructure. The beauty of this tool is that it will work with the Red Hat hosted infrastructure, or you can use it with your own infrastructure if you want to host your own Openshift servers. This alone means you’ll never get locked in to a third-party providers terms or pricing.

Create a new app:

To get a fresh python 3.3 app going, you can run:

$ rhc create-app <appname> python-3.3

From this point on, it’s fairly straight forward, and you can now hack your way through the app in python. To push a new version of your app into production, it’s just a git commit away:

$ git add -p && git commit -m 'Awesome new commit...' && git push && rhc tail

Creating a new app from existing code:

If you want to push a new app from an existing code base, it’s as easy as:

$ rhc create-app awesomesauce python-3.3 --from-code https://github.com/purpleidea/pdfdoc
Application Options
-------------------
Domain:      purpleidea
Cartridges:  python-3.3
Source Code: https://github.com/purpleidea/pdfdoc
Gear Size:   default
Scaling:     no

Creating application 'awesomesauce' ... done


Waiting for your DNS name to be available ... done

Cloning into 'awesomesauce'...
The authenticity of host 'awesomesauce-purpleidea.rhcloud.com (203.0.113.13)' can't be established.
RSA key fingerprint is 00:11:22:33:44:55:66:77:88:99:aa:bb:cc:dd:ee:ff.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'awesomesauce-purpleidea.rhcloud.com,203.0.113.13' (RSA) to the list of known hosts.

Your application 'awesomesauce' is now available.

  URL:        http://awesomesauce-purpleidea.rhcloud.com/
  SSH to:     00112233445566778899aabb@awesomesauce-purpleidea.rhcloud.com
  Git remote: ssh://00112233445566778899aabb@awesomesauce-purpleidea.rhcloud.com/~/git/awesomesauce.git/
  Cloned to:  /home/james/code/awesomesauce

Run 'rhc show-app awesomesauce' for more details about your app.

In my case, my app also needs some binaries installed. I haven’t yet automated this process, but I think it can be done be creating a custom cartridge. Help to do this would be appreciated!

Updating your app:

In the case of an app that I already deployed with this method, updating it from the upstream source is quite easy. You just pull down and relevant commits, and then push them up to your app’s git repo:

$ git pull upstream master 
From https://github.com/purpleidea/pdfdoc
 * branch            master     -> FETCH_HEAD
Updating 5ac5577..bdf9601
Fast-forward
 wsgi.py | 2 --
 1 file changed, 2 deletions(-)
$ git push origin master 
Counting objects: 7, done.
Delta compression using up to 4 threads.
Compressing objects: 100% (3/3), done.
Writing objects: 100% (3/3), 312 bytes | 0 bytes/s, done.
Total 3 (delta 2), reused 0 (delta 0)
remote: Stopping Python 3.3 cartridge
remote: Waiting for stop to finish
remote: Waiting for stop to finish
remote: Building git ref 'master', commit bdf9601
remote: Activating virtenv
remote: Checking for pip dependency listed in requirements.txt file..
remote: You must give at least one requirement to install (see "pip help install")
remote: Running setup.py script..
remote: running develop
remote: running egg_info
remote: creating pdfdoc.egg-info
remote: writing pdfdoc.egg-info/PKG-INFO
remote: writing dependency_links to pdfdoc.egg-info/dependency_links.txt
remote: writing top-level names to pdfdoc.egg-info/top_level.txt
remote: writing manifest file 'pdfdoc.egg-info/SOURCES.txt'
remote: reading manifest file 'pdfdoc.egg-info/SOURCES.txt'
remote: writing manifest file 'pdfdoc.egg-info/SOURCES.txt'
remote: running build_ext
remote: Creating /var/lib/openshift/00112233445566778899aabb/app-root/runtime/dependencies/python/virtenv/venv/lib/python3.3/site-packages/pdfdoc.egg-link (link to .)
remote: pdfdoc 0.0.1 is already the active version in easy-install.pth
remote: 
remote: Installed /var/lib/openshift/00112233445566778899aabb/app-root/runtime/repo
remote: Processing dependencies for pdfdoc==0.0.1
remote: Finished processing dependencies for pdfdoc==0.0.1
remote: Preparing build for deployment
remote: Deployment id is 9c2ee03c
remote: Activating deployment
remote: Starting Python 3.3 cartridge (Apache+mod_wsgi)
remote: Application directory "/" selected as DocumentRoot
remote: Application "wsgi.py" selected as default WSGI entry point
remote: -------------------------
remote: Git Post-Receive Result: success
remote: Activation status: success
remote: Deployment completed with status: success
To ssh://00112233445566778899aabb@awesomesauce-purpleidea.rhcloud.com/~/git/awesomesauce.git/
   5ac5577..bdf9601  master -> master
$

Final thoughts:

I hope this helped you getting going with Openshift. Feel free to send me patches!

Happy hacking!

James