Automatic clustering in mgmt

In mgmt, deploying and managing your clustered config management infrastructure needs to be as automatic as the infrastructure you’re using mgmt to manage. With mgmt, instead of a centralized data store, we function as a distributed system, built on top of etcd and the raft protocol.

In this article, I’ll cover how this feature works.

Foreword:

Mgmt is a next generation configuration management project. If you haven’t heard of it yet, or you don’t remember why we use a distributed database, start by reading the previous articles:

Embedded etcd:

Since mgmt and etcd are both written in golang, the etcd code can be built into the same binary as mgmt. As a result, etcd can be managed directly from within mgmt. Unfortunately, there’s currently no recommended API to do this, but I’ve tried to get such a feature upstream to avoid code duplication in mgmt. If you can help out here, I’d really appreciate it! In the meantime, I’ve had to copy+paste the necessary portions into mgmt.

Clustering mechanics:

You can deploy an automatically clustered mgmt cluster by following these three steps:

1) If no mgmt servers exist you can start one up by running mgmt normally:

./mgmt run --file examples/graph0.yaml

2) To add any subsequent mgmt server, run mgmt normally, but point it at any number of existing mgmt servers with the --seeds command:

./mgmt run --file examples/graph0.yaml --seeds <ip address:port>

3) Profit!

We internally implement a clustering algorithm which does the hard-working of building and managing the etcd cluster for you, so that you don’t have to. If you’re interested, keep reading to find out how it works!

Clustering algorithm:

The clustering algorithm works as follows:

If you aren’t given any seeds, then assume you are the first etcd server (peer) and start-up. If you are given a seeds argument, then connect to that peer to join the cluster as a client. If you’d like to be promoted to a server, then you can “volunteer” by setting a special key in the cluster data store.

The existing cluster of peers will decide if they want additional peers, and if so, they can “nominate” someone from the pool of volunteers. If you have been nominated, you can start-up an etcd peer and peer with the rest of the cluster. Similarly, the cluster can decide to un-nominate a peer, and if you’ve been un-nominated, then you should shutdown your etcd server.

All cluster decisions are made by consensus using the raft algorithm. In practice this means that the elected cluster leader looks at the state of the system, and makes the necessary nomination changes.

Lastly, if you don’t want to be a peer any more, you can revoke your volunteer message, which will be seen by the cluster, and if you were running a server, you should receive an un-nominate message in response, which will let you shutdown cleanly.

Disclaimer:

It’s probably worth mentioning that the current implementation has a few issues, and at least one race. The goal is to have it polished up by the time etcd v3 is released, but it’s perfectly usable for testing and experimentation today! If you don’t want to automatically cluster, you can always use the --no-server flag, and point mgmt at a manually managed mgmt cluster using the --seeds flag.

Testing:

Testing this feature on a single machine makes development and experimentation easier, so as a result, there are a few flags which make this possible.

--hostname <hostname>
With this flag, you can force your mgmt client to pretend it is running on a host with the above mentioned name. You can use this to specify --hostname h1, or --hostname h2, and so on; one for each mgmt agent you want to run on the same machine.

--server-urls <ip:port>
With this flag you can specify which IP address and port the etcd server will listen on for peer requests. By default this will use 127.0.0.1:2379, but when running multiple mgmt agents on the same machine you’ll need to specify this manually to avoid collisions. You can specify as many IP address and port pairs as you’d like by separating them with commas or semicolons. The --peer-urls flag is an alias which does the same thing.

--client-urls <ip:port>
This flag specifies which IP address and port the etcd server will listen on for client connections. It defaults to 127.0.0.1:2380, but you’ll occasionally want to specify this manually for the same reasons as mentioned above. You can specify as many IP address and port pairs as you’d like by separating them with commas or semicolons. This is the address that will be used by the --seeds flag when joining an existing cluster.

Elastic clustering:

In the future, you’ll be able to specify a much more elaborate method to decide how many hosts should be promoted into peers, and which hosts should be nominated or un-nominated when growing or shrinking the cluster.

At the moment, we do the grow or shrink operation when the current peer count does not match the requested cluster size. This value has a default of 5, and can even be changed dynamically. To do so you can run:

ETCDCTL_API=3 etcdctl --endpoints 127.0.0.1:2379 put /_mgmt/idealClusterSize 3

You can also set it at start-up by using the --ideal-cluster-size flag.

Example:

Here’s a real example if you want to dive in. Try running the following four commands in separate terminals:

./mgmt run --file examples/etcd1a.yaml --hostname h1 --ideal-cluster-size 3
./mgmt run --file examples/etcd1b.yaml --hostname h2 --seeds http://127.0.0.1:2379 --client-urls http://127.0.0.1:2381 --server-urls http://127.0.0.1:2382
./mgmt run --file examples/etcd1c.yaml --hostname h3 --seeds http://127.0.0.1:2379 --client-urls http://127.0.0.1:2383 --server-urls http://127.0.0.1:2384
./mgmt run --file examples/etcd1d.yaml --hostname h4 --seeds http://127.0.0.1:2379 --client-urls http://127.0.0.1:2385 --server-urls http://127.0.0.1:2386

Once you’ve done this, you should have a three host cluster! Check this by running any of these commands:

ETCDCTL_API=3 etcdctl --endpoints 127.0.0.1:2379 member list
ETCDCTL_API=3 etcdctl --endpoints 127.0.0.1:2381 member list
ETCDCTL_API=3 etcdctl --endpoints 127.0.0.1:2383 member list

Note that you’ll need a v3 beta version of the etcdctl command which you can get by running ./build in the etcd git repo.

To grow your cluster, try increasing the desired cluster size to five:

ETCDCTL_API=3 etcdctl --endpoints 127.0.0.1:2381 put /_mgmt/idealClusterSize 5

You should see the last host start-up an etcd server. If you reduce the idealClusterSize, you’ll see servers shutdown! You’re responsible if you destroy the cluster by setting it too low! You can then try growing your cluster again, but unfortunately due to a bug, hosts can’t be re-used yet, and you’ll get a “bind: address already in use” error. We hope to have this fixed shortly!

Security:

Unfortunately no authentication security or transport security has been implemented yet. We have a great design, but are busy working on other parts of the project at the moment. If you’d like to help out here, please let us know!

Future work:

There’s still a lot of work to do, to improve this feature. The biggest challenge has been getting a reasonable embedded server API upstream. It’s not clear whether this patch can be made to work or if something different will need to be written, but at least one other project looks like it could benefit from this as well.

Video:

A recording from the recent Berlin CoreOSFest 2016 has been published! I demoed these recent features, but one interesting note is that I am actually presenting an earlier version of the code which used the etcd V2 API. I’ve since ported the code to V3, but it is functionally similar. It’s probably worth mentioning, that I found the V3 API to be more difficult, but also more correct and powerful. I think it is a net improvement to the project.

Community:

I can’t end this blog post without mentioning some of the great stuff that’s been happening in the mgmt community! In particular, Felix has written some great code to run existing Puppet code on mgmt. Check out his work!

Upcoming speaking:

I’ve got some upcoming speaking in Hong Kong at HKOSCon16 and in Cape Town at DebConf16 about the project. Please ping me if you’ll be in one of these cities and would like to hack on mgmt or just chat about the project. I’m happy to give some impromptu demos if you ask!

Thanks for reading!

Happy Hacking,

James

PS: We now have a community run twitter account. Check us out!

Upcoming speaking In Hong Kong and South Africa

I’m thrilled to tell you that I’ll be speaking about mgmt in Hong Kong and South Africa. It will be my first time to both countries and my first time to Asia and Africa!

In Hong Kong I’ll be speaking at HKOSCon2016.

In South Africa I’ll be speaking at DebConf16.

I’m looking forward to meeting with many of the hard-working Debian hackers, and collaborating with them to build and promote excellent Free Software. The mgmt project considers both Fedora and Debian to be first class platforms, and parity is a primary design goal.

I’ll be presenting and demoing many of the new features in mgmt. If you haven’t heard about the project, please read some of the posts about it…

You’re also welcome to come join our IRC channel! It’s #mgmtconfig on Freenode.

Many special thanks to both conference organizers as well as Red Hat and the OSAS department for sponsoring my travel! Without it, visiting these places and interacting with new continents of hackers would be impossible.

Happy Hacking,

James

PS: We now have a community run twitter account. Check us out!

One hour hacks: Remote LUKS over SSH

I have a GNU/Linux server which I mount a few LUKS encrypted drives on. I only ever interact with the server over SSH, and I never want to keep the LUKS credentials on the remote server. I don’t have anything especially sensitive on the drives, but I think it’s a good security practice to encrypt it all, if only to add noise into the system and for solidarity with those who harbour much more sensitive data.

This means that every time the server reboots or whenever I want to mount the drives, I have to log in and go through the series of luksOpen and mount commands before I can access the data. This turned out to be a bit laborious, so I wrote a quick script to automate it! I also made sure that it was idempotent.

I decided to share it because I couldn’t find anything similar, and I was annoyed that I had to write this in the first place. Hopefully it saves you some anguish. It also contains a clever little bash hack that I am proud to have in my script.

Here’s the script. You’ll need to fill in the map of mount folder names to drive UUID’s, and you’ll want to set your server hostname and FQDN to match your environment of course. It will prompt you for your root password to mount, and the LUKS password when needed.

Example of mounting:

james@computer:~$ rluks.sh 
Running on: myserver...
[sudo] password for james: 
Mount/Unmount [m/u] ? m
Mounting...
music: mkdir ✓
LUKS Password: 
music: luksOpen ✓
music: mount ✓
files: mkdir ✓
files: luksOpen ✓
files: mount ✓
photos: mkdir ✓
photos: luksOpen ✓
photos: mount ✓
Done!
Connection to server.example.com closed.

Example of unmounting:

james@computer:~$ rluks.sh 
Running on: myserver...
[sudo] password for james: 
Sorry, try again.
[sudo] password for james: 
Mount/Unmount [m/u] ? u
Unmounting...
music: umount ✓
music: luksClose ✓
music: rmdir ✓
files: umount ✓
files: luksClose ✓
files: rmdir ✓
photos: umount ✓
photos: luksClose ✓
photos: rmdir ✓
Done!
Connection to server.example.com closed.
james@computer:~$

It’s worth mentioning that there are many improvements that could be made to this script. If you’ve got patches, send them my way. After all, this is only a: one hour hack.

Happy hacking,

James

PS: One day this sort of thing might be possible in mgmt. Let me know if you want to help work on it!

Automatic grouping in mgmt

In this post, I’ll tell you about the recently released “automatic grouping” or “AutoGroup” feature in mgmt, a next generation configuration management prototype. If you aren’t already familiar with mgmt, I’d recommend you start by reading the introductory post, and the second post. There’s also an introductory video.

Resources in a graph

Most configuration management systems use something called a directed acyclic graph, or DAG. This is a fancy way of saying that it is a bunch of circles (vertices) which are connected with arrows (edges). The arrows must be connected to exactly two vertices, and you’re only allowed to move along each arrow in one direction (directed). Lastly, if you start at any vertex in the graph, you must never be able to return to where you started by following the arrows (acyclic). If you can, the graph is not fit for our purpose.

A DAG from Wikipedia

An example DAG from Wikipedia

The graphs in configuration management systems usually represent the dependency relationships (edges) between the resources (vertices) which is important because you might want to declare that you want a certain package installed before you start a service. To represent the kind of work that you want to do, different kinds of resources exist which you can use to specify that work.

Each of the vertices in a graph represents a unique resource, and each is backed by an individual software routine or “program” which can check the state of the resource, and apply the correct state if needed. This makes each resource idempotent. If we have many individual programs, this might turn out to be a lot of work to do to get our graph into the desired state!

Resource grouping

It turns out that some resources have a fixed overhead to starting up and running. If we can group resources together so that they share this fixed overhead, then our graph might converge faster. This is exactly what we do in mgmt!

Take for example, a simple graph such as the following:

Simple DAG showing three pkg, two file, and one svc resource

Simple DAG showing one svc, two file, and three pkg resources…

We can logically group the three pkg resources together and redraw the graph so that it now looks like this:

DAG with the three pkg resources now grouped into one.

DAG with the three pkg resources now grouped into one! Overlapping vertices mean that they act as if they’re one vertex instead of three!

This all happens automatically of course! It is very important that the new graph is a faithful, logical representation of the original graph, so that the specified dependency relationships are preserved. What this represents, is that when multiple resources are grouped (shown by overlapping vertices in the graph) they run together as a single unit. This is the practical difference between running:

$ dnf install -y powertop
$ dnf install -y sl
$ dnf install -y cowsay

if not grouped, and:

$ dnf install -y powertop sl cowsay

when grouped. If you try this out you’ll see that the second scenario is much faster, and on my laptop about three times faster! This is because of fixed overhead such as cache updates, and the dnf dep solver that each process runs.

This grouping means mgmt uses this faster second scenario instead of the slower first scenario that all the current generation tools do. It’s also important to note that different resources can implement the grouping feature to optimize for different things besides performance. More on that later…

The algorithm

I’m not an algorithmist by training, so it took me some fiddling to come up with an appropriate solution. I’ve implemented it along with an extensive testing framework and a series of test cases, which it passes of course! If we ever find a graph that does not get grouped correctly, then we can iterate on the algorithm and add it as a new test case.

The algorithm turns out to be relatively simple. I first noticed that vertices which had a relationship between them must not get grouped, because that would undermine the precedence ordering of the vertices! This property is called reachability. I then attempt to group every vertex to every other vertex that has no reachability or reverse reachability to it!

The hard part turned out to be getting all the plumbing surrounding the algorithm correct, and in particular the actual vertex merging algorithm, so that “discarded edges” are reattached in the correct places. I also took a bit of extra time to implement the algorithm as a struct which satisfies an “AutoGrouper” interface. This way, if you’d like to implement a different algorithm, it’s easy to drop in your replacement. I’m fairly certain that a more optimal version of my algorithm is possible for anyone wishing to do the analysis.

A quick note on nomenclature: I’ve actually decided to call this grouping and not merging, because we actually preserve the unique data of each resource so that they can be taken apart and mixed differently when (and if) there is a change in the compiled graph. This makes graph changeovers very cheap in mgmt, because we don’t have to re-evaluate anything which remains constant between graphs. Merging would imply a permanent reduction and loss of unique identity.

Parallelism and user choice

It’s worth noting two important points:

  1. Auto grouping of resources usually decreases the parallelism of a graph.
  2. The user might not want a particular resource to get grouped!

You might remember that one of the novel properties of mgmt, is that it executes the graph in parallel whenever possible. Although the grouping of resources actually removes some of this parallelism, certain resources such as the pkg resource already have an innate constraint on sequential behaviour, namely: the package manager lock. Since these tools can’t operate in parallel, and since each execution has a fixed overhead, it’s almost always beneficial to group pkg resources together.

Grouping is also not mandatory, so while it is a sensible default, you can disable grouping per resource with a simple meta parameter.

Lastly, it’s also worth mentioning that grouping doesn’t “magically” happen without some effort. The underlying resource needs to know how to optimize, watch, check and apply multiple resources simultaneously for it to support the feature. At the moment, only the pkg resource can do any grouping, and even then, there could always be some room for improvement. It’s also not optimal (or even logical) to group certain types of resources, so those will never be able to do any grouping. We also don’t group together resources of different kinds, although mgmt could support this if a valid use case is ever found.

File grouping

As I mentioned, only the pkg resource supports grouping at this time. The file resource demonstrates a different use case for resource grouping. Suppose you want to monitor 10000 files in a particular directory, but they are specified individually. This would require far too many inotify watches than a normal system usually has, so the grouping algorithm could group them into a single resource, which then uses a recursive watcher such as fanotify to reduce the watcher count by a factor of 10000. Unfortunately neither the file resource grouping, nor the fanotify support for this exist at the moment. If you’d like to implement either of these, please let me know!

If you can think of another resource kind that you’d like to write, or in particular, if you know of one which would work well with resource grouping, please contact me!

Science!

I wouldn’t be a very good scientist (I’m actually a Physiologist by training) if I didn’t include some data and a demonstration that this all actually works, and improves performance! What follows will be a good deal of information, so skim through the parts you don’t care about.

Science <3 data

Science <3 data

I decided to test the following four scenarios:

  1. single package, package check, package already installed
  2. single package, package install, package not installed
  3. three packages, package check, packages already installed
  4. three packages, package install, packages not installed

These are the situations you’d encounter when running your tool of choice to install one or more packages, and finding them either already present, or in need of installation. I timed each test, which ends when the tool tells us that our system has converged.

Each test is performed multiple times, and the average is taken, but only after we’ve run the tool at least twice so that the caches are warm.

We chose small packages so that the fixed overhead delays due to bandwidth and latencies are minimal, and so that our data is more representative of the underlying tool.

The single package tests use the powertop package, and the three package tests use powertop, sl, and cowsay. All tests were performed on an up-to-date Fedora 23 laptop, with an SSD. If you haven’t tried sl and cowsay, do give them a go!

The four tools tested were:

  1. puppet
  2. mgmt
  3. pkcon
  4. dnf

The last two are package manager front ends so that it’s more obvious how expensive something is expected to cost, and so that you can discern what amount of overhead is expected, and what puppet or mgmt is causing you. Here are a few representative runs:

mgmt installation of powertop:

$ time sudo ./mgmt run --file examples/pkg1.yaml --converged-timeout=0
21:04:18 main.go:63: This is: mgmt, version: 0.0.3-1-g6f3ac4b
21:04:18 main.go:64: Main: Start: 1459299858287120473
21:04:18 main.go:190: Main: Running...
21:04:18 main.go:113: Etcd: Starting...
21:04:18 main.go:117: Main: Waiting...
21:04:18 etcd.go:113: Etcd: Watching...
21:04:18 etcd.go:113: Etcd: Watching...
21:04:18 configwatch.go:54: Watching: examples/pkg1.yaml
21:04:20 config.go:272: Compile: Adding AutoEdges...
21:04:20 config.go:533: Compile: Grouping: Algorithm: nonReachabilityGrouper...
21:04:20 main.go:171: Graph: Vertices(1), Edges(0)
21:04:20 main.go:174: Graphviz: No filename given!
21:04:20 pgraph.go:764: State: graphStateNil -> graphStateStarting
21:04:20 pgraph.go:825: State: graphStateStarting -> graphStateStarted
21:04:20 main.go:117: Main: Waiting...
21:04:20 pkg.go:245: Pkg[powertop]: CheckApply(true)
21:04:20 pkg.go:303: Pkg[powertop]: Apply
21:04:20 pkg.go:317: Pkg[powertop]: Set: installed...
21:04:25 packagekit.go:399: PackageKit: Woops: Signal.Path: /8442_beabdaea
21:04:25 packagekit.go:399: PackageKit: Woops: Signal.Path: /8443_acbadcbd
21:04:31 pkg.go:335: Pkg[powertop]: Set: installed success!
21:04:31 main.go:79: Converged for 0 seconds, exiting!
21:04:31 main.go:55: Interrupted by exit signal
21:04:31 pgraph.go:796: Pkg[powertop]: Exited
21:04:31 main.go:203: Goodbye!

real    0m13.320s
user    0m0.023s
sys    0m0.021s

puppet installation of powertop:

$ time sudo puppet apply pkg.pp 
Notice: Compiled catalog for computer.example.com in environment production in 0.69 seconds
Notice: /Stage[main]/Main/Package[powertop]/ensure: created
Notice: Applied catalog in 10.13 seconds

real    0m18.254s
user    0m9.211s
sys    0m2.074s

dnf installation of powertop:

$ time sudo dnf install -y powertop
Last metadata expiration check: 1:22:03 ago on Tue Mar 29 20:04:29 2016.
Dependencies resolved.
==========================================================================
 Package          Arch           Version            Repository       Size
==========================================================================
Installing:
 powertop         x86_64         2.8-1.fc23         updates         228 k

Transaction Summary
==========================================================================
Install  1 Package

Total download size: 228 k
Installed size: 576 k
Downloading Packages:
powertop-2.8-1.fc23.x86_64.rpm            212 kB/s | 228 kB     00:01    
--------------------------------------------------------------------------
Total                                     125 kB/s | 228 kB     00:01     
Running transaction check
Transaction check succeeded.
Running transaction test
Transaction test succeeded.
Running transaction
  Installing  : powertop-2.8-1.fc23.x86_64                            1/1 
  Verifying   : powertop-2.8-1.fc23.x86_64                            1/1 

Installed:
  powertop.x86_64 2.8-1.fc23                                              

Complete!

real    0m10.406s
user    0m4.954s
sys    0m0.836s

puppet installation of powertop, sl and cowsay:

$ time sudo puppet apply pkg3.pp 
Notice: Compiled catalog for computer.example.com in environment production in 0.68 seconds
Notice: /Stage[main]/Main/Package[powertop]/ensure: created
Notice: /Stage[main]/Main/Package[sl]/ensure: created
Notice: /Stage[main]/Main/Package[cowsay]/ensure: created
Notice: Applied catalog in 33.02 seconds

real    0m41.229s
user    0m19.085s
sys    0m4.046s

pkcon installation of powertop, sl and cowsay:

$ time sudo pkcon install powertop sl cowsay
Resolving                     [=========================]         
Starting                      [=========================]         
Testing changes               [=========================]         
Finished                      [=========================]         
Installing                    [=========================]         
Querying                      [=========================]         
Downloading packages          [=========================]         
Testing changes               [=========================]         
Installing packages           [=========================]         
Finished                      [=========================]         

real    0m14.755s
user    0m0.060s
sys    0m0.025s

and finally, mgmt installation of powertop, sl and cowsay with autogrouping:

$ time sudo ./mgmt run --file examples/autogroup2.yaml --converged-timeout=0
21:16:00 main.go:63: This is: mgmt, version: 0.0.3-1-g6f3ac4b
21:16:00 main.go:64: Main: Start: 1459300560994114252
21:16:00 main.go:190: Main: Running...
21:16:00 main.go:113: Etcd: Starting...
21:16:00 main.go:117: Main: Waiting...
21:16:00 etcd.go:113: Etcd: Watching...
21:16:00 etcd.go:113: Etcd: Watching...
21:16:00 configwatch.go:54: Watching: examples/autogroup2.yaml
21:16:03 config.go:272: Compile: Adding AutoEdges...
21:16:03 config.go:533: Compile: Grouping: Algorithm: nonReachabilityGrouper...
21:16:03 config.go:533: Compile: Grouping: Success for: Pkg[powertop] into Pkg[cowsay]
21:16:03 config.go:533: Compile: Grouping: Success for: Pkg[sl] into Pkg[cowsay]
21:16:03 main.go:171: Graph: Vertices(1), Edges(0)
21:16:03 main.go:174: Graphviz: No filename given!
21:16:03 pgraph.go:764: State: graphStateNil -> graphStateStarting
21:16:03 pgraph.go:825: State: graphStateStarting -> graphStateStarted
21:16:03 main.go:117: Main: Waiting...
21:16:03 pkg.go:245: Pkg[autogroup:(cowsay,powertop,sl)]: CheckApply(true)
21:16:03 pkg.go:303: Pkg[autogroup:(cowsay,powertop,sl)]: Apply
21:16:03 pkg.go:317: Pkg[autogroup:(cowsay,powertop,sl)]: Set: installed...
21:16:08 packagekit.go:399: PackageKit: Woops: Signal.Path: /8547_cbeaddda
21:16:08 packagekit.go:399: PackageKit: Woops: Signal.Path: /8548_bcaadbce
21:16:16 pkg.go:335: Pkg[autogroup:(cowsay,powertop,sl)]: Set: installed success!
21:16:16 main.go:79: Converged for 0 seconds, exiting!
21:16:16 main.go:55: Interrupted by exit signal
21:16:16 pgraph.go:796: Pkg[cowsay]: Exited
21:16:16 main.go:203: Goodbye!

real    0m15.621s
user    0m0.040s
sys    0m0.038s

Results and analysis

My hard work seems to have paid off, because we do indeed see a noticeable improvement from grouping package resources. The data shows that even in the single package comparison cases, mgmt has very little overhead, which is demonstrated by seeing that the mgmt run times are very similar to the times it takes to run the package managers manually.

In the three package scenario, performance is approximately 2.39 times faster than puppet for installation. Checking was about 12 times faster! These ratios are expected to grow with a larger number of resources.

Sweet graph...

Bigger bars is worse… Puppet is in Red, mgmt is in blue.

The four groups at the bottom along the x axis correspond to the four scenarios I tested, 1, 2 and 3 corresponding to each run of that scenario, with the average of the three listed there too.

Versions

The test wouldn’t be complete if we didn’t tell you which specific version of each tool that we used. Let’s time those as well! ;)

puppet:

$ time puppet --version 
4.2.1

real    0m0.659s
user    0m0.525s
sys    0m0.064s

mgmt:

$ time ./mgmt --version
mgmt version 0.0.3-1-g6f3ac4b

real    0m0.007s
user    0m0.006s
sys    0m0.002s

pkcon:

$ time pkcon --version
1.0.11

real    0m0.013s
user    0m0.006s
sys    0m0.005s

dnf:

$ time dnf --version
1.1.7
  Installed: dnf-0:1.1.7-2.fc23.noarch at 2016-03-17 13:37
  Built    : Fedora Project at 2016-03-09 16:45

  Installed: rpm-0:4.13.0-0.rc1.12.fc23.x86_64 at 2016-03-03 09:39
  Built    : Fedora Project at 2016-02-29 09:53

real    0m0.438s
user    0m0.379s
sys    0m0.036s

Yep, puppet even takes the longest to tell us what version it is. Now I’m just teasing…

Methodology

It might have been more useful to time the removal of packages instead so that we further reduce the variability of internet bandwidth and latency, although since most configuration management is used to install packages (rather than remove), we figured this would be more appropriate and easy to understand. You’re welcome to conduct your own study and share the results!

Additionally, for fun, I also looked at puppet runs where three individual resources were used instead of a single resource with the title being an array of all three packages, and found no significant difference in the results. Indeed puppet runs dnf three separate times in either scenario:

$ ps auxww | grep dnf
root     12118 27.0  1.4 417060 110864 ?       Ds   21:57   0:03 /usr/bin/python3 /usr/bin/dnf -d 0 -e 0 -y install powertop
$ ps auxww | grep dnf
root     12713 32.7  2.0 475204 159840 ?       Rs   21:57   0:02 /usr/bin/python3 /usr/bin/dnf -d 0 -e 0 -y install sl
$ ps auxww | grep dnf
root     13126  0.0  0.7 275324 55608 ?        Rs   21:57   0:00 /usr/bin/python3 /usr/bin/dnf -d 0 -e 0 -y install cowsay

Data

If you’d like to download the raw data as a text formatted table, and the terminal output from each type of run, I’ve posted it here.

Conclusion

I hope that you enjoyed this feature and analysis, and that you’ll help contribute to making it better. Come join our IRC channel and say hello! Thanks to those who reviewed my article and pointed out some good places for improvements!

Happy Hacking,

James

 

Automatic edges in mgmt

It’s been two months since I announced mgmt, and now it’s time to continue the story by telling you more about the design of what’s now in git master. Before I get into those details, let me quickly recap what’s happened since then.

Mgmt community recap:

Okay, time to tell you what’s new in mgmt!

Types vs. resources:

Configuration management systems have the concept of primitives for the basic building blocks of functionality. The well-known ones are “package”, “file”, “service” and “exec” or “execute”. Chef calls these “resources”, while puppet (misleadingly) calls them “types”.

I made the mistake of calling the primitives in mgmt, “types”, no doubt due to my extensive background in puppet, however this overloads the word because it usually refers to programming types, so I’ve decided not to use it any more to refer to these primitives. The Chef folks got this right! From now on they’ll be referred to as “resources” or “res” when abbreviated.

Mgmt now has: “noop“, “file“, “svc“, “exec“, and now: “pkg“…

The package (pkg) resource:

The most obvious change to mgmt, is that it now has a package resource. I’ve named it “pkg” because we hackers prefer to keep strings short. Creating a pkg resource is difficult for two reasons:

  1. Mgmt is event based
  2. Mgmt should support many package managers

Event based systems would involve an inotify watch on the rpmdb (or a similar watch on /var/lib/dpkg/), and the logic to respond to it. This engineering problem boils down to being able to support the entire matrix of possible GNU/Linux packaging systems. Yuck! Additionally, it would be particularly unfriendly if we primarily supported RPM and DNF based systems, but left the DPKG and APT backend out as “an exercise for the community”.

Therefore, we solve both of these problems by basing the pkg resource on the excellent PackageKit project! PackageKit provides the events we need, and more importantly, it supports many backends already! If there’s ever a new backend that needs to be added, you can add it upstream in PackageKit, and everyone (including mgmt) will benefit from your work.

As a result, I’m proud to announce that both Debian and Fedora, (and many other friends) all got a working pkg resource on day one. Here’s a small demo:

Run mgmt:

root@debian:~/mgmt# time ./mgmt run --file examples/pkg1.yaml 
18:58:25 main.go:65: This is: mgmt, version: 0.0.2-41-g963f025
[snip]
18:18:44 pkg.go:208: Pkg[powertop]: CheckApply(true)
18:18:44 pkg.go:259: Pkg[powertop]: Apply
18:18:44 pkg.go:266: Pkg[powertop]: Set: installed...
18:18:52 pkg.go:284: Pkg[powertop]: Set: installed success!

The “powertop” package will install… Now while mgmt is still running, remove powertop:

root@debian:~# pkcon remove powertop
Resolving                     [=========================]         
Testing changes               [=========================]         
Finished                      [=========================]         
Removing                      [=========================]         
Loading cache                 [=========================]         
Running                       [=========================]         
Removing packages             [=========================]         
Committing changes            [=========================]         
Finished                      [=========================]         
root@debian:~# which powertop
/usr/sbin/powertop

It gets installed right back! Similarly, you can do it like this:

root@debian:~# apt-get -y remove powertop
Reading package lists... Done
Building dependency tree       
Reading state information... Done
The following packages were automatically installed and are no longer required:
  libnl-3-200 libnl-genl-3-200
Use 'apt-get autoremove' to remove them.
The following packages will be REMOVED:
  powertop
0 upgraded, 0 newly installed, 1 to remove and 80 not upgraded.
After this operation, 542 kB disk space will be freed.
(Reading database ... 47528 files and directories currently installed.)
Removing powertop (2.6.1-1) ...
Processing triggers for man-db (2.7.0.2-5) ...
root@debian:~# which powertop
/usr/sbin/powertop

And it will also get installed right back! Try it yourself to see it happen “live”! Similar behaviour can be seen on Fedora and other distros.

As a quite aside. If you’re a C hacker, and you would like to help with the upstream PackageKit project, they would surely love your contributions, and in particular, we here working on mgmt would especially like it if you worked on any of the open issues that we’ve uncovered. In order from increasing to decreasing severity, they are: #118 (please help!), #117 (needs love), #110 (needs testing), and #116 (would be nice to have). If you’d like to test mgmt on your favourite distro, and report and fix any issues, that would be helpful too!

Automatic edges:

Since we’re now hooked into the pkg resource, there’s no reason we can’t use that wealth of knowledge to make mgmt more powerful. For example, the PackageKit API can give us the list of files that a certain package would install. Since any file resource would obviously want to get “applied” after the package is installed, we use this information to automatically generate the relationships or “edges” in the graph. This means that module authors don’t have to waste time manually adding or updating the “require” relationships in their modules!

For example, the /etc/drbd.conf file, will want to require the drbd-utils package to be installed first. With traditional config management systems, without this dependency chain, after one run, your system will not be in a converged state, and would require another run. With mgmt, since it is event based, it would converge, except it might run in a sub-optimal order. That’s one reason why we add this dependency for you automatically.

This is represented via what mgmt calls the “AutoEdges” API. (If you can think of a better name, please tell me now!) It’s also worth noting that this isn’t entirely a novel idea. Puppet has a concept of “autorequires”, which is used for some of their resources, but doesn’t work with packages. I’m particularly proud of my design, because in my opinion, I think the API and mechanism in mgmt are much more powerful and logical.

Here’s a small demo:

james@fedora:~$ ./mgmt run --file examples/autoedges3.yaml 
20:00:38 main.go:65: This is: mgmt, version: 0.0.2-42-gbfe6192
[snip]
20:00:38 configwatch.go:54: Watching: examples/autoedges3.yaml
20:00:38 config.go:248: Compile: Adding AutoEdges...
20:00:38 config.go:313: Compile: Adding AutoEdge: Pkg[drbd-utils] -> Svc[drbd]
20:00:38 config.go:313: Compile: Adding AutoEdge: Pkg[drbd-utils] -> File[file1]
20:00:38 config.go:313: Compile: Adding AutoEdge: Pkg[drbd-utils] -> File[file2]
20:00:38 main.go:149: Graph: Vertices(4), Edges(3)
[snip]

Here we define four resources: pkg (drbd-utils), svc (drbd), and two files (/etc/drbd.conf and /etc/drbd.d/), both of which happen to be listed inside the RPM package! The AutoEdge magic works out these dependencies for us by examining the package data, and as you can see, adds the three edges. Unfortunately, there is no elegant way that I know of to add an automatic relationship between the svc and any of these files at this time. Suggestions welcome.

Finally, we also use the same interface to make sure that a parent directory gets created before any managed file that is a child of it.

Automatic edges internals:

How does it work? Each resource has a method to generate a “unique id” for that resource. This happens in the “GetUUIDs” method. Additionally, each resource has an “AutoEdges” method which, unsurprisingly, generates an “AutoEdge” object (struct). When the compiler is generating the graph and adding edges, it calls two methods on this AutoEdge object:

  1. Next()
  2. Test(…)

The Next() method produces a list of possible matching edges for that resource. Whichever edges match are added to the graph, and the results of each match is fed into the Test(…) function. This information is used to tell the resource whether there are more potential matches available or not. The process iterates until Test returns false, which means that there are no other available matches.

This ensures that a file: /var/lib/foo/bar/baz/ will first seek a dependency on /var/lib/foo/bar/, but be able to fall back to /var/lib/ if that’s the first file resource available. This way, we avoid adding more resource dependencies than necessary, which would diminish the amount of parallelism possible, while running the graph.

Lastly, it’s also worth noting that users can choose to disable AutoEdges per resource if they so desire. If you’ve got an idea for a clever automatic edge, please contact me, send a patch, or leave the information in the comments!

Contributing:

Good ideas and designs help, but contributors is what will make the project. All sorts of help is appreciated, and you can join in even if you’re not an “expert”. I’ll try and tag “easy” or “first time” patches with the “mgmtlove” tag. Feel free to work on other issues too, or suggest something that you think will help! Want to add more interesting resources! Anyone want to write a libvirt resource? How about a network resource? Use your imagination!

Lastly, thanks to both Richard and Matthias Klumpp in particular from the PackageKit project for their help so far, and to everyone else who has contributed in some way.

That’s all for today. I’ve got more exciting things coming. Please contribute!

Happy Hacking!

James

Introducing: git tpush

On today’s issue of “one hour hacks”, I’ll show you how you can stop your git drive-by’s to git master from breaking your CI tests… Let’s continue!

The problem:

Sometimes I’ve got a shitty one-line patch that I want to push to git master. I’m usually right, and everything tests out fine, but usually isn’t always, and then I look silly while I frantically try to fix git master on a project that I maintain. Let’s use tools to hide our human flaws!

How you’re really supposed to do it:

Good hackers know that you’re supposed to:

  • checkout a new feature branch for your patch
    • checkout -b feat/foo
  • write the patch and commit it
    • git add -p && git commit -m 'my foo'
  • push that branch up to origin
    • git push origin feat/foo
  • wait for the tests to succeed
    • hub ci-status && sleep 1m && ...
  • merge and push to git master
    • git checkout master && git merge feat/foo && git push
  • delete your local branch
    • git branch -d feat/foo
  • delete the remote branch
    • git push origin :feat/foo
  • mourn all the lost time it took to push this patch
    • cat && drink && sleep 8h

If it happens to fail, you have to remember what the safe git reset command is, and then you have to start all over!

How do we make this easy?

I write tools, so I sat down and wrote a cheap little bash script to do it for me! Sharing is caring, so please enjoy a copy!

How does it work?

It actually just does all of the above for you automatically, and it automatically cleans up all of the temporary branches when it’s done! Here are two example runs to help show you the tool in action:

A failure scenario:

Here I add a patch with some trailing whitespace, which will easily get caught by the automatic test suite. You’ll note that I actually had to force the commit locally with git commit -n, because the pre-commit hook actually caught this first. You could extend this script to run your test suite locally before you push the patch, but that defeats the idea of a drive-by.

james@computer:~/code/mgmt$ git add -p
diff --git a/README.md b/README.md
index 277aa73..2c31356 100644
--- a/README.md
+++ b/README.md
@@ -34,6 +34,7 @@ Please get involved by working on one of these items or by suggesting something
 Please grab one of the straightforward [#mgmtlove](https://github.com/purpleidea/mgmt/labels/mgmtlove) issues if you're a first time contributor.
 
 ## Bugs:
+Some bugs are caused by bad drive-by patches to git master! 
 Please set the `DEBUG` constant in [main.go](https://github.com/purpleidea/mgmt/blob/master/main.go) to `true`, and post the logs when you report the [issue](https://github.com/purpleidea/mgmt/issues).
 Bonus points if you provide a [shell](https://github.com/purpleidea/mgmt/tree/master/test/shell) or [OMV](https://github.com/purpleidea/mgmt/tree/master/test/omv) reproducible test case.
 
Stage this hunk [y,n,q,a,d,/,e,?]? y
<stdin>:9: trailing whitespace.
Some bugs are caused by bad drive-by patches to git master! 
warning: 1 line adds whitespace errors.

james@computer:~/code/mgmt$ git commit -m 'XXX this is a bad patch'
README.md:37: trailing whitespace.
+Some bugs are caused by bad drive-by patches to git master! 
james@computer:~/code/mgmt$ git commit -nm 'XXX this is a bad patch'
[master 8b29fb3] XXX this is a bad patch
 1 file changed, 1 insertion(+)
james@computer:~/code/mgmt$ git tpush 
tpush branch is: tpush/test-0
Switched to a new branch 'tpush/test-0'
Counting objects: 3, done.
Delta compression using up to 4 threads.
Compressing objects: 100% (3/3), done.
Writing objects: 100% (3/3), 357 bytes | 0 bytes/s, done.
Total 3 (delta 2), reused 0 (delta 0)
To git@github.com:purpleidea/mgmt.git
 * [new branch]      tpush/test-0 -> tpush/test-0
Switched to branch 'master'
Your branch is ahead of 'origin/master' by 1 commit.
  (use "git push" to publish your local commits)
- CI is starting...
/ Waiting for CI...
- Waiting for CI...
\ Waiting for CI...
| Waiting for CI...
Deleted branch tpush/test-0 (was 8b29fb3).
To git@github.com:purpleidea/mgmt.git
 - [deleted]         tpush/test-0
Upstream CI failed with:
failure: https://travis-ci.org/purpleidea/mgmt/builds/109472293
Fix branch 'master' with:
git commit --amend
or:
git reset --soft HEAD^
Happy hacking!

A successful scenario:

In this example, I amended the previous patch in my local copy of git master (which never got pushed to the public git master!) and then I ran git tpush again.

james@computer:~/code/mgmt$ git commit --amend
[master 1186d63] Add a link to my article about debugging golang
 Date: Mon Feb 15 17:28:01 2016 -0500
 1 file changed, 1 insertion(+)
james@computer:~/code/mgmt$ git tpush 
tpush branch is: tpush/test-0
Switched to a new branch 'tpush/test-0'
Counting objects: 3, done.
Delta compression using up to 4 threads.
Compressing objects: 100% (3/3), done.
Writing objects: 100% (3/3), 397 bytes | 0 bytes/s, done.
Total 3 (delta 2), reused 0 (delta 0)
To git@github.com:purpleidea/mgmt.git
 * [new branch]      tpush/test-0 -> tpush/test-0
Switched to branch 'master'
Your branch is ahead of 'origin/master' by 1 commit.
  (use "git push" to publish your local commits)
- CI is starting...
/ Waiting for CI...
- Waiting for CI...
\ Waiting for CI...
| Waiting for CI...
Deleted branch tpush/test-0 (was 1186d63).
To git@github.com:purpleidea/mgmt.git
 - [deleted]         tpush/test-0
Counting objects: 3, done.
Delta compression using up to 4 threads.
Compressing objects: 100% (3/3), done.
Writing objects: 100% (3/3), 397 bytes | 0 bytes/s, done.
Total 3 (delta 2), reused 0 (delta 0)
To git@github.com:purpleidea/mgmt.git
   6e68d6d..1186d63  master -> master
Done!

Please take note of the “Waiting for CI…” text. This is actually a spinner which updates itself. Run this tool yourself to get the full effect! This avoids having 200 line scrollback in your terminal while you wait for your CI to complete.

Lastly, you’ll note that I run git tpush instead of git-tpush. This is because I added an alias to my ~/.gitconfig. It’s quite easy, just add:

tpush = !git-tpush

under the [alias] section, and ensure the git-tpush shell script is in your ~/bin/ folder and is executable.

Pre-requisites:

This actually depends on the hub command line tool which knows how to ask our CI about the test status. It sucks using a distributed tool like git with a centralized thing like github, but it’s still pretty easy for me to move away if they go bad. You’ll also want the tput installed if you want the fancy spinner, but that comes with most GNU/Linux distros by default.

The code:

Grab a copy here!

Happy hacking,

James

PS: No, I couldn’t think of a better name than git-tpush, if you don’t like it feel free to rename it yourself!

Debugging golang programs

I’ve been writing a lot of golang lately. I’ve hit painful problems in the past. Here are some debugging tips. Hopefully they help you out. I bet you don’t know #2.

#0 Use log.Printf:

This should go without saying, but I’m ashamed to say it’s what I use the most. We’ve only been C programming for 44+ years, and it’s still what is most useful!

#1 Use go run -race:

Since many problems are caused by random races, ensuring you use the built-in tools will help you catch some problems. One of these is a race detector. You can use it by adding the -race flag to the go run, go build or go test commands. In practice, it only ever caught beginner issues, and it hasn’t set off any alarms since. Maybe I need to write more test cases! Maybe you need to write more test cases!

#2 Use Control+Backslash:

Say what? If you press control+backslash, you will cause a core dump. Example:

james@computer:/tmp$ sleep 42h
^\Quit (core dumped)

Obviously there are nicer ways to kill a process (I for one, welcome our robotic overlords) but in times of emergency, use what you’ve got. The interesting thing, is that when you do this to a golang program, you’ll get much more interesting output:

james@computer:~/code/mgmt$ ./mgmt run --file examples/graph0.yaml 
16:19:26 main.go:65: This is: mgmt, version: 0.0.2-6-g6e68d6d
16:19:26 main.go:66: Main: Start: 1455571166809588335
16:19:26 main.go:196: Main: Running...
16:19:26 main.go:106: Etcd: Starting...
16:19:26 etcd.go:132: Etcd: Watching...
16:19:26 configwatch.go:54: Watching: examples/graph0.yaml
16:19:26 etcd.go:159: Etcd: Waiting 1000 ms for connection...
16:19:26 main.go:149: Graph: Vertices(2), Edges(1)
16:19:26 main.go:152: Graphviz: No filename given!
16:19:26 main.go:163: State: graphNil -> graphStarting
16:19:26 main.go:165: State: graphStarting -> graphStarted
16:19:26 file.go:340: File[file1]: Apply
^\SIGQUIT: quit
PC=0x482a53 m=2

goroutine 0 [idle]:
runtime.futex(0xecae00, 0x0, 0x7f6f5dd8fde8, 0x0, 0x0, 0x4828ac, 0x3c, 0x0, 0x43392b, 0xecae00, ...)
    /usr/lib/golang/src/runtime/sys_linux_amd64.s:289 +0x23
runtime.futexsleep(0xecae00, 0x0, 0xdf8475800)
    /usr/lib/golang/src/runtime/os1_linux.go:56 +0xf0
runtime.notetsleep_internal(0xecae00, 0xdf8475800, 0xc820000900)
    /usr/lib/golang/src/runtime/lock_futex.go:171 +0x12b
runtime.notetsleep(0xecae00, 0xdf8475800, 0x0)
    /usr/lib/golang/src/runtime/lock_futex.go:191 +0x6b
runtime.sysmon()
    /usr/lib/golang/src/runtime/proc1.go:3022 +0x4aa
runtime.mstart1()
    /usr/lib/golang/src/runtime/proc1.go:715 +0xe8
runtime.mstart()
    /usr/lib/golang/src/runtime/proc1.go:685 +0x72

goroutine 1 [select]:
main.waitForSignal(0xc820076300)
    /home/james/code/mgmt/main.go:48 +0x5da
main.run(0xc82011c0f0)
    /home/james/code/mgmt/main.go:198 +0x779
github.com/codegangsta/cli.Command.Run(0xb4e9d8, 0x3, 0x0, 0x0, 0xc8200799d0, 0x1, 0x1, 0xb4e9d8, 0x3, 0x0, ...)
    /home/james/code/src/gopath/src/github.com/codegangsta/cli/command.go:127 +0x1052
github.com/codegangsta/cli.(*App).Run(0xc8200a8500, 0xc820074100, 0x4, 0x4, 0x0, 0x0)
    /home/james/code/src/gopath/src/github.com/codegangsta/cli/app.go:159 +0xc2f
main.main()
    /home/james/code/mgmt/main.go:283 +0xced

goroutine 17 [syscall, locked to thread]:
runtime.goexit()
    /usr/lib/golang/src/runtime/asm_amd64.s:1721 +0x1

goroutine 21 [syscall]:
os/signal.loop()
    /usr/lib/golang/src/os/signal/signal_unix.go:22 +0x18
created by os/signal.init.1
    /usr/lib/golang/src/os/signal/signal_unix.go:28 +0x37

goroutine 22 [select]:
main.run.func2(0xc82011c0f0, 0xc820122040, 0xc82011a5d0, 0xc82011e130, 0x5, 0xc820076360, 0xc82011e010)
    /home/james/code/mgmt/main.go:110 +0x1061
created by main.run
    /home/james/code/mgmt/main.go:168 +0x60c

goroutine 23 [select, locked to thread]:
runtime.gopark(0xc8a258, 0xc82002e728, 0xb4ec08, 0x6, 0x18, 0x2)
    /usr/lib/golang/src/runtime/proc.go:185 +0x163
runtime.selectgoImpl(0xc82002e728, 0x0, 0x18)
    /usr/lib/golang/src/runtime/select.go:392 +0xa64
runtime.selectgo(0xc82002e728)
    /usr/lib/golang/src/runtime/select.go:212 +0x12
runtime.ensureSigM.func1()
    /usr/lib/golang/src/runtime/signal1_unix.go:227 +0x353
runtime.goexit()
    /usr/lib/golang/src/runtime/asm_amd64.s:1721 +0x1

goroutine 7 [select]:
main.(*FileType).Watch(0xc820174180)
    /home/james/code/mgmt/file.go:156 +0x15f4
main.(*Graph).Start.func1(0xc82011e010, 0xc820010b80)
    /home/james/code/mgmt/pgraph.go:558 +0x7e
created by main.(*Graph).Start
    /home/james/code/mgmt/pgraph.go:560 +0x171

goroutine 25 [select]:
main.ConfigWatch.func1(0xc82009db40, 0x14, 0xc820076660)
    /home/james/code/mgmt/configwatch.go:74 +0x13c8
created by main.ConfigWatch
    /home/james/code/mgmt/configwatch.go:153 +0x67

goroutine 26 [sleep]:
time.Sleep(0x3b9aca00)
    /usr/lib/golang/src/runtime/time.go:59 +0xf9
main.(*EtcdWObject).EtcdWatch.func1(0x7f6f5c512c20, 0xc82009db80, 0xc820122040, 0xffffffffffffffff, 0xc820076360, 0xc820072af0)
    /home/james/code/mgmt/etcd.go:160 +0x7db
created by main.(*EtcdWObject).EtcdWatch
    /home/james/code/mgmt/etcd.go:205 +0xc2

goroutine 27 [chan send]:
main.(*EtcdWObject).EtcdChannelWatch.func1(0x7f6f5c512c78, 0xc820122140, 0x7f6f5c512ca0, 0xc8200786d0, 0xc8200766c0)
    /home/james/code/mgmt/etcd.go:109 +0xd9
created by main.(*EtcdWObject).EtcdChannelWatch
    /home/james/code/mgmt/etcd.go:111 +0x7b

goroutine 34 [syscall]:
syscall.Syscall6(0xe8, 0x4, 0xc82018dc24, 0x7, 0xffffffffffffffff, 0x0, 0x0, 0x0, 0x0, 0x0)
    /usr/lib/golang/src/syscall/asm_linux_amd64.s:44 +0x5
syscall.EpollWait(0x4, 0xc82018dc24, 0x7, 0x7, 0xffffffffffffffff, 0x0, 0x0, 0x0)
    /usr/lib/golang/src/syscall/zsyscall_linux_amd64.go:365 +0x89
gopkg.in/fsnotify%2ev1.(*fdPoller).wait(0xc820146000, 0xc89a00, 0x0, 0x0)
    /home/james/code/src/gopath/src/gopkg.in/fsnotify.v1/inotify_poller.go:85 +0xbc
gopkg.in/fsnotify%2ev1.(*Watcher).readEvents(0xc820158000)
    /home/james/code/src/gopath/src/gopkg.in/fsnotify.v1/inotify.go:179 +0x1af
created by gopkg.in/fsnotify%2ev1.NewWatcher
    /home/james/code/src/gopath/src/gopkg.in/fsnotify.v1/inotify.go:58 +0x315

goroutine 8 [select]:
main.(*NoopType).Watch(0xc82001c230)
    /home/james/code/mgmt/types.go:362 +0x31a
main.(*Graph).Start.func1(0xc82011e010, 0xc820010b60)
    /home/james/code/mgmt/pgraph.go:558 +0x7e
created by main.(*Graph).Start
    /home/james/code/mgmt/pgraph.go:560 +0x171

goroutine 31 [syscall]:
syscall.Syscall6(0xe8, 0x9, 0xc8201f7c24, 0x7, 0xffffffffffffffff, 0x0, 0x0, 0xc82017c440, 0xc82017c420, 0x15)
    /usr/lib/golang/src/syscall/asm_linux_amd64.s:44 +0x5
syscall.EpollWait(0x9, 0xc8201f7c24, 0x7, 0x7, 0xffffffffffffffff, 0x1, 0x0, 0x0)
    /usr/lib/golang/src/syscall/zsyscall_linux_amd64.go:365 +0x89
gopkg.in/fsnotify%2ev1.(*fdPoller).wait(0xc82009dcc0, 0x8000, 0x0, 0x0)
    /home/james/code/src/gopath/src/gopkg.in/fsnotify.v1/inotify_poller.go:85 +0xbc
gopkg.in/fsnotify%2ev1.(*Watcher).readEvents(0xc820090cd0)
    /home/james/code/src/gopath/src/gopkg.in/fsnotify.v1/inotify.go:179 +0x1af
created by gopkg.in/fsnotify%2ev1.NewWatcher
    /home/james/code/src/gopath/src/gopkg.in/fsnotify.v1/inotify.go:58 +0x315

rax    0xfffffffffffffffc
rbx    0x7f6f5dd8fde8
rcx    0x482a53
rdx    0x0
rdi    0xecae00
rsi    0x0
rbp    0x0
rsp    0x7f6f5dd8fdb0
r8     0x0
r9     0x0
r10    0x7f6f5dd8fde8
r11    0x246
r12    0x7ffec7010a4f
r13    0x7f6f5dd90700
r14    0x800000
r15    0x0
rip    0x482a53
rflags 0x246
cs     0x33
fs     0x0
gs     0x0

I find this particularly useful when you have a “stuck” goroutine. Killing the program usually makes it easy to find where everyone was waiting. AFAIK, everything is ordered by most recently used at the top. You’re welcome!

#3 Use a real debugger:

There is a GDB like golang debugger called “delve“. It’s probably something I would use more often, except I haven’t needed that much power, and it still has a number of rough edges. I’m sure if you could help improve that, it would help out the project. There was a talk at FOSDEM about it. If we’re lucky, we’ll eventually get video. I’m “looking forward” to using it more in the future.

Hope this helped,

Happy Hacking!

James

PS: The “other” slash, isn’t called “forward slash”, it’s just called “slash”

PPS: Yes, I say “golang”, no “go”. I think it’s shitty that google basically ripped off the go! name. I guess David lost this one.