Copyleft is Dead. Long live Copyleft!

Posted on October 17, 2017 by purpleidea

As you may have noticed, we recently re-licensed mgmt from the AGPL (Affero General Public License) to the regular GPL. This is a post explaining the decision and which hopefully includes some insights at the intersection of technology and legal issues.

Disclaimer:

I am not a lawyer, and these are not necessarily the opinions of my employer. I think I’m knowledgeable in this area, but I’m happy to be corrected in the comments. I’m friends with a number of lawyers, and they like to include disclaimer sections, so I’ll include this so that I blend in better.

Background:

It’s well understood in infrastructure coding that the control of, and trust in the software is paramount. It can be risky basing your business off of a product if the vendor has the ultimate ability to change the behaviour, discontinue the software, make it prohibitively expensive, or in the extreme case, use it as a backdoor for corporate espionage.

While many businesses have realized this, it’s unfortunate that many individuals have not. The difference might be protecting corporate secrets vs. individual freedoms, but that’s a discussion for another time. I use Fedora and GNOME, and don’t have any Apple products, but you might value the temporary convenience more. I also support your personal choice to use the software you want. (Not sarcasm.)

This is one reason why Red Hat has done so well. If they ever mistreated their customers, they’d be able to fork and grow new communities. The lack of an asymmetrical power dynamic keeps customers feeling safe and happy!

Section 13:

The main difference between the AGPL and the GPL is the “Remote Network Interaction” section. Here’s a simplified explanation:

Both licenses require that if you modify the code, you give back your contributions. “Copyleft” is Copyright law that legally requires this share-alike provision. These licenses never require this when using the software privately, whether as an individual or within a company. The thing that “activates” the licenses is distribution. If you sell or give someone a modified copy of the program, then you must also include the source code.

The AGPL extends the GPL in that it also activates the license if that software runs on a application providers computer which is common with hosted software-as-a-service. In other words, if you were an external user of a web calendaring solution containing AGPL software, then that provider would have to offer up the code to the application, whereas the GPL would not require this, and neither license would require distribution of code if the application was only available to employees of that company nor would it require distribution of the software used to deploy the calendaring software.

Network Effects and Configuration Management:

If you’re familiar with the infrastructure automation space, you’re probably already aware of three interesting facts:

Hosted configuration management as a service probably isn’t plausible
The infrastructure automation your product uses isn’t the product
Copyleft does not apply to the code or declarations that describe your configuration

As a result of this, it’s unlikely that the Section 13 requirement of the AGPL would actually ever apply to anyone using mgmt!

A number of high profile organizations outright forbid the use of the AGPL. Google and Openstack are two notable examples. There are others. Many claim this is because the cost of legal compliance is high. One argument I heard is that it’s because they live in fear that their entire proprietary software development business would be turned on its head if some sufficiently important library was AGPL. Despite weak enforcement, and with many companies flouting the GPL, Linux and the software industry have not shown signs of waning. Compliance has even helped their bottom line.

Nevertheless, as a result of misunderstanding, fear and doubt, using the AGPL still cuts off a portion of your potential contributors. Possible overzealous enforcing has also probably caused some to fear the GPL.

Foundations and Permissive Licensing:

Why use copyleft at all? Copyleft is an inexpensive way of keeping the various contributors honest. It provides an organization constitution so that community members that invest in the project all get a fair, representative stake.

In the corporate world, there is a lot of governance in the form of “foundations”. The most well-known ones exist in the United States and are usually classified as 501(c)(6) under US Federal tax law. They aren’t allowed to generate a profit, but they exist to fulfill the desires of their dues-paying membership. You’ve probably heard of the Linux Foundation, the .NET foundation, the OpenStack Foundation, and the recent Linux Foundation child, the CNCF. With the major exception being Linux, they primarily fund permissively licensed projects since that’s what their members demand, and the foundation probably also helps convince some percentage of their membership into voluntarily contributing back code.

Running an organization like this is possible, but it certainly adds a layer of overhead that I don’t think is necessary for mgmt at this point.

It’s also interesting to note that of the top corporate contributions to open source, virtually all of the licensing is permissive, usually under the Apache v2 license. I’m not against using or contributing to permissively licensed projects, but I do think there’s a danger if most of our software becomes a monoculture of non-copyleft, and I wanted to take a stand against that trend.

Innovation:

I started mgmt to show that there was still innovation to be done in the automation space, and I think I’ve achieved that. I still have more to prove, but I think I’m on the right path. I also wanted to innovate in licensing by showing that the AGPL isn’t actually harmful. I’m sad to say that I’ve lost that battle, and that maybe it was too hard to innovate in too many different places simultaneously.

Red Hat has been my main source of funding for this work up until now, and I’m grateful for that, but I’m sad to say that they’ve officially set my time quota to zero. Without their support, I just don’t have the energy to innovate in both areas. I’m sad to say it, but I’m more interested in the technical advancements than I am in the licensing progress it might have brought to our software ecosystem.

Conclusion / TL;DR:

If you, your organization, or someone you know would like to help fund my mgmt work either via a development grant, contract or offer of employment, or if you’d like to be a contributor to the project, please let me know! Without your support, mgmt will die.

Happy Hacking,

James

You can follow James on Twitter for more frequent updates and other random noise.

EDIT: I mentioned in my article that: “Hosted configuration management as a service probably isn’t plausible“. Turns out I was wrong. The splendiferous Nathen Harvey was kind enough to point out that Chef offers a hosted solution! It’s free for five hosts as well!

I was probably thinking more about how I would be using mgmt, and not about the greater ecosystem. If you’d like to build or use a hosted mgmt solution, please let me know!

Metaparameters in mgmt

Posted on March 1, 2017 by purpleidea

In mgmt we have meta parameters. They are similar in concept to what you might be familiar with from other tools, except that they are more clearly defined (in a single struct) and vastly more powerful.

In mgmt, a meta parameter is a parameter which is codified entirely in the engine, and which can be used by any resource. In contrast with Puppet, require/before are considered meta parameters, whereas in mgmt, the equivalent is a graph edge, which is not a meta parameter. [1]

Kinds

As of this writing we have seven different kinds of meta parameters:

AutoEdge
AutoGroup
Noop
Retry & Delay
Poll
Limit & Burst
Sema

The astute reader will note that there are actually nine different meta parameters listed, but I have grouped them into seven categories since some of them are very tightly interconnected. The first two, AutoEdge and AutoGroup have been covered in separate articles already, so they won’t be discussed here. To learn about the others, please read on…

Noop

Noop stands for no-operation. If it is set to true, we tell the CheckApply portion of the resource to not make any changes. It is up to the individual resource implementation to respect this facility, which is the case for all correctly written resources. You can learn more about this by reading the CheckApply section in the resource guide.

If you’d like to set the noop state on all resources at runtime, there is a cli flag which you can use to do so. It is unsurprisingly named --noop, and overrides all the resources in the graph. This is in stark contrast with Puppet which will allow an individual resource definition to override the user’s choice!

james@computer:/tmp$ cat noop.pp 
file { '/tmp/puppet.noop':
    content => "nope, nope, nope!\n",
    noop => false,    # set at the resource level
}
james@computer:/tmp$ time puppet apply noop.pp 
Notice: Compiled catalog for freed in environment production in 0.29 seconds
Notice: /Stage[main]/Main/File[/tmp/puppet.noop]/ensure: defined content as '{md5}d8bda32dd3fbf435e5a812b0ba3e9a95'
Notice: Applied catalog in 0.03 seconds

real    0m15.862s
user    0m7.423s
sys    0m1.260s
james@computer:/tmp$ file puppet.noop    # verify it worked
puppet.noop: ASCII text
james@computer:/tmp$ rm -f puppet.noop    # reset
james@computer:/tmp$ time puppet apply --noop noop.pp    # safe right?
Notice: Compiled catalog for freed in environment production in 0.30 seconds
Notice: /Stage[main]/Main/File[/tmp/puppet.noop]/ensure: defined content as '{md5}d8bda32dd3fbf435e5a812b0ba3e9a95'
Notice: Class[Main]: Would have triggered 'refresh' from 1 events
Notice: Stage[main]: Would have triggered 'refresh' from 1 events
Notice: Applied catalog in 0.02 seconds

real    0m15.808s
user    0m7.356s
sys    0m1.325s
james@computer:/tmp$ cat puppet.noop 
nope, nope, nope!
james@computer:/tmp$

If you look closely, Puppet just trolled you by performing an operation when you thought it would be noop! I think the behaviour is incorrect, but if this isn’t supposed to be a bug, then I’d sure like to know why!

It’s worth mentioning that there is also a noop resource in mgmt which is similarly named because it does absolutely nothing.

Retry & Delay

In mgmt we can run continuously, which means that it’s often more useful to do something interesting when there is a resource failure, rather than simply shutting down completely. As a result, if there is an error during the CheckApply phase of the resource execution, the operation can be retried (retry) a number of times, and there can be a delay between each retry.

The delay value is an integer representing the number of milliseconds to wait between retries, and it defaults to zero. The retry value is an integer representing the maximum number of allowed retries, and it defaults to zero. A negative value will permit an infinite number of retries. If the number of retries is exhausted, then the temporary resource failure will be converted into a permanent failure. Resources which depend on a failed resource will be blocked until there is a successful execution. When there is a successful CheckApply, the resource retry counter is reset.

In general it is best to leave these values at their defaults unless you are expecting spurious failures, this way if you do get a failure, it won’t be masked by the retry mechanism.

It’s worth mentioning that the Watch loop can fail as well, and that the retry and delay meta parameters apply to this as well! While these could have had their own set of meta parameters, I felt it would have unnecessarily cluttered up the interface, and I couldn’t think of a reason where it would be helpful to have different values. They do have their own separate retry counter and delay timer of course! If someone has a valid use case, then I’m happy to separate these.

If someone would like to implement a pluggable back-off algorithm (eg: exponential back-off) to be used here instead of a simple delay, then I think it would be a welcome addition!

Poll

Despite mgmt being event based, there are situations where you’d really like to poll instead of using the Watch method. For these cases, I reluctantly implemented a poll meta parameter. It does exactly what you’d expect, generating events every poll seconds. It defaults to zero which means that it is disabled, and Watch is used instead.

Despite my earlier knock of it, it is actually quite useful, in that some operations might require or prefer polling, and having it as a meta parameter means that those resources won’t need to duplicate the polling code.

This might be very powerful for an aws resource that can set up hosted Amazon ec2 resources. When combined with the retry and delay meta parameters, it will even survive outages!

One particularly interesting aspect is that ever since the converged graph detection was improved, we can still converge a graph and shutdown with the converged-timeout functionality while using polling! This is described in more detail in the documentation.

Limit & Burst

In mgmt, the events generated by the Watch main loop of a resource do not need to be 1-1 matched with the CheckApply remediation step. This is very powerful because it allows mgmt to collate multiple events into a single CheckApply step which is helpful for when the duration of the CheckApply step is longer than the interval between Watch events that are being generated often.

In addition, you might not want to constantly Check or Apply (converge) the state of your resource as often as it goes out of state. For this situation, that step can be rate limited with the limit and burst meta parameters.

The limit and burst meta parameters implement something known as a token bucket. This models a bucket which is filled with tokens and which is drained slowly. It has a particular rate limit (which sets a maximum rate) and a burst count which sets a maximum bolus which can be absorbed.

This doesn’t cause us to permanently miss events (and stay un-converged) because when the bucket overfills, instead of dropping events, we actually cache the last one for playback once the bucket falls within the execution rate threshold. Remember, we expect to be converged in the steady state, not at every infinitesimal delta t in between.

The limit and burst metaparams default to allowing an infinite rate, with zero burst. As it turns out, if you have a non-infinite rate, the burst must be non-zero or you will cause a Validate error. Similarly, a non-zero burst, with an infinite rate is effectively the same as the default. A good rule of thumb is to remember to either set both values or neither. This is all because of the mathematical implications of token buckets which I won’t explain in this article.

Sema

Sema is short for semaphore. In mgmt we have implemented P/V style counting semaphores. This is a mechanism for reducing parallelism in situations where there are not explicit dependencies between resources. This might be useful for when the number of operations might outnumber the number of CPUs on your machine and you want to avoid starving your other processes. Alternatively, there might be a particular operation that you want to add a mutex (mutual exclusion) around, which can be represented with a semaphore of size (1) one. Lastly, it was a particularly fun meta parameter to write, and I had been itching to do so for some time.

To use this meta parameter, simply give a list of semaphore ids to the resource you want to lock. These can be any string, and are shared globally throughout the graph. By default, they have a size of one. To specify a semaphore with a different size, append a colon (:) followed by an integer at the end of the semaphore id.

Valid ids might include: “some_id:42“, “hello:13“, and “lockname“. Remember, the size parameter is the number of simultaneous resources which can run their CheckApply methods at the same time. It does not prevent multiple Watch methods from returning events simultaneously.

If you would like to force a semaphore globally on all resources, you can pass in the --sema argument with a size integer. This will get appended to the existing semaphores. For example, to simulate Puppet’s traditional non-parallel execution, you could specify --sema 1.

Oh, no! Does this mean I can deadlock my graphs? Interestingly enough, this is actually completely safe! The reason is that because all the semaphores exist in the mgmt directed acyclic graph, and because that DAG represents dependencies that are always respected, there will always be a way to make progress, which eventually unblocks any waiting resources! The trick to doing this is ensuring that each resource always acquires the list of semaphores in alphabetical order. (Actually the order doesn’t matter as long as it’s consistent across the graph, and alphabetical is as good as any!) Unfortunately, I don’t have a formal proof of this, but I was able to convince myself on the back of an envelope that it is true! Please contact me if you can prove me right or wrong! The one exception is that a counting semaphore of size zero would never let anyone acquire it, so by definition it would permanently block, and as a result is not currently permitted.

The last important point to mention is about the interplay between automatic grouping and semaphores. When more than one resource is grouped, they are considered to be part of the same resource. As a result, the resulting list of semaphores is the sum of the individual semaphores, de-duplicated. This ensures that individual locking guarantees aren’t broken when multiple resources are combined.

Future

If you have ideas for future meta parameters, please let me know! We’d love to hear about your ideas on our mailing list or on IRC. If you’re shy, you can contact me privately as well.

Happy Hacking,

James

[1] This is a bit of an apples vs. flame-throwers comparison because I’m comparing the mgmt engine meta parameters with the puppet language meta parameters, but I think it’s worth mentioning because there’s a clear separation between the two in mgmt, where as the separation is much more blurry in the puppet scenario. It’s also true that the mgmt language might grow a concept of language-level meta parameters which has a partial set that only maps partially to engine meta parameters, but this is a discussion for another day!

Faster golang builds

Posted on February 26, 2017 by purpleidea

I’ve been hacking in golang since before version 1.4, and the speed at which my builds finished has been mostly trending downwards. Let’s look into the reasons and some fixes. TL;DR click-bait title: “Get 4x faster golang builds with this one trick!”.

Here are the three reasons my builds got slower:

The compiler

Before version 1.5, the compiler was written in C but with that release, it moved to being pure golang. This unfortunately reduced build performance quite measurably, even though it was the right decision for the project.

There have been slight improvements with newer versions, however Google’s focus has been improving runtime performance instead of build performance. This is understandable since they want to save lots of electricity dollars at scale, which is not as helpful for smaller shops where the developer iteration cycle is the metric to optimize.

This could still be improved if folks wanted to put in the effort. A gcc style -O0 option could help. The sad thing about this whole story is that “instant” builds were a major marketing feature of early golang presentations.

Project size

Over time, my main project (mgmt) has gotten much bigger! It compiles in a number of libraries, including etcd, prometheus, and more! This naturally increases build time and is a mostly unavoidable consequence of building cool things and standing on giants!

This mostly can’t be helped, but it can be mitigated…

Dependency caching

When you build a project and all of its dependencies, the unchanged dependencies shouldn’t need to be rebuilt! Unfortunately golang does a great job of silently rebuilding things unnecessarily. Here’s why…

When you run a build, golang will attempt to re-use any common artefacts from previous builds. If the golang versions or library versions don’t match, these won’t be used, and the compiler will redo this work. Unfortunately, by default, those results won’t be saved, causing you to waste CPU cycles every time you test!

When the intermediate results are kept, they are found in your $GOPATH/pkg/. To save them, you need to either run go install (which makes a mess in your $GOPATH/bin/, or you can run go build -i. (Thanks to Dave for the tip!)

The sad part of this story is that these aren’t cached by default, and stale results aren’t discarded by default! If you’re experiencing slow builds, you should rm -rf $GOPATH/pkg/ and then go build -i. After one successful build, future builds should be much faster!

Example

james@computer:~/code/mgmt$ time go build    # before

real    0m28.152s
user    1m17.097s
sys     0m5.235s

james@computer:~/code/mgmt$ time go build -i    # after

real    0m8.129s
user    0m12.014s
sys     0m0.839s

Debugging

If you want to debug what’s going on, you can always run go build -x.

Blame!

I don’t like assigning blame, but this feels like a case of the golang tools being obtuse, and the man pages being non-existent. The golang project has a lot of maturing to do to integrate sanely with a stock GNU environment:

build intermediates could be saved and discarded by default
man go build could exist and provide useful information
~~go build --help~~ go help build could provide more useful information
POSIX style flags could be used (eg: --help)
build cache could be stored in $XDG_CACHE_HOME.

Hope this helped improve your golang experience! I always knew something was going in in $GOPATH/pkg/, but I think it’s pretty absurd that I only fully understood it now. My builds are about 4x faster now. :)

Happy Hacking,

James

Remote execution in mgmt

Posted on October 7, 2016 by purpleidea

Bootstrapping a cluster from your laptop, or managing machines without needing to first setup a separate config management infrastructure are both very reasonable and fundamental asks. I was particularly inspired by Ansible‘s agent-less remote execution model, but never wanted to build a centralized orchestrator. I soon realized that I could have my ice cream and eat it too.

Prior knowledge

If you haven’t read the earlier articles about mgmt, then I recommend you start with those, and then come back here. The first and fourth are essential if you’re going to make sense of this article.

Limitations of existing orchestrators

Current orchestrators have a few limitations.

They can be a single point of failure
They can have scaling issues
They can’t respond instantaneously to node state changes (they poll)
They can’t usually redistribute remote node run-time data between nodes

Despite these limitations, orchestration is still very useful because of the facilities it provides. Since these facilities are essential in a next generation design, I set about integrating these features, but with a novel twist.

Implementation, Usage and Design

Mgmt is written in golang, and that decision was no accident. One benefit is that it simplifies our remote execution model.

To use this mode you run mgmt with the --remote flag. Each use of the --remote argument points to a different remote graph to execute. Eventually this will be integrated with the DSL, but this plumbing is exposed for early adopters to play around with.

Startup (part one)

Each invocation of --remote causes mgmt to remotely connect over SSH to the target hosts. This happens in parallel, and runs up to --cconns simultaneous connections.

A temporary directory is made on the remote host, and the mgmt binary and graph are copied across the wire. Since mgmt compiles down to a single statically compiled binary, it simplifies the transfer of the software. The binary is cached remotely to speed up future runs unless you pass the --no-caching option.

A TCP connection is tunnelled back over SSH to the originating hosts etcd server which is embedded and running inside of the initiating mgmt binary.

Execution (part two)

The remote mgmt binary is now run! It wires itself up through the SSH tunnel so that its internal etcd client can connect to the etcd server on the initiating host. This is particularly powerful because remote hosts can now participate in resource exchanges as if they were part of a regular etcd backed mgmt cluster! They don’t connect directly to each other, but they can share runtime data, and only need an incoming SSH port open!

Closure (part three)

At this point mgmt can either keep running continuously or it can close the connections and shutdown.

In the former case, you can either remain attached over SSH, or you can disconnect from the child hosts and let this new cluster take on a new life and operate independently of the initiator.

In the latter case you can either shutdown at the operators request (via a ^C on the initiator) or when the cluster has simultaneously converged for a number of seconds.

This second possibility occurs when you run mgmt with the familiar --converged-timeout parameter. It is indeed clever enough to also work in this distributed fashion.

Diagram

I’ve used by poor libreoffice draw skills to make a diagram. Hopefully this helps out my visual readers.

If you can improve this diagram, please let me know!

Example

I find that using one or more vagrant virtual machines for the remote endpoints is the best way to test this out. In my case I use Oh-My-Vagrant to set up these machines, but the method you use is entirely up to you! Here’s a sample remote execution. Please note that I have omitted a number of lines for brevity, and added emphasis to the more interesting ones.

james@hostname:~/code/mgmt$ ./mgmt run --remote examples/remote2a.yaml --remote examples/remote2b.yaml --tmp-prefix 
17:58:22 main.go:76: This is: mgmt, version: 0.0.5-3-g4b8ad3a
17:58:23 remote.go:596: Remote: Connect...
17:58:23 remote.go:607: Remote: Sftp...
17:58:23 remote.go:164: Remote: Self executable is: /home/james/code/gopath/src/github.com/purpleidea/mgmt/mgmt
17:58:23 remote.go:221: Remote: Remotely created: /tmp/mgmt-412078160/remote
17:58:23 remote.go:226: Remote: Remote path is: /tmp/mgmt-412078160/remote/mgmt
17:58:23 remote.go:221: Remote: Remotely created: /tmp/mgmt-412078160/remote
17:58:23 remote.go:226: Remote: Remote path is: /tmp/mgmt-412078160/remote/mgmt
17:58:23 remote.go:235: Remote: Copying binary, please be patient...
17:58:23 remote.go:235: Remote: Copying binary, please be patient...
17:58:24 remote.go:256: Remote: Copying graph definition...
17:58:24 remote.go:618: Remote: Tunnelling...
17:58:24 remote.go:630: Remote: Exec...
17:58:24 remote.go:510: Remote: Running: /tmp/mgmt-412078160/remote/mgmt run --hostname '192.168.121.201' --no-server --seeds 'http://127.0.0.1:2379' --file '/tmp/mgmt-412078160/remote/remote2a.yaml' --depth 1
17:58:24 etcd.go:2088: Etcd: Watch: Path: /_mgmt/exported/
17:58:24 main.go:255: Main: Waiting...
17:58:24 remote.go:256: Remote: Copying graph definition...
17:58:24 remote.go:618: Remote: Tunnelling...
17:58:24 remote.go:630: Remote: Exec...
17:58:24 remote.go:510: Remote: Running: /tmp/mgmt-412078160/remote/mgmt run --hostname '192.168.121.202' --no-server --seeds 'http://127.0.0.1:2379' --file '/tmp/mgmt-412078160/remote/remote2b.yaml' --depth 1
17:58:24 etcd.go:2088: Etcd: Watch: Path: /_mgmt/exported/
17:58:24 main.go:291: Config: Parse failure
17:58:24 main.go:255: Main: Waiting...
^C17:58:48 main.go:62: Interrupted by ^C
17:58:48 main.go:397: Destroy...
17:58:48 remote.go:532: Remote: Output...
|    17:58:23 main.go:76: This is: mgmt, version: 0.0.5-3-g4b8ad3a
|    17:58:47 main.go:419: Goodbye!
17:58:48 remote.go:636: Remote: Done!
17:58:48 remote.go:532: Remote: Output...
|    17:58:24 main.go:76: This is: mgmt, version: 0.0.5-3-g4b8ad3a
|    17:58:48 main.go:419: Goodbye!
17:58:48 remote.go:636: Remote: Done!
17:58:48 main.go:419: Goodbye!

You should see that we kick off the remote executions, and how they are wired back through the tunnel. In this particular case we terminated the runs with a ^C.

The example configurations I used are available here and here. If you had a terminal open on the first remote machine, after about a second you would have seen:

[root@omv1 ~]# ls -d /tmp/file*  /tmp/mgmt*
/tmp/file1a  /tmp/file2a  /tmp/file2b  /tmp/mgmt-412078160
[root@omv1 ~]# cat /tmp/file*
i am file1a
i am file2a, exported from host a
i am file2b, exported from host b

You can see the remote execution artifacts, and that there was clearly data exchange. You can repeat this example with --converged-timeout=5 to automatically terminate after five seconds of cluster wide inactivity.

Live remote hacking

Since mgmt is event based, and graph structure configurations manifest themselves as event streams, you can actually edit the input configuration on the initiating machine, and as soon as the file is saved, it will instantly remotely propagate and apply the graph differential.

For this particular example, since we export and collect resources through the tunnelled SSH connections, it means editing the exported file, will also cause both hosts to update that file on disk!

You’ll see this occurring with this message in the logs:

18:00:44 remote.go:973: Remote: Copied over new graph definition: examples/remote2b.yaml

While you might not necessarily want to use this functionality on a production machine, it will definitely make your interactive hacking sessions more useful, in particular because you never need to re-run parts of the graph which have already converged!

Auth

In case you’re wondering, mgmt can look in your ~/.ssh/ for keys to use for the auth, or it can prompt you interactively. It can also read a plain text password from the connection string, but this isn’t a recommended security practice.

Hierarchial remote execution

Even though we recommend running mgmt in a normal clustered mode instead of over SSH, we didn’t want to limit the number of hosts that can be configured using remote execution. For this reason it would be architecturally simple to add support for what we’ve decided to call “hierarchial remote execution”.

In this mode, the primary initiator would first connect to one or more secondary nodes, which would then stage a second series of remote execution runs resulting in an order of depth equal to two or more. This fan out approach can be used to distribute the number of outgoing connections across more intermediate machines, or as a method to conserve remote execution bandwidth on the primary link into your datacenter, by having the secondary machine run most of the remote execution runs.

This particular extension hasn’t been built, although some of the plumbing has been laid. If you’d like to contribute this feature to the upstream project, please join us in #mgmtconfig on Freenode and let us (I’m @purpleidea) know!

Docs

There is some generated documentation for the mgmt remote package available. There is also the beginning of some additional documentation in the markdown docs. You can help contribute to either of these by sending us a patch!

Novel resources

Our event based architecture can enable some previously improbable kinds of resources. In particular, I think it would be quite beautiful if someone built a provisioning resource. The Watch method of the resource API normally serves to notify us of events, but since it is a main loop that blocks in a select call, it could also be used to run a small server that hosts a kickstart file and associated TFTP images. If you like this idea, please help us build it!

Conclusion

I hope you enjoyed this article and found this remote execution methodology as novel as we do. In particular I hope that I’ve demonstrated that configuration software doesn’t have to be constrained behind a static orchestration topology.

Happy Hacking,

James

Seen in downtown Montreal…

Posted on August 3, 2016 by purpleidea

The Technical Blog of James was seen on an outdoor electronic display in downtown Montreal! Thanks to one of my readers for sending this in.

I guess the smart phone revolution is over, and people are taking to reading my articles on bigger screens! The “poutine” is decent proof that this is probably Montreal.

If you’ve got access to a large electronic display, put up the blog, snap a photo, and send it my way! I’ll post it here and send you some random stickers!

Happy Hacking,

James

PS: If you have some comments about this blog, please don’t be shy, send them my way.

Automatic clustering in mgmt

Posted on June 20, 2016 by purpleidea

In mgmt, deploying and managing your clustered config management infrastructure needs to be as automatic as the infrastructure you’re using mgmt to manage. With mgmt, instead of a centralized data store, we function as a distributed system, built on top of etcd and the raft protocol.

In this article, I’ll cover how this feature works.

Foreword:

Mgmt is a next generation configuration management project. If you haven’t heard of it yet, or you don’t remember why we use a distributed database, start by reading the previous articles:

Embedded etcd:

Since mgmt and etcd are both written in golang, the etcd code can be built into the same binary as mgmt. As a result, etcd can be managed directly from within mgmt. Unfortunately, there’s currently no recommended API to do this, but I’ve tried to get such a feature upstream to avoid code duplication in mgmt. If you can help out here, I’d really appreciate it! In the meantime, I’ve had to copy+paste the necessary portions into mgmt.

Clustering mechanics:

You can deploy an automatically clustered mgmt cluster by following these three steps:

1) If no mgmt servers exist you can start one up by running mgmt normally:

./mgmt run --file examples/graph0.yaml

2) To add any subsequent mgmt server, run mgmt normally, but point it at any number of existing mgmt servers with the --seeds command:

./mgmt run --file examples/graph0.yaml --seeds <ip address:port>

3) Profit!

We internally implement a clustering algorithm which does the hard-working of building and managing the etcd cluster for you, so that you don’t have to. If you’re interested, keep reading to find out how it works!

Clustering algorithm:

The clustering algorithm works as follows:

If you aren’t given any seeds, then assume you are the first etcd server (peer) and start-up. If you are given a seeds argument, then connect to that peer to join the cluster as a client. If you’d like to be promoted to a server, then you can “volunteer” by setting a special key in the cluster data store.

The existing cluster of peers will decide if they want additional peers, and if so, they can “nominate” someone from the pool of volunteers. If you have been nominated, you can start-up an etcd peer and peer with the rest of the cluster. Similarly, the cluster can decide to un-nominate a peer, and if you’ve been un-nominated, then you should shutdown your etcd server.

All cluster decisions are made by consensus using the raft algorithm. In practice this means that the elected cluster leader looks at the state of the system, and makes the necessary nomination changes.

Lastly, if you don’t want to be a peer any more, you can revoke your volunteer message, which will be seen by the cluster, and if you were running a server, you should receive an un-nominate message in response, which will let you shutdown cleanly.

Disclaimer:

It’s probably worth mentioning that the current implementation has a few issues, and at least one race. The goal is to have it polished up by the time etcd v3 is released, but it’s perfectly usable for testing and experimentation today! If you don’t want to automatically cluster, you can always use the --no-server flag, and point mgmt at a manually managed mgmt cluster using the --seeds flag.

Testing:

Testing this feature on a single machine makes development and experimentation easier, so as a result, there are a few flags which make this possible.

--hostname <hostname>
With this flag, you can force your mgmt client to pretend it is running on a host with the above mentioned name. You can use this to specify --hostname h1, or --hostname h2, and so on; one for each mgmt agent you want to run on the same machine.

--server-urls <ip:port>
With this flag you can specify which IP address and port the etcd server will listen on for peer requests. By default this will use 127.0.0.1:2379, but when running multiple mgmt agents on the same machine you’ll need to specify this manually to avoid collisions. You can specify as many IP address and port pairs as you’d like by separating them with commas or semicolons. The --peer-urls flag is an alias which does the same thing.

--client-urls <ip:port>
This flag specifies which IP address and port the etcd server will listen on for client connections. It defaults to 127.0.0.1:2380, but you’ll occasionally want to specify this manually for the same reasons as mentioned above. You can specify as many IP address and port pairs as you’d like by separating them with commas or semicolons. This is the address that will be used by the --seeds flag when joining an existing cluster.

Elastic clustering:

In the future, you’ll be able to specify a much more elaborate method to decide how many hosts should be promoted into peers, and which hosts should be nominated or un-nominated when growing or shrinking the cluster.

At the moment, we do the grow or shrink operation when the current peer count does not match the requested cluster size. This value has a default of 5, and can even be changed dynamically. To do so you can run:

ETCDCTL_API=3 etcdctl --endpoints 127.0.0.1:2379 put /_mgmt/idealClusterSize 3

You can also set it at start-up by using the --ideal-cluster-size flag.

Example:

Here’s a real example if you want to dive in. Try running the following four commands in separate terminals:

./mgmt run --file examples/etcd1a.yaml --hostname h1 --ideal-cluster-size 3
./mgmt run --file examples/etcd1b.yaml --hostname h2 --seeds http://127.0.0.1:2379 --client-urls http://127.0.0.1:2381 --server-urls http://127.0.0.1:2382
./mgmt run --file examples/etcd1c.yaml --hostname h3 --seeds http://127.0.0.1:2379 --client-urls http://127.0.0.1:2383 --server-urls http://127.0.0.1:2384
./mgmt run --file examples/etcd1d.yaml --hostname h4 --seeds http://127.0.0.1:2379 --client-urls http://127.0.0.1:2385 --server-urls http://127.0.0.1:2386

Once you’ve done this, you should have a three host cluster! Check this by running any of these commands:

ETCDCTL_API=3 etcdctl --endpoints 127.0.0.1:2379 member list
ETCDCTL_API=3 etcdctl --endpoints 127.0.0.1:2381 member list
ETCDCTL_API=3 etcdctl --endpoints 127.0.0.1:2383 member list

Note that you’ll need a v3 beta version of the etcdctl command which you can get by running ./build in the etcd git repo.

To grow your cluster, try increasing the desired cluster size to five:

ETCDCTL_API=3 etcdctl --endpoints 127.0.0.1:2381 put /_mgmt/idealClusterSize 5

You should see the last host start-up an etcd server. If you reduce the idealClusterSize, you’ll see servers shutdown! You’re responsible if you destroy the cluster by setting it too low! You can then try growing your cluster again, but unfortunately due to a bug, hosts can’t be re-used yet, and you’ll get a “bind: address already in use” error. We hope to have this fixed shortly!

Security:

Unfortunately no authentication security or transport security has been implemented yet. We have a great design, but are busy working on other parts of the project at the moment. If you’d like to help out here, please let us know!

Future work:

There’s still a lot of work to do, to improve this feature. The biggest challenge has been getting a reasonable embedded server API upstream. It’s not clear whether this patch can be made to work or if something different will need to be written, but at least one other project looks like it could benefit from this as well.

Video:

A recording from the recent Berlin CoreOSFest 2016 has been published! I demoed these recent features, but one interesting note is that I am actually presenting an earlier version of the code which used the etcd V2 API. I’ve since ported the code to V3, but it is functionally similar. It’s probably worth mentioning, that I found the V3 API to be more difficult, but also more correct and powerful. I think it is a net improvement to the project.

Community:

I can’t end this blog post without mentioning some of the great stuff that’s been happening in the mgmt community! In particular, Felix has written some great code to run existing Puppet code on mgmt. Check out his work!

Upcoming speaking:

I’ve got some upcoming speaking in Hong Kong at HKOSCon16 and in Cape Town at DebConf16 about the project. Please ping me if you’ll be in one of these cities and would like to hack on mgmt or just chat about the project. I’m happy to give some impromptu demos if you ask!

Thanks for reading!

Happy Hacking,

James

PS: We now have a community run twitter account. Check us out!

One hour hacks: Remote LUKS over SSH

Posted on April 25, 2016 by purpleidea

I have a GNU/Linux server which I mount a few LUKS encrypted drives on. I only ever interact with the server over SSH, and I never want to keep the LUKS credentials on the remote server. I don’t have anything especially sensitive on the drives, but I think it’s a good security practice to encrypt it all, if only to add noise into the system and for solidarity with those who harbour much more sensitive data.

This means that every time the server reboots or whenever I want to mount the drives, I have to log in and go through the series of luksOpen and mount commands before I can access the data. This turned out to be a bit laborious, so I wrote a quick script to automate it! I also made sure that it was idempotent.

I decided to share it because I couldn’t find anything similar, and I was annoyed that I had to write this in the first place. Hopefully it saves you some anguish. It also contains a clever little bash hack that I am proud to have in my script.

Here’s the script. You’ll need to fill in the map of mount folder names to drive UUID’s, and you’ll want to set your server hostname and FQDN to match your environment of course. It will prompt you for your root password to mount, and the LUKS password when needed.

Example of mounting:

james@computer:~$ rluks.sh 
Running on: myserver...
[sudo] password for james: 
Mount/Unmount [m/u] ? m
Mounting...
music: mkdir ✓
LUKS Password: 
music: luksOpen ✓
music: mount ✓
files: mkdir ✓
files: luksOpen ✓
files: mount ✓
photos: mkdir ✓
photos: luksOpen ✓
photos: mount ✓
Done!
Connection to server.example.com closed.

Example of unmounting:

james@computer:~$ rluks.sh 
Running on: myserver...
[sudo] password for james: 
Sorry, try again.
[sudo] password for james: 
Mount/Unmount [m/u] ? u
Unmounting...
music: umount ✓
music: luksClose ✓
music: rmdir ✓
files: umount ✓
files: luksClose ✓
files: rmdir ✓
photos: umount ✓
photos: luksClose ✓
photos: rmdir ✓
Done!
Connection to server.example.com closed.
james@computer:~$

It’s worth mentioning that there are many improvements that could be made to this script. If you’ve got patches, send them my way. After all, this is only a: one hour hack.

Happy hacking,

James

PS: One day this sort of thing might be possible in mgmt. Let me know if you want to help work on it!

Automatic grouping in mgmt

Posted on March 30, 2016 by purpleidea

In this post, I’ll tell you about the recently released “automatic grouping” or “AutoGroup” feature in mgmt, a next generation configuration management prototype. If you aren’t already familiar with mgmt, I’d recommend you start by reading the introductory post, and the second post. There’s also an introductory video.

Resources in a graph

Most configuration management systems use something called a directed acyclic graph, or DAG. This is a fancy way of saying that it is a bunch of circles (vertices) which are connected with arrows (edges). The arrows must be connected to exactly two vertices, and you’re only allowed to move along each arrow in one direction (directed). Lastly, if you start at any vertex in the graph, you must never be able to return to where you started by following the arrows (acyclic). If you can, the graph is not fit for our purpose.

An example DAG from Wikipedia

The graphs in configuration management systems usually represent the dependency relationships (edges) between the resources (vertices) which is important because you might want to declare that you want a certain package installed before you start a service. To represent the kind of work that you want to do, different kinds of resources exist which you can use to specify that work.

Each of the vertices in a graph represents a unique resource, and each is backed by an individual software routine or “program” which can check the state of the resource, and apply the correct state if needed. This makes each resource idempotent. If we have many individual programs, this might turn out to be a lot of work to do to get our graph into the desired state!

Resource grouping

It turns out that some resources have a fixed overhead to starting up and running. If we can group resources together so that they share this fixed overhead, then our graph might converge faster. This is exactly what we do in mgmt!

Take for example, a simple graph such as the following:

Simple DAG showing three pkg, two file, and one svc resource

Simple DAG showing one svc, two file, and three pkg resources…

We can logically group the three pkg resources together and redraw the graph so that it now looks like this:

DAG with the three pkg resources now grouped into one! Overlapping vertices mean that they act as if they’re one vertex instead of three!

This all happens automatically of course! It is very important that the new graph is a faithful, logical representation of the original graph, so that the specified dependency relationships are preserved. What this represents, is that when multiple resources are grouped (shown by overlapping vertices in the graph) they run together as a single unit. This is the practical difference between running:

$ dnf install -y powertop
$ dnf install -y sl
$ dnf install -y cowsay

if not grouped, and:

$ dnf install -y powertop sl cowsay

when grouped. If you try this out you’ll see that the second scenario is much faster, and on my laptop about three times faster! This is because of fixed overhead such as cache updates, and the dnf dep solver that each process runs.

This grouping means mgmt uses this faster second scenario instead of the slower first scenario that all the current generation tools do. It’s also important to note that different resources can implement the grouping feature to optimize for different things besides performance. More on that later…

The algorithm

I’m not an algorithmist by training, so it took me some fiddling to come up with an appropriate solution. I’ve implemented it along with an extensive testing framework and a series of test cases, which it passes of course! If we ever find a graph that does not get grouped correctly, then we can iterate on the algorithm and add it as a new test case.

The algorithm turns out to be relatively simple. I first noticed that vertices which had a relationship between them must not get grouped, because that would undermine the precedence ordering of the vertices! This property is called reachability. I then attempt to group every vertex to every other vertex that has no reachability or reverse reachability to it!

The hard part turned out to be getting all the plumbing surrounding the algorithm correct, and in particular the actual vertex merging algorithm, so that “discarded edges” are reattached in the correct places. I also took a bit of extra time to implement the algorithm as a struct which satisfies an “AutoGrouper” interface. This way, if you’d like to implement a different algorithm, it’s easy to drop in your replacement. I’m fairly certain that a more optimal version of my algorithm is possible for anyone wishing to do the analysis.

A quick note on nomenclature: I’ve actually decided to call this grouping and not merging, because we actually preserve the unique data of each resource so that they can be taken apart and mixed differently when (and if) there is a change in the compiled graph. This makes graph changeovers very cheap in mgmt, because we don’t have to re-evaluate anything which remains constant between graphs. Merging would imply a permanent reduction and loss of unique identity.

Parallelism and user choice

It’s worth noting two important points:

Auto grouping of resources usually decreases the parallelism of a graph.
The user might not want a particular resource to get grouped!

You might remember that one of the novel properties of mgmt, is that it executes the graph in parallel whenever possible. Although the grouping of resources actually removes some of this parallelism, certain resources such as the pkg resource already have an innate constraint on sequential behaviour, namely: the package manager lock. Since these tools can’t operate in parallel, and since each execution has a fixed overhead, it’s almost always beneficial to group pkg resources together.

Grouping is also not mandatory, so while it is a sensible default, you can disable grouping per resource with a simple meta parameter.

Lastly, it’s also worth mentioning that grouping doesn’t “magically” happen without some effort. The underlying resource needs to know how to optimize, watch, check and apply multiple resources simultaneously for it to support the feature. At the moment, only the pkg resource can do any grouping, and even then, there could always be some room for improvement. It’s also not optimal (or even logical) to group certain types of resources, so those will never be able to do any grouping. We also don’t group together resources of different kinds, although mgmt could support this if a valid use case is ever found.

File grouping

As I mentioned, only the pkg resource supports grouping at this time. The file resource demonstrates a different use case for resource grouping. Suppose you want to monitor 10000 files in a particular directory, but they are specified individually. This would require far too many inotify watches than a normal system usually has, so the grouping algorithm could group them into a single resource, which then uses a recursive watcher such as fanotify to reduce the watcher count by a factor of 10000. Unfortunately neither the file resource grouping, nor the fanotify support for this exist at the moment. If you’d like to implement either of these, please let me know!

If you can think of another resource kind that you’d like to write, or in particular, if you know of one which would work well with resource grouping, please contact me!

Science!

I wouldn’t be a very good scientist (I’m actually a Physiologist by training) if I didn’t include some data and a demonstration that this all actually works, and improves performance! What follows will be a good deal of information, so skim through the parts you don’t care about.

Science <3 data

I decided to test the following four scenarios:

single package, package check, package already installed
single package, package install, package not installed
three packages, package check, packages already installed
three packages, package install, packages not installed

These are the situations you’d encounter when running your tool of choice to install one or more packages, and finding them either already present, or in need of installation. I timed each test, which ends when the tool tells us that our system has converged.

Each test is performed multiple times, and the average is taken, but only after we’ve run the tool at least twice so that the caches are warm.

We chose small packages so that the fixed overhead delays due to bandwidth and latencies are minimal, and so that our data is more representative of the underlying tool.

The single package tests use the powertop package, and the three package tests use powertop, sl, and cowsay. All tests were performed on an up-to-date Fedora 23 laptop, with an SSD. If you haven’t tried sl and cowsay, do give them a go!

The four tools tested were:

puppet
mgmt
pkcon
dnf

The last two are package manager front ends so that it’s more obvious how expensive something is expected to cost, and so that you can discern what amount of overhead is expected, and what puppet or mgmt is causing you. Here are a few representative runs:

mgmt installation of powertop:

$ time sudo ./mgmt run --file examples/pkg1.yaml --converged-timeout=0
21:04:18 main.go:63: This is: mgmt, version: 0.0.3-1-g6f3ac4b
21:04:18 main.go:64: Main: Start: 1459299858287120473
21:04:18 main.go:190: Main: Running...
21:04:18 main.go:113: Etcd: Starting...
21:04:18 main.go:117: Main: Waiting...
21:04:18 etcd.go:113: Etcd: Watching...
21:04:18 etcd.go:113: Etcd: Watching...
21:04:18 configwatch.go:54: Watching: examples/pkg1.yaml
21:04:20 config.go:272: Compile: Adding AutoEdges...
21:04:20 config.go:533: Compile: Grouping: Algorithm: nonReachabilityGrouper...
21:04:20 main.go:171: Graph: Vertices(1), Edges(0)
21:04:20 main.go:174: Graphviz: No filename given!
21:04:20 pgraph.go:764: State: graphStateNil -> graphStateStarting
21:04:20 pgraph.go:825: State: graphStateStarting -> graphStateStarted
21:04:20 main.go:117: Main: Waiting...
21:04:20 pkg.go:245: Pkg[powertop]: CheckApply(true)
21:04:20 pkg.go:303: Pkg[powertop]: Apply
21:04:20 pkg.go:317: Pkg[powertop]: Set: installed...
21:04:25 packagekit.go:399: PackageKit: Woops: Signal.Path: /8442_beabdaea
21:04:25 packagekit.go:399: PackageKit: Woops: Signal.Path: /8443_acbadcbd
21:04:31 pkg.go:335: Pkg[powertop]: Set: installed success!
21:04:31 main.go:79: Converged for 0 seconds, exiting!
21:04:31 main.go:55: Interrupted by exit signal
21:04:31 pgraph.go:796: Pkg[powertop]: Exited
21:04:31 main.go:203: Goodbye!

real    0m13.320s
user    0m0.023s
sys    0m0.021s

puppet installation of powertop:

$ time sudo puppet apply pkg.pp 
Notice: Compiled catalog for computer.example.com in environment production in 0.69 seconds
Notice: /Stage[main]/Main/Package[powertop]/ensure: created
Notice: Applied catalog in 10.13 seconds

real    0m18.254s
user    0m9.211s
sys    0m2.074s

dnf installation of powertop:

$ time sudo dnf install -y powertop
Last metadata expiration check: 1:22:03 ago on Tue Mar 29 20:04:29 2016.
Dependencies resolved.
==========================================================================
 Package          Arch           Version            Repository       Size
==========================================================================
Installing:
 powertop         x86_64         2.8-1.fc23         updates         228 k

Transaction Summary
==========================================================================
Install  1 Package

Total download size: 228 k
Installed size: 576 k
Downloading Packages:
powertop-2.8-1.fc23.x86_64.rpm            212 kB/s | 228 kB     00:01    
--------------------------------------------------------------------------
Total                                     125 kB/s | 228 kB     00:01     
Running transaction check
Transaction check succeeded.
Running transaction test
Transaction test succeeded.
Running transaction
  Installing  : powertop-2.8-1.fc23.x86_64                            1/1 
  Verifying   : powertop-2.8-1.fc23.x86_64                            1/1 

Installed:
  powertop.x86_64 2.8-1.fc23                                              

Complete!

real    0m10.406s
user    0m4.954s
sys    0m0.836s

puppet installation of powertop, sl and cowsay:

$ time sudo puppet apply pkg3.pp 
Notice: Compiled catalog for computer.example.com in environment production in 0.68 seconds
Notice: /Stage[main]/Main/Package[powertop]/ensure: created
Notice: /Stage[main]/Main/Package[sl]/ensure: created
Notice: /Stage[main]/Main/Package[cowsay]/ensure: created
Notice: Applied catalog in 33.02 seconds

real    0m41.229s
user    0m19.085s
sys    0m4.046s

pkcon installation of powertop, sl and cowsay:

$ time sudo pkcon install powertop sl cowsay
Resolving                     [=========================]         
Starting                      [=========================]         
Testing changes               [=========================]         
Finished                      [=========================]         
Installing                    [=========================]         
Querying                      [=========================]         
Downloading packages          [=========================]         
Testing changes               [=========================]         
Installing packages           [=========================]         
Finished                      [=========================]         

real    0m14.755s
user    0m0.060s
sys    0m0.025s

and finally, mgmt installation of powertop, sl and cowsay with autogrouping:

$ time sudo ./mgmt run --file examples/autogroup2.yaml --converged-timeout=0
21:16:00 main.go:63: This is: mgmt, version: 0.0.3-1-g6f3ac4b
21:16:00 main.go:64: Main: Start: 1459300560994114252
21:16:00 main.go:190: Main: Running...
21:16:00 main.go:113: Etcd: Starting...
21:16:00 main.go:117: Main: Waiting...
21:16:00 etcd.go:113: Etcd: Watching...
21:16:00 etcd.go:113: Etcd: Watching...
21:16:00 configwatch.go:54: Watching: examples/autogroup2.yaml
21:16:03 config.go:272: Compile: Adding AutoEdges...
21:16:03 config.go:533: Compile: Grouping: Algorithm: nonReachabilityGrouper...
21:16:03 config.go:533: Compile: Grouping: Success for: Pkg[powertop] into Pkg[cowsay]
21:16:03 config.go:533: Compile: Grouping: Success for: Pkg[sl] into Pkg[cowsay]
21:16:03 main.go:171: Graph: Vertices(1), Edges(0)
21:16:03 main.go:174: Graphviz: No filename given!
21:16:03 pgraph.go:764: State: graphStateNil -> graphStateStarting
21:16:03 pgraph.go:825: State: graphStateStarting -> graphStateStarted
21:16:03 main.go:117: Main: Waiting...
21:16:03 pkg.go:245: Pkg[autogroup:(cowsay,powertop,sl)]: CheckApply(true)
21:16:03 pkg.go:303: Pkg[autogroup:(cowsay,powertop,sl)]: Apply
21:16:03 pkg.go:317: Pkg[autogroup:(cowsay,powertop,sl)]: Set: installed...
21:16:08 packagekit.go:399: PackageKit: Woops: Signal.Path: /8547_cbeaddda
21:16:08 packagekit.go:399: PackageKit: Woops: Signal.Path: /8548_bcaadbce
21:16:16 pkg.go:335: Pkg[autogroup:(cowsay,powertop,sl)]: Set: installed success!
21:16:16 main.go:79: Converged for 0 seconds, exiting!
21:16:16 main.go:55: Interrupted by exit signal
21:16:16 pgraph.go:796: Pkg[cowsay]: Exited
21:16:16 main.go:203: Goodbye!

real    0m15.621s
user    0m0.040s
sys    0m0.038s

Results and analysis

My hard work seems to have paid off, because we do indeed see a noticeable improvement from grouping package resources. The data shows that even in the single package comparison cases, mgmt has very little overhead, which is demonstrated by seeing that the mgmt run times are very similar to the times it takes to run the package managers manually.

In the three package scenario, performance is approximately 2.39 times faster than puppet for installation. Checking was about 12 times faster! These ratios are expected to grow with a larger number of resources.

Bigger bars is worse… Puppet is in Red, mgmt is in blue.

The four groups at the bottom along the x axis correspond to the four scenarios I tested, 1, 2 and 3 corresponding to each run of that scenario, with the average of the three listed there too.

Versions

The test wouldn’t be complete if we didn’t tell you which specific version of each tool that we used. Let’s time those as well! ;)

puppet:

$ time puppet --version 
4.2.1

real    0m0.659s
user    0m0.525s
sys    0m0.064s

mgmt:

$ time ./mgmt --version
mgmt version 0.0.3-1-g6f3ac4b

real    0m0.007s
user    0m0.006s
sys    0m0.002s

pkcon:

$ time pkcon --version
1.0.11

real    0m0.013s
user    0m0.006s
sys    0m0.005s

dnf:

$ time dnf --version
1.1.7
  Installed: dnf-0:1.1.7-2.fc23.noarch at 2016-03-17 13:37
  Built    : Fedora Project at 2016-03-09 16:45

  Installed: rpm-0:4.13.0-0.rc1.12.fc23.x86_64 at 2016-03-03 09:39
  Built    : Fedora Project at 2016-02-29 09:53

real    0m0.438s
user    0m0.379s
sys    0m0.036s

Yep, puppet even takes the longest to tell us what version it is. Now I’m just teasing…

Methodology

It might have been more useful to time the removal of packages instead so that we further reduce the variability of internet bandwidth and latency, although since most configuration management is used to install packages (rather than remove), we figured this would be more appropriate and easy to understand. You’re welcome to conduct your own study and share the results!

Additionally, for fun, I also looked at puppet runs where three individual resources were used instead of a single resource with the title being an array of all three packages, and found no significant difference in the results. Indeed puppet runs dnf three separate times in either scenario:

$ ps auxww | grep dnf
root     12118 27.0  1.4 417060 110864 ?       Ds   21:57   0:03 /usr/bin/python3 /usr/bin/dnf -d 0 -e 0 -y install powertop
$ ps auxww | grep dnf
root     12713 32.7  2.0 475204 159840 ?       Rs   21:57   0:02 /usr/bin/python3 /usr/bin/dnf -d 0 -e 0 -y install sl
$ ps auxww | grep dnf
root     13126  0.0  0.7 275324 55608 ?        Rs   21:57   0:00 /usr/bin/python3 /usr/bin/dnf -d 0 -e 0 -y install cowsay

Data

If you’d like to download the raw data as a text formatted table, and the terminal output from each type of run, I’ve posted it here.

Conclusion

I hope that you enjoyed this feature and analysis, and that you’ll help contribute to making it better. Come join our IRC channel and say hello! Thanks to those who reviewed my article and pointed out some good places for improvements!

Happy Hacking,

James

Automatic edges in mgmt

Posted on March 14, 2016 by purpleidea

It’s been two months since I announced mgmt, and now it’s time to continue the story by telling you more about the design of what’s now in git master. Before I get into those details, let me quickly recap what’s happened since then.

Mgmt community recap:

I gave the first public presentation about mgmt at CfgMgmtCamp.
I repeated the talk at DevConf.cz. The video recording is available.
Felix wrote about his work cross compiling puppet code to mgmt.
Three new contributors got their feet wet in git master.
Mgmt got a lot more attention on the web…

Okay, time to tell you what’s new in mgmt!

Types vs. resources:

Configuration management systems have the concept of primitives for the basic building blocks of functionality. The well-known ones are “package”, “file”, “service” and “exec” or “execute”. Chef calls these “resources”, while puppet (misleadingly) calls them “types”.

I made the mistake of calling the primitives in mgmt, “types”, no doubt due to my extensive background in puppet, however this overloads the word because it usually refers to programming types, so I’ve decided not to use it any more to refer to these primitives. The Chef folks got this right! From now on they’ll be referred to as “resources” or “res” when abbreviated.

Mgmt now has: “noop“, “file“, “svc“, “exec“, and now: “pkg“…

The package (pkg) resource:

The most obvious change to mgmt, is that it now has a package resource. I’ve named it “pkg” because we hackers prefer to keep strings short. Creating a pkg resource is difficult for two reasons:

Mgmt is event based
Mgmt should support many package managers

Event based systems would involve an inotify watch on the rpmdb (or a similar watch on /var/lib/dpkg/), and the logic to respond to it. This engineering problem boils down to being able to support the entire matrix of possible GNU/Linux packaging systems. Yuck! Additionally, it would be particularly unfriendly if we primarily supported RPM and DNF based systems, but left the DPKG and APT backend out as “an exercise for the community”.

Therefore, we solve both of these problems by basing the pkg resource on the excellent PackageKit project! PackageKit provides the events we need, and more importantly, it supports many backends already! If there’s ever a new backend that needs to be added, you can add it upstream in PackageKit, and everyone (including mgmt) will benefit from your work.

As a result, I’m proud to announce that both Debian and Fedora, (and many other friends) all got a working pkg resource on day one. Here’s a small demo:

Run mgmt:

root@debian:~/mgmt# time ./mgmt run --file examples/pkg1.yaml 
18:58:25 main.go:65: This is: mgmt, version: 0.0.2-41-g963f025
[snip]
18:18:44 pkg.go:208: Pkg[powertop]: CheckApply(true)
18:18:44 pkg.go:259: Pkg[powertop]: Apply
18:18:44 pkg.go:266: Pkg[powertop]: Set: installed...
18:18:52 pkg.go:284: Pkg[powertop]: Set: installed success!

The “powertop” package will install… Now while mgmt is still running, remove powertop:

root@debian:~# pkcon remove powertop
Resolving                     [=========================]         
Testing changes               [=========================]         
Finished                      [=========================]         
Removing                      [=========================]         
Loading cache                 [=========================]         
Running                       [=========================]         
Removing packages             [=========================]         
Committing changes            [=========================]         
Finished                      [=========================]         
root@debian:~# which powertop
/usr/sbin/powertop

It gets installed right back! Similarly, you can do it like this:

root@debian:~# apt-get -y remove powertop
Reading package lists... Done
Building dependency tree       
Reading state information... Done
The following packages were automatically installed and are no longer required:
  libnl-3-200 libnl-genl-3-200
Use 'apt-get autoremove' to remove them.
The following packages will be REMOVED:
  powertop
0 upgraded, 0 newly installed, 1 to remove and 80 not upgraded.
After this operation, 542 kB disk space will be freed.
(Reading database ... 47528 files and directories currently installed.)
Removing powertop (2.6.1-1) ...
Processing triggers for man-db (2.7.0.2-5) ...
root@debian:~# which powertop
/usr/sbin/powertop

And it will also get installed right back! Try it yourself to see it happen “live”! Similar behaviour can be seen on Fedora and other distros.

As a quite aside. If you’re a C hacker, and you would like to help with the upstream PackageKit project, they would surely love your contributions, and in particular, we here working on mgmt would especially like it if you worked on any of the open issues that we’ve uncovered. In order from increasing to decreasing severity, they are: #118 (please help!), #117 (needs love), #110 (needs testing), and #116 (would be nice to have). If you’d like to test mgmt on your favourite distro, and report and fix any issues, that would be helpful too!

Automatic edges:

Since we’re now hooked into the pkg resource, there’s no reason we can’t use that wealth of knowledge to make mgmt more powerful. For example, the PackageKit API can give us the list of files that a certain package would install. Since any file resource would obviously want to get “applied” after the package is installed, we use this information to automatically generate the relationships or “edges” in the graph. This means that module authors don’t have to waste time manually adding or updating the “require” relationships in their modules!

For example, the /etc/drbd.conf file, will want to require the drbd-utils package to be installed first. With traditional config management systems, without this dependency chain, after one run, your system will not be in a converged state, and would require another run. With mgmt, since it is event based, it would converge, except it might run in a sub-optimal order. That’s one reason why we add this dependency for you automatically.

This is represented via what mgmt calls the “AutoEdges” API. (If you can think of a better name, please tell me now!) It’s also worth noting that this isn’t entirely a novel idea. Puppet has a concept of “autorequires”, which is used for some of their resources, but doesn’t work with packages. I’m particularly proud of my design, because in my opinion, I think the API and mechanism in mgmt are much more powerful and logical.

Here’s a small demo:

james@fedora:~$ ./mgmt run --file examples/autoedges3.yaml 
20:00:38 main.go:65: This is: mgmt, version: 0.0.2-42-gbfe6192
[snip]
20:00:38 configwatch.go:54: Watching: examples/autoedges3.yaml
20:00:38 config.go:248: Compile: Adding AutoEdges...
20:00:38 config.go:313: Compile: Adding AutoEdge: Pkg[drbd-utils] -> Svc[drbd]
20:00:38 config.go:313: Compile: Adding AutoEdge: Pkg[drbd-utils] -> File[file1]
20:00:38 config.go:313: Compile: Adding AutoEdge: Pkg[drbd-utils] -> File[file2]
20:00:38 main.go:149: Graph: Vertices(4), Edges(3)
[snip]

Here we define four resources: pkg (drbd-utils), svc (drbd), and two files (/etc/drbd.conf and /etc/drbd.d/), both of which happen to be listed inside the RPM package! The AutoEdge magic works out these dependencies for us by examining the package data, and as you can see, adds the three edges. Unfortunately, there is no elegant way that I know of to add an automatic relationship between the svc and any of these files at this time. Suggestions welcome.

Finally, we also use the same interface to make sure that a parent directory gets created before any managed file that is a child of it.

Automatic edges internals:

How does it work? Each resource has a method to generate a “unique id” for that resource. This happens in the “UIDs” method. Additionally, each resource has an “AutoEdges” method which, unsurprisingly, generates an “AutoEdge” object (struct). When the compiler is generating the graph and adding edges, it calls two methods on this AutoEdge object:

Next()
Test(…)

The Next() method produces a list of possible matching edges for that resource. Whichever edges match are added to the graph, and the results of each match is fed into the Test(…) function. This information is used to tell the resource whether there are more potential matches available or not. The process iterates until Test returns false, which means that there are no other available matches.

This ensures that a file: /var/lib/foo/bar/baz/ will first seek a dependency on /var/lib/foo/bar/, but be able to fall back to /var/lib/ if that’s the first file resource available. This way, we avoid adding more resource dependencies than necessary, which would diminish the amount of parallelism possible, while running the graph.

Lastly, it’s also worth noting that users can choose to disable AutoEdges per resource if they so desire. If you’ve got an idea for a clever automatic edge, please contact me, send a patch, or leave the information in the comments!

Contributing:

Good ideas and designs help, but contributors is what will make the project. All sorts of help is appreciated, and you can join in even if you’re not an “expert”. I’ll try and tag “easy” or “first time” patches with the “mgmtlove” tag. Feel free to work on other issues too, or suggest something that you think will help! Want to add more interesting resources! Anyone want to write a libvirt resource? How about a network resource? Use your imagination!

Lastly, thanks to both Richard and Matthias Klumpp in particular from the PackageKit project for their help so far, and to everyone else who has contributed in some way.

That’s all for today. I’ve got more exciting things coming. Please contribute!

Happy Hacking!

James

Introducing: git tpush

Posted on February 16, 2016 by purpleidea

On today’s issue of “one hour hacks”, I’ll show you how you can stop your git drive-by’s to git master from breaking your CI tests… Let’s continue!

The problem:

Sometimes I’ve got a shitty one-line patch that I want to push to git master. I’m usually right, and everything tests out fine, but usually isn’t always, and then I look silly while I frantically try to fix git master on a project that I maintain. Let’s use tools to hide our human flaws!

How you’re really supposed to do it:

Good hackers know that you’re supposed to:

checkout a new feature branch for your patch
- checkout -b feat/foo
write the patch and commit it
- git add -p && git commit -m 'my foo'
push that branch up to origin
- git push origin feat/foo
wait for the tests to succeed
- hub ci-status && sleep 1m && ...
merge and push to git master
- git checkout master && git merge feat/foo && git push
delete your local branch
- git branch -d feat/foo
delete the remote branch
- git push origin :feat/foo
mourn all the lost time it took to push this patch
- cat && drink && sleep 8h

If it happens to fail, you have to remember what the safe git reset command is, and then you have to start all over!

How do we make this easy?

I write tools, so I sat down and wrote a cheap little bash script to do it for me! Sharing is caring, so please enjoy a copy!

How does it work?

It actually just does all of the above for you automatically, and it automatically cleans up all of the temporary branches when it’s done! Here are two example runs to help show you the tool in action:

A failure scenario:

Here I add a patch with some trailing whitespace, which will easily get caught by the automatic test suite. You’ll note that I actually had to force the commit locally with git commit -n, because the pre-commit hook actually caught this first. You could extend this script to run your test suite locally before you push the patch, but that defeats the idea of a drive-by.

james@computer:~/code/mgmt$ git add -p
diff --git a/README.md b/README.md
index 277aa73..2c31356 100644
--- a/README.md
+++ b/README.md
@@ -34,6 +34,7 @@ Please get involved by working on one of these items or by suggesting something
 Please grab one of the straightforward [#mgmtlove](https://github.com/purpleidea/mgmt/labels/mgmtlove) issues if you're a first time contributor.
 
 ## Bugs:
+Some bugs are caused by bad drive-by patches to git master! 
 Please set the `DEBUG` constant in [main.go](https://github.com/purpleidea/mgmt/blob/master/main.go) to `true`, and post the logs when you report the [issue](https://github.com/purpleidea/mgmt/issues).
 Bonus points if you provide a [shell](https://github.com/purpleidea/mgmt/tree/master/test/shell) or [OMV](https://github.com/purpleidea/mgmt/tree/master/test/omv) reproducible test case.
 
Stage this hunk [y,n,q,a,d,/,e,?]? y
<stdin>:9: trailing whitespace.
Some bugs are caused by bad drive-by patches to git master! 
warning: 1 line adds whitespace errors.

james@computer:~/code/mgmt$ git commit -m 'XXX this is a bad patch'
README.md:37: trailing whitespace.
+Some bugs are caused by bad drive-by patches to git master! 
james@computer:~/code/mgmt$ git commit -nm 'XXX this is a bad patch'
[master 8b29fb3] XXX this is a bad patch
 1 file changed, 1 insertion(+)
james@computer:~/code/mgmt$ git tpush 
tpush branch is: tpush/test-0
Switched to a new branch 'tpush/test-0'
Counting objects: 3, done.
Delta compression using up to 4 threads.
Compressing objects: 100% (3/3), done.
Writing objects: 100% (3/3), 357 bytes | 0 bytes/s, done.
Total 3 (delta 2), reused 0 (delta 0)
To git@github.com:purpleidea/mgmt.git
 * [new branch]      tpush/test-0 -> tpush/test-0
Switched to branch 'master'
Your branch is ahead of 'origin/master' by 1 commit.
  (use "git push" to publish your local commits)
- CI is starting...
/ Waiting for CI...
- Waiting for CI...
\ Waiting for CI...
| Waiting for CI...
Deleted branch tpush/test-0 (was 8b29fb3).
To git@github.com:purpleidea/mgmt.git
 - [deleted]         tpush/test-0
Upstream CI failed with:
failure: https://travis-ci.org/purpleidea/mgmt/builds/109472293
Fix branch 'master' with:
git commit --amend
or:
git reset --soft HEAD^
Happy hacking!

A successful scenario:

In this example, I amended the previous patch in my local copy of git master (which never got pushed to the public git master!) and then I ran git tpush again.

james@computer:~/code/mgmt$ git commit --amend
[master 1186d63] Add a link to my article about debugging golang
 Date: Mon Feb 15 17:28:01 2016 -0500
 1 file changed, 1 insertion(+)
james@computer:~/code/mgmt$ git tpush 
tpush branch is: tpush/test-0
Switched to a new branch 'tpush/test-0'
Counting objects: 3, done.
Delta compression using up to 4 threads.
Compressing objects: 100% (3/3), done.
Writing objects: 100% (3/3), 397 bytes | 0 bytes/s, done.
Total 3 (delta 2), reused 0 (delta 0)
To git@github.com:purpleidea/mgmt.git
 * [new branch]      tpush/test-0 -> tpush/test-0
Switched to branch 'master'
Your branch is ahead of 'origin/master' by 1 commit.
  (use "git push" to publish your local commits)
- CI is starting...
/ Waiting for CI...
- Waiting for CI...
\ Waiting for CI...
| Waiting for CI...
Deleted branch tpush/test-0 (was 1186d63).
To git@github.com:purpleidea/mgmt.git
 - [deleted]         tpush/test-0
Counting objects: 3, done.
Delta compression using up to 4 threads.
Compressing objects: 100% (3/3), done.
Writing objects: 100% (3/3), 397 bytes | 0 bytes/s, done.
Total 3 (delta 2), reused 0 (delta 0)
To git@github.com:purpleidea/mgmt.git
   6e68d6d..1186d63  master -> master
Done!

Please take note of the “Waiting for CI…” text. This is actually a spinner which updates itself. Run this tool yourself to get the full effect! This avoids having 200 line scrollback in your terminal while you wait for your CI to complete.

Lastly, you’ll note that I run git tpush instead of git-tpush. This is because I added an alias to my ~/.gitconfig. It’s quite easy, just add:

tpush = !git-tpush

under the [alias] section, and ensure the git-tpush shell script is in your ~/bin/ folder and is executable.

Pre-requisites:

This actually depends on the hub command line tool which knows how to ask our CI about the test status. It sucks using a distributed tool like git with a centralized thing like github, but it’s still pretty easy for me to move away if they go bad. You’ll also want the tput installed if you want the fancy spinner, but that comes with most GNU/Linux distros by default.

The code:

Grab a copy here!

Happy hacking,

James

PS: No, I couldn’t think of a better name than git-tpush, if you don’t like it feel free to rename it yourself!

The Technical Blog of James

Technical articles, writeups and discussion!

Tag Archives: gluster

Copyleft is Dead. Long live Copyleft!

Metaparameters in mgmt

Faster golang builds

Remote execution in mgmt

Seen in downtown Montreal…

Automatic clustering in mgmt

One hour hacks: Remote LUKS over SSH

Automatic grouping in mgmt

Automatic edges in mgmt

Introducing: git tpush