Bittorent sync for repository mirroring

Theron Conrey writes about using:

BitTorrent Sync as Geo-Replication for Storage

We got a chance to talk about this idea at Linuxcon. I’m not entirely convinced there aren’t some problem edge cases with this solution, but I think it will be hard to tell as long as the BitTorrent sync library is proprietary. I did come up with a special case of Theron’s idea that I believe could work well.

The special case uses the optimization that the synchronization (or file transferring) is unidirectional. This avoids any coherency complications involved if both sides were to write to the same file. Combined with the BitTorrent protocol, this does what normal torrent usage does, except with BitTorrent sync, we’re looking at a folder full of files.

What kind of synchronization would benefit from this model? Repository mirroring! This is exactly a folder full of files, but going in only one direction. Instead of yum or deb mirrors each running rsync, they could use BitTorrent sync, and because of the large amount of available upload bandwidth usually available on these mirrors, “seeding”, wouldn’t be a problem, and the worldwide pool would synchronize faster.

Can we apply this to user mirroring, net installers, and machine updating? Absolutely. I believe someone has already looked into the updates scenario, but it didn’t progress for some reason. The more convincing case is still the server geo-replication of course.

Obviously, using glusterfs with puppet-gluster to host the mirrors could be a good fit. You might not even need to use any gluster replication when you have built-in geo-replication via other mirrors.

If someone works up the open source BitTorrent parts, I’m happy to hack together the puppet parts to turn this into a turn-key solution for mirror hosts.

Hope you liked this idea.

Happy hacking,

James

Gluster Community Day, Thursday

I’m here in New Orleans hacking up a storm and getting to meet fellow gluster users IRL. John Mark Walker started off with a great “State of the GlusterFS union” style talk.

Today Louis (semiosis) gave a great talk about running glusterfs on amazon. It was highly pragmatic and he explained how he chose the number of bricks per host. The talk will be posted online shortly.

Marco Ceppi from Canonical gave a talk about juju and gluster. I haven’t had much time to look at juju, so it was good exposure. Marco’s gluster charm suffers from a lack of high availability peering, but I’m sure that is easily solved, and it isn’t a big issue. I had the same issue when working on puppet-gluster. I’ve written an article about how I solved this problem. I think it’s the most elegant solution, but if anyone has a better idea, please let me know. The solutions I used for puppet, can be applied to juju too. Marco and I talked about porting puppet-gluster to ubuntu. We also talked about using puppet inside of juju, with a puppetmaster, but we’re not sure how useful that would be beyond pure hack value.

Joe Julian gave a talk on running a MySQL (MariaDB) on glusterfs and getting mostly decent performance. That man knows his gluster internals.

I presented my talk about puppet-gluster. I had a successful live demo, which ran over ssh+screen across the conference centre internet to my home cluster Montreal. With interspersed talking, the full deploy took about eight minutes. Hope you enjoyed it. Let me know if you have any trouble with your setup and what features you’re missing. The video will be posted shortly.

Thanks again to John Mark Walker, RedHat and gluster.org for sponsoring my trip.

Happy hacking,

James

Linuxcon day three, Wednesday

After hacking away on Monday and Tuesday and meeting fellow nerds IRL, I’ve landed even more changes to puppet-gluster. My git master branch now sits at 47 commits.

$ git clone https://github.com/purpleidea/puppet-gluster.git
Cloning into 'puppet-gluster'...
remote: Counting objects: 317, done.
remote: Compressing objects: 100% (144/144), done.
remote: Total 317 (delta 187), reused 275 (delta 148)
Receiving objects: 100% (317/317), 82.17 KiB | 12.00 KiB/s, done.
Resolving deltas: 100% (187/187), done.
$ cd puppet-gluster/
$ git log | grep '^commit' | wc -l
47
$ git log | head
commit fa3fd2eb4bab499031274e0918a40e7a99fe0086
Author: James Shubin <hidden>
Date:   Wed Sep 18 17:53:13 2013 -0400

    Added fancy volume creation.
    
    This moves the command into a separate file. This also adds temporary
    saving of stdout and stderr to /tmp for easy debugging of command
    output.

As you can see above, volume creation is now “fancier” and more robust. In case things go wrong, it’s easy to get fast access to gluster command line output (saved in /tmp/), and the volume creation commands are individually stored in your puppet-gluster working directory. Usually this is /var/lib/puppet/tmp/gluster/, and each volume creation command is in the volume subdirectory.

I also met gluster expert Joe Julian. He’s been recently hired at Rackspace. Congratulations Joe. We talked about puppet and gluster, and is very knowledgeable about gluster internals and PRN source diving.

I was interviewed by Aaron Delp and Brian Gracely, on The Cloudcast. These two gentlemen are a pleasure to sit and chat with. Check out their podcast. We talked about puppet, gluster, puppet-gluster and how to dive in. Feel free to comment or email me if you have any questions about something that we didn’t cover in the interview.

All week, I’ve been hacking along side Jayneil Dalal in the speaker room. He was kind enough to give me a Beagle Bone black! Where will its hack potential take you? Two features which are particularly useful are on-board Ethernet, and 2GB of flash storage. He’s at the conference showing off some Minnow boards. They’ve got an Intel atom chip on board if you need something a little beefier.

I’m giving my puppet-gluster talk tomorrow (Thursday) here at Linuxcon! I hope you can make it. I’ll even have a live demo. Until then,

Happy hacking,

James

Linuxcon day one, Monday

I’m here in New Orleans at Linux Con, hacking on puppet-gluster and talking to lots of interesting folks. I’ve met gluster hacker Theron Conrey, and my host John Mark Walker, Fedora and Raspberry Pi experts Spot and Ruth Suehle, and many others too.

The hotel is very nice. The bathroom sink has two taps of course, but both of them are hot. The New Orleans heat is probably the cause of this.

I’m hacking at full speed to get some new features and testing in before my talk on Thursday. I’ve been reworking the simple firewall support in my puppet module. For those that want automated and correct firewall configuration, expect some improvements soon.

I also pushed some work on property management for gluster volumes. This commit adds a list of all available gluster properties. The patch is still missing type information for many of them, because I haven’t yet tested each one, but if there are some you’d like to use, please let me know. This is easy to patch.

More code is landing soon. Don’t be afraid to contact me if you’re not sure how to get started with puppet gluster, or if there’s a use case that I am not currently solving.

Happy Hacking,

James

PS: Sorry I published this late!