adding range support to python’s http server to kickstart with anaconda

I’ve been working on automatic installs using kickstart and puppet. I’m using a modified python httpserver because it’s lightweight, and easy to integrate into my existing python code base. The server was churning away perfectly until anaconda started downloading the full rpm’s for installation. What was going wrong?

Traceback (most recent call last):
[...]
error: [Errno 32] Broken pipe
BorkedError: See TTBOJ for explanation and discussion

As it turns out, anaconda first downloads the headers, and then later requests the full rpm with an http range request. This second range request which begins at byte 1384, causes the “simple” httpserver to bork, because it doesn’t support this more elaborate feature.

After a bit of searching, I found rangehttpserver and was very grateful that I wouldn’t have to write this feature myself. This work by smgoller was based on the similar httpserver by xyne. Both of these people have been very responsive and kind in giving me special permission to the relevant portions of their code that I needed under the GPLv2/3+. Thanks to these two and their contribution to Free Software this let’s us all see further, instead of having to reinvent previously solved problems.

This derivative work is only one part of a larger software release that I have coming shortly, but I wanted to put this out here early to thank these guys and to make you all aware of the range issue and solution.

Thank you again and,
Happy Hacking,

James

How to send and receive files like a professional

Everyone needs to send and receive files sometimes. Traditionally people send files as email attachments. This still works great, and supports encryption, but many mail servers are slow and cap the upper file size limit.

ICQ was a great solution back in the 1990’s, but those days are now over. (I still remember my number.)

A lot of folks use dropbox, which requires a dropbox account, and for you to trust them with your files.

If you want a simple solution that doesn’t need internet access (if you’re on a LAN, for example) you can use droopy and woof. These are two shell scripts that I keep in my ~/bin/. Droopy lets you receive a file from a sender, and woof lets you send one their way. Unfortunately, they don’t support ssl. This could be a project for someone. (Hint)

I recently patched droopy to add inline image support. I’ve emailed my patch to the author, but until it gets merged, you can get my patched version here. (AGPLv.3+)

Hopefully these are helpful to you.

Happy hacking,

James

including a recursive tree of files with distutils

It turns out it is non trivial (afaict) to include a tree of files (a directory) in a python distutils data_files argument. Here’s how I managed to do it, while also allowing the programmer to include manual entries:

NAME = 'project_name'
distutils.core.setup(
# ...
    data_files=[
        ('share/%s' % NAME, ['README']),
        ('share/%s' % NAME, ['files/somefile']),
        ('share/%s/templates' % NAME, [
            'files/templates/template1.tmpl',
            'files/templates/template2.tmpl',
        ]),
    ] + [('share/%s/%s' % (NAME, x[0]), map(lambda y: x[0]+'/'+y, x[2])) for x in os.walk('the_directory/')],
# ...
)

Since data_files is a list, I’ve just appended our specially generated list to the end. You can do this as many times as you wish. The list is a comprehension which builds each tuple as it walks through the requested directory. I’ve chosen a root installation directory of ${prefix}/share/project_name/the_directory/ but you can change this code to match your own specifications.

Strangely, I couldn’t find this solution when searching the Internets, so I had to write it myself. Perhaps my google-fu is weak, and maybe this post needs to get some linkage to help out the rest of us python programmers.

Happy hacking,
James

 

finding your software install $prefix from inside python

Good python software developers tend to use distutils and include a setup.py with their code. The problem I often encounter is finding out which prefix your software has been installed in from within the python code. This might be necessary if you want to interact with some data that you’ve installed into: $prefix/share/projectname/ Here are the various steps:

1) Distutils:

NAME='someproject'
distutils.core.setup(
    name=NAME,
    version='0.1',
    author='James Shubin',
    author_email='purpleidea@gmail.com',
    url='https://ttboj.wordpress.com/',
    description='This is an example project',
    # http://pypi.python.org/pypi?%3Aaction=list_classifiers
    classifiers=[
        'Environment :: Console',
        'Intended Audience :: System Administrators',
        'License :: OSI Approved :: GNU Affero General Public License v3',
        'Operating System :: POSIX :: Linux',
        'Programming Language :: Python',
        'Topic :: Utilities',
    ],
    packages=[NAME],
    package_dir={NAME: 'src'},
    data_files=[
        ('share/%s' % NAME, ['README']),
        ('share/%s' % NAME, ['images/something.png']),
    ],
    scripts=['somebin'],
)

2) Install:

python setup.py install --prefix=~/testprefix/

Note: If you don’t specify a prefix, then this will get installed into your system prefix.

3) Run:

cd ~/testprefix/ # the prefix you chose above
PYTHONPATH=lib/python2.7/site-packages/ ./bin/somebin

Note: If you didn’t specify a prefix above, then you don’t need to set the PYTHONPATH variable, and also, the executable will already be in your default $PATH

4) Prefix:

I have written a small python module which I include in all of my python software. It will returns the projects installed prefix when run. I usually use it like so:

print 'something.png is located at: %s' % os.path.join(prefix.prefix(), 'share', NAME, 'images', 'something.png')

5) Code:

Here is the code for prefix.py. I put this file under my projectname/src/ directory.

#!/usr/bin/python
# -*- coding: utf-8 -*-
"""Find the prefix of the current installation, and other useful variables.

Finding the prefix that your program has been installed in can be non-trivial.
This simplifies the process by allowing you to import <packagename>.prefix and
get instant access to the path prefix by calling the function named: prefix().
If you'd like to join this prefix onto a given path, pass it as the first arg.

Example: if [ `./prefix.py` ]; then echo yes; else echo no; fi
Example: x=`./prefix.py`; echo 'prefix: '$x
"""
# Copyright (C) 2010-2012  James Shubin
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU Affero General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU Affero General Public License for more details.
#
# You should have received a copy of the GNU Affero General Public License
# along with this program.  If not, see <http://www.gnu.org/licenses/>.

__all__ = ('prefix', 'name')
#DEBUG = False

import os
import sys

def prefix(join=None):
    """Returns the prefix that this code was installed into."""
    # constants for this execution
    path = os.path.abspath(__file__)
    #if DEBUG: print 'path: %s' % path
    name = os.path.basename(os.path.dirname(path))
    #if DEBUG: print 'name: %s' % name
    this = os.path.basename(path)
    #if DEBUG: print 'this: %s' % this

    # rule set
    rules = [
        # to match: /usr/lib/python2.5/site-packages/project/prefix.py
        # or: /usr/local/lib/python2.6/dist-packages/project/prefix.py
        lambda x: x == 'lib',
        lambda x: x == ('python%s' % sys.version[:3]),
        lambda x: x in ['site-packages', 'dist-packages'],
        lambda x: x == name,    # 'project'
        lambda x: x == this,    # 'prefix.py'
    ]

    # matching engine
    while len(rules) > 0:
        (path, token) = os.path.split(path)
        #if DEBUG: print 'path: %s, token: %s' % (path, token)
        rule = rules.pop()
        if not rule(token):
            #if DEBUG: print 'rule failed'
            return False

    # usually returns: /usr/ or /usr/local/ (but without slash postfix)
    if join is None:
        return path
    else:
        return os.path.join(path, join)    # add on join if it exists!

def name(pop=[], suffix=None):
    """Returns the name of this particular project. If pop is a list
    containing more than one element, name() will remove those items
    from the path tail before deciding on the project name. If there
    is an element which does not exist in the path tail, then raise.
    If a suffix is specified, then it is removed if found at end."""
    path = os.path.dirname(os.path.abspath(__file__))
    if isinstance(pop, str): pop = [pop]    # force single strings to list
    while len(pop) > 0:
        (path, tail) = os.path.split(path)
        if pop.pop() != tail:
            #if DEBUG: print 'tail: %s' % tail
            raise ValueError('Element doesnʼt match path tail.')

    path = os.path.basename(path)
    if suffix is not None and path.endswith(suffix):
        path = path[0:-len(suffix)]
    return path

if __name__ == '__main__':
    join = None
    if len(sys.argv) > 1:
        join = ' '.join(sys.argv[1:])
    result = prefix(join)
    if result:
        print result
    else:
        sys.exit(1)

Why this sort of thing isn’t built into python boggles my mind, so if for some reason you have a better solution, please let me know. Also, don’t be fooled by the red herring that is: sys.prefix

Happy hacking,
James

getopt vs. optparse vs. argparse

sooner or later you’ll end up needing to do some argument parsing. the foolish end up writing their own yucky parser that ends up having a big if statement filled with things like:

if len(sys.argv) > 1

in it. don’t do this unless you have a really good excuse.

sooner or later, someone directs you to getopt, and you happily continue on with buggy manual parsing thinking you’ve “found the way“. useful in some circumstances, but should generally be avoided.

since you’re a good student, you read the docs, and one chapter later, you find out about optparse. higher level parsing! alright! the library that we all wanted to write, actually exists, and it seems to follow some ideals too. this i actually appreciate, and it is lovely to use. you dream about all programs using this common library and unifying the world. consistency is a dream.

you then remember that the positional syntax of cp, git, man, and friends actually does makes sense, and you’d like for them not to change. you go on with life, hacking up optparse when needed. everything is pretty good, and you’re a seasoned coder by now, but sooner or later, someone sets you straight with a nice blog post like this.

there’s a new kid in town, and it’s called argparse. you read the docs, and you promise yourself to use standard argument styles. subparsers, and types finally exist in a sensible way. you love the inheritance schemes, and you’re one step away from being able to complete your parsing code, but you still haven’t found that magic place in the manual that hides the precious answer you need. and now you have (probably the fourth code block down from that link- maybe also the fifth). why this way buried in with the api specs, i don’t know, but i’m glad it was there.

thanks to ivan for getting me to check out argparse in the first place.

the python subprocess module

i’m sure that i won’t be able to tell you anything revolutionary which can’t be found out by reading the manual, but i thought i would clarify it, and by showing you a specific example which i needed.

subprocess.Popen accepts a bunch or args, one of which is the shell argument, which is False by default. If you specify shell=True then the first argument of popen should be a string which is what gets parsed by the shell and then eventually run. (nothing revolutionary)

the magic happens if you use shell=False (the default), in which case the first argument then accepts an array of arguments to pass. this array exactly transforms to become the sys.argv of the subprocess that you’ve opened with popen. magic!

this means you could pass an argument like: “hello how are you” and it will get received as one element in sys.argv, versus being split up into 4 arguments: “hello”, “how”, “are”, “you”. it’s still possible to try to do some shell quoting magic, and achieve the same result, but it’s much harder that way.


>>> _ = subprocess.Popen(['python', '-c', 'print "dude, this is sweet"'])
>>> dude, this is sweet

vs.


>>> _ = subprocess.Popen("python -c 'print "dude, this isnt so sweet"'", shell=True)
>>> dude, this isnt so sweet

and i’m not 100% sure how i would even add an ascii apostrophe for the isn’t.

the second thing i should mention is that you have to remember that each argument actually needs to be split up; for example:


>>> _ = subprocess.Popen(['ls', '-F', '--human-readable', '-G'])
[ ls output ]

yes it’s true that you can combine flags into one argument, but that’s magic happening inside the program.

all this wouldn’t be powerful if we couldn’t pipe programs together. here is a simple example:


>>> p1 = subprocess.Popen(['dmesg'], stdout=subprocess.PIPE)
>>> p2 = subprocess.Popen(['grep', '-i', 'sda'], stdin=p1.stdout)
[ dmesg output that matches sda ]

i think it’s pretty self explanatory. now let’s say we wanted to add one more stage to the pipeline, but have it be something that usually gets executed with os.system:


p1 = subprocess.Popen(['dmesg'], stdout=subprocess.PIPE)
p2 = subprocess.Popen(['grep', '-i', 'sda'], stdin=p1.stdout)
p3 = subprocess.Popen(['less'], stdin=p2.stdout)
sts = os.waitpid(p3.pid, 0)
print 'all done'

this above example should all be pasted into a file and run; the call to waitpid is important, because it stops the interpreter from continuing on before less has finished executing.

hope this took the learning curve and guessing out of using the new subprocess module, (even though it actually has existed for a while…)

the power to yield a better console interface

as part of a different project, i needed to duplicate some existing terminal magic in python. what i needed to write was something similar to the getch() function in curses. it can be found in: ncurses*/base/lib_getch.c after doing an: apt-get source libncurses5

what’s the magic? i need to stay in a continuous loop reading from the file descriptor, however i want to return periodically so that gobject doesn’t block and the interface can remain responsive. enter: yield, who comes in and saves the day. see the accompanying code for specifics.

as part of the bigger scheme, i wanted to write a console like interface for talking to a dbus server that both allows you to run methods, and receive signals. i wanted to use gobject, and i didn’t want to use threads! and because i wanted to make it pro, i decided it should look and feel like my standard bash shell (except prettier). it’s intended to be easy to use, and running the module will give you an example session. check back for a more complete and expressive code base.

1) if anyone knows who to do a ualarm (not alarm) in python, help is appreciated. either through ctypes or as a c extension for python.

2) please leave me your comments on my greadline module! more features are on the way, such as history and cursor support.

code: http://www.cs.mcgill.ca/~james/code/greadline.tar