Geek on a Rock

Coding in Bermuda ... and other stuff

About Me

My name is Damion Wilson and I've been a programming enthusiast for a while now (anyone else remember the Apple II ? Atari 800 ? Vic-20 ? I guess not).

I've been lucky to have been employed programming in the banking, telecoms, trading, and reinsurance industries and even did some sysadmin/devops work too. Even so, I can't help writing code all the time (someone should eventually stop me).

Right now, I'm in reinsurance working on risk systems with a group of really smart characters, and have a wonderful wife and two daughters, which is about the best anyone can hope for.

I won't bother to tell all the things I've written stuff in, but what I like to write in these days are Python(Pandas, Numpy, Django), C++(Boost), C(yes, still), Erlang, and SQL(PostgreSQL, SQL Server). I have a weakness for Network and Systems programming which can be evidenced by the rather ancient CIPE-Win32 but I also have been known to do GUI work in perl with evidenced by the similarly ancient Tk-DKW.

My more recent stuff will (eventually) appear here

Other interests are cycling, martial arts, cabinet carpentry and gardening

Links

December 03, 2020

A Command Line Wrapper for Bash Applications

The Problem

Writing applications in Bash is considered ill advised by many. But why?

Indeed, as the mother of scripting languages, it's lineage leads directly to lua, python, perl, ruby, etc. So why shouldn't decent programming practices yield reliable, well engineered solutions here too?

Of course Bash supports this kind of use, despite the arcane syntax (still not worse than APL !). But writing something that offers a command line interface that's intuitive and fits in with Unix's 'building block' approach requires quite a bit of boiler plate

Basic Treatment

For those who don't know, getopt is a bash builtin that allows the specification and handling of command line options. You invoke it with a command line options specification and the list of command line options you want it to process, and it sets a shell variable with the processed options and their values


#!/usr/bin/env bash

OPTION_TEMP=$(getopt --options "-h" --longoptions "--help" -- "$@")
eval set -- "$OPTION_TEMP" # Replace original options array

while true
do
    case "$1" in
        --help|-h)
            echo "Show help message"
            shift
            ;;

        --)
            shift
            break
            ;;
    esac
done

As you add options, the specification and the case statement get bigger. Also, there's no way to tie the option specs to descriptions...

Optionslib

After wrapping the behaviour in a basic, useful pattern and finding that I was duplicating the code (everywhere!), I faced the fact that it was time to make a generic implementation

The resulting work lives here: https://github.com/damionw/optionslib

It soon becomes quite clear that I've been (mal)affected by python argparse.

Installation

The makefile will install the package in /usr/local/bin and /usr/local/lib


$ git clone https://github.com/damionw/optionslib
$ cd optionslib
$ make install

Usage

We can test it out with a simple bash script


#!/usr/bin/env bash

. $(optionslib --lib)

logging_level=info

optionslib::parse::description "This is a demo application"

optionslib::parse::config "
    long_options=--help short_options=-h action=show_help description='Display instructions'
    long_options=--logging: action=store_var name=logging_level description='Set logging level'
"

if ! optionslib::parse::parse_arguments "$@"
then
    exit 255
fi

echo "Logging level = $logging_level"

Saving this program as /tmp/optionsdemo and running it yields...


$ /tmp/optionsdemo --help
Usage: /tmp/optionsdemo [--help] [--logging=]

This is a demo application

    --help           Display instructions
    --logging:       Set logging level

$ /tmp/optionsdemo --logging=info

Logging level = info

Follow Up

You're invited to experiment. Options are comma separated and the actions that optionslib::parse::config() knows about are:

store_var - stores values in the shell variable indicated by name
show_help - displays the help message and exits
command - runs the shell function indicated by name, passing the parameter value

December 01, 2020

A Case for Modular Bash Applications

What were you thinking ???

As an old Unix developer (circa 1986), I've successfully written and deployed mission critical applications and services using shell. So, as a proper programming language, bash is even more adept

However, some of the drawbacks are difficult to ignore:

No Standard Library Concept
Arcane Syntax
Ambiguous Features
No Namespaces
Slow

Some of these concerns can be mitigated however...

File layout for a Bash Application

Historically, bash programs are single file affairs. Why? because there aren't standardised expectations for deploying and importing application libraries.

Binary applications commonly deposit shareable objects in /usr/lib or /usr/local/lib. Python package management, similarly, create and manage modules beneath /usr/local/lib/python-. Bash has no standard package management, but that doesn't mean we can't leverage standard library directories in the same way

Flubber : A demo bash application

Let's call our test application flubber. Here's the on disk layout


/usr
    /bin
        /flubber
    /lib
        /flubber-0.01
            /features

        /flubber

/usr/bin/flubber is the program entry point. What does it do ?


#!/usr/bin/env bash

# Locate the library path
binary_file="$(readlink -f "${BASH_SOURCE[0]}")"
binary_path="$(dirname "${binary_file}")"
binary_name="$(basename "${binary_file}")"
library_import_file="$(readlink -f "${path_path}/../lib/${binary_name}")"

This preamble asserts that the library will be located relative to the binary, this time in /usr/lib/flubber. Then, we import the library with


. "${library_import_file}"

This let's us divorce the binary from the library code. It also lets other entities, or even an interactive session, import the library's namespace elements and functionality directly. More on this later. For now, let's look at /usr/lib/flubber...


#!/usr/bin/env bash

library_file="$(readlink -f "${BASH_SOURCE[0]}")"
library_path="$(dirname "${binary_file}")"
package_name="$(basename "${library_file}")"

Similar file location behaviour as in the binary, but we're actually interested in determining something else...


export __FLUBBER_VERSION__="$(
    find "${library_path}/${package_name}"-[.0-9]* -maxdepth 0 -mindepth 0 -type d -printf "%f$n" |
    awk -F- '{print $NF;}' |
    sort -nr |
    head -1
)"

... which is capturing the numbered subdirectories prefixed with the package name, in this case there's only flubber-0.01. We then use that subdirectory to import the component modules.

There's only one: features.


import_path="${local_path}/${package_name}-${__FLUBBER_VERSION__}"

. "${import_path}/features"

After we import the components, for good measure, we refresh the namespace cache with hash -r


hash -r

The file /usr/local/lib/flubber-0.01/features provides functions specific to flubber features


#!/usr/bin/env bash

flubber::features::construct() {
    echo "FLUBBER construction"
}

We're only defining one function in this module, using namespace naming semantics to forego collisions with other libraries or modules.
Bash, of course, doesn't care that we're using colons in its function names. The point is, flubber will be able to call flubber::features::construct and we'll understand what the function is and where its coming from

Using Flubber

Now we can finish off /usr/bin/flubber. We won't get into getopt based command line options handling just yet (another post will cover that). However, we will accept a single command line option


case "$1" in
    --lib)
        echo "${library_import_file}"
        shift
        exit 0
        ;;
esac

So, if we run flubber, we'll get the library import entry point


$ flubber --lib

/usr/lib/flubber

We can now import the namespace from the command line


$ . $(flubber --lib)
$ flubber::features::construct

FLUBBER construction

We can also endow the flubber application with the same capability, essentially 'exporting' the internal behaviour via the user interface. If we rewrite the case statement to add another option --doit


case "$1" in
    --lib)
        echo "${library_import_file}"
        shift
        exit 0
        ;;

    --doit)
        flubber::features::construct
        shift
        exit 0
        ;;
esac

We can run it like this


$ flubber --doit

FLUBBER construction

What Next?

As mentioned, there's no getopt parameter handling here, but there is a pattern (and a library) to handle that.
Adding features to flubber now consists of changing or adding files in /usr/lib/flubber-0.01 New versions should be exposed by renaming to or adding /usr/lib/flubber-0.0x
Makefiles for existing applications can be augmented to place components into the appropriate directory layout and monolithic code blocks can then be reliably refactored into discrete components.

A representative example of this pattern, which is also the library providing getopt integration can be found here here: https://github.com/damionw/optionslib

March 25, 2018

Shell Function keyword arguments

The Problem

For those of you who might be unfamilar, BASH, or Bourne Again Shell, really is a complete programming language, replete with every facility you might want.

However, there are features it infuriatingly lacks that one might have experienced in other languages, such as Erlang, Perl or Python

Having worked in Python constantly since 2008, I'm particular to amenities offered there, especially when I'm using Bash as 'glue' to string together tools that are themselves written in Python.

To this end, one such feature is the ability in Python to specify function parameters either by keyword or by position.

And, in a moment of heightened irritation, I had to have it...

What it looks like

Here's a python function with two parameters:


def myfunc(one, two):
    print one
    print two

We can call it many ways:


myfunc(one=1, two=2)
myfunc(1, 2)
myfunc(1, two=2)

Each invocation ensures that, within the body of myfunc(), variables one and two are given the respective values 1 and 2

BASH Use Case

So, what would this look like in BASH?

We can't rely on BASH itself, which doesn't offer keyword parameters on its own. All it knows about are positional parameters. However, they're all strings, so what if we passed parameters that looked like this:


myfunc "one=1" "two=2" 3 4

Ok, so within the function, we have an arrangement where a positional string value could be in a special form, where something like = might be prepended


$1="one=1"
$2="two=2"
$3="3"
$4="4"

After that, we'd need to preprocess the argument list and evaluate all the values and populate a keyword hash table, supported in Bash >= 4.0 (see )


eval "$(argument_formatter "$@")"

Which would populate a hash table named kwargs and a positional array named args (internally managed in python)


$ echo ${kwargs["one"]}

1

$ echo ${kwargs["two"]}

2

$ echo ${args[0]}

3

$ echo ${args[1]}

4

Determining where/how to use mixed positional and keyword interpretations


myfunc one=1 2

Extracting the values could look like so...


$ one=${kwargs["one"]:-${args[0]}}
$ two=${kwargs["two"]:-${args[1]}}
$ echo $one

1

$ echo $two

2

...ensuring the kwarg is preferred, but the positional argument is used as the default

Implementation

But, how do we implement this feature. Glad you asked. Here, we're calling our function this


argument_formatter() {
...
}

Inside, we first create separate arrays to hold the keyword and the positional arguments as we parse them from the parameter list. Note the difference in declaration styles for the associative vs indexed arrays.


    local -A _keyword_args=()
    local -a _positional_args=()

Then, let's loop through them, looking for entries of the form <identifier>=<value>. Those'll be the keyword args, anything else is positional. We have to split the keywords from their values so we can store them in the associative array


    for _parameter in "$@"
    do
        if (echo "${_parameter}" | grep -q '^[[:alpha:]][[:alnum:]]*[=]')
        then
            _key="$(echo "${_parameter}" | sed -e 's/=.*$//g')"
            _value="$(echo "${_parameter}" | sed -e 's/^[^=]*=//g')"
            _keyword_args["${_key}"]="${_value}"
        else
            _positional_args[${#_positional_args[@]}]="${_parameter}"
        fi
    done

Then, since we're using eval, we just print out the bash source required to create the kwargs in the target scope


    echo -n "local -A kwargs=("
    for _key in "${!_keyword_args[@]}"
    do
        echo -n " [$"${_key}$"]=$"$(echo "${_keyword_args["${_key}"]}" | sed -e 's/$"/$$$"/g')$""
    done
    echo " )"

And, likewise, the source required to create the positional args in the target scope


    echo -n "local -a args=("
    for _value in "${_positional_args[@]}"
    do
        echo -n " $"$(echo "${_value}" | sed -e 's/$"/$$$"/g')$""
    done
    echo " )"

That's it! Once eval'ed, the calling function will have the kwargs and args arrays defined locally

You can see the full implementation here: https://github.com/damionw/bashLib/blob/master/src/lib/bashLib-0.12/arguments or clone https://github.com/damionw/bashLib and try it out from there.

I hope that was informative !

March 11, 2016

Reference Counted Process Group Supervision

The Provocation

Some months ago, in a flurry of frustration provoked coding, I implemented my own email client. At some point, it'll make a blog post but, for now, suffice it to say that my once torrid love affair with Kmail from KDE3 has ended with the advent of KDE4. As an aside, it's sad that software projects can and do reach perfection and then pass it by in their quests for relevance

Email Services

In the details of writing an email client are essentially 4 elements, which can all be represented as separate services, though in mass market clients, they're strangely all bundled into a single executable, binding them unnecessarily.

Retrieval

Dispatching

Indexing

Viewing

In my case, I merged the Retrieval and Indexing into a single service, but I'm sure you get the picture

The nature of email clients are that, typically, the user will start the email program just after login, which is when any queued messages are sent, new messages are received, and the UI is displayed with whatever viewing preferences the user last selected.

In my case, each separate service needed to be started, too, but with a difference. Since the UI is only needed for me to view email, I don't need to start the UI then (more on that later), but email collection and dispatching should start happening independently.

With the pattern that emerged, it became clear that email collection and dispatch were services linked to my login. But which login ? I'm on linux where I could login via ssh and conceivably want my email to start working. The keyboard and screen interface is only one part of the user experience

So, it seemed that I needed a service 'group', consisting of three services:

email_send

email_retrieval

webmail_service

The webmail service materialised after I decided not to use a Qt standalone UI and, instead, implemented the client using a browser interface. The services are not interdependent in this case, but they are connected by the common file structure and that they are subservient to at least one active user login session.

Supervised Processes with Reference Counting

After this set of requirements became clear, I implemented a Bash script that, when invoked, would test for the presence of a process group supervisor, which was responsible for ensuring that the three services were alive and maintaining the list of dependent processes. If the supervisor were not present, it would be started and then it would launch the three services.

The script would also inform the supervisor about the process id (PID) of the shell that invoked it, adding it to the list of dependents. Subsequently, the running supervisor would delete PID's from the list as the processes disappeared or when explicitly commanded to do so.

Once the list of dependents was empty, the supervisor would shutdown, after stopping its services

After running the suite for several months, it became apparent that this was the arrangement I'd been missing the whole time! Indeed, I was provoked to add on other services, such as a notifier that fades the incoming subject lines in and out on the screen independently of an active email client.

Chaining

Soon, it became obvious that what was needed was a general purpose tool, that would maintain arbitrary process groups, utilising the same approach. So, after another flurry of viscious coding, I produced such a tool which can be seen here

Normally, I'd show the code to implement this, but that's going to be a turnoff unless you're hot for Bash code. No, what's more interesting is what's possible with reference counted process groups.

The process groups can be chained. Let's imagine that for a moment:
Each of the managed services in a process group can run the same Bash script and express interest in another named group of services. Doing so introduces a managed group dependency, ensuring that, as long as the first group is running, the other group will be running too.

And, as it turns out, this is exactly what happens when an init system runs, or supervisord, or any of the other process supervision/startup frameworks. Except, this approach differs in that it does support a dendritic dependency graph but does not attempt to do so from the top down. And, if any group is explicitly stopped by pruning the dependencies, all of the groups that list it as a dependent will be stopped if that is their only dependent. describe the dependency

I'll stop there, before I start showing you any Bash, but I encourage you to play with the tool and explore what can be done with these kinds of process groups.

February 24, 2016

Is Rational Thought Dying ?

I've observed a troubling theme appearing with increasing frequency in modern discourse. Available news sources and personal interactions all seem to indicate, least anecdotally,that consideration of rationality has become something of an anachronism.

Ok, ok. Maybe that's just a bit too doom and gloom for a personal blog, but hear me out.

Historically, great effort was put into fitting preordained conclusions into at least a facade of reasonable explanation,lending the subsequent decision making seem driven by confirmable evidence. Debate skills were in high demand for such work, whether for national or local politics, business, even science and engineering. In this regime, dissenters at least had a chance for provoking change since, if the evidence could be demonstrated as being incorrect, and hence the conclusions, then reversals could be forced. Not always, but often enough. Famous examples of this kind of change are evident in fights for voting rights and civil actions against misbehaving corporations.

But now...

Now we see that the occurrences of such manoeuvring are disappearing. The new doctrine is to state,one's goals from a position of authority or power. This effort does seem to (still) be accompanied by some attempt at providing a reason or a motivation for the subsequent actions, but the explanations are less substantial, less well built to resist challenge. Indeed, when challenges are made, the arguments are defended weakly or are abandoned, leaving the original behaviour in place with only the provocateur's authority to support it. Examples of this are found in the Global Warming 'debates',and denials of marriage rights.

Now, it's important to recognize that the same manifest behaviour of the powerful will not be limited by 'the people' so much that it will be limited by the self policing of the powerful in an attempt to appear to be rational. The 'powerful' in this context simply refers to those with the ability to control policy outcomes that affect people's lives other than their own.
This could refer to people like judges, business owners, politicians, and even heads of criminal enterprise. But it could also mean climate scientists or biochemists employed in perverse circumstances. Of course, any one of us is susceptible to the twin sins of vanity and lust.

So, it appears that what we're actually losing is the desire of people in positions of control to appear to be rationally driven. Is this really a bad thing? Given that people will ultimately do what they want anyway, are we really losing anything by dispensing with the facade?

Well, I certainly think so, and here's why:

When someone attempts to enact policy, any attempt to fit that policy into a framework of rationality will change that policy. This is inescapable. The narrative acts as a form of logical cage that will illuminate any part of the policy that pokes out through the wires. Any actor disinterested in allowing that form of scrutiny, will address those weaknesses. Though changing the narrative is an option, that's harder since it needs to be verifiable to work. /p>

By now, your wondering why, in this anecdotally supported thesis of mine, have we dispensed with this particular form of delusion? In a word, inattention

The evidence abounds that, in general, entire national populations are less engaged in the aspects of their own governance and economic health.

This is a terrifying prospect, and I really hope its not true, but I can't escape that conclusion hard as I might try. In essence, people have become consumed more and more with basic needs and are less inclined or even able to hold those in power accountable using the knowledge available to them. Even when there's more information available to them than at any time in human history.

I don't have a recipe for correcting for this kind of circumstance. Indeed, the perversion is obvious. There's no reason that anyone previously under scrutiny and having escaped that scrutiny, would invite it back. Especially so, that person is likely to do everything possible to prevent said scrutiny from reappearing. And, its just so easy to give those that would otherwise be interested in your behaviour something else to trouble themselves with...

Of course, the savvy will correctly note that rational decision making as a concept is a recent thing (hey, they didn't call them the 'dark ages' for nothing). And the romans themselves only had rules for those who weren't powerful enough to break them. Furthermore, it's only pertinent in functioning democracies. But that's just the thing. Democracies have been making the case that this form of applied logic is the best thing ever for centuries now. Progress in that context has always been toward making this form of thinking more prevalent. It worked for science, right?

Unfortunately for us, though, the big problems can only be solved when we come together on them. And we can't do that unless we can first agree on what's true. And that means talking sensibly about things.

Of course these opinions are only my own, and they're, of course, really just opinions :-)

September 24, 2015

Python Object Proxying

I stumbled upon this recipe and couldn't resist applying it to a common problem: converting linear computations into parallel tasks with asynchronous result delivery.

But, enough about that for a while. What we're really here for is a simple implementation and use case for this delectable technique

This time, we'll dive right into the implementation. Here is the basic class, without any augmentation


class BasicObjectProxy(object):
    __slots__ = ["_obj", "__weakref__"]

    def __init__(self, obj=None):
        object.__setattr__(self, "_obj", obj)

    #
    # proxying (special cases)
    #
    def __getattribute__(self, name):
        return getattr(object.__getattribute__(self, "_obj"), name)

    def __delattr__(self, name):
        delattr(object.__getattribute__(self, "_obj"), name)

    def __setattr__(self, name, value):
        setattr(object.__getattribute__(self, "_obj"), name, value)

    def __nonzero__(self):
        return bool(object.__getattribute__(self, "_obj"))

    def __str__(self):
        return str(object.__getattribute__(self, "_obj"))

    def __unicode__(self):
        return unicode(object.__getattribute__(self, "_obj"))

    def __repr__(self):
        return repr(object.__getattribute__(self, "_obj"))

    def _proxy_synchronise_(self):
        pass

    @classmethod
    def _create_class_proxy(cls, theclass):
        """creates a proxy for the given class"""

        def make_method(name):
            def method(self, *args, **kw):
                return getattr(object.__getattribute__(self, "_obj"), name)(*args, **kw)

            return method

        _proxied_accessor_methods = [
            '__abs__', '__add__', '__and__', '__call__', '__cmp__', '__coerce__',
            '__contains__', '__delitem__', '__delslice__', '__div__', '__divmod__',
            '__eq__', '__float__', '__floordiv__', '__ge__', '__getitem__',
            '__getslice__', '__gt__', '__hash__', '__hex__', '__iadd__', '__iand__',
            '__idiv__', '__idivmod__', '__ifloordiv__', '__ilshift__', '__imod__',
            '__imul__', '__int__', '__invert__', '__ior__', '__ipow__', '__irshift__',
            '__isub__', '__iter__', '__itruediv__', '__ixor__', '__le__', '__len__',
            '__long__', '__lshift__', '__lt__', '__mod__', '__mul__', '__ne__',
            '__neg__', '__oct__', '__or__', '__pos__', '__pow__', '__radd__',
            '__rand__', '__rdiv__', '__rdivmod__', '__reduce__', '__reduce_ex__',
            '__repr__', '__reversed__', '__rfloordiv__', '__rlshift__', '__rmod__',
            '__rmul__', '__ror__', '__rpow__', '__rrshift__', '__rshift__', '__rsub__',
            '__rtruediv__', '__rxor__', '__setitem__', '__setslice__', '__sub__',
            '__truediv__', '__xor__', 'next',
        ]

        return type(
            "%s(%s)" % (cls.__name__, theclass.__name__),
            (cls,),
            {
                _key: make_method(_key)
                for _key
                in _proxied_accessor_methods
                if hasattr(theclass, _key) and not hasattr(cls, _key)
            }
        )

    @classmethod
    def __new__(cls, obj, *args, **kwargs):
        proxied_type = obj.__class__

        try:
            cache = cls.__dict__["_class_proxy_cache"]
        except KeyError:
            cls._class_proxy_cache = cache = {}
        try:
            synthesized_class = cache[proxied_type]
        except KeyError:
            cache[proxied_type] = synthesized_class = cls._create_class_proxy(proxied_type)

        return object.__new__(synthesized_class)

The entire trickery for BasicObjectProxy is contained in the overloaded __new__() and the class method _create_class_proxy()

new() uses the wrapped object type and calls _create_class_proxy() to produce a new type that is built specifically to masquerade as the original type.

The artificial type (possibly cached) is used to instantiate a new object, which is returned from the 'constructor'.

The wrapped object is then passed to __init__() where it is captured to provide the underlying value for all of the proxy object's accessor methods.

_create_class_proxy() Iterates over a static list of possible attribute names,and creates wrapper methods to implement each call that the wrapped type supports.
In this case, the wrapper methods jus return the underlying object, so the proxy appears to be no different than the original object

So, now, we have a proxy class that can be used like so:


>>> value = BasicObjectProxy(22)

>>> print value

22

>>> print type(value)

<class '__main__.BasicObjectProxy(int)'>

>>> print value + 11

33

Not bad at all, but we want more...

What we really want is to intercept each accessor call, and substitute our own sentinel before it.

To accomplish that, we have to do some work:


class BasicObjectProxy(object):
    __slots__ = ["_obj", "__weakref__"]

    def __init__(self, obj=None):
        object.__setattr__(self, "_obj", obj)

    #
    # proxying (special cases)
    #
    def __getattribute__(self, name):
        object.__getattribute__(self, "_proxy_synchronise_")()
        return getattr(object.__getattribute__(self, "_obj"), name)

    def __delattr__(self, name):
        object.__getattribute__(self, "_proxy_synchronise_")()
        delattr(object.__getattribute__(self, "_obj"), name)

    def __setattr__(self, name, value):
        object.__getattribute__(self, "_proxy_synchronise_")()
        setattr(object.__getattribute__(self, "_obj"), name, value)

    def __nonzero__(self):
        object.__getattribute__(self, "_proxy_synchronise_")()
        return bool(object.__getattribute__(self, "_obj"))

    def __str__(self):
        object.__getattribute__(self, "_proxy_synchronise_")()
        return str(object.__getattribute__(self, "_obj"))

    def __unicode__(self):
        object.__getattribute__(self, "_proxy_synchronise_")()
        return unicode(object.__getattribute__(self, "_obj"))

    def __repr__(self):
        object.__getattribute__(self, "_proxy_synchronise_")()
        return repr(object.__getattribute__(self, "_obj"))

    def _proxy_synchronise_(self):
        pass

    @classmethod
    def _create_class_proxy(cls, theclass):
        """creates a proxy for the given class"""

        def make_method(name):
            def method(self, *args, **kw):
                object.__getattribute__(self, "_proxy_synchronise_")()
                return getattr(object.__getattribute__(self, "_obj"), name)(*args, **kw)

            return method

        _proxied_accessor_methods = [
            '__abs__', '__add__', '__and__', '__call__', '__cmp__', '__coerce__',
            '__contains__', '__delitem__', '__delslice__', '__div__', '__divmod__',
            '__eq__', '__float__', '__floordiv__', '__ge__', '__getitem__',
            '__getslice__', '__gt__', '__hash__', '__hex__', '__iadd__', '__iand__',
            '__idiv__', '__idivmod__', '__ifloordiv__', '__ilshift__', '__imod__',
            '__imul__', '__int__', '__invert__', '__ior__', '__ipow__', '__irshift__',
            '__isub__', '__iter__', '__itruediv__', '__ixor__', '__le__', '__len__',
            '__long__', '__lshift__', '__lt__', '__mod__', '__mul__', '__ne__',
            '__neg__', '__oct__', '__or__', '__pos__', '__pow__', '__radd__',
            '__rand__', '__rdiv__', '__rdivmod__', '__reduce__', '__reduce_ex__',
            '__repr__', '__reversed__', '__rfloordiv__', '__rlshift__', '__rmod__',
            '__rmul__', '__ror__', '__rpow__', '__rrshift__', '__rshift__', '__rsub__',
            '__rtruediv__', '__rxor__', '__setitem__', '__setslice__', '__sub__',
            '__truediv__', '__xor__', 'next',
        ]

        return type(
            "%s(%s)" % (cls.__name__, theclass.__name__),
            (cls,),
            {
                _key: make_method(_key)
                for _key
                in _proxied_accessor_methods
                if hasattr(theclass, _key) and not hasattr(cls, _key)
            }
        )

    @classmethod
    def _allocator(cls, proxied_type):
        try:
            cache = cls.__dict__["_class_proxy_cache"]
        except KeyError:
            cls._class_proxy_cache = cache = {}
        try:
            synthesized_class = cache[proxied_type]
        except KeyError:
            cache[proxied_type] = synthesized_class = cls._create_class_proxy(proxied_type)

        return object.__new__(synthesized_class)

    def __new__(cls, obj, *args, **kwargs):
        return cls._allocator(obj.__class__)

Not too much has changed.

First, we instituted a call to a method named _proxy_synchronise_() whenever one of the accessor methods is called.

Because we can't trust the object's symbol table anymore, we have to call object.__getattribute__() every time we want a 'real' proxy object attribute.

In this class, we don't bother endow _proxy_synchronise_() with any functionality, since we aim to subclass from it

We've also separated the __new__() from the wrapper class synthesis, which is now held in class method _allocator()

So, with the new base class implementing the 'guts' of the proxying work, we can do anything we want in a derived class...


def ProxyObjectFactory(proxied_type, value_callback):
    class _Internal(BasicObjectProxy):
        def __new__(cls, *args, **kwargs):
            return cls._allocator(proxied_type)

        def __init__(self, initial_value):
            super(_Internal, self).__init__()

            if type(initial_value) == proxied_type:
                object.__setattr__(self, "_obj", initial_value)

        def _proxy_synchronise_(self):
            previous_value = object.__getattribute__(self, "_obj")
            object.__setattr__(self, "_obj", value_callback(previous_value))

    return _Internal

Of course, we're doing more than just subclassing, but to what end ?

Well, we want to be able to create classes which wrap functions and then use those classes to create any number of objects, using the same underlying function
To that end, the ProxyObjectFactory needs a type to masquerade and a function which it then uses instead of getting the target type and the underlying value from the wrapped object used in the original implementation.

As we can see, _proxy_synchronise_() is overloaded to fetch a value using the provided callback method. In this case,the previously held wrapped value is passed into it. That new value is then used to replace the wrapped value

Now we can synthesize a new wrapper type...


>>> IntegerFunctionProxy = ProxyObjectFactory(int, lambda previous: previous * 2)

implementing the interface for the int type and a function to multiply the previous value by 2.
We make a wrapped instance this way, storing the initial value 1...


>>> value = IntegerFunctionProxy(1)

We can probe the proxy's values, which continue to double per the underlying function


>>> print value

2

>>> print value

4

>>> print value

8

>>> print value + 7

23

I think you get the idea

The callback in this case could be any function that takes and int and returns an int.

One can already see that using a function to initiate an asynchronous request and letting the proxy object marshall the return value could be trivial...
...well, almost trivial

I hope that was entertaining. Until next time ...

January 18, 2015

Python Decorator Fun #2 : Prototyped Functions

Exposure to Erlang, as is common with any developer with multi language experience, has caused me to wish for similar features that are otherwise absent from Python

But some are not so difficult to implement, and one of these is Erlang's function signatures.

In short, Erlang functions are differentiated by the parameter types and their order, like C++ for instance, but also can be differentiated by the values that some or all of the parameters may have.

This is really cool, because it takes out a lot of value checking and comparisons in what should otherwise be a 'clean' representation of the problem's solution. Indeed, this is one of the goals of functional programming itself, which Erlang represents and a metaphor which Python supports

Here is a simple Erlang case


start_registered(ModuleName) ->
    start_registered(ModuleName, []).

start_registered(ModuleName, Arguments) ->
    start_registered(ModuleName, ModuleName, Arguments).

start_registered(RegisterName, ModuleName, Arguments) ->
    {ok, Pid} = start_link(ModuleName, Arguments),
    register(RegisterName, Pid),
    {ok, Pid}.

We won't get into the weird Erlang syntax (for Pythonistas) but we are defining three forms of the same function, start_registered.


start_registered(ModuleName) ->
    start_registered(ModuleName, []).

This form takes a single module name and calls the same function with that module name followed by an empty parameter list


start_registered(ModuleName, Arguments) ->
    start_registered(ModuleName, ModuleName, Arguments).

This takes a module name and list of arguments, calling itself using the same module name to register as a task, then the module name and the parameter list


start_registered(ModuleName, Arguments) ->
    start_registered(ModuleName, ModuleName, Arguments).

And this form has a name to register the task as its first parameter, the module name followed by the list of arguments.

This kind of use saves having to define a single function that has to parse the intent of the caller on whether to use defaults or not.

What's not shown is the ability for Erlang to also employ value checking on any of the function parameters, called guards.

But Python has no strict typing and function parameters are really just managed as a list of names with no sign of intent other than what names are used.


def myfunction(name, address):
    pass

What can we infer about the name and address parameters ? Nothing really, because we can (obviously) call it this way...


myfunction(0.2, AutomobileClass(None))

... with no implications or penalties. It's up to myfunction to dynamically determine if the parameters are correct in context Of course, the fluid type handling of dynamic languages like Python are desired attributes but, it's still nice to be able to have such features when you want them

What would this feature look like in Python ? Since we can't really justify modifying the language, and this is a decorators tutorial, we can try it with decorator logic !


@prototype(one=int, two=int)
def myfunction(one=1, two=0):
    print "myfunction(int, int)", one, two

@prototype(one=(int, {1,2,3}))
def myfunction(one):
    print "myfunction(int=[1,2,3])", one

@prototype(one=int)
def myfunction(one):
    print "myfunction(int)", one

We have three forms of myfunction(). One which takes two integers, named one and two. The next takes a single integer named one, which can have values 1, 2, or 3. And the last which takes a single integer of any value.

Calling these would have this effect


>>> myfunction(one=10, two=7)

myfunction(int, int) 10 7

>>> myfunction(one=2)

myfunction(int=[1,2,3]) 2

>>> myfunction(one=99)

myfunction(int) 99

Interested in seeing how such a thing might be implemented ? I thought so

We import some required modules


from inspect import getargspec
from collections import OrderedDict

We need getargspec() to interrogate the function parameters and OrderedDict because the function parameter order is significant when applying the type signature

We're implementing this decorator as a class because (1) we use the class namespace to store a persistent function registry and (2) we use instantiated objects to store the function's signature (temporarily)


class prototype(object):
    _registry = {}

    def __init__(self, **prototype):
        self._prototype = prototype

Each call to prototype(...) instantiates a new prototype object

This method lets us get a dictionary strictly mapping parameter names and their types


    @staticmethod
    def get_type_spec(prototype):
        for _key, _type in prototype.iteritems():
            if type(_type) == type:
                yield _key, _type
            elif type(_type) == tuple:
                yield _key, _type[0]
            else:
                raise Exception("Type specification '%s' must be either a single type or a tuple" % (_key))

This method lets us get a dictionary strictly mapping parameter names and the set of permissible values


    @staticmethod
    def get_value_spec(prototype):
        for _key, _type in prototype.iteritems():
            if type(_type) == type:
                yield _key, None
            else:
                yield _key, _type[1]

Since we're using an object to implement the function decoration, we have to handle the call with the function object as a parameter.


    def __call__(self, fn):

First, we make sure that the class function registry gets an entry for our function's name


        collection = self._registry.setdefault(fn.__name__, [])

Then we craft a function definition recording the function object, its parameter list, its type map and its permissible value map


        signature_definition = [
            fn,
            getargspec(fn),
            dict(self.get_type_spec(self._prototype)),
            dict(self.get_value_spec(self._prototype)),
        ]

And then record it in the collection for this function name


        collection.append(signature_definition)

Finally, we return a lambda that captures the function's name and will call our static handler method with it. This is what gets recorded as the decorated function.


        return lambda *args, **kwargs: self.handler(fn.__name__, args, kwargs)

The handler function is where the marshalling happens


    @staticmethod
    def handler(name, args, kwargs):

We must iterate through all of the function definitions recorded with the same name, checking each one for candidacy.


        for fn, spec, _prototype, _valuemap in prototype._registry[name]:

We build a dictionary containing the parameter names and their default values for the selected function instance.


            instance_parameters = OrderedDict(
                zip(
                    spec.args,
                    [] if spec.defaults is None else spec.defaults,
                )
            )

Then we update the dictionary with the positional parameters passed in the current call attempt


            instance_parameters.update(
                zip(
                    spec.args,
                    args,
                )
            )

And then update the dictionary with the keyword arguments passed in the current call attempt


            instance_parameters.update(kwargs)

We're careful to exclude cases where there are keyword arguments that don't match or not all of the provided parameters are assigned to named arguments


            if len(instance_parameters) < len(args):
                continue
            elif set(kwargs.keys()).difference(instance_parameters.keys()):
                continue

Now, prepare a dictionary representing the signature of the current function using the types of the parameters that were passed


            instance_prototype = {
                _key: type(_value) for _key, _value in instance_parameters.iteritems()
            }

If they don't match, this function is not the right candidate


            if _prototype != instance_prototype:
                continue

Iterate over the parameter values and see if the value specification for the parameter is None (any value) or, if there is a collection, then determine if the parameter is within the permissible values. As soon as function parameter value is deemed inconsistent, then the loop is exited early. If the loop reaches the end of the parameters without failing, then the function is called immediately



            for key, value in instance_parameters.iteritems():
                permitted = _valuemap.get(key)

                if permitted is None or value in permitted:
                    continue

                # Found non permissible value
                break
            else:
                # All values were found to be permissible
                return fn(**instance_parameters)

If no match is found, then the function, as called, is unimplemented


        raise NotImplementedError("Function '%s' cannot be found with the expressed signature" % (name))

The class can be immediately used at this point, so something like


@prototype(a=float)
def myfunction(a):
    print "FLOAT", a

is possible

We can even simulate the Erlang cascading call


@prototype(name=str)
def start_service(name):
    print "Calling start service with %s and %s" % (name, [])
    return start_service(name, [])

@prototype(name=str, args=list)
def start_service(name, args):
    print "Calling start service as tag %s with %s and %s" % (name, name, args)
    return start_service(name, name, args)

@prototype(service_tag=str, name=str, args=list)
def start_service(service_tag, name, args):
    print "Starting service as %s with module %s and parameters %s" % (service_tag, name, args)

On execution, this yields


>>> start_service("fred")

Calling start service with fred and []
Calling start service as tag fred with fred and []
Starting service as fred with module fred and parameters []

I'm certain that there are ways to improve the (probably slow) function lookup method, but, at the least, this benefits by 'hiding' the value and type comparisons that would otherwise be done in plain view by a function that supported multiple call patterns

I hope you liked that !

September 01, 2014

Python Decorator Fun #1

Given the title's attached serial number, it would seem like this would portend an ongoing saga of python decorators ad nauseam. However, I promise not to bore you to tears (well, almost)

The intent of this post and any to follow are to demonstrate interesting and possibly helpful applications of decorator logic, where decorators themselves promise simpler, more fathomable code.

Did you ever wish that you could access an object member like it was an array, but "catch" the subscript and produce a result algorithmically instead ?

Well, have no fear. We will present an interface for just such a beast: The subscripted decorator.

We can jump right to a provoking use case. Here we instantiate our demo object, which is pretending to "hold" a range of numbers from 5 to 11


obj = MyRangeHandler(5, 12)

Let's look at the simple values


print obj.values

[5, 6, 7, 8, 9, 10, 11]

So let's get the square of the value at position 2 (7)


print obj.squares[2]

49

Get it ? However, we don't have to store the square values ahead of time, we can compute them based on the index passed on the original values

So, let's see how this is implemented...

First, we'll need functools


from functools import wraps

We'll define a function that'll produce our decorated method


def subscripted(function_reference):

This needs to return a wrapped inner function...


    @wraps(function_reference)
    def wrapper(obj, *args, **kwargs):

Inside that, we define a class whose sole purpose is to implement the subscripting protocol's lookup method __getitem__()


        class subscripter(object):
            def __getitem__(self, key):
                return function_reference(obj, key)

Notice that all it does is call the original object method with the lookup key that the wrapper caller provided. Now we tell the wrapper to return an instance of the newly defined class, which has captured the outer object's reference.


        return subscripter()

And, we have the decorator method return the wrapper as a property of of the enclosing object, so it will fetch a new subscripter object whenever we use the . accessor on the decorated method name.


    return property(wrapper)

Now, we can create our MyRangeHandler class


class MyRangeHandler(object):
    def __init__(self, start, finish):
        self.values = range(start, finish, 1)

And let's implement the squared attribute...


    @subscripted
    def squared(self, index):
        return self.values[index] ** 2

And that's it. This use case can obviously be handled using normal function-style decomposition or simply storing all computable values. But, there may be cases where subscript lookup is more "natural" and precomputation is too expensive.

This example doesn't allow for fetching all of the square values, however, since the subscripter class doesn't support slicing, and isn't a "proper" standin for a list object in any case.

Hope you found that useful !

August 27, 2014

Finding Paradise ?

Is there anywhere that could truly be called paradise ?

Perhaps not, but, when the sun rises after an all night coding session, or the night job we had to take to service the mortgage when wages were cut, or even if it's the only job we could get, we should still take a second to look outside, at that golden sun rising over the turqouise water, and remember, that whatever trouble befalls us, we still can have something to speak the meaning to life.

We're starting to get our geek community to come together. The google group is here. There have been various attempts in, for instance, the dotNet and Arduino circles, and there is a robotics club as well as Bermuda.io which leans toward opening public data for analysis from a general position that all such data should be open (highly admirable).

Open Bermuda attempts to form a general community of technological enthusiasts, where sustaining such a community causes a perpetual cycle of circulating ideas and knowledge transfer between people interested in such fields as Mathematics, Programming, Biology, Oceanography, Public Policy, Climate Science, Electrical Engineering, Robotics ... Ok, there's just too much to list, but you get the picture.

If everyone works to keep the community at or above critical mass, then it should all work as planned and we'll make not only Bermuda a better place, but the World, too.