Saturday, August 28, 2010

Case classes in Python

I'm trying to build a parser. The main function I've writing is called parse. It takes a str and returns a parse tree of token classes. I have a heap of tests that parse some text and then compare the parse tree against the an expected parse tree.

The way I started testing this looked a little like:

# token classes
class Add(Token):
    def __init__(self, left, right):
        self.left = left
        self.right = right

    def __repr__(self):
        return 'Add(' + repr(left) + ', ' + repr(right) + ')'

class Number(Token):
    def __init__(self, numberStr):
        self.numberStr = numberStr

    def __repr__(self):
        return 'Number(' + repr(numberStr) + ')'

# tests
class Tests(unittest.TestCase):
    def test_add(self):
            "Add(Number('1'), Number('1'))", 

Making the token classes was a real chore. After a while, I had around 10 of them and writing them was increasingly tedious. It occurred to me that if I had somethinng like Scala Case Classes, I could change the code to something like:

# token classes
class Add(Token):
    def __init__(self, left, right):

class Number(Token):
    def __init__(self, numberStr):

# tests
class Tests(unittest.TestCase):
    def test_add(self):
            Add(Number('1'), Number('1'), 

This way the token classes basically write themselves and, because they have a handy eq method, the test don't need those messy string comparisons.

I did some research and found out that I could basically get the case class functionality (apart from the pattern matching aspect) using a custom Metaclass.

Here the result:

>>> from caseclasses import CaseMetaClass
>>> class MyCaseClass():
...     __metaclass__ = CaseMetaClass
...     def __init__(self, a, b):
...         pass
>>> instance = MyCaseClass(1, 'x')
>>> instance.a
>>> instance.b
>>> instance == MyCaseClass(1, 'y')
>>> instance == MyCaseClass(1, 'x')
>>> str(instance)
"MyCaseClass(1, 'x')"

This is how it works. Case classes are marked with the CaseMetaClass. For each argument in the __init__ method, a read-only property is generated with the same value as the argument when an instance is constructed. It should be obvious from the example how the generated __eq__ and __str__ methods work. I also added simple implementations of __ne__ and __hash__ to be consistnt with the __eq__.

Getting this working was much easier than I expected it to be. Here is the code:

import inspect

from decorator import decorator

class CaseMetaClass(type):
    def __new__(mcs, name, bases, dict):
        def noop(self):

        for meth in ('__eq__', '__ne__', '__hash__', '__str__'):
            if meth in dict:
                raise Exception('{} must not be defined on class.' % (meth))

        if '__init__' in dict:
            args, varargs, varkw, _ = inspect.getargspec(dict['__init__'])
            if varkw is not None:
                raise Exception("__init__ can't take **kwargs")
            args = args[1:]
            args = []
            varargs = None

        if args and varargs:
            raise Exception("Case class __init__ can't have both args (other than self) and *args")

        for arg in args + ([varargs] if varargs else []):
            if arg.startswith('_'):
                raise Exception("Case class attributes can't start with '_'.")
            dict[arg] = property(lambda self, arg=arg: getattr(self, '_' + arg))

        def _init(func, self, *init_args) :
            setattr(self, '_CaseMetaClass__args', init_args)
            if varargs:
                setattr(self, '_' + varargs, init_args)
                for (name, value) in zip(args, init_args):
                    setattr(self, '_' + name, value)
        dict['__init__'] = decorator(_init, dict.get('__init__', noop))

        def str(self):
            values = [repr(x) for x in getattr(self,'_CaseMetaClass__args')]
            return name + '(' + ', '.join(values) + ')'
        dict['__str__'] = str
        dict['__repr__'] = str

        def eq(self, other):
            if other is None:
                return False
            if type(self) is not type(other):
                return False
            return self._CaseMetaClass__args == other._CaseMetaClass__args
        dict['__eq__'] = eq

        dict['__ne__'] = lambda self, other: not (self == other)
        dict['__hash__'] = lambda self: hash(type(self)) ^ hash(self._CaseMetaClass__args)

        return type.__new__(mcs, name, bases, dict)

Sunday, April 18, 2010

UDFs and Underlying Schema Changes

I had a problem at work recently where we changed a column's collation but a User Defined Function that referenced this column didn't reflect the updates and started complaining about collation conflicts. In this situation, most people (at least most people I know) recommend dropping the UDF and recreating it to force the UDF to update, but there is a better way.

sp_refreshsqlmodule will refresh the UDF in a single step and without the risk in dropping an object. The MSDN documentation for CREATE FUNCTION (under Best Practices) recommends creating functions WITH SCHEMABINDING as an alternative.

Saturday, April 03, 2010

Truth and Python 3: The 'and' and 'or' Operators

The Python and and or operators are a little involved but work in a useful and helpful way. These operators combine two objects, called operands, into a single object. The operators can be of any type and aren't restricted to bool. Python uses the keywords for these operators instead of the && and || symbols common in other languages.

The and operator returns the first false operand or the last one if both operands and true. Some examples will make this clearer:

>>> '' and 0
>>> 'x' and 0
>>> 'x' and 1

The or operator returns the first true operand or the last one if both operands are false. Here are some examples:

>>> 'x' or 1
>>> '' or 1
>>> '' or 0

Of course, bool(x and y) will always be equal to bool(x) and bool(y). For this reason, and operator will work as expected when used in conditional clauses. Similarly, bool(x or y) will always be equal to bool(x) or bool(y).

These operators are 'lazy' or 'short-circuiting'. The second operand is only evaluated if it would make a difference to the result. Here's an example:

>>> def x():
...     print('x')
...     return True
>>> def y():
...     print ('y')
...     return True
>>> x() or y()

In the example above y isn't called since x returns True.

Logically or will return the first operand if it's true, otherwise it will return the last. This means that or works like the null coalescing operator (??) in C# and it's often used in that way. Here's a vague example:

def some_method(some_parameter=None):
    some_parameter = some_parameter or get_default_value()
    # details...

Monday, March 01, 2010

Truth and Python 3: The if statement

In it's basic form the Python if statement looks like this:

if expression:

If expression is true, statements are executed; otherwise statements are skipped. expression can be of any type; it doesn't have to evaluate to True or False. The if statement automatically converts expression to bool(expression) if needed.

Unlike other languages, parenthesis are not required around expression. In fact, it's considered bad style to include them.

If statements consists of a single statement, the whole if statement may be written as one line:

if expression: statement

The else Clause

The if statement can include an else clause. The else clause is optional; an if statement doesn't have to have one.

if expression:

The else-statements are executed if none of the other branches of the if statement are executed. In the case above, if expression is false, else-statements are executed.

The elif Clauses

The if statement can include one or more elif or 'else-if' clauses. These clauses are also optional.

if expression:
elif elif-1-expression:
elif elif-2-expression:
more elif clauses...

An elif clause contains an expression and statements. The elif statements are only examined if expression is false. The interpretter starts going through the elif conditions from top to bottom until it finds one that is true. It then executes the corresponding statements and exits the if statement.

The if Statement in Full

An if statement may include both elif clauses and an else clause. The complete form of the if statement looks like this:

if expression:
elif elif-1-expression:
elif elif-2-expression:
more elif clauses...

Sunday, January 31, 2010

Truth and Python 3: The bool Type

Python 3 has a bool type representing Boolean values. There are two builtin bool constants, True and False.

For legacy reasons, bool is actually a subtype of int and True behaves like 1 while False behaves like 0.

For example,

>>> True + 0
>>> False * 3

and even...

>>> True == 1
The main difference between True and 1 is that str(True) returns 'True' and not '1'. Similarly, str(False) returns 'False' and not '0'.

Any object can be converted to bool by running it through the bool constructor. Here are some examples:

>>> bool(True)
>>> bool(None)
>>> bool([])

For any object, x, if bool(x) is True, we say it's truth value is true and we consider x as true in a Boolean context. If bool(x) is False, we say it's truth value is false and we consider it as false in a Boolean context.

x's truth value is determined as follows:

  1. If x is None, it's false
  2. If x defines a __bool__ method that returns False or 0, it's false.
  3. If x doesn't define a __bool__ method but defines a __len__ method that returns 0, it's false.
  4. Otherwise, it's true

Given the rules above, it's no surprise that the vast majority of objects are considered true. The main ones that are considered false are:

  • None
  • False
  • The zero value of each numeric type i.e. 0, 0.0, 0+0j
  • Empty containers e.g. (), {} and [].
  • "" (the empty string)

Sunday, January 17, 2010

FireFox without a Mouse

I finished reading The Productive Programmer a few weeks ago and since then I've been thinking about how to be more productive on the 'micro' level.

"Keep you hands on the keyboard as much as possible" is one of the recommendations that people always make. Constantly moving your hand from keyboard to mouse and back slows you down and feels pretty awkward too.

With that in mind, I've been learning how to browse with FireFox using the keyboard only. Some sites make this difficult but I can now go for much quite a while without reaching for the mouse.

Here are some basic tasks and how to do them with the keyboard. Note that I've chosen the methods and shortcuts that I like best and that there are other ways of doing these things.

Entering an Address

  1. Use alt+d to focus the Location bar
  2. Type the URL
  3. Press enter to open the URL.
    You can hold these keys down to modify the behavior:
    • ctrl - prepend www. and append .com to the address
    • shift - prepend www. and append .net to the address
    • ctrl+shift - prepend www. and append .org to the address
    • alt - open in a new tab
    For example, pressing ctrl+shift+alt+enter with an address of python will open in a new tab.


  1. Use ctrl+k to focus the Search bar
  2. Type the search terms
  3. Press enter to open the search in the current tab
    You can hold down alt to open the search in a new tab instead

Navigating between tabs

On Ubuntu these shortcuts use alt instead of ctrl.

  • ctrl+tab displays the next tab
  • shift+ctrl+tab displays the previous tab
  • ctrl+n where n is the digit 1 to 8 displays the nth tab
  • ctrl+9 displays the last tab
  • ctrl+w closes the current tab

Searching within a page

  1. Use / to open the Quick Find bar
  2. Type the search text
  3. Use F3 to cycle forward and shift+F3 to cycle backward through the matches

'Clicking' on links

  1. Use ' to open the Quick Find bar for links only
  2. Type the search text
  3. Use F3 and shift+F3 to cycle to focus the link you desire
  4. Hit enter to 'click' the link. The following modifiers are available:
    • shift - open the page in a new window
    • ctrl - open the page in a new tab
    • alt - save the page instead of opening it