Inventions, Modifications, and Ideas: March 2013

Sunday, March 24, 2013

Setting up matplotlib Python Package on a Mac

Thanks to this post, I've managed to get python and a few packages working on a Hackintosh running Snow Leopard. However, I found matplotlib to be particularly challenging, even with the instructions. Hence, for myself and others, I have decided to record my insights.

Once you have homebrew, python, and numpy installed as indicated in the post, above, do as follows in a terminal:

brew install freetype
brew install libpng

chmod +w /usr/local/lib
brew link freetype --force --overwrite

The last line should fail, but list a couple of files. Find the files (it tells you where they are) and delete them (or move them into a new folder for safe keeping). Retype the command and the link should occur. Then:

chmod +w /usr/local/include
brew link libpng --force --overwrite

As in the previous case, this will fail and list a few files. Proceed as in the previous case. Retype the command.

Finally type:

pip install matplotlib
or
pip install git+git://github.com/matplotlib/matplotlib.git#egg=matplotlib-dev

You'll have to use the latter if you're using the Mountain Lion OS. It was the latter that I concluded with, personally.

Monday, March 18, 2013

Disjoint-set Data Structure in Python

I needed this code to write Kruskal's minimum spanning tree algorithm.

class DisjointDataStruct(object):
"""Disjoint-set data structure.

For details see:
http://en.wikipedia.org/wiki/Disjoint-set_data_structure

Public functions:
union(x, y)
find(x)
"""

def __init__(self, vertices):
"""My personal representation of the spaghetti stack.

vertices - a list of vertex names from a graph
"""
self.struct = {v: [v, 0] for v in vertices}

def union(self, x, y):
"""Joins subsets x and y into a single subset.

Uses union by rank optimization.

x, y - vertex names corresponding to keys in self.struct
"""
xRoot = self.find(x)
yRoot = self.find(y)
if xRoot == yRoot:
return

xV = self.struct[xRoot]
yV = self.struct[yRoot]
if xV[1] < yV[1]:
xV[0] = yRoot
elif xV[1] > yV[1]:
yV[0] = xRoot
else:
yV[0] = xRoot
xV[1] += 1

def find(self, x):
"""Determines which subset a particular element x is in.

Uses path compression optimization.

x - vertex name corresponding to key in self.struct
"""
if self.struct[x][0] != x:
self.struct[x][0] = self.find(self.struct[x][0])
return self.struct[x][0]

The timings are as follows:
__init__ is 17.5ms per loop in 100 loops with 100000 vertices (using timeit)
union has a mean of 1.0ms, standard deviation of 1.10e-7, min of ~0.9ms, and max of 1.0ms.
The union data was computed over the 224 union time values that did not equal 0 over a total of 100000 random unions of 100000 vertices (using time.time end - start).

On a single run of cProfile with a vertex set of 100000, and 100000 random unions where the two elements are not equal (I increment one of them if they are), I get the following values:
0.042s for __init__
0.197s for union in total (100000 calls) so per call is ~1.97e-06s
0.130s for find in total (299990 total / 200000 primitive) so per call is ~4.33e-07s

All of this was clocked running an i5-2500K @ 3.3GHz

Sunday, March 10, 2013

2D Wraparound World

I figured I would share a finding I had while working on a project. For those of you who have attempted to code a representation of a wraparound map (2D torus) using a 1 or 2D array, you may have found the experience quite challenging to get operating smoothly. I was trying to find a more elegant solution when I stumbled upon the following formula:

sqrt(min(|x1 - x2|, w - |x1 - x2|)^2 + min(|y1 - y2|, h - |y1-y2|)^2)

Where the two points are (x1, y1) and (x2, y2);
w and h are the width and height;
min and sqrt are minimization and square root functions, respectively.

With that the process is absurdly easy.

Note: If you just need relative distance for a comparison, don't bother using the square root.

Function courtesy of:
http://stackoverflow.com/questions/2123947/calculate-distance-between-two-x-y-coordinates/2123977#2123977

Saturday, March 9, 2013

Python Nested List

In a previous post, I outlined a python function for creating nested lists. In retrospect, it was a naive, if functional, first attempt. As the size of the various dimensions increases, the function would become quite slow largely as a result of the deepcopy calls and recursion. A more efficient method would use the following syntax:


nestedList = [[[baseValue for x in range(axisLength)]

                for y in range(axisLength)] for z in ...]

Now, a naive attempt to turn this into a function might do something like the following:


def nestedList(axisLength, degree, base):

for x in range(degree):
base = [base for y in range(axisLength)]
return base

But this will actually just create shallow copies. The problem is that base is not recreated at each iteration. However, this is fixable with generators. For example:

def baseGen(base, axisLength):
while True:
try:
yield [base.next() for x in range(axisLength)]
except AttributeError:
yield [base for x in range(axisLength)]

def nestedList(base, axisLength, degree):
for d in range(degree):
base = baseGen(base, axisLength)
return base

Then, you would just call nList = nestedList(None, 5, 3) and then ls = nList.next() for a 5x5x5 cube with None as the base value. An added advantage of using generators is that you can just keep calling nList.next() every time you want a new cube.