Tuesday, December 23, 2008

Inkface update

So as promised, here is an update on Inkface. I am ready with v0.1.2, almost.

A main change in this update is, the definition of a new abstraction called 'Face'. This was introduced to make programming the app more intuitive. A 'Face' object is initialized by giving it a name of SVG file that holds UI elements. One can create several face objects that hold elements from different SVG files, and as per the program logic can add-remove these 'face' objects from the canvas. The elements in the SVG files can be addressed as attributes of the face objects in which they are loaded. This makes the app source code very intuitive. The management of the element objects is done behind the scene without the app programmer having to worry about it.

To demonstrate these features, I have written a simple yet useful app using above concepts. It's a twitter client, written using python-twitter library as backend and inkface-python as front end. This app depicts friends' and public's twits in a more playful manner as clouds and banners, rather than rectangular gtk widgets. It enables text-entry by a face object called 'Keyboard' which implements the logic of text entry (from touch screen interface as well as actual keyboard of n810). I want to polish the app further before making a demo video, so I am not going to rush it today.

The inkface infrastructure is now divided into three libraries: the SVG parser and renderer library - libaltsvg (derivative of librsvg); the python bindings for inkface - inkface-python; a helper python library which defines above mentioned abstractions - inklib. There are deb packages for each of these components. A good news is, I have got permission to upload these packages to maemo extra repo; so users now won't have to worry about dependencies.

Soon I will have to seriously concentrate on the performance issues. As you will observe while running the twitter client, the app takes nearly 50% of n810's total 128M memory. So it surely is not healthy. The python ref count debugging made sure that the memory was no longer leaked in buckets; but a vector graphics based app is known to need more resources. I plan to look deep into librsvg and cairo data structures to trim down the memory usage.

Meanwhile, this update also has some code for OpenGL backend, however it's disabled by default. It is targeted for desktop and OpenGL ES based handhelds.

Tommorrow I am leaving for a vacation, so after I return next week, I will do some final testing and release the deb packages.

That's all for now.

Merry Christmas to all!!!

Saturday, December 20, 2008

Debugging python reference counts

If you are wondering about the progress of inkface, it's coming soon. I was planning to release version 0.1.2 couple weeks ago with a cool app demonstrating the improvements in the library (A twitter client), but after I saw the abysmal memory performance of the app on my n810, I decided to wait. Past two weeks I spent understanding the memory leaks in the library. In this post I would like to share my experiences while debugging reference count leaks in my python bindings.

The primary documentation on the subject from python.org is the first stepping stone. Considering how daunting the task of memory debugging is - some inspiration always helps. Read this post from Guido which explains how to approach the task.

In most of the cases we start debugging the memory leak after we have seen the program crawling or ever-increasing heap profile graph from valgrind massif. Whichever is the case, the theory of reference counts as explained in above docs, doesn't make sense in this scenario. In such case, the following steps may help you get started.
  • Put debug statements in the dealloc functions of your Python Type objects. This will tell you when are they getting called, if at all. They will be called when Py_DECREF call on your python object decrements its refcount to zero.
  • Identify the python objects that you suspect to be leaking (i.e. the objects whose ref counts are not reaching zero when you expect them to) and write a simple routine that dumps their refcounts. You can write this routine inside your C module or in python - depending upon how your code is organized. In C you can find PyObject's refcount by its member ob_refcnt. In python you can do the same by passing the object to sys.getrefcount() function. Call this refcount-dump routine from various places in your code and monitor how the reference count of these objects varies.
  • By calling the refcount-dump routine at strategic points you will soon narrow down to the area that is increasing the ref counts unexpectedly (or more correctly forgetting to decrease it when the job is done). Now look in the Python-C API docs for the behavior of the API calls you are making in this particular code segment. Understand the meanings of borrowed reference and new reference. Soon, you will find the places where you are supposed to call Py_DECREF, but you haven't.
While debugging my code I had following observations.
  • The ref counts on my objects were astronomical. After a while I figured that they were increasing with time. This observation clearly showed that the bug was in a loop structure. Soon I found the problematic code and put Py_DECREFs.
  • One common place of forgetting to release the ownership of an object is while iterating over a sequence using PyIter. Note that, PyIter_Next returns a new reference and you have to release it (roughly at the end of the iteration)
  • Sometimes Heisenberg's law kicks in ("you can't observe something without influencing it" as mentioned by Guido in above post). In my case, the python objects I was tracking were saved as values in a dictionary. So the only way to refer to them was
    sys.getrefcount(dict.values()[index])
    This leads to a creation of list of values, thus incrementing the reference count of the object I am monitoring. There is no easy way that I know of to work around this, but just taking into consideration these additional refcounts helps.
  • In my code, I have a pure python class which encapsulates the objects created by C library. So the deallocation of C objects was dependent on the deallocation of pure python object. So I had to track its refcounts as well. One surprising thing I noticed was this object had very high refcount immediately after its instantiation. e.g.
    o = someclass()
    print sys.getrefcount(o) # 11
    This was very unexpected. The reason behind it lied inside its constructor of course. This class has several methods. And inside the constructor I register these method objects as callback handlers. This understandably increases the refcount of method objects; however as it turns out it also increases the refcount of the parent object. So in my case, I got rid of these references by unregistering the callback handlers when I wanted to release the object.
After the whole exercise, I concluded that debugging memory leaks by means of debugging reference count leaks is a more useful way, compared to going through valgrind logs of a misbehaving C/C++ program.

These are my findings based on less than one day's work. If you find any mistakes in my understanding, please do point out.

And yes... an update on Inkface is coming soon.

Saturday, December 06, 2008

Paul Graham nails it again!

Paul Graham published a new essay today "Could VC be a Casualty of the Recession?" It is one of those awesome ones. The best part of the essay is this:
... it has gotten much
cheaper to start a startup. There are four main reasons: Moore's
law has made hardware cheap; open source has made software free;
the web has made marketing and distribution free; and more powerful
programming languages mean development teams can be smaller.

All the thoughts on my mind these days, couldn't be summarized better. Check out the full essay, as usual his opinions are radical yet convincing.

By the way, tomorrow

P.S. This is my first post from ScribeFire

Wednesday, December 03, 2008

A great essay

Today Slashdot had a story on Dijkstra's essay. I started to read it casually during my lunch. After reading 30 pages of handwritten text, I was left awestruck. Some of the points made by Dijkstra in this essay are no short of epiphanies. And the fact that, the prose is 20 years old, feels you with respect for the vision Dijkstra had. If you are anyway related to Software engineering, this essay is a must read.

Here are my favorite snippets:

... As the economics is known as "The Miserable Science", software engineering should be known as "The Doomed Discipline", doomed because it cannot even approach its goal since its goal is self-contradictory. Software engineering, of course, presents itself as another worthy cause, but that is eyewash: if you carefully read its literature and analyse what its devotees actually do, you will discover that software engineering has accepted as its charter "How to program if you cannot" ....

... The practice is pervaded by the reassuring illusion that programs are just devices like any others, the only difference admitted being that their manufacturer might require a new type of craftsmen, viz. programmers. From there it is only a small step to measuring "programmer productivity" in terms of "number of lines of code produced per month"....

... Unfathomed misunderstanding is further revealed by the term "software maintenance", as a result of which many people continue to believe that programs - and even programming languages themselves - are subject to wear and tear....

Tuesday, December 02, 2008

Fedora 10 on MacBook 2,1

I finally got Linux working on my Macbook. Although, I had bought this Macbook in Dec'06, after confirming that Linux can be installed on it; I never felt any need to do so. My desktop was sufficient for the task. But soon I will be getting rid of my desktop, so have to prepare the Macbook as a primary coding machine.

It is pretty easy, to install Linux alongside Mac OS X, provided you are ready to completely format the hard drive. I attempted to salvage my original installation of Mac OS X, by resizing the HFS volume with parted; however that didn't work. I had taken backup of all the necessary data (hopefully!!).

Here is the recipe for converting a 2nd generation MacBook with Intel Core 2 duo, into a dual boot box. Since Bootcamp has expired on Mac OS X Tiger, I couldn't use it; so this one uses rEFIt and OS X's Disk Utility for repartitioning.

1. Download rEFIt and install it.
2. Reboot and see that you can boot into Mac OS X
3. Reboot with Mac OS X install CD. During installation, use the "Disk Utility" to partition your entire hard drive into three volumes. My layout - 30GB Journaled HFS for Mac OS X, 40 GB for linux (type doesn't matter, we will change it later anyway), and 4GB for linux swap.
4. Continue with Mac OS X installation (this takes looooong time)
5. Reboot into newly installed Mac OS X
6. Reboot with Fedora 10 (or other distro's) installer Live CD.
7. Install Fedora 10 as you normally would - I chose the disks with root partition on 40GB volume (mark it to format to ext3) and 4GB swap. Also note that, you should install the bootloader on first sector of the partition (In my case it was /dev/sda3) and NOT the MBR.
8. After installation is complete, reboot.
9. In rEFIt menu you should see the Penguine alongside Mac OS X.

There are several tutorials that tell additional steps; but above steps worked for me.

I was happy to find many things working out of the box: Wifi, sound, Battery meter in tray. (Note that 2nd gen Macbook doesn't have nVidia card, so don't go installing kmod-nvidia packages like mentioned at many places for newer Macs)

There are several things that still need some investigation - right click, scroll gestures on touch pad. I have yet to test suspend and resume.

But overall the system looks good (writing from the same). I am excited to try out KVM virtual machines for the first time - because this is only the first personal machine I have got with hardware support for virtualization.