Calvin's

Icon

designs and hacks. people and products.

A Review of Rob Napier, Mugunth Kumar’s iOS 6 book; and advanced debugging with Python + LLDB

I have had the benefit of a sneak preview into Rob and Mugunth’s upcoming book release - iOS 6 Programming Pushing the Limits: Advanced Application Development for Apple iPhone, iPad, and iPad Touch- which is an updated release to their successful iOS 5 pushing the limits series.

As any veteran programmers would know, one of the most important skills (among a multitude of skill requirements) to master is the art of debugging an application.  Chapter 19 provides an excellent coverage of this topic in the context of iOS apps.  Beginning with a conceptual introduction to LLDB (Apple’s Lower Level Debugger) and the difference from the older version of the debugger – GDB – the chapter introduces the intermediate programmer to the actual mechanics of debugging with LLDB and the use of breakpoints (which is no different from the use of breakpoints from many other advanced editors or debugging tools).

Breakpoints

Having mucked around with almost 5 conceptually simple iOS apps, I am no strange to using breakpoints to zero in on problems.  What is interesting, however, is the detailed explanation of different types of breakpoints – exception breakpoints, symbolic breakpoints.  Of particular interest and of learning value to me personally, is the use of Ctrl-Click + “share” on the breakpoint to write a breakpoint state into a `xcshareddata` directory, which can be saved into git and so shared with a fellow developer.

Watchpoints

Watchpoints are also interesting to me, being useful for specifically tracking data mutation events – particularly for global variables that have changed, in the context of Singleton implementations, Core Data persistent store coordinators or API engines.

Advanced Use of Debugger through Python Scripting

For our normal, simple use cases, introspecting objects at breakpoints is a simple matter of typing

po myObject
po myDictionary
po myArray

For scalar variables (e.g. integers or structs), we will of course use

p (int) self.myAge
p (CGPoint) self.view.center

What if we need to search through a very large array containing a large number of objects?  This is where we might have problems manually reading the 10,000 objects that are printed out.  In such a case, we can actually use Python to run a search.  Jumpt into the python shell prompt from the lldb prompt by typing “script”. Then, via the great example from the book:-

>>> import mypython_script
>>> array  = lldb.frame.FindVariable("myArray")
>>> yes_or_no = mypython_script.search_object(array, "<search element>")    # search_object is a method we write in our custom mypython_script .py file.
>>> print yes_or_no

This nice little trick from the book goes straight into my personal bookmark.

Finishing Up

The chapter ties up neatly with an explanation of how NSZombieEnabled flag/environment variable is used to track objects in memory, a summary of various types of Application crashes one commonly encounters and how crash stacks can be collected – “natively” via iTunes as well as on 3rd party services like TestFlight and HockeyApp.

All-in-all, this is a must-have book for any professional iOS developer and certainly a level-up for me.  I have merely covered specifically this one chapter of a very detailed and extensive pro manual and I am certainly looking forward to grabbing a copy (or receiving a copy free ;-)) from Mugunth.  Grab it at Amazon.com if you think this is something that would help you build even more awesome iOS apps - iOS 6 Programming Pushing the Limits: Advanced Application Development for Apple iPhone, iPad, and iPad Touch.

Python lists

Python lists are not really lists based on computer science’s definition of the word.  Classically trained programmers who are new to Python may be confused why a python list’s `append` method is so much more efficient than its `insert` method.

The classical list (not the python list) – what computer scientists call a linked list – is implemented as a series of nodes, each node keeping a reference to the next node.  We can imagine such a linked list in Python like this:-

class Node:
    def __init__(self, value, next=None):
        self.value = value
        self.next = next

# Usage
>>>    L = Node("a", Node("b", Node("c", Node("d"))))
>>>    L.next.next.value
'c'

Computer scientists call this a “singly linked list”, as opposed to a “double linked list”.  In a “double linked list”, each node will also keep a reference to the previous node so it is “bi-directional” whereas our singly linked list example here only points to the next node and does not “remember” the previous node.

But Python’s list type is implemented in a different way.  Instead of several separate nodes referencing each other, a Python list is a single contiguous slab of memory.  Computer scientists call this an “array”.

Understanding this fundamentals  reveal our implementation (and performance) differences.

1.  Iterating over Python List and Linked List

When iterating over the contents of a list, both are equally efficient.  But there’s some (resource) overhead in the linked list.

2.  Accessing an element in a Python List  vs an element in a Linked List

When directly accessing an element in a given index, our Python list (an “array”) is a lot more efficient because the position of the element can be calculated and the right memory location accessed directly (since it is in a contiguous slab of memory)!  To access an element in the linked list, we will need to traverse the list from the beginning (much like traversing a DOM tree in HTML).

3.  Inserting vs Appending into a Python List compared to a Linked List

The biggest puzzle, as mentioned initially, is the difference between `insert` and `append`.  `insert` in a linked list is very cheap – no matter how many nodes we have in our linked list, insertion takes roughly the same amount of time.  This is precisely because our linked list’s nodes are at different memory location.

On the other hand, the advantage we have gained from using Python’s list being an array that occupies a contiguous slab of memory is now lost if we attempt insertion because this requires that we move all elements that are on the right of the insertion point, possibly even moving all the elements to a larger array (a completely new memory slab).  This also explains why `append` is efficient for a Python list since `append` means inserting at the end of the memory slab where there are no elements on its right.

python threading bug: ‘_DummyThread’ object has no attribute ‘_Thread__block’

This bug, filed here – http://bugs.python.org/issue14308 - occurs because of a bad interaction between dummy thread objects created by the threading API when we call threading.currentThread() on a foreign thread.  And in particular, because of the _after_fork feature which is called to clean up resources (triggered by `os.fork()` method).

Stephen White also provided a code snippet that demonstrates this problem:-


import os
import thread
import threading
import time

def t():
    threading.currentThread() # Populate threading._active with a DummyThread
    time.sleep(3)

thread.start_new_thread(t, ())

time.sleep(1)

pid = os.fork()
if pid == 0:
    os._exit(0)
    os.waitpid(pid, 0)

Running this script will give you “no attribute ‘_Thread__block’” error, as explained.  For detailed explanations and a monkey-patch solution without modifying python source code, this is a good resource - http://stackoverflow.com/questions/13193278/understand-python-threading-bug

It so happens that django-debug-toolbar’s middleware causes exactly this problem.  And it’s extremely annoying to have my django dev server printing out ‘_DummyThread’ object has no attribute ‘_Thread__block’ in my terminal stdout repeatedly whenever my DebugToolbarMiddleware is enabled.

MIDDLEWARE_CLASSES += (
    'debug_toolbar.middleware.DebugToolbarMiddleware',
)

So here’s my pull request to resolve this issue on django-debug-toolbar - https://github.com/django-debug-toolbar/django-debug-toolbar/pull/333.  I have also taken the liberty to “upgrade” the original use of the thread module to threading module in this pull request. thread module will no longer be available in Python3 but threading module will, so in my opinion, it’s better to simply using the threading module!

Further criticisms and suggestions to improve welcome.