This was originally written for my talk in SyPy September 2013. It was originally written as HTML slides. I've converted it to PDF for SpeakerDeck as I have had the time now
isn't a name in Python. • Hard to find information about for beginners. • I thought it was pointers when I started. • It unpacks arguments (equivalent to Go's ...something and something...)
the value of a iterable. Instead of: for i in range(0, len(l)):! v = l[i]! # do something with the index and value! Do this: for i, v in enumerate(l):! # do something with index and value
and you know the list will only ever be integers, characters or bytes. • When your list does not change much. • When you don't need to do math (if you want to do math, use Numpy's Array).
return array.array('B', list).tostring()! ! def f1(l):! string = ""! for item in l:! string = string + chr(item)! return string! The one with arrays is 15x faster than the other function.
an array vs sorting a list of 1 million integers (naive sorts) ! timeit.timeit("sorted(l)", setup="from __main__ import l", number=100)! 54.20818901062012! ! ! timeit.timeit("sorted(a)", setup="from __main__ import a", number=100)! 93.52262997627258
as a priority queue. Here's Guido's code to sort 1 million integers with an array and a heap queue: ! ! import sys, array, tempfile, heapq! assert array.array('i').itemsize == 4! ! def intsfromfile(f):! while True:! a = array.array('i')! a.fromstring(f.read(4000))! if not a:! break! for x in a:! yield x! ! iters = []! while True:! a = array.array('i')! a.fromstring(sys.stdin.buffer.read(40000))! if not a:! break! f = tempfile.TemporaryFile()! array.array('i', sorted(a)).tofile(f)! f.seek(0)! iters.append(intsfromfile(f))! ! a = array.array('i')! for x in heapq.merge(*iters):! a.append(x)! if len(a) >= 1000:! a.tofile(sys.stdout.buffer)! del a[:]! if a:! a.tofile(sys.stdout.buffer)
! import random! import array! ! l = [x for x in range(0, 1000000)] # generate 1 million integers! ! # shuffle a few times! random.shuffle(l)! random.shuffle(l)! random.shuffle(l)! ! # create an array! a = array.array('i', l)
integers to be sorted: – There are no repeats – There is a range (specifically from 0-‐1000000) – Sound familiar? Sounds like database indices? A real life example [insert example about url classificaHon]
customSortBA(iterable):! ba = bitarray('0' * len(iterable))! for i in iterable:! ba[i] = True! return [j for j,k in enumerate(ba) if k]! Want to do it in pure Python? def customSortPP(iterable):! a = [None for i in range(0, len(iterable))]! for i in iterable:! a[i] = True! return [j for j,k in enumerate(ba) if k]!
Limited in applicaHon though. • O(3N). (Technically O(2M + N), but in this case M = N) • But Timsort has a best case of O(N) too! • Not a general purpose sort