Interesting Things
With Python
SyPy
September
2013
Slide 2
Slide 2 text
Zip tricks
• [x amount of] things that I think is
interesting (YMMV)
• Hacks in Python
• Based on my working experience
• Do NOT put these code into production
Slide 3
Slide 3 text
THE BASICS
Slide 4
Slide 4 text
The Basics
These are things I have found helpful:
1. In place value swapping
2. Zip tricks
3. Splat
4. Enumerate
5. SimpleHTTPServer
Slide 5
Slide 5 text
In-place value swapping
>>> a = 1!
>>> b = 2!
>>> a, b = b, a!
>>> print(a, b)!
2, 1
Slide 6
Slide 6 text
While we're on that topic...
>>> a, b, c, d = 1, 2, 3, 4!
>>> a!
1!
>>> b!
2
Slide 7
Slide 7 text
Zip tricks
Zip is really more useful than expected.
Take this 2 x 4 matrix!
1 2 3 4!
5 6 7 8!
Represented in Python:
m = [[1,2,3,4], [5,6,7,8]]!
Make it a 4 x 2 matrix
Slide 8
Slide 8 text
Zip tricks
>>> zip(*m)!
[(1, 5), (2, 6), (3, 7), (4, 8)]!
It essentially is this:
1 5!
2 6!
3 7!
4 8
Slide 9
Slide 9 text
Splat Operator
• That's the name in Perl.
• There isn't a name in Python.
• Hard to find information about for
beginners.
• I thought it was pointers when I started.
• It unpacks arguments (equivalent to
Go's ...something and
something...)
Splat Operator
Now you can take any amount of arguments!
!
def fn(*args):!
print args!
!
>>> l = [1,2,3,4]!
>>> fn(l)!
([1, 2, 3, 4])!
>>> fn(*l)!
(1, 2, 3, 4)
Slide 12
Slide 12 text
Enumerate
enumerate() is a generator that returns the index and the
value of a iterable.
Instead of:
for i in range(0, len(l)):!
v = l[i]!
# do something with the index and value!
Do this:
for i, v in enumerate(l):!
# do something with index and value
Slide 13
Slide 13 text
Enumerate
One more trick with enumerate: Start at any
number!
for i, v in enumerate(l, start=2):!
# do something with index and value!
Do this, and you will eliminate many off-by-
one errors.
Slide 14
Slide 14 text
SimpleHTTPServer
python -m SimpleHTTPServer 8080!
That's it! You have a simple HTTP Server
running.
Slide 15
Slide 15 text
Bonus: Pydoc
I love Godoc. Then I discovered Godoc was
inspired by Pydoc
pydoc -p 6060!
Now go to http://localhost:6060 for
documentation!
Slide 16
Slide 16 text
The Fun Stuff
Slide 17
Slide 17 text
Bit Twiddling
What does this code do?
def fn(x): !
return (3435973837*x)>>35!
And this?
def fn2(y): !
return ((1717986919*y)>>34)-(y>>31)
Slide 18
Slide 18 text
Bit Twiddling
Answer: They both divide x by 10
Can you figure out what the magic numbers
3435973837 and 1717986919 are?
Slide 19
Slide 19 text
Little Known/Used
Libraries
Slide 20
Slide 20 text
Arrays
There is a difference between lists and
arrays.
Python has a built-in array library.
import array
Slide 21
Slide 21 text
When are arrays useful?
• When you have a list, and you know the
list will only ever be integers, characters
or bytes.
• When your list does not change much.
• When you don't need to do math (if you
want to do math, use Numpy's Array).
Slide 22
Slide 22 text
Guido's Example
Compare these two functions:
import array!
def f7(l):!
return array.array('B',
list).tostring()!
!
def f1(l):!
string = ""!
for item in l:!
string = string + chr(item)!
return string!
The one with arrays is 15x faster than the other function.
Slide 23
Slide 23 text
Caveat
Don't be seduced, however. Here's the
performance of sorting an array vs sorting a list
of 1 million integers (naive sorts)
!
timeit.timeit("sorted(l)", setup="from
__main__ import l", number=100)!
54.20818901062012!
!
!
timeit.timeit("sorted(a)", setup="from
__main__ import a", number=100)!
93.52262997627258
Slide 24
Slide 24 text
Heap
The data structure, not the memory term. Very useful as a priority queue.
Here's Guido's code to sort 1 million integers with an array and a heap queue:
!
!
import sys, array, tempfile, heapq!
assert array.array('i').itemsize == 4!
!
def intsfromfile(f):!
while True:!
a = array.array('i')!
a.fromstring(f.read(4000))!
if not a:!
break!
for x in a:!
yield x!
!
iters = []!
while True:!
a = array.array('i')!
a.fromstring(sys.stdin.buffer.read(40000))!
if not a:!
break!
f = tempfile.TemporaryFile()!
array.array('i', sorted(a)).tofile(f)!
f.seek(0)!
iters.append(intsfromfile(f))!
!
a = array.array('i')!
for x in heapq.merge(*iters):!
a.append(x)!
if len(a) >= 1000:!
a.tofile(sys.stdout.buffer)!
del a[:]!
if a:!
a.tofile(sys.stdout.buffer)
Sorting
Python uses something called TimSort as it's default sort.
!
import random!
import array!
!
l = [x for x in range(0, 1000000)] # generate 1
million integers!
!
# shuffle a few times!
random.shuffle(l)!
random.shuffle(l)!
random.shuffle(l)!
!
# create an array!
a = array.array('i', l)
Slide 27
Slide 27 text
Sorting
By
now
it
should
be
something
obvious
about
the
integers
to
be
sorted:
– There
are
no
repeats
– There
is
a
range
(specifically
from
0-‐1000000)
– Sound
familiar?
Sounds
like
database
indices?
A
real
life
example
[insert
example
about
url
classificaHon]
Slide 28
Slide 28 text
Sorting
The Sorting Algorithm
from bitarray import bitarray!
!
def customSortBA(iterable):!
ba = bitarray('0' * len(iterable))!
for i in iterable:!
ba[i] = True!
return [j for j,k in enumerate(ba) if k]!
Want to do it in pure Python?
def customSortPP(iterable):!
a = [None for i in range(0, len(iterable))]!
for i in iterable:!
a[i] = True!
return [j for j,k in enumerate(ba) if k]!
Slide 29
Slide 29 text
Sorting
• Lots
of
opHmizaHons
can
be
done.
• Limited
in
applicaHon
though.
• O(3N).
(Technically
O(2M
+
N),
but
in
this
case
M
=
N)
• But
Timsort
has
a
best
case
of
O(N)
too!
• Not
a
general
purpose
sort