Slide 1

Slide 1 text

Do you really think you know strings in Python?

Slide 2

Slide 2 text

What is string?

Slide 3

Slide 3 text

A Data Structure?

Slide 4

Slide 4 text

In Python, Strings are sequences, just like lists, and tuples.

Slide 5

Slide 5 text

What is a sequence in Python?

Slide 6

Slide 6 text

Well, a sequence is an “ordered iterable” with random access. Huh?

Slide 7

Slide 7 text

Sequences

Slide 8

Slide 8 text

Iterable, iterator, generator and containers Container: Containers are data structures holding elements, and that support membership tests. Example, lists, set, dict, tuple, str Iterable: An iterable is any object, not necessarily a data structure, that can return an iterator. Iterator: It's a stateful helper object that will produce the next value when you call next() on it. Any object that has a __next__() method is therefore an iterator.

Slide 9

Slide 9 text

Iterable, iterator, generator and containers ● Most containers are also iterable. But many more things are iterable as well. Examples are open files, open sockets, etc. ● Where containers are typically finite, an iterable may just as well represent an infinite source of data. ● An iterator is a value factory. Each time you ask it for "the next" value, it knows how to compute it because it holds internal state. ● A generator is a special kind of iterator (with some elegant syntax of writing it) ● Any generator, therefore, is a factory that lazily produces values.

Slide 10

Slide 10 text

Iterable, iterator, generator and containers

Slide 11

Slide 11 text

Sequence operations: Concatenation, multiplication, Iterating, and indexing

Slide 12

Slide 12 text

But, are strings exactly like lists?

Slide 13

Slide 13 text

Let’s play a game

Slide 14

Slide 14 text

1. Python snippet 2. Guess the output 3. Actual Output 4. Repeat!!

Slide 15

Slide 15 text

Are strings exactly like lists?

Slide 16

Slide 16 text

No content

Slide 17

Slide 17 text

So unlike lists, strings are “immutable”

Slide 18

Slide 18 text

Is there any other difference except immutability?

Slide 19

Slide 19 text

No content

Slide 20

Slide 20 text

No content

Slide 21

Slide 21 text

So strings are special type of sequences containing only characters.

Slide 22

Slide 22 text

Objects and identities in Python

Slide 23

Slide 23 text

Objects and identities in Python (contd…) id returns an integer (or long integer) which is guaranteed to be unique and constant for this object during its lifetime is operator checks if both the operands refer to the same object (i.e., it checks if the identity of the operands matches or not).

Slide 24

Slide 24 text

What happens if you multiply strings with booleans?

Slide 25

Slide 25 text

No content

Slide 26

Slide 26 text

What’s up with the booleans?

Slide 27

Slide 27 text

Is the behavior same for all numbers?

Slide 28

Slide 28 text

No content

Slide 29

Slide 29 text

Are all the string objects different?

Slide 30

Slide 30 text

No content

Slide 31

Slide 31 text

Are all concatenated strings same?

Slide 32

Slide 32 text

No content

Slide 33

Slide 33 text

Unlocking mysteries, one by one...

Slide 34

Slide 34 text

Implicit String interning ● Such behavior is due to CPython optimization (called string interning) that tries to use existing immutable objects in some cases rather than creating a new object every time. ● After being interned, many variables may point to the same string object in memory (thereby saving memory).

Slide 35

Slide 35 text

When are strings interned implicitly? ● The decision of when to implicitly intern a string is implementation dependent. There are some facts that can be used to guess if a string will be interned or not: ○ All length 0 and length 1 strings are interned. ○ Strings are interned at compile time ('wtf' will be interned but ''.join(['w', 't', 'f'] will not be interned) ○ Strings that are not composed of ASCII letters, digits or underscores, are not interned. ● This explains why 'wtf!' was not interned due to !

Slide 36

Slide 36 text

No content

Slide 37

Slide 37 text

Constant folding ● Constant folding is a technique for peephole optimization in Python. ● This means the expression 'a'*20 is replaced by 'aaaaaaaaaaaaaaaaaaaa' during compilation to reduce few clock cycles during runtime. ● Constant folding only occurs for strings having length less than 20. (Why? Imagine the size of .pyc file generated as a result of the expression 'a'*10**10).

Slide 38

Slide 38 text

The interactive environment optimization When a and b are set to "wtf!" in the same line, the Python interpreter creates a new object, then references the second variable at the same time. If you do it on separate lines, it doesn't "know" that there's already wtf! as an object (because "wtf!" is not implicitly interned as per the facts mentioned above). It's a compiler optimization and specifically applies to the interactive environment.

Slide 39

Slide 39 text

Half Triple quoted strings

Slide 40

Slide 40 text

No content

Slide 41

Slide 41 text

Implicit string literal concatenation

Slide 42

Slide 42 text

What’s the best way to make a “giant” string?

Slide 43

Slide 43 text

What’s the best way to make a “giant” string?

Slide 44

Slide 44 text

Let the race begin!

Slide 45

Slide 45 text

No content

Slide 46

Slide 46 text

Many different ways to concatenate strings

Slide 47

Slide 47 text

Many different ways to concatenate strings

Slide 48

Slide 48 text

Bonus

Slide 49

Slide 49 text

' bit.ly/wtfpython About me (An active Python blogger, and author of “What the f*ck Python?”) https://satwikkansal.xyz Handle: @satwikkansal