Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Do you really think you know strings in Python?

Do you really think you know strings in Python?

Understanding the tricky concepts behind strings in Python in an interesting way.

Satwik Kansal

June 22, 2018
Tweet

More Decks by Satwik Kansal

Other Decks in Programming

Transcript

  1. Iterable, iterator, generator and containers Container: Containers are data structures

    holding elements, and that support membership tests. Example, lists, set, dict, tuple, str Iterable: An iterable is any object, not necessarily a data structure, that can return an iterator. Iterator: It's a stateful helper object that will produce the next value when you call next() on it. Any object that has a __next__() method is therefore an iterator.
  2. Iterable, iterator, generator and containers • Most containers are also

    iterable. But many more things are iterable as well. Examples are open files, open sockets, etc. • Where containers are typically finite, an iterable may just as well represent an infinite source of data. • An iterator is a value factory. Each time you ask it for "the next" value, it knows how to compute it because it holds internal state. • A generator is a special kind of iterator (with some elegant syntax of writing it) • Any generator, therefore, is a factory that lazily produces values.
  3. Objects and identities in Python (contd…) id returns an integer

    (or long integer) which is guaranteed to be unique and constant for this object during its lifetime is operator checks if both the operands refer to the same object (i.e., it checks if the identity of the operands matches or not).
  4. Implicit String interning • Such behavior is due to CPython

    optimization (called string interning) that tries to use existing immutable objects in some cases rather than creating a new object every time. • After being interned, many variables may point to the same string object in memory (thereby saving memory).
  5. When are strings interned implicitly? • The decision of when

    to implicitly intern a string is implementation dependent. There are some facts that can be used to guess if a string will be interned or not: ◦ All length 0 and length 1 strings are interned. ◦ Strings are interned at compile time ('wtf' will be interned but ''.join(['w', 't', 'f'] will not be interned) ◦ Strings that are not composed of ASCII letters, digits or underscores, are not interned. • This explains why 'wtf!' was not interned due to !
  6. Constant folding • Constant folding is a technique for peephole

    optimization in Python. • This means the expression 'a'*20 is replaced by 'aaaaaaaaaaaaaaaaaaaa' during compilation to reduce few clock cycles during runtime. • Constant folding only occurs for strings having length less than 20. (Why? Imagine the size of .pyc file generated as a result of the expression 'a'*10**10).
  7. The interactive environment optimization When a and b are set

    to "wtf!" in the same line, the Python interpreter creates a new object, then references the second variable at the same time. If you do it on separate lines, it doesn't "know" that there's already wtf! as an object (because "wtf!" is not implicitly interned as per the facts mentioned above). It's a compiler optimization and specifically applies to the interactive environment.
  8. ' bit.ly/wtfpython About me (An active Python blogger, and author

    of “What the f*ck Python?”) https://satwikkansal.xyz Handle: @satwikkansal