Upgrade to Pro — share decks privately, control downloads, hide ads and more …

ASCII vs. Unicode in Python 2

Sponsored · Your Podcast. Everywhere. Effortlessly. Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.
Avatar for Jessica McElroy Jessica McElroy
November 03, 2014
250

ASCII vs. Unicode in Python 2

Avatar for Jessica McElroy

Jessica McElroy

November 03, 2014

Transcript

  1.  All  data  is  stored  as  bits.      Bits  are

     meaningless  unHl  they  are   encoded.  
  2. H   01001000       A   01000001  

      C   01000011     K   01001011     ASCII  can  represent  up  to     2  ^  8  (256)  characters  
  3. How  does  unicode  work?   Think  of  it  like  a

     database  that  maps  any  symbol  you  can   think  of  to  a  number  (code  point)  and  to  a  unique  name.     Symbol/glyph       Code  Point       Unique  Name    
  4.   >> text_str = “Let’s go to a café.” >>

    type(text_str) <type ‘str’> >> text_uni = u“Let’s go to a café.” >> type(text_uni) <type ‘unicode’>
  5.   >> text_str = “Let’s go to a café.” >>

    len(text_str) 20 >> text_uni = u“Let’s go to a café.” >> len(text_uni) 19
  6. >> tomorrow = u”Hasta ma\u00f1ana.” >> print tomorrow Hasta mañana.

    >> love = u“I \U00002764 Hackbright!” >> print love I  ❤    Hackbright! Unicode  escape  sequences  
  7. Make  a  unicode  sandwich   1)  Decode  early   2)

     Unicode  everywhere   3)  Encode  late  
  8. Include  Unicode  in  Unit  Tests   ґ℮αḓαbℓℯ  ßü☂  ηø☂  αṧ¢I

        TesHng  monkeys:                 ✞ε✖⊥  ḟ◎я  ⊥εṧтḯᾔℊ      
  9. References     hhp://www.unicode.org/charts/     hhp://www.joelonso•ware.com/arHcles/Unicode.html     hhp://nedbatchelder.com/text/unipain/unipain.html#1

        hhp://blog.notdot.net/2010/07/Ge‚ng-­‐unicode-­‐right-­‐in-­‐Python     hhps://mathiasbynens.be/notes/javascript-­‐unicode     hhps://pythonhosted.org/kitchen/unicode-­‐frustraHons.html