Jonas Neubert
August 18, 2018
1.2k

# Zebras & Lasers (PyBay 2018)

August 18, 2018

## Transcript

1. Zebras and Lasers:
A crash course on barcodes with Python
Jonas Neubert
3rd Annual Regional Python Conference August 16 - 19, 2018 | San Francisco, CA

2. Jonas Neubert
Software Engineer at Zymergen
slides: http://jonasneubert.com/talks/pybay2018.html

3. From First Principles
ʵ Barcodes Theory
ż Barcodes with Python
Ʉ Barcodes in The Realworld

4. Barcodes Theory

5. Symbology
- Geometric primitives (lines, circles, matrix, ...)
- Arrangement of primitives (single row, concentric, stacked, …)
- Symbol table (pixels → bits)
- Encoding (bits → information)
- Checksums and Error Correction
Code
- What the numbers mean (part numbers, ZIP codes, ...)

6. Code 128: Full ASCII, variable length, continuous (11,3), three character sets

7. 11 slots
3 bars & 3 gaps per character
Maximum Bar Size: 4 Slots
106 Different Characters
3 Character Sets for “Full-Ascii”
Code 128: Full ASCII, variable length, continuous (11,3), three character sets

8. character “P”
(0x30)
Code 128: Full ASCII, variable length, continuous (11,3), three character sets

9. character “H”(0x28)
Code 128: Full ASCII, variable length, continuous (11,3), three character sets

10. character “P”(0x30)
Code 128: Full ASCII, variable length, continuous (11,3), three character sets

11. “PHP”
Code 128: Full ASCII, variable length, continuous (11,3), three character sets

12. “PHP” cHecksUm
(0x43)
Code 128: Full ASCII, variable length, continuous (11,3), three character sets

13. “PHP”
stARt coDE
(0x68)
STOP Code
..(0x70)
cHecksUm
(0x43)
Code 128: Full ASCII, variable length, continuous (11,3), three character sets

14. “PHP”
stARt coDE
(0x68)
STOP Code
..(0x70)
cHecksUm
(0x43)
quiet ZonEs
Code 128: Full ASCII, variable length, continuous (11,3), three character sets

15. Code 39: Numbers and uppercase letters (39 chars), discrete 3-of-9 2-width
12 slots
5 bars & 4 gaps per character
Two distinct widths
3 out of 9 Bars or Gaps are 2x wide
Variable Gap between characters
39 Different Characters: A-Z 0-9 *\$/+%

16. *
Start code ?
Just use “*”
STOP CODE?
Another
“*”
cHecksUm optional
*
H
P P
Code 39: Numbers and uppercase letters (39 chars), discrete 3-of-9 2-width

17. Interleaved 2-of-5: Numbers only, variable length, continuous 2-of-5 2-width

18. 0
8 8
1
Start
Guard StoP Guard
cHecksUm
optional
Interleaved 2-of-5: Numbers only, variable length, continuous 2-of-5 2-width

19. POSTNET: Numbers only, fixed length, clocked continuous 2-of-5, height-modulated
USPS pre-2009

20. POSTNET: Numbers only, fixed length, clocked continuous 2-of-5, height-modulated
USPS pre-2009

21. Start GUard
STOP
guarD
cHecksUm
9 4 1 4 3 3 0 0 8
-
POSTNET: Numbers only, fixed length, clocked continuous 2-of-5, height-modulated
USPS pre-2009

22. Zielcode: Numbers only, fixed length, continuous 2-of-5 1-width
Deutsche Post
Source: Helmut Schraets: Meilensteine der Briefcodierung (1998).
https://www.briefmarken-kevelaer.de/media/files/Meilensteine-Neu.pdf

23. Intelligent Mail®: Numbers only, variable length, clocked 4-state, height-modulated
US Postal since 2009
00-300-123456-123456789-94143-3008-00
“Send First Class letter with no services (300) no OEL information (00) to UCSF Conference
center (94143-3008-00) using mAILer ID 123456 and sequence number 123456789”

24. Source: USPS: Intelligent Mail Barcode 4-state Specification (2015),
Intelligent Mail®: Numbers only, variable length, clocked 4-state, height-modulated
US Postal since 2009

25. :CueCat: Like Code 128 but at 22.5° angle and proprietary code

26. Rule of Thumb:
Height = 0.15 * Width
Anonymous 1980s engineer
pulling Number out of his Hat
Bacon

27. Bacon & Eggs Rule of Thumb:
Height = 0.15 * Width
Anonymous 1980s engineer
pulling Number out of his Hat

28. Bacon & Egg with Toast
Rule of Thumb:
Height = 0.15 * Width
Anonymous 1980s engineer
pulling Number out of his Hat

29. Code 49: Stacked linear codes, fixed length (49 chars)
Full Breakfast with Bacon & Eggs
and Toast & Jam

30. Code 16k: Stack of inverted Code 128 symbols, 77 chars or 154 digits

31. PDF417: “High capacity”, variable length, stack of continuous (17,4) codes
Full Breakfast with Bacon, Baked Beans and Eggs
and Toast or an English Muffin as a side.

32. PDF417: “High capacity”, variable length, stack of continuous (17,4) codes

33. Data Matrix: Arbitrary data, up to 1556 bytes/2335 chars/3116 digits
PyBay

34. Data Matrix: Arbitrary data, up to 1556 bytes/2335 chars/3116 digits
White in Top right
means: Uses EC200
PyBay
Left and bottom
guards are solid
Top and right guards
are clocked

35. Data Matrix: Arbitrary data, up to 1556 bytes/2335 chars/3116 digits
White in Top right
means: Uses EC200
PyBay
Left and bottom
guards are solid
Top and right guards
are clocked

36. Data Matrix: Arbitrary data, up to 1556 bytes/2335 chars/3116 digits
White in Top right
means: Uses EC200
PyBay
Left and bottom
guards are solid
Top and right guards
are clocked
P
y
B
P
y
a
B a
REED-SOLOMON ERROR
CORRECTION

37. Data Matrix: Arbitrary data, up to 1556 bytes/2335 chars/3116 digits

38. Data Matrix: Arbitrary data, up to 1556 bytes/2335 chars/3116 digits

39. Semacode: Code for storing URLs in Datamatrix
Source: Bernd Hopfengärtner: Hello World! (2007)
http://hello.w0r1d.net/

40. QR: up to 7089 digits or 4296 ASCII characters or 1817 Kanji characters

41. QR: up to 7089 digits or 4296 ASCII characters or 1817 Kanji characters
Source: Xinhua (2017)
http://www.xinhuanet.com/photo/2017-09/14/c_1121662356_2.htm

42. Aztec:
up to 3832 digits or
3067 characters
Maxicode:
exactly 596 bits,
hexagonal
elements
Grid Matrix:
optimized for
encoding Chinese
characters
Code One: First
non-proprietary 2D
symbology (1992)
Micro QR:
Variation of QR Code,
numeric, up to 35 digits,
single alignment mark
← HIBC Datamatrix:
Variation of Datamatrix,
up to 36 characters, used
in medical devices

43. UPC/EAN: Numbers only, fixed length of 12 or 13 digits, continuous (7,2) 4-width
Source: Businessweek, April 7, 1973. Available on the website of the
ID History Museum at http://idhistory.com/ibm/Busweek1973001.pdf

44. 1D
Interleaved 2-of-5
Code 39
Code 128
:CueCat
Bullseye
UPC/EAN
Stacked
Code 49
Code 16k
PDF417
Matrix
Datamatrix
Aztec
Maxicode
Grid Matrix
Code One
QR
Mini QR
POSTNET
RM4SCC
PostBar
Intelligent Mail

45. 1D
Interleaved 2-of-5
Code 39
Code 128
:CueCat
Codabar
Code 11
Code 93
93i
Channel Code
Bullseye
UPC-A, UPC-E
EAN-13, EAN-8
GS1-128
Plessey Code
Stacked
Code 49
Code 16k
PDF417
MicroPDF417
Codablock
SuperCode
UltraCode
GS1 Databar St.
Matrix
Datamatrix
Aztec
Maxicode
Grid Matrix
Code One
QR
Mini QR
Vericode
MiniCode
GoCode
POSTNET
RM4SCC
PostBar
Intelligent Mail
Data Logic
Leitcode
Pharmacode
PZN
ISBN
Composite
Aztec Mesas
EAN.UCC
EAN-14 Comp.

46. Interleaved 2 of 5
Code 128
Code 128
M
axicode
M
icro
QR
Code 39
Code 39
Code 128
Code 39

47. How to pick a symbology?
dictate a specific
symbology?
Continue listening to this
presentation.
Use the symbology
dictated by your
application.
Yes
No

Really bright lamps
✍ Wands
ɾ Laser
ɑ Imaging

Really bright lamps
✍ Wands
ɾ Laser
ɑ Imaging
Photos:
“ACI plate on an IANR hopper” by Feddacheenee on Wikimedia Commons.
https://commons.wikimedia.org/wiki/File:ACI_plate.jpg
“Petrified Forest, Flora, Summer 2010” Quinn Rossi on flickr, Licensed under CC BY 2.0.
https://www.flickr.com/photos/theeskimo/4898894840/

Really bright lamps
✍ Wands
ɾ Laser
ɑ Imaging
Photo:
“Barcodelesestift mit Leseeinheit” by I,Nightflyer on Wikimedia.de under CC BY-SA 3.0
https://de.wikipedia.org/wiki/Barcodeleseger%C3%A4t#/media/File:Barcodelesestift.jpg

Really bright lamps
✍ Wands
ɾ Laser
ɑ Imaging

Really bright lamps
✍ Wands
ɾ Laser
ɑ Imaging
Photo: “Swap Paper for Mobile Boarding Passes“ by City of Mc Allen.

53. Printing Technology

54. Direct Parts Marking: Permanently marking a piece by punching, etching, laser, etc.
“Printing” Technology

55. “Printing” Technology
Otolith Marking: Information stored in fish bones for fishery management
Photo: randychiu on flickr, Licensed under CC BY 2.0.
https://www.flickr.com/photos/randychiu/4817651586/

56. Barcodes with Python

57. Generating Barcodes
>>> import barcode
>>> bc = barcode.get_barcode('code39', 'PYBAY2018')
>>> bc.save('mybc')
'mybc.svg'
pip install python-barcode

58. Generating Barcodes
>>> import barcode
>>> barcode.generate('code39', 'PYBAY2018', output='mybc')
'mybc.svg'
pip install viivakoodi

59. Generating Barcodes
>>> from pubcode import Code128
>>> bc = Code128('PyBay2018', charset='B')
>>> bc.bars
'21121431312121214113112312112421214122321112312212322131122
>>> bc.image(height=50).save('mybarcode.png')
MissingDependencyError: PIL module is required to use image
pip install PubCode

60. Generating Barcodes
>>> from pubcode import Code128
>>> bc = Code128('PyBay2018', charset='B')
>>> bc.bars
'21121431312121214113112312112421214122321112312212322131122
>>> bc.image(height=50).save('mybarcode.png')
pip install PubCode pillow

61. Generating Barcodes
import pdf417gen, this
data = pdf417gen.encode(this.s)
image = pdf417gen.render_image(data)
image.save('mybarcode.png')
pip install pdf417gen

62. ✓ ✓

✓ ✓

✓ ✓

Generating Barcodes
treepoem
pyBarcode
↳ python-barcode
↳ viivakoodi
↳ steenzout.barcode
↳ reBarcode
pyStrich (fork of huBarcode)
qrcode
PyQRCode
segno
pdf417gen
code128
PubCode
candybar
MIT
MIT
MIT
MIT
MIT
MIT
Apache
BSD
BSD
BSD
MIT
LGPL
MIT
Apache
UPC-A
EAN
ISBN
PZN
I. 2 of 5
Code 39
Code 93
Code 128
Data Matrix
PDF417
QR
Micro QR

123
N/A
17
28
5
0
N/A
1,481
204
18
13
N/A
1
2
Latest
release
June 2018
July 2013*
June 2018
Nov 2014
Jan2017
Oct 2017
July 2016
Mar 2018
June 2016
Feb 2018
May 2017
Jan 2015
Aug 2015
Aug 2016
Output
1: SVG
2: pillow
3: pypng
[]: optional
2
1, [2]
1, [2]
1, [2]
1, [2]
1, [2]
2
1, 3, [2]
1, 3
2
1, [2]
2
2

Aztec
various formats,
no external dep.

63. Generating Barcodes
Easy to install, support for common symbologies:
python-barcode
license: MIT, last updated: June 2018
https://pypi.org/project/python-barcode/
Most feature rich tool:
Treepoem (uses BWIPP, requires Postscript)
license: MIT, last updated: Aug 2018
https://pypi.org/project/treepoem/
Symbology specific libraries:
Segno (QR, Micro QR), pdf417gen, libdmtx (Datamatrix), PubCode (Code 128)

65. Don’t read barcodes with Python
Use purpose-built hardware
whenever possible. These
can optimize every step of
the process going from light
to data.
Photos: “Symbol Motorola” by Lost Parcels on flickr, licensed under CC BY 2.0,
https://www.flickr.com/photos/lostparcels/31046764866/. By HP APJ on flickr, licensed
under CC BY-NC-ND 2.0, https://www.flickr.com/photos/personalagain/5444387315/

zbar
C/C++ library
EAN-8/13, UPC-A/E, EAN-8, Interleaved 2 of 5,
Code 39, Code 128, QR
(not documented: Codabar, Code 93, PDF417)
license: LGPL 2.0, last updated: Oct 2009
https://sourceforge.net/projects/zbar/
ZXing
Java library (and C++ rewrite/port/fork)
same as zbar, plus: Codabar, Code 93, ITF, Data
Matrix, Aztec, PDF417, Maxicode, GS1 Databar
(and various codes using these symbologies)
license: MIT, last updated: Aug 2018
https://github.com/zxing/zxing (Java)
https://github.com/glassechidna/zxing-cpp (C++)
libdmtx
C/C++ library, .NET fork, Datamatrix symbology only
license: LGPL 2.0, last updated: unclear
https://sourceforge.net/projects/libdmtx/, https://github.com/dmtx,
https://sourceforge.net/projects/datamatrixnet/

67. “The Original”
Last updated 2009
Python 2 only
Actively maintained
Python 2 and Python 3
Actively maintained
installation

68. There exists an sparsely
maintained C++ rewrite/port/fork
of ZXing (). This wraps that.
https://github.com/glassechidna/zxing-cpp
https://github.com/lubo/zxinglight
- uses subprocess.Popen()
Python 3 only

69. ZXing example
>>> import zxing
BarCode('420941079374889676090739719680',
'420941079374889676090739719680', 'CODE_128', 'TEXT',
[(37.5, 96.0), (805.0, 96.0)])
pip install zxing

Commercial SDKs
Scandit (C)
https://www.scandit.com/products/barcode-scanner/barcod
e-scanner-details/
Cognex ManateeWorks
https://manateeworks.com/barcode-scanner-sdk
Vintasoft (.NET)
http://www.vintasoft.com/vsbarcode-dotnet-index.html
Leadtools Barcode SDK (Java, C, .NET Core)
DTKSoft Barcode SDK (Windows only)
https://www.dtksoft.com/?p=barcodesdk
Sension (C++)
https://www.sensionweb.com/codemax-barcode-sdk
https://www.dynamsoft.com/
https://pypi.org/project/dbr/
Matrox Imaging Library
https://www.matrox.com/imaging/en/products/software/mil/
(Python wrapper available as “update”)
NO endorsement is implied.

71. Barcodes in the Realworld

72. Barcodes as real-world Browser Cookies

73. Barcodes as Information
Photo: “Heinz QR porn code too saucy for ketchup customer”, BBC, Jun
https://www.bbc.com/news/technology-33200142

74. Barcodes as Passwords (spoiler: don’t do it)

75. Barcodes as Passwords (spoiler: don’t do it)
'M1NEUBERT/JONAS EP8LD5W SFOSANUA 1900 034Y029D0071 15D>5180
W8034BUA 2A01123810700013 UA UA GV444737 N*30600 05
^QO6IGQATNXzqrr57ZahTjZGbeLkuah2ZCvlKPeBBodwewJDZWTitJftePwRGUZs
F7yCngxRxKj9x9MtIAbgLE21TueaF4GIT7QU'

76. Barcodes as Passwords (spoiler: don’t do it)

77. RTFM (part 1)
Understand the hardware you are using
A barcode reader with factory settings is like a Windows PC without a firewall

78. RTFM (part 2)
Understand the symbology you are using
Example: With the Interleaved 2-of-5 symbology makes valid partial scans very
likely, only use with fixed length content
I AM A RED STICKER
THAT BREAKS
“4321”
“0987654321”

79. RTFM (part 3)
Understand the code you are using
Example: Google Chrome “crashes” when scanning a product code into a textbox
Here’s why:
- Product code is a GS1 Datamatrix code
- GS1 Datamatrix specifies “FNC1” character as separator
- Code 128 maps “FNC1” to 0x64
- My barcode reader decides to communicate 0x64 as F8 key
- In Chrome developer tools “F8” means: Stop code execution

80. Tips for working with barcodes
- “Outsource” the decoding if you can
- Use a symbology that is right and safe for your data type
- Limit the symbologies accepted to those you actually need
- Disable reader configuration through barcodes
- Validate input as if it was user input (because it is)
- Validate that only the characters or numbers you expect are used
- Check length of data (most symbologies are variable length)
- Protect against Little Bobby Barcodes injection attacks
- If your software hands out barcodes, treat them like browser cookies
- Never include secrets in barcodes

81. My last slide
What’s the URL for these slides again?
http://jonasneubert.com/talks/pybay2018.html
Want to work with me?
https://www.zymergen.com/careers/
The best ways to reach me are Twitter (@jonemo)
and email ([email protected])

82. Books
Roger C. Palmer: The Bar Code Book: A Comprehensive Guid To Reading, Printing, Specifying,
Evaluating, And Using Bar Code and Other Machine-Readable Symbols, 5th ed., Trafford Pub. 2007
John Berry: The Secret Life of Bar Codes, Wirksworth Books, 2013
Gavin Weightman: Eureka: How Invention Happens, Yale University Press, 2017. Excerpt available at
https://www.smithsonianmag.com/innovation/history-bar-code-180956704/

I didn’t read the following but came across references to them during my research for this presentation:
Ben Nelson: Punched Cards to Barcodes: A 200 Year Journey with Descriptions of over 260 Cods,
Helmers Publishing, 1997.
Stephen A. Brown: Revolution at the Checkout Counter: The Explosion of the Bar Code, Wertheim
Publications in Industrial Relations, 1997.
Bill Selmeier: Spreading the Barcode, Lulu.com, 2010.
George J. Laurer: Engineering Was Fun, 3rd ed., Lulu.com, 2012.

83. Articles
Katie Mingle: Barcodes. 99 Percent Invisible. 2014.
https://99percentinvisible.org/episode/barcodes/
Ernie Smith: Right Track, Wrong Station, Tedium. 2017.
Ernie Smith: Switching Labels, Tedium. 2017.
https://tedium.co/2017/11/13/retail-theft-ticket-switching/
Ernie Smith: When Tech Hit Retail, Tedium. 2017.
https://tedium.co/2017/07/20/point-of-sale-retail-history/
Larry Silverman: Barcodes, A Brief History, TrackAbout, 2015.
Brian Krebs: Why It’s Still A Bad Idea to Post or Trash Your Airline Boarding Pass, 2017.
Jonas Neubert: Intro to Barcode Readers, 2017.
Shaun Ewing: What’s contained in a boarding pass barcode, 2011.
https://shaun.net/notes/whats-contained-in-a-boarding-pass-barcode/

84. Videos & Talks
Barcode in Black and White, Groupe Média TFO.
“FX” Felix Lindner: Toying with Barcodes, DEFCON 16 (2008).
Karsten Nohl, Nemanja Nikodijevic: Where in the World Is Carmen San Diego, Becoming a Secret Travel
Agent, 33c3 (2016). https://media.ccc.de/v/33c3-7964-where_in_the_world_is_carmen_sandiego
Karina Ruzinov: Shipping secret messages through barcodes, PyCascades 2018.

85. Online Resources (besides Wikipedia)
Retail Identification History Museum. http://www.idhistory.com/
Egoditor UG: http://www.barcode-generator.org/ (scroll down past the generator for content)

86. The Original Barcode Patent
US2612994A: Classifying apparatus and method, 1949, Norman J Woodland, Silver Bernard (Drexel
University)
Otolith Marking (Barcodes in Fish Bones)
Volk, Schroder, Grimm, Ackley: Use Of A Bar Code Symbology to Produce Multiple Thermally Induced
Otolith Marks, Transactions of the American Fisheries Society, vol. 123, is. 5, pp. 811-816 (1994).
Volk, Schroder, Grimm: Otolith thermal marking, Fisheries Research, vol. 43, is. 1-3, pp. 205-219 (1999).
Helmut Schraets: Meilensteine der Briefcodierung (1998).
https://www.briefmarken-kevelaer.de/media/files/Meilensteine-Neu.pdf
USPS: Intelligent Mail Barcode 4-state Specification (2015).
https://ribbs.usps.gov/intelligentmail_mailpieces/documents/tech_guides/USPSB3200IntelligentMailBarco
de4State.pdf
DNA Bardcoding
https://en.wikipedia.org/wiki/DNA_barcoding