Slide 1

Slide 1 text

@PeterHilton http://hilton.org.uk/ Universal Bugs 🐛 🐛 🐛 🐛 🐛 🐛 🐛 🐛 🐛 🐛 http://hilton.org.uk/tag/ddd

Slide 2

Slide 2 text

That’s a nasty bug you’ve got there… Software has bugs, and then we fix them. Except when we don’t. Not all bugs appear because you didn’t write a unit test, or because the cat walked on your keyboard. Some bugs are endemic. Who can fix those? 2 @PeterHilton •

Slide 3

Slide 3 text

House numbers

Slide 4

Slide 4 text

House numbers Peter Hilton

Slide 5

Slide 5 text

Any attribute with the word number in its name must be modelled as text https://hilton.org.uk/blog/non-numeric-numbers 5 @PeterHilton •

Slide 6

Slide 6 text

Post codes

Slide 7

Slide 7 text

Stadscykel / CC0

Slide 8

Slide 8 text

No content

Slide 9

Slide 9 text

Post codes 117 of 190 Universal Postal Union countries use post codes Most countries only use (3-10) digits Some countries use letters as well Only Ireland has a unique post code per address Yet another country-specific value model Ideally, you have each country’s up-to-date actual list https://en.wikipedia.org/wiki/Postal_code 9 @PeterHilton •

Slide 10

Slide 10 text

Country lists

Slide 11

Slide 11 text

Countries Visible Each, NASA

Slide 12

Slide 12 text

Countries By country, you probably mean sovereign state. c.f. ISO 3166-1 alpha-2 code, Unicode CLDR localised name. The closest to a standard is United Nations membership, but 18 states are not universally recognised: 🇮🇱 🇰🇷 🇰🇵 🇨🇳 🇨🇾 🇦🇲 🇵🇸 🇹🇼 🇪🇭 🏳 🏳 🇽🇰 🏳 🏳 … four of which not even by the Apple Emoji font 🏳😢 https://hilton.org.uk/blog/country-lists 12 @PeterHilton •

Slide 13

Slide 13 text

Standard text identifiers a.k.a. codes ISO 3166-1 alpha-2 country (region) codes: → GB NL FR ISO 4127 currency codes (country code plus one letter): → GBP NLG FRF EUR ISO 639-1 language codes (lower-case!): → en nl fr 13 @PeterHilton •

Slide 14

Slide 14 text

Country names

Slide 15

Slide 15 text

France Frankrijk France Франція Netherlands Nederland Pays-Bas Нідерланди Ukraine Oekraïne Ukraine Україна United Kingdom Verenigd Koninkrijk Royaume-Uni Велика Британія English Dutch French Ukranian Standard names and their translations

Slide 16

Slide 16 text

No content

Slide 17

Slide 17 text

Unicode Common Locale Data Repository (CLDR) 17 @PeterHilton • Unicode CLDR provides standard lists and translations of: territories (including countries), currencies, time zones, languages, calendar names (quarters, months & weekdays), scripts (writing systems), units of measurement, etc. ⚠ You have to use validity data to filter the lists 📄 Unicode CLDR publishes XML and JSON on GitHub https://hilton.org.uk/blog/l10n-cldr-names

Slide 18

Slide 18 text

Country list order

Slide 19

Slide 19 text

English Dutch French Ukranian France Frankrijk France Франція Netherlands Nederland Pays-Bas Нідерланди Ukraine Oekraïne Ukraine Україна United Kingdom Verenigd Koninkrijk Royaume-Uni Велика Британія Sort order is language-dependent!

Slide 20

Slide 20 text

Email addresses

Slide 21

Slide 21 text

Email addresses There’s a standard for email addresses. So what’s the problem? 1. Several updated/replaced standards: 
 RFC 822 → RFC 2822 → RFC 5322 → RFC 6854 2. Four levels of email address validation: 
 RFC format + domain + mailbox exists + correct person 3. Security risks of supporting the whole standard https://hilton.org.uk/blog/mail-address-validation 21 @PeterHilton •

Slide 22

Slide 22 text

(((?:(?:(?:(?:(?:(?:[ \t]*\r\n)?[ \t]+)?\((?:(?:(?:[ \t]*\r\n)?[ \t]+)?[\x01-\x08\x0B\x0C\x0E- \x1F\x7F!-'*-\[\]-~]|(?:\\[\x01-\x09\x0B\x0C\x0E-\x7F]))*(?:(?:[ \t]*\r\n)?[ \t]+)?\))*(?:(?:(?:(?: [ \t]*\r\n)?[ \t]+)?\((?:(?:(?:[ \t]*\r\n)?[ \t]+)?[\x01-\x08\x0B\x0C\x0E-\x1F\x7F!-'*-\[\]-~]|(?:\\ [\x01-\x09\x0B\x0C\x0E-\x7F]))*(?:(?:[ \t]*\r\n)?[ \t]+)?\))|(?:(?:[ \t]*\r\n)?[ \t]+)))?[a-zA-Z0-9! #-'*+\-/=?^-`{-~.\[]]+(?:(?:(?:(?:[ \t]*\r\n)?[ \t]+)?\((?:(?:(?:[ \t]*\r\n)?[ \t]+)?[\x01- \x08\x0B\x0C\x0E-\x1F\x7F!-'*-\[\]-~]|(?:\\[\x01-\x09\x0B\x0C\x0E-\x7F]))*(?:(?:[ \t]*\r\n)?[ \t]+)? \))*(?:(?:(?:(?:[ \t]*\r\n)?[ \t]+)?\((?:(?:(?:[ \t]*\r\n)?[ \t]+)?[\x01-\x08\x0B\x0C\x0E- \x1F\x7F!-'*-\[\]-~]|(?:\\[\x01-\x09\x0B\x0C\x0E-\x7F]))*(?:(?:[ \t]*\r\n)?[ \t]+)?\))|(?:(?: [ \t]*\r\n)?[ \t]+)))?)|(?:(?:(?:(?:(?:[ \t]*\r\n)?[ \t]+)?\((?:(?:(?:[ \t]*\r\n)?[ \t]+)?[\x01- \x08\x0B\x0C\x0E-\x1F\x7F!-'*-\[\]-~]|(?:\\[\x01-\x09\x0B\x0C\x0E-\x7F]))*(?:(?:[ \t]*\r\n)?[ \t]+)? \))*(?:(?:(?:(?:[ \t]*\r\n)?[ \t]+)?\((?:(?:(?:[ \t]*\r\n)?[ \t]+)?[\x01-\x08\x0B\x0C\x0E- \x1F\x7F!-'*-\[\]-~]|(?:\\[\x01-\x09\x0B\x0C\x0E-\x7F]))*(?:(?:[ \t]*\r\n)?[ \t]+)?\))|(?:(?: [ \t]*\r\n)?[ \t]+)))?"(?>(?:(?:[ \t]*\r\n)?[ \t]+)?(?:[\x01-\x08\x0B\x0C\x0E-\x1F\x7F!#-\[\]-~]|(?:\ \[\x01-\x09\x0B\x0C\x0E-\x7F])))*(?:(?:[ \t]*\r\n)?[\t]+)?"(?:(?:(?:(?:[ \t]*\r\n)?[ \t]+)?\((?:(?: (?:[ \t]*\r\n)?[ \t]+)?[\x01-\x08\x0B\x0C\x0E-\x1F\x7F!-'*-\[\]-~]|(?:\\[\x01-\x09\x0B\x0C\x0E- \x7F]))*(?:(?:[ \t]*\r\n)?[ \t]+)?\))*(?:(?:(?:(?:[ \t]*\r\n)?[ \t]+)?\((?:(?:(?:[ \t]*\r\n)?[ \t]+)? [\x01-\x08\x0B\x0C\x0E-\x1F\x7F!-'*-\[\]-~]|(?:\\[\x01-\x09\x0B\x0C\x0E-\x7F]))*(?:(?:[ \t]*\r\n)? [ \t]+)?\))|(?:(?:[ \t]*\r\n)?[ \t]+)))?))(?:(?:(?:[ \t]*\r\n)?[ \t]+)(?:(?:(?:(?:(?:(?:[ \t]*\r\n)?[ \t]+)?\((?:(?:(?:[ \t]*\r\n)?[ \t]+)?[\x01-\x08\x0B\x0C\x0E-\x1F\x7F!-'*-\[\]-~]|(?:\\[\x01- \x09\x0B\x0C\x0E-\x7F]))*(?:(?:[ \t]*\r\n)?[ \t]+)?\))*(?:(?:(?:(?:[ \t]*\r\n)?[ \t]+)?\((?:(?:(?: [ \t]*\r\n)?[ \t]+)?[\x01-\x08\x0B\x0C\x0E-\x1F\x7F!-'*-\[\]-~]|(?:\\[\x01-\x09\x0B\x0C\x0E- \x7F]))*(?:(?:[ \t]*\r\n)?[ \t]+)?\))|(?:(?:[ \t]*\r\n)?[ \t]+)))?[a-zA-Z0-9!#-'*+\-/=?^-`{-~.\[]]+ (?:(?:(?:(?:[ \t]*\r\n)?[ \t]+)?\((?:(?:(?:[ \t]*\r\n)?[ \t]+)?[\x01-\x08\x0B\x0C\x0E-\x1F\x7F!-'*-\ [\]-~]|(?:\\[\x01-\x09\x0B\x0C\x0E-\x7F]))*(?:(?:[ \t]*\r\n)?[ \t]+)?\))*(?:(?:(?:(?:[ \t]*\r\n)? [ \t]+)?\((?:(?:(?:[ \t]*\r\n)?[ \t]+)?[\x01-\x08\x0B\x0C\x0E-\x1F\x7F!-'*-\[\]-~]|(?:\\[\x01- \x09\x0B\x0C\x0E-\x7F]))*(?:(?:[ \t]*\r\n)?[ \t]+)?\))|(?:(?:[ \t]*\r\n)?[ \t]+)))?)|(?:(?:(?:(?:(?:[ \t]*\r\n)?[ \t]+)?\((?:(?:(?:[ \t]*\r\n)?[ \t]+)?[\x01-\x08\x0B\x0C\x0E-\x1F\x7F!-'*-\[\]-~]|(?:\\

Slide 23

Slide 23 text

Personal names

Slide 24

Slide 24 text

@PeterHilton • Marit van Dijk 
 Simone de Gijt 
 Peter Hilton 24 tussenvoegsel

Slide 25

Slide 25 text

Falsehoods Programmers Believe About Names, Patrick McKenzie (@patio11) 25 @PeterHilton • 24. My system will never have to deal with names from China. 25. Or Japan. 26. Or Korea. 27. Or Ireland, the United Kingdom, the United States, Spain, Mexico, Brazil, Peru, Russia, Sweden, Botswana, South Africa, Trinidad, Haiti, France, or the Klingon Empire, all of which have “weird” naming schemes in common use. https://www.kalzumeus.com/2010/06/17/falsehoods-programmers-believe-about-names/

Slide 26

Slide 26 text

Naming conventions depend on the country: first name + last name is basically racist https://hilton.org.uk/blog/respect-personal-names 26 @PeterHilton •

Slide 27

Slide 27 text

Guidelines 27 @PeterHilton • W3C offers examples and guidance for modelling → 1. Allow any character 
 → letters, spaces, punctuation, digits, , etc 2. Allow long names 3. Don’t parse or split names (first name + last name) 4. Collect variants for different purposes 
 → Sort by, What should we call you? https://www.w3.org/International/questions/qa-personal-names

Slide 28

Slide 28 text

@PeterHilton • Your customers take this personally For more examples of domain model name acceptance failures, follow @yournameisvalid on Twitter 28

Slide 29

Slide 29 text

Genders

Slide 30

Slide 30 text

There is more than one thing wrong with this form field!

Slide 31

Slide 31 text

Photo @QuietMisdreavus, design by telegraham

Slide 32

Slide 32 text

Article 5 (General Data Protection Regulation) Principles relating to processing of personal data 1. Personal data shall be: (c) adequate, relevant and limited to what is necessary in relation to the purposes for which they are processed (‘data minimisation’) https://eur-lex.europa.eu/eli/reg/2016/679/oj#d1e1807-1-1

Slide 33

Slide 33 text

Article 9 (General Data Protection Regulation) Processing of special categories of personal data 1. Processing of personal data revealing racial or ethnic origin, political opinions, religious or philosophical beliefs, or trade union membership, and the processing of […] data concerning a natural person’s sex life or sexual orientation shall be prohibited. https://eur-lex.europa.eu/eli/reg/2016/679/oj#d1e2051-1-1

Slide 34

Slide 34 text

Build unisex software 34 @PeterHilton • 1. Don’t ask people their gender 
 (or require gendered personal titles) 2. Learn about the GDPR restrictions on personal data, 
 and data minimisation 3. Don’t limit input to two options if you do need to know 
 (and get that need to know in writing from legal) https://hilton.org.uk/blog/build-unisex-software https://hilton.org.uk/blog/refactor-boolean-enumeration

Slide 35

Slide 35 text

Telephone numbers

Slide 36

Slide 36 text

☎ Telephone numbers Telephone numbers 
 only use digits… 
 except for the punctuation. And they’re numbers… except for the significant leading zeroes. Different formats for the same telephone number: (010)-790 0185 0031 10 790 01 85 +31107900185 tel:+31107900185 36 @PeterHilton •

Slide 37

Slide 37 text

Telephone number standards E.123 Notation for national and international… (2001): ‘9.1 Grouping of digits in a telephone number should be accomplished by means [of] spaces’ RFC 3966 The tel URI for Telephone Numbers (2004): ‘even though ITU-T E.123 recommends the use of space characters as visual separators […] “tel” URIs MUST NOT use spaces in visual separators to avoid excessive escaping’ https://hilton.org.uk/blog/telephone-number-formats 37 @PeterHilton •

Slide 38

Slide 38 text

Bank account numbers

Slide 39

Slide 39 text

International Bank Account Number (IBAN) ISO 13616:1997 😀 15-32 letters (A-Z) and digits 😀 Starts with an ISO 3166-1 alpha-2 country code 😀 Two check digits prevent the most common errors 😀 Includes bank code and account number 😀 Used in Europe, North Africa, Middle East, Caribbean https://en.wikipedia.org/wiki/International_Bank_Account_Number 39 @PeterHilton •

Slide 40

Slide 40 text

😢 Not used in North America, Asia, Australasia, etc…

Slide 41

Slide 41 text

Bonus numbers

Slide 42

Slide 42 text

Aircraft tail numbers CardMapr

Slide 43

Slide 43 text

Aircraft tail numbers → PH-BXD Not an ISO 3166-1 country code 😢 Not even a number 😭

Slide 44

Slide 44 text

Time zones

Slide 45

Slide 45 text

Bug taxonomy

Slide 46

Slide 46 text

Programming errors Some errors are uncontroversially bugs, not differences of opinion or feature requests, e.g. opposite sort direction 46 @PeterHilton • First name Last name Born ‐ Giovanni Pierluigi da Palestrina 1525 Thomas Tallis 1505 Nicolas Gombert 1495 Josquin des Prez 1450 🎼

Slide 47

Slide 47 text

Under-specification Some bugs are due to not specifying or implementing any consistent behaviour, e.g. random order in lists and tables 47 @PeterHilton • First name Last name Born Giovanni Pierluigi da Palestrina 1525 Nicolas Gombert 1495 Josquin des Prez 1450 Thomas Tallis 1505 🎼

Slide 48

Slide 48 text

Wrong functionality A more subjective kind of bug is due to consistent behaviour that isn’t what’s needed, e.g. sort by date instead of name 48 @PeterHilton • First name Last name Born ‐ Josquin des Prez 1450 Nicolas Gombert 1495 Thomas Tallis 1505 Giovanni Pierluigi da Palestrina 1525 🎼

Slide 49

Slide 49 text

No content

Slide 50

Slide 50 text

Wrong model Some bugs lie deeper in the model and its assumptions, e.g. sort by last name and getting it wrong 50 @PeterHilton • First name Last name ‐ Born Giovanni Pierluigi da Palestrina 1525 Josquin des Prez 1450 Nicolas Gombert 1495 Thomas Tallis 1505 🎼

Slide 51

Slide 51 text

Four kinds of bugs 1. Programming errors, e.g. sorting broken 2. Under-specification, e.g. unsorted 3. Wrong functionality, e.g. sort by date instead of name 4. Wrong model, e.g. sort by last name 51 @PeterHilton •

Slide 52

Slide 52 text

Summary

Slide 53

Slide 53 text

Guidelines 1. Don’t try to standardise or validate personal names 2. Validate email addresses in multiple steps, not just regex 3. Build unisex software - remove gender from your models 4. Use ISO codes for languages, countries and currencies 5. Use Unicode CLDR for localised names, for ISO codes 6. Choose country lists carefully 7. Model identifiers called ‘numbers’ as text 53 @PeterHilton •

Slide 54

Slide 54 text

Summary 1. Learn to recognise bugs that better tests can’t fix 2. There’s more to modelling than entity relationships 3. There’s more to localisation than language 4. Some standards are more useful than others 5. Figure out how not to be part of the problem 6. It’s social and political, rather than only technical 54 @PeterHilton •

Slide 55

Slide 55 text

@PeterHilton http://hilton.org.uk/ Work with me - available for product manager roles www.linkedin.com/in/peterhilton/ B2B SaaS automaton, Europe remote