Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Universal bugs

Universal bugs

Bugs that live in the gap between coders and product managers

You can blame programmers for bugs caused by typos in the code, and you can blame your product manager or product owner for features that your customer doesn’t need, but who do you blame for the universal bugs that all software has sooner or later? We navigate time zones carefully, but names and numbers still catch us out. Our initial naive solutions don’t take account of real-world complexity, even for solved problems.

Developers need to know what telephone numbers, house numbers and aircraft tail numbers have in common, apart from predating computers. Attendees will discover different kinds of numbers, learn about validating email addresses and bank account numbers, and realise how unoriginal some bugs are. And more important than straightforward bugs, we’ll also discover bugs we have to fix to make our software inclusive.

Peter Hilton

May 10, 2023

More Decks by Peter Hilton

Other Decks in Technology


  1. That’s a nasty bug you’ve got there… Software has bugs,

    and then we fix them. Except when we don’t. Not all bugs appear because you didn’t write a unit test, or because the cat walked on your keyboard. Some bugs are endemic. Who can fix those? 2 @PeterHilton •
  2. Any attribute with the word number in its name must

    be modelled as text https://hilton.org.uk/blog/non-numeric-numbers 5 @PeterHilton •
  3. Post codes 117 of 190 Universal Postal Union countries use

    post codes Most countries only use (3-10) digits Some countries use letters as well Only Ireland has a unique post code per address Yet another country-specific value model Ideally, you have each country’s up-to-date actual list https://en.wikipedia.org/wiki/Postal_code 9 @PeterHilton •
  4. Countries By country, you probably mean sovereign state. c.f. ISO

    3166-1 alpha-2 code, Unicode CLDR localised name. The closest to a standard is United Nations membership, but 18 states are not universally recognised: 🇮🇱 🇰🇷 🇰🇵 🇨🇳 🇨🇾 🇦🇲 🇵🇸 🇹🇼 🇪🇭 🏳 🏳 🇽🇰 🏳 🏳 … four of which not even by the Apple Emoji font 🏳😢 https://hilton.org.uk/blog/country-lists 12 @PeterHilton •
  5. Standard text identifiers a.k.a. codes ISO 3166-1 alpha-2 country (region)

    codes: → GB NL FR ISO 4127 currency codes (country code plus one letter): → GBP NLG FRF EUR ISO 639-1 language codes (lower-case!): → en nl fr 13 @PeterHilton •
  6. France Frankrijk France Франція Netherlands Nederland Pays-Bas Нідерланди Ukraine Oekraïne

    Ukraine Україна United Kingdom Verenigd Koninkrijk Royaume-Uni Велика Британія English Dutch French Ukranian Standard names and their translations
  7. Unicode Common Locale Data Repository (CLDR) 17 @PeterHilton • Unicode

    CLDR provides standard lists and translations of: territories (including countries), currencies, time zones, languages, calendar names (quarters, months & weekdays), scripts (writing systems), units of measurement, etc. ⚠ You have to use validity data to filter the lists 📄 Unicode CLDR publishes XML and JSON on GitHub https://hilton.org.uk/blog/l10n-cldr-names
  8. English Dutch French Ukranian France Frankrijk France Франція Netherlands Nederland

    Pays-Bas Нідерланди Ukraine Oekraïne Ukraine Україна United Kingdom Verenigd Koninkrijk Royaume-Uni Велика Британія Sort order is language-dependent!
  9. Email addresses There’s a standard for email addresses. So what’s

    the problem? 1. Several updated/replaced standards: 
 RFC 822 → RFC 2822 → RFC 5322 → RFC 6854 2. Four levels of email address validation: 
 RFC format + domain + mailbox exists + correct person 3. Security risks of supporting the whole standard https://hilton.org.uk/blog/mail-address-validation 21 @PeterHilton •
  10. (((?:(?:(?:(?:(?:(?:[ \t]*\r\n)?[ \t]+)?\((?:(?:(?:[ \t]*\r\n)?[ \t]+)?[\x01-\x08\x0B\x0C\x0E- \x1F\x7F!-'*-\[\]-~]|(?:\\[\x01-\x09\x0B\x0C\x0E-\x7F]))*(?:(?:[ \t]*\r\n)?[ \t]+)?\))*(?:(?:(?:(?: [ \t]*\r\n)?[

    \t]+)?\((?:(?:(?:[ \t]*\r\n)?[ \t]+)?[\x01-\x08\x0B\x0C\x0E-\x1F\x7F!-'*-\[\]-~]|(?:\\ [\x01-\x09\x0B\x0C\x0E-\x7F]))*(?:(?:[ \t]*\r\n)?[ \t]+)?\))|(?:(?:[ \t]*\r\n)?[ \t]+)))?[a-zA-Z0-9! #-'*+\-/=?^-`{-~.\[]]+(?:(?:(?:(?:[ \t]*\r\n)?[ \t]+)?\((?:(?:(?:[ \t]*\r\n)?[ \t]+)?[\x01- \x08\x0B\x0C\x0E-\x1F\x7F!-'*-\[\]-~]|(?:\\[\x01-\x09\x0B\x0C\x0E-\x7F]))*(?:(?:[ \t]*\r\n)?[ \t]+)? \))*(?:(?:(?:(?:[ \t]*\r\n)?[ \t]+)?\((?:(?:(?:[ \t]*\r\n)?[ \t]+)?[\x01-\x08\x0B\x0C\x0E- \x1F\x7F!-'*-\[\]-~]|(?:\\[\x01-\x09\x0B\x0C\x0E-\x7F]))*(?:(?:[ \t]*\r\n)?[ \t]+)?\))|(?:(?: [ \t]*\r\n)?[ \t]+)))?)|(?:(?:(?:(?:(?:[ \t]*\r\n)?[ \t]+)?\((?:(?:(?:[ \t]*\r\n)?[ \t]+)?[\x01- \x08\x0B\x0C\x0E-\x1F\x7F!-'*-\[\]-~]|(?:\\[\x01-\x09\x0B\x0C\x0E-\x7F]))*(?:(?:[ \t]*\r\n)?[ \t]+)? \))*(?:(?:(?:(?:[ \t]*\r\n)?[ \t]+)?\((?:(?:(?:[ \t]*\r\n)?[ \t]+)?[\x01-\x08\x0B\x0C\x0E- \x1F\x7F!-'*-\[\]-~]|(?:\\[\x01-\x09\x0B\x0C\x0E-\x7F]))*(?:(?:[ \t]*\r\n)?[ \t]+)?\))|(?:(?: [ \t]*\r\n)?[ \t]+)))?"(?>(?:(?:[ \t]*\r\n)?[ \t]+)?(?:[\x01-\x08\x0B\x0C\x0E-\x1F\x7F!#-\[\]-~]|(?:\ \[\x01-\x09\x0B\x0C\x0E-\x7F])))*(?:(?:[ \t]*\r\n)?[\t]+)?"(?:(?:(?:(?:[ \t]*\r\n)?[ \t]+)?\((?:(?: (?:[ \t]*\r\n)?[ \t]+)?[\x01-\x08\x0B\x0C\x0E-\x1F\x7F!-'*-\[\]-~]|(?:\\[\x01-\x09\x0B\x0C\x0E- \x7F]))*(?:(?:[ \t]*\r\n)?[ \t]+)?\))*(?:(?:(?:(?:[ \t]*\r\n)?[ \t]+)?\((?:(?:(?:[ \t]*\r\n)?[ \t]+)? [\x01-\x08\x0B\x0C\x0E-\x1F\x7F!-'*-\[\]-~]|(?:\\[\x01-\x09\x0B\x0C\x0E-\x7F]))*(?:(?:[ \t]*\r\n)? [ \t]+)?\))|(?:(?:[ \t]*\r\n)?[ \t]+)))?))(?:(?:(?:[ \t]*\r\n)?[ \t]+)(?:(?:(?:(?:(?:(?:[ \t]*\r\n)?[ \t]+)?\((?:(?:(?:[ \t]*\r\n)?[ \t]+)?[\x01-\x08\x0B\x0C\x0E-\x1F\x7F!-'*-\[\]-~]|(?:\\[\x01- \x09\x0B\x0C\x0E-\x7F]))*(?:(?:[ \t]*\r\n)?[ \t]+)?\))*(?:(?:(?:(?:[ \t]*\r\n)?[ \t]+)?\((?:(?:(?: [ \t]*\r\n)?[ \t]+)?[\x01-\x08\x0B\x0C\x0E-\x1F\x7F!-'*-\[\]-~]|(?:\\[\x01-\x09\x0B\x0C\x0E- \x7F]))*(?:(?:[ \t]*\r\n)?[ \t]+)?\))|(?:(?:[ \t]*\r\n)?[ \t]+)))?[a-zA-Z0-9!#-'*+\-/=?^-`{-~.\[]]+ (?:(?:(?:(?:[ \t]*\r\n)?[ \t]+)?\((?:(?:(?:[ \t]*\r\n)?[ \t]+)?[\x01-\x08\x0B\x0C\x0E-\x1F\x7F!-'*-\ [\]-~]|(?:\\[\x01-\x09\x0B\x0C\x0E-\x7F]))*(?:(?:[ \t]*\r\n)?[ \t]+)?\))*(?:(?:(?:(?:[ \t]*\r\n)? [ \t]+)?\((?:(?:(?:[ \t]*\r\n)?[ \t]+)?[\x01-\x08\x0B\x0C\x0E-\x1F\x7F!-'*-\[\]-~]|(?:\\[\x01- \x09\x0B\x0C\x0E-\x7F]))*(?:(?:[ \t]*\r\n)?[ \t]+)?\))|(?:(?:[ \t]*\r\n)?[ \t]+)))?)|(?:(?:(?:(?:(?:[ \t]*\r\n)?[ \t]+)?\((?:(?:(?:[ \t]*\r\n)?[ \t]+)?[\x01-\x08\x0B\x0C\x0E-\x1F\x7F!-'*-\[\]-~]|(?:\\
  11. Falsehoods Programmers Believe About Names, Patrick McKenzie (@patio11) 25 @PeterHilton

    • 24. My system will never have to deal with names from China. 25. Or Japan. 26. Or Korea. 27. Or Ireland, the United Kingdom, the United States, Spain, Mexico, Brazil, Peru, Russia, Sweden, Botswana, South Africa, Trinidad, Haiti, France, or the Klingon Empire, all of which have “weird” naming schemes in common use. https://www.kalzumeus.com/2010/06/17/falsehoods-programmers-believe-about-names/
  12. Naming conventions depend on the country: first name + last

    name is basically racist https://hilton.org.uk/blog/respect-personal-names 26 @PeterHilton •
  13. Guidelines 27 @PeterHilton • W3C offers examples and guidance for

    modelling → 1. Allow any character 
 → letters, spaces, punctuation, digits, , etc 2. Allow long names 3. Don’t parse or split names (first name + last name) 4. Collect variants for different purposes 
 → Sort by, What should we call you? https://www.w3.org/International/questions/qa-personal-names
  14. @PeterHilton • Your customers take this personally For more examples

    of domain model name acceptance failures, follow @yournameisvalid on Twitter 28
  15. Article 5 (General Data Protection Regulation) Principles relating to processing

    of personal data 1. Personal data shall be: (c) adequate, relevant and limited to what is necessary in relation to the purposes for which they are processed (‘data minimisation’) https://eur-lex.europa.eu/eli/reg/2016/679/oj#d1e1807-1-1
  16. Article 9 (General Data Protection Regulation) Processing of special categories

    of personal data 1. Processing of personal data revealing racial or ethnic origin, political opinions, religious or philosophical beliefs, or trade union membership, and the processing of […] data concerning a natural person’s sex life or sexual orientation shall be prohibited. https://eur-lex.europa.eu/eli/reg/2016/679/oj#d1e2051-1-1
  17. Build unisex software 34 @PeterHilton • 1. Don’t ask people

    their gender 
 (or require gendered personal titles) 2. Learn about the GDPR restrictions on personal data, 
 and data minimisation 3. Don’t limit input to two options if you do need to know 
 (and get that need to know in writing from legal) https://hilton.org.uk/blog/build-unisex-software https://hilton.org.uk/blog/refactor-boolean-enumeration
  18. ☎ Telephone numbers Telephone numbers 
 only use digits… 

    except for the punctuation. And they’re numbers… except for the significant leading zeroes. Different formats for the same telephone number: (010)-790 0185 0031 10 790 01 85 +31107900185 tel:+31107900185 36 @PeterHilton •
  19. Telephone number standards E.123 Notation for national and international… (2001):

    ‘9.1 Grouping of digits in a telephone number should be accomplished by means [of] spaces’ RFC 3966 The tel URI for Telephone Numbers (2004): ‘even though ITU-T E.123 recommends the use of space characters as visual separators […] “tel” URIs MUST NOT use spaces in visual separators to avoid excessive escaping’ https://hilton.org.uk/blog/telephone-number-formats 37 @PeterHilton •
  20. International Bank Account Number (IBAN) ISO 13616:1997 😀 15-32 letters

    (A-Z) and digits 😀 Starts with an ISO 3166-1 alpha-2 country code 😀 Two check digits prevent the most common errors 😀 Includes bank code and account number 😀 Used in Europe, North Africa, Middle East, Caribbean https://en.wikipedia.org/wiki/International_Bank_Account_Number 39 @PeterHilton •
  21. Programming errors Some errors are uncontroversially bugs, not differences of

    opinion or feature requests, e.g. opposite sort direction 46 @PeterHilton • First name Last name Born ‐ Giovanni Pierluigi da Palestrina 1525 Thomas Tallis 1505 Nicolas Gombert 1495 Josquin des Prez 1450 🎼
  22. Under-specification Some bugs are due to not specifying or implementing

    any consistent behaviour, e.g. random order in lists and tables 47 @PeterHilton • First name Last name Born Giovanni Pierluigi da Palestrina 1525 Nicolas Gombert 1495 Josquin des Prez 1450 Thomas Tallis 1505 🎼
  23. Wrong functionality A more subjective kind of bug is due

    to consistent behaviour that isn’t what’s needed, e.g. sort by date instead of name 48 @PeterHilton • First name Last name Born ‐ Josquin des Prez 1450 Nicolas Gombert 1495 Thomas Tallis 1505 Giovanni Pierluigi da Palestrina 1525 🎼
  24. Wrong model Some bugs lie deeper in the model and

    its assumptions, e.g. sort by last name and getting it wrong 50 @PeterHilton • First name Last name ‐ Born Giovanni Pierluigi da Palestrina 1525 Josquin des Prez 1450 Nicolas Gombert 1495 Thomas Tallis 1505 🎼
  25. Four kinds of bugs 1. Programming errors, e.g. sorting broken

    2. Under-specification, e.g. unsorted 3. Wrong functionality, e.g. sort by date instead of name 4. Wrong model, e.g. sort by last name 51 @PeterHilton •
  26. Guidelines 1. Don’t try to standardise or validate personal names

    2. Validate email addresses in multiple steps, not just regex 3. Build unisex software - remove gender from your models 4. Use ISO codes for languages, countries and currencies 5. Use Unicode CLDR for localised names, for ISO codes 6. Choose country lists carefully 7. Model identifiers called ‘numbers’ as text 53 @PeterHilton •
  27. Summary 1. Learn to recognise bugs that better tests can’t

    fix 2. There’s more to modelling than entity relationships 3. There’s more to localisation than language 4. Some standards are more useful than others 5. Figure out how not to be part of the problem 6. It’s social and political, rather than only technical 54 @PeterHilton •
  28. @PeterHilton http://hilton.org.uk/ Work with me - available for product manager

    roles www.linkedin.com/in/peterhilton/ B2B SaaS automaton, Europe remote