Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Universal bugs

Universal bugs

Bugs that live in the gap between coders and product managers

You can blame programmers for bugs caused by typos in the code, and you can blame your product manager or product owner for features that your customer doesn’t need, but who do you blame for the universal bugs that all software has sooner or later? We navigate time zones carefully, but names and numbers still catch us out. Our initial naive solutions don’t take account of real-world complexity, even for solved problems.

Developers need to know what telephone numbers, house numbers and aircraft tail numbers have in common, apart from predating computers. Attendees will discover different kinds of numbers, learn about validating email addresses and bank account numbers, and realise how unoriginal some bugs are. And more important than straightforward bugs, we’ll also discover bugs we have to fix to make our software inclusive.

Peter Hilton

May 10, 2023
Tweet

More Decks by Peter Hilton

Other Decks in Technology

Transcript

  1. @PeterHilton
    http://hilton.org.uk/
    Universal Bugs


    🐛 🐛 🐛 🐛 🐛 🐛 🐛 🐛 🐛 🐛
    http://hilton.org.uk/tag/ddd

    View full-size slide

  2. That’s a nasty bug you’ve got there…
    Software has bugs, and then we fix them.


    Except when we don’t.


    Not all bugs appear because you didn’t write a unit test, or
    because the cat walked on your keyboard.


    Some bugs are endemic. Who can fix those?
    2
    @PeterHilton •

    View full-size slide

  3. House numbers

    View full-size slide

  4. House numbers
    Peter Hilton

    View full-size slide

  5. Any attribute with
    the word number in
    its name must be
    modelled as text


    https://hilton.org.uk/blog/non-numeric-numbers 5
    @PeterHilton •

    View full-size slide

  6. Stadscykel / CC0

    View full-size slide

  7. Post codes
    117 of 190 Universal Postal Union countries use post codes


    Most countries only use (3-10) digits


    Some countries use letters as well


    Only Ireland has a unique post code per address


    Yet another country-specific value model


    Ideally, you have each country’s up-to-date actual list


    https://en.wikipedia.org/wiki/Postal_code 9
    @PeterHilton •

    View full-size slide

  8. Country lists

    View full-size slide

  9. Countries
    Visible Each, NASA

    View full-size slide

  10. Countries
    By country, you probably mean sovereign state.


    c.f. ISO 3166-1 alpha-2 code, Unicode CLDR localised name.


    The closest to a standard is United Nations membership,


    but 18 states are not universally recognised:


    🇮🇱 🇰🇷 🇰🇵 🇨🇳 🇨🇾 🇦🇲 🇵🇸 🇹🇼 🇪🇭 🏳 🏳 🇽🇰 🏳 🏳


    … four of which not even by the Apple Emoji font 🏳😢


    https://hilton.org.uk/blog/country-lists 12
    @PeterHilton •

    View full-size slide

  11. Standard text identifiers a.k.a. codes
    ISO 3166-1 alpha-2 country (region) codes:


    → GB NL FR


    ISO 4127 currency codes (country code plus one letter):


    → GBP NLG FRF EUR


    ISO 639-1 language codes (lower-case!):


    → en nl fr
    13
    @PeterHilton •

    View full-size slide

  12. Country names

    View full-size slide

  13. France Frankrijk France Франція
    Netherlands Nederland Pays-Bas Нідерланди
    Ukraine Oekraïne Ukraine Україна
    United Kingdom Verenigd Koninkrijk Royaume-Uni Велика Британія
    English Dutch French Ukranian
    Standard names and their translations

    View full-size slide

  14. Unicode Common Locale Data Repository (CLDR)
    17
    @PeterHilton •
    Unicode CLDR provides standard lists and translations of:


    territories (including countries), currencies, time zones,


    languages, calendar names (quarters, months & weekdays),


    scripts (writing systems), units of measurement, etc.


    ⚠ You have to use validity data to filter the lists


    📄 Unicode CLDR publishes XML and JSON on GitHub


    https://hilton.org.uk/blog/l10n-cldr-names

    View full-size slide

  15. Country list order

    View full-size slide

  16. English Dutch French Ukranian
    France Frankrijk France Франція
    Netherlands Nederland Pays-Bas Нідерланди
    Ukraine Oekraïne Ukraine Україна
    United Kingdom Verenigd Koninkrijk Royaume-Uni Велика Британія
    Sort order is language-dependent!

    View full-size slide

  17. Email addresses


    View full-size slide

  18. Email addresses
    There’s a standard for email addresses.


    So what’s the problem?


    1. Several updated/replaced standards:

    RFC 822 → RFC 2822 → RFC 5322 → RFC 6854


    2. Four levels of email address validation:

    RFC format + domain + mailbox exists + correct person


    3. Security risks of supporting the whole standard


    https://hilton.org.uk/blog/mail-address-validation 21
    @PeterHilton •

    View full-size slide

  19. (((?:(?:(?:(?:(?:(?:[ \t]*\r\n)?[ \t]+)?\((?:(?:(?:[ \t]*\r\n)?[ \t]+)?[\x01-\x08\x0B\x0C\x0E-
    \x1F\x7F!-'*-\[\]-~]|(?:\\[\x01-\x09\x0B\x0C\x0E-\x7F]))*(?:(?:[ \t]*\r\n)?[ \t]+)?\))*(?:(?:(?:(?:
    [ \t]*\r\n)?[ \t]+)?\((?:(?:(?:[ \t]*\r\n)?[ \t]+)?[\x01-\x08\x0B\x0C\x0E-\x1F\x7F!-'*-\[\]-~]|(?:\\
    [\x01-\x09\x0B\x0C\x0E-\x7F]))*(?:(?:[ \t]*\r\n)?[ \t]+)?\))|(?:(?:[ \t]*\r\n)?[ \t]+)))?[a-zA-Z0-9!
    #-'*+\-/=?^-`{-~.\[]]+(?:(?:(?:(?:[ \t]*\r\n)?[ \t]+)?\((?:(?:(?:[ \t]*\r\n)?[ \t]+)?[\x01-
    \x08\x0B\x0C\x0E-\x1F\x7F!-'*-\[\]-~]|(?:\\[\x01-\x09\x0B\x0C\x0E-\x7F]))*(?:(?:[ \t]*\r\n)?[ \t]+)?
    \))*(?:(?:(?:(?:[ \t]*\r\n)?[ \t]+)?\((?:(?:(?:[ \t]*\r\n)?[ \t]+)?[\x01-\x08\x0B\x0C\x0E-
    \x1F\x7F!-'*-\[\]-~]|(?:\\[\x01-\x09\x0B\x0C\x0E-\x7F]))*(?:(?:[ \t]*\r\n)?[ \t]+)?\))|(?:(?:
    [ \t]*\r\n)?[ \t]+)))?)|(?:(?:(?:(?:(?:[ \t]*\r\n)?[ \t]+)?\((?:(?:(?:[ \t]*\r\n)?[ \t]+)?[\x01-
    \x08\x0B\x0C\x0E-\x1F\x7F!-'*-\[\]-~]|(?:\\[\x01-\x09\x0B\x0C\x0E-\x7F]))*(?:(?:[ \t]*\r\n)?[ \t]+)?
    \))*(?:(?:(?:(?:[ \t]*\r\n)?[ \t]+)?\((?:(?:(?:[ \t]*\r\n)?[ \t]+)?[\x01-\x08\x0B\x0C\x0E-
    \x1F\x7F!-'*-\[\]-~]|(?:\\[\x01-\x09\x0B\x0C\x0E-\x7F]))*(?:(?:[ \t]*\r\n)?[ \t]+)?\))|(?:(?:
    [ \t]*\r\n)?[ \t]+)))?"(?>(?:(?:[ \t]*\r\n)?[ \t]+)?(?:[\x01-\x08\x0B\x0C\x0E-\x1F\x7F!#-\[\]-~]|(?:\
    \[\x01-\x09\x0B\x0C\x0E-\x7F])))*(?:(?:[ \t]*\r\n)?[\t]+)?"(?:(?:(?:(?:[ \t]*\r\n)?[ \t]+)?\((?:(?:
    (?:[ \t]*\r\n)?[ \t]+)?[\x01-\x08\x0B\x0C\x0E-\x1F\x7F!-'*-\[\]-~]|(?:\\[\x01-\x09\x0B\x0C\x0E-
    \x7F]))*(?:(?:[ \t]*\r\n)?[ \t]+)?\))*(?:(?:(?:(?:[ \t]*\r\n)?[ \t]+)?\((?:(?:(?:[ \t]*\r\n)?[ \t]+)?
    [\x01-\x08\x0B\x0C\x0E-\x1F\x7F!-'*-\[\]-~]|(?:\\[\x01-\x09\x0B\x0C\x0E-\x7F]))*(?:(?:[ \t]*\r\n)?
    [ \t]+)?\))|(?:(?:[ \t]*\r\n)?[ \t]+)))?))(?:(?:(?:[ \t]*\r\n)?[ \t]+)(?:(?:(?:(?:(?:(?:[ \t]*\r\n)?[
    \t]+)?\((?:(?:(?:[ \t]*\r\n)?[ \t]+)?[\x01-\x08\x0B\x0C\x0E-\x1F\x7F!-'*-\[\]-~]|(?:\\[\x01-
    \x09\x0B\x0C\x0E-\x7F]))*(?:(?:[ \t]*\r\n)?[ \t]+)?\))*(?:(?:(?:(?:[ \t]*\r\n)?[ \t]+)?\((?:(?:(?:
    [ \t]*\r\n)?[ \t]+)?[\x01-\x08\x0B\x0C\x0E-\x1F\x7F!-'*-\[\]-~]|(?:\\[\x01-\x09\x0B\x0C\x0E-
    \x7F]))*(?:(?:[ \t]*\r\n)?[ \t]+)?\))|(?:(?:[ \t]*\r\n)?[ \t]+)))?[a-zA-Z0-9!#-'*+\-/=?^-`{-~.\[]]+
    (?:(?:(?:(?:[ \t]*\r\n)?[ \t]+)?\((?:(?:(?:[ \t]*\r\n)?[ \t]+)?[\x01-\x08\x0B\x0C\x0E-\x1F\x7F!-'*-\
    [\]-~]|(?:\\[\x01-\x09\x0B\x0C\x0E-\x7F]))*(?:(?:[ \t]*\r\n)?[ \t]+)?\))*(?:(?:(?:(?:[ \t]*\r\n)?
    [ \t]+)?\((?:(?:(?:[ \t]*\r\n)?[ \t]+)?[\x01-\x08\x0B\x0C\x0E-\x1F\x7F!-'*-\[\]-~]|(?:\\[\x01-
    \x09\x0B\x0C\x0E-\x7F]))*(?:(?:[ \t]*\r\n)?[ \t]+)?\))|(?:(?:[ \t]*\r\n)?[ \t]+)))?)|(?:(?:(?:(?:(?:[
    \t]*\r\n)?[ \t]+)?\((?:(?:(?:[ \t]*\r\n)?[ \t]+)?[\x01-\x08\x0B\x0C\x0E-\x1F\x7F!-'*-\[\]-~]|(?:\\

    View full-size slide

  20. Personal names


    View full-size slide

  21. @PeterHilton •
    Marit van Dijk

    Simone de Gijt

    Peter Hilton
    24
    tussenvoegsel

    View full-size slide

  22. Falsehoods Programmers Believe About Names,
    Patrick McKenzie (@patio11)
    25
    @PeterHilton •
    24. My system will never have to deal with names from
    China. 25. Or Japan. 26. Or Korea. 27. Or Ireland, the
    United Kingdom, the United States, Spain, Mexico, Brazil,
    Peru, Russia, Sweden, Botswana, South Africa, Trinidad,
    Haiti, France, or the Klingon Empire, all of which have
    “weird” naming schemes in common use.


    https://www.kalzumeus.com/2010/06/17/falsehoods-programmers-believe-about-names/

    View full-size slide

  23. Naming conventions
    depend on the country:


    first name + last name
    is basically racist


    https://hilton.org.uk/blog/respect-personal-names 26
    @PeterHilton •

    View full-size slide

  24. Guidelines
    27
    @PeterHilton •
    W3C offers examples and guidance for modelling →


    1. Allow any character

    → letters, spaces, punctuation, digits, , etc


    2. Allow long names


    3. Don’t parse or split names (first name + last name)


    4. Collect variants for different purposes

    → Sort by, What should we call you?


    https://www.w3.org/International/questions/qa-personal-names

    View full-size slide

  25. @PeterHilton •
    Your customers take
    this personally


    For more examples
    of domain model
    name acceptance
    failures, follow
    @yournameisvalid
    on Twitter
    28

    View full-size slide

  26. There is more than one thing
    wrong with this form field!

    View full-size slide

  27. Photo @QuietMisdreavus, design by telegraham

    View full-size slide

  28. Article 5 (General Data Protection Regulation)


    Principles relating to processing of personal data
    1. Personal data shall be:


    (c) adequate, relevant and limited to what is
    necessary in relation to the purposes for which they
    are processed (‘data minimisation’)
    https://eur-lex.europa.eu/eli/reg/2016/679/oj#d1e1807-1-1

    View full-size slide

  29. Article 9 (General Data Protection Regulation)


    Processing of special categories of personal data
    1. Processing of personal data revealing racial or
    ethnic origin, political opinions, religious or
    philosophical beliefs, or trade union membership,
    and the processing of […] data concerning a natural
    person’s sex life or sexual orientation shall be
    prohibited.
    https://eur-lex.europa.eu/eli/reg/2016/679/oj#d1e2051-1-1

    View full-size slide

  30. Build unisex software
    34
    @PeterHilton •
    1. Don’t ask people their gender

    (or require gendered personal titles)


    2. Learn about the GDPR restrictions on personal data,

    and data minimisation


    3. Don’t limit input to two options if you do need to know

    (and get that need to know in writing from legal)


    https://hilton.org.uk/blog/build-unisex-software


    https://hilton.org.uk/blog/refactor-boolean-enumeration

    View full-size slide

  31. Telephone numbers

    View full-size slide


  32. Telephone numbers
    Telephone numbers

    only use digits…

    except for the punctuation.


    And they’re numbers…
    except for the significant
    leading zeroes.


    Different formats for the
    same telephone number:


    (010)-790 0185


    0031 10 790 01 85


    +31107900185


    tel:+31107900185
    36
    @PeterHilton •

    View full-size slide

  33. Telephone number standards
    E.123 Notation for national and international… (2001):


    ‘9.1 Grouping of digits in a telephone number should be
    accomplished by means [of] spaces’


    RFC 3966 The tel URI for Telephone Numbers (2004):


    ‘even though ITU-T E.123 recommends the use of space
    characters as visual separators […] “tel” URIs MUST NOT use
    spaces in visual separators to avoid excessive escaping’


    https://hilton.org.uk/blog/telephone-number-formats 37
    @PeterHilton •

    View full-size slide

  34. Bank account numbers


    View full-size slide

  35. International Bank Account Number (IBAN)


    ISO 13616:1997
    😀 15-32 letters (A-Z) and digits


    😀 Starts with an ISO 3166-1 alpha-2 country code


    😀 Two check digits prevent the most common errors


    😀 Includes bank code and account number


    😀 Used in Europe, North Africa, Middle East, Caribbean


    https://en.wikipedia.org/wiki/International_Bank_Account_Number 39
    @PeterHilton •

    View full-size slide

  36. 😢 Not used in North America, Asia, Australasia, etc…


    View full-size slide

  37. Bonus numbers

    View full-size slide

  38. Aircraft tail numbers
    CardMapr

    View full-size slide

  39. Aircraft tail numbers → PH-BXD
    Not an ISO 3166-1
    country code 😢
    Not even a
    number 😭

    View full-size slide

  40. Bug taxonomy

    View full-size slide

  41. Programming errors
    Some errors are uncontroversially bugs, not differences of
    opinion or feature requests, e.g. opposite sort direction
    46
    @PeterHilton •
    First name Last name Born ‐
    Giovanni Pierluigi da Palestrina 1525
    Thomas Tallis 1505
    Nicolas Gombert 1495
    Josquin des Prez 1450
    🎼

    View full-size slide

  42. Under-specification
    Some bugs are due to not specifying or implementing any
    consistent behaviour, e.g. random order in lists and tables
    47
    @PeterHilton •
    First name Last name Born
    Giovanni Pierluigi da Palestrina 1525
    Nicolas Gombert 1495
    Josquin des Prez 1450
    Thomas Tallis 1505
    🎼

    View full-size slide

  43. Wrong functionality
    A more subjective kind of bug is due to consistent behaviour
    that isn’t what’s needed, e.g. sort by date instead of name
    48
    @PeterHilton •
    First name Last name Born ‐
    Josquin des Prez 1450
    Nicolas Gombert 1495
    Thomas Tallis 1505
    Giovanni Pierluigi da Palestrina 1525
    🎼

    View full-size slide

  44. Wrong model
    Some bugs lie deeper in the model and its assumptions, e.g.
    sort by last name and getting it wrong
    50
    @PeterHilton •
    First name Last name ‐ Born
    Giovanni Pierluigi da Palestrina 1525
    Josquin des Prez 1450
    Nicolas Gombert 1495
    Thomas Tallis 1505
    🎼

    View full-size slide

  45. Four kinds of bugs
    1. Programming errors, e.g. sorting broken


    2. Under-specification, e.g. unsorted


    3. Wrong functionality, e.g. sort by date instead of name


    4. Wrong model, e.g. sort by last name
    51
    @PeterHilton •

    View full-size slide

  46. Guidelines
    1. Don’t try to standardise or validate personal names


    2. Validate email addresses in multiple steps, not just regex


    3. Build unisex software - remove gender from your models


    4. Use ISO codes for languages, countries and currencies


    5. Use Unicode CLDR for localised names, for ISO codes


    6. Choose country lists carefully


    7. Model identifiers called ‘numbers’ as text
    53
    @PeterHilton •

    View full-size slide

  47. Summary
    1. Learn to recognise bugs that better tests can’t fix


    2. There’s more to modelling than entity relationships


    3. There’s more to localisation than language


    4. Some standards are more useful than others


    5. Figure out how not to be part of the problem


    6. It’s social and political, rather than only technical
    54
    @PeterHilton •

    View full-size slide

  48. @PeterHilton
    http://hilton.org.uk/
    Work with me - available
    for product manager roles
    www.linkedin.com/in/peterhilton/
    B2B SaaS automaton, Europe remote

    View full-size slide