Known unknowns

Known unknowns

This is my second, bigger, better take on "little problems":
https://speakerdeck.com/abelar_s/little-needs-big-problems

It warns about elephants in the room and adds two examples, so:
1. is about "just a checkbox" - for starters
2. is about importing - as the main dish, it goes with code and UX
3. is about reporting - tips to help add value

----

"That's easy, I just need... could you do that in 5 minutes?"
Love it or hate it, you've heard that a lot.

Young devs accept to be the rockstars, old devs forbid this outright as the symptom of deeper and more dangerous causes.
How and why?

What I'll show here are simple needs you run into every day, the traps they hide and some possible evolutions I've met.

1c737f6bf76c2f983c77446629a188d8?s=128

Sylvain Abélard

March 04, 2014
Tweet

Transcript

  1. Known unknowns @abelar_s / maitre-du-monde.fr @BBL_FR / @parisrb

  2. Elephants in the room

  3. Phil Karlton naming, cache invalidation

  4. Hate encoding, timezones

  5. I18N, L10N names, phone numbers

  6. Distributed Computing 1.The network is reliable. 2.Latency is zero. 3.Bandwidth

    is infinite. 4.The network is secure. 5.Topology doesn't change. 6.There is one administrator. 7.Transport cost is zero. 8.The network is homogeneous.
  7. Little needs

  8. Big problems

  9. None
  10. Advice

  11. Gurus

  12. YAGNI Gurus

  13. YAGNI You Ain’t Gonna Need It So many devs...

  14. Suits

  15. $ Suits

  16. $ Contract So many salesguys...

  17. My friend

  18. $ My friend

  19. $ Free has no value Freelancer

  20. ? Myself

  21. ? My own company

  22. ? Long-term value Software editor

  23. Example #1

  24. Checkbox

  25. ‘Just a checkbox’

  26. Tri-state

  27. Tomorrow?

  28. Timestamps?

  29. Versioning?

  30. State Machine

  31. Checkbox + naming

  32. Imports + exports

  33. Example #2

  34. Import a CSV file

  35. IO.readlines('test.csv').each{|line| Model.create(attr: line[0]) } 30s: trivial

  36. FasterCSV.parse(File.open('test.csv'), 'r').each{ |line| Client.create( attr: line[0], company: Company.find_by_name(line[1]), active: ['1',

    'oui', 'X'].include?(line[2]) ) } 5mn: some more data
  37. comp_cache = {} FasterCSV.parse(File.open('test.csv'), 'r').each{|line| Client.create( attr: line[0], company: comp_cache[line[1]]

    || (comp_cache[line[1]] = Company.find_by_name(line[1])), active: ['1', 'oui', 'X'].include?(line[2]) ) } 10mn: perf’s not good
  38. comp_cache = {} FasterCSV.parse(File.open('test.csv'), 'r').each{|line| next if line.blank? || line.size

    != 5 comp = comp_cache[line[1]] || (comp_cache[line[1]] = Company.find_by_name(line[1])) if !comp puts "OMG IDK #{line[1]}" next end Client.create( attr: line[0], company: comp, active: ['1', 'oui', 'X'].include?(line[2]) ) } 20mn: data not good
  39. comp_cache = {} FasterCSV.parse(File.open('test.csv'), 'r').each{|line| next if line.blank? || line.size

    != 5 comp = comp_cache[line[1]] || (comp_cache[line[1]] = Company.find_by_name(line[1])) if !comp puts "OMG IDK #{line[1]}" next end Client.create( attr: line[0], company: comp, active: ['1', 'oui', 'X'].include?(line[2]), first_name: line[3].split.first, last_name: line[3].split.last, comments: "#{line[4]} #{line[5]} (#{line[6]})" ) } 30mn: more info
  40. comp_cache = {} FasterCSV.parse(File.open('test.csv'), 'r').each{|line| next if line.blank? || line.size

    != 5 comp = comp_cache[line[1]] || (comp_cache[line[1]] = Company.find_by_name(line[1])) if !comp puts "OMG IDK #{line[1]}" next end c = Client.find_or_create_by_attr(line[0]) c.update_columns( company: comp, active: ['1', 'oui', 'X'].include?(line[2]), first_name: line[3].split.first, last_name: line[3].split.last, comments: "#{line[4]} #{line[5]} (#{line[6]})" ) ok = c.save puts c.errors.full_messages.inspect unless ok ok }.partition{|x| x}.map{|a| "#{a.size} #{a.first}" } 45mn: some logs
  41. def to_date(date) months = ["JAN", "FEB", "MAR", "APR", "MAY", "JUN",

    "JUL", "AUG", "SEP", "OCT", "NOV", "DEC"] tmp = date.scan(/([0-9]{2})([A-Z]{3})([0-9]{4})/).flatten tmp[0] = tmp[0].to_i tmp[1] = months.index(tmp[1]) + 1 tmp[2] = tmp[2].to_i Date.new(tmp[2], tmp[1], tmp[0]) end def to_float(var) (var.blank? || var == '.' ? nil : var.to_s.tr(' ,', '_.').to_f) end def transcode_file(filepath) input = File.open(filepath) File.open(Rails.root.join("tmp", "import_utf8.csv"), "wb") do |f| until input.eof? content = input.read(2**16) detection = CharlockHolmes::EncodingDetector.detect(content) content = CharlockHolmes::Converter.convert(content, detection[:encoding], "UTF-8") f.write(content) end end end 1h: some ‘utils’
  42. Solutions?

  43. ... automating...

  44. # gem install upsert # gem install smartercsv SmarterCSV.process( Rails.root.join("tmp",

    "test.csv"), :col_sep => ';', :chunk_size => 10000) do |chunk| # ... Upsert.batch(connection, 'infos') do |upsert_infos| upsert.row({code: row[:code]}, name: row[:nom]) # ... end end Tools to the rescue
  45. Freedom!

  46. Independence day!

  47. Bad data: purge

  48. Where am I?

  49. Bad files

  50. OK, KO, warnings...

  51. And much more... - source - field separator - encoding

    - XLS tab - behaviour - numbers - offset - errors - create / update - column match - update on... ?
  52. What else?

  53. Check my data

  54. Duplicates Duplicates

  55. Err0r Corektion

  56. Errors;As;CSV Export ;to ; report Import ;to ; merge

  57. Another source

  58. Refactoring

  59. Performance

  60. Audit trail File, format, person, timestamp, source, transforms + origins

  61. Blame all the things!

  62. Example #3

  63. Reporting

  64. ‘Just some sums’

  65. ... and filters

  66. and exports

  67. and graphs

  68. Programmer Time Translation Table est real coder thinks manager knows

    30s 1h trivial! do, build, test, deploy... 5mn 2h easy! unexpected problem 1h 2h code... not on the 1st try 4h 4h check docs realistic 8h 12~16d minor refactor many dependencies 2d 5d OK, real code same as before 1wk 2~20d wow, er... let’s see with the team http://coding.abel.nu/2012/06/programmer-time-translation-table/
  69. Thanks! @abelar_s / maitre-du-monde.fr HumanTalks 2013-11-12

  70. Questions? @abelar_s / maitre-du-monde.fr HumanTalks 2013-11-12

  71. Questions? @abelar_s / maitre-du-monde.fr HumanTalks 2013-11-12 Little needs : regardez

    ce qu'ont fait les projets open source, vos concurrents Le Code de Retour - le batch et le cache