Little needs, big problems

Little needs, big problems

"That's easy, I just need... could you do that in 5 minutes?"
Love it or hate it, you've heard that a lot.

Young devs accept to be the rockstars, old devs forbid this outright as the symptom of deeper and more dangerous causes.
How and why?

What I'll show here are simple needs you run into every day, the traps they hide and some possible evolutions I've met.

[Currently this slide only includes importing CSV, my next pet peeves could be versioning, logging, cleaning data...]

1c737f6bf76c2f983c77446629a188d8?s=128

Sylvain Abélard

November 12, 2013
Tweet

Transcript

  1. Little needs @abelar_s / maitre-du-monde.fr HumanTalks 2013-11-12

  2. Big problems

  3. Advice

  4. Gurus

  5. YAGNI Gurus

  6. YAGNI You Ain’t Gonna Need It So many devs...

  7. Suits

  8. $ Suits

  9. $ Contract So many salesguys...

  10. My friend

  11. $ My friend

  12. $ Free has no value Freelancer

  13. ? Myself

  14. ? My own company

  15. ? Long-term value Software editor

  16. Code?

  17. Import a CSV file

  18. IO.readlines('test.csv').each{|line| Model.create(attr: line[0]) } 30s: trivial

  19. FasterCSV.parse(File.open('test.csv'), 'r').each{ |line| Client.create( attr: line[0], company: Company.find_by_name(line[1]), active: ['1',

    'oui', 'X'].include?(line[2]) ) } 5mn: some more data
  20. comp_cache = {} FasterCSV.parse(File.open('test.csv'), 'r').each{|line| Client.create( attr: line[0], company: comp_cache[line[1]]

    || (comp_cache[line[1]] = Company.find_by_name(line[1])), active: ['1', 'oui', 'X'].include?(line[2]) ) } 10mn: perf’s not good
  21. comp_cache = {} FasterCSV.parse(File.open('test.csv'), 'r').each{|line| next if line.blank? || line.size

    != 5 comp = comp_cache[line[1]] || (comp_cache[line[1]] = Company.find_by_name(line[1])) if !comp puts "OMG IDK #{line[1]}" next end Client.create( attr: line[0], company: comp, active: ['1', 'oui', 'X'].include?(line[2]) ) } 20mn: data not good
  22. comp_cache = {} FasterCSV.parse(File.open('test.csv'), 'r').each{|line| next if line.blank? || line.size

    != 5 comp = comp_cache[line[1]] || (comp_cache[line[1]] = Company.find_by_name(line[1])) if !comp puts "OMG IDK #{line[1]}" next end Client.create( attr: line[0], company: comp, active: ['1', 'oui', 'X'].include?(line[2]), first_name: line[3].split.first, ! ! last_name: line[3].split.last, comments: "#{line[4]} #{line[5]} (#{line[6]})" ) } 30mn: more info
  23. comp_cache = {} FasterCSV.parse(File.open('test.csv'), 'r').each{|line| next if line.blank? || line.size

    != 5 comp = comp_cache[line[1]] || (comp_cache[line[1]] = Company.find_by_name(line[1])) if !comp puts "OMG IDK #{line[1]}" next end c = Client.find_or_create_by_attr(line[0]) c.update_columns( company: comp, active: ['1', 'oui', 'X'].include?(line[2]), first_name: line[3].split.first, ! ! last_name: line[3].split.last, comments: "#{line[4]} #{line[5]} (#{line[6]})" ) ok = c.save puts c.errors.full_messages.inspect unless ok ok }.partition{|x| x}.map{|a| "#{a.size} #{a.first}" } 45mn: some logs
  24. def to_date(date) months = ["JAN", "FEB", "MAR", "APR", "MAY", "JUN",

    "JUL", "AUG", "SEP", "OCT", "NOV", "DEC"] tmp = date.scan(/([0-9]{2})([A-Z]{3})([0-9]{4})/).flatten tmp[0] = tmp[0].to_i tmp[1] = months.index(tmp[1]) + 1 tmp[2] = tmp[2].to_i Date.new(tmp[2], tmp[1], tmp[0]) end def to_float(var) (var.blank? || var == '.' ? nil : var.to_s.tr(' ,', '_.').to_f) end def transcode_file(filepath) input = File.open(filepath) File.open(Rails.root.join("tmp", "import_utf8.csv"), "wb") do |f| until input.eof? content = input.read(2**16) detection = CharlockHolmes::EncodingDetector.detect(content) content = CharlockHolmes::Converter.convert(content, detection[:encoding], "UTF-8") f.write(content) end end end 1h: some ‘utils’
  25. Feeling ashamed

  26. ... automating...

  27. # gem install upsert # gem install smartercsv SmarterCSV.process( Rails.root.join("tmp",

    "test.csv"), :col_sep => ';', :chunk_size => 10000) do |chunk| # ... Upsert.batch(connection, 'infos') do |upsert_infos| upsert.row({code: row[:code]}, name: row[:nom]) # ... end end Tools to the rescue
  28. See the client

  29. Independence day!

  30. Bad data: purge

  31. Where am I?

  32. Bad files

  33. OK, KO, warnings...

  34. And much more... - source - field separator - encoding

    - XLS tab - behaviour - numbers - offset - errors - create / update - column match - update on... ?
  35. What else?

  36. Check my data ‣ find duplicates - error or merge

    ‣ error correction - merging tool ‣ errors as a CSV - import, export, reimport
  37. Another source ‣ another format? - some more refactoring ahead

    - perf issues? ‣ keep the source! - file - format - person - timestamp - BLAME ALL THE THINGS!
  38. I just need... ‣ to import CSV files - that’s

    easy right? - right... ‣ reporting dashboards - just some sums right? - yeah, and filters - and export! - and graphs! - ...
  39. Programmer Time Translation Table est real coder thinks manager knows

    30s 1h trivial! do, build, test, deploy... 5mn 2h easy! unexpected problem 1h 2h code... not on the 1st try 4h 4h check docs realistic 8h 12~16d minor refactor many dependencies 2d 5d OK, real code same as before 1wk 2~20d wow, er... let’s see with the team http://coding.abel.nu/2012/06/programmer-time-translation-table/
  40. Thanks! @abelar_s / maitre-du-monde.fr HumanTalks 2013-11-12

  41. Questions? @abelar_s / maitre-du-monde.fr HumanTalks 2013-11-12