Known unknowns

Known unknowns

This is my second, bigger, better take on "little problems":
https://speakerdeck.com/abelar_s/little-needs-big-problems

It warns about elephants in the room and adds two examples, so:
1. is about "just a checkbox" - for starters
2. is about importing - as the main dish, it goes with code and UX
3. is about reporting - tips to help add value

----

"That's easy, I just need... could you do that in 5 minutes?"
Love it or hate it, you've heard that a lot.

Young devs accept to be the rockstars, old devs forbid this outright as the symptom of deeper and more dangerous causes.
How and why?

What I'll show here are simple needs you run into every day, the traps they hide and some possible evolutions I've met.

1c737f6bf76c2f983c77446629a188d8?s=128

Sylvain Abélard

March 04, 2014
Tweet

Transcript

  1. 6.

    Distributed Computing 1.The network is reliable. 2.Latency is zero. 3.Bandwidth

    is infinite. 4.The network is secure. 5.Topology doesn't change. 6.There is one administrator. 7.Transport cost is zero. 8.The network is homogeneous.
  2. 9.
  3. 10.
  4. 11.
  5. 14.
  6. 15.
  7. 17.
  8. 20.
  9. 24.
  10. 26.
  11. 27.
  12. 37.

    comp_cache = {} FasterCSV.parse(File.open('test.csv'), 'r').each{|line| Client.create( attr: line[0], company: comp_cache[line[1]]

    || (comp_cache[line[1]] = Company.find_by_name(line[1])), active: ['1', 'oui', 'X'].include?(line[2]) ) } 10mn: perf’s not good
  13. 38.

    comp_cache = {} FasterCSV.parse(File.open('test.csv'), 'r').each{|line| next if line.blank? || line.size

    != 5 comp = comp_cache[line[1]] || (comp_cache[line[1]] = Company.find_by_name(line[1])) if !comp puts "OMG IDK #{line[1]}" next end Client.create( attr: line[0], company: comp, active: ['1', 'oui', 'X'].include?(line[2]) ) } 20mn: data not good
  14. 39.

    comp_cache = {} FasterCSV.parse(File.open('test.csv'), 'r').each{|line| next if line.blank? || line.size

    != 5 comp = comp_cache[line[1]] || (comp_cache[line[1]] = Company.find_by_name(line[1])) if !comp puts "OMG IDK #{line[1]}" next end Client.create( attr: line[0], company: comp, active: ['1', 'oui', 'X'].include?(line[2]), first_name: line[3].split.first, last_name: line[3].split.last, comments: "#{line[4]} #{line[5]} (#{line[6]})" ) } 30mn: more info
  15. 40.

    comp_cache = {} FasterCSV.parse(File.open('test.csv'), 'r').each{|line| next if line.blank? || line.size

    != 5 comp = comp_cache[line[1]] || (comp_cache[line[1]] = Company.find_by_name(line[1])) if !comp puts "OMG IDK #{line[1]}" next end c = Client.find_or_create_by_attr(line[0]) c.update_columns( company: comp, active: ['1', 'oui', 'X'].include?(line[2]), first_name: line[3].split.first, last_name: line[3].split.last, comments: "#{line[4]} #{line[5]} (#{line[6]})" ) ok = c.save puts c.errors.full_messages.inspect unless ok ok }.partition{|x| x}.map{|a| "#{a.size} #{a.first}" } 45mn: some logs
  16. 41.

    def to_date(date) months = ["JAN", "FEB", "MAR", "APR", "MAY", "JUN",

    "JUL", "AUG", "SEP", "OCT", "NOV", "DEC"] tmp = date.scan(/([0-9]{2})([A-Z]{3})([0-9]{4})/).flatten tmp[0] = tmp[0].to_i tmp[1] = months.index(tmp[1]) + 1 tmp[2] = tmp[2].to_i Date.new(tmp[2], tmp[1], tmp[0]) end def to_float(var) (var.blank? || var == '.' ? nil : var.to_s.tr(' ,', '_.').to_f) end def transcode_file(filepath) input = File.open(filepath) File.open(Rails.root.join("tmp", "import_utf8.csv"), "wb") do |f| until input.eof? content = input.read(2**16) detection = CharlockHolmes::EncodingDetector.detect(content) content = CharlockHolmes::Converter.convert(content, detection[:encoding], "UTF-8") f.write(content) end end end 1h: some ‘utils’
  17. 44.

    # gem install upsert # gem install smartercsv SmarterCSV.process( Rails.root.join("tmp",

    "test.csv"), :col_sep => ';', :chunk_size => 10000) do |chunk| # ... Upsert.batch(connection, 'infos') do |upsert_infos| upsert.row({code: row[:code]}, name: row[:nom]) # ... end end Tools to the rescue
  18. 45.
  19. 49.
  20. 51.

    And much more... - source - field separator - encoding

    - XLS tab - behaviour - numbers - offset - errors - create / update - column match - update on... ?
  21. 63.
  22. 68.

    Programmer Time Translation Table est real coder thinks manager knows

    30s 1h trivial! do, build, test, deploy... 5mn 2h easy! unexpected problem 1h 2h code... not on the 1st try 4h 4h check docs realistic 8h 12~16d minor refactor many dependencies 2d 5d OK, real code same as before 1wk 2~20d wow, er... let’s see with the team http://coding.abel.nu/2012/06/programmer-time-translation-table/
  23. 71.

    Questions? @abelar_s / maitre-du-monde.fr HumanTalks 2013-11-12 Little needs : regardez

    ce qu'ont fait les projets open source, vos concurrents Le Code de Retour - le batch et le cache