Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Create libcsv based ruby/csv compatible CSV library

Create libcsv based ruby/csv compatible CSV library

RubyKaigi 2018 - Create libcsv based ruby/csv compatible CSV library

7ef4ab70ee295c821f8b77fab3aa87cf?s=128

秒速284km

May 31, 2018
Tweet

More Decks by 秒速284km

Other Decks in Programming

Transcript

  1. Create libcsv based ruby/csv compatible CSV library Asakusa.rb @284km

  2. http://rubykaigi.org/2018/parties Talked in the title: “Super fast CSV parser” -

    libcsv based - ruby/csv compatible
  3. # libcsv based - fast - standard

  4. # ruby/csv compatible - small && easy to read -

    Backward compatibility
  5. ## Motivation - CSV is often used - Sometimes I

    use a large CSV
  6. ## 284km/csv/tree/libcsv ``` # 100,000 lines Comparison: (libcsv) quoted: 81022184

    allocated (libcsv) unquoted: 82071384 allocated - 1.01x more unquoted: 110535101 allocated - 1.36x more quoted: 133536546 allocated - 1.65x more ```
  7. ## libcsv - columns↑ performance↓ - rows↑ performance➘

  8. ## memory usage ``` # 5 columns, 10,000 lines Comparison:

    (libcsv) quoted: 8090304 allocated (libcsv) unquoted: 9139504 allocated - 1.13x more unquoted: 11053221 allocated - 1.37x more quoted: 13354666 allocated - 1.65x more ```
  9. ## memory usage ``` # 5 columns, 100,000 lines Comparison:

    (libcsv) quoted: 81022184 allocated (libcsv) unquoted: 82071384 allocated - 1.01x more unquoted: 110535101 allocated - 1.36x more quoted: 133536546 allocated - 1.65x more ```
  10. ## memory usage ``` # 10 columns, 10,000 lines Comparison:

    unquoted: 13954361 allocated (libcsv) quoted: 14090304 allocated - 1.01x more (libcsv) unquoted: 15139504 allocated - 1.08x more quoted: 18555121 allocated - 1.33x more ```
  11. ## memory usage ``` # 10 columns, 1,000,000 lines Comparison:

    unquoted: 1396650961 allocated (libcsv) quoted: 1411636904 allocated - 1.01x more (libcsv) unquoted: 1412686104 allocated - 1.01x more quoted: 1856651721 allocated - 1.33x more ```
  12. # What/How to do after this talk ## Continue development

    - libcsv multibyte support - column↑ performance→ - ffi gem replace to fiddle
  13. RubyKaigi 2018 Thank you