Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Command line & Data Science
Search
Yatish Mehta
October 28, 2014
0
59
Command line & Data Science
Yatish Mehta
October 28, 2014
Tweet
Share
More Decks by Yatish Mehta
See All by Yatish Mehta
Shore | A modern Ruby on Rails template to start your next project
yatish27
0
110
Taming The Rails Monolith Mammoth
yatish27
0
32
ActionCable and ReactJS tie the knot
yatish27
1
270
Featured
See All Featured
Practical Tips for Bootstrapping Information Extraction Pipelines
honnibal
PRO
23
1.4k
CoffeeScript is Beautiful & I Never Want to Write Plain JavaScript Again
sstephenson
162
15k
Documentation Writing (for coders)
carmenintech
74
5k
Intergalactic Javascript Robots from Outer Space
tanoku
272
27k
The Web Performance Landscape in 2024 [PerfNow 2024]
tammyeverts
9
810
The Straight Up "How To Draw Better" Workshop
denniskardys
236
140k
Measuring & Analyzing Core Web Vitals
bluesmoon
9
580
KATA
mclloyd
32
14k
StorybookのUI Testing Handbookを読んだ
zakiyama
31
6.1k
Distributed Sagas: A Protocol for Coordinating Microservices
caitiem20
333
22k
Reflections from 52 weeks, 52 projects
jeffersonlam
352
21k
Building Flexible Design Systems
yeseniaperezcruz
328
39k
Transcript
Yatish Mehta @yatish27
Command line & Data Science
• pip install csvkit • cat leads.csv | csvlook •
csvstat leads.csv • csvgrep -c 6 -m samplecompany.com | csvlook 1.csvkit
2. grep,sed,sort,uniq • cat wiki.txt | grep -oE '\w+' |
tee words • < words grep '^a' | sort | uniq -c | sort -r • sed ’s/data/tata/g’ wiki.txt > wiki2.txt
• brew install jq • < data.json jq ‘.[]’ •
< data.json jq ‘.[] | select(.age>22)’ • cat data.json | jq '.[] | {isActive: ._id, name: .name}' 3. jq JSON processor
4. qstats • qstats one_hundred_milion.dat Min.
44.947 1st Qu. 93.2553 Median 100.001 Mean 100.001 3rd Qu. 106.747 Max. 156.997 Range 112.05 Std Dev. 10.0002 Length 100000000 • Faster than awk, sort, R
5. parallel • iterative • shell parallel.sh , each action
as a job • parallel keyword
Thank You