+ Scalable Computation Open Source Big Data / Data Science Platform 5 COTS Apps (Excel, Tableau, Qlik...) Statistical Time Series Analysis Wider Big Data Analytics eco-systems • Shell/APIs: HDFS, Hive, Spark, HBase, Sqoop, JDBC/ODBC • Languages: Julia, Python, R, Scala - Developed on: - Operated by: NLTK: Natural Language Distributed Time Series / Geospatial / Graph Databases GIT Repo Data Products WebSocket Drag + Drop (CZML/GeoJSON) Web Browser (collaboration) Export to CSV/ Excel Geospatial data Time Series data Public Data Market data Real-time Streaming Open Gov Data JDBC via phoenix HDFS Hive/Pig w/ Geospatial
u Big Data Bootcamp u Lunch and Learn KT Sessions Big Data Technology is evolving so fast… here’s Hadoop related: Big data ELT with Apache Sqoop BI vs Data Science Data Scientist Career Path MOOC and Machine Learning Machine Learning with Apache Spark Map Reduce 101 Big Data Security: Kerberos/Knox/Sentry Deep Learning and Use Cases Time Series and Geospatial Big Data Analytics with Impala HBase: Distributed Key-value BigTable Distributed Time Sereis DB: OpenTSDB Machine Learning with Hadoop and R Advanced Machine Bayesian Network