#DS101: understanding the basics

#DS101: understanding the basics


Muhammad Aswan Syahputra

February 18, 2020


  1. understanding the basics #DS101

  2. HELLO My name is ASWAN

  3. aswansyahputra • Data Analyst @ Jabar Digital Service • Sensory

    Scientist @ Sensolution.ID • Instructor @ R Academy Telkom University • Initiator of Komunitas R Indonesia
  4. Unit di bawah Dinas Komunikasi dan Informatika Provinsi Jawa Barat

    yang dicita-citakan dapat mempersempit kesenjangan digital, membantu efisiensi dan akurasi pengambilan kebijakan berbasis data dan teknologi, serta merevolusi pemakaian teknologi dalam kehidupan masyarakat serta pemerintahan di Jawa Barat. Jabar Digital Service
  5. R Indonesia www.r-indonesia.id Telegram: @GNURIndonesia (t.me/GNURIndonesia) Web: www.r-indonesia.id GitHub: www.github.com/indo-r

  6. data ? what is

  7. “Facts that can be analyzed or used in an effort

    to gain knowledge or make decisions; information.” “a collection of facts, observations, or other information related to a particular question or problem.” “a collection of facts from which conclusions may be drawn.” – The American Heritage® Dictionary of the English Language, 5th Edition – GNU version of the Collaborative International Dictionary of English – WordNet 3.0 Copyright 2006 by Princeton University
  8. crucial ? why is it

  9. science ? what is data

  10. Data science is the art of turning raw data into

  11. why the hype?

  12. None
  13. activities ? what are the

  14. Prepare

  15. Prepare Understand

  16. Prepare Understand Communicate

  17. workflow ? how is the

  18. Import Tidy Transform Visualise Model Communicate

  19. Remote Files Local Files Database API Clipboard Data import

  20. Data wrangling Tidy Transform

  21. Data visualisation I. Exploration

  22. Data visualisation II. Presentation

  23. Data modeling Predict Explain “All models are wrong, but some

    are useful” – George Box
  24. Mean of x 9 exact Sample variance of x 11

    exact Mean of y 7.50 to 2 decimal places Sample variance of y 4.125 ±0.003 Correlation between x and y 0.816 to 3 decimal places Linear regression line y = 3.00 + 0.500x to 2 and 3 decimal places, respectively R2 0.67 to 3 decimal places Anscombe’s Quartet
  25. Viz + Stats

  26. Graphics Dashboards Reports Slides API Data communication

  27. key skills ? what are the

  28. Import Tidy Transform Visualise Model Communicate

  29. Import Tidy Transform Visualise Model Communicate Understand

  30. Program Import Tidy Transform Visualise Model Communicate Understand

  31. Statistic

  32. Statistic Program

  33. Think Do Describe (preciely) Cognitive Computational

  34. Easy to use for the standard things, but very frustrating

    if you want to do something that is not already preprogrammed. Program like SPSS are busses...
  35. R can take you anywhere you want to go if

    you take time to learn how to use the equipment, but that is going to take longer than learning where the bus stops are in SPSS. R is a 4-wheel drive SUV (though environmentally friendly) with a bike on the back, a kayak on top, good walking and running shoes in the passenger seat, and mountain climbing and spelunking gear in the back.
  36. It’s just a text!

  37. It’s just a text! Ctrl + C Ctrl + V

  38. None
  39. in jds ? how is it

  40. Program Import Tidy Transform Visualise Model Communicate Understand

  41. Program Import Tidy Transform Visualise Model Communicate Understand

  42. Demo