Upgrade to Pro — share decks privately, control downloads, hide ads and more …

pandas.from[0]

 pandas.from[0]

First presented at Kariera IT on 2021-03-06
Updated for Warsaw IT Days on 2021-04-07

66a8e04959f652a198a2f9a910faa105?s=128

Vitaliy Rudnytskiy

March 06, 2021
Tweet

More Decks by Vitaliy Rudnytskiy

Other Decks in Programming

Transcript

  1. INTERNAL Witalij Rudnicki, SAP April, 2021 pandas.from[0]

  2. 3 @sygyzmundovych - aka Vitaliy Rudnytskiy, Віталій Рудницький - A

    Developer Advocate in SAP - All things Data (with the focus on SAP HANA, SAP Data Intelligence, Analytics) - Based in Wrocław, Poland - Organizer of Wrocław SAP Meetup - https://people.sap.com/vitaliy.rudnytskiy 51°04'40.3"N 16°57'48.8"E (WGS84) Witalij Rudnicki @Sygyzmundovych
  3. 6 @sygyzmundovych Black Hole M87 (Image Credits: Event Horizon Telescope

    Collaboration)
  4. 7 @sygyzmundovych What pandas have in common with the black

    hole? source: https://iopscience.iop.org/article/10.3847/2041-8213/ab0c57
  5. 8 @sygyzmundovych What is pandas? “[…] we are concerned with

    data structures and tools for working with data sets in-memory, […]” “We hope that pandas will help make scientific Python a more attractive and practical statistical computing environment for academic and industry practitioners alike.” “pandas is a new Python library of data structures and statistical tools initially developed for quantitative finance applications. Most of our examples here stem from time series and cross- sectional data arising in financial modeling.” source: https://conference.scipy.org/proceedings/scipy2010/pdfs/mckinney.pdf
  6. 10 @sygyzmundovych What is pandas? “As a bit of background,

    I started building pandas in early 2008 during my tenure at AQR Capital Management, a quantitative investment management firm. At the time, I had a distinct set of requirements that were not well addressed by any single tool at my disposal…” “The pandas name itself is derived from panel data, an econometrics term for multidimensional structured datasets, and a play on the phrase Python data analysis itself.” “pandas provides high-level data structures and functions designed to make working with structured or tabular data fast, easy, and expressive.” source: https://pandas.pydata.org/about/
  7. 12 @sygyzmundovych What is pandas? source: https://pandas.pydata.org/about/

  8. 13 @sygyzmundovych data analysis / manipulation in Python working with

    structured or tabular data open source high-level data structures and functions data sets in-memory the most powerful … tool available in any language What is pandas?
  9. 14 @sygyzmundovych Modelling (Schema) Data Collection Data Analysis Data Storytelling

    Data Collection Data Analysis Modelling Data Storytelling Traditional Data Analytics Exploratory Data Analysis
  10. 15 @sygyzmundovych Data Analysis?? I want to be the Data

    Scientist!! source: https://www.forbes.com/sites/gilpress/2016/03/23/data-preparation-most-time-consuming-least-enjoyable-data-science-task-survey-says/
  11. 19 @sygyzmundovych data analysis / manipulation in Python working with

    structured or tabular data open source high-level data structures and functions data sets in-memory the most powerful … tool available in any language What is pandas?
  12. 20 @sygyzmundovych The Global CTO Survey 2020 Report by STX

    Next source: https://www.stxnext.com/cto-survey-2020-report/
  13. 21 @sygyzmundovych Chiken or Egg? source: https://dev.pandas.io/pandas-blog/pandas-10.html/

  14. 23 @sygyzmundovych data analysis / manipulation in Python working with

    structured or tabular data open source high-level data structures and functions data sets in-memory the most powerful … tool available in any language What is pandas?
  15. 24 @sygyzmundovych The replacement of… MS Excel?

  16. 25 @sygyzmundovych data analysis / manipulation in Python working with

    structured or tabular data open source high-level data structures and functions data sets in-memory the most powerful … tool available in any language What is pandas?
  17. 26 @sygyzmundovych https://github.com/pandas-dev/pandas

  18. 27 @sygyzmundovych pandas 1.x

  19. 28 @sygyzmundovych The Game of Logos

  20. 29 @sygyzmundovych data analysis / manipulation in Python working with

    structured or tabular data open source high-level data structures and functions data sets in-memory the most powerful … tool available in any language What is pandas?
  21. 30 @sygyzmundovych Let’s have a look… Demo: https://github.com/Sygyzmundovych/pandasfrom0

  22. 31 @sygyzmundovych https://colab.research.google.com/notebooks/mlcc/intro_to_pandas.ipynb https://dev.pandas.io/pandas-blog/2019-pandas-user-survey.html Let’s have a look… or better:

    let’s get our hands dirty!
  23. 32 @sygyzmundovych By our own data engineers, data scientists and

    data analysts, obviously By our software engineers: § https://pypi.org/project/contextual-ai/ § https://pypi.org/project/sailor/ § https://pypi.org/project/hana-ml/ How is it used in companies like SAP? https://www.youtube.com/watch?v=fSiVmL4S00w& list=PLSXNnd21oW416dQIZu5-XGx9K0806AUKN
  24. 33 @sygyzmundovych data analysis / manipulation in Python working with

    structured or tabular data open source high-level data structures and functions data sets in-memory the most powerful … tool available in any language What is pandas?
  25. 34 @sygyzmundovych Is pandas Ok in 2021? source: https://towardsdatascience.com/are-you-still-using-pandas-to-process-big-data-in-2021-850ab26ad919

  26. 35 @sygyzmundovych data analysis / manipulation in Python working with

    structured or tabular data open source high-level data structures and functions data sets in-memory the most powerful … tool available in any language What is pandas?
  27. Witalij Rudnicki @Sygyzmundovych Thank you. Dziękuję! http://bit.ly/PandasFrom0