$30 off During Our Annual Pro Sale. View Details »

Python for Data Science - Python Brasil 11 (2015)

Python for Data Science - Python Brasil 11 (2015)

This talk, presented at Python Brasil 11 (2015), demonstrates a complete Data Science process, involving Obtaining, Scrubbing, Exploring, Modeling and Interpreting data using Python ecosystem tools, like IPython Notebook, Pandas, Matplotlib, NumPy, SciPy and Scikit-learn.

Avatar for Gabriel Moreira

Gabriel Moreira

November 10, 2015
Tweet

Other Decks in Programming

Transcript

  1. TYPES OF ANALYTICS Investigative Analytics Operational Analytics Consumers: Humans Consumers:

    Machines http://blog.cloudera.com/blog/2014/03/why-apache-spark-is-a-crossover-hit-for-data-scientists/ https://hbr.org/2014/08/the-question-to-ask-before-hiring-a-data-scientist/
  2. INQUIRE 1. Which communities are more popular? 2. Is the

    user engagement increasing? 3. What is the distribution of user interactions? 4. Is there a relationship between publishing hour and number of interactions?
  3. OBTAIN •Download data from another location (e.g., a web page

    or server) •Query data from a database (e.g., MySQL or Oracle) •Extract data from an API (e.g., Twitter, Facebook) •Extract data from another file (e.g., an HTML file or spreadsheet) •Generate data yourself (e.g., reading sensors or taking surveys)
  4. 4 - RELATIONSHIP BETWEEN PUBLISHING TIME AND NUMBER OF INTERACTIONS?

    http://viverdeblog.com/melhoresahorarios-para-postar-nas-redes-sociais/
  5. 1. Discover the most relevant words in the posts 2.

    Find related posts, with similar content Operational Analytics Tasks example Find Related Posts
  6. 1 - RELEVANT WORDS IN A POST TF-IDF - More

    “relevant" terms in a document are frequent terms in the document and rare in other documents
  7. 2 - SIMILAR POSTS Cosine Similarity
 Measure of similarity between

    two vectors 
 being the cosine of the angle between them.
  8. 2 - SIMILAR POSTS Original Post Did you ever wonder

    how great it would be if you could write your jmeter tests in ruby ? This projects aims to do so. If you use it on your project just let me now. On the Architecture Academy you can read how jmeter can be used to validate your Architecture. modulo 13 arch definition architecture validation | academia de arquitetura
 
 Most similar post (cosine similarity = 0.30)
 Foram disponibilizados no site Enterprise Architecture, na parte de Knowledge Base de performance, alguns how-tos relacionados a testes de performance.Entre eles, como definir os requisitos (throughput, cálculo de threads para o JMeter etc.), utilização do JMeter, geração de massa de dados e monitoramento. planning and executing performance testing | enterprise architecture - how to identify performance acceptance criteria | enterprise architecture - how to geracao de massa de dados | enterprise architecture - how to jmeter | enterprise architecture - how to monitoramento | enterprise architecture
  9. DATA PRODUCTS “If information has context and the context is

    interactive, insights are not predictable." [Agile Data Science, O’Reilly, 2014]
  10. DATA SCIENCE COURSES • Introduction to Data Science (Univ. of

    Washington) • Data Science specialization (Johns Hopkins) • Intro to Hadoop and MapReduce (Cloudera) • Machine Learning (Stanford) • Statistical Learning (Stanford) • Mining Massive Datasets (Stanford) • Scalable Machine Learning (Berkeley) http://workingsweng.com.br/2014/04/cursos-mooc-e-especializacoes-em-data-science/