Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Introducción a la visualización de datos en Python

Introducción a la visualización de datos en Python

First meetup of PyPereira, about dataviz using seaborn, a Python library.

Juan Sebastián Vega

October 26, 2017
Tweet

More Decks by Juan Sebastián Vega

Other Decks in Programming

Transcript

  1. seaborn • Basada en matplotlib • Alto nivel • Varios

    temas para estilizar gráficas • Distribuciones, regresiones, matrices, series de tiempo,…
  2. TIPS dataset total_bill tip sex smoker day time size 0

    16.99 1.01 Female No Sun Dinner 2 1 10.34 1.66 Male No Sun Dinner 3 2 21.01 3.50 Male No Sun Dinner 3 3 23.68 3.31 Male No Sun Dinner 2 4 24.59 3.61 Female No Sun Dinner 4 5 25.29 4.71 Male No Sun Dinner 4 6 8.77 2.00 Male No Sun Dinner 2 7 26.88 3.12 Male No Sun Dinner 4 8 15.04 1.96 Male No Sun Dinner 2 9 14.78 3.23 Male No Sun Dinner 2 tips = sns.load_dataset('tips')
  3. Titanic dataset survived pclass sex age sibsp parch fare embarked

    class who adult_male deck embark_town alive alone 0 0 3 male 22.0 1 0 7.2500 S Third man TRUE NaN Southampton no FALSE 1 1 1 female 38.0 1 0 71.2833 C First woman FALSE C Cherbourg yes FALSE 2 1 3 female 26.0 0 0 7.9250 S Third woman FALSE NaN Southampton yes TRUE 3 1 1 female 35.0 1 0 53.1000 S First woman FALSE C Southampton yes FALSE 4 0 3 male 35.0 0 0 8.0500 S Third man TRUE NaN Southampton no TRUE 5 0 3 male NaN 0 0 8.4583 Q Third man TRUE NaN Queenstown no TRUE 6 0 1 male 54.0 0 0 51.8625 S First man TRUE E Southampton no TRUE 7 0 3 male 2.0 3 1 21.0750 S Third child FALSE NaN Southampton no FALSE 8 1 3 female 27.0 0 2 11.1333 S Third woman FALSE NaN Southampton yes FALSE 9 1 2 female 14.0 1 0 30.0708 C Second child FALSE NaN Cherbourg yes FALSE titanic = sns.load_dataset('titanic')
  4. Cuarteto de Anscombe dataset x y 0 I 10.0 8.04

    1 I 8.0 6.95 2 I 13.0 7.58 3 I 9.0 8.81 4 I 11.0 8.33 5 I 14.0 9.96 6 I 6.0 7.24 7 I 4.0 4.26 8 I 12.0 10.84 9 I 7.0 4.82 anscombe = sns.load_dataset('anscombe')
  5. Diferentes tipos de modelos Puntos aislados anscombe_3 = anscombe[anscombe.dataset ==

    'III'] sns.lmplot(x='x', y='y', data=anscombe_3, ci=None)
  6. ¿Cómo la relación entre dos variables cambia respecto a otras?

    Una variable extra sns.lmplot(x='total_bill', y='tip', hue='smoker', data=tips)
  7. ¿Cómo la relación entre dos variables cambia respecto a otras?

    Dos variables extra sns.lmplot(x='total_bill', y='tip', hue='smoker', col='time', data=tips)
  8. ¿Cómo la relación entre dos variables cambia respecto a otras?

    Tres variables extra sns.lmplot(x='total_bill', y='tip', hue='smoker', col='time', row='sex', data=tips)
  9. Conclusiones • Explorar y comprender los datos • Seaborn: Hacer

    ciertas cosas difíciles fáciles de hacer • Seaborn complementa a matplotlib /sebasvega95 Gracias!!