Slide 1

Slide 1 text

pandas ʹΑΔ ࣌ܥྻσʔλॲཧ @PyConJP 2016 sinhrks

Slide 2

Slide 2 text

ࣗݾ঺հ • @sinhrks • ۀ຿: σʔλ෼ੳ • OSS׆ಈ: PyData Development Team (pandas) Dask Development Team (Dask) • GitHub: https://github.com/sinhrks

Slide 3

Slide 3 text

໨త • ໨త: • ࣌ܥྻσʔλ෼ੳͷͨΊͷޮ཰తͳॲཧΛ஌Δ • ࣌ܥྻϞσϧͷΠϯτϩμΫγϣϯ

Slide 4

Slide 4 text

໨࣍ • pandasͱ͸ • ࣌ܥྻσʔλͷॲཧ • ࣌ܥྻσʔλͷ౷ܭϞσϧ • ͓·͚: • ։ൃϩʔυϚοϓ

Slide 5

Slide 5 text

pandasͱ͸?

Slide 6

Slide 6 text

pandasͱ͸ • σʔλ෼ੳͷͨΊͷσʔλߏ଄ͱɺσʔλͷલॲཧ / ूܭʹ͓ ͍ͯศརͳؔ਺ / ϝιουΛఏڙ • Rͷ “data.frame” + α • ࡞ऀ: Wes McKinney • ϥΠηϯε: BSD • ҙຯ: PANel DAta System • GitHub: 7000↑⭐️

Slide 7

Slide 7 text

pandasΛ࢖͏ϝϦοτ • ݱ࣮ͷ(Ԛ͍)σʔλʹରԠ • ௚ײతͳૢ࡞ • ߴ଎ • ࢀߟ: pandas internals @PyConJP 2015

Slide 8

Slide 8 text

pandasͷσʔλߏ଄ • σʔλͷ࣍ݩ͝ͱʹఆٛ 4FSJFT ࣍ݩ %BUB'SBNF ࣍ݩ 1BOFM ࣍ݩ ৭෇͖ͷηϧ͸ϥϕϧ ࣍ݩҎ্ͷσʔλߏ଄͸WͰඇਪ঑

Slide 9

Slide 9 text

DataFrame • 2࣍ݩͷσʔλߏ଄: • ߦ (index) ͱ ྻ(columns) ʹϥϕϧΛ࣋ͭ • ྻ͝ͱʹܕΛ࣋ͭ $PMVNOT *OEFY JOUܕ PCKFDUܕ

Slide 10

Slide 10 text

import pandas as pd df = pd.read_csv(‘adult.csv’) df DataFrame "EVMU%BUBTFUUBLFOGSPN6$*.-3FQPTJUPSZ -JDINBO . 6$*.BDIJOF-FBSOJOH3FQPTJUPSZ*SWJOF $"6OJWFSTJUZPG$BMJGPSOJB 4DIPPMPG*OGPSNBUJPOBOE$PNQVUFS4DJFODF $47ϑΝΠϧͷಡΈࠐΈ

Slide 11

Slide 11 text

DataFrame df[['age', 'marital-status']] df.groupby('income')['hours-per-week'].mean() άϧʔϓԽ ྻબ୒ ू໿ ฏۉ ྻͷબ୒

Slide 12

Slide 12 text

pandasͷػೳ • ϕΫτϧԽ͞Εͨܭࢉ • άϧʔϓԽ ὎ ू໿ (split-apply-combine) • มܗ (merge, join, concat…) • ଟ༷ͳೖग़ྗ (SQL, CSV, Excel, …) • ॊೈͳ࣌ܥྻσʔλॲཧ • ՄࢹԽ

Slide 13

Slide 13 text

؀ڥ • όʔδϣϯ • Python 3.5.2 • pandas 0.19.0rc1 • statsmodels 0.8.0rc1 • ໊લۭؒ import numpy as np import pandas as pd import matplotlib.pyplot as plt import statsmodels.api as sm

Slide 14

Slide 14 text

pandasʹΑΔ ࣌ܥྻσʔλॲཧ

Slide 15

Slide 15 text

࣌ܥྻσʔλͱ͸ • ݱ࣮ͷσʔλ͸ʁ ͋Δݱ৅ͷ࣌ؒతͳมԽΛɺ࿈ଓతʹʢ·ͨ͸ҰఆִؒΛ͓͍ͯ ෆ࿈ଓʹʣ؍ଌͯ͠ಘΒΕͨ஋ͷܥྻʢҰ࿈ͷ஋ʣ XJLJQFEJBΑΓ

Slide 16

Slide 16 text

ݱ࣮ͷσʔλ͸ • ඞཁͳपظ͕ҟͳΔ • ೔࣍σʔλΛ݄࣍Ͱ෼ੳ͍ͨ͠ • पظతͰͳ͍ • Πϕϯτͷൃੜ͝ͱʹه࿥͞ΕͨϩάΛ෼ੳ͍ͨ͠ • ࣌ؒͰϥϕϧ෇͚͞Ε͍ͯͳ͍ • ೔࣌Λྻͱؚͯ͠ΉੜσʔλΛɺ͋Δपظ͝ͱʹूܭͯ͠෼ੳ͍ͨ͠ ԿΒ͔ͷલॲཧΛߦ͍ѻ͍΍͍͢ܗʹ͢Δ ੜσʔλ ࣌ܥྻσʔλ

Slide 17

Slide 17 text

࣌ܥྻσʔλͷ४උ values = [datetime.datetime(2001, 1, 1), datetime.datetime(2001, 2, 1), datetime.datetime(2001, 3, 1)] s = pd.Series(np.arange(3), index=values) s 2001-01-01 0 2001-02-01 1 2001-03-01 2 dtype: int64 ೔࣌ͷϦετ ೔࣌Λϥϕϧͱ͢Δ Ұ࣍ݩσʔλ 4FSJFT df = pd.DataFrame({'঎඼A': [25, 27, 30], '঎඼B': [10, 15, 17]}, index=values) df ೔࣌Λϥϕϧͱ͢Δ ೋ࣍ݩσʔλ %BUB'SBNF

Slide 18

Slide 18 text

࣌ܥྻσʔλͷϥϕϧ df.index DatetimeIndex(['2001-01-01', '2001-02-01', '2001-03-01'], dtype='datetime64[ns]', freq=None) df df['঎඼A'] 2001-01-01 25 2001-02-01 27 2001-03-01 30 Name: ঎඼A, dtype: int64 ϥϕϧ͸೔࣌ͷܕΛ࣋ͭ %BUFUJNF*OEFY ϥϕϧ JOEFY ྻͷબ୒

Slide 19

Slide 19 text

σʔλͷ४උ • ΍Γ͍ͨ͜ͱ • 1. σʔλʹ೔࣌ͷϥϕϧΛ͚͍ͭͨ • 2. ೚ҙͷ೔࣌ϑΥʔϚοτΛύʔε͍ͨ͠

Slide 20

Slide 20 text

γʔέϯεͷੜ੒ (pd.date_range) s = pd.Series(np.arange(10)) s 0 0 1 1 2 2 dtype: int64 s.index = pd.date_range('2001-01-01', freq='M', periods=3) s 2001-01-31 0 2001-02-28 1 2001-03-31 2 Freq: M, dtype: int64 ϥϕϧ JOEFY Λ্ॻ͖ pd.date_range('2001-01-01', freq='M', periods=3) DatetimeIndex(['2001-01-31', '2001-02-28', '2001-03-31'], dtype='datetime64[ns]', freq='M') ͔Β݄࣍Ͱݸ ͷσʔλΛ࡞੒ ೔࣌ͷϥϕϧ͕ͳ͍σʔλ

Slide 21

Slide 21 text

Frequency String • ੜ੒͢Δ࣌ܥྻͷपظΛࢦఆ͢Δ • ଞɺશ25छྨ 'SFRVFODZ4USJOH ҙຯ " ೥຤ . ݄຤ 8 ि % ೔ ) ࣌ 5 ෼ 4 ඵ

Slide 22

Slide 22 text

Frequency String pd.date_range('2016-01-01', freq='M', periods=3) DatetimeIndex(['2016-01-31', '2016-02-29', ‘2016-03-31’], dtype='datetime64[ns]', freq='M') pd.date_range('2016-01-01', freq='MS', periods=3) DatetimeIndex(['2016-01-01', '2016-02-01', '2016-03-01'], dtype='datetime64[ns]', freq='MS') pd.date_range('2016-01-01', freq='W', periods=3) DatetimeIndex(['2016-01-03', '2016-01-10', '2016-01-17'], dtype='datetime64[ns]', freq='W-SUN') pd.date_range('2016-01-01', freq='W-TUE', periods=3) DatetimeIndex(['2016-01-05', '2016-01-12', '2016-01-19'], dtype='datetime64[ns]', freq='W-TUE') 856& ि Ր༵࢝·Γ 8ि .4݄ॳ .݄຤

Slide 23

Slide 23 text

೔࣌ͷύʔε (pd.to_datetime) • ೔࣌จࣈྻΛߴ଎ʹύʔε • C Parser ὎ ਖ਼نදݱ ὎ dateutil pd.to_datetime(['2016-09-22', '2016-09-23']) DatetimeIndex(['2016-09-22', ‘2016-09-23'], dtype='datetime64[ns]', freq=None) pd.to_datetime(['September 22nd, 2016', 'September 22nd, 2016']) DatetimeIndex(['2016-09-22', ‘2016-09-22’], dtype='datetime64[ns]', freq=None) pd.to_datetime(['22 Sep 2016', '23 Sep 2016']) DatetimeIndex(['2016-09-22', ‘2016-09-23'], dtype='datetime64[ns]', freq=None)

Slide 24

Slide 24 text

೔࣌ͷύʔε (pd.to_datetime) • ϑΥʔϚοτࢦఆʹΑΔॊೈͳύʔε΋Մೳ pd.to_datetime(['2016೥9݄22೔', '2016೥9݄23೔']) ValueError: Unknown string format pd.to_datetime(['2016೥9݄22೔', '2016೥9݄23೔'], format='%Y೥%m݄%d೔') DatetimeIndex(['2016-09-22', ‘2016-09-23'], dtype='datetime64[ns]', freq=None)

Slide 25

Slide 25 text

σʔλબ୒ • ΍Γ͍ͨ͜ͱ • 1. ͋Δ೔࣌Λબ୒͍ͨ͠ • 2. ͋ΔظؒΛબ୒͍ͨ͠ • 3. ͋Δ৚݅Λຬͨ͢೔࣌Λબ୒͍ͨ͠

Slide 26

Slide 26 text

σʔλબ୒ idx = pd.date_range('2016-01-01', freq='D', periods=366) df = pd.DataFrame({'঎඼A': np.random.randint(100, size=366), '঎඼B': np.random.randint(100, size=366)}, index=idx) df.loc[datetime.datetime(2016, 1, 2)] ঎඼A 12 ঎඼B 64 Name: 2016-01-02 00:00:00, dtype: int64 ͷߦΛબ୒ ݁Ռ͸4FSJFT df.loc['2016-01-02'] ঎඼A 12 ঎඼B 64 Name: 2016-01-02 00:00:00, dtype: int64 จࣈྻ͸೔࣌ͱͯ͠ѻΘΕΔ df

Slide 27

Slide 27 text

εϥΠεʹΑΔબ୒ Ҏ߱Λબ୒ df.loc['2016-09-22':] df df.loc['2016-09-01':'2016-09-30':2] ʙ·Ͱ ೔͓͖ʹબ୒

Slide 28

Slide 28 text

෦෼จࣈྻʹΑΔબ୒ df['2016-03'] df df['2016-03':'2016-05'] ݄ʙ݄ͷσʔλΛ બ୒ จࣈྻ͕೔෇Λؚ·ͳ͍ ݄ͷσʔλΛબ୒

Slide 29

Slide 29 text

৚݅ʹΑΔબ୒ df.index.month df.loc[(df.index.month == 1) | (df.index.month == 3)] array([ 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ... 12, 12, 12, 12, 12, 12, 12, 12, 12], dtype=int32) ϕΫτϧԽ͞Εͨ ϓϩύςΟΞΫηε (df.index.month == 1) | (df.index.month == 3) array([ True, True, True, True, True, True, True, ... False, False, False, False, False, False], dtype=bool)

Slide 30

Slide 30 text

ϓϩύςΟΞΫηε • ೔࣌ͷଐੑʹԠͨ͡ॲཧ͕؆୯ʹॻ͚Δ ϓϩύςΟ ϓϩύςΟ ZFBS EBUF NPOUI UJNF EBZ EBZPGZFBS IPVS XFFLPGZFBS NJOVUF XFFL TFDPOE EBZPGXFFL NJDSPTFDPOE XFFLEBZ OBOPTFDPOE XFFLEBZ@OBNF RVBSUFS

Slide 31

Slide 31 text

೔࣌σʔλͷલॲཧ • ΍Γ͍ͨ͜ͱ • 1. ೔࣌ͷपظΛม͍͑ͨ • 2. ܽଛ஋Λิ׬͍ͨ͠ • 3. લޙͷ஋ͱൺֱ / ܭࢉ͍ͨ͠

Slide 32

Slide 32 text

ϦαϯϓϦϯά (.resample) • αϯϓϧσʔλ • ਖ਼نཚ਺ͷྦྷੵ࿨ (ϥϯμϜ΢ΥʔΫ) idx = pd.date_range('2016-09-22', freq='H', periods=50) df = pd.DataFrame({'val': np.random.randn(50)}, index=idx) df = df.cumsum() df

Slide 33

Slide 33 text

ϦαϯϓϦϯά (.resample) df.resample('6H').mean() df.resample('30T').interpolate() μ΢ϯαϯϓϦϯά ΞοϓαϯϓϦϯά

Slide 34

Slide 34 text

ϦαϯϓϦϯά (.resample) • ༷ʑͳू໿͕Մೳ ू໿ϝιου ू໿ϝιου ⒏MM NFEJBO CBDLpMM NJO QBE PIMD pMMOB QSPE JOUFSQPMBUF TJ[F DPVOU TFN OVOJRVF TUE pSTU MB TVN MBTU WBS NBY

Slide 35

Slide 35 text

ิ׬ (.interpolate) • αϯϓϧσʔλ • ஋ʹܽଛ (NaN) ΛؚΉ indexer = np.random.randint(4, size=50) == 1 df.loc[indexer] = np.nan df ܽଛ ܽଛ

Slide 36

Slide 36 text

ิ׬ (.interpolate) • ܽଛ஋ͷิ׬ • ಺෦Ͱ scipy.interpolate Λར༻ df.interpolate()

Slide 37

Slide 37 text

΢Οϯυ΢ؔ਺ (.rolling) • .resample ͱಉ͘͡ɺू໿ϝιουΛνΣΠϯͰ ͖Δ df.rolling(3).mean()

Slide 38

Slide 38 text

γϑτ (.shift) • ஋Λࢦఆ͞Εͨ periods ͚ͩͣΒ͢ • લޙͷ஋ͱͷൺֱ / ܭࢉΛ͢Δࡍʹศར df.shift(periods=1)

Slide 39

Slide 39 text

ࠩ෼ (.diff) • ࢦఆ͞Εͨ periods ͱͷࠩΛͱΔ • df - df.shift() ͱಉ͡ df.diff(periods=1)

Slide 40

Slide 40 text

γϑτͷ࢖͍ํ (ྫ) idx = pd.date_range('2016-09-22 10:00', freq='T', periods=50) df = pd.DataFrame({'val': np.repeat([0, 1, 0, 1], [10, 20, 10, 10])}, index=idx) df df.index[df['val'] != df[‘val'].shift()] DatetimeIndex(['2016-09-22 10:00:00', '2016-09-22 10:10:00', '2016-09-22 10:30:00', '2016-09-22 10:40:00'], dtype='datetime64[ns]', freq=None)

Slide 41

Slide 41 text

ूܭ • ΍Γ͍ͨ͜ͱ • ೔࣌ΛؚΉੜσʔλΛूܭ͍ͨ͠

Slide 42

Slide 42 text

೔࣌σʔλͷूܭ • αϯϓϧσʔλ • ঎඼ͷൃ஫σʔλ df = pd.DataFrame({'਺ྔ': np.random.randint(100, size=1000), '঎඼໊': np.random.choice(list('ABC'), 1000), 'ൃ஫೔': np.random.choice(idx, 1000)}) df

Slide 43

Slide 43 text

೔࣌σʔλͷूܭ • pd.Grouper • ྻ໊ͱपظΛࢦఆͨ͠άϧʔϓԽ df.groupby([pd.Grouper(key='ൃ஫೔', freq='M'), '঎඼໊']).sum()

Slide 44

Slide 44 text

ϓϩύςΟΞΫηε (.dt) df['ൃ஫೔'].dt.weekday df.groupby(df['ൃ஫೔'].dt.weekday).sum() 0 6 1 4 2 6 .. 997 6 998 5 999 2 Name: ൃ஫೔, dtype: int64 EUϓϩύςΟΛ௨ͯ͡ɺ ೔࣌ϓϩύςΟ΁ͷΞΫηε͕Մೳ

Slide 45

Slide 45 text

Ϋϩεूܭ (pd.pivot_table) • pd.pivot_table + • pd.Grouper • ϓϩύςΟΞΫηε pd.pivot_table(df, index=pd.Grouper(key='ൃ஫೔', freq='M'), columns='঎඼໊', values='਺ྔ', aggfunc='sum')

Slide 46

Slide 46 text

ΧϨϯμʔ • pandas.tseries.offsets.CustomBusinessDay • ॕ೔Λߟྀͨ͠ॲཧ • japandas (https://github.com/sinhrks/japandas) • JapaneseHolidayCalendar from pandas.tseries.offsets import CustomBusinessDay import japandas cal = japandas.JapaneseHolidayCalendar() cbd = CustomBusinessDay(calendar=cal) idx = pd.DatetimeIndex(['2016-09-20', '2016-09-21', ‘2016-09-22']) idx + cbd DatetimeIndex(['2016-09-21', '2016-09-23', '2016-09-23'], dtype='datetime64[ns]', freq=None) ͸ॕ೔

Slide 47

Slide 47 text

ՄࢹԽ • ࣌ܥྻσʔλͷपظΛࣗಈͰௐ੔ͯ͠ϓϩοτ idx1 = pd.date_range('2016-09-01', freq='D', periods=50) df1 = pd.DataFrame({'val1': np.random.randn(50)}, index=idx1) df1.plot()

Slide 48

Slide 48 text

ՄࢹԽ • ࣌ܥྻσʔλͷपظΛࣗಈͰௐ੔ͯ͠ϓϩοτ • पظ͕ҟͳΔ৔߹΋ࣗಈௐ੔ idx2 = pd.date_range('2016-09-01', freq='M', periods=3) df2 = pd.DataFrame({'val1': np.random.randn(3)}, index=idx2) ax = df1.plot() df2.plot(ax=ax)

Slide 49

Slide 49 text

࣌ܥྻσʔλͷ ౷ܭϞσϧ

Slide 50

Slide 50 text

࣌ܥྻσʔλͷ౷ܭϞσϧ • ໨త • ࣌ܥྻͷؔ܎Λௐ΂͍ͨ • কདྷͷ༧ଌΛ͍ͨ͠ • มԽ఺ / ҟৗ஋Λݕ஌͍ͨ͠ • … • ࣌ܥྻσʔλͷཹҙ఺ • ͋Δ࣌఺Ҏલͷσʔλ͔ΒͷӨڹ͕͋Δ͔ʁ • τϨϯυ΍قઅੑ͕͋Δ͔ʁ

Slide 51

Slide 51 text

࣌ܥྻϞσϧΛؚΈPythonύοέʔδ • ར༻͍ͨ͠Ϟσϧʹ߹ΘͤͯύοέʔδΛબͿ • ඞཁʹԠ͡ R Λར༻ (rpy2, pypeR) 4UBUT.PEFMT 1Z'MVY ౷ܭྔݕఆ ✅ "3*." ✅ ✅ 7"3 ✅ ✅ ("3$) ✅ TBOECPY ✅ ("4 ✅ 4UBUF4QBDF ✅ SD ✅

Slide 52

Slide 52 text

αϯϓϧσʔλ • AirPassengers • ݄࣍ͷࠃࡍઢ౥৐ਓ਺ (ઍਓ) • ୯มྔɺτϨϯυͱقઅੑΛ࣋ͭ df = pd.read_csv('airpassengers.csv', index_col=0, parse_dates=[0]) df

Slide 53

Slide 53 text

• ࣌ܥྻΛτϨϯυɺقઅੑɺ࢒ࠩʹ෼ղ ࣌ܥྻͷ੒෼෼ղ res = sm.tsa.seasonal_decompose(df) fig = res.plot(); ݩσʔλ τϨϯυ قઅੑ ࢒ࠩ

Slide 54

Slide 54 text

౷ܭྔ • ඪຊࣗݾ૬ؔ (ACF) • ҟ࣌఺ؒͷڞ෼ࢄΛඪ४Խͨ͠΋ͷ • ඪຊภࣗݾ૬ؔ (PACF) fig, axes = plt.subplots(1, 2) sm.tsa.graphics.plot_acf(df, ax=axes[0]); sm.tsa.graphics.plot_pacf(df, ax=axes[1]);

Slide 55

Slide 55 text

౷ܭྔ • ਖ਼نཚ਺ (ϗϫΠτϊΠζ) ͷ৔߹ • աڈͷ஋ͱ૬͕ؔͳ͍ wn = pd.Series(np.random.randn(100)) fig, axes = plt.subplots(1, 2) sm.tsa.graphics.plot_acf(wn, ax=axes[0]); sm.tsa.graphics.plot_pacf(wn, ax=axes[1]);

Slide 56

Slide 56 text

SARIMAϞσϧ • قઅతࣗݾճؼ࿨෼ҠಈฏۉϞσϧ • ࣗݾճؼ࿨෼ҠಈฏۉϞσϧ (ARIMA) • + قઅมಈ (ARIMA) • ARIMA (p, d, q) • d֊ࠩ෼Λͱͬͨ࣌ܥྻ yt ͕ • (ऑ)ఆৗ (ฏۉɺࣗݾڞ෼ࢄ͕࣌ؒʹΑΒͣҰఆ) • ҎԼͷաఔʹै͏ • yt = c + φ1yt-1 + … + φpyp + εt + θ1εt-1 + … + θqεt-q

Slide 57

Slide 57 text

τϨϯυͷআڈ df.plot() df.diff().plot() ෼ࢄ͕େ͖͘ͳ͍ͬͯΔ ֊ࠩ

Slide 58

Slide 58 text

ର਺ม׵ ldf = np.log(df) ldf.plot() ldf.diff().plot() ର਺ม׵

Slide 59

Slide 59 text

قઅੑͷআڈ res = sm.tsa.seasonal_decompose(ldf) seasonal_adjust = (ldf - res.seasonal) seasonal_adjust.plot() قઅ੒෼ΛҾ͘

Slide 60

Slide 60 text

୯Ґࠜݕఆ • Augmented Dickey-Fullerݕఆ sm.tsa.adfuller(df['Air passengers'])[1] 0.99188024343764114 sm.tsa.adfuller(ldf['Air passengers'])[1] 0.42236677477038415 sm.tsa.adfuller(ldf['Air passengers'].diff().dropna())[1] 0.071120548150854057 ݩσʔλ ର਺Խ ର਺Խ֊ࠩ sm.tsa.adfuller(seasonal_adjust['Air passengers'].diff().dropna())[1] 8.0990048658604878e-09 ର਺Խقઅੑআڈ֊ࠩ

Slide 61

Slide 61 text

SARIMAϞσϧͷਪఆ mod_seasonal = sm.tsa.SARIMAX(ldf, trend='c', order=(1, 1, 1), seasonal_order=(0, 1, 2, 12)) res_seasonal = mod_seasonal.fit() res_seasonal.summary() ʜ ʜ "3*."قઅ੒෼ͷύϥϝʔλ SD͕ඞཁ

Slide 62

Slide 62 text

Ϟσϧ͔Βͷ༧ଌ pred = res_seasonal.forecast(36) pred 1961-01-01 6.110548 1961-02-01 6.052912 1961-03-01 6.174690 ... 1963-10-01 6.388955 1963-11-01 6.242262 1963-12-01 6.345214 Freq: MS, dtype: float64 ax = ldf.plot() pred.plot(ax=ax) ظઌΛ༧ଌ ݩσʔλ༧ଌ஋Λϓϩοτ

Slide 63

Slide 63 text

·ͱΊ • pandas Λ࢖ͬͯ࣌ܥྻ·ΘΓͷॲཧΛ͢Δํ๏ • PythonͰ࣌ܥྻϞσϧΛѻ͏ํ๏ (ͷ৮Γ)

Slide 64

Slide 64 text

։ൃϩʔυϚοϓ • ܭը • 0.19 (ݱࡏrc) ὎ 0.20 ὎ 1.0 ΛϦϦʔε • pandas 1.0 • API ౚ݁ • Long Time Support • pandas 2.0 (under discussion) • Python 3.xͷΈΛαϙʔτ • 2࣍ݩҎԼͷσʔλʹಛԽ • όοΫΤϯυΛ C++ ʹҠߦ (Apache Arrow)