Slide 34
Slide 34 text
ͻͱ·࣮ͣ
2. SchemaΛ࡞Δ
from pyspark.sql.types import StructType, StructField,
DoubleType, DateType, StringType, LongType
# schemaઃఆʢͪΐͬͱ͍ʣ
STATCAST_SCHEMA: StructType = StructType(
[
StructField("pitch_type", StringType(), False),
StructField("game_date", DateType(), False),
StructField("release_speed", DoubleType(), False),
StructField("release_pos_x", DoubleType(), False),
StructField("release_pos_z", DoubleType(), False),
# ͗͢ΔͷͰলུʢ91߲͋Δʣ
StructField("spin_axis", DoubleType(), False),
StructField("delta_home_win_exp", DoubleType(), False),
StructField("delta_run_exp", DoubleType(), False)
]
)
• CSVͷ߹SchemaΛ࡞Δ
• ͜Ε͕ແ͍ͱҙਤ௨Γʹ
ಈ͔ͳ͍
• 91߲ͷSchema
ؤுͬͯॻ͖·ͨ͠ྦ