Slide 11
Slide 11 text
APACHE ICEBERG & ITS ROLE IN A LAKEHOUSE
Apache Iceberg is a high performance open table format purpose-built for large scale analytics. Plays the
role of the metadata layer in a Lake house Architecture.
Catalog: Tracks location of table’s current metadata file
Metadata file: File which defines a table’s schema, partition,, snapshot list etc
Snapshot: Snapshot of data after a write
Manifest file: Contains location, path and metadata about a list of data files
Manifest list: defines a single Snapshot as a list of manifest files along with stats
Data File: File containing the data of the table (parquet, orc, avro etc)