PERFORMANCE UPDATE - WHEN APACHE ORC MET APACHE SPARK

PERFORMANCE UPDATE: WHEN APACHE ORC MET APACHE SPARK Dongjoon Hyun
- [email protected] Owen O'Malley - [email protected] @owen_omalley

2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Agenda Performance Update Spark – Apache ORC 1.4 Integration Benchmark Overview Results Roadmap

ORC History  Originally released as part of Hive – Released in Hive 0.11.0 (2013-05-16) – Included in each release before Hive 2.3.0 (2017-07-17)  Factored out of Hive – Improve integration with other tools – Shrink the size of the dependencies – Releases faster than Hive – Added C++ reader and new C++ writer

Integration History  Before Apache ORC – Hive 1.2.1 (2015-06-27) ➔ SPARK-2883 Since Spark 1.4, Hive ORC is used.  After Apache ORC – v1.0.0 (2016-01-25) – v1.1.0 (2016-06-10) – v1.2.0 (2016-08-25) – v1.3.0 (2017-01-23) – v1.4.0 (2017-05-08) ➔ SPARK-21422 For Spark 2.3, Apache ORC dependency is added and used.

Integration Design (in HDP 2.6.3 with Spark 2.2)  Switch between old and new formats by configuration FileFormat TextBasedFileFormat ParquetFileFormat OrcFileFormat HiveFileFormat JsonFileFormat LibSVMFileFormat CSVFileFormat TextFileFormat o.a.s.sql.execution.datasources o.a.s.ml.source.libsvm o.a.s.sql.hive.orc OrcFileFormat Old OrcFileFormat from Hive 1.2.1 (SPARK-2883) New OrcFileFormat with ORC 1.4.0 (SPARK-21422, SPARK-20682, SPARK-20728)

Benefit in Apache Spark  Speed – Use both Spark ColumnarBatch and ORC RowBatch together more seamlessly  Stability – Apache ORC 1.4.0 has many fixes and we can depend on ORC community more.  Usability – User can use ORC data sources without hive module, i.e, -Phive.  Maintainability – Reduce the Hive dependency and can remove old legacy code later.

Faster and More Scalable Number of columns Time (ms) Single Column Scan from Wide Tables (1M rows with all BIGINT columns) 0 500 1000 1500 2000 TINYINT SMALLINT INT BIGINT FLOAT DOULBE OLD NEW Vectorized Read (15M rows in a single-column table) Time (ms) 0 200 400 600 800 100 200 300 NEW OLD

Support Matrix  HDP 2.6.3 is going to add a new faster and stable ORC file format for ORC tables with the subset of limitations of Apache Spark 2.2.  The followings are not supported yet – Zero-byte ORC File – Schema Evolution • Adding columns at the end • Changing types and deleting columns  Please see the full JIRA issue list in next slides.

Done Tickets  SPARK-20901 Feature parity for ORC with Parquet – The parent ticket of all ORC issues  Done – SPARK-20566 ColumnVector should support `appendFloats` for array – SPARK-21422 Depend on Apache ORC 1.4.0 – SPARK-21831 Remove `spark.sql.hive.convertMetastoreOrc` config in HiveCompatibilitySuite – SPARK-21839 Support SQL config for ORC compression – SPARK-21884 Fix StackOverflowError on MetadataOnlyQuery – SPARK-21912 ORC/Parquet table should not create invalid column names

On-Going Tickets  SPARK-20682 Support a new faster ORC data source based on Apache ORC  SPARK-20728 Make ORCFileFormat configurable between sql/hive and sql/core  SPARK-16060 Vectorized Orc reader  SPARK-21791 ORC should support column names with dot  SPARK-21787 Support for pushing down filters for DATE types in ORC  SPARK-19809 Zero byte ORC file support  SPARK-14387 Enable Hive-1.x ORC compatibility with spark.sql.hive.convertMetastoreOrc

To-do Tickets  Configuration – SPARK-21783 Turn on ORC filter push-down by default  Read – SPARK-11412 Support merge schema for ORC – SPARK-16628 OrcConversions should not convert an ORC table  Alter – SPARK-21929 Support `ALTER TABLE ADD COLUMNS(..)` for ORC data source – SPARK-18355 Spark SQL fails to read from a ORC table with new column  Write – SPARK-12417 Orc bloom filter options are not propagated during file write

Benchmark Overview • Benchmark Objective • Compare performance of New ORC vs Old ORC in Spark • With TPC-DS at different scales 1 TB, 10 TB

Enabling New ORC in Spark  spark.sql.hive.convertMetastoreOrc=true  spark.sql.orc.enabled=true

Software/Hardware/Cluster Details  Software – Internal branch of Spark 2.2 (in HDP)  Node Configuration – 32 CPU, Intel E5-2640, 2.00GHz – 256 GB RAM – 6 SATA Disks (~4 TB, 7200 rpm) – 10 Gbps network card  Cluster – 10 nodes used for 1 TB – 15 nodes used for 10 TB – Double type used in data – TPC-DS (1.4) queries in Spark used for benchmarking

Spark Tuning Spark Parameter Name Default Value Tuned Value spark.sql.hive.convertMetastoreOrc false true spark.sql.orc.enabled false true spark.sql.orc.filterPushdown false true spark.sql.statistics.fallBackToHdfs false true spark.sql.autoBroadcastJoinThreshold 10L * 1024 * 1024 (10MB) 26214400 spark.shuffle.io.numConnectionsPerPeer 1 10 spark.io.compression.lz4.blockSize 32k 128kb spark.sql.shuffle.partitions 200 300 spark.network.timeout 120s 600s spark.locality.wait 3s 0s *Hive Parameter Name (hive-site.xml) Default Value Tuned Value hive.exec.max.dynamic.partitions 1000 10000 hive.exec.dynamic.partition.mode strict nonstrict *mainly for data ingestion *all tables analyzed with `noscan` option for getting basic stats

Results : 10 TB Scale (15 executors, 170 GB, 25 cores) * Total runtime of 74 queries in TPC-DS - New ORC is > 2x better than old ORC

Results : 10 TB Scale (15 executors, 170 GB, 25 cores) New ORC vs Old ORC New ORC consistently outperforms old ORC in all queries significantly

Results : 1 TB Scale (10 executors, 170 GB, 25 cores) * Total runtime of 97 queries in TPC-DS @ 1 TB scale • New ORC is ~2x faster than old ORC

*Profiler shows high CPU usage with libzip.so during ORC reads. ORC-175 can be used for improving this further *ORC-175 (Intel ISAL: intelligent storage acceleration libraries for inflate) can be useful here

Comparison of New ORC vs Old ORC vs Parquet (10 TB Scale) * Total runtime of 74 queries in TPC-DS * 10 TB Scale (15 executors, 170 GB, 25 cores)

RoadMap  Tune default ambari configs for spark based on these benchmark results to get better OOTB experience for end users  Additional enhancements like parallel reading of footers in ORC  Include ORC-175 (Intel ISAL) when complete  Contribute back the fixes back to community

Thanks!  Questions – For ORC – dev@@orc.apache.org – For Spark – [email protected] – Benchmarks – [email protected]

Results : 10 TB Scale (15 executors, 170 GB, 25 cores) Comparison of New ORC vs Parquet

PERFORMANCE UPDATE - WHEN APACHE ORC MET APACHE...

PERFORMANCE UPDATE - WHEN APACHE ORC MET APACHE SPARK

Dongjoon Hyun

More Decks by Dongjoon Hyun

Other Decks in Technology

Featured

Transcript

PERFORMANCE UPDATE: WHEN APACHE ORC MET APACHE SPARK Dongjoon Hyun

2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

3 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

6 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

7 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

8 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

9 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

10 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

11 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

12 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

13 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

14 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

15 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

16 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

17 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

18 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

19 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

20 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

21 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

22 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

23 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

24 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

25 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

26 © Hortonworks Inc. 2011 – 2016. All Rights Reserved