Apache Phoenix

What Is Apache Phoenix ? • Massively parallel, relational database
engine • Supports OLTP for Hadoop • Uses Apache HBase as its backing store • Open source / Apache 2.0 license • Written in Java , SQL • ACID (atomicity, consistency, isolation, durability) – Via Apache Tephra integration

Phoenix SQL Support • Accepts SQL queries • Compiles them
to HBase scans • Orchestrates running of scans • Produces regular JDBC result sets • Creates performance gains by using – HBase API/coprocessors/custom filters • Results in query response times – Milliseconds for small queries – Seconds for tens of millions of rows

Phoenix SQL Support • See phoenix.apache.org for full syntax support

Phoenix Environment

Phoenix Bulk Loading • Bulk load data via • Single-threaded
for CSV via psql i.e. – bin/psql.py -t EXAMPLE localhost data.csv – Load for EXAMPLE table – For HBase on local machine • MapReduce-based for CSV and JSON – See next slide

Phoenix Bulk Loading • Bulk load example for MapReduce –
For CSV and JSON loads – Using Phoenix MapReduce library – Against the EXAMPLE table

Phoenix Performance

Phoenix User-defined functions(UDFs) • Create temporary/permanent UDF's – Temporary for
session only • Use UDF's in SQL and Indexes • Permanent UDF's stored in SYSTEM.FUNCTION • Tenant specific UDF usage supported • UDF jar files must be placed on HDFS • UDF jar updates not currently possible – (without cluster bounce)

Phoenix Transactions • Cross row/table/ACID support using Apache Tephra •
Transactional functionality currently beta • Enable transactions and snapshot dir in hbase-site.xml • Also set a transational timeout value • Start Tephra • Create tables with flag TRANSACTIONAL=true • Then transactions act as follows – Start with statement against table – End with commit or rollback

Available Books • See “Big Data Made Easy” – Apress
Jan 2015 • See “Mastering Apache Spark” – Packt Oct 2015 • See “Complete Guide to Open Source Big Data Stack – “Apress Jan 2018” • Find the author on Amazon – www.amazon.com/Michael-Frampton/e/B00NIQDOOM/ • Connect on LinkedIn – www.linkedin.com/in/mike-frampton-38563020

Connect • Feel free to connect on LinkedIn – www.linkedin.com/in/mike-frampton-38563020
• See my open source blog at – open-source-systems.blogspot.com/ • I am always interested in – New technology – Opportunities – Technology based issues – Big data integration

Apache Phoenix

Apache Phoenix

Mike Frampton

More Decks by Mike Frampton

Other Decks in Technology

Featured

Transcript

What Is Apache Phoenix ? • Massively parallel, relational database

Phoenix SQL Support • Accepts SQL queries • Compiles them

Phoenix SQL Support • See phoenix.apache.org for full syntax support

Phoenix Environment

Phoenix Bulk Loading • Bulk load data via • Single-threaded

Phoenix Bulk Loading • Bulk load example for MapReduce –

Phoenix Performance

Phoenix User-defined functions(UDFs) • Create temporary/permanent UDF's – Temporary for

Phoenix Transactions • Cross row/table/ACID support using Apache Tephra •

Available Books • See “Big Data Made Easy” – Apress

Connect • Feel free to connect on LinkedIn – www.linkedin.com/in/mike-frampton-38563020