Upgrade to Pro — share decks privately, control downloads, hide ads and more …

OrientDB

 OrientDB

This presentation gives an overview of the OrientDB database project. It explains OrientDB in terms of it's functionality, its indexing and architecture. It examines the ETL functionality as well as the UI available.

Links for further information and connecting

http://www.amazon.com/Michael-Frampton/e/B00NIQDOOM/

https://nz.linkedin.com/pub/mike-frampton/20/630/385

https://open-source-systems.blogspot.com/

Mike Frampton

June 03, 2020
Tweet

More Decks by Mike Frampton

Other Decks in Technology

Transcript

  1. What Is Apache OrientDB ? • A NoSQL database •

    Uses a multi model approach – supports models – Graph, document, key/value, object – Reactive, Geospatial • Open source Apache 2.0 license • Offers scaleability / high performance • Supports polyglot persistence i.e. – The idea that different kinds of data benefit – From being stored in different formats
  2. What Is Apache OrientDB ? • Offered from orientdb.com as

    – Community and enterprise editions • Supports use of Tinkerpop 2.x and 3.x • Has multi-master and sharded architecture • Offers Gremlin and extended SQL interfaces • Supports ACID transactions • Offers record level security • Supports schema-full, schema-less, schema mix • Written in Java
  3. OrientDB Indexes • SB-Tree Index – Default, durable, transactional and

    supports range queries • Hash Index – Fast lookup, light on disk usage, no range queries • Auto Sharding Index – Durable and transactional, no range queries • Lucene Full Text Index – Durable and transactional, text only, supports range queries • Lucene Spatial Index – Durable and transactional, spatial only, supports range queries
  4. OrientDB Studio • Run Queries, edit graphs • Manage schemas,

    security management • Manage databases, servers + cluster ( enterprise edn ) • Query profiler ( enterprise edn ) • Query audit logs ( enterprise edn ) • Backup management ( enterprise edn )
  5. OrientDB Integration • Uses ETL tool ( JSON cfg )

    for data import • Compatible with most RDBMS with a JDBC driver – Tested Oracle, SQLServer, MySQL, PostgreSQL, HyperSQL • Has an Apache Spark connector (2.2.7+) • Has a Neo4j importer – Uses Neo4j Java API to extract graph – Uses OrientDB Java API to import graph
  6. OrientDB Backups • Possible to export database in JSON format

    but – No locking so possibly not an exact replica • Backups lock database and create an exact replica – Database in read only mode – Concurrent writes blocked • Distributed cluster allows ( during backup ) – Read / write – Snapshots
  7. OrientDB Cluster • Supports a distributed architecture • Uses HazelCast

    for auto discovery of nodes • Has a multi-master system • Supports REPLICA nodes ( read only ) • Records can be created in distributed mode • Supports distributed transactions • Cannot import DB in distributed mode
  8. OrientDB Teleporter • Uses JDBC to import RDBMS database •

    Enterprise version has a synchronisation function • Tested Oracle, SQLServer, MySQL, PostgreSQL, HyperSQL • Execution involves – Source DB Schema Building – Graph Model Building – OrientDB Schema Writing – OrientDB importing • Default admin, reader, writer accounts created
  9. Available Books • See “Big Data Made Easy” – Apress

    Jan 2015 • See “Mastering Apache Spark” – Packt Oct 2015 • See “Complete Guide to Open Source Big Data Stack – “Apress Jan 2018” • Find the author on Amazon – www.amazon.com/Michael-Frampton/e/B00NIQDOOM/ • Connect on LinkedIn – www.linkedin.com/in/mike-frampton-38563020
  10. Connect • Feel free to connect on LinkedIn – www.linkedin.com/in/mike-frampton-38563020

    • See my open source blog at – open-source-systems.blogspot.com/ • I am always interested in – New technology – Opportunities – Technology based issues – Big data integration