Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Logstash SQL/JDBC Input Plugin

Dd9d954997353b37b4c2684f478192d3?s=47 Elastic Co
August 31, 2015

Logstash SQL/JDBC Input Plugin

Abstract:

Using your relational database with the ELK stack

This presentation covers how to ingest data from your database using the new Logstash JDBC plugin. We explore how to transform your data for seamless visualization in Kibana, and share strategies on how to keep Elasticsearch indices up-to-date with your database. This presentation concludes with a demo bringing everything together to ingest, visualize, and learn from Musicbrainz music data.

Presented at: http://www.meetup.com/Silicon-Valley-Elastic-Fantastics/events/224699595/

Dd9d954997353b37b4c2684f478192d3?s=128

Elastic Co

August 31, 2015
Tweet

More Decks by Elastic Co

Other Decks in Programming

Transcript

  1. SELECT FROM LOGSTASH *

  2. Me Tal Levy Software Engineer @ Elastic

  3. DISCLAIMER This is not a talk about running SQL queries

    against Elasticsearch
  4. FROM DATABASE TO ELASTICSEARCH

  5. Elasticsearch Rivers ~~~~~ ~~~~~ *

  6. input { jdbc {…} } Plugin Plugin Plugin Plugin Plugin

    Plugin Plugin Plugin Plugin Plugin Plugin Plugin Plugin Plugin PluginPluginPluginPlugin PluginPluginPluginPlugin
  7. MySQL Oracle Microsoft SQL Server Cassandra Elasticsearch Pagerduty Kafka Postgresql

    … …
  8. Options

  9. Options jdbc_driver_library

  10. Options jdbc_driver_library jdbc_driver_class

  11. Options jdbc_driver_library jdbc_driver_class jdbc_connection_string

  12. Options jdbc_driver_library jdbc_driver_class jdbc_connection_string jdbc_user

  13. Options jdbc_driver_library jdbc_driver_class jdbc_connection_string jdbc_user statement

  14. Options jdbc_driver_library jdbc_driver_class jdbc_connection_string jdbc_user statement parameters

  15. Options jdbc_driver_library jdbc_driver_class jdbc_connection_string jdbc_user statement parameters jdbc_paging_enabled jdbc_page_size

  16. Options jdbc_driver_library jdbc_driver_class jdbc_connection_string jdbc_user statement parameters jdbc_paging_enabled jdbc_page_size jdbc_fetch_size

  17. input { jdbc { # Postgres jdbc connection string to

    our database, mydb jdbc_connection_string => "jdbc:postgresql://localhost:5432/mydb" # The user we wish to execute our statement as jdbc_user => "postgres" # The path to our downloaded jdbc driver jdbc_driver_library => "/path/to/postgresql-9.4-1201.jdbc41.jar" # The name of the driver class for Postgresql jdbc_driver_class => "org.postgresql.Driver" # our query statement => "SELECT * from contacts" } } output { stdout { codec => json_lines } }
  18. parameters

  19. jdbc { … parameters => { “location” => “outside” }

    statement => “select * from table where location=‘:location’ }
  20. special parameters

  21. special parameters O N LY O N E * maybe

    more to com *
  22. sql_last_start The last time a statement was executed

  23. in-sync one-time dump VS

  24. statement => “SELECT * FROM table” one-time dump

  25. schedule

  26. schedule “0 * * * *” “* 5 * 1-3

    *” “0 6 * * * America/Chicago” =>
  27. statement => ” SELECT * FROM table WHERE created_at >=

    :sql_last_start “ schedule => “*****” in-sync
  28. Elasticsearch Indexing Strategy

  29. BYO_ID

  30. BYOID input { jdbc { jdbc_connection_string => "jdbc:postgresql://localhost:5432/mydb" jdbc_user =>

    "postgres" jdbc_validate_connection => true jdbc_driver_library => "/path/to/postgresql-9.4-1201.jdbc41.jar" jdbc_driver_class => "org.postgresql.Driver" statement => "SELECT * from contacts" } } output { elasticsearch { protocol => http index => "contacts" document_type => "contact" document_id => "%{uid}" host => "ES_NODE_HOST" } }
  31. curl ES_NODE_HOST:9200/contacts/contact/5?pretty { "_index" : "contacts", "_type" : "contact", "_id"

    : "5", "_version" : 1, "found" : true, "_source":{ "uid": 5, "email": "sam@example.com", "first_name": "Sam", "last_name": null, "@version": "1", "@timestamp": "2015-07-29T22:12:20.146Z" } } BYOID
  32. UPDATE contacts SET last_name = 'Smith' WHERE email = 'sam@example.com';

    BYOID
  33. curl ES_NODE_HOST:9200/contacts/contact/5?pretty { "_index" : "contacts", "_type" : "contact", "_id"

    : "5", "_version" : 2, "found" : true, "_source":{ "uid": 5, "email": "sam@example.com", "first_name": "Sam", "last_name": "Smith", "@version": "1", "@timestamp": "2015-07-29T22:12:20.146Z" } } BYOID
  34. None
  35. Artists Albums Tracks Countries

  36. input { jdbc { jdbc_driver_library => "/path/to/driver.jar" jdbc_driver_class => "org.postgresql.Driver"

    jdbc_url => "jdbc://postgresql" jdbc_user => "musicbrainz" statement_filepath => "query.sql" } } output { elasticsearch { protocol => http } } SELECT release_group.gid AS album_i release_group.type AS album_ release_group_primary_type.n release.name AS release_name artist.name AS artist_name, artist.gid AS artist_gid, artist_credit.id AS artist_c artist.type AS artist_type_i artist_type.name AS artist_t artist.begin_date_year artis area.name AS artist_country_ release_country.date_year AS release_country.date_month A release_country.date_day AS FROM musicbrainz.artist INNER JOIN musicbrainz.artist_cr ON artist_credit_name.artist INNER JOIN musicbrainz.artist_cr ON artist_credit.id = artist INNER JOIN musicbrainz.release_g ON release_group.artist_cred INNER JOIN musicbrainz.release
  37. None
  38. None
  39. None
  40. ¿Questions?

  41. http://discuss.elastic.co