Slide 1

Slide 1 text

SELECT FROM LOGSTASH *

Slide 2

Slide 2 text

Me Tal Levy Software Engineer @ Elastic

Slide 3

Slide 3 text

DISCLAIMER This is not a talk about running SQL queries against Elasticsearch

Slide 4

Slide 4 text

FROM DATABASE TO ELASTICSEARCH

Slide 5

Slide 5 text

Elasticsearch Rivers ~~~~~ ~~~~~ *

Slide 6

Slide 6 text

input { jdbc {…} } Plugin Plugin Plugin Plugin Plugin Plugin Plugin Plugin Plugin Plugin Plugin Plugin Plugin Plugin PluginPluginPluginPlugin PluginPluginPluginPlugin

Slide 7

Slide 7 text

MySQL Oracle Microsoft SQL Server Cassandra Elasticsearch Pagerduty Kafka Postgresql … …

Slide 8

Slide 8 text

Options

Slide 9

Slide 9 text

Options jdbc_driver_library

Slide 10

Slide 10 text

Options jdbc_driver_library jdbc_driver_class

Slide 11

Slide 11 text

Options jdbc_driver_library jdbc_driver_class jdbc_connection_string

Slide 12

Slide 12 text

Options jdbc_driver_library jdbc_driver_class jdbc_connection_string jdbc_user

Slide 13

Slide 13 text

Options jdbc_driver_library jdbc_driver_class jdbc_connection_string jdbc_user statement

Slide 14

Slide 14 text

Options jdbc_driver_library jdbc_driver_class jdbc_connection_string jdbc_user statement parameters

Slide 15

Slide 15 text

Options jdbc_driver_library jdbc_driver_class jdbc_connection_string jdbc_user statement parameters jdbc_paging_enabled jdbc_page_size

Slide 16

Slide 16 text

Options jdbc_driver_library jdbc_driver_class jdbc_connection_string jdbc_user statement parameters jdbc_paging_enabled jdbc_page_size jdbc_fetch_size

Slide 17

Slide 17 text

input { jdbc { # Postgres jdbc connection string to our database, mydb jdbc_connection_string => "jdbc:postgresql://localhost:5432/mydb" # The user we wish to execute our statement as jdbc_user => "postgres" # The path to our downloaded jdbc driver jdbc_driver_library => "/path/to/postgresql-9.4-1201.jdbc41.jar" # The name of the driver class for Postgresql jdbc_driver_class => "org.postgresql.Driver" # our query statement => "SELECT * from contacts" } } output { stdout { codec => json_lines } }

Slide 18

Slide 18 text

parameters

Slide 19

Slide 19 text

jdbc { … parameters => { “location” => “outside” } statement => “select * from table where location=‘:location’ }

Slide 20

Slide 20 text

special parameters

Slide 21

Slide 21 text

special parameters O N LY O N E * maybe more to com *

Slide 22

Slide 22 text

sql_last_start The last time a statement was executed

Slide 23

Slide 23 text

in-sync one-time dump VS

Slide 24

Slide 24 text

statement => “SELECT * FROM table” one-time dump

Slide 25

Slide 25 text

schedule

Slide 26

Slide 26 text

schedule “0 * * * *” “* 5 * 1-3 *” “0 6 * * * America/Chicago” =>

Slide 27

Slide 27 text

statement => ” SELECT * FROM table WHERE created_at >= :sql_last_start “ schedule => “*****” in-sync

Slide 28

Slide 28 text

Elasticsearch Indexing Strategy

Slide 29

Slide 29 text

BYO_ID

Slide 30

Slide 30 text

BYOID input { jdbc { jdbc_connection_string => "jdbc:postgresql://localhost:5432/mydb" jdbc_user => "postgres" jdbc_validate_connection => true jdbc_driver_library => "/path/to/postgresql-9.4-1201.jdbc41.jar" jdbc_driver_class => "org.postgresql.Driver" statement => "SELECT * from contacts" } } output { elasticsearch { protocol => http index => "contacts" document_type => "contact" document_id => "%{uid}" host => "ES_NODE_HOST" } }

Slide 31

Slide 31 text

curl ES_NODE_HOST:9200/contacts/contact/5?pretty { "_index" : "contacts", "_type" : "contact", "_id" : "5", "_version" : 1, "found" : true, "_source":{ "uid": 5, "email": "[email protected]", "first_name": "Sam", "last_name": null, "@version": "1", "@timestamp": "2015-07-29T22:12:20.146Z" } } BYOID

Slide 32

Slide 32 text

UPDATE contacts SET last_name = 'Smith' WHERE email = '[email protected]'; BYOID

Slide 33

Slide 33 text

curl ES_NODE_HOST:9200/contacts/contact/5?pretty { "_index" : "contacts", "_type" : "contact", "_id" : "5", "_version" : 2, "found" : true, "_source":{ "uid": 5, "email": "[email protected]", "first_name": "Sam", "last_name": "Smith", "@version": "1", "@timestamp": "2015-07-29T22:12:20.146Z" } } BYOID

Slide 34

Slide 34 text

No content

Slide 35

Slide 35 text

Artists Albums Tracks Countries

Slide 36

Slide 36 text

input { jdbc { jdbc_driver_library => "/path/to/driver.jar" jdbc_driver_class => "org.postgresql.Driver" jdbc_url => "jdbc://postgresql" jdbc_user => "musicbrainz" statement_filepath => "query.sql" } } output { elasticsearch { protocol => http } } SELECT release_group.gid AS album_i release_group.type AS album_ release_group_primary_type.n release.name AS release_name artist.name AS artist_name, artist.gid AS artist_gid, artist_credit.id AS artist_c artist.type AS artist_type_i artist_type.name AS artist_t artist.begin_date_year artis area.name AS artist_country_ release_country.date_year AS release_country.date_month A release_country.date_day AS FROM musicbrainz.artist INNER JOIN musicbrainz.artist_cr ON artist_credit_name.artist INNER JOIN musicbrainz.artist_cr ON artist_credit.id = artist INNER JOIN musicbrainz.release_g ON release_group.artist_cred INNER JOIN musicbrainz.release

Slide 37

Slide 37 text

No content

Slide 38

Slide 38 text

No content

Slide 39

Slide 39 text

No content

Slide 40

Slide 40 text

¿Questions?

Slide 41

Slide 41 text

http://discuss.elastic.co