Index Activiti data on Elasticsearch

Index Activiti data on Elasticsearch

Activit User Day 2015 presentation with Mike Dias (@mike_dias) and Silvio Neto (@silvioneto) about Activiti integration with Elasticsearch.

Da1bf5eb395c67b4abe83837e52a6c1f?s=128

Mike Dias

June 11, 2015
Tweet

Transcript

  1. Index Activiti data on Elasticsearch Activiti User Day Paris 2015

  2. Silvio dos Passos Neto CTO at iColabora @silvioneto

  3. “Don’t bridge the business-IT divide. Obliterate it!” (2003) Smith &

    Fingar
  4. None
  5. None
  6. ?

  7. ?

  8. None
  9. None
  10. None
  11. @mike_dias

  12. The big table problem

  13. ID_ NAME_ VALUE_* … ACT_HI_VARINST

  14. User form Process Instance ID_ NAME_ VALUE_* … 1 client_name

    Jonh … 2 client_tel 123456 … 3 due_date 01/06/2015 … 4 demand_desc I have a problem… … ACT_HI_VARINST
  15. User form Process Instance Process Instance User form ID_ NAME_

    VALUE_* … 1 client_name Jonh … 2 client_tel 123456 … 3 due_date 01/06/2015 … 4 demand_desc I have a problem… … 5 client_name Bob … 6 client_tel 654321 … 7 due_date 10/06/2015 … 8 demand_desc My internet conn… … ACT_HI_VARINST
  16. 85 fields x ~1000 Process per day = ~85.000 variables

    per day
  17. ~15 million variables in 9 months

  18. None
  19. The Tool

  20. None
  21. Built on top of

  22. Analytics

  23. Distributed

  24. Indexing the data

  25. Historic Data

  26. P P P P P P P P P P

    P P P P P P P P P Process Lake
  27. P P P P P P P P P P

    P P P P P P P P P P P P P P P P P P P P P CPU 1 CPU 2 CPU 3 CPU 4 Process Lake
  28. P P P CPU 1 V V V V V

    V V V Variables T T T Tasks P Process { } P V V V V V V V V T T T JSON REST API
  29. Real-Time Data

  30. E E E E E E E E E E

    E E E E E E E E E E E E E E E E E E E E E E Engine Events E E E E E E E E E E E E
  31. E E E E E E E E E E

    E E E E E E E E E E E E E E E E E E E E E E Engine Events E E E E E E E E E E E E E E E E E E { } JSON REST API Listeners
  32. Playing with the data

  33. Search

  34. { "query":{ "path":"variables", "nested":{ "query":{ "match":{ "text":"João Silva" } }

    } } }
  35. Search results

  36. Compare

  37. SELECT * FROM ACT_HI_VARINST WHERE NAME_ = 'passport' AND TEXT_

    = '1234'
  38. { "filter":{ "nested":{ "path":"variables", "filter":{ "bool":{ "must":[ { “term": {

    "name":"passport" }}, { “term": { "text":"1234" }} ] } } } } }
  39. Response Time 0 secs 45 secs 90 secs 135 secs

    180 secs MySQL Elasticsearch 0,08 secs 161 secs
  40. Response Time 0 secs 45 secs 90 secs 135 secs

    180 secs MySQL Elasticsearch 0,08 secs 161 secs CENSURED
  41. Lessons learned

  42. Full text search is a helpful feature

  43. Reduce MySQL workload

  44. ES is great for analytics

  45. Next steps

  46. Apache Spark Lightning-Fast Cluster Computing

  47. Java EE dependency

  48. Open source

  49. Thank you! @mike_dias @silvioneto

  50. Questions? @mike_dias @silvioneto