Source n Developer driven culture that promotes the use of open source n Signiﬁcant usage by late 2014: 700+ nodes n Centralized Elas(csearch engineering team formed mid 2014 n Diverse use cases – Search: workﬂow search, trade / order search … – Document Search: legal documents, candidate resume, source code – Metrics: JVM, Network, App Usage, Alerts, Transac(on volumes … – Making Real (me transac(on data queryable – Data Analy(cs : Order Flow Dashboards, Analysis
n Centralized monitoring and metrics n Governance on proper usage n Elas(c Vendor Support – Global support – Design review – Performance tuning – Patching n Integra(on with internal code base using custom language wrappers
(Problem) n Currently order transac(on data is persisted into Sybase databases n The total transac(onal volume is so large that DB instances need to be split into many stripes n Longer (me range and aggregated queries very diﬃcult and slow -‐ hours in some cases. n Which means extrac(ng meaningful analy(cs from the data is diﬃcult n Diﬀerent sources for Historical and Real Time data means no code sharing
(Solution) n Extract de-‐normalized views of Historical data into Elas(csearch n Intra-‐day data indexed from live transac(on feed n Uniﬁed schema – querying historic and live data from a single source n U(lize ES Aggrega(ons for fast analy(c queries
n High level dashboard showing per-‐market analysis of Order Flow data n Replaced and greatly expanded upon a legacy equivalent n Aggregated queries across both Historical and live upda(ng data n Ability to query the latest transac(on state cri(cal n Previous Implementa(ons relied on real (me transac(on callbacks to perform the aggrega(ons. Lots of custom code n U(lizing the Real (me feed to ES and aggrega(ons for querying resulted in a dras(cally simpliﬁed architecture and code base
steps n Ease of deployment and horizontal scaling are game changers n Moving from the Rela(onal mental model takes some adjustment. To noSql and a completely new query language. n Living without easy joining means thinking more about the data model up front n Working with a fast moving technology comes with risks and challenges n Elas(c’s auto-‐schema feature is useful for development but can cause problems in a produc(on system. n Indexes are low cost and easy to re-‐create. Types can't be easily re-‐created without re-‐crea(ng the index n Expanded use Of Elas(csearch in other problem domains n Plans to replace the rela(onal data sources with Hadoop. Retaining Elas(csearch as the high speed query engine on top.