Tokyo Tour - Goldman Sachs Engineering

Dd9d954997353b37b4c2684f478192d3?s=47 Elastic Co
December 16, 2015

Tokyo Tour - Goldman Sachs Engineering


Elastic Co

December 16, 2015

  1. 1   Goldman Sachs Engineering   GS.com/Engineering Dec, 2015

  2. 2   Elas(c{ON}15   Goldman  Sachs   Elas%c  in  Equity

     Elas(c{ON}15   Goldman  Sachs   Elas%c  in  Equity     Finance   Asia  Equi%es  Engineering   Ian  Maclean,  Vice  President   December  16,  2015  
  3. 3   Ian MacLean Vice President

  4. 4   Elastic @ GS Grass Roots Usage of Open

    Source n Developer  driven  culture  that  promotes  the  use  of  open  source   n Significant  usage  by  late  2014:  700+  nodes   n Centralized  Elas(csearch  engineering  team  formed  mid  2014   n Diverse  use  cases   –  Search:  workflow  search,  trade  /  order  search  …   –  Document  Search:  legal  documents,  candidate  resume,  source  code   –  Metrics:  JVM,  Network,  App  Usage,  Alerts,  Transac(on  volumes  …   –  Making  Real  (me  transac(on  data  queryable   –  Data  Analy(cs  :  Order  Flow  Dashboards,  Analysis      
  5. 5   Center of Excellency Model Standardized Software Packaging n Internally

     build  open  source  code  from  GitHub   n Enhanced  with  GS  Plugins  for  Security  and  Backup   n Include  other  open  source  plugins  and  tools    
  6. 6   Center of Excellency Model Support n Elas(csearch  install  Inventory

      n Centralized  monitoring  and  metrics   n Governance  on  proper  usage   n Elas(c  Vendor  Support     –  Global  support   –  Design  review   –  Performance  tuning   –  Patching   n Integra(on  with  internal  code  base  using  custom  language  wrappers    
  7. 7   Equities Engineering Use Cases High Performance Order Search

    (Problem) n  Currently  order  transac(on  data  is  persisted  into  Sybase  databases   n  The  total  transac(onal  volume  is  so  large  that  DB  instances  need  to  be  split  into  many   stripes   n  Longer  (me  range  and  aggregated  queries  very  difficult  and  slow  -­‐  hours  in  some  cases.   n  Which  means  extrac(ng  meaningful  analy(cs  from  the  data  is  difficult   n  Different  sources  for  Historical  and  Real  Time  data  means  no  code  sharing      
  8. 8   Equities Engineering Use Cases High Performance Order Search

    (Solution)     n Extract  de-­‐normalized  views  of  Historical  data  into  Elas(csearch   n Intra-­‐day  data  indexed  from  live  transac(on  feed   n Unified  schema  –  querying  historic  and  live  data  from  a  single   source   n U(lize  ES  Aggrega(ons  for  fast  analy(c  queries      
  9. 9   Equities Engineering Use Cases Management Analytics – Sharp-X

    n High  level  dashboard  showing  per-­‐market  analysis  of  Order  Flow  data   n Replaced  and  greatly  expanded  upon  a  legacy  equivalent   n Aggregated  queries  across  both  Historical  and  live  upda(ng  data   n Ability  to  query  the  latest  transac(on  state  cri(cal   n Previous  Implementa(ons  relied  on  real  (me  transac(on  callbacks  to  perform  the   aggrega(ons.  Lots  of  custom  code   n U(lizing  the  Real  (me  feed  to  ES  and  aggrega(ons  for  querying  resulted  in  a   dras(cally  simplified  architecture  and  code  base      
  10. 10   Equities Engineering Use Cases Management Analytics – Sharp-X

    (Continued ) Equity  Order  Flow  Dashboards  
  11. 11   Equities Engineering Use Cases Lessons Learnt / Next

    steps n Ease  of  deployment  and  horizontal  scaling  are  game  changers   n Moving  from  the  Rela(onal  mental  model  takes  some  adjustment.  To  noSql  and  a   completely  new  query  language.   n Living  without  easy  joining  means  thinking  more  about  the  data  model  up  front   n Working  with  a  fast  moving  technology  comes  with  risks  and  challenges   n Elas(c’s  auto-­‐schema  feature  is  useful  for  development  but  can  cause  problems  in  a   produc(on  system.   n Indexes  are  low  cost  and  easy  to  re-­‐create.  Types  can't  be  easily  re-­‐created  without   re-­‐crea(ng  the  index   n Expanded  use  Of  Elas(csearch  in  other  problem  domains   n Plans  to  replace  the  rela(onal  data  sources  with  Hadoop.  Retaining  Elas(csearch  as   the  high  speed  query  engine  on  top.      
  Learn more at GS.com/Engineering

    in this section is neither a licensed engineer nor an individual offering engineering services to the general public under applicable law. These materials (“Materials”) are confidential and for discussion purposes only. The Materials are based on information that we consider reliable, but Goldman Sachs does not represent that it is accurate, complete and/or up to date, and it should not be relied on as such. The Materials do not constitute advice nor is Goldman Sachs recommending any action based upon them. Opinions expressed may not be those of Goldman Sachs unless otherwise expressly noted. As a condition to Goldman Sachs presenting the Materials to you, you agree to treat the Materials in a confidential manner and not disclose the contents thereof without the permission of Goldman Sachs. © Copyright 2015 The Goldman Sachs Group, Inc. All rights reserved.