Upgrade to Pro — share decks privately, control downloads, hide ads and more …

IstanbulFlow meetup - Always-on with OpsGenie

IstanbulFlow meetup - Always-on with OpsGenie

In this presentation, I talk about OpsGenie's past, present, and future as well as the product. In the second half, I talk about key points that make OpsGenie a successful global SaaS player and the challenges along the way.

Serhat Can

May 10, 2018
Tweet

More Decks by Serhat Can

Other Decks in Technology

Transcript

  1. About me • Trying to write master thesis • Ex-Software

    Engineer • Technical Evangelist - WTF is this and why? • 2+ years at OpsGenie • Organizer; ◦ Serverless Turkey meetup ◦ DevOps Turkey meetup ◦ DevOpsDays İstanbul • @srhtcn on Twitter
  2. Agenda First half OpsGenie’s history, present, and future The need

    and the product Second half Working hard towards success Challenges
  3. History, the beginning, 6 years before 2012 • CEO was

    working as a well experienced field engineer in enterprise monitoring ◦ Certainly there were missing features & processes in enterprise monitoring products • Build software to integrate & rule all IT systems : ROSS • Customers are from the enterprise market ◦ Salesforce.com, ABC News, SITA, Hughes ... • Revenue model ◦ Product + yearly support + cost of custom works • Result, not good not bad ◦ Growth rate is under expectations ◦ We believed we could do better in terms of growth
  4. History, let's jump into SaaS business, 2011 • World is

    going SaaS • What can we do as SaaS , OpsGenie ◦ Alert notifications from old product, just one of the features ◦ We knew there was a demand for it, it had a potential to sell ◦ We knew the domain well • Start with the minimum viable product , 1 year built time , 2012 release • Try to sell it ◦ Touch old customers ◦ Built & market integrations to reach new customers • Iterate as feedback flows from visitors & customers • Revenue model ◦ Per user per month / annual
  5. Growth in 6 years - Present • 3000+ customers •

    50.000+ users • 3 x revenue increase each year - gets harder but doable • 10 m $ investment from Battery Ventures • 3 offices Ankara (~75), Boston (~80), and Washington DC (~10) ◦ Engineering: 3 to 25 to 60+ ◦ Sales (including account managers): 60+ ◦ Customer success: 15 ◦ Marketing: 12 ◦ Product: 5 ◦ Management & HR: 8
  6. Who uses OpsGenie • Those willing to do a good

    monitoring by all means :) • Those using all nice monitoring solutions • IT operations, devops, sysadmin, developers • Customer success, ticketing system operators
  7. 200+ Integrations • Native integrations with monitoring, ticketing, collaboration, and

    chat tools • Client SDKs and tools (lamp, marid) • Custom solutions
  8. The future Sky is the limit! Every company who operates

    always-on services (nowadays everyone!) needs OpsGenie.
  9. The impact of downtime & perf degradation Direct revenue loss

    Unhappy users Loss of credibility Loss of opportunity
  10. Role in IT Operations IT Systems Factories Scientific Labs Food

    trucks ... Real World Monitoring Alerting Alert Management Notifications Security tools Custom software Marid Operations Notify via phone call, sms, mobile app, emails Notify right person, right time Notify collaboration tools & it systems Help operator solve fast ... Need to understand & solve fast Needs clean & rich information from multiple systems Ticketing, Collaboration IT management All about monitoring ...
  11. Plan and prepare for incidents • Determine who should respond

    • Use templates to prepare messaging and communication channels to responders and stakeholders • Predefine collaboration methods including video conferences, and chat channels • Create status pages to communicate proactively to all stakeholders
  12. Ensures issues are never missed, and the right people are

    notified • Easily manage on-call schedules for multiple teams • Use OpsGenie’s flexible rules engine to route alerts to the right people • Notify responders using multiple channels • Automatically escalate alerts until action is taken
  13. Provides insight to improve your operational efficiency • Track and

    analyze all response activities • Ensure on-call workloads are distributed and balanced • Understand the frequency and source of alerts
  14. Feature set in a nutshell Reliable alerting through multiple notification

    channels On-call schedules and escalations Flexible routing and powerful integrations Alert policies Audio and video bridges Incident command center Reporting and analytics Heartbeats Incoming Call Service and incident management
  15. Startup Culture Same & single goal, leading company to success

    Everyone should own & do his best towards success Be agile & iterative Definitely does not mean 7/24 work
  16. Value all members • Let the member feel comfortable &

    owner ◦ So that she can put his own efforts freely ◦ Listen everybody • Let the member to engage in all stages, let her be a master ◦ Requirement analysis, design , development, testing, quality assurance, delivery to production & monitoring ◦ So let her to have the responsibility from end to end ◦ So let her have a word & understanding on all phases • Career paths for everyone should be valuable • Salary policy ◦ Show your value to your member with well amount ◦ Common bonus model ◦ Stock options • ! Monitor your members by getting feedback from them if they feel valuable
  17. Value the customer The product itself should satisfy customer needs

    in well defined ways, but also : • Should respond to customer queries fast • Should solve problems fast, requires nice logging, alerting , monitoring • Should introduce minor features fast, requires a flexible product & agile team • Making customers happy is surely harder than selling the product • ! Monitor customer happiness & usage ◦ Sales efforts are worthless if customer goes away soon
  18. Favor communication instead of bureaucracy Sharing ideas makes us train

    faster, makes the product better, Sharing ideas lets team feel they are working for the same goal, Sharing ideas saves time, experience becomes more visible Please encourage everyone to communicate as much as possible even if you have well designed requirements documents Get rid of useless steps, communications
  19. Positive working environment • One should only deal with the

    business ◦ We are lucky that we are running our own business • One should find enough help when needed ◦ Open Office ◦ Slack ◦ Great teamwork • One should be satisfied ◦ Tiny snack bar ◦ Tiny bicycles ◦ Humor factor ◦ Macbook & big screen ◦ Sofas
  20. Agile & iterative progress by all means Not only during

    development cycle Also on product design & release Also on growing the team Also on growing the company Benefit is lowering down the risks, using time wisely Don’t forget to monitor all processes
  21. Design & Spent Time Wisely • Being effective is more

    important than team size ◦ Always run after efficient, simple, cost effective solutions • Time is very important for a startup ◦ If used efficiently expected output grows significantly • Communication & agility greatly helps spending time wisely • How : This is the real engineering ◦ Always spent time on planning & designing as a whole ◦ ! Monitor the process & make improvements • Result ◦ Each team member becomes a master ◦ Whole team becomes a master
  22. Keep track of world, use the right tools • Do

    not lose tracking of IT world ◦ Even if you do not use the new technologies or products, track the culture, approach & design behind it ◦ Global companies like to share their experience by all means all places, blogs & events • Adopt the right, effective set of tools for your needs ◦ Replace the tools as they becomes a burden or hard to manage • Right tool sample : AWS services ◦ It is very hard to achieve same success without AWS ◦ If AWS was not there we should be spending significantly more manpower & resources ◦ Still need to use AWS services carefully :)
  23. Extending the Product • Keep track of it management world

    changes • Get feedback from customer • Find out suitable extensions to product ◦ Use same experience & technology you have to create new features ◦ Find more customers ◦ Find more users from existing customer base • Still be iterative, agile, efficient & wise • ! Need monitoring for extending to new worlds
  24. Marketing is everyone’s job, but also is a department •

    The most important team - evangelism! :D • Create content for each stage of customer journey • Reuse content in different mediums, plan ahead • Keep track of everything - monitor • Keep your business data in a central place, ready to be required to make informed decisions • User experience and quality landing pages become a competitive advantage • Messaging matters • Follow the trends and produce videos! • Have regular webinars for your prospects and customers
  25. Sales • We are hiring! • Outbound and inbound •

    Once a small team uses OpsGenie, it is easier to expand the account because our product works • Support sales with good content and easy to use examples & demo accounts • US and EU are different • Enterprises require special care
  26. IT Monitoring Domain is too complex Analysis & experience in

    the domain is very important Bunch of tools to integrate with Every customer has its own way of managing it The solution should satisfy different customers & tools A simple extendible alert structure , customizable data , tags , actions , flow
  27. Decision of making integrations & client tools • What if

    customers implement their own integrations ◦ We have the genie running on SaaS doing alerting ◦ Is it enough to engage customers ? ◦ Every potential customer can hassle with implementing integrations ? • What if we built them ◦ We spent significant effort on implementation & maintenance ◦ Reusable piece of external software can be used by every company ◦ Customer spends less time & less effort ◦ It becomes very faster to integrate your all solutions ◦ Use existing integrations , if not suits use client SDKs & tools • Every step towards improving customer engagement helps success
  28. Scaling the technical product • Start with the minimum viable

    product, but do not forget monitoring • Detect potentials problems as early as possible and plan future changes • We started already with a scalable product, made improvements on the way ◦ Technology shifts : AWS SimpleDB > AWS RDS > AWS DynamoDB • Work chaos earlier • Luckily we process already filtered data, so managing traffic is not a huge problem - still the load can be unexpected • Cost of services being used shall increase as business grows, requires changes on the way
  29. Searching new members for team Developer personality : self learner,

    productive, agile, ready to share the burden Developer skill set : ability to cope with problems, new tech, hacker like Hard to find developers having both Find the good talent in internship and hire as part-time
  30. Scaling the team • Adding new members, doesn’t scale productivity

    as expected • Find out ways to train new members with the culture & progress ◦ 3-6 months incremental issues, heavy pair development ◦ Product technical depth is not huge, but we have a huge set of tech stuff • Split into teams : Platform (now 5 teams), Integrations, SRE, Customer Success, Marketing, Sales • Platform team is still huge, splitting very hard (mostly done!) • Define new roles for managing the process • Try and experiment on the way • Communicate openly while making changes
  31. Scaling the code-base 3 people came up with initial product

    with extreme & test driven development Also introduced ugly, unrelated approaches from old product As team grows , need to make it more readable,robust, error free Single responsibility & layers are very important We are not living in a world in which every developer: is able to learn extremely fast & know every piece of the product
  32. Creating reliable and highly available software Being highly available these

    days requires more than multiple AZ Reducing the number of problems on prod in a fast paced and fast growing team requires proper testing and automation As you grow, team sizes increase but having a monolith brings a lot of problems In some cases, boring technology is better Still, try to take advantage of what is next - Serverless We feel the pain of our customer and use OpsGenie at OsGenie
  33. Security Employee training becomes a must Secure software requires thinking

    about security from the beginning Enterprises need extensive security audits and certificates
  34. Suggested resources Book: Incident management for operations The DevOps Handbook

    Links: https://medium.com/@serhatcan/opsgenie-nedir-ne-yapar-b688eaf724f1 opsgenie.com/blog opsgenie.com/resources engineering.opsgenie.com