Upgrade to Pro — share decks privately, control downloads, hide ads and more …

[EN] Mercari Server-Side Deployment: Present an...

mercari
September 30, 2017
190

[EN] Mercari Server-Side Deployment: Present and Future

mercari

September 30, 2017
Tweet

Transcript

  1. SRE • Abbreviation of Site Reliability Engineering • Coined by

    Ben Treynor, Google’s operations team leader • Google SRE provide software engineering to increase site/service reliability across Google’s various products and services
  2. Mercari SRE • To ensure a reliable service that is

    enjoyable to use at anytime. • “Takes care of all engineering apart from new service development” • 2015/11 Changed name from “Infrastructure Team” to “SREs” • SRE is a better indication of what we do • Currently 10 members, always looking for new hires
  3. Mercari Server Side Deployment Requirements • Continuous delivery • More

    than 10/Day • GitHub pull request based development • Uses pull requests in production environment • Merges into master branch = production deploy • Confirmation prior to deployment • Is it peer reviewed? • Is database migration finished? • Do you have release approval from your manager? master
  4. Confirmation of Deployment • From the production quality assurance point

    of view • It must be tied to tickets • It must be peer reviewed • It must be possible to go back and see what you released in the past and when From the release accident prevention point of view • Check that DBMigration (DB Table or record necessary for the code) has been implemented • If the code is changed again after a peer review, it must be reviewed again
  5. Directly after the launch - 2015/2/17 • Deployed regularly once

    a week • The amount of information included in each deploy increased significantly • Difficult to differentiate between rollbacks and continued deployment • Emergency deployment everyday • Weekdays (Mon-Thu) were DeployDays • We wrote the pull request that we wanted to deploy on a whiteboard • System of taking turns • Check whether release requirements are met each time • Ansible-playbook implementation
  6. Key Features • Automated with Slack bots • All of

    the deployment can be done on Slack • Works with Google Calendar • Can plan a time for the release • Confirms the Pull request • Redmine ticket URL included in description? • Redmine manager approval status? • Is there an SQL file for the source difference? • Make DBMigration essential • Has someone other than the PR creator added LGTM labels? • Shows that peer review has been done
  7. Deploy Overview (1st Generation) :Google Calendar :Redmine :API Servers :API

    Servers :API Servers 1-1:git push 2-2:approve? 1b-1:approve <<Developer>> <<Manager>> 2-4:comment(“deploy done”) :GitHub Deploy bot: Slack bot 2-3:deploy() 2-5:entry(“deploy done”) 2-1:entry(DeployDATE)
  8. Key Features • Separates the deploy bot and ITGC review

    bot • In order to increase deployment possibilities in the future • GitHub review bot • A bot that reviews sources on GitHub • Confirm PullRequest • Is there a JIRA Issue link in the description? • Doesn’t apply to master branch • Doesn’t review PR • GitHub Integrations & services • Amazon SNS • AWS Lambda
  9. Review bot: AWS Lambda Deploy Overview (2nd Generation) :Google Calendar

    :JIRA :API Servers :API Servers :API Servers <<Amazon SNS>> 1a-2:event(:Pullrequest) 1a-1:git push / comment 1a-3:result := review(:PullRequest) 1a-4:approve? 1a-5: review(:result) 1b-1:approve <<Developer>> <<Manager>> 2-2: comment(“review please”) 2-4:comment(“deploy done”) :GitHub Deploy bot: Slack bot 2-3:deploy() 2-5:entry(“deploy done”) 2-1:entry(DeployDATE)
  10. GitHub Review Bot • AWS Lambda + SNS • npm

    github • Can also use Preview API • Doesn’t work with GraphQL v4 // eval_items: {approve: , body: } Promise // p_r: Amazon SNS PullRequest object (eval_items, p_r) => { Promise.all(eval_items) .then(results => { let approve = results.map(r => { return r.approve; }) .reduce((x,y) => { return x && y; }); let body = approve ? MSG_OK : MSG_NG; return github.pullRequests.createReview({ owner: ‘xxxx’, repo: p_r.head.repo.name, number: p_r.number, body: body, event: approve ? ‘APPROVE’ : ‘REQUEST_CHANGES’ }) }); }
  11. Self P-R Label Confirmation • Can see label add/remove history

    as events at github.issues.getEvents • Compare event.actor.login and p_r.user.login (p_r) => { return github.issues.getEvents({ owner: ‘xxxx’, repo: p_r.head.repo.name, number: p_r.number }).then(resp => { // get last event let last_ev = resp.filter(ev => { ev.event == ‘labeled’ && Ev.label.name == ‘LGTM’ }).reverse()[0]; if (last_ev.actor.login == p_r.user.login) { return {approve: false, body: ‘Self Labeled’}; } else { return {approve: true, body: ‘OK’}; } }); }
  12. Below events move on trigger • Create a Pull request

    • Pull request comment creation • Source push • Attaching/deleting labels • Peer review • DBMigration completion
  13. First version Made the results appear in the status area

    Hard to see it along with CI test results from developers
  14. 2nd Beta Made the results appear as review comments There

    were comments that the timeline was messy every time a push occurred
  15. Beta 2nd (continued) The NOT_IT_CONTROL label shows that it is

    a simple adjustment that doesn’t need to be tied to a ticket Determined by the type of file included in the Pull request change The label is automatically added
  16. Current version As soon as a pull request is created,

    the bot review begins Doesn’t show result details straight away Shows specific review results in developer comments
  17. GitHub Bot Roundup Considering User Experiences • Don’t get in

    the way of developers (=Users) • The developer is the lead. • Shadow and watch over them. • Keep info to a minimum. Enough to explain details if asked. • Issues • There are times when the developer is concerned if it is operating properly
  18. Deployment that Works with Kubernetes(k8s) • How should we implement

    Microservices? • Decided to make k8s the foundation • Started using Spinnaker • Continuous Delivery Platform • Developed by Netflix • Collaborated with Google and was open-sourced in 2015 • Relationship with Microservices • Developers can have a sense of ownership with deployment as well
  19. Spinnaker • Hopes • Possible to create a pipeline and

    deploy with various styles • High hopes for Automated Canary Analysis • Challenges • GUI-based configuration • We want to configure it by code! • Reviewing in PullRequest / Managing change history • Because the development community is new, there is still little information on troubleshooting • Often breaks at version update
  20. The (Near) Future of Deployment • Operational improvement • More

    automation • Internal use of Spinnaker Deploy • Data Deploy • Machine learning data • Making it easier to read • Not just for engineers • Allowing the CS team to view it as well https://flic.kr/p/BKp7yQ
  21. References • Mercari Engineering Blog • http://tech.mercari.com/entry/2015/10/15/183000 • http://tech.mercari.com/entry/2016/11/14/120000 •

    Kubernetes • https://kubernetes.io/ • Spinnaker • https://www.spinnaker.io/ • http://tech.mercari.com/entry/2017/08/21/092743 • GitHub REST API v3 • https://developer.github.com/v3/ • npm github • https://www.npmjs.com/package/github • http://mikedeboer.github.io/node-github/ • https://developer.github.com/v3/