- User Impact. - Detection. - Resolution. - Duration/Timeline. - What Went Well? - What Went Poorly? - Where We Got Lucky? - Action Items. The documentation for the Google Cloud Python libraries was unavailable for users.
- User Impact. - Detection. - Resolution. - Duration/Timeline. - What Went Well? - What Went Poorly? - Where We Got Lucky? - Action Items. As part of repository cleanup, an engineer with write access to the development repository deleted the gh-pages branch.
- User Impact. - Detection. - Resolution. - Duration/Timeline. - What Went Well? - What Went Poorly? - Where We Got Lucky? - Action Items. During a two hour period, hosted reference documentation for our libraries was unavailable.
- User Impact. - Detection. - Resolution. - Duration/Timeline. - What Went Well? - What Went Poorly? - Where We Got Lucky? - Action Items. This was first detected via an external customer. Shortly after two internal teams also noticed and notified. The delay to the initial report was 30 minutes following the start of the outage.
- User Impact. - Detection. - Resolution. - Duration/Timeline. - What Went Well? - What Went Poorly? - Where We Got Lucky? - Action Items. The documentation was available after republishing to gh-pages.
- User Impact. - Detection. - Resolution. - Duration/Timeline. - What Went Well? - What Went Poorly? - Where We Got Lucky? - Action Items. 2 hours. 2019-12-05 (all times PDT) 09:45 Branch gh-pages deleted during clean up of repository branches. 10:21 GitHub issue filed stating that docs are not available. 10:45 Team responds to issue. 10:55 Branch has been republished 11:45 GitHub is serving docs again.
- User Impact. - Detection. - Resolution. - Duration/Timeline. - What Went Well? - What Went Poorly? - Where We Got Lucky? - Action Items. We were notified fairly quick and the team monitors GitHub issues well enough that we knew about it pretty fast.
- User Impact. - Detection. - Resolution. - Duration/Timeline. - What Went Well? - What Went Poorly? - Where We Got Lucky? - Action Items. The branch should have been protected from deletion. GitHub page publishing is a bit opaque. While the branch was pushed within about 5 minutes of finding out it took an additional hour or so to get it actually serving the files. The lack of debugging information made this more difficult.
- User Impact. - Detection. - Resolution. - Duration/Timeline. - What Went Well? - What Went Poorly? - Where We Got Lucky? - Action Items. A member of the team had a local copy of the branch handy and was able to republish.
- User Impact. - Detection. - Resolution. - Duration/Timeline. - What Went Well? - What Went Poorly? - Where We Got Lucky? - Action Items. Protect gh-pages branch. Don’t allow deletion. Investigate if debugging gh-pages could be improved. Investigate if there are other technologies we ought to be using instead of gh-pages.
executed that deleted the production database. 2. An engineer executed a script that deleted the production database without confirmation. 3. Chris executed a script that deleted the production database without confirmation.