During managed service incidents, this page documents the incident response playbooks the Core Services team can use when issues arise in the Managed Services Platform (MSP) fleet and shared platform.

<aside> 👋 This page is fairly high-level - looking for something more specific?

For more per-managed-service-service/MSP-operator-oriented guidance, refer to the Managed Services infrastructure pages instead.
For specific disaster recovery playbooks, see ‣
For general MSP platform availability information, see MSP platform availability </aside>

Basics

Declaring an incident

If a MSP service outage occurs, you should declare an Incidents , which more or less means using the /incident command to create an incident. Assess the impact of the outage and configure the incident as appropriate:

Use the owners field in service specification to infer what channels and stakeholders need to be notified.
Make sure to fill out the Affected Services field of the incident creation template.

Infrastructure access

Quick links and brief summary below - for more details refer to the more generalized guidance.

Service Infrastructure
- You can request mspServiceEditor or mspServiceReader on the service's folder:
  - catogory: prod services: Entitle: mspServiceEditor on the Managed Services folder
  - catogory: internal services: Entitle: mspServiceEditor on the the Internal Services folder
  - catogory: test services: All engineers should have access by default (test services are placed in the Engineering Projects folder)
- mspServiceEditor and mspServiceReader are available for convenience, and are configured in gcp/org/customer-roles/msp.tf in the infrastructure repo. Additional roles can be requested directly via Entitle.
Terraform Cloud
- Core Services team members should be part of the Core Services team in Terraform Cloud, which should have access to all MSP TFC workspaces by default.
- Entitle: Managed Services Platform Operators can be used in case a non-Core-Services teammate needs access, or if there is some other issue accessing the workspace.
- Entitle: owners can be used for escalated access to Sourcegraph's entire Terraform Cloud account. Use with care!
CLI-apply mode: sourcegraph-secrets GSM access: need for sg msp tfc commands and using terraform apply

Service-specific guidance is generated in Managed Services infrastructure pages.

Disaster recovery playbooks

→ See ‣

Basics

Declaring an incident

Infrastructure access

Disaster recovery playbooks

Core concepts

Custom terraform