During managed service incidents, this page documents the incident response playbooks the Core Services team can use when issues arise in the Sourcegraph Managed Services Platform (MSP) fleet and shared platform.
<aside> 💡 For more per-managed-service-service/MSP-operator-oriented guidance, refer to the Managed Services infrastructure pages instead.
</aside>
If a MSP service outage occurs, you should declare an Incidents , which more or less means using the /incident
command to create an incident. Assess the impact of the outage and configure the incident as appropriate:
owners
field in service specification to infer what channels and stakeholders need to be notified.Affected Services
field of the incident creation template.Quick links and brief summary below - for more details refer to the more generalized guidance.
Managed Services Platform Operators
can be used in case a non-Core-Services teammate needs access, or if there is some other issue accessing the workspace.owners
can be used for escalated access to Sourcegraph's entire Terraform Cloud account. Use with care!mspServiceEditor
or mspServiceReader
via Entitle on the specific service environment's project.mspServiceEditor
or mspServiceReader
on the service's folder:
catogory: prod
services: Entitle: mspServiceEditor
on the Managed Services
foldercatogory: internal
services: Entitle: mspServiceEditor
on the the Internal Services
foldercatogory: test
services: All engineers should have access by default (test services are placed in the Engineering Projects
folder)mspServiceEditor
and mspServiceReader
are available for convenience, and are configured in gcp/org/customer-roles/msp.tf
in the infrastructure repo. Additional roles can be requested directly via Entitle.Service-specific guidance is generated in Managed Services infrastructure pages.
Custom Terraform (*.tf
) can be added to relevant environment workspaces in the managed-services
repository to quickly provision and manage custom infrastructure using Terraform Cloud during an incident, without needing to make significant changes to sg msp
to introduce a new resource.
In peacetime, all service workspaces are left in "VCS mode", where the remote managed-services
repository is used when running Terraform plan and apply in Terraform Cloud.
Changes to the repository automatically triggers a plan as part of repository CI, and merging to main
automatically deploys the workspaces.