This page documents core components of Sourcegraph Managed Services Platform (MSP) infrastructure that are provisioned for services, their availability properties, and disaster recovery processes we have available.

Also see MSP technical details for more details about MSP internals.



Availability of GCP infrastructure

MSP primarily provisions GCP-managed infrastructure to support services. For GCP-managed infrastructure, in general, the following terms are used:

We are not currently aiming to provide multi-regional availability, and we do not offer SLA details beyond the ones offered by GCP for the products we use.

Component Description Default Regional HA option Uptime SLA References Notes on enabling regional HA
Cloud Run Runs service containers Regional n/a 99.95% Cloud Run: Zonal redundancy

Cloud Run disaster recovery | n/a | | Cloud SQL | PostgreSQL database instance provider | Zonal | Yes: 2x cost (~$200/mo for high-volume usage resources) | 99.95% with regional HA | Cloud SQL PostgreSQL High Availability

Pricing | Incurs downtime ~5mins | | Cloud Memorystore | Redis instance provider | Zonal | Yes: 2x cost (~$50/mo for high-volume usage requirements) | 99.9% | High Availability for Memorystore

Pricing | |

Regional option spend increase estimates are based on Sourcegraph Accounts Management System (SAMS) billing observed April 23, 2024. Services that require zonal failover capabilities should opt-in to the regional HA offerings for all core components that are in use by the service.

Services should choose availability tiers that best match their product availability requirements, and also assign Environment categories appropriately.

Environment categories

Each service environment describes a category that is used by Core Services to determine the importance of stability for that environment, particularly when rolling out major platform changes (MSP infrastructure upgrades).

For more details about environment categories, please see Environment categories.

Disaster recovery

Disaster recovery playbooks

See MSP incident response, in particular Disaster recovery playbooks.

Testing disaster recovery