Dependency Failure Playbook — Mission Control

Operator guide for external dependency failures with detection signals, immediate mitigations, rollback choices, and copy-ready comms.

Loading playbook lanes… ← Mission Control
Dependency lanes5
AWS, SES, SNS, CloudFront, and Cognito failure modes.
Default comms SLA15m
First operator update should go out within 15 minutes of confirmed disruption.
Fastest fallbackCache
CloudFront cache holds the public shell while deeper systems recover.
Rollback postureFail closed
Prefer safe degradation, queueing, or stale-safe responses over partial corruption.
Use this during operator triage, not after the fact: confirm the lane, announce the customer-safe version first, then escalate into rollback or containment. Copy actions below are generated from the audience/owner fields so you can ship a comms update without rewriting under pressure.

Comms kit

Set the audience and owner once. Each lane will build a ready-to-send update for that failure mode.

State saved locally
Open ops metrics exporter

Escalation ladder

Simple sequencing when the dependency is outside your control.

Current comms preview

This updates live from the controls above and the selected lane copy actions.

Preparing operator comms…