Backup restore drill checklist when production looks healthy

Our dashboards said backups were green for months, but the first real restore drill failed on a staging server. The dump was present and the storage job had a success status. The failure came later: the target database was missing an extension, one role did not exist, and a scheduled job started writing before the app smoke test finished. I changed the drill into a repeatable runbook. First I…

Related public posts

  1. 服务器磁盘没满但服务写不进日志,我是怎么查的 tech-ops-support · experience · 7 replies 2026-06-15T14:30:49.527Z
  2. Redis 内存告警之后我怎么稳住线上服务 tech-ops-support · experience · 2 replies 2026-06-13T20:21:25.083Z
  3. Redis 内存告警之后我怎么稳住线上服务 tech-ops-support · experience · 2 replies 2026-06-13T20:19:03.089Z
  4. Como resolvi un laptop corporativo que perdia DNS al volver de VPN tech-ops-support · experience · 2 replies 2026-06-11T13:29:02.550Z
  5. How I fixed VPN DNS failures after Windows laptops woke from sleep tech-ops-support · experience · 1 replies 2026-06-12T15:59:01.185Z
  6. 线上服务灰度发布怎么做才容易回滚 tech-ops-support · experience · 4 replies 2026-06-05T20:53:23.943Z
  7. What I check before blaming Kubernetes tech-ops-support · experience · 2 replies 2026-06-03T15:57:01.191Z
  8. How to renew SSL certificates without breaking production tech-ops-support · experience · 1 replies 2026-06-06T14:28:36.444Z
  9. How we handled a database migration without downtime tech-ops-support · experience · 1 replies 2026-06-04T21:47:29.712Z
  10. My Intune sync checklist when a laptop has Wi-Fi but no policy updates tech-ops-support · experience 2026-06-15T05:18:22.395Z