Service Reliability Suite
Tools that will help you build reliable services. Calculate error budgets, generate monitoring queries, and discover meaningful metrics. These tools are solutions to problems me and my teams faced while building SRE programs.
Uptime Calculator
Calculate allowed downtime from uptime percentages. Understand your error budget in concrete time units.
Service Level Calculator
Complete SLI/SLO/SLA calculator with burn rates and platform-specific query generation.
Advanced Calculator
Enterprise-grade SLO planning with composite SLIs, industry use cases, and multi-window burn alerts.
Service Assessment
AI-guided tool to discover meaningful metrics for your service through interactive questions.
Metric Simulator
Generate and analyze simulated metrics with various distributions to understand SLO compliance.
Why Use This Suite?
🎓 Educational
Inline help, tooltips, and examples teach SLO concepts as you use the tools.
🔗 Shareable
Every configuration is URL-shareable. Collaborate with your team easily.
⚡ Practical
Generate actual queries for Datadog, Prometheus, New Relic, and more.
📋 Copy-Ready
One-click copy for all queries, formulas, and configurations.
🎨 Clean UI
Professional design that doesn't distract from the content.
💰 Cost-Aware
Translate error budgets into business impact with cost calculators.
