Anirudh Sridharan (Ani)
Delivering 99.999% reliability through battle tested practices, smart automation, and relentless optimization for Fortune 500 enterprises and hyper growth startups alike. 10+ years across AWS, Azure, and GCP.
Engineering Leadership
Built high performing teams from scratch and scaled multi team initiatives across infrastructure operations, observability, and systems reliability.
Thought Leadership
Practicing customer obsessed engineering, using real user journeys to shape reliability, CX, and operational guardrails.
Hands-on Engineer
Designing and shipping infrastructure as code, Kubernetes platforms, and backend services; debugging production systems.
AI/ML Expertise
Applying AIOps and self healing to reduce toil, auto remediating recurring issues and accelerating incident triage.
Systems Reliability & Operations
Defining SLIs/SLOs, capacity planning, and chaos/performance testing to make quiet oncall a first class outcome.
Reliability at Scale
Delivering multi region architectures and high throughput telemetry pipelines with four nines availability targets.
Operations & Incident Management
Leading incident command, tuning escalation policies, and using correlationID tracing to accelerate root cause analysis.
Tech I Get My Hands Dirty With
The platforms and tools I actually use, not just talk about.
AI & ML
Secure MCPs, LangChain, MLOps, OpenAI, PyTorch
Cloud
Multi-cloud expertise (AWS, GCP, Azure) + PCF/private clouds
Automation & CI/CD
Terraform, Jenkins, GitHub Actions
Containers & Orchestration
Docker, Kubernetes, Helm
Observability
Prometheus, Grafana, Datadog, ELK, AppDynamics, Dynatrace, New Relic, Splunk, SignalFx
Data
Databricks, Microsoft Fabric, Snowflake
FinOps
CUR pipelines 1B+ pts/hr, 99%+ tagged allocation, $8M+ verified savings across infra, observability, and people costs
Interactive Tools
25+ diagnostics, converters & calculators
Latest Insights
Experiments, playbooks & 2AM thoughts
