TechAni
DevOps Discovery

Deliver quickly without trading away operational excellence

Use this discovery to map culture, delivery automation, and measurement gaps. Each section pairs questions with TechAni’s recommended remediation patterns so you can move from insight to implementation rapidly.

Culture & Collaboration

Assess whether development, platform, and operations teams share goals, rituals, and ownership for production outcomes.

Discovery Questions

  • Are development and operations teams aligned around shared objectives?
  • How frequently do cross-functional retros or incident reviews occur?
  • Do developers participate in on-call or operational reviews?
  • How are feedback loops between product and platform teams facilitated?
  • Are teams organized around services or functional silos?
  • How much autonomy do product teams have over deployment and operations?

Evidence to Collect

  • Org charts and team responsibility matrices
  • Retro/postmortem documentation
  • Communication workflows

Team Topologies Alignment

Apply Team Topologies patterns to reduce cognitive load and improve flow.

Stream-Aligned TeamsPlatform TeamsEnabling Teams

Implementation Steps

  1. Organize stream-aligned teams that own services end-to-end (build, deploy, operate).
  2. Establish platform teams that provide paved paths and self-service capabilities.
  3. Form enabling teams to coach others in adopting new practices and tooling.
  4. Limit team interaction modes to reduce coordination overhead.
  5. Define clear service ownership and on-call responsibilities for every team.

You Build It, You Run It

Shift operational accountability closer to the teams shipping code.

Implementation Steps

  1. Transition on-call rotation for services to the teams that develop them.
  2. Provide platform guardrails, runbooks, and paved paths so teams can operate safely.
  3. Run joint incident reviews and retrospectives to maintain shared empathy.

CI/CD & Automation

Measure how automated, reliable, and standardized the delivery pipeline is across teams.

Discovery Questions

  • What CI/CD toolchain is in place today?
  • Are build/test/deploy stages automated and observable?
  • How standardized are pipelines across teams?
  • Is deployment frequency measured and shared?
  • Are rollbacks and progressive delivery strategies in use?
  • How are secrets and configuration managed within pipelines?

Evidence to Collect

  • Pipeline definitions
  • Deployment metrics
  • Automation coverage reports

GitOps Continuous Delivery

Adopt declarative, Git-driven deployments with automated syncing and rollback.

ArgoCDFluxCDGitHub Actions

Implementation Steps

  1. Standardize builds and tests via reusable CI workflows.
  2. Update manifests in a GitOps repository as the source of truth.
  3. Use ArgoCD/FluxCD to sync manifests and manage rollbacks.
  4. Introduce canary or blue/green strategies with automated verification.

Progressive Delivery

Adopt incremental rollout strategies with automated analysis and rollback.

Argo RolloutsFlaggerLaunchDarkly

Implementation Steps

  1. Define rollout steps (e.g., 5% → 25% → 50% → 100%).
  2. Measure key metrics (error rate, latency) during each stage.
  3. Automate rollback when thresholds are violated.
  4. Pair feature flags with progressive delivery for safe experimentation.

Infrastructure as Code

Review how consistently infrastructure is expressed as code and governed across environments.

Discovery Questions

  • Which IaC frameworks are used (Terraform, Pulumi, CloudFormation)?
  • Is all infrastructure managed as code or partially manual?
  • How are changes reviewed, tested, and promoted across environments?
  • Is drift detection and remediation automated?
  • How do development and platform teams collaborate on infrastructure changes?

Evidence to Collect

  • IaC repositories
  • Change management workflows
  • Drift detection reports

Automated Terraform Workflow

Automate plans, reviews, and applies with safeguards.

Terraform CloudAtlantisSpacelift

Implementation Steps

  1. Trigger terraform plan on pull requests and publish output to reviewers.
  2. Run policy-as-code checks (Sentinel/OPA) and security scanning (Checkov/tfsec).
  3. Estimate cost impact with Infracost before approval.
  4. Require approvals before terraform apply runs automatically on merge.

Environment Parity

Ensure dev, staging, and production infrastructure remain consistent.

Implementation Steps

  1. Use shared modules with environment-specific variables for configuration.
  2. Promote changes through dev → staging → production pipelines.
  3. Automate smoke tests and post-apply validations in each environment.

Delivery Performance (DORA Metrics)

Track the four key DevOps metrics and feed them into continuous improvement loops.

Discovery Questions

  • What is the current deployment frequency?
  • How long does it take for code to reach production?
  • How quickly are incidents resolved (MTTR)?
  • What percentage of changes cause incidents or rollbacks?
  • How are these metrics communicated to leadership and teams?

Evidence to Collect

  • Metrics dashboards
  • Incident records
  • Change approval logs

DORA Metrics Instrumentation

Automate measurement and reporting of the four key DevOps metrics.

SleuthLinearBGrafana

Implementation Steps

  1. Pull deployment events from CI/CD systems.
  2. Calculate lead time from commit timestamp to production release.
  3. Integrate incident data to measure MTTR and change failure rate.
  4. Share trends with product and engineering leadership to drive prioritization.

Data-Driven Continuous Improvement

Use DORA data during retros to target systemic improvements.

Implementation Steps

  1. Review DORA metrics at sprint or release retrospectives.
  2. Correlate improvements with changes in process or tooling.
  3. Set quarterly targets and publish progress to the organization.

Continuous Improvement & Learning

Ensure learnings are captured, shared, and actioned across teams.

Discovery Questions

  • How frequently are retrospectives held, and are action items tracked?
  • Are metrics and learnings transparent across teams?
  • Does psychological safety exist for sharing incident root causes?
  • Are there internal communities of practice for DevOps or SRE?
  • How are new ideas piloted and scaled?

Evidence to Collect

  • Retro documentation
  • Action item trackers
  • Community of practice artifacts

Structured Retrospectives

Make retrospectives actionable and outcomes visible.

Implementation Steps

  1. Run team-level retros every sprint with standardized templates.
  2. Hold incident retrospectives within 48 hours of major events.
  3. Track action items in shared tooling with owners and due dates.

Communities of Practice

Encourage cross-team knowledge sharing and standardization.

Implementation Steps

  1. Host monthly DevOps/SRE guild meetings and tech talks.
  2. Maintain shared documentation hubs (Notion, Confluence).
  3. Facilitate async Q&A channels and annual internal summits.