Architects, Don’t Skip Leg Day: Fitness Functions Matter

June 11, 2025

6 min read

Architects, Don’t Skip Leg Day: Fitness Functions Matter

"How do we know the architecture is still good?"

I remember the question clearly. It was halfway through a large-scale microservices migration at a fintech I was consulting for. The CIO asked it in a steering meeting, eyes scanning the room. Silence. Until one engineer muttered, "We hope so…"

That moment stuck with me. It echoed a deeper truth: without feedback loops, architecture drifts. The elegant diagrams fade, entropy creeps in, and assumptions go stale. Most solution architects are familiar with architectural principles like scalability, security, and maintainability. But how often do we verify these principles once code hits production? In a world where software changes daily, even the best-designed architectures are vulnerable to erosion. We need tools that do more than document intent, we need mechanisms to continuously validate it.

That's where architecture fitness functions come in. They turn architectural qualities into measurable, testable, and enforceable rules. In this post, we'll explore what they are, why they matter, and how you can put them to work in your architecture practice.

Measuring What Matters in Your Architecture

In complex systems, especially distributed, cloud-native environments, architecture isn’t a "set-and-forget" endeavor. Things evolve: code, teams, infrastructure, and compliance needs. Yet many architecture documents remain frozen in time. Fitness Functions, a concept popularized in Evolutionary Architecture by Neal Ford and other authors, give us living tests that continuously assess whether your system’s architecture is still aligned with its desired qualities.

At their core, fitness functions are automated checks that provide objective feedback on specific architectural characteristics. Think of them as unit tests for architectural intent.

They answer:

Is our latency below 200ms at P95?
Are all internal APIs secured with OAuth2?
Have any new services broken our domain boundaries?
Are cost optimization practices actually being followed?

Not all fitness functions are created equal. They vary not just in purpose, but in when they execute and how they inform architectural decisions. Some catch violations early in the development pipeline, while others monitor behavior over time to reveal systemic issues. Depending on when and how they run, they offer different feedback loops, some immediate, others continuous. Here's a breakdown of the main types:

Type	When It Runs	Example
Static	At build or deploy time	All service interfaces must use protocol buffers, not JSON.
Dynamic	At runtime	No service may use more than 500MB RAM under normal load.
Triggered	On specific events or conditions	On each new service deployment, validate tagging and security group rules.
Continuous	Always running (observability)	Error rates should remain below 1% in all environments.

Embedding Fitness Functions in Practice

Here’s how to put fitness functions to work across a typical architecture lifecycle: from capturing architectural intent and tying it to measurable outcomes, to enforcing it through automation and observability. These steps help ensure your design principles are not just aspirational but continuously validated in day-to-day operations.

1. Tie Functions to Quality Attributes

Start by linking each fitness function to a specific architectural quality. This makes them purposeful and easy to communicate. Each quality attribute represents a non-functional requirement that can drift over time without explicit safeguards. By grounding your fitness functions in these qualities, you ensure the architecture evolves in line with business and technical goals.

Performance: Pods in the production namespace must not exceed 80% of CPU usage.

kubectl top pods --namespace=production \
  | awk 'NR>1 {gsub(/m/,"",$2); if ($2+0 > 80) print $1}'

Security: All endpoints require authentication.

grep -rn '@PermitAll' ./src \
  | awk '{print "Public endpoint found: "$1}'

Maintainability: Codebases must not exceed 15 package dependencies.

npm ls --parseable | wc -l \
  | awk '{if ($1 > 15) print $1 " package dependencies exceeding the maximum of 15"}'

Modularity: Services in different bounded contexts must not share databases.

grep -r 'jdbc:mysql://shared-db' ./services \
  | awk '{print "Shared DB usage detected: "$1}'

Cost Efficiency: Prevent cost sprawl in cloud-native environments by enforcing cost thresholds during infrastructure provisioning.

infracost breakdown --path . --format json \
  | jq '.projects[].breakdown.totalMonthlyCost | tonumber' \
  | awk '{ if ($1 > 300) { print "❌ FAIL: Monthly cost $" $1 " exceeds $300"; exit 1 } else { print "✅ PASS: Monthly cost is $" $1; exit 0 } }'

2. Automate Early and Often

Use tools like these to codify architectural intent into automated feedback loops that integrate directly into your CI/CD or runtime environments. These tools help you catch violations early, enforce consistency, and scale architectural governance across teams:

Open Policy Agent (OPA) for policy-as-code (e.g., Kubernetes admission control)
ArchUnit (Java) or NetArchTest (.NET) for enforcing architectural layering
Terraform with Sentinel or Infracost for cost and policy compliance
Chaos Monkey or Litmus for resilience-based fitness testing

3. Expose via CI/CD & Observability Pipelines

Even the most well-crafted fitness functions lose value if their insights stay hidden. To truly influence development and operational behaviors, surface these checks where decisions are made: in CI/CD pipelines, monitoring dashboards, and alerting systems. This turns feedback into action and architecture into a living part of delivery.

Make architectural drift visible in pipelines: fail builds or emit warnings
Integrate with dashboards (e.g., Grafana) to visualize long-term trends
Trigger alerts when thresholds are breached

Lessons from the Field

In one engagement, we used ArchUnit to enforce strict domain boundaries in a DDD-based monolith. One day, the pipeline failed. A junior dev had unknowingly injected a cross-domain repository call. The fitness function caught it. Without it, that violation might have grown roots, compromising the architecture silently.

This is the power of automated architectural feedback. Fitness Functions transform passive blueprints into living, evolving systems that respond to change and resist entropy. If your architecture has goals, fitness functions are your feedback loop. Don’t wait for a production incident or a CIO's awkward question to realize you’ve drifted.

So, what’s your architecture testing?

Are you validating observability is consistently implemented?
That data pipelines meet freshness SLAs?
That S3 buckets aren’t publicly accessible?

Let this post be your signal to begin. Start small. One test. One signal. Then evolve.