Health Endpoints: When Production Sleeps with One Eye Open

August 20, 2025

6 min read

Health Endpoints: When Production Sleeps with One Eye Open

Picture this. Your team just rolled out a new payments service into production. It's late Friday night. Everything looks fine at first glance, but suddenly transactions stop flowing. You open the logs and dashboards, but nothing obvious shows up. Was it the database? The service? The network? You don't know yet. This is where a health endpoint could have been your first line of defense. Instead of digging through half a dozen monitoring tools, you could hit a simple /health URL and instantly know if the service was alive, if its dependencies were reachable, and if it was fit to serve requests.

For solution architects, these moments define the difference between a smooth-running system and a fragile one. A well-designed health endpoint does more than tell you "it’s up." It becomes the contract between your service and the ecosystem around it: load balancers, service meshes, orchestration platforms, and even your support teams. Without it, your architecture is running blind, and small issues turn into major outages.

Why Your Services Need a Pulse Check

Systems sit in the middle of service meshes, queues, databases, and external APIs. When one piece falters, the ripple effect can take down customer journeys in seconds. Health endpoints act as the simple, consistent signal that helps your ecosystem decide if a service is healthy enough to trust. They’re like the green light on a machine. Without it, operators and platforms have to guess whether it’s safe to use.

Health endpoints are not just for developers or operations teams. For architects, they’re an important part of designing systems that scale, recover, and integrate cleanly. Some key considerations flow directly from this point and include areas that must be accounted for when designing resilient systems, such as in early detection, where health endpoint lets load balancers or service meshes detect failing instances fast and route traffic away before users notice. While dependency awareness highlights that a shallow "I'm alive" check is not enough, and you must distinguish between the service being up and it being ready to serve requests. Integration fit is about the fact that in complex systems, orchestration platforms like Kubernetes depend on health endpoints for pod lifecycle management, and without them, scaling and self-healing are blind. Finally, business continuity comes from proper health checks, translating into uptime, which in turn directly supports revenue, SLA compliance, and customer trust.

Building Health Endpoints That Don’t Lie

Before you can design meaningful health checks, it helps to understand that they come in different depths and serve different purposes. In practice, health endpoints are often structured in three layers:

1. Liveness

Is the service running? This should be a cheap check that simply tells you the process hasn’t crashed. Keep this check lightweight and reliable so it can always respond quickly.

.NET example for the /health/live endpoint:

using Microsoft.AspNetCore.Diagnostics.HealthChecks;
using Microsoft.Extensions.Diagnostics.HealthChecks;
using System.Text.Json;

var builder = WebApplication.CreateBuilder(args);

builder.Services.AddHealthChecks()
    .AddCheck("self", () => HealthCheckResult.Healthy());

var app = builder.Build();

app.MapHealthChecks("/health/live", new HealthCheckOptions
{
    Predicate = r => r.Name == "self",
    ResponseWriter = async (ctx, report) =>
    {
        ctx.Response.ContentType = "application/json";
        var body = new
        {
            status = report.Status.ToString()
        };
        await ctx.Response.WriteAsync(JsonSerializer.Serialize(body));
    }
});

app.Run();

Example output:

{
  "status": "Healthy"
}

2. Readiness

Is the service ready to accept traffic? For example, has it connected to its database or warmed up caches? Without this, systems may flood a service that can't handle real work yet. Use readiness checks to protect consumers during startup or outages.

.NET example for the /health/ready endpoint:

using Microsoft.AspNetCore.Diagnostics.HealthChecks;
using Microsoft.Extensions.Diagnostics.HealthChecks;
using System.Text.Json;

var builder = WebApplication.CreateBuilder(args);

builder.Services.AddHealthChecks()
    .AddSqlServer(
        builder.Configuration.GetConnectionString("AppDb")!,
        name: "sql",
        failureStatus: HealthStatus.Unhealthy,
        tags: ["ready"])
    .AddRedis(
        builder.Configuration["Redis:Connection"]!,
        name: "redis",
        failureStatus: HealthStatus.Unhealthy,
        tags: ["ready"]);

var app = builder.Build();

app.MapHealthChecks("/health/ready", new HealthCheckOptions
{
    Predicate = r => r.Tags.Contains("ready"),
    ResponseWriter = async (ctx, report) =>
    {
        ctx.Response.ContentType = "application/json";
        var body = new
        {
            status = report.Status.ToString()
        }
        await ctx.Response.WriteAsync(JsonSerializer.Serialize(body));
    }
});

app.Run();

Example output:

{
  "status": "Unhealthy"
}

3. Deep Health

Are the dependencies healthy? You might run checks against downstream APIs, queues, or databases. Be careful though, too much depth can create cascading failures. Provide detailed health information on a secure endpoint for operators, while exposing only minimal info for automated systems. Also, standardize the endpoint pattern across all services so orchestration tools and humans know where to look.

.NET example for the /health/deep endpoint:

using Microsoft.AspNetCore.Diagnostics.HealthChecks;
using Microsoft.Extensions.Diagnostics.HealthChecks;
using System.Text.Json;

var builder = WebApplication.CreateBuilder(args);

builder.Services.AddHealthChecks()
    .AddSqlServer(
        builder.Configuration.GetConnectionString("AppDb")!,
        name: "sql",
        failureStatus: HealthStatus.Unhealthy,
        tags: ["deep"])
    .AddRedis(
        builder.Configuration["Redis:Connection"]!,
        name: "redis",
        failureStatus: HealthStatus.Unhealthy,
        tags: ["deep"])
    .AddUrlGroup(new Uri(builder.Configuration["Payments:URL"]!),
        name: "payments-api",
        tags: ["deep"]);

var app = builder.Build();

app.MapHealthChecks("/health/deep", new HealthCheckOptions
{
    Predicate = r => r.Tags.Contains("deep"),
    ResponseWriter = async (ctx, report) =>
    {
        ctx.Response.ContentType = "application/json";
        var body = new
        {
            status = report.Status.ToString(),
            results = report.Entries.ToDictionary(
                x => x.Key,
                x => new
                {
                    status = x.Value.Status.ToString(),
                    description = x.Value.Description,
                    data = x.Value.Data
                })
        };
        await ctx.Response.WriteAsync(JsonSerializer.Serialize(body));
    }
});

app.Run();

Example output:

{
  "status": "Unhealthy",
  "results": {
    "sql": {
      "status": "Healthy",
      "description": null,
      "data": {
      }
    },
    "redis": {
      "status": "Healthy",
      "description": null,
      "data": {
      }
    },
    "payments-api": {
      "status": "Unhealthy",
      "description": null,
      "data": {
      }
    }
  }
}

Ship With a Pulse, Not a Guess

Health endpoints are the vital signs of your architecture. They don't fix problems by themselves, but they give you visibility when seconds matter. Without them, your systems leave operators, developers, and even automation tools in the dark, forcing teams to react slowly to incidents that could have been avoided. A well-structured set of health checks turns blind spots into signals, helping your ecosystem self-heal and keeping your customers shielded from failures.

For architects the lesson is to design health endpoints in every service specification. Treat them like contracts that your service signs with the wider system. They will save you time in debugging, reduce downtime, and protect both revenue and reputation. If your systems can't report their health, then you can't claim they're truly resilient. Start small by defining consistent liveness and readiness checks across your services, and then expand with deeper health checks where they add business value.

Review one of your existing services today and check if its health endpoints are telling the full truth. If not, define what liveness, readiness, and deep health should look like and add them to your next sprint. Resilient systems start with honest signals.