Prometheus and Spring Boot Health Checks

When trying to set up alerting for Spring Boot services with Prometheus, I discovered the synthetic "up" time series which is great for checking whether the monitoring system can reach my service instances. While this is a great thing, I also wanted to alert on the health status of my instances, as reported by /actuator/health. Unfortunately, there is nothing in Spring Boot's /actuator/prometheus endpoint that I could use.

After some pondering, I decided to expose my own "health" time series from Spring Boot. With Micrometer, this is quite easy - all I have to do is registering a Gauge meter that fetches the health status from the Actuator's HealthEndpoint bean when sampled:

package de.mafr.demo.prometheus;

import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.boot.SpringApplication;
import org.springframework.boot.actuate.autoconfigure.metrics.MeterRegistryCustomizer;
import org.springframework.boot.actuate.health.HealthEndpoint;
import org.springframework.boot.actuate.health.Status;
import org.springframework.boot.autoconfigure.SpringBootApplication;
import org.springframework.context.annotation.Bean;

import io.micrometer.prometheus.PrometheusMeterRegistry;


@SpringBootApplication
public class PrometheusDemoApplication {

    @Autowired
    private HealthEndpoint healthEndpoint;

    @Bean MeterRegistryCustomizer prometheusHealthCheck() {
        return registry -> registry.gauge("health", healthEndpoint, ep -> healthToCode(ep));
    }

    private static int healthToCode(HealthEndpoint ep) {
        Status status = ep.health().getStatus();

        return status.equals(Status.UP) ? 1 : 0;
    }

    public static void main(String[] args) {
        SpringApplication.run(PrometheusDemoApplication.class, args);
    }
}

In this example, I simply map Status.UP to 1 and everything else to 0, but you can easily define your own convention that covers Status.OUT_OF_SERVICE, Status.UNKNOWN, and any custom codes you may have.

social