Summaries for Timing
A gauge can only hold the last value of what is was set to, so how can we time events and measure latency?
The answer is to use a summary. It will track both the total time taken by events there were and how many events there were:
package io.robustperception.java_examples;
import io.prometheus.client.Summary;
import io.prometheus.client.hotspot.DefaultExports;
import io.prometheus.client.exporter.HTTPServer;
import java.util.Random;
public class JavaExample {
static final Summary functionLatency = Summary.build()
.name("my_function_latency").unit("seconds")
.help("Latency of my function").register();
static void myFunction() throws Exception {
Summary.Timer requestTimer = functionLatency.startTimer();
try {
Thread.sleep(new Random().nextInt(1000));
} finally {
requestTimer.observeDuration();
}
}
public static void main(String[] args) throws Exception {
DefaultExports.initialize();
HTTPServer server = new HTTPServer(8000);
while (true) {
myFunction();
Thread.sleep(1000);
}
}
}
Here the startTimer() is called when you want to start timing, and observeDuration when you want to stop. A try..finally is used to handle any exceptions that might be thrown.
The metrics output will include my_function_latency_seconds_sum and my_function_latency_seconds_count, and from these the latency in seconds can be calculated with the expression rate(my_function_latency_seconds_sum[1m]) / rate(my_function_latency_seconds_count[1m]) in PromQL.