Summaries for Timing
A gauge can only hold the last value of what is was set to, so how can we time events and measure latency?
The answer is to use a summary. It will track both the total time taken by events there were and how many events there were:
import time
import random
from prometheus_client import start_http_server
from prometheus_client import Summary
function_latency = Summary("my_function_latency_seconds",
"Latency of my_function", unit="seconds")
def my_function():
with function_latency.time():
time.sleep(random.random())
if __name__ == "__main__":
start_http_server(8000)
while True:
my_function()
time.sleep(1)
Here the time() context manager is used to time some code. It can also be used as a function decorator.
The metrics output will include my_function_latency_seconds_sum and my_function_latency_seconds_count, and from these the latency in seconds can be calculated with the expression rate(my_function_latency_seconds_sum[1m]) / rate(my_function_latency_seconds_count[1m]) in PromQL.