Using Prometheus Exemplars to jump from metrics to traces in Grafana
This is a step-by-step guide to enable correlation between metrics and traces, using Prometheus, Grafana, Tempo, OpenTelemetry, and OpenMetrics.
Exemplars are coming! But what are exemplars? A few highlighted values in a time series, with a set of key-value pairs (labels). The main use-case is to include the request’s trace ID in the labels so that we can jump from a metric time series to the interesting traces directly. This is also known as correlation between metrics and traces.
Enabling Exemplars in an already instrumented Go application requires:
- to include the current request’s trace ID each time we increment a Prometheus metric
- to expose the Prometheus metrics in the OpenMetrics format
- to collect the metrics & exemplars using Prometheus
- to configure Grafana to visualize the time series with the exemplars — with a direct link to Tempo to visualize the trace associated with an exemplar
If you just want to play with Prometheus Exemplars and Grafana, you should use Grafana’s TNS demo.
Including the request’s trace ID in the Prometheus metrics
First, you need to ensure you’re using Prometheus Go client library version 1.4.0 or more, to get Pull Request #706. So that you can change your code to something like:
A few things to note:
- we’re using OpenTelemetry for our traces, so we can easily retrieve the current trace ID from the context. Side note, if you’re not using OpenTelemetry yet, you should seriously consider it, the tracing part is GA now, and the collector is just awesome. We’re pushing our traces to Tempo — more on that later.
- we need to use the new
ExemplarObserver
interface because the originalHistogram
interface has not been changed, to avoid breaking the backward compatibility. - when using the
ObserveWithExemplar
method, note theTraceID
key — we’ll need it later to configure Grafana, so that it knows which label to use to retrieve the trace ID.
Exposing Prometheus metrics in the OpenMetrics format
So what is the impact of adding our trace ID? Well, so far… nothing, because if you use the default Prometheus HTTP handler, it only exposes your metrics using the Prometheus format, which doesn't support Exemplars. This is where OpenMetrics comes in! It’s a spec based on the Prometheus format, but with support for Exemplars. And of course, Prometheus knows how to collect metrics exposed in the OpenMetrics format. So let’s change our Prometheus HTTP Handler to enable support for OpenMetrics:
You might be wondering why it’s an optional feature, and not enabled by default. The reason is that you can’t really switch between the Prometheus format and OpenMetrics without impacts. In particular,
Counters are expected to have the
_total
suffix in their metric name. In the output, the suffix will be truncated from the#TYPE
and#HELP
line. A counter with a missing_total
suffix is not an error. However, its type will be set tounknown
in that case to avoid invalid OpenMetrics output.
So once you’ve done that, if you want to check the output of the /metrics
endpoint in the OpenMetrics format, you need to use the right Accept
HTTP header — otherwise you’ll just get the default Prometheus format, without the Exemplars.
curl -H "Accept: application/openmetrics-text" http://host/metrics
and you should see something similar to:
So for each metric, we have the current value and the latest exemplar: the set of labels — only the trace ID for us — along with the recorded value and timestamp. If you do another request, you should see different exemplars.
Collecting Exemplars with Prometheus
Now we need to ensure Prometheus can collect our metrics with the associated Exemplars. At the time of writing, Pull Request #6635 — which adds support for Exemplars in Prometheus — has not been merged yet. It may be included in the next version: 2.26.0
. In the mean time, we’ll need a custom Prometheus, such as the tomwilkie/prometheus:0ea72b6a6 container image. This is the Prometheus image used in the Grafana TNS demo. Of course, don’t use this in your production Prometheus setup.
The Exemplars storage feature needs to be explicitly enabled, using the --enable-feature=exemplar-storage
flag — you can see the PR for the documentation. If you are using the Prometheus Helm Chart, you can set the following values to enable it:
server:
image:
repository: tomwilkie/prometheus
tag: 0ea72b6a6
extraArgs:
enable-feature: exemplar-storage
Configuring Grafana
First, you will need a recent version of Grafana: Exemplars have been added in version 7.4.0-beta1 and I’ve been using v7.5.0-beta1 without any issue so far. Alternatively, you can use grafana/grafana:7.4.x-exemplars — which is the container image used in the Grafana TNS demo.
You will also need to configure Exemplars in your Prometheus data source. This is the part where you use the TraceID
key from your code, and which allows Grafana to extract the trace ID from the Exemplars, and build a link using that ID. If you are using a data source defined in YAML format, you can configure it like that:
name: Prometheus
type: prometheus
access: proxy
url: http://prometheus-server.namespace
httpMethod: POST
version: 1
jsonData:
exemplarTraceIdDestinations:
- name: TraceID
datasourceUid: tempo
It will automatically create links for our Tempo instance. Tempo is part of the Grafana stack, and used to ingest and display traces.
And finally, you need to change your Grafana dashboards:
- you need to use
Time series
panels, notGraph
panels — otherwise you won’t see your Exemplars - you need to enable the
Examplars
checkbox for your Prometheus query — as in the following screenshot:
- and when you see dots in your graph, just mouse over them, and you will see a modal window with details of an Exemplar, such as the time and value, the trace ID with our link to Tempo, and a set of labels associated with the time series:
Note that if you have too many labels on your time series, the modal window might be truncated, and you might not be able to see the top of it — which is the most interesting part, with the trace ID. It happened to me because I was using the default kubernetes-pod scrape config from the Prometheus Helm Chart, which has a labelmap relabel config to extract all labels from the pod and use them on the time series. So I had to update this configuration to ensure I had a limited set of labels, to get a smaller Examplar window, and be able to click on the trace ID link to Tempo.
You can see a demo of the whole flow — from metrics to traces:
Your turn now ;-)