Collect JSON metrics with Telegraf Credits: Nicholas Cappello on Unsplash

Collect JSON metrics with Telegraf

Telegraf is a powerful, plugin based metrics collector that also provides Prometheus compatible outputs. For various purposes, there are a number of input plugins that can collect metrics from various sources. Even more powerful are the processor plugins that allow metrics to be processed and manipulated as they pass through, and immediately output results based on the values they process. In this short blog post I’ll explain how to fetch JSON metrics from the Docker registry API to track some data of a DockerHub Repository.

Even though Prometheus has become very popular, not all applications and API’s provide native Prometheus metrics. In this example, I’ll focus on the repository metrics provided by https://hub.docker.com/v2/repositories. While the API displays the current pull numbers, storing this information to Prometheus has the advantage of displaying time-based information, e.g. the pull frequency/ratio or pulls in a certain period of time.

To fetch metrics from an HTTP endpoint, the generic HTTP input plugin can be used. This plugin allows collecting metrics from one or more HTTP(S) endpoints and also supports a lot of different data formats not only JSON.

By default, only numeric values are extracted from the collected JSON data. In the case of the Docker Registry API endpoint, the status, star_count, pull_count, and collaborator_count fields are used. This behavior can be customized by adding fields for the json_string_fields configuration option. Since Prometheus supports only numeric values for metrics, string fields are added as labels to each metric. In some cases, string fields contain useful information that can also be converted to numeric references. For example, status texts such as OK and ERROR can be converted to 0 and 1 and used as metrics in Prometheus. In the case of the Docker Registry API, I wanted to use the last_updated field as a dedicated metric in Prometheus instead of a label. Fortunately, the date time format used can be transformed to a numeric value by converting it to a Unix timestamp.

For simple type conversions, I would recommend to always check the available converter processor first. Sadly there is no such converter for date or date time values available yet, so I had to go another way and utilized the starlark processor. This processor calls a starlark function for each matched metric, allowing for custom programmatic metric processing. While starlark is a Python dialect and might look familiar, it only supports a very limited subset of the language, as explained in the specification. But it’s powerful enough for what I wanted to do: Convert the date time value in the format of 2006-01-02T15:04:05.999999Z to a valid Unix timestamp. That’s how the final configuration looks like:

#jinja2: lstrip_blocks: True
[[inputs.http]]
  name_override = "dockerhub_respository"
  urls = [
    "https://hub.docker.com/v2/repositories/library/telegraf/"
  ]

  json_string_fields = [
    "last_updated",
    "name",
    "namespace",
    "repository_type",
  ]

  data_format = "json"

[[processors.starlark]]
  namepass = ["dockerhub_respository"]

  source = '''
load("time.star", "time")
def apply(metric):

  metric.fields["last_updated"] = time.parse_time(metric.fields["last_updated"], format="2006-01-02T15:04:05.999999Z").unix

  return metric
'''

To debug and test the Telegraf configuration it’s useful to execute the binary with the --test flag:

./usr/bin/telegraf --debug --test --config etc/telegraf/telegraf.conf --config-directory etc/telegraf/telegraf.d/
2022-03-12T17:01:07Z I! Starting Telegraf 1.21.4
2022-03-12T17:01:07Z I! Loaded inputs: http
2022-03-12T17:01:07Z I! Loaded aggregators:
2022-03-12T17:01:07Z I! Loaded processors: starlark
2022-03-12T17:01:07Z W! Outputs are not used in testing mode!
2022-03-12T17:01:07Z I! Tags enabled: host=localhost project=prometheus
[...]
> dockerhub_respository,host=localhost,name=telegraf,namespace=library,project=prometheus,repository_type=image,url=https://hub.docker.com/v2/repositories/library/telegraf/ collaborator_count=0,last_updated=1646267076i,pull_count=519804664,star_count=560,status=1 1647104469000000000

The final metrics generated by the Prometheus output plugin will look like this:

dockerhub_respository_collaborator_count{host="localhost",name="telegraf",namespace="library",project="prometheus",repository_type="image",url="https://hub.docker.com/v2/repositories/library/telegraf/"} 0
dockerhub_respository_last_updated{host="localhost",name="telegraf",namespace="library",project="prometheus",repository_type="image",url="https://hub.docker.com/v2/repositories/library/telegraf/"} 1.646267076e+09
dockerhub_respository_pull_count{host="localhost",name="telegraf",namespace="library",project="prometheus",repository_type="image",url="https://hub.docker.com/v2/repositories/library/telegraf/"} 5.19802966e+08
dockerhub_respository_star_count{host="localhost",name="telegraf",namespace="library",project="prometheus",repository_type="image",url="https://hub.docker.com/v2/repositories/library/telegraf/"} 560
dockerhub_respository_status{host="localhost",name="telegraf",namespace="library",project="prometheus",repository_type="image",url="https://hub.docker.com/v2/repositories/library/telegraf/"} 1

That’s it. The Telegraf HTTP input plugin is a very flexible but generic way to collect and transform JSON metrics from various sources. If that’s still not powerful enough you can pass the raw data fetched by the HTTP input plugin to the starlark converter and write your own functions to parse the input and extract the required information into metrics.