Instrumenting Rails with Prometheus
almost 7 years ago
People following me have occasionally seen me post graphs like this:
Usually people leave this type of instrumentation and graphing to NewRelic and Skylight. However, at our scale we find it extremely beneficial to have instrumentation, graphing and monitoring local cause we are in the business of hosting, this is a central part of our job.
Over the past few years Prometheus has emerged as one of the leading options for gathering metrics and alerting. However, sadly, people using Rails have had a very hard time extracting metrics.
Issue #9 on the official prometheus client for Ruby has been open 3 years now, and there is very little chance it will be “solved” any time soon.
The underlying fundamental issue is that Prometheus, unlike Graphite/Statsd is centered around the concept pulling metrics as opposed to pushing metrics.
This means you must provide a single HTTP endpoint that collects all the metrics you want exposed. This ends up being particularly complicated with Unicorn/Puma and Passenger who usually will run multiple forks of a process. If you simply implement a secured /metrics
endpoint in your app, you have no guarantees over which forked process will handle the request, without “cross fork” aggregation you would just report metrics for a single, random, process. Which is less than useful.
Additionally, knowing what to collect and how to collect it is a bit of an art, it can easily take multiple week just to figure out what you want.
Having solved this big problem for Discourse I spent some time extracting the patterns.
Introducing prometheus_exporter
The prometheus_exporter gem is a toolkit that provides all the facilities you need.
-
It has an extensible collector that allows you to run a single process to aggregate metrics for multiple processes on one machine.
-
It implements gauge, counter and summary metrics.
-
It has default instrumentation that you can easily add to your app
-
It has a very efficient and robust transport channel between forked processes and master collector. The master collector gathers metrics via HTTP but reduces overhead by using chunked encoding so a single session can gather a very large amount of metrics.
-
It exposes metrics to prometheus over a dedicated port, HTTP endpoint is compressed.
-
It is completely extensible, you can pick as much or as little as you want.
A minimal example implementing metrics for your Rails app
In your Gemfile:
gem 'prometheus_exporter'
# in config/initializers/prometheus.rb
if Rails.env != "test"
require 'prometheus_exporter/middleware'
# This reports stats per request like HTTP status and timings
Rails.application.middleware.unshift PrometheusExporter::Middleware
end
At this point, your web is instrumented, every request will keep track of SQL/Redis/Total time (provided you are using PG)
You may also be interested in per-process stats, like:
and
# in config/initializers/prometheus.rb
if Rails.env != "test"
require 'prometheus_exporter/instrumentation'
# this reports basic process stats like RSS and GC info, type master
# means it is instrumenting the master process
PrometheusExporter::Instrumentation::Process.start(type: "master")
end
# in unicorn/puma/passenger be sure to run a new process instrumenter after fork
after_fork do
require 'prometheus_exporter/instrumentation'
PrometheusExporter::Instrumentation::Process.start(type:"web")
end
Also you may be interested in some Sidekiq stats:
Sidekiq.configure_server do |config|
config.server_middleware do |chain|
require 'prometheus_exporter/instrumentation'
chain.add PrometheusExporter::Instrumentation::Sidekiq
end
end
FInally, you may want to collect some global stats across all processes, like:
To do so we can introduce a “type collector”:
# lib/global_type_collector.rb
unless defined? Rails
require File.expand_path("../../config/environment", __FILE__)
end
require 'raindrops'
class GlobalPrometheusCollector < PrometheusExporter::Server::TypeCollector
include PrometheusExporter::Metric
def initialize
@web_queued = Gauge.new("web_queued", "Number of queued web requests")
@web_active = Gauge.new("web_active", "Number of active web requests")
end
def type
"app_global"
end
def observe(obj)
# do nothing, we would only use this if metrics are transported from apps
end
def metrics
path = "/var/www/my_app/tmp/sockets/unicorn.sock"
info = Raindrops::Linux.unix_listener_stats([path])[path]
@web_active.observe(info.active)
@web_queued.observe(info.queued)
[
@web_queued,
@web_active
]
end
end
After all of this is done you need to run the collector (in a monitored process in production) using runit ,supervisord, systemd or whatever your poison is (mine is runit).
bundle exec prometheus_exporter -t /var/www/my_app/lib/global_app_collector.rb
Then you follow the various guides online and setup Prometheus and the excellent Grafana and you too can have wonderful graphs.
For those curious, here is an partial example of how the raw metric feed looks for an internal app we use that I instrumented yesterday: https://gist.github.com/SamSaffron/e2e0c404ff0bacf5fbca80163b54f0a4
I hope you find this helpful, good luck instrumenting all things!
EDIT: @bbonamin has shared a dashboard here which is a good starting point!
We are going the
statsd_exporter
way. Have you considered it? It lacks free-form tags, but it has a mapping that will do the job, I wrote a mapping config generator so it’s all automatic.