Ruby tracing part one - BPF

January 28, 2017

How many times did you need to ssh into production, and see what the Ruby app is actually doing now? How it allocates memory, which syscalls or what user-level functions is it calling?

Even with centralized logging, the granularity is less than ideal for observing performance, or even bugs in the wild. What you usually get is an after-the-fact summary of state, maybe requests per second report, maybe an exception and a backtrace. And you don’t run your Ruby apps on a MacOS server, do you? So DTrace is out of the question.

Enter eBPF.

Brendan Gregg has been blogging about this persistently for a few years now, but I found it only last week (oops!) and it blew my mind. eBPF stands for extended Berkeley Packet Filter. It allows you to write tiny programs, that get checked for safety, compiled, and run by a small VM inside of the Linux kernel, whenever some kernel-, or even user-space event fires. Basically DTrace, but on Linux. Awesome!

Let’s take a look how to set up and compile MRI Ruby to be trace-able using bcc and eBPF. Then we’ll go through some useful scripts from bcc toolbox, define our own probes in the Ruby VM and finally, talk about application level tracing.

Setting things up

Kernel

A pre-requisite is a fairly recent Linux kernel. eBPF is in mainline kernel since 4.1, so that, or newer one should do. I’ll be using Debian unstable with experimental repository here, which includes Linux 4.9. A nice thing is that even Raspberry Pi is running version 4.4 kernel, which makes it a cool box for experimenting with eBPF.

Ruby

With the right kernel in place, apt install the systemtap-sdt-dev package:

$ sudo apt install systemtap-sdt-dev

This will make dtrace command available. We can now compile MRI Ruby with dtrace support enabled. Using ruby-install:

$ ruby-install ruby 2.4 -- --enable-dtrace

If successful and with ruby in $PATH, this command should list the probes compiled into Ruby:

$ readelf -n $(which ruby) | grep Name
Name: raise
Name: gc__mark__end
Name: gc__mark__begin
Name: gc__sweep__begin
Name: gc__sweep__end
8< --- snip ---

Yay, so we now can see all probes that the MRI Ruby VM defines. Same steps should work with Rubinius as well.

Bcc tools

To be actually able to write or use some programs that fire on these events, we need to install BPF Compiler Collection (BCC). Rather than posting the instructions here, I’ll just link to them - while they are written for Ubuntu, they certainly work for Debian as well.

The only gotcha I encountered on Debian unstable is an issue with gcc 6.3 for which this patch needs to be applied, in order for the bcc toolchain to compile.

With bcc installed in /usr/, and /usr/share/bcc/tools/ in PATH, let’s list the probes again:

$ tplist -l $(which ruby)
/opt/rubies/ruby-2.4.0/bin/ruby ruby:raise
/opt/rubies/ruby-2.4.0/bin/ruby ruby:gc__mark__end
8< --- snip ---

Cool!

Usage

Now, let’s get to actual tracing. What about garbage collections?

Let’s start a tiny app and start tracing:

#!/usr/bin/env ruby
require 'webrick'
server = WEBrick::HTTPServer.new(Port: 8080)
server.mount_proc('/') { |req, res| res.body = 'Hi!' }
trap('INT') { server.shutdown }
server.start

# trace -p 16305 'u:ruby:gc__mark__begin'

If we now run run curl "http://localhost:8080" and trigger some GC runs, trace will happily report them:

# trace -p 16305 'u:ruby:gc__mark__begin'
PID    TID    COMM         FUNC
16305  16333  puma 001     gc__mark__begin

Great, but tracing single events is not that useful. And, as of today, writing our own bcc programs is not for the faint of heart. Luckily, others already chimed in with their scripts, so there is a couple of tools that can trace not only GC, but also object allocations, calls or method flow:

$ ucalls -l ruby $(pgrep ruby) 1
$ uflow ruby $(pgrep ruby)
$ uobjnew ruby $(pgrep ruby)
$ ugc ruby $(pgrep ruby)

You can about these in the merge request which adds them to the bcc toolchain.

Creating our own probes

Ruby VM probes

With the tracing capabilities at hand, a question immediately comes to mind:

Can we introduce our own probes to Ruby VM? And the answer is yes! Plus, it’s not complicated. Suppose we want to trace whenever a RegExp is matched.

Grep-ing the Ruby sources for ‘dtrace’ leads us to dtrace_probes.rdoc, that lists the exported probes, RUBY_DTRACE_HOOK in internal.h and a probes.d file, listing probe definitions.

So if we piece these bits of info together, we should:

Add a probe definition to probes.d:

probe regexp__match(const char *regexp_string, const char *filename, int lineno);

Add a RUBY_DTRACE_HOOK(NAME, arg) hook to the appropriate place in re.c from ruby 2.4.0 (for brevity, this covers only the string =~ regexp case):

diff --git a/re.c b/re.c
index e94b6a5..b6783ad 100644
--- a/re.c
+++ b/re.c
@@ -10,6 +10,7 @@
 **********************************************************************/

 #include "internal.h"
+#include "probes.h"
 #include "ruby/re.h"
 #include "ruby/util.h"
 #include "regint.h"
@@ -3076,6 +3078,7 @@ rb_reg_match(VALUE re, VALUE str)
     long pos = reg_match_pos(re, &str, 0);
     if (pos < 0) return Qnil;
     pos = rb_str_sublen(str, pos);
+		 RUBY_DTRACE_HOOK(REGEXP_MATCH, RREGEXP_SRC(re));
     return LONG2FIX(pos);
 }

When we now re-compile Ruby, we should see the probe in the list:

# tplist -l ./build/bin/ruby | grep regexp
./build/bin/ruby ruby:regexp__match

We will use a trivial loop that matches a regexp:

#!./build/bin/ruby
puts Process.pid
while true
  r = /test/
  s = "and now for some testing"
  s =~ r
  sleep 1
end

Ensuring PATH and RUBY_ROOT point to the ruby tree (running chruby . in the source tree should do the trick):

# trace -p 960 'u:ruby:regexp__match'
PID    TID    COMM         FUNC
960    960    test.rb      regexp__match
960    960    test.rb      regexp__match

Et voila! We have a trace! If you change the s to not match r, no probe will fire.

Application level tracing

OK, so, “That’s pretty neat, but do I have to patch ruby VM every time I want to trace something?. I want to add probes to my app directly.”…

It turns out that’s not that simple. As far as I know, and I’d love to be wrong - please do correct me! - the only way how to add application level probes is via libusdt and ruby-usdt, which, at the moment works with DTrace only :( I don’t know enough of the BPF internals (yet!), but maybe it should be possible to adapt the library to work with bcc’s usdt interface, making the ruby-usdt gem work out of the box!

Conclusion

So that’s it. It’s awesome to see Linux offering such advanced tracing capabilities. This in turn makes our apps more transparent at runtime and eases debugging in production.

Bcc is a really powerful toolkit for BPF-based tracing, and there are also tools like ply, making writing BPF programs (for kernel probes) a breeze by compiling into them from a higher-level scripting language. This brings the whole tracing system even closer to non-C developers and operations engineers. Yay!

In the next post, we’ll look at rbtrace gem, which works as an extension to MRI Ruby, so it is bound to a concrete Ruby implementation, but can run on any operating system.

If you’re solving Ruby performance issues in prod, have experience with DTracing Ruby or just would like to chat about Ruby & perf, shoot me a mail to (balazs at kutilovi dot cz), or a tweet! I’ll be happy to hear from you or help you out.