Handling forks in threaded code in Ruby

March 12, 2024

In MRI Ruby, when calling a Process.fork, only the main thread gets copied into the child process, while the other threads will appear as ‘dead’.

In my case, it was a Thread::Queue serviced by a Thread worker loop initialized in puma which then forked b/c it was set up in cluster mode, and I saw things not being processed as they should.

However, it turns out that this is a documented behavior, and I’m probably not alone getting caught off-guard by this.

So apps/libraries that also do threading might want to somehow detect that a fork happened, and clean up/restore the previous state. But how?

There are multiple approaches to handling this, and the best summary and discussion probably is in this Ruby Issue asking for before/after fork callbacks. Here are the main points from there:

Documentation, rely on users to do the right thing (e.g. to add their code into a before_fork/after_fork callback).
PID change detection (supposedly slow, 1-2% slowdown).
Check Thread.alive? to catch the now-dead thread (finicky, as threads can die for a multitude of unrelated reasons).
Decorate .fork (or ._fork in Ruby 3.1+) in Process/Kernel, and run the callback automatically.

In my case, I thought that the 1-2% performance hit was probably OK, so I went the simple / clean way:

@pid = $$
fork { puts "forked, restart work" if $$ != @pid; }

Which is guaranteed to work in all MRI Rubies that I wanted to support.

However, here’s where glibc made it more interesting:

There’s a quite significant performance loss for $$ and Process.pid, b/c an internal glibc pid cache got removed in libc 2.25.

This apparently got noticed by Shopify, which released a single-purpose pid_cache gem introducing its own cache for, well… the process ID. A small gotcha is that the library can fix only Process.pid, but not the $$ global variable.

Perhaps prepending a module to Process with a wrapper around _fork to trigger a callback might be a better idea, after all…

P.S.: Here’s a couple of benchmarks - they show that on Ruby 3.3, the $$ is a bit faster, but for the older version, using pid_cache and Process.pid instead is the way to go.

dpkg -s libc6 | grep Version:
Version: 2.31-13+deb11u7

Ruby 3.3

(Issue fixed in MRI).

ruby 3.3.0 (2023-12-25 revision 5124f9ac75) [x86_64-linux]
Warming up --------------------------------------
                  $$     1.027M i/100ms
         Process.pid   770.879k i/100ms
Calculating -------------------------------------
                  $$     10.785M (± 6.6%) i/s -     54.450M in   5.071929s
         Process.pid      7.653M (± 5.9%) i/s -     38.544M in   5.056729s

Comparison:
                  $$: 10784576.3 i/s
         Process.pid:  7653482.9 i/s - 1.41x  slower

Ruby 3.2

ruby 3.2.2 (2023-03-30 revision e51014f9c0) [x86_64-linux]
Warming up --------------------------------------
                  $$   162.670k i/100ms
         Process.pid   150.169k i/100ms
Calculating -------------------------------------
                  $$      1.527M (±14.5%) i/s -      7.483M in   5.076198s
         Process.pid      1.538M (± 6.2%) i/s -      7.659M in   4.999731s

Comparison:
         Process.pid:  1538337.7 i/s
                  $$:  1527259.8 i/s - same-ish: difference falls within error

Ruby 3.2 with pid_cache

Using the pid_cache gem, only Process.pid was fixed.

ruby 3.2.2 (2023-03-30 revision e51014f9c0) [x86_64-linux]
Warming up --------------------------------------
                  $$   167.073k i/100ms
         Process.pid   493.054k i/100ms
Calculating -------------------------------------
                  $$      1.629M (± 4.6%) i/s -      8.187M in   5.036055s
         Process.pid      5.046M (± 6.7%) i/s -     25.146M in   5.009096s

Comparison:
         Process.pid:  5045774.2 i/s
                  $$:  1629039.3 i/s - 3.10x  slower

(benchmark code is available on GitHub).