Weekly notes #7

On new hash table implementation in Ruby 2.4 & DB resiliency in presence of filesystem-level errors.

Ruby, systems, performance, and programming links from last week.

Ruby

Towards Faster Ruby Hash Tables - how Ruby 2.4 got more performant hash tables by using open addressing and an array for entries instead of doubly linked list. This results in improved data-locality. The original ticket - Feature #12142: Hash tables with open addressing reads almost like another Ruby-drama :>

Open Source Software is not a zero-sum game. I think that with the patches now in Ruby trunk, everybody wins. You win, Vladimir wins, everybody else wins, and Ruby wins! Thanks again to everybody.

But it’s worth skimming through anyway to see how an initial patch gets gradually improved based on feedback. Koichi Sasada’s work on evaluating CPU and memory performance of competing implementations and Yura Sokolov’s final implementation, which looks even better than the one in the current Ruby trunk, is also worth checking out.

Ruby OMR Jit - IBM is developing ORM - a language runtime toolkit and uses Ruby as a testbed for the JIT part, moving MRI towards the ‘3x3’ goal at the same time. Currently based on top of 2.4, looking for contributors and early adopters.

Resiliency against disk I/O errors

Redundancy does not imply fault tolerance - on resiliency against faults on the disk I/O and file system level and how corrupted data can propagate to ‘healthy’ nodes in distributed storage.

All File Systems are Not Created Equal: On the Complexity of Crafting Crash Consistent Applications - explores persistence properties of filesystems, then analyzes vulnerabilities (crashes, data inconsistencies) of OSS databases & tools that appear when these properties do not hold.

Data corruption is worse than you know - some statistics about disk, raid and memory errors from CERN.

Articles

How We Organize GitHub Issues - A Styleguide For Tagging - handy system for organising and tracking issues/feature requests/ideas on GitHub. Useful for open-source software, too.

Toy Robot Coding Puzzle - why certain interview questions never get old (apparently, because of the quality of candidates?)