The HPC cluster as a reflection of values

Yesterday while I was cooking dinner, I happened to re-watch Bryan Cantrill’s talk on “Platform as a Reflection of Values“. (I watch a lot tech talks while cooking or baking — I often have trouble focusing on a video unless I’m doing something with my hands, but if I know a recipe well I can often make it on autopilot.)

If you haven’t watched this talk before, I encourage checking it out. Cantrill gave it in part to talk about why the node.js community and Joyent didn’t work well together, but I thought he had some good insights into how values get built into a technical artifact itself, as well as how the community around those artifacts will prioritize certain values.

While I was watching the talk (and chopping some vegetables), I started thinking about what values are most important in the “HPC cluster platform”.

Continue reading

happy living close (-ish) to the metal

For various reasons, I’ve been doing a little bit of career introspection lately. One of the interesting realizations to come out of this is that, despite in practice doing mostly software work, I’ve been happiest when my work involved a strong awareness of the hardware I was running on.

Continue reading

Sketching out HPC clusters at different scales

High-performance computing (HPC) clusters come in a variety of shapes and sizes, depending on the scale of the problems you’re working on, the number of different people using the cluster, and what kinds of resources they need to use.

However, it’s often not clear what kinds of differences separate the kind of cluster you might build for your small research team:

Note: do not use in production

From the kind of cluster that might serve a large laboratory with many different researchers:

The Trinity supercomputer at Los Alamos National Lab, also known as “that goddamn machine” when I used to get paged at 3am

There are lots of differences between a supercomputer and my toy Raspberry Pi cluster, but also a lot in common. From a management perspective, a big part of the difference is how many different specialized node types you might find in the larger system.

Continue reading

handy utilities for every hpc cluster

I’ve built a lot of HPC clusters, and they’ve often looked very different from each other depending on the particular hardware and target applications. But I almost always find myself installing a few common tools on them, to make their management easier, so I thought I’d share the list.

Continue reading