• by addaon on 5/20/2025, 10:23:38 PM

    An okay overview of some high level context for on-disk storage, but it's perhaps more useful to say that disk hardware (and memory hardware) present an abstraction of a bunch of bits. Even for DRAM, there isn't a one-to-one mapping between capacitors the fab etches into the silicon and bits that your software can access at a given physical address. At the lowest level, defective rows are bypassed and remapped. At the next level up, ECC means that a single bit can never be (reliably) pointed at on its own -- instead, the data of, say, 64 bits is smeared across 72 capacitors. For disks, this gets even worse, both because the hardware itself is less reliable and because the slow speed allows more and more tricks to be played. A bunch of bits get mapped to a bunch of blocks, but blocks get remapped, bits within blocks get error corrected, multiple bits are stored in a single physical element, etc.

  • by Liftyee on 5/20/2025, 11:22:12 PM

    For my previously-shallow level of understanding, this was an insightful article that showed me a little of how the filesystem actually works. I'm vaguely aware of abstractions at the hardware level (especially with solid state memory controllers, wear-levelling...) but that's another layer of abstraction down from that explained here. I'll learn the magic of working around nanoscale physics another day.

    The author seems to have a number of explanations of this quality. I've put the one about git submodules on my reading list.

  • by ggm on 5/21/2025, 2:15:24 AM

    Most of the complications can be learned after you get comfortable with a basic model. It is entirely true things have got more complicated but the key concepts and most importantly (to me) the language of what disks are comes from their history. The whole block/sector/inner/outer and cache/written and addressing models, comes from the realities of spinning objects. We didn't inherit very many concepts from mercury delay lines in the longer term, but we did from core memory because addressing models "made sense" in the X/Y plane model they exposed and we carried some of that into the future, and into disk sector/block models.

    Shingled, SMR, CVR, checksums, RAID, RAM backed, the impact of VM models, L1 and L2 cache, unified file buffer caches.. its all addons which assume you have the basic language around disk "concepts"

  • by stevetron on 5/20/2025, 10:36:05 PM

    A disk is circular.