Ken said:
Now let's start writing. We write a block at the beginning of line 0,
then we seek to the beginning of line 1 and write another block. Since
Linux allows sparse files it doesn't write anything in the blocks between
these two blocks.
Yes
They'll get filled in later when we write more blocks
there. The two blocks we just wrote are now next to each other on disk.
Not necessarily. Filesystems might protect themselves against these
access patterns to avoid fragmentation (ext2/3 already prevent several
processes accessing the disk at the same time writing to the same
cylinder groups to avoid fragmentation), in fact if I understand the
source code of ext2 correctly this is one of the things indirect block
allocation prevent partially (ie: non-contiguous parts are allocated in
different parts of the disk and the fs tries to put a part next to a
contiguous one).
I guess you'd probably end with a matrix split in large continuous
blocks in various parts of one or perhaps more cylinder groups on disk.
Keep going, and you'll find that both your reads and writes are
contiguous.
This probably isn't true and if you want to get down to the real meat,
you'll have to evaluate the seek times and amount of them. On a modern
drive, a cylinder probably stores something of the order of the
megabytes or tens of megabytes. This means that more than one line of
the matrix fit in one single cylinder : consecutive lines can be
accessed without seek or minimal (track to next track) seek. The only
penalty is that you access a low percentage of what the disk could have
in one revolution.
On the other end, if you access lines in non contiguous parts of the
matrix with smaller segments (typical when you transpose by sub-matrix),
you diminish the amount of data used on each revolution and get more
seeks. If you are doing tens or hundred of thousands of them, this
begins to take a huge toll. Indeed reading 2G on a modern SATA disk
takes at most 40s (50+MB/s), so seeking can become the performance
bottleneck as soon as you get into the tens of thousands with quick
seeks, and a 2G 64bit integer matrix already have a number of lines in
this ballpark.
On the other hand, if you try to transpose back, the file
that you just finished writing is the pathological worst case for seek
time.
This is not obvious and at least not the case for ext2/3 (unless your
filesystem is already heavily fragmented). I believe the same allocation
heuristics are used in various modern fs to avoid file fragmentation.
Does this sound about right? In practice it may not be worth optimizing
seek time, since that's very much the realm of the operating system.
Unfortunately when you deal with data that can't fit in cache, you
pretty much defeat any mechanism that the OS uses to optimize disk
access so you can't rely on it. By the number of lines in the matrix,
you can guess the order of seek numbers neeeded and you can guess that
algorithms not optimizing them will get far worse performance than
others optimizing them.
Now what would be real fun is the OP telling that he uses a SSD with
access times in the order of 0.1ms for storing these matrices
Then
I'd eat my hat and advise to go with the easiest algorithm to code
Lionel