Ouch, that's a lot of precision bits to give up just for this kind of stuffing
trick. I would not be willing to sacrifice more than, say, 3. That's enough
for a type tag with 8 values.
I had considered before giving it a dedicated 56-bit range, but never
really got around it, and found it good enough.
it is still considerably more accurate than the 28-bit flonum used on
32-bit targets.
Instead of shifting and using an offset, you could put a couple of tag bits
into the low order bits of the pointer which indicate "this is a double". Then
mask them to zero to retrieve the float.
this is possible, except that low-order tag bits are really annoying as
they require constraining that every pointer be aligned, say, on 8 bytes
(and a random byte-aligned character pointer can make a mess of things).
in my case, I had used a different strategy:
unusable parts of the address space are used for tagged values.
on x86, this basically means the range from 0xC0000000 to 0xFFFFFFFF,
since this is space is (generally) reserved for the OS on both Windows
and Linux.
both "fixnum" and "flonum" are 28 bits in this case, with the rest of
the space being used for other smaller type ranges.
on x86-64, a 56-bit glob of address space was used for tagged values
(IIRC, starting at something like 0x7F000000_00000000), which was
in-turn divided into around 256 48-bit regions.
on current HW though, a person could probably get by with a larger 60 or
62-bit region though.
say, a 60 bit region starting at 0x70000000_00000000, probably using the
next 4 bits as a tag, giving 16 regions each being 56 bits.
or, more extreme:
0x40000000_00000000 to 0xBFFFFFFF_FFFFFFFF is used as a single giant
63-bit region, with 3 more bits as tags, allowing for up to 8 60-bit
regions.
but, even with 16 bits shaved off, a double is still plenty accurate for
most things IME.
note that this stuff is not used for addressable memory objects though,
which use a different strategy for identifying types:
namely, identifying type either by memory region (ex: cons cells or
interned strings) or by tags within the memory-object headers (most
other allocated memory). granted, technically, these are themselves
based on address regions (for example, the MM/GC will check against
relevant "heap chunks" or similar, when relevant).
note that the MM/GC does not require pointers to point at the start of
memory objects (it can find the start of the memory object by itself),
as well as determine when address is outside of areas it knows about.
pretty much all types are (canonically) identified by "type names" (as
strings), rather than by tag bits or similar (although tag values are
used internally).
or such...