D
dragan
Wouldn't it be nice if one byte was well-defined as 32 bits? Please list the
pros and cons of such a hypothetical thing in this thread.
pros and cons of such a hypothetical thing in this thread.
It isn't hypothetical on some platforms and it would be a royal pain indragan said:Wouldn't it be nice if one byte was well-defined as 32 bits? Please list the
pros and cons of such a hypothetical thing in this thread.
Wouldn't it be nice if one byte was well-defined as 32 bits? Please list the
pros and cons of such a hypothetical thing in this thread.
Some portable standards like MPEG or Ogg avoided the word byte, since
it's historic meaning was 'storage for one (latin) character'. They
defined 'octet' instead. However, the meaning of byte has silently
changed with time and is now the same as octet. This is a common
evolution in living languages.
Marcel said:Hi,
pros: none
reason: nothing changes, you only use another terminology. You still
need 8 bits where 8 bits are required, and so on.
cons: a lot of confusion
reason: most people use 'byte' as synonym for 8 bits and not for any
type that could be of arbitrary length.
So why do you want to change the terminology? Or what do you expect to
change with that? There are already thousands of CPU designs around that
use 32 bit (or even more) as machine size word.
Some portable standards like MPEG or Ogg avoided the word byte, since
it's historic meaning was 'storage for one (latin) character'. They
defined 'octet' instead.
dragan said:Wouldn't it be nice if one byte was well-defined as 32 bits? Please list the
pros and cons of such a hypothetical thing in this thread.
Wouldn't it be nice if one byte was well-defined as 32 bits? Please list the
pros and cons of such a hypothetical thing in this thread.
Yeah, it would most definitely be a great thing! When we change that we
could also change "chair" to "car", "car" to "window" and "window" to
"chair". Maybe change the meaning of "seven" and "thirteen", but I'm
still unsure.
dragan said:Wouldn't it be nice if one byte was well-defined as 32 bits? Please
list the pros and cons of such a hypothetical thing in this thread.
dragan said:I guess I'll have to help you all with the concept. A key part to this
excercise is: use of imagination. Apparently none of the responders have
any ( hehe). Maybe you'll give it another go with this further
information that I did not want to give right away so that I wouldn't be
coloring the responses in any way. So, herein is some more information.
The key word in the OP is 'hypothetical'.
There are not many constraints in the OP. I was hoping that someone would
have IMAGINED what everything would be like today if a byte was 32-bits
from the start, but all scenarios are fair game: there are no limits (or
there weren't? First responders have beaten the dead horse cliches?). That
probably gets some imaginations going. (?) All who responded , wrongly
assumed that the OP was suggesting changing the definition of "byte".
Perhaps the OP then is kind of like one of those drawings that contain
multiple scenes depending on how you are processing the information (and
who you are also, no doubt: your values, experiences and such): the pretty
girl or the witch-like woman.
That's probably enough information for a second round, if you want to "try
again" or "try" for the first time (there are no wrong responses, though
the ones so far have been cliche or lame or something IMO). So, let your
imagination run wild and post a response!
pros: none
reason: nothing changes, you only use another terminology. You
still need 8 bits where 8 bits are required, and so on.
cons: a lot of confusion
reason: most people use 'byte' as synonym for 8 bits and not
for any type that could be of arbitrary length.
So why do you want to change the terminology? Or what do you
expect to change with that? There are already thousands of CPU
designs around that use 32 bit (or even more) as machine size
word.
Some portable standards like MPEG or Ogg avoided the word
byte, since it's historic meaning was 'storage for one (latin)
character'.
They defined 'octet' instead. However, the meaning of byte
has silently changed with time and is now the same as octet.
This is a common evolution in living languages.
Code which assumes that UCHAR_MAX is a small number will blow up.
int char_translation_table[UCHAR_MAX]; // oops!
Communication with other machines and file formats is
difficult, because the rest of the world is ``byte == octet''.
When you open a binary file on this platform, you are reading
32 bit unsigned chars. If the file came from another system,
what happens to its bytes? When we read the 32-bit bit byte of
a JPEG file, are we getting four octets of its header? In what
order?
BGB / cr88192 said:well, it is worth noting also that, among programmers, the SJ tempermant
is apparently very common (some research was apparently done which found
it to be more common than the NT tempermant among programmers, which is
still more than NF, with SP is last place...).
myself included, as an ESTJ apparently (MBTI, LSE is the socionics
equivalent...).
so, what then is the SJ stereotype?
errm... not having whole lots of imagination, and maybe worrying about
rules and conventions (or, oddly, some things say tending towards a
systematic and materialistic worldview, ...), ...
there are pros and cons I guess, or one can also assert that people are
people and can be whoever they want to be, granted...
ok then:
you would either need way the hell more RAM;
otherwise, teh craploads of shifting and masking to access any smaller
members (such as, what we currently call bytes...).
not even using 16 bits for character strings is all that compelling, why?
because, in the average case, lots of space is wasted say, than would be
used with UTF-8, even though the extended characters take more bytes.
this is because for "most" text, the characters fit nicely in 1 or 2
bytes. it is then mostly just asian languages which need more bytes, as
most of the other non-latin alphabets (greek, cyrillic, arabic, ...)
happen to fall nicely in the 2-byte break-even range.
it can also be observed that, in many uses, even for common forms of asian
text, many latin/... characters are present, and to may reduce the overall
cost of the expansion of the non-latin chars (or even improve on the net
size).
consider, for example, Chinese or Japanese source code, or even HTML,
where the vast majority of the characters are typically latin (either part
of the source-code character set, or as markup tags, ...).
hence, in the general case, UTF-8 may be a denser encoding on average than
UTF-16 (for some hypothetical open-ended collection of text).
similarly, there is little to be gained WRT performance,
since most modern CPUs handle memory in larger units anyways (I think many
newer processors generally work with data 128 bits at a time,
Then those people are confused anyway, since that was never the definition
of "byte", neither in C++, nor in general.
That is interesting conclusion; I see KB, MB, GB etc. used all
over the place by everybody! Some hard-core folks have been
trying to introduce KiB, MiB, etc. to replace the deceptive
KB, MB, GB, TB and the ilk (especially hard-drive
manufacturers have found the distinction between 10^3 and 2^10
very useful, but I digress.
"just pointing out the obvious"
Most
modern transmission protocols are defined in terms of octets
(although in the past, I worked on one which used 5 bit
elements).
I was hoping that someone would
have IMAGINED what everything would be like today if a byte was 32-bits from
the start,
The obvious what? I've only seen KB, MB, etc. used with regards
to a specific hardware: my PC has 4MB main memory, etc. On most
specific hardware, the bytes have a specific size (8 bits on a
PC, although on the old PDP-10, the size of a byte was
programmable, with both 7 and 9 bit bytes being commonly used).
And on machines which aren't byte addressable, you probably
won't hear KB, MB etc. being used.
Want to reply to this thread or ask your own question?
You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.