bytes calculation

Pietro Cerutti · Jul 30, 2007

Mark said:
My guess is that if Richard H says it invokes undefined behaviour, it
does. I'm a little disappointed that he has not felt fit to explain why
and give chapter and verse, but that's his perogative.

I agree on both your first and second sentence.

However, the undefined behaviour on the implementations you happen to
use (or indeed most or even all implementations) may be for the code to
do what you expect.

Yes, I'm not assuming that my code is correct only because it gives the
result I was expecting..

Richard Heathfield · Jul 30, 2007

jacob navia said:

Please, are you a lawyer?

Can you tell me of a machine where char_bit != 8 ?

Assuming you mean CHAR_BIT, the use of 16- and even 32-bit bytes is
commonplace in modern digital signal processors. SHARC DSPs are an
obvious example.

And please, a machine in use 2007 ok?

It is not unlikely that you use several such processors yourself,
whether you realise it or not. DSPs are in common use in mobile phones,
for example.

jacob navia · Jul 30, 2007

Mark said:
My guess is that if Richard H says it invokes undefined behaviour, it
does. I'm a little disappointed that he has not felt fit to explain why
and give chapter and verse, but that's his perogative.

However, the undefined behaviour on the implementations you happen to
use (or indeed most or even all implementations) may be for the code to
do what you expect.

You agree with me then, that a shift right of 32 bits should give
you the upper 4 bytes ok?

I am getting nuts with all the language lawyers around.

Here we enter into the realms of philosophy and personal taste - Mr
Heathfield wishes for all C code to be provably correct (loosely
speaking) for a general abstract C implementation which sticks to the
letter of the language specification,

Typical legalese of lawyers's taste.

while Mr Navia will accept C code

working on the sample of machines he happens to work with.

What?

If sizeof(unsigned long long) is 8 and sizeof(int) is 4,
(and CHAR_BIT is 8)

unsigned long long >> 32 gives the upper 32 bits,
that can be safely stored into a 32 bit integer!

This in all machines as far as I can see.

By the way, I am using this code in
PowerPC (64 bits)
Solaris (Sparc)
64 bit windows
64 bit linux
32 bit windows
32 bit linux

[Disclaimer - I do not speak for either party here, and my comments
above are my interpreations of their positions]

OK, but I tell you: my position is that a right shift should do
it in ANY machine. You disagree with that???

You yourself you are unable to say why it should NOT work.
You have used (surely) this code.

Incredible, how lawyers scare people with legalese!

Spoon · Jul 30, 2007

Jacob said:
Please, are you a lawyer?
Can you tell me of a machine where char_bit != 8 ?
And please, a machine in use 2007 ok?

cf. comp.arch.embedded

Nick Keighley · Jul 30, 2007

Please, are you a lawyer?

you say that like it's a bad thing...

Note: sizeof returns the number of chars in an object (or type)
*not* the number of bits.

Can you tell me of a machine where char_bit != 8 ?

And please, a machine in use 2007 ok?

But like you I'm puzzled as to why a 32-bit shift
on a 64-bit quantity doesn't yield the top 32-bits.
Unless I misunderstood something...

Mark Bluemel · Jul 30, 2007

Richard Heathfield wrote:
....

Here's a solution that I believe to be correct - I don't have a C99
compiler, so I can't check it (because of the unsigned long long) - but
corrections are most welcome:

Good job too.

[snip]

/* first ensure that a result is possible */
if(sizeof scnlen_lo < 4 || scnlen_hi < 4)

This should surely be:-

if(sizeof scnlen_lo < 4 ||sizeof scnlen_hi < 4)

And given the issues we meet later, I'd be inclined to add parentheses
(but then I'm often inclined to add parentheses):-

if((sizeof scnlen_lo < 4) ||(sizeof scnlen_hi < 4))

....

if(scnlen & mask <= UINT_MAX)

if((scnlen & mask) <= UINT_MAX)

....

if(scnlen & mask <= INT_MAX)

if((scnlen & mask) <= INT_MAX)

....

Pietro Cerutti · Jul 30, 2007

Richard said:
Pietro Cerutti said:

Under those assumptions and assuming CHAR_BIT is 8, sclen_hi now
actually contains the top 28 bytes of scnlen, which isn't what was
asked.

If you meant those to be bit counts rather than byte counts, your code
invokes undefined behaviour.

You seem not to be willing to help me understanding why my code invokes
undefined behavior.
Maybe you're expecting some more work on my side, which I've done:
My source is n1124.

As of "6.5.7", shift operators invoke undefined only if the right
operand is negative (not the case in my code) or greater than the width
of the left operand (not the case in my code).
Moreover, the left shift operator (which I do not use) invokes undefined
behavior if the result is not representable in the result type.

As of "6.5.10", the bitwise AND operator never causes undefined behavior.

So, please, could you point me to where the problem is?
Thank you in advance.

jacob navia · Jul 30, 2007

Richard said:
jacob navia said:

Assuming you mean CHAR_BIT, the use of 16- and even 32-bit bytes is
commonplace in modern digital signal processors. SHARC DSPs are an
obvious example.

It is not unlikely that you use several such processors yourself,
whether you realise it or not. DSPs are in common use in mobile phones,
for example.

1) ADSP-TS201 TigerSHARC Processor Programming Reference 1-11 cites
bytes as a processor data type. (8 bit bytes)

2) The "expand" instruction can use explicitly 8 bit values as
input. ADSP-TS201 TigerSHARC Processor Programming Reference 10-41

3) The Add instruction can add bytes in parallel, as documented in
ADSP-TS201 TigerSHARC Processor Programming Reference 3-18

4) data in the registers can be avccessed as 8 bit bytes.
<quote>
To select the operand size within a register file register, a register
name prefix selects a size that is equal or less than the size of the
register. These operand size prefixes for fixed-point data work as
follows.
• B – indicates byte (8-bit) word data. The data in a single 32-bit
register is treated as four 8-bit words. Example register names with
byte word operands are BR1, BR1:0, and BR3:0.
< end quote>

And you tell me that bytes are not recognized by that processor?

Richard Heathfield · Jul 30, 2007

Mark Bluemel said:

My guess is that if Richard H says it invokes undefined behaviour, it
does.

That's a good guess, but on this occasion you're mistaken. The code does
not in fact invoke undefined behaviour, and I was mistaken to claim
otherwise. I therefore owe apologies to Mr Cerutti and Mr Navia.

I'm a little disappointed that he has not felt fit to explain
why and give chapter and verse, but that's his perogative.

My reasoning (which would have worked just fine if I'd been right about
the UB) was that Mr Navia should have been perfectly capable of working
out the answer for himself and announcing it by himself. This turns out
to be impossible, of course, because I was mistaken in my claim, and
therefore incorrectly "corrected" a more or less correct post, in which
there was no UB to spot.

Now that I've taken my lumps, I'll explain what I thought was wrong. It
was the right shift by 32 bits. This is *not* undefined behaviour
because it is performed on an integer that is guaranteed to occupy more
than 32 bits (at least 64, in fact).

To make matters worse, I checked the code *twice* after spotting the
"bug" (as I thought it was), before posting my "correction".

The assumptions by both Mr Navia and Mr Cerutti that bytes are always 8
bits wide, whilst incorrect, pale in comparison with my own blunder.

<snip>

And now I suppose it's time for the clowns to have their field day. Pace
regs - I shall not be reading their responses.

Richard Heathfield · Jul 30, 2007

Mark Bluemel said:

Richard Heathfield wrote:
...

Good job too.

<Three corrections snipped>

Today is not being a good day, is it?

Thank you for the corrections.

Ben Pfaff · Jul 30, 2007

Richard Heathfield said:
Pietro Cerutti said:

If you meant those to be bit counts rather than byte counts, your code
invokes undefined behaviour.

Let me expand on this. Consider the first assignment:
sclen_hi = scnlen >> 32;
scnlen has type "unsigned long long", which is at least 64 bits
wide. Thus, shifting it right by a mere 32 bits is
well-defined. The result is then assigned to sclen_hi, which has
type "unsigned int". Regardless of whether "unsigned int" is
large enough to hold 32 bits, the assignment is acceptable: any
high bits will simply be trimmed off.

Consider the second assignment:
sclen_lo = scnlen & 0xFFFFFFFF;
I believe that 0xFFFFFFFF has type "int", "unsigned int",
"unsigned long int", or "unsigned long long int", depending.
Regardless of its type, the "&" operation against scnlen will
cause it to be promoted to "unsigned long long int". The result
is assigned to sclen_lo.

The possibility for undefined behavior seems to be in the range
of "int", which is the type of sclen_lo. If "int" is 32 bits,
then converting a large 32-bit unsigned integer to "int" invokes
undefined behavior, because it is outside the range of "int". I
imagine that this is what Richard is talking about.

Another possibility is that Richard is concerned about the
possibility of padding bits in various types. Padding bits are
interesting theoretically but rarely come up in real life, so I'm
not going to go any further in this direction.

The example number of 888888 would not yield undefined behavior,
as far as I can see. I don't think that's the issue at hand.

Mark Bluemel · Jul 30, 2007

jacob said:
You agree with me then, that a shift right of 32 bits should give
you the upper 4 bytes ok?

Given the full set of constraints - 8-bit bytes, 64-bit unsigned long
longs, 32-bit signed and unsigned ints - yes, I agree it should work.
However, I admit that my understanding is not complete, especially when
we look at the fine detail of the language specification.

As I said above, if Richard H says that this invokes undefined
behaviour, based on my experiences of his postings I'm inclined to
believe him, though I'd rather like to see chapter and verse.

I am getting nuts with all the language lawyers around.

Typical legalese of lawyers's taste.

This is not legalese - or at least I did not intend it to be so. I was
trying to accurately characterise Richard H's position, which is a
purist position relying only on what the language specification asserts.

What?

If sizeof(unsigned long long) is 8 and sizeof(int) is 4,
(and CHAR_BIT is 8)

unsigned long long >> 32 gives the upper 32 bits,
that can be safely stored into a 32 bit integer!

This in all machines as far as I can see.

By the way, I am using this code in
PowerPC (64 bits)
Solaris (Sparc)
64 bit windows
64 bit linux
32 bit windows
32 bit linux

This is the point where we need to wheel out the DS9K I suspect.

Other posters have already pointed out platforms where some of the
initial constraints are broken.

[Disclaimer - I do not speak for either party here, and my comments
above are my interpreations of their positions]

Click to expand...

OK, but I tell you: my position is that a right shift should do
it in ANY machine. You disagree with that???

You yourself you are unable to say why it should NOT work.

I have stated above that I recognise that my knowledge is limited, and
I'm more inclined to trust Richard's informed judgement than mine on
this point.

You have used (surely) this code.

Code I work with uses such constructions, but it is code which is
regarded as platform-specific, not as totally portable.

jacob navia · Jul 30, 2007

Ben said:
Let me expand on this. Consider the first assignment:
sclen_hi = scnlen >> 32;
scnlen has type "unsigned long long", which is at least 64 bits
wide. Thus, shifting it right by a mere 32 bits is
well-defined. The result is then assigned to sclen_hi, which has
type "unsigned int". Regardless of whether "unsigned int" is
large enough to hold 32 bits, the assignment is acceptable: any
high bits will simply be trimmed off.

Consider the second assignment:
sclen_lo = scnlen & 0xFFFFFFFF;
I believe that 0xFFFFFFFF has type "int", "unsigned int",
"unsigned long int", or "unsigned long long int", depending.
Regardless of its type, the "&" operation against scnlen will
cause it to be promoted to "unsigned long long int". The result
is assigned to sclen_lo.

The possibility for undefined behavior seems to be in the range
of "int", which is the type of sclen_lo.

No, it is *unsigned* int, so the assignment of an unsigned long long
is perfectly legal, see the message where Heathfield apologizes
for his mistake in this thread

If "int" is 32 bits,
then converting a large 32-bit unsigned integer to "int" invokes
undefined behavior, because it is outside the range of "int". I
imagine that this is what Richard is talking about.

This is the same mistake as Heathfield. The declaration for
sclen_lo is unsigned int.

Another possibility is that Richard is concerned about the
possibility of padding bits in various types. Padding bits are
interesting theoretically but rarely come up in real life, so I'm
not going to go any further in this direction.

I have always seen those arguments when lawyers do not find
any more arguments.

The example number of 888888 would not yield undefined behavior,
as far as I can see. I don't think that's the issue at hand.

Yes, it is not the issue at hand.

Richard · Jul 30, 2007

Richard Heathfield said:
Mark Bluemel said:

That's a good guess, but on this occasion you're mistaken. The code does
not in fact invoke undefined behaviour, and I was mistaken to claim
otherwise. I therefore owe apologies to Mr Cerutti and Mr Navia.

My reasoning (which would have worked just fine if I'd been right about
the UB) was that Mr Navia should have been perfectly capable of working
out the answer for himself and announcing it by himself. This turns out
to be impossible, of course, because I was mistaken in my claim, and
therefore incorrectly "corrected" a more or less correct post, in which
there was no UB to spot.

Now that I've taken my lumps, I'll explain what I thought was wrong. It
was the right shift by 32 bits. This is *not* undefined behaviour
because it is performed on an integer that is guaranteed to occupy more
than 32 bits (at least 64, in fact).

To make matters worse, I checked the code *twice* after spotting the
"bug" (as I thought it was), before posting my "correction".

The assumptions by both Mr Navia and Mr Cerutti that bytes are always 8
bits wide, whilst incorrect, pale in comparison with my own blunder.

<snip>

And now I suppose it's time for the clowns to have their field day. Pace
regs - I shall not be reading their responses.

You should. Though shall reap what you sow.

FWIW, there is nothing wrong with making mistakes. There is something
very wrong, however, when sneering at people and acting like a little
princess and being wrong all in one.

But kudos for admitting your mistake.

Charles Bailey · Jul 30, 2007

Let me expand on this. Consider the first assignment:
sclen_hi = scnlen >> 32;
scnlen has type "unsigned long long", which is at least 64 bits
wide. Thus, shifting it right by a mere 32 bits is
well-defined. The result is then assigned to sclen_hi, which has
type "unsigned int". Regardless of whether "unsigned int" is
large enough to hold 32 bits, the assignment is acceptable: any
high bits will simply be trimmed off.

Consider the second assignment:
sclen_lo = scnlen & 0xFFFFFFFF;
I believe that 0xFFFFFFFF has type "int", "unsigned int",
"unsigned long int", or "unsigned long long int", depending.
Regardless of its type, the "&" operation against scnlen will
cause it to be promoted to "unsigned long long int". The result
is assigned to sclen_lo.

The possibility for undefined behavior seems to be in the range
of "int", which is the type of sclen_lo. If "int" is 32 bits,
then converting a large 32-bit unsigned integer to "int" invokes
undefined behavior, because it is outside the range of "int". I
imagine that this is what Richard is talking about.

I think that you have the original types of scnlen_hi and scnlen_lo
swapped. As far as I can see the assignment scnlen_lo = scnlen should
set scnlen_lo to the lower bits of scnlen due to the way that unsigned
conversions work. This may be exactly what is required if the width
of unsigned int is 4 times the width of an unsigned char.

I think that there is, indeed, the possiblity of undefined behaviour
in the scnlen_hi part of the algorithm as you reasoned for scnlen_lo.

Richard · Jul 30, 2007

Mark Bluemel said:
This is not legalese - or at least I did not intend it to be so. I was
trying to accurately characterise Richard H's position, which is a
purist position relying only on what the language specification
asserts.

How can you accurately characterise his position when he didn't post any
justification - just an incomplete, buggy and long winded replacement
which he has since acknowledged is wrong?

Richard Heathfield · Jul 30, 2007

Ben Pfaff said:

The possibility for undefined behavior seems to be in the range
of "int", which is the type of sclen_lo. If "int" is 32 bits,
then converting a large 32-bit unsigned integer to "int" invokes
undefined behavior, because it is outside the range of "int". I
imagine that this is what Richard is talking about.

Well, it wasn't. Whether you are right about the UB is a matter on
which, for today at least, I think I should remain silent - I don't
think my foot can take any more bullets.

John Smith · Jul 30, 2007

Mark said:
Would it do anyone any harm if you were to a) explain why this is the
case, and b) explain how to solve the original poster's problem correctly?

No, but RH is c.l.c's official Jacob Navia baiter and in that
capacity he's not obligated to explain anything. Sounds like a
homework question anyway.

JS

jacob navia · Jul 30, 2007

John said:
No, but RH is c.l.c's official Jacob Navia baiter and in that capacity
he's not obligated to explain anything. Sounds like a homework question
anyway.

JS

What is funny (or maybe sad) is how people that are surely
competent programmers start doubting the most elementary
stuff when a lawyer starts telling them:

this is illegal... You can be prosecuted/sued for this.

Richard Heathfield · Jul 30, 2007

John Smith said:

RH is c.l.c's official Jacob Navia baiter

If that is true, I hereby resign the position. I'm not interested in
baiting Mr Navia. When I have replied to him, it is to correct his
mistakes (although in this very thread I seem to have managed to
uncorrect an unmistake, for a change).

But I'm weary of being pilloried for correcting his stupidities,
misconceptions, and false assumptions.

The obvious course of action for me is simply to killfile him - and so
that is what I will now do. Let other people correct his many blunders.
The alternative, of course, is to let those blunders go uncorrected.

Javascript programming in TheThingsNetwork	1	May 12, 2022
How to do it?	6	Oct 11, 2005
Please critique my code for fun learning project.	5	Jul 21, 2023
Unsigned long problem in c coding	3	Jul 31, 2019
Four or Two Bytes?	22	May 30, 2008
How to get an integer from a sequence of bytes	24	May 27, 2013
verilog, rtl, simulation	0	Apr 22, 2017
Trying to build a SARIMAX model to forecast the S&P500 trend	0	Nov 5, 2023

bytes calculation

Pietro Cerutti

Richard Heathfield

jacob navia

Spoon

Nick Keighley

Mark Bluemel

Pietro Cerutti

jacob navia

Richard Heathfield

Richard Heathfield

Ben Pfaff

Mark Bluemel

jacob navia

Richard

Charles Bailey

Richard

Richard Heathfield

John Smith

jacob navia

Richard Heathfield

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads