Can't Convert BYTE to WORD?

B

Bryan Parkoff

I have two variables: "char A" and "short B". I can be able to convert
from A to B using explicit case conversion with no problem like "B = short
(A);". Right now, I have two variables: "char T[6]" and "short A". T has
an array of six elements. I desire to capture first element and second
element as two bytes into word as short.
The problem is that "A" captures only one element instead of two
elements. I have looked at machine language and I discovered that C++
Compiler selects the wrong instruction which it uses MOV EAX, BYTE PTR [T]
instead of MOV EAX, WORD PTR [T].
Is there a way how I can fix an error in my source code using explicit
case conversion? I tried to use dynamic_cast<>, but it has the same result.
Here is my example code below.

Bryan Parkoff

int main(void)
{
unsigned char T[6] = { "Bryan" }; // I chose unsigned char for string
instead of char.
unsigned short A;

A = unsigned short (*T); // Should capture "Br"

return 0;
}
 
V

Victor Bazarov

Bryan said:
I have two variables: "char A" and "short B". I can be able to convert
from A to B using explicit case conversion with no problem like "B = short
(A);".

Actually, AFAIK, there is no need to be explicit. Implicit conversion
should work just fine:

B = A;
Right now, I have two variables: "char T[6]" and "short A". T has
an array of six elements. I desire to capture first element and second
element as two bytes into word as short.
The problem is that "A" captures only one element instead of two
elements. I have looked at machine language and I discovered that C++
Compiler selects the wrong instruction which it uses MOV EAX, BYTE PTR [T]
instead of MOV EAX, WORD PTR [T].
Is there a way how I can fix an error in my source code using explicit
case conversion?

No, you need an arithmetic (or bit manipulation) expression.
I tried to use dynamic_cast<>, but it has the same result.
Here is my example code below.

Bryan Parkoff

int main(void)
{
unsigned char T[6] = { "Bryan" }; // I chose unsigned char for string
instead of char.
unsigned short A;

A = unsigned short (*T); // Should capture "Br"

Why should it? You say here, essentially,

A = unsigned short(T[0]);

so it does as you ask, only takes the first one. You should do something
like

A = (T[0] << CHAR_BIT) | T[1];

(or vice versa depending on where in A you want the 'B' and where the 'r')
return 0;
}

V
 
R

red floyd

Bryan said:
I have two variables: "char A" and "short B". I can be able to convert
from A to B using explicit case conversion with no problem like "B = short
(A);". Right now, I have two variables: "char T[6]" and "short A". T has
an array of six elements. I desire to capture first element and second
element as two bytes into word as short.
The problem is that "A" captures only one element instead of two
elements. I have looked at machine language and I discovered that C++
Compiler selects the wrong instruction which it uses MOV EAX, BYTE PTR [T]
instead of MOV EAX, WORD PTR [T].
Is there a way how I can fix an error in my source code using explicit
case conversion? I tried to use dynamic_cast<>, but it has the same result.
Here is my example code below.

Bryan Parkoff

int main(void)
{
unsigned char T[6] = { "Bryan" }; // I chose unsigned char for string
instead of char.
unsigned short A;

A = unsigned short (*T); // Should capture "Br"

return 0;
}

*T is an unsigned char, so you have the equivalent of:
unsigned char C = 'B';
unsigned short A = unsigned short(C);

Why would you expect anything else?
 
D

Default User

Bryan said:
I have two variables: "char A" and "short B". I can be able to convert
from A to B using explicit case conversion with no problem like "B = short
(A);". Right now, I have two variables: "char T[6]" and "short A". T has
an array of six elements. I desire to capture first element and second
element as two bytes into word as short.
The problem is that "A" captures only one element instead of two
elements. I have looked at machine language and I discovered that C++
Compiler selects the wrong instruction which it uses MOV EAX, BYTE PTR [T]
instead of MOV EAX, WORD PTR [T].
Is there a way how I can fix an error in my source code using explicit
case conversion? I tried to use dynamic_cast<>, but it has the same result.
Here is my example code below.

Bryan Parkoff

int main(void)
{
unsigned char T[6] = { "Bryan" }; // I chose unsigned char for string
instead of char.
unsigned short A;

A = unsigned short (*T); // Should capture "Br"

return 0;
}


Seems like you are thinking that what you have will start copying at
the pointer value into the short. It doesn't.

If you did want to do that, you could use memcpy().


memcpy (&A, T, 2);


However, you have to be sure that's the byte order you want and all
that.

It would be helpful if you explained exactly what you are trying to do,
as copying two characters into a short isn't all that typical of an
operation.



Brian
 
B

Bryan Parkoff

Default User said:
Bryan said:
I have two variables: "char A" and "short B". I can be able to convert
from A to B using explicit case conversion with no problem like "B =
short
(A);". Right now, I have two variables: "char T[6]" and "short A". T
has
an array of six elements. I desire to capture first element and second
element as two bytes into word as short.
The problem is that "A" captures only one element instead of two
elements. I have looked at machine language and I discovered that C++
Compiler selects the wrong instruction which it uses MOV EAX, BYTE PTR
[T]
instead of MOV EAX, WORD PTR [T].
Is there a way how I can fix an error in my source code using
explicit
case conversion? I tried to use dynamic_cast<>, but it has the same
result.
Here is my example code below.

Bryan Parkoff

int main(void)
{
unsigned char T[6] = { "Bryan" }; // I chose unsigned char for string
instead of char.
unsigned short A;

A = unsigned short (*T); // Should capture "Br"

return 0;
}


Seems like you are thinking that what you have will start copying at
the pointer value into the short. It doesn't.

If you did want to do that, you could use memcpy().


memcpy (&A, T, 2);


However, you have to be sure that's the byte order you want and all
that.

It would be helpful if you explained exactly what you are trying to do,
as copying two characters into a short isn't all that typical of an
operation.
Brian,

Thank you for the information. I am sure that memcpy() works, but I
want to use two x86 instruction that uses WORD instead of BYTE, but C++
Compiler only assigns BYTE instead of WORD. It is the way how C++ Compiler
works. I don't know if there is no solution so I am forced to use __asm
function. I do not wish to use left shift to move first byte to the left
before use "or" to capture second byte so two bytes becomes word. SHIFT and
ROTATE are hurt on Intel Pentium 4 because of eating about 7 clock cycles
rather than one clock cycle.
Variable: T is defined in BYTE ARRAY, but I force to tell C++ Compiler
to capture two bytes from BYTE ARRAY. Please look at my C++ source code
with my comment below.

// Example 1
int main(void)
{
unsigned char T[6] = { "Bryan" }; // I chose unsigned char for string
instead of char.
unsigned short A; // C++ Compiler assigns "movzx EAX, BYTE PTR [T]"
because T is defined as BYTE ARRAY.

A = unsigned short (*T); // Should capture "Br"

return 0;
}

// Example 2
int main(void)
{
unsigned char T[6] = { "Bryan" }; // I chose unsigned char for string
instead of char.
unsigned short A; // C++ Compiler assigns "movzx EAX, BYTE PTR [T]"
because T is defined as BYTE ARRAY.

// A = unsigned short (*T); // Should capture "Br"
__asm
{
movzx EAX, WORD PTR [T] // Remove "movzx EAX, BYTE PTR [T]" from C++
Compiler
mov WORD PTR [A], AX
}

return 0;
}

Bryan Parkoff
 
R

red floyd

Bryan said:
[redacted]

1. How a specific compiler generates cod is OT.
2. 80x86 ASM is OT (the __asm keyword)
3. To do what you want, you need to drop into the realm of UB
(specifically, cast &T[0] to an unsigned short *, and dererence that.
4. What, specifically, are you attempting to do, where memcpy() would be
insufficient (whereas it is defined behavior and portable, save for
endian issues)?
 
B

Bryan Parkoff

Todd Brylski said:
A = *((unsigned short *)T);
Todd,

You got the right answer. Thank you very much. It helps me a lot to
save performance.

Bryan Parkoff
 
O

Old Wolf

Bryan said:
"Todd Brylski":

Note that those brackets are unnecessary:
A = *(unsigned short *)T;
You got the right answer. Thank you very much. It helps me a
lot to save performance.

I guess that depends what you mean by 'performance'. This code
is non-portable, eg. it may crash the program on Sun hardware.

Try this one:
memcpy(&A, T, sizeof A);
 
M

Mike Smith

Bryan said:
int main(void)
{
unsigned char T[6] = { "Bryan" }; // I chose unsigned char for string
instead of char.
unsigned short A;

A = unsigned short (*T); // Should capture "Br"

return 0;
}

I think what you're looking for might have been:

A = *((unsigned short *)T);

But this is naughty as it assumes that sizeof(short) == 2, and that your
machine architecture is little-endian.
 
B

Bryan Parkoff

Old Wolf said:
Note that those brackets are unnecessary:
A = *(unsigned short *)T;


I guess that depends what you mean by 'performance'. This code
is non-portable, eg. it may crash the program on Sun hardware.

Try this one:
memcpy(&A, T, sizeof A);
Well, I ran this loop over 0x10000000 times to use left shift and "or"
before it takes 400 ms on Pentium III, but it use WORD pointer instead of
left shift and "or" before it takes 370 ms.
I doubt that it may crash on other non-x86 machine unless little endian
and big endian are defined. I don't accept memcpy() function because it
takes too many cycles that can degrade performance however WORD pointer uses
only two x86 instructions.
I will try to test on other non-x86 machine later.

Bryan Parkoff
 
O

Old Wolf

Bryan said:
I doubt that it may crash on other non-x86 machine

On some machines, the program crashes if you try to read a short
out of an odd-numbered memory location. This is called "alignment".
Sun hardware is one such platform.

Of course, you might be lucky and end up with T being an even-
numbered address.
 
R

Rolf Magnus

Bryan said:
Well, I ran this loop over 0x10000000 times to use left shift and "or"
before it takes 400 ms on Pentium III, but it use WORD pointer instead of
left shift and "or" before it takes 370 ms.

So? This seems like a negligible difference to me, considering that your
final program is likely to do other things that just copying char values
into short variables, I'd be surprised if there is any noticable difference
at all.
I doubt that it may crash on other non-x86 machine unless little
endian and big endian are defined.

You doubt it, but you don't know why it might crash, do you?
I don't accept memcpy() function because it takes too many cycles that
can degrade performance

It does? That's strange. On my compiler, memcpy is extremely fast. How big
is the difference on your system?
however WORD pointer uses only two x86 instructions.

My compiler transforms the memcpy call into two assembler instructions on
x86.
 
D

Default User

Thank you for the information. I am sure that memcpy() works, but I
want to use two x86 instruction that uses WORD instead of BYTE, but C++
Compiler only assigns BYTE instead of WORD. It is the way how C++ Compiler
works. I don't know if there is no solution so I am forced to use __asm
function. I do not wish to use left shift to move first byte to the left
before use "or" to capture second byte so two bytes becomes word. SHIFT and
ROTATE are hurt on Intel Pentium 4 because of eating about 7 clock cycles
rather than one clock cycle.

It seems to me that you are deep in premature-optimization land. Your
"test" that you mention in another post is far from conclusive.

You should use the construct that is clearest and most portable until
such time as you have a demonstrated (not just desired) need for
something else.

Library routines such as memcpy() are usually highly optimized for an
implementation. Invoking undefined behavior with a substitute while
chasing a (possibly illusory) performance advantage is a fool's gambit.

It is possible to switch the problem around safely, that is to cast the
short* to a char* and stuff the two bytes that way. I doubt it's of
much advantage, nor will it likely give you that particular assembly
instruction set you seem to crave.

At any rate, you are rapidly heading off into waters that are off-topic
for this newsgroup.



Brian
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,432
Messages
2,571,682
Members
48,796
Latest member
Greg L.

Latest Threads

Top