What would you think about this method of string copy?

G

Good Guy

To achieve higher performance, would you think using the method below
when copying strings specially when there are a lot of characters in
the string, lot more than 12, is violating any rule?

unsigned char one[12]={1,2,3,4,5,6,7,8,9,10,11,12};
unsigned char two[12];
unsigned int (& three)[3]=reinterpret_cast<unsigned int (&)[3]>(one);
unsigned int (& four)[3]=reinterpret_cast<unsigned int (&)[3]>(two);
for (unsigned int i=0;i<3;i++)
four=three;

Some people tell because there are functions for string copy like
memcpy using such method is never right but I disagree .
 
N

Niklas Holsti

Good said:
To achieve higher performance, would you think using the method below
when copying strings specially when there are a lot of characters in
the string, lot more than 12, is violating any rule?

unsigned char one[12]={1,2,3,4,5,6,7,8,9,10,11,12};
unsigned char two[12];
unsigned int (& three)[3]=reinterpret_cast<unsigned int (&)[3]>(one);
unsigned int (& four)[3]=reinterpret_cast<unsigned int (&)[3]>(two);
for (unsigned int i=0;i<3;i++)
four=three;

Some people tell because there are functions for string copy like
memcpy using such method is never right but I disagree .


Most memcpy functions that I have looked at do such "chunking"
optimizations internally, and can do them better, because they can use
processor-specific chunks that are wider than an "int" and can take into
account alignments and the left-over characters at the end of the
strings when the length is not a multiple of the chunk size.

So I would agree with the "some people" and prefer memcpy, unless you
know that your system's memcpy is stupid. Moreover, your "int" loop will
fail the moment someone changes the length of the strings during
maintenance, while the memcpy solution keeps working.
 
R

Richard Kettlewell

Good Guy said:
To achieve higher performance, would you think using the method below
when copying strings specially when there are a lot of characters in
the string, lot more than 12, is violating any rule?

unsigned char one[12]={1,2,3,4,5,6,7,8,9,10,11,12};
unsigned char two[12];
unsigned int (& three)[3]=reinterpret_cast<unsigned int (&)[3]>(one);
unsigned int (& four)[3]=reinterpret_cast<unsigned int (&)[3]>(two);
for (unsigned int i=0;i<3;i++)
four=three;

Some people tell because there are functions for string copy like
memcpy using such method is never right but I disagree .


Your approach may actually have *worse* performance. For instance GCC
does a better job of optimizing fixed-size memcpy() than your loop.
 
G

Goran Pusic

Most memcpy functions that I have looked at do such "chunking"
optimizations internally, and can do them better, because they can use
processor-specific chunks that are wider than an "int" and can take into
account alignments and the left-over characters at the end of the
strings when the length is not a multiple of the chunk size.

So I would agree with the "some people" and prefer memcpy, unless you
know that your system's memcpy is stupid.

That's not enough. One needs to know that memcpy is stupid __and__
that it being stupid (meaning slow) actually matters for the code at
hand __and__ that it's not realistically possible to get needed speed
elsewhere (or simply by using a better memcpy).

Goran.
 
N

Niklas Holsti

Goran said:
That's not enough. One needs to know that memcpy is stupid __and__
that it being stupid (meaning slow) actually matters for the code at
hand __and__ that it's not realistically possible to get needed speed
elsewhere (or simply by using a better memcpy).

Agreed, agreed. I was assuming that the OP had a real problem.
 
J

James Kanze

To achieve higher performance, would you think using the method below
when copying strings specially when there are a lot of characters in
the string, lot more than 12, is violating any rule?
unsigned char one[12]={1,2,3,4,5,6,7,8,9,10,11,12};
unsigned char two[12];
unsigned int (& three)[3]=reinterpret_cast<unsigned int (&)[3]>(one);
unsigned int (& four)[3]=reinterpret_cast<unsigned int (&)[3]>(two);
for (unsigned int i=0;i<3;i++)
four=three;


Well, it's undefined behavior, according to the standard. And
it doesn't work on Sparcs and many other processors (causes core
dumps), and runs slower than the standard memcpy on Linux Intel.

Other than that...
 
S

SG

Why is it UB exactly?  Creating a reference to an object of different
type is fine according to the standard:

"That is, a reference
cast reinterpret_cast<T&>(x) has the same effect as the
conversion *reinterpret_cast<T*>(&x) with the built-in &
and * operators (and similarly for reinterpret_cast<T&&>(x)).
The result refers to the same object as the source lvalue,
but with a different type."

Don't forget §3.10/15:

"If a program attempts to access the stored value of an object
through an lvalue of other than one of the following types the
behavior is undefined
- the dynamic type of the object,
- a cv-qualified version of the dynamic type of the object,
- a type similar (as defined in 4.4) to the dynamic type of the
object,
- a type that is the signed or unsigned type corresponding to the
dynamic type of the object,
- a type that is the signed or unsigned type corresponding to a
cv-qualified version of the dynamic type of the object,
- an aggregate or union type that includes one of the
aforementioned types among its members (including, recursively,
a member of a subaggregate or contained union),
- a type that is a (possibly cv-qualified) base class type of the
dynamic type of the object,
- a char or unsigned char type."

Cheers!
SG
 
J

James Kanze

To achieve higher performance, would you think using the method below
when copying strings specially when there are a lot of characters in
the string, lot more than 12, is violating any rule?
unsigned char one[12]={1,2,3,4,5,6,7,8,9,10,11,12};
unsigned char two[12];
unsigned int (& three)[3]=reinterpret_cast<unsigned int (&)[3]>(one);
unsigned int (& four)[3]=reinterpret_cast<unsigned int (&)[3]>(two);
for (unsigned int i=0;i<3;i++)
four=three;

Well, it's undefined behavior, according to the standard. And
it doesn't work on Sparcs and many other processors (causes core
dumps), and runs slower than the standard memcpy on Linux Intel.
Other than that...

Why is it UB exactly?

Because it causes the program to crash on some machines?
Creating a reference to an object of different
type is fine according to the standard:
"That is, a reference
cast reinterpret_cast<T&>(x) has the same effect as the conversion
*reinterpret_cast<T*>(&x) with
the built-in & and * operators (and similarly for
reinterpret_cast<T&&>(x)). The result refers to the
same object as the source lvalue, but with a different type."

I don't have my copy of the standard where I can look at it at
the moment, but I'm pretty sure you'll find that reference
conversions obey the same rules as pointer conversions. And
that the same holds for accesses through the resulting
references. In other words, the only accesses allowed are those
to the original type (here, unsigned char) or to a character
type. (As I say, I don't have my copy handy, but IIRC, the
critical text is somewhere in the lifetime of object or the
object model text. Something about accessing an object through
an lvalue expression of the type.)

In practice, of course, the real problem with modern machines is
alignment: there's no guarantee that an unsigned char will be
aligned correctly to be accessed as an unsigned int. On
a Sparc, a misaligned access will cause the program to core
dump, and on an Intel, it will slow things down considerably.
(I think the implementation of memcpy under Linux takes this
into account when it "optimizes".)

Since you don't like to admit that anything but Intel exists,
you can skip the following. But I've worked on machines where
an int* wouldn't fit into a char*. And on at least one machine
still being sold, ints (and unsigned int) have illegal
representations; the compiler will and out the irrelevant bits
when assigning an unsigned int (which means that the resulting
unsigned char won't be equal). All of these considerations are
what motivated the undefined behavior.
 
J

Joshua Maurice

Please ignore my two previous posts; a post on moderated newsgroup
confused me.

I assume that you're talking about the recent thread in which I have
participated. I can't speak to the version that you know that uses a
char*, but I would like to caution others. It doesn't matter if you
cast through a char*. What matters is how you access the object. Ex:

#include <iostream>
int main()
{
int* x = new int;

//fine, though potentially dangerous
char* c = reinterpret_cast<char*>(x);

//fine because of the char exception in 3.10 / 15
std::cout << *c;

/*not UB (undefined behavior) yet (?), but it is nonsensical
because the result cannot be used in any meaningful way.
At the very least, it is implementation dependent because short
may have a different alignment requirement than char.
*/
short* s = reinterpret_cast<short*>(c);

/*UB, accessing an object through an lvalue of the wrong type
*/
std::cout << *s;
}
 
J

Joshua Maurice

It doesn't matter if you
cast through a char*. What matters is how you access the object. Ex:

  #include <iostream>
  int main()
  {
    int* x = new int;

    //fine, though potentially dangerous
    char* c = reinterpret_cast<char*>(x);

    //fine because of the char exception in 3.10 / 15
    std::cout << *c;

    /*not UB (undefined behavior) yet (?), but it is nonsensical
because the result cannot be used in any meaningful way.
    At the very least, it is implementation dependent because short
may have a different alignment requirement than char.
    */
    short* s = reinterpret_cast<short*>(c);

    /*UB, accessing an object through an lvalue of the wrong type
    */
    std::cout << *s;
  }

Actually technically, according to what I just wrote, I'm wrong. I
realized just after I hit submit.

std::cout << *s;
is still is undefined behavior. Actually, so is
std::cout << *c;
for the same reason. Both read uninitialized memory, so UB. Let's take
this example instead:

#include <iostream>
int main()
{
int* x = new int(1);

//fine, though potentially dangerous
char* c = reinterpret_cast<char*>(x);

//fine because of the char exception in 3.10 / 15
std::cout << *c;

/*not UB (undefined behavior) yet (?), but it is nonsensical
because the result cannot be used in any meaningful way.
At the very least, it is implementation dependent because short
may have a different alignment requirement than char.
*/
short* s = reinterpret_cast<short*>(c);

/*UB, accessing an object through an lvalue of the wrong type
*/
std::cout << *s;
}

Moreover, let's also consider this example, with descriptions correct
as far as I can tell from reading the standard, and as to its intent
and how the compiler writers interpret it:

int main()
{
int * x = new int;

//fine, but probably stupid,
short * s = reinterpret_cast<short*>(x);

//

/* As far as I can tell from reading the standard, this is
actually fine. We have a piece of storage, so we can create a new POD
object in that storage by writing to that storage through an lvalue of
that POD type.
*/
*s = 1;

//

/* UB (undefined behavior).
An int object does not exist at that storage - a short object
does, so you're reading an object through an lvalue of the wrong
type.

Instead, if we wrote to the storage through an int lvalue instead
of reading from the storage, then we would have reused the storage and
created a new int object, and we would not have undefined behavior.
*/
return *x;
}
 
J

Joshua Maurice

Stroustrup's pool allocator casts from a char pointer (which points to
within a char array) to a "Link" pointer and uses (writes) to storage
refered to by this pointer.  As "Link" is POD is this undefined
behaviour or not?  The char/unsigned char UB exclusion only applies in
the other direction as far as I can tell (accessing a stored value of an
object through an lvalue of char/unsigned char type).

I would really want to see the whole code before commenting. This is
rather tricky stuff.

I waited a couple days to see if I would be corrected in comp.lang.c+
+.moderated, but I actually got a reply from a respected guy agreeing
with me in that it's the most reasonable interpretation, so I feel
better.

In short, if Link has a trivial constructor (That is it has no user
declared constructor, it has no member with a non-trivial constructor,
and it has no base with a non-trivial constructor), then you can start
the lifetime of a Link object by simply writing to some storage
through a Link lvalue. Any object in the storage has its lifetime
ended, and any subsequent read on that storage must be through a Link
lvalue, a char or unsigned char lvalue, or one of the other exceptions
in 3.10 / 15 - until some time as the Link object lifetime ends, which
can only happen from the storage being released or reused to hold
another object.

Of course, there's a lot more restrictions than this, such as you
can't reuse the storage of a static scope or stack scope const object,
the storage of a stack object with a non-trivial destructor must
contain an object of that type when the implicit automatic destructor
call happens (so you could reuse the storage to put a different object
there as long as you put an object of the original type there before
the implicit automatic destructor call - highly not suggested), and so
on. I would see the post in comp.lang.c++.moderated:

http://groups.google.com/group/comp.lang.c++.moderated/browse_thread/thread/bf18512b5b848c0b#
Newsgroups: comp.lang.c++.moderated
Followup-To: comp.lang.c++.moderated
From: Alberto Griggio <[email protected]>
Date: Fri, 19 Nov 2010 14:17:45 CST
Local: Fri, Nov 19 2010 12:17 pm
Subject: understanding strict aliasing

PS: I'm still not quite clear on the intent and practice of reading
uninitialized data through a char or unsigned char lvalue. Some
sections say you can, one say you can't, and I don't know what
actually happens on real machines with trap values, rare as those are.
 
J

Joshua Maurice

Whilst I don't disagree that this is the case could you please cite the
part of the standard that codifies this?  I cannot find anything in 3.8
of the 0x draft standard (which is all I have available).

It doesn't. This has been my interpretation of intent and
reasonableness of some very badly phrased sections in the standard.
Please see the comp.lang.c++.moderated thread.

In short, 3.8 Object Lifetime [basic.life] reads:

1 The lifetime of an object is a runtime property of the object. The
lifetime of an object of type T begins
when:
— storage with the proper alignment and size for type T is obtained,
and
— if T is a class type with a non-trivial constructor (12.1), the
constructor call has completed.
The lifetime of an object of type T ends when:
— if T is a class type with a non-trivial destructor (12.4), the
destructor call starts, or
— the storage which the object occupies is reused or released.

This is frankly bullshit. This means that any such storage has a near
limitless amount of objects co-existing in it. The only reasonable way
out seems to be to use the same fix of the union DR, which I've heard
is finally making it into the draft (maybe). (See that same comp.lang.c
++.moderated thread for some discussion. Someone there mentions a
recent thread on on comp.std.c++.)

3.8 Object Lifetime / 1 should read:

1 The lifetime of an object is a runtime property of the object. The
lifetime of an object of type T begins
when:
— storage with the proper alignment and size for type T is obtained,
and
— if T is a class type with a non-trivial constructor (12.1), the
constructor call has completed, /+ or if T is not a class type with a
non-trivial constructor, a write was made to the storage through a T
lvalue +/.
The lifetime of an object of type T ends when:
— if T is a class type with a non-trivial destructor (12.4), the
destructor call starts, or
— the storage which the object occupies is reused or released.

On the bright side, no one has corrected me yet, and I've gotten some
support, so I'm going to run with it. This appears to be the only
option which actually works on common implementations, such as gcc
without -fno-strict-aliasing, and it appears to be the solution
consistent with the proposed union DR fix. (I think / I hope. I'm
planning on reading that shortly.)
 
J

Joshua Maurice

3.8 Object Lifetime / 1 should read:

1 The lifetime of an object is a runtime property of the object. The
lifetime of an object of type T begins
when:
— storage with the proper alignment and size for type T is obtained,
and
— if T is a class type with a non-trivial constructor (12.1), the
constructor call has completed, /+ or if T is not a class type with a
non-trivial constructor, a write was made to the storage through a T
lvalue +/.
The lifetime of an object of type T ends when:
— if T is a class type with a non-trivial destructor (12.4), the
destructor call starts, or
— the storage which the object occupies is reused or released.

Meh. That's not sufficient, actually. You'd need to fix up the note
section of 3.8 / 2, and maybe other section. I haven't done an
exhaustive reading on this. Also, I'm not sure if my proposed wording
is clear enough. A correct fix would need to cover cases such as:
int main()
{
char* c = new char[sizeof(int)];
int* x = reinterpret_cast<int*>(c);
*x = 1;
}
and
int main()
{
int x = 1;
}

I'm not sure if the second initialization counts as "a write to the
storage through an int lvalue" in standardeze.
 
J

Joshua Maurice

3.8 Object Lifetime / 1 should read:
1 The lifetime of an object is a runtime property of the object. The
lifetime of an object of type T begins
when:
— storage with the proper alignment and size for type T is obtained,
and
— if T is a class type with a non-trivial constructor (12.1), the
constructor call has completed, /+ or if T is not a class type with a
non-trivial constructor, a write was made to the storage through a T
lvalue +/.
The lifetime of an object of type T ends when:
— if T is a class type with a non-trivial destructor (12.4), the
destructor call starts, or
— the storage which the object occupies is reused or released.

Meh. That's not sufficient, actually. You'd need to fix up the note
section of 3.8 / 2, and maybe other section. I haven't done an
exhaustive reading on this. Also, I'm not sure if my proposed wording
is clear enough. A correct fix would need to cover cases such as:
  int main()
  {
    char* c = new char[sizeof(int)];
    int* x = reinterpret_cast<int*>(c);
    *x = 1;
  }
and
  int main()
  {
    int x = 1;
  }

I'm not sure if the second initialization counts as "a write to the
storage through an int lvalue" in standardeze.

See
http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_active.html
1116. Aliasing of union members

Sweets.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,009
Latest member
GidgetGamb

Latest Threads

Top