std::copy versus pointer copy

F

franky.backeljauw

Hello,

I am wondering which of these two methods is the fastest: std::copy, which
is included in the standard library, or a manually written pointer copy?
Do any of you have any experience with this?

I would think that the library function std::copy would perform optimally,
as it is a library function, and therefore the writers of this function
would know best how to optimize it ... but some tests seem to indicate
that my pointer copy function can do a lot better, especially when its
loop is vectorized. I have used g++ and icc with -O2, and icc with -O2
-march=pentiumiii -xK -tpp6 as compiler options.

Or is there another way to look at this? For completeness, I include my
test program:

<code>
#include <ctime>
#include <algorithm>
#include <iostream>

using namespace std;

typedef double Element;

class Vector
{
public:

Vector( unsigned int newsize ) { array = new Element[ newsize ]; size = newsize; }
~Vector() { if ( array ) delete( array ), array = 0; }

void init() { for ( unsigned int i = 0; i<size; i++ ) array[ i ] = i; }
void print() { for ( unsigned int i = 0; i<size; i++ ) cout << array[ i ] << " "; cout << endl; }

friend void iCopy( Element *dst, const Element *src, int n ) { int i = n + 1; while ( i-- > 0 ) *dst++ = *src++; }
friend void sCopy( Element *dst, const Element *src, int n ) { std::copy( src, src+n+1, dst ); }

unsigned int size;
Element *array;
};

int main()
{
Vector a( 10 ), b( 10 ), c( 10 ); a.init();
cout << "a : "; a.print();
cout << "b : "; b.print();
cout << "c : "; c.print();

iCopy( b.array, a.array, 7 );
cout << "b after iCopy : "; b.print();

sCopy( c.array, a.array, 7 );
cout << "c after sCopy : "; c.print();

clock_t start, stop, taken;

int VECTOR_SIZE = 5000000;

Vector d( VECTOR_SIZE ), e( VECTOR_SIZE );
d.init();

start = clock();
for ( int i = 0 ; i < 10; i++ ) iCopy( e.array, d.array, VECTOR_SIZE-10 );
stop = clock();
taken = ( stop - start ) / 1000;
cerr << "time for iCopy : " << taken << endl;

start = clock();
for ( int i = 0; i < 10; i++ ) sCopy( e.array, d.array, VECTOR_SIZE-1 );
stop = clock();
taken = ( stop - start ) / 1000;
cerr << "time for sCopy : " << taken << endl;

start = clock();
for ( int i = 0 ; i < 10; i++ ) iCopy( e.array, d.array, VECTOR_SIZE-10 );
stop = clock();
taken = ( stop - start ) / 1000;
cerr << "time for iCopy : " << taken << endl;

start = clock();
for ( int i = 0; i < 10; i++ ) sCopy( e.array, d.array, VECTOR_SIZE-1 );
stop = clock();
taken = ( stop - start ) / 1000;
cerr << "time for sCopy : " << taken << endl;

return 0;
}
</code>

Thanks for any reply.

Regards,

Franky B.
 
W

White Wolf

Hello,

I am wondering which of these two methods is the fastest: std::copy,
which is included in the standard library, or a manually written pointer
copy?
Do any of you have any experience with this?

Yes. The answer is: It depends. On the compiler, the flags, the library
implementation etc.

The rule of thumb: do not optimize, unless you have to. Go for the easiest
solution and if it proves to be a bottlneck, then try to find out the cause.
 
J

Josh Sebastian

Hello,

I am wondering which of these two methods is the fastest: std::copy, which
is included in the standard library, or a manually written pointer copy?
Do any of you have any experience with this?

friend void iCopy( Element *dst, const Element *src, int n )
{ int i = n + 1; while ( i-- >dst++ = *src++; }
friend void sCopy( Element *dst, const Element *src, int n )
{ std::copy(src, src+n+1, dst ); }

I think that std::copy is going to be /at least/ as fast as the hand-coded
copy. Alternatively, if your standard library implementation is clever
enough to meta-programatically determine that Element is POD, std::copy
could simply call std::memcpy, which might be faster.
I would think that the library function std::copy would perform optimally,
as it is a library function, and therefore the writers of this function
would know best how to optimize it ... but some tests seem to indicate
that my pointer copy function can do a lot better, especially when its
loop is vectorized. I have used g++ and icc with -O2, and icc with -O2
-march=pentiumiii -xK -tpp6 as compiler options.
YMMV.

start = clock();
for ( int i = 0 ; i < 10; i++ ) iCopy( e.array, d.array, VECTOR_SIZE-10 );
stop = clock();
taken = ( stop - start ) / 1000;
cerr << "time for iCopy : " << taken << endl;

Why are you dividing by 1000? Either divide by CLOCKS_PER_SEC or nothing
at all. Also, std::clock_t is integral; you probably want to be doing
floating-point division.

You didn't post your results. When I tested with GCC 3.3, I get basically
the same results for the two copies.

Josh
 
K

Kevin Goodsell

Hello,

I am wondering which of these two methods is the fastest: std::copy, which
is included in the standard library, or a manually written pointer copy?
Do any of you have any experience with this?

I would think that the library function std::copy would perform optimally,
as it is a library function, and therefore the writers of this function
would know best how to optimize it

I think you would probably find that it is not very optimized at all. It
is designed to be generic, so it can apply to objects that need to be
copied via their assignment operators (where a bit-wise copy would break
the objects horribly).
... but some tests seem to indicate
that my pointer copy function can do a lot better, especially when its
loop is vectorized. I have used g++ and icc with -O2, and icc with -O2
-march=pentiumiii -xK -tpp6 as compiler options.

Or is there another way to look at this? For completeness, I include my
test program:

<code>
#include <ctime>
#include <algorithm>
#include <iostream>

using namespace std;

typedef double Element;

class Vector
{
public:

Vector( unsigned int newsize ) { array = new Element[ newsize ]; size = newsize; }
~Vector() { if ( array ) delete( array ), array = 0; }

This is broken. delete is not a function, it's an operator. You also
must use the proper 'delete' operator for the 'new' operator you used,
so you need to use delete[] here.

As for the rest of it, it seems quite unnecessary... delete works fine
with null pointers, and 'array' is disappearing as soon as the
destructor completes, so setting it to 0 is rather pointless.
void init() { for ( unsigned int i = 0; i<size; i++ ) array[ i ] = i; }
void print() { for ( unsigned int i = 0; i<size; i++ ) cout << array[ i ] << " "; cout << endl; }

friend void iCopy( Element *dst, const Element *src, int n ) { int i = n + 1; while ( i-- > 0 ) *dst++ = *src++; }
friend void sCopy( Element *dst, const Element *src, int n ) { std::copy( src, src+n+1, dst ); }

This is bizarre (putting friend definitions inside the class), and I'm
not sure if it's legal.

Also, you are copying one too many elements with your std::copy. The
second argument should be src+n.

-Kevin
 
F

franky.backeljauw

Hello,

let me add another one: just a normal member copy, like dst=src. As
it seems from my test program, this method seems to be the fastest of all
.... any ideas why? Can it be optimized better by the compiler, as it is
easier to see what's going on? Or is there some penalty for the other
methods?

Thanks again for any reply.

<code>
#include <ctime>
#include <algorithm>
#include <iostream>

using namespace std;

typedef double Element;

class Vector
{
public:

Vector( unsigned int newsize ) { array = new Element[ newsize ]; size = newsize; }
~Vector() { if ( array ) delete( array ), array = 0; }

void init() { for ( unsigned int i = 0; i<size; i++ ) array[ i ] = i; }
void print() { for ( unsigned int i = 0; i<size; i++ ) cout << array[ i ] << " "; cout << endl; }

friend void iCopy( Element *dst, const Element *src, int n ) { int i = n + 1; while ( i-- > 0 ) *dst++ = *src++; }
friend void mCopy( Element *dst, const Element *src, int n ) { int i = n + 1; while ( i-- > 0 ) dst[ i ] = src; }
friend void sCopy( Element *dst, const Element *src, int n ) { std::copy( src, src+n+1, dst ); }

unsigned int size;
Element *array;
};

int main()
{
Vector a( 10 ), b( 10 ), c( 10 ); a.init();
cout << "a : "; a.print();
cout << "b : "; b.print();
cout << "c : "; c.print();

iCopy( b.array, a.array, 7 );
cout << "b after iCopy : "; b.print();

sCopy( c.array, a.array, 7 );
cout << "c after sCopy : "; c.print();

clock_t start, stop, taken;

int VECTOR_SIZE = 5000000;

Vector d( VECTOR_SIZE ), e( VECTOR_SIZE );
d.init();

start = clock();
for ( int i = 0 ; i < 10; i++ ) iCopy( e.array, d.array, VECTOR_SIZE-10 );
stop = clock();
taken = ( stop - start ) / 1000;
cerr << "time for iCopy : " << taken << endl;

start = clock();
for ( int i = 0 ; i < 10; i++ ) mCopy( e.array, d.array, VECTOR_SIZE-10 );
stop = clock();
taken = ( stop - start ) / 1000;
cerr << "time for mCopy : " << taken << endl;

start = clock();
for ( int i = 0; i < 10; i++ ) sCopy( e.array, d.array, VECTOR_SIZE-10 );
stop = clock();
taken = ( stop - start ) / 1000;
cerr << "time for sCopy : " << taken << endl;

start = clock();
for ( int i = 0 ; i < 10; i++ ) iCopy( e.array, d.array, VECTOR_SIZE-10 );
stop = clock();
taken = ( stop - start ) / 1000;
cerr << "time for iCopy : " << taken << endl;

start = clock();
for ( int i = 0 ; i < 10; i++ ) mCopy( e.array, d.array, VECTOR_SIZE-10 );
stop = clock();
taken = ( stop - start ) / 1000;
cerr << "time for mCopy : " << taken << endl;

start = clock();
for ( int i = 0; i < 10; i++ ) sCopy( e.array, d.array, VECTOR_SIZE-10 );
stop = clock();
taken = ( stop - start ) / 1000;
cerr << "time for sCopy : " << taken << endl;

return 0;
}
</code>

Regards,

Franky B.
 
F

franky.backeljauw


What does "YMMV" mean? Never heard it before ...
Why are you dividing by 1000? Either divide by CLOCKS_PER_SEC or nothing
at all. Also, std::clock_t is integral; you probably want to be doing
floating-point division.

Okay, I shall do that.
You didn't post your results. When I tested with GCC 3.3, I get basically
the same results for the two copies.

Here are some results for the three methods (see my second message):

<results>
$ icc stdcopy2.cpp -o stdcopy2 -O2 -march=pentiumiii -tpp6 -xK :
/opt/intel/compiler70/ia32/include/xlocnum(102) : (col. 64) remark: LOOP
WAS VECTORIZED.
/opt/intel/compiler70/ia32/include/xlocnum(103) : (col. 69) remark: LOOP
WAS VECTORIZED.
/opt/intel/compiler70/ia32/include/xlocnum(104) : (col. 67) remark: LOOP
WAS VECTORIZED.
$ ./stdcopy2
time for iCopy : 2090
time for mCopy : 2090
time for sCopy : 3090
time for iCopy : 2100
time for mCopy : 2090
time for sCopy : 3090

$ icc stdcopy2.cpp -o stdcopy2 -O2
$ ./stdcopy2
time for iCopy : 2100
time for mCopy : 2090
time for sCopy : 3090
time for iCopy : 2110
time for mCopy : 2100
time for sCopy : 3100

$ g++ stdcopy2.cpp -o stdcopy2 -O2
$ ./stdcopy2
time for iCopy : 2970
time for mCopy : 2090
time for sCopy : 3090
time for iCopy : 3010
time for mCopy : 1990
time for sCopy : 3070
</results>

According to the results I got, std::copy is the slowest, followed closely
by a pointer copy, but the member copy wins by quite a margin. Could you
run my second test program and check your results ... thanks!

Regards,

Franky B.
 
F

franky.backeljauw

Yes. The answer is: It depends. On the compiler, the flags, the
library implementation etc. The rule of thumb: do not optimize, unless
you have to. Go for the easiest solution and if it proves to be a
bottlneck, then try to find out the cause.

I am looking for the most optimized and thus fastest solution ...

Regards,

Franky B.
 
F

franky.backeljauw

I think you would probably find that it is not very optimized at all. It
is designed to be generic, so it can apply to objects that need to be
copied via their assignment operators (where a bit-wise copy would break
the objects horribly).

In an earlier thread somebody was saying that std::copy was the fastest
way of copying, so that's why I wanted to test it, as it is very important
for what I'm doing - a lot of copying, basically :). But my results seem
to indicate that is not very optimized indeed ...
~Vector() { if ( array ) delete( array ), array = 0; }

This is broken. delete is not a function, it's an operator. You also
must use the proper 'delete' operator for the 'new' operator you used,
so you need to use delete[] here.

Indeed - sorry, my mistake - blame it on the moonlight!
As for the rest of it, it seems quite unnecessary... delete works fine
with null pointers, and 'array' is disappearing as soon as the
destructor completes, so setting it to 0 is rather pointless.

True, once again ...
This is bizarre (putting friend definitions inside the class), and I'm
not sure if it's legal.

Maybe, but it's not of concern to this mail.
Also, you are copying one too many elements with your std::copy. The
second argument should be src+n.

The test in the beginning (copying a to b and c) seemed to show that
std::copy needed the +1 to copy the same amount of elements ...

Regards,

Franky B.
 
J

Juergen Heinzl

Hello,

I am wondering which of these two methods is the fastest: std::copy, which
is included in the standard library, or a manually written pointer copy?
Do any of you have any experience with this?

I would think that the library function std::copy would perform optimally,
as it is a library function, and therefore the writers of this function
would know best how to optimize it ... but some tests seem to indicate
that my pointer copy function can do a lot better, especially when its
loop is vectorized. I have used g++ and icc with -O2, and icc with -O2
-march=pentiumiii -xK -tpp6 as compiler options.
[-]
I seriously doubt your tests as std::copy() isn't a function but a template
and its actual implementation tends to result in either a simple loop or,
where possible, even a call to memmove() or some similar, optimized function.

You'd look what actually happens in algorithm,
Juergen
 
J

Josh Sebastian

What does "YMMV" mean? Never heard it before ...

Your mileage may vary.
According to the results I got, std::copy is the slowest, followed closely
by a pointer copy, but the member copy wins by quite a margin. Could you
run my second test program and check your results ... thanks!

I added a couple of tests: s2Copy (which is std::copy with the arithmetic
operations taken out of the timing) and bCopy (which calls memcpy to do a
bitwise copy). Here's my results:

curien@balar:~/prog$ g++ blah.cpp
curien@balar:~/prog$ ./a.out
a : 0 1 2 3 4 5 6 7 8 9
b : 0 0 0 0 0 0 0 0 0 0
c : 0 0 0 0 0 0 0 0 0 0
b after iCopy : 0 1 2 3 4 5 6 7 0 0
c after sCopy : 0 1 2 3 4 5 6 7 0 0
time for iCopy : 1600
time for mCopy : 1920
time for sCopy : 1500
time for s2Copy : 1490
time for bCopy : 1500
time for iCopy : 1580
time for mCopy : 1920
time for sCopy : 1490
time for s2Copy : 1500
time for bCopy : 1500

curien@balar:~/prog$ g++ -O1 blah.cpp
curien@balar:~/prog$ ./a.out
a : 0 1 2 3 4 5 6 7 8 9
b : 0 0 0 0 0 0 0 0 0 0
c : 0 0 0 0 0 0 0 0 0 0
b after iCopy : 0 1 2 3 4 5 6 7 0 0
c after sCopy : 0 1 2 3 4 5 6 7 0 0
time for iCopy : 1540
time for mCopy : 1430
time for sCopy : 1500
time for s2Copy : 1520
time for bCopy : 1490
time for iCopy : 1490
time for mCopy : 1330
time for sCopy : 1480
time for s2Copy : 1490
time for bCopy : 1490

curien@balar:~/prog$ g++ -O2 blah.cpp
curien@balar:~/prog$ ./a.out
a : 0 1 2 3 4 5 6 7 8 9
b : 0 0 0 0 0 0 0 0 0 0
c : 0 0 0 0 0 0 0 0 0 0
b after iCopy : 0 1 2 3 4 5 6 7 0 0
c after sCopy : 0 1 2 3 4 5 6 7 0 0
time for iCopy : 1510
time for mCopy : 1460
time for sCopy : 1490
time for s2Copy : 1480
time for bCopy : 1490
time for iCopy : 1480
time for mCopy : 1460
time for sCopy : 1510
time for s2Copy : 1490
time for bCopy : 1470

As you can see, different methods are faster, depending on optimization
settings, but the different is miniscule. Interestingly, I also show that
mCopy is the fastest.

Josh
 
W

White Wolf

I am looking for the most optimized and thus fastest solution ...

Then again: you are looking at the last steps of any possible optimization
(step #4). Did you do the previous step? Measure that this copy is really
a bottleneck of your application?

And the answer is still the same. It depends. It depends on the quality of
your compiler, your standard library implementation, the quality of their
interaction etc. In theory the standard library copy should be the fastest
possible implementation. If it is not, you may roll your own as well as
report to the implementors that their is non-optimal and what can they do to
make it better. But while doing so you will wonder into
implementation-dependent-land. Meaning that if you change compiler/standard
libarry implementation you may find that now your loop is inferior and
std::copy is faster. That is why I am saying: if it is an academic try just
staying with this answer: it depends. If you really really (swear on your
grave) have this as a serious bottleneck, you will need to find out the
cause of it. Since g++ is GNU, if you have found the cause and fixed it
professionally it (I am sure) will make it into the implementation. Until
it does, you may stick with a home-made solution.
 
K

Kevin Goodsell

<code>
#include <ctime>
#include <algorithm>
#include <iostream>

using namespace std;

typedef double Element;

class Vector
{
public:

Vector( unsigned int newsize ) { array = new Element[ newsize ]; size = newsize; }
~Vector() { if ( array ) delete( array ), array = 0; }

This is still wrong.
void init() { for ( unsigned int i = 0; i<size; i++ ) array[ i ] = i; }
void print() { for ( unsigned int i = 0; i<size; i++ ) cout << array[ i ] << " "; cout << endl; }

friend void iCopy( Element *dst, const Element *src, int n ) { int i = n + 1; while ( i-- > 0 ) *dst++ = *src++; }
friend void mCopy( Element *dst, const Element *src, int n ) { int i = n + 1; while ( i-- > 0 ) dst[ i ] = src; }
friend void sCopy( Element *dst, const Element *src, int n ) { std::copy( src, src+n+1, dst ); }


These are all going 1 element too far.
unsigned int size;
Element *array;
};

int main()
{
Vector a( 10 ), b( 10 ), c( 10 ); a.init();
cout << "a : "; a.print();
cout << "b : "; b.print();
cout << "c : "; c.print();

b and c have uninitialized memory that you are trying to print. This
gives undefined behavior.

return 0;
}
</code>

I had to make a number of changes to get sensible output. The result was
being truncated to 0. Here are the results (Visual C++ 6.0, default
project settings, original library with a few of the fixes published by
Dinkumware, none of which seem likely to affect this test):

No optimizations (debug build)
----
time for iCopy : 0.828
time for mCopy : 0.812
time for sCopy : 0.797
time for iCopy : 0.813
time for mCopy : 0.812
time for sCopy : 0.797

Normal optimizations (release build)
----
time for iCopy : 0.735
time for mCopy : 0.671
time for sCopy : 0.672
time for iCopy : 0.703
time for mCopy : 0.672
time for sCopy : 0.687

Not a terribly significant difference in any case. I wouldn't expect
this to be a bottleneck

-Kevin
 
W

White Wolf

What does "YMMV" mean? Never heard it before ...

The FAQ is your friend.

http://www.parashift.com/c++-faq-lite/how-to-post.html#faq-5.1

and in there you see:

http://www.astro.umd.edu/~marshall/abbrev.html
According to the results I got, std::copy is the slowest, followed
closely by a pointer copy, but the member copy wins by quite
a margin.

As you see: it depends. Now if this is really a serious problem in your
app, look at the code of std::copy in the g++ implementation and try to find
out why is it slow.
 
K

Kevin Goodsell

The test in the beginning (copying a to b and c) seemed to show that
std::copy needed the +1 to copy the same amount of elements ...

It's the other way around. The other tests are copying too many elements.

As for your issue, I would recommend writing your copy function to call
std::copy. Later, when everything's working, you can replace it with a
faster function if you feel it's necessary. But the "faster function"
might be slower on some systems, so you have to watch that. Simply using
std::copy is probably a good compromise anyway.

-Kevin
 
F

franky.backeljauw

Then again: you are looking at the last steps of any possible
optimization (step #4). Did you do the previous step? Measure that
this copy is really a bottleneck of your application?

As the application I am writing involves a large amount of array copies,
it is very important to make it work as fast as possible ...
And the answer is still the same. It depends. It depends on the
quality of your compiler, your standard library implementation, the
quality of their interaction etc. In theory the standard library copy
should be the fastest possible implementation. If it is not, you may
roll your own as well as report to the implementors that their is
non-optimal and what can they do to make it better.

Indeed, that was my first thought, namely that the implementors of the
library would know best how to make their function work optimally, as they
developed the library themselves.

On the other hand, as somebody else replied, it might not be very optimal,
since it is written with generality in mind, that is, it should work on
classes too ... while if you would write your own copying routine, it is
optimized for the application at hand (in this case, an array copy).
But while doing so you will wonder into implementation-dependent-land.
Meaning that if you change compiler/standard libarry implementation you
may find that now your loop is inferior and std::copy is faster. That
is why I am saying: if it is an academic try just staying with this
answer: it depends. If you really really (swear on your grave) have
this as a serious bottleneck, you will need to find out the cause of it.
Since g++ is GNU, if you have found the cause and fixed it
professionally it (I am sure) will make it into the implementation.
Until it does, you may stick with a home-made solution.

Yet, even in the case of g++, I am sure it is written as optimal as much
as it is made generic ...

Regards,

Franky B.
 
J

Juergen Heinzl

As the application I am writing involves a large amount of array copies,
it is very important to make it work as fast as possible ...


Indeed, that was my first thought, namely that the implementors of the
library would know best how to make their function work optimally, as they
developed the library themselves.

On the other hand, as somebody else replied, it might not be very optimal,
since it is written with generality in mind, that is, it should work on
classes too ... while if you would write your own copying routine, it is
optimized for the application at hand (in this case, an array copy).
[-]
A compiler can detect whether std::copy is applied to some POD type or
something more complex and choose the best template for each case.
Yet, even in the case of g++, I am sure it is written as optimal as much
as it is made generic ...
[-]
There're more that one copy templates behind std::copy(). The source is
there, so why make assumptions ?

Juergen
 
F

franky.backeljauw

Your mileage may vary.

I never would have guessed it 8-|
I added a couple of tests: s2Copy (which is std::copy with the arithmetic
operations taken out of the timing) and bCopy (which calls memcpy to do a
bitwise copy). Here's my results:

[..]
curien@balar:~/prog$ g++ -O2 blah.cpp
curien@balar:~/prog$ ./a.out
a : 0 1 2 3 4 5 6 7 8 9
b : 0 0 0 0 0 0 0 0 0 0
c : 0 0 0 0 0 0 0 0 0 0
b after iCopy : 0 1 2 3 4 5 6 7 0 0
c after sCopy : 0 1 2 3 4 5 6 7 0 0
time for iCopy : 1510
time for mCopy : 1460
time for sCopy : 1490
time for s2Copy : 1480
time for bCopy : 1490
time for iCopy : 1480
time for mCopy : 1460
time for sCopy : 1510
time for s2Copy : 1490
time for bCopy : 1470

As you can see, different methods are faster, depending on optimization
settings, but the different is miniscule. Interestingly, I also show that
mCopy is the fastest.

Thanks,

Franky B.
 
F

franky.backeljauw

A compiler can detect whether std::copy is applied to some POD type or
something more complex and choose the best template for each case.
[..]
There're more that one copy templates behind std::copy(). The source is
there, so why make assumptions ?

Because my results show that an ordinary member copy (and even the pointer
copy) are faster than the std::copy case ... so that would mean that
either the correct template isn't used, or that the code is not optimally
written. I still need to check the code however, to see what method of
copying is used.

Regards,

Franky B.
 
F

franky.backeljauw


Know this one ...

Didn't quite see that one ... YAB! (*)
As you see: it depends. Now if this is really a serious problem in your
app, look at the code of std::copy in the g++ implementation and try to
find out why is it slow.

Yes, I need to check the code to see what's going on ... but I would not
be asking this question if it didn't matter of if it wasn't a serious
problem!

Regards,

Franky B.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,581
Members
45,056
Latest member
GlycogenSupporthealth

Latest Threads

Top