Reading from array is faster than writing ???

O

Oleg Kornilov

Hello !
Who can explain why readind from array[][] is fast, but
writing (same place) might take a lot of time (I need many loops)
How to solve it (declare as static or force some compiler (VC 6.0)
options) ??? Thanks in advance
Here is example (but critical !!!) code...

void main(int argc, char* argv[])
{
unsigned char r[360][180], dat2, i,j,k;
int p;

for (p=0; p<650; p++) {
for (i=0; i<200; i++) {
for (j=0; j<200; j++) {
for (k=0; k<100; k++) {

dat2=r[j]; dat2=r[j]; dat2=r[j];

// r[j]=dat2; r[j]=dat2; r[j]=dat2;
// critical point: without it almost no execution time
// with it - 6 seconds on P3-1000 ???

}
}
}
}
}
 
T

Tom St Denis

Oleg Kornilov said:
Hello !
Who can explain why readind from array[][] is fast, but
writing (same place) might take a lot of time (I need many loops)
How to solve it (declare as static or force some compiler (VC 6.0)
options) ??? Thanks in advance
Here is example (but critical !!!) code...

Because writes to SRAM/DRAM are slower than reads? Also by reading chances
are the CPU keeps the array in cache whereas when you write the data does go
to cache (unless you mark the region as uncacheable) but it also has to then
go to external memory. The CPU will only have so many write buffers which
shall induce stalls.

The post though is off-topic for this group. Try targetting a group
specific to your processor [or say comp.lang.asm.x86]

Tom
 
D

Dan Pop

In said:
Who can explain why readind from array[][] is fast, but
writing (same place) might take a lot of time (I need many loops)

If you manage to disable the cache(s), you may get the reads about as
slow as the writes...

Dan
 
N

Nils Petter Vaskinn

Hello !
Who can explain why readind from array[][] is fast, but
writing (same place) might take a lot of time (I need many loops)
How to solve it (declare as static or force some compiler (VC 6.0)
options) ??? Thanks in advance
Here is example (but critical !!!) code...

void main(int argc, char* argv[])
{
unsigned char r[360][180], dat2, i,j,k;
int p;

for (p=0; p<650; p++) {
for (i=0; i<200; i++) {
for (j=0; j<200; j++) {
for (k=0; k<100; k++) {

dat2=r[j]; dat2=r[j]; dat2=r[j];


You never do anything with the value read into dat2, your compiler may
recognize that and optimize away to no instruction.
// r[j]=dat2; r[j]=dat2; r[j]=dat2;
// critical point: without it almost no execution time
// with it - 6 seconds on P3-1000 ???


But when you include this it may be unable to do the optimization.

6000000 clock cycles

7800000000 reads
+ 7800000000 assignments.

6000000/(7800000000*2) =

0.000384615 operations per clock cycle.


I think you've got fairly good performance. Any operation taking less than
a clock cycle means som optimization has been done and you're getting
better performance than what you programmed for.
 
L

Leor Zolman

Hello !
Who can explain why readind from array[][] is fast, but
writing (same place) might take a lot of time (I need many loops)
How to solve it (declare as static or force some compiler (VC 6.0)
options) ??? Thanks in advance
Here is example (but critical !!!) code...

void main(int argc, char* argv[])
{
unsigned char r[360][180], dat2, i,j,k;
int p;

for (p=0; p<650; p++) {
for (i=0; i<200; i++) {
for (j=0; j<200; j++) {
for (k=0; k<100; k++) {

dat2=r[j]; dat2=r[j]; dat2=r[j];

[snip]

A side note: If performance is what you're interested in, you may be able
to speed up both reading /and/ writing by employing "extra" pointers
whenever the opportunity arises, saving multiplications in the
multiple-subscripting operations:

int main(int argc, char* argv[])
{
unsigned char r[360][180], dat2, i,j,k;
int p;

for (p=0; p<650; p++) {
for (i=0; i<200; i++) {
for (j=0; j<200; j++) {
unsigned char *rj = r[j]; // added this
for (k=0; k<100; k++) {
// dat2=r[j]; dat2=r[j]; dat2=r[j]; // before
dat2 = rj; dat2 = rj; dat2 = rj; // after
}
}
}
}
}

Don't know how likely contemporary optimizers are to do this sort of thing
for you automatically.... profile, profile, profile.


Leor Zolman
BD Software
(e-mail address removed)
www.bdsoft.com -- On-Site Training in C/C++, Java, Perl & Unix
C++ users: Download BD Software's free STL Error Message
Decryptor at www.bdsoft.com/tools/stlfilt.html
 
T

Tristan Miller

Greetings.

Oleg Kornilov said:
Who can explain why readind from array[][] is fast, but
writing (same place) might take a lot of time (I need many loops)

It's not necessarily. If writing happens to be slower for you, then that's
some issue with your compiler or operating system or hardware, not with the
C language itself, and there's no standards-compliant way of changing it.
Here is example (but critical !!!) code...

void main(int argc, char* argv[])

P.S. -- main() returns int.

Regards,
Tristan
 
M

Martin Ambuhl

Oleg said:
Hello !
Who can explain why readind from array[][] is fast, but
writing (same place) might take a lot of time (I need many loops)
How to solve it (declare as static or force some compiler (VC 6.0)
options) ??? Thanks in advance
Here is example (but critical !!!) code...

void main(int argc, char* argv[])

We _know_ you have not followed the newsgroup, checked the archives, or
checked the FAQ before posting. A good user of usenet does these things.
How do we know you have not done any of these things? Because, had you
done them, you would not have used the nonsensical return type 'void' for
main, and in "(but critical !!!) code", at that. Continuing to your
question ...
{
unsigned char r[360][180], dat2, i,j,k;
^^^^^^^^^^^
Why do you set the size of your array or the limits of your for-loops to
absurd values? "for (i=0; i<200; i++) /* ... */ dat2 = r[j];" is a joke
if the r array is declared with size char[360][180].

int p;

for (p=0; p<650; p++) {
for (i=0; i<200; i++) {
for (j=0; j<200; j++) {
for (k=0; k<100; k++) {

dat2=r[j]; dat2=r[j]; dat2=r[j];


This massive loop need not do 650*200*200*100*3 assignments;
it can be optimized away to simply
dat2=r[199][199];
No loops, no triple assignments.
// r[j]=dat2; r[j]=dat2; r[j]=dat2;


While p and k play no role in this, and the three-fold assignment is silly
(and oprimizable to one), you do need the 200*200 assignments, one for each
r[j].

The assignment to the r[j] takes 40,000 assignments; the assignment to
dat2 takes 1.
 
S

Sidney Cadot

Leor said:
[snip]

A side note: If performance is what you're interested in, you may be able
to speed up both reading /and/ writing by employing "extra" pointers
whenever the opportunity arises, saving multiplications in the
multiple-subscripting operations:

A decent compiler will be able to do this by itself during optimization.

In fact, you may loose performance by doing it by hand since the
compiler may be able to use clever tricks for the array case, that it
cannot apply otherwise (such as aliasing analysis).

Best regards,

Sidney
 
M

Morris Dovey

Artie said:
Morris said:
Tristan said:
_V.-o Tristan Miller [en,(fr,de,ia)] >< Space is limited
/ |`-' -=-=-=-=-=-=-=-=-=-=-=-=-=-=-= <> In a haiku, so it's hard
(7_\\ http://www.nothingisreal.com/ >< To finish what you

What language is "ia"?

It's not IowAn? ;-) <g, d & r>

No. That's "en-ia" ( more or less :)

My first Google search pattern was ill-constructed and produced
overwhelming clutter (mostly about the State of Iowa in the USA).
A better search yielded an ISO 639 page telling me that "ia"
signifies Interlingua.
 
N

Nils Petter Vaskinn

In <[email protected]> Nils Petter


You seem to be off by three orders of magnitude...

And I even wrote the wrong "units" in the result. I meant clock cycles per
operation. Let's try this again:

6000000000 clock cycles / (650 * 200 * 200 * 200 * 6) operations =

0.384615 clock cycles per operation

Still pretty good I think.

My point was really that when you do something 7800000000 times you
shouldn't be surprised that it takes a little time.

And since it takes less than a clock cycle per operation it means he gets
a lot for free from optimizations. So if he needs better performance maybe
rethinking the algorithm is in order.
 
D

Dan Pop

In said:
And I even wrote the wrong "units" in the result. I meant clock cycles per
operation. Let's try this again:

6000000000 clock cycles / (650 * 200 * 200 * 200 * 6) operations =

0.384615 clock cycles per operation

Still pretty good I think.

My point was really that when you do something 7800000000 times you
shouldn't be surprised that it takes a little time.

And since it takes less than a clock cycle per operation it means he gets
a lot for free from optimizations.

A trivial optimisation is to read once and write once, which would reduce
the number of operations in the inner loop by a factor of three,
so you'd get > 1 cycle per operation.

Dan
 
N

Nils Petter Vaskinn

A trivial optimisation is to read once and write once, which would
reduce the number of operations in the inner loop by a factor of three,
so you'd get > 1 cycle per operation.

Yes.

Btw let's draw a few comclusions here:

OP gets more than one 'operation' per clock cycle => OP's compiler did
automatic optimizations => program doesn't do exactly what the code tells
it to, only something that produces the same end result => OP's benchmark
/ profiling is worthless because he's not measuring what he thinks he is.

For all we know writing to memory on his plattform could be faster than
reading, despite what his test says. If he is to get any useful results he
neeed to make a better test, and compile it with optimizations turned off.
 
D

Dan Pop

In said:
For all we know writing to memory on his plattform could be faster than
reading, despite what his test says. If he is to get any useful results he
neeed to make a better test, and compile it with optimizations turned off.

Even with optimisations turned off, many compilers take short cuts WRT
the program's behaviour in the abstract C machine. Using volatile may
help, but one still has to inspect the generated object code.

It's much better to design the benchmark in such a way that its
relevant operations cannot be optimised away by the compiler and compile
it with full optimisations.

Dan
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,057
Latest member
KetoBeezACVGummies

Latest Threads

Top