pointer arithmetic

A

Andy Zhang

Hello,

I'm trying to allocate an array inside a function and read data from a file
into it. I can read data into the first element of the array fine, but I'm
quite confused as to how to read data into the second element. Code below:

int get_mft(FILE *f, unsigned int blocksize, MFTENTRY **mft)
{
int MFTEntries;
*mft = (MFTENTRY*) malloc(sizeof(MFTENTRY));
//read MFT to get size of MFT
fread(*mft, sizeof(MFTENTRY), 1, f);
MFTEntries = blocks_to_bytes(mft[0]->blockend[0] - mft[0]->blockstart[0],
blocksize) / sizeof(MFTENTRY);

//read rest of MFT
realloc(*mft, MFTEntries * sizeof(MFTENTRY));
fread(*mft+sizeof(MFTENTRY), sizeof(MFTENTRY), MFTEntries-1, f); //crashes
here

return MFTEntries;
}

It _seems_ to work fine if I pass &mft[0][1] to the second fread(), until I
try to free() mft. BTW, it's not a "real" filesystem, just something I
thought I'd do for fun.

Thanks.

Andy Zhang
 
J

Jack Klein

Yes, you are having a problem with your use of pointer arithmetic, but
there are a few other issues as well...
Hello,

I'm trying to allocate an array inside a function and read data from a file
into it. I can read data into the first element of the array fine, but I'm
quite confused as to how to read data into the second element. Code below:

int get_mft(FILE *f, unsigned int blocksize, MFTENTRY **mft)
{
int MFTEntries;
*mft = (MFTENTRY*) malloc(sizeof(MFTENTRY));

Danger! Alert! Asking for trouble starting with the line above...

It is never necessary to cast the value returned by malloc() in C.
The conversion from void pointer to any type of object pointer is
automatic and requires no cast.

If your compiler complains about assigning the return value to a
pointer, that means you have forgotten to include <stdlib.h> with the
proper prototype for malloc().

And never, Never, NEVER use the pointer returned by malloc() without
first testing for NULL.

Also you almost never need to apply the sizeof operator to the name of
a type when using a memory allocation or file read/write function,
unless you are doing some sort of type punning.
//read MFT to get size of MFT

Do you have a C99 conforming compiler? There are very, very few of
them. If not, comments beginning with // are not legal.
fread(*mft, sizeof(MFTENTRY), 1, f);

Never use the contents of an area of memory after an input operation
without checking the result of the input operation first. See my
sample code at the end.
MFTEntries = blocks_to_bytes(mft[0]->blockend[0] - mft[0]->blockstart[0],
blocksize) / sizeof(MFTENTRY);

//read rest of MFT
realloc(*mft, MFTEntries * sizeof(MFTENTRY));

You're asking for a memory leak by passing your original pointer to
realloc() without making a temporary copy first. If the realloc()
fails, it will return NULL. The original memory is still allocated
but you no longer have a pointer to it to free it.

And of course, never use the value returned by malloc(), calloc(), or
realloc() without first checking for NULL.
fread(*mft+sizeof(MFTENTRY), sizeof(MFTENTRY), MFTEntries-1, f); //crashes
here

Here's where pointer arithmetic is clobbering you. When you add an
offset to a pointer, the compiler automatically scales the offset by
the size of the type pointed to.

The expression "*mft+sizeof(MFENTRY)" is not equivalent to the second
in an array of these structures. If the size of the MFENTRY type is
50 bytes, for example, this expression is equivalent to accessing the
50th element of an array of (at least) 50 structures.

If you want to access the second element of an array via a pointer,
you want *mft+1.
return MFTEntries;
}

It _seems_ to work fine if I pass &mft[0][1] to the second fread(), until I
try to free() mft. BTW, it's not a "real" filesystem, just something I
thought I'd do for fun.

Please set your newsreader to add a proper signature to your posts. A
proper signature line consists of "-- \n", that is two minus signs, a
single blank space, and then an end of line. See mine below for an
example. News software automatically recognizes that pattern and
trims the signature when replying.
Thanks.

Andy Zhang

Personally I would never dereference a pointer to something as small
as a structure pointer that many times. If you reference it that many
times it is just as efficient, if not more so, to make a local copy on
most implementations. Even more than any slight gain in efficiency,
it makes the code much more readable.

Try this on for size:

int get_mft(FILE *f, unsigned int blocksize, MFTENTRY **mft)
{
int MFTEntries = 0; /* if unsuccessful */
int number_read;
MFTENTRY *mft_real, *mft_temp;

*mft = NULL; /* in case the function fails */

mft_real = malloc(sizeof *mft_real); /* don't need sizeof(type) */
if (NULL == mft_real)
{
return 0;
}

number_read = fread(mft_real, sizeof *mft_real, 1, f);
if (1 != number_read)
{
free(mft_real);
return 0;
}

MFTEntries = blocks_to_bytes(mft_real->blockend[0] -
mft_real->blockstart[0], blocksize / sizeof *mft_real);

mft_temp = realloc(mft_real, MFTEntries * sizeof *mft_real);

if (NULL == mft_temp)
{
free(mft_real);
return 0;
}

mft_real = mft_temp;

number_read = fread(mft_real + 1, sizeof *mft_real,
MFTEntries - 1, f);

if (MFTEntries - 1 != f)
{
free(mft_real);
return 0;
}

*mft = mft_real;

return MFTEntries;
}

Note:

1. How much more straight forward the code is without the repeated
dereference to the pointer-to-pointer. And probably at least as
efficient on most platforms.

2. The pointer arithmetic in the second fread() call, just add 1!

3. At no point is the sizeof operator applied to the name of a type.
As long as you have a pointer to the type around, you can do sizeof
*pointer_to_type. Even if the pointer is not initialized or NULL,
because sizeof does not evaluate its operand.

4. No cast on malloc() or realloc(), but make sure <stdlib.h> is
included.

--
Jack Klein
Home: http://JK-Technology.Com
FAQs for
comp.lang.c http://www.eskimo.com/~scs/C-faq/top.html
comp.lang.c++ http://www.parashift.com/c++-faq-lite/
alt.comp.lang.learn.c-c++ ftp://snurse-l.org/pub/acllc-c++/faq
 
A

Andy Zhang

Thank you for the anwser, Jack. I'm using Visual C++ and it complains about
having to cast voids to non-voids explicitly even if I include stdlib.h. I
don't usually use Outlook Express for newsreading, but I will add a
signature.

--
Andy Zhang


Jack Klein said:
Yes, you are having a problem with your use of pointer arithmetic, but
there are a few other issues as well...
Hello,

I'm trying to allocate an array inside a function and read data from a file
into it. I can read data into the first element of the array fine, but I'm
quite confused as to how to read data into the second element. Code below:

int get_mft(FILE *f, unsigned int blocksize, MFTENTRY **mft)
{
int MFTEntries;
*mft = (MFTENTRY*) malloc(sizeof(MFTENTRY));

Danger! Alert! Asking for trouble starting with the line above...

It is never necessary to cast the value returned by malloc() in C.
The conversion from void pointer to any type of object pointer is
automatic and requires no cast.

If your compiler complains about assigning the return value to a
pointer, that means you have forgotten to include <stdlib.h> with the
proper prototype for malloc().

And never, Never, NEVER use the pointer returned by malloc() without
first testing for NULL.

Also you almost never need to apply the sizeof operator to the name of
a type when using a memory allocation or file read/write function,
unless you are doing some sort of type punning.
//read MFT to get size of MFT

Do you have a C99 conforming compiler? There are very, very few of
them. If not, comments beginning with // are not legal.
fread(*mft, sizeof(MFTENTRY), 1, f);

Never use the contents of an area of memory after an input operation
without checking the result of the input operation first. See my
sample code at the end.
MFTEntries = blocks_to_bytes(mft[0]->blockend[0] - mft[0]->blockstart[0],
blocksize) / sizeof(MFTENTRY);

//read rest of MFT
realloc(*mft, MFTEntries * sizeof(MFTENTRY));

You're asking for a memory leak by passing your original pointer to
realloc() without making a temporary copy first. If the realloc()
fails, it will return NULL. The original memory is still allocated
but you no longer have a pointer to it to free it.

And of course, never use the value returned by malloc(), calloc(), or
realloc() without first checking for NULL.
fread(*mft+sizeof(MFTENTRY), sizeof(MFTENTRY), MFTEntries-1, f); //crashes
here

Here's where pointer arithmetic is clobbering you. When you add an
offset to a pointer, the compiler automatically scales the offset by
the size of the type pointed to.

The expression "*mft+sizeof(MFENTRY)" is not equivalent to the second
in an array of these structures. If the size of the MFENTRY type is
50 bytes, for example, this expression is equivalent to accessing the
50th element of an array of (at least) 50 structures.

If you want to access the second element of an array via a pointer,
you want *mft+1.
return MFTEntries;
}

It _seems_ to work fine if I pass &mft[0][1] to the second fread(), until I
try to free() mft. BTW, it's not a "real" filesystem, just something I
thought I'd do for fun.

Please set your newsreader to add a proper signature to your posts. A
proper signature line consists of "-- \n", that is two minus signs, a
single blank space, and then an end of line. See mine below for an
example. News software automatically recognizes that pattern and
trims the signature when replying.
Thanks.

Andy Zhang

Personally I would never dereference a pointer to something as small
as a structure pointer that many times. If you reference it that many
times it is just as efficient, if not more so, to make a local copy on
most implementations. Even more than any slight gain in efficiency,
it makes the code much more readable.

Try this on for size:

int get_mft(FILE *f, unsigned int blocksize, MFTENTRY **mft)
{
int MFTEntries = 0; /* if unsuccessful */
int number_read;
MFTENTRY *mft_real, *mft_temp;

*mft = NULL; /* in case the function fails */

mft_real = malloc(sizeof *mft_real); /* don't need sizeof(type) */
if (NULL == mft_real)
{
return 0;
}

number_read = fread(mft_real, sizeof *mft_real, 1, f);
if (1 != number_read)
{
free(mft_real);
return 0;
}

MFTEntries = blocks_to_bytes(mft_real->blockend[0] -
mft_real->blockstart[0], blocksize / sizeof *mft_real);

mft_temp = realloc(mft_real, MFTEntries * sizeof *mft_real);

if (NULL == mft_temp)
{
free(mft_real);
return 0;
}

mft_real = mft_temp;

number_read = fread(mft_real + 1, sizeof *mft_real,
MFTEntries - 1, f);

if (MFTEntries - 1 != f)
{
free(mft_real);
return 0;
}

*mft = mft_real;

return MFTEntries;
}

Note:

1. How much more straight forward the code is without the repeated
dereference to the pointer-to-pointer. And probably at least as
efficient on most platforms.

2. The pointer arithmetic in the second fread() call, just add 1!

3. At no point is the sizeof operator applied to the name of a type.
As long as you have a pointer to the type around, you can do sizeof
*pointer_to_type. Even if the pointer is not initialized or NULL,
because sizeof does not evaluate its operand.

4. No cast on malloc() or realloc(), but make sure <stdlib.h> is
included.

--
Jack Klein
Home: http://JK-Technology.Com
FAQs for
comp.lang.c http://www.eskimo.com/~scs/C-faq/top.html
comp.lang.c++ http://www.parashift.com/c++-faq-lite/
alt.comp.lang.learn.c-c++ ftp://snurse-l.org/pub/acllc-c++/faq
 
C

CBFalconer

Andy said:
Thank you for the anwser, Jack. I'm using Visual C++ and it
complains about having to cast voids to non-voids explicitly even
if I include stdlib.h. I don't usually use Outlook Express for
newsreading, but I will add a signature.

Don't toppost. You are having problems with casts because you are
not using a C compiler. Either configure that thing correctly, or
name your source files correctly.
 
A

Andy Zhang

Jack Klein said:
<snip>
Try this on for size:

int get_mft(FILE *f, unsigned int blocksize, MFTENTRY **mft)
{
int MFTEntries = 0; /* if unsuccessful */
int number_read;
MFTENTRY *mft_real, *mft_temp;

*mft = NULL; /* in case the function fails */

mft_real = malloc(sizeof *mft_real); /* don't need sizeof(type) */
if (NULL == mft_real)
{
return 0;
}

number_read = fread(mft_real, sizeof *mft_real, 1, f);
if (1 != number_read)
{
free(mft_real);
return 0;
}

MFTEntries = blocks_to_bytes(mft_real->blockend[0] -
mft_real->blockstart[0], blocksize / sizeof *mft_real);

mft_temp = realloc(mft_real, MFTEntries * sizeof *mft_real);

if (NULL == mft_temp)
{
free(mft_real);
return 0;
}

mft_real = mft_temp;

number_read = fread(mft_real + 1, sizeof *mft_real,
MFTEntries - 1, f);

if (MFTEntries - 1 != f)
{
free(mft_real);
return 0;
}

*mft = mft_real;

return MFTEntries;
}

<snip>

If I try allocating more memory with malloc() just before that last return,
BOOM. What's going on? Did I corrupt the heap/stack/malloc's internal
structures?
 
M

Mark McIntyre

Thank you for the anwser, Jack. I'm using Visual C++ and it complains about
having to cast voids to non-voids explicitly even if I include stdlib.h.

You're compiling it as C++ then. Make sure the filename is something.c
not .C or .C++
 
B

Barry Schwarz

Hello,

I'm trying to allocate an array inside a function and read data from a file
into it. I can read data into the first element of the array fine, but I'm
quite confused as to how to read data into the second element. Code below:

int get_mft(FILE *f, unsigned int blocksize, MFTENTRY **mft)
{
int MFTEntries;
*mft = (MFTENTRY*) malloc(sizeof(MFTENTRY));
//read MFT to get size of MFT
fread(*mft, sizeof(MFTENTRY), 1, f);
MFTEntries = blocks_to_bytes(mft[0]->blockend[0] - mft[0]->blockstart[0],
blocksize) / sizeof(MFTENTRY);

//read rest of MFT
realloc(*mft, MFTEntries * sizeof(MFTENTRY));

In addition to all the other good advice you received, this statement
is the source of most of your problems. You tell realloc to
reallocate the area pointed to by *mft but you never save the pointer
returned by realloc. You don't know where realloc has moved your
allocation. Note that like all functions, realloc receives a copy of
*mft and therefore cannot update in any way that you can see.

You need something like
temp_ptr = realloc(*mft, MFTEntries * sizeof *temp_ptr);
if (temp_ptr == NULL) {/*error handler */}
*mft = tepm_ptr;
fread(*mft+sizeof(MFTENTRY), sizeof(MFTENTRY), MFTEntries-1, f); //crashes
here

Here you use the old value of *mft which realloc has probably made
obsolete. Once make the changes suggested above, *mft will point to
the reallocated area and you solve one problem.

The other problem is that *mft is a pointer. When you do arithmetic
on a pointer, C takes the size of the "thing" pointed to into
consideration. Since you want to read into the second MFTENTRY (*mft
points to the first and you want the next one), the expression should
be *mft+1 which will be evaluated intuitively as
(MFTENTRY*)((char*)*mft + 1*sizeof(MFTENTRY))
return MFTEntries;
}

It _seems_ to work fine if I pass &mft[0][1] to the second fread(), until I
try to free() mft. BTW, it's not a "real" filesystem, just something I
thought I'd do for fun.

&mft[0][1] is identical in every way to *mft+1 as discussed above.
However, the only reason it appears to work is that realloc was able
to expand your initial allocation in place (without moving anything).
For obvious reasons, it is not good for you code to depend on this.
Thanks.

Andy Zhang



<<Remove the del for email>>
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,767
Messages
2,569,570
Members
45,045
Latest member
DRCM

Latest Threads

Top