Strange Behaviour in finding Size of a File

F

felix

This method was written to create new Log File, when the size of the Log File reaches a max size defined by user [10MB in our case]. Here is the code snippet that does this check:

//-- Code starts here : --

static size_t LogSize = 1048576;
bool CreateNewLogs = false;


if ( stat ( logFile, &results ) == 0 )
{
if ( results.st_size > LogSize )
{
CreateNewLogs = true;
}
}
//-- Code ends here : --

It is strange that the condition got satisfied when results.st_size = 2589116.
And we are sure that the size of the data that is written is between 50 to 100 bytes in one operation. And this check is done before writing into the LogFile.

I am not sure if I am missing anything in our understanding of the stat function. Any inputs or pointers on this regard will be really Helpful.


Thanks in advance, and please let me know if any other information is required.
 
J

James Kuyper

This method was written to create new Log File, when the size of the Log File reaches a max size defined by user [10MB in our case]. Here is the code snippet that does this check:

//-- Code starts here : --

static size_t LogSize = 1048576;
bool CreateNewLogs = false;


if ( stat ( logFile, &results ) == 0 )

That is presumably the POSIX stat() function, or something similar? If
so, its behavior is defined by the POSIX standard, not the C standard,
and you'll get better answers to your questions in comp.unix.programmer
than in this newsgroup.
{
if ( results.st_size > LogSize )
{
CreateNewLogs = true;
}
}
//-- Code ends here : --

It is strange that the condition got satisfied when results.st_size = 2589116.
And we are sure that the size of the data that is written is between 50 to 100 bytes in one operation. And this check is done before writing into the LogFile.

Keep in mind that file I/O is normally buffered, so the buffer size is
more relevant than the size of your individual writes. Still, that seems
to be a rather large jump to explain by buffering.
I am not sure if I am missing anything in our understanding of the stat function. Any inputs or pointers on this regard will be really Helpful.

The people in comp.unix.programming may need to know more details about
how data is written to the file, and whether or not you've used any
POSIX functions to change the file mode.
Just to get a better idea of what's going on, I'd recommend reporting
the file size somewhere (probably in a separate log file) every time you
call stat().
 
E

Eric Sosman

This method was written to create new Log File, when the size of the Log File reaches a max size defined by user [10MB in our case]. Here is the code snippet that does this check:

//-- Code starts here : --

static size_t LogSize = 1048576;

Ah. This is obviously some strange usage of "10 MB" that
I hadn't previously been aware of.
It is strange that the condition got satisfied when results.st_size = 2589116.

Not all *that* strange ...
 
M

Mark Bluemel

You're surprised that 2589116 is greater than 1048576?

No. Given that "we are sure that the size of the data that is written is
between 50 to 100 bytes in one operation. And this check is done before
writing into the LogFile." I think the OP is surprised that the
condition wasn't satisfied earlier.

I think James Kuyper has given some good advice.
 
J

James Kuyper

You're surprised that 2589116 is greater than 1048576?

No, he's surprised that, when checking this condition periodically,
separated by writes of no more than 100 bytes, that it doesn't trigger
until 2589116. Naively, it could be expected to trigger with
results.st_size no more than 100 bytes larger than LogSize. Buffering is
the simplest of the many reasons invalidating that conclusion; there's
several others, and many people who are better equipped to explain those
issues than I am, so I'll leave that explanation to them, rather than
embarrassing myself by getting it wrong.
 
G

Greg Martin

This method was written to create new Log File, when the size of the Log File reaches a max size defined by user [10MB in our case]. Here is the code snippet that does this check:

//-- Code starts here : --

static size_t LogSize = 1048576;
bool CreateNewLogs = false;


if ( stat ( logFile, &results ) == 0 )
{
if ( results.st_size > LogSize )
{
CreateNewLogs = true;
}
}
//-- Code ends here : --

It is strange that the condition got satisfied when results.st_size = 2589116.
And we are sure that the size of the data that is written is between 50 to 100 bytes in one operation. And this check is done before writing into the LogFile.

I am not sure if I am missing anything in our understanding of the stat function. Any inputs or pointers on this regard will be really Helpful.


Thanks in advance, and please let me know if any other information is required.

I would think it strange if I knew for sure that the function got called
before the write in every place that the file was possibly being written
to and that there weren't multiple process/threads that could write to it.
 
K

Keith Thompson

felix said:
This method was written to create new Log File, when the size of the
Log File reaches a max size defined by user [10MB in our case]. Here
is the code snippet that does this check:

//-- Code starts here : --

static size_t LogSize = 1048576;
bool CreateNewLogs = false;


if ( stat ( logFile, &results ) == 0 )
{
if ( results.st_size > LogSize )
{
CreateNewLogs = true;
}
}
//-- Code ends here : --

It is strange that the condition got satisfied when results.st_size =
2589116. And we are sure that the size of the data that is written is
between 50 to 100 bytes in one operation. And this check is done
before writing into the LogFile.

It's been pointed out that 1048576 is the wrong value if you want 10 MB
(more pedantically, 10 MiB). But LogSize is also the wrong type.
It should be the same type as the st_size member.

You probably want to use "const" rather than "static" in the definition
of LogSize, unless the value can change.

Neither of these is likely to be the cause of the problem you're seeing.
Since stat() is defined by POSIX, not by C, you'll likely get better
answers in comp.unix.programmer.
 
K

Keith Thompson

Alain Ketterlin said:
What makes you think it will not. At least ftell() is C.

We know that stat() isn't working as the OP expects. I'd expect
ftell() to yield exactly the same results -- and it requires opening
the file and seeking to the end of it. Furthermore, there's no
guarantee in C that the fseek()/ftell() trick will accurately yield
the size of a file. Binary streams may legally be padded with an
implementation-defined number of null characters (N1370 7.21.2p3),
and "A binary stream need not meaningfully support fseek calls
with a whence value of SEEK_END" (7.21.9.2p3). For text streams,
the value returned by ftell() isn't necessarily meaningful except
as an argument to fseek() (7.21.9.4p2).

A POSIX environment makes more guarantees -- but as long as you're
depending on POSIX, there's no good reason not to use stat()
(or fstat() or lstat()).

My point is that you suggested ftell() as a solution to the OP's
problem. It isn't.
Unixism. Just ignore it if it hurts you.

If you're referring to section 2 of the manual (ftell(2),
documentation available via "man 2 ftell"), I've never heard of
that being referred to as "level 2". Try being a little less
condescending and actually answering the question.
 
A

Alain Ketterlin

Keith Thompson said:
We know that stat() isn't working as the OP expects. I'd expect
ftell() to yield exactly the same results -- and it requires opening
the file and seeking to the end of it.

If stat() doesn't give the correct result in the OP's use case (writing
chunks, and testing whether the size has reached a limit), it's probably
because the file is still open. And several people have suggested that
the problem was probably with the buffering. The doc of ftell() says (on
my system):

| The ftell() function obtains the current value of the file position
| indicator for the stream pointed to by stream.

In my opinion, that's a pretty good hint given the problem description.
Furthermore, there's no guarantee in C that the fseek()/ftell() trick
will accurately yield the size of a file. Binary streams may legally
be padded with an implementation-defined number of null characters
(N1370 7.21.2p3), and "A binary stream need not meaningfully support
fseek calls with a whence value of SEEK_END" (7.21.9.2p3). For text
streams, the value returned by ftell() isn't necessarily meaningful
except as an argument to fseek() (7.21.9.4p2).

I know all this, thank you, but I have no indication that the OP is in
any of these cases. I just gave the OP a track to follow. He/she would
surely come back with another (and maybe more precise) question if
things turn out to be more difficult.
A POSIX environment makes more guarantees -- but as long as you're
depending on POSIX, there's no good reason not to use stat()
(or fstat() or lstat()).

Keeping the file open is a good reason to not use stat() (see code
above).
My point is that you suggested ftell() as a solution to the OP's
problem. It isn't.

OK, call it a hint if you want...
If you're referring to section 2 of the manual (ftell(2),
documentation available via "man 2 ftell"), I've never heard of
that being referred to as "level 2".

OK now you have. I didn't think it was that hard to understand, given
that lseek was two words away.
Try being a little less condescending and actually answering the
question.

Try being a little less condescending and actually helping people asking
for help instead of being pedantic (your words) on size units and
everything but the OP's problem.

-- Alain.
 
P

Philip Lantz

Alain said:
OK now you have. I didn't think it was that hard to understand, given
that lseek was two words away.

I didn't understand it either--I thought you meant to use a value of 2
for whence, and I thought it was a strange way to express that. It never
occurred to me you were referring to section 2 of the Unix Programmer's
Manual.
 
K

Kenny McCormack

Philip Lantz said:
I didn't understand it either--I thought you meant to use a value of 2
for whence, and I thought it was a strange way to express that. It never
occurred to me you were referring to section 2 of the Unix Programmer's
Manual.

Well, live & learn!

See, the Usenet can be a helpful thing after all.
 
J

James Kuyper

We know that stat() isn't working as the OP expects. I'd expect
ftell() to yield exactly the same results -- and it requires opening
the file and seeking to the end of it. ...

Opening the file and seeking to the end is not a problem; it's already
open, or at least it was at the time of the last write, and the current
write position was presumably at the end of the file, otherwise it
wouldn't be growing. I'd expect ftell() to give a better indication of
the bytes written than stat()=>st_size, since I wouldn't expect the
value returned by ftell() to be affected by issues such as buffering.
 
K

Keith Thompson

James Kuyper said:
Opening the file and seeking to the end is not a problem; it's already
open, or at least it was at the time of the last write, and the current
write position was presumably at the end of the file, otherwise it
wouldn't be growing. I'd expect ftell() to give a better indication of
the bytes written than stat()=>st_size, since I wouldn't expect the
value returned by ftell() to be affected by issues such as buffering.

Ok, good point.

That's assuming that the program that's querying the size of the file
is the same one that's writing to the file, and that the querying
code has access to the relevant FILE*. That's not entirely obvious
from the original post, but it seems likely.

(As I said earlier, C doesn't guarantee that the value returned by
ftell() is meaningful for text streams, but that's unlikely to be
an issue for the OP.)
 
J

James Kuyper

Ok, good point.

That's assuming that the program that's querying the size of the file
is the same one that's writing to the file, and that the querying
code has access to the relevant FILE*. That's not entirely obvious
from the original post, but it seems likely.

The behavior that's controlled by the results of this query is the
creation of a new log file, so it never even occurred to me to consider
that the log file might be being written by some other program.
 
P

Philip Lantz

Kenny said:
Well, live & learn!

See, the Usenet can be a helpful thing after all.

What am I supposed to have learned? Is the term "level" commonly used to
identify a section of the Unix Programmer's Manual?
 
E

Eric Sosman

[...]
(As I said earlier, C doesn't guarantee that the value returned by
ftell() is meaningful for text streams, but that's unlikely to be
an issue for the OP.)

If the all the logging happens in one place (or a small
number of nearby places), and if it all happens during one
execution of the program, the O.P. can do the entire job in
purely portable C. Something like

static FILE *logStream;
static size_t logLength;

void writeLog(const char *format, ...) {
if (logStream == NULL) {
logStream = openLog(...);
logLength = 0;
}

va_list ap;
va_start(ap, format);
logLength += vfprintf(format, ap);

va_end(ap);
if (logLength > LIMIT) {
closeLog(logStream);
logStream = NULL;
}
}

.... should do it, along with a little error-checking and such.

(Okay, okay: The number of characters written to a stream is
not necessarily the same thing as the number of bytes stored on
a disk. Nonetheless, for "start a new log when the old one gets
too big" purposes it should be close enough.)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,766
Messages
2,569,569
Members
45,042
Latest member
icassiem

Latest Threads

Top