How to retrieve the name of the file from a FILE *

CBFalconer · Oct 14, 2004

Jarno said:
. snip ...

How do you hash pointers portably?

You cast them into integers (allowable, but not the inverse) and
use hashing methods for integers (see the references in my hashlib
package and its tests). The operation of the hash table only
requires equal/non-equal comparisons of pointers, which is always
allowable.

Please refrain from snipping attributes for material you quote.

CBFalconer · Oct 14, 2004

Dan said:
Your method doesn't cover the predefined streams, which are the
interesting cases. If I open a file myself, I already know its name,
but it is sometimes helpful to know where your stdin data comes from.

Actually it is worse than that, because any stream may be connected
to one (or more) i/o devices or disk files, which also have names
or designators in most systems. Then that device may be connected
to something else, which in turn has its own cascade of names.

So it is better to simply cut the Gordian knot, and say that the
names are unknown to the program.

jacob navia · Oct 14, 2004

Under linux making

ls -l /proc/self/fd

will print a nice table with each integer file descriptor linked
to the real file it is using (/dev/pts1 for a console file),
or /home/jacob/getfilename.c for a real file

So, that's how you do it under linux.

Maybe other OSes will follow.

Jonathan Adams · Oct 14, 2004

jacob navia said:
Under linux making

ls -l /proc/self/fd

will print a nice table with each integer file descriptor linked
to the real file it is using (/dev/pts1 for a console file),
or /home/jacob/getfilename.c for a real file

So, that's how you do it under linux.

Maybe other OSes will follow.

<OT>
Solaris 10 has similar functionality:

% ls -l /proc/self/path/[0-9]*
</OT>

Cheers,
- jonathan

Jarno A Wuolijoki · Oct 14, 2004

You cast them into integers (allowable, but not the inverse) and
use hashing methods for integers (see the references in my hashlib
package and its tests). The operation of the hash table only
requires equal/non-equal comparisons of pointers, which is always
allowable.

Does the standard guarantee that pointers that compare equal convert to
integers that do so as well?
(think of x86 real mode, b000:8000 vs b800:0000)

Please refrain from snipping attributes for material you quote.

Oops. I accidentally followed my own queer 'netiquette' instead of ng's.

(That is, I tend to think that only first level attributions are really
relevant in the context of my reply. I learned this ugly habit in
BBS's where it was typical to nest much farther than here)

Keith Thompson · Oct 14, 2004

You alias them with an array of unsigned char of size sizeof(FILE *).

That was my thought as well, but I can imagine an implementation in
which two FILE* values have the same value (as pointers) but different
representations (as arrays of unsigned char). Realistically, an
implementation is unlikely to generate two such distinct
representations for the same value, but I think a conforming
implementation could do so.

Keith Thompson · Oct 15, 2004

You call the POSIX stat() function on that file descriptor.

<OT><QUIBBLE>fstat()</QUIBBLE></OT>

CBFalconer · Oct 15, 2004

Jarno said:
Does the standard guarantee that pointers that compare equal convert
to integers that do so as well?
(think of x86 real mode, b000:8000 vs b800:0000)

It doesn't matter. You are working only with the value that was
returned from fopen.

Fao, Sean · Oct 15, 2004

jacob said:
Yes, it is based on the win32 API. That is why I did not publish the
code here, just giving a pointer to the code

I suppose you think that made it right?

goose · Oct 15, 2004

jacob navia said:
Recently there was a discussion in this group about
how to retrieve the file name given a FILE *.

The question raised my curiosity, and after some
research I have come up with a good implementation.

The solution is in the tutorial for lcc-win32
(http://www.cs.virginia.edu/~lcc-win32) page
331.

It piqued my curiosity too, but I doubt that
a good implementation of this exists. I've not read
your solution; make it plain text and I'll read it,
I dont download binaries (at all!).

Here is my solution for the group to pick at

#include <stdio.h>
#include <stdlib.h>

#define fopen(x,y) (save_name(x, y))

FILE *save_name (char *name, char *mode) {
FILE *t = (fopen) (name, mode);
if (t) {
/* here we save t and filename
and mode somwhere; in an array
maybe?
*/
printf ("%s saved\n", name);
}
return t;
}

int main (void) {
FILE *test = fopen ("test.txt", "w");

if (test==NULL) {
printf ("failure\n");
} else {
printf ("success\n");
fprintf (test, "success\n");
fclose (test);
}

return EXIT_SUCCESS;

}

goose,
I suspect that the above is not allowed, please comment.

Dan Pop · Oct 15, 2004

In said:
You cast them into integers (allowable, but not the inverse)

Wrong. The cast is allowed in both directions, but the results are not
guaranteed to be meaningful in any direction.

For maximal portability, you have to use the unsigned char array approach.
Even on C99, intptr_t is an optional typedef.

Dan

Dan Pop · Oct 15, 2004

In said:
That was my thought as well, but I can imagine an implementation in
which two FILE* values have the same value (as pointers) but different
representations (as arrays of unsigned char). Realistically, an
implementation is unlikely to generate two such distinct
representations for the same value, but I think a conforming
implementation could do so.

It doesn't matter: you get only one representation from fopen() and you
keep using it. There is no way for that representation to metamorphose
into the other.

Dan

Michael Wojcik · Oct 15, 2004

That was my thought as well, but I can imagine an implementation in
which two FILE* values have the same value (as pointers) but different
representations (as arrays of unsigned char). Realistically, an
implementation is unlikely to generate two such distinct
representations for the same value, but I think a conforming
implementation could do so.

I'm dubious about that, but another option is converting the value
to a string with sprintf and the %p conversion specifier. fscanf
requires that converting the result of a *printf %p generated by
the same program execution using the *scanf %p conversion specifier
produce a void* that compares equal to the original pointer; thus, %p
must produce unique strings for each distinct pointer value (during a
given execution of that program). (C90 7.9.6.2)

Keith Thompson · Oct 15, 2004

It doesn't matter: you get only one representation from fopen() and you
keep using it. There is no way for that representation to metamorphose
into the other.

You may be right, but I'm still not quite sure of that. Could a
pointer assignment change its representation? Similarly, can a
floating-point assignment change the representation (without changing
the represented value)? I *think* it can; for example, loading a
floating-point value into a register might automatically normalize it.
The same thing could happen with an address register. As long as the
before and after values compare equal, I don't see a problem.

Realistically, though, if automatic pointer normalization happens so
easily, it's unlikely that a non-normalized pointer could survive long
enough to be returned from fopen().

If my guess is right, hashing pointers by converting them to arrays of
unsigned char will probably work reliably on every system other than
the DS9000.

CBFalconer · Oct 15, 2004

Dan said:
CBFalconer said:

You cast them into integers (allowable, but not the inverse)

Click to expand...

Wrong. The cast is allowed in both directions, but the results are
not guaranteed to be meaningful in any direction.

For maximal portability, you have to use the unsigned char array
approach. Even on C99, intptr_t is an optional typedef.

I have my doubts. Consider that the representation of a pointer
may contain trap bits, which are accessed by the unsigned char
attack. There is no guarantee that those trap bits do not change
with time and/or actual storage location (of the pointer). The
cast technique eliminates those trap bits. If it doesn't convert
back to the pointer, so what, it is just one phase of the hashing
mechanism.

So I claim that the cast makes the pointer to hash function single
valued, while the unsigned char approach does not. I would be hard
put to find a system where the unsigned char method would not work,
but it is not guaranteed.

Keith Thompson · Oct 16, 2004

CBFalconer said:
Dan Pop wrote: [...]

For maximal portability, you have to use the unsigned char array
approach. Even on C99, intptr_t is an optional typedef.

Click to expand...

I have my doubts. Consider that the representation of a pointer
may contain trap bits, which are accessed by the unsigned char
attack. There is no guarantee that those trap bits do not change
with time and/or actual storage location (of the pointer). The
cast technique eliminates those trap bits. If it doesn't convert
back to the pointer, so what, it is just one phase of the hashing
mechanism.

Did you mean padding bits rather than trap bits? A type can have trap
*representations*, but a valid pointer value (of the kind that we're
interested in hashing) won't be one of them.

I don't believe that the cast necessarily eliminates padding bits.

Assume the following:

void *p1 = foo();
void *p2 = bar();
uintptr_t u1 = uintptr_t(p1);
uintptr_t u2 = uintptr_t(p1);

Assume that p1 == p2 (they point to the same address), but that they
have different internal representations (perhaps one is normalized and
the other is not).

We know from C99 7.18.1.4 that (void*)u1 == (void*)u2, but we don't
know that u1 == u2. For example, if the cast simply copies the bits,
the values of u1 and u2 would reflect the difference in
representations of the two pointer values; converting back to void*
yields two pointers that have different representations, but compare
equal to each other.

The cast *might* normalize the representation, but it doesn't have to.

Chris Torek · Oct 16, 2004

[regarding hashing pointers by first converting them to uintptr_t]

Assume the following:

void *p1 = foo();
void *p2 = bar();
uintptr_t u1 = uintptr_t(p1);
uintptr_t u2 = uintptr_t(p1);

Minor nit: this is (old) C++ syntax; you mean:

uintptr_t u1 = (uintptr_t)p1;

and so on.

Assume that p1 == p2 (they point to the same address), but that they
have different internal representations (perhaps one is normalized and
the other is not).

We know from C99 7.18.1.4 that (void*)u1 == (void*)u2, but we don't
know that u1 == u2. For example, if the cast simply copies the bits,
the values of u1 and u2 would reflect the difference in
representations of the two pointer values; converting back to void*
yields two pointers that have different representations, but compare
equal to each other.

The cast *might* normalize the representation, but it doesn't have to.

Indeed, consider the historical implementations that are the very
reason the C standards are full of this kind of weirdness with
pointer arithmetic. In other words, think back to the 1980s and
C compilers for the IBM PC that ran under MS-DOS with its various
"extender" schemes to access more than 64K and 640K of memory.

One of the models under which code ran had 20-bit pointers, so that
uintptr_t would have to be defined as "unsigned long" ("int" being
only 16 bits on these compilers). If functions foo() and bar()
returned "un-normalized" pointers, and you assigned these to u1 and
u2 via casts, you get -- unnormalized integers. The "normalization"
operation was done by the "==" operators (only). Relational
comparisons ("<" and ">", and their "<=" and ">=" variants) compared
only offsets. This led to the peculiar case that:

printf("p1 is %sequal to p2\n", p1 == p2 ? "" : "not ");
printf("p1 is %sless than p2\n", p1 < p2 ? "" : "not ");

would sometimes print:

p1 is equal to p2
p1 is less than p2

In other words, p1 < p2 && p1 == p2, both at the same time.

(The only things that behave this way on modern CPUs are floating
point numbers.

If x is set to NaN, a surprising number of
comparisons all produce "false" as their result.)

Keith Thompson · Oct 16, 2004

Chris Torek said:
[regarding hashing pointers by first converting them to uintptr_t]

Assume the following:

void *p1 = foo();
void *p2 = bar();
uintptr_t u1 = uintptr_t(p1);
uintptr_t u2 = uintptr_t(p1);

Click to expand...

Minor nit: this is (old) C++ syntax; you mean:

uintptr_t u1 = (uintptr_t)p1;

and so on.

D'oh! (It wasn't (deliberately) C++ syntax, it was just a mistake;
I'm not going to admit to the thought process that led to it.) And I
used the wrong variable on the last line. What I meant, of course,
was:

void *p1 = foo();
void *p2 = bar();
uintptr_t u1 = (uintptr_t)p1;
uintptr_t u2 = (uintptr_t)p2;

[snip]

Thanks for confirming (somewhat to my surprise) that there are
real-world examples of what I was talking about.

Mark McIntyre · Oct 16, 2004

Obviously I have just written it, so it is not well tested. Can you give
any examples for your assertion? In which circumstances it doesn't work?

When I run it through my C interpreter on my Palmpilot, or compile it on my
Vax 8800, and on my IBM S/360. And it also fails on my spare PC, on my Mac,
on my Atari, on my Symbian phone, etc etc....

But I think you probably knew that !

Add a text file that a user specified the name of in a program to a directory	0	Apr 28, 2022
How to read from a .csv file in Java?	1	Nov 6, 2023
How can I view / open / render / display a pdf file with c code?	0	Sep 23, 2023
How to store and retrieve data from the backend	5	Jun 29, 2017
How can I find occurrences of a column name FPPaymentID in the entire database (e.g table, stored procedure etc) in SSMS?	2	Jun 20, 2023
How to get day (number), month (number) and year from a date using month's french name?	3	Feb 5, 2023
How do I set the default content page) on a Classic ASP file?	0	Aug 24, 2021
How to use PDF-lib and how to center each line of texts on the page?	1	Aug 16, 2023

How to retrieve the name of the file from a FILE *

CBFalconer

CBFalconer

jacob navia

Jonathan Adams

Jarno A Wuolijoki

Keith Thompson

Keith Thompson

CBFalconer

Fao, Sean

goose

Dan Pop

Dan Pop

Michael Wojcik

Keith Thompson

CBFalconer

Keith Thompson

Chris Torek

Keith Thompson

Mark McIntyre

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads