fopen

J

jacob navia

Mr said:
Do you find fopen() absurd and impossible to use too? Or are file or
stream attributes (such as the POSIX permission bits, or O_EXCL) less
important to you than some of the thread attributes that POSIX or
Windows have?

That prompted me to study a bit the history of fopen(). Where does it
come from? When was the first version of fopen?

The oldest version of it that I found a reference to was in the first
Unix Manual: (http://cm.bell-labs.com/cm/cs/who/dmr/man31.pdf)

mov $filename , r0
jsr r5,fopen; iobuf

It is dated from Nov 3rd 1971. It says in the "Bugs" section:

For greater speed, the buffer should be 512 bytes long. Unfortunately,
this will cause several existing programs to stop working.

!!!

In 1978 things are already well in place:
(http://plan9.bell-labs.com/7thEdMan/index.html)

NAME
-fopen, freopen, fdopen - open a stream

SYNOPSIS
#include <stdio.h>
FILE *fopen(filename, type)
char *filename, *type;

DESCRIPTION
fopen opens the file named by filename and associates a stream with it.
fopen returns a pointer to be used to identify the stream in subsequent
operations.
Type is a character string having one of the following values:
"r" open for reading
"w" create for writing
"a" append: open for writing at end of file, or create for writing

----------------------------------------------------------------------

The interface of fopen has then at least 33 years. A whole professional
life.

When designing interfaces NOW, taking as an example fopen() one of the
oldest interfaces still in use is not really a good idea.

Considerations that then were crucial (small RAM footprint, efficiency,
etc) aren't of such an importance now, with machines that have 3 or 4
orders of magnitude more RAM and much more power.
 
I

Ian Collins

In 1978 things are already well in place:
(http://plan9.bell-labs.com/7thEdMan/index.html)

NAME
-fopen, freopen, fdopen - open a stream

SYNOPSIS
#include<stdio.h>
FILE *fopen(filename, type)
char *filename, *type;

DESCRIPTION
fopen opens the file named by filename and associates a stream with it.
fopen returns a pointer to be used to identify the stream in subsequent
operations.
Type is a character string having one of the following values:
"r" open for reading
"w" create for writing
"a" append: open for writing at end of file, or create for writing

Does that mean I expire in a couple of years?
When designing interfaces NOW, taking as an example fopen() one of the
oldest interfaces still in use is not really a good idea.

Um, what else do you want when opening a fie?
Considerations that then were crucial (small RAM footprint, efficiency,
etc) aren't of such an importance now, with machines that have 3 or 4
orders of magnitude more RAM and much more power.

A) none of those criteria apply to fopen().

B) make that 5 or 6 orders of magnitude!
 
I

Ian Collins

A bufferred disc file copies disc sectors to virtual memory which can be paged
to disc. Which fopen can act as a rather complicated disc to disc copier. Memory
mapping files on pageable devices (as Multics did) gives you more control over
paging and page alignment.

But that's just detail. Let the OS manage all the paging behind the
scenes.
And the request/response nature of sockets isn't handled well with fopen buffers.

They (sockets) don't make sense at all with the fxxx() family.
 
J

James Kuyper

On 07/ 8/11 10:59 PM, jacob navia wrote: ....

Um, what else do you want when opening a fie?

Take a look at the options available when using the unix system function
open() and fcntl(). C streams only support a small fraction of those
options. Of course, the fraction that they do support includes all of
the most popular ones; I've seldom had any need to use the ones they
don't support.
 
T

Todd Carnes

Considerations that then were crucial (small RAM footprint, efficiency,
etc) aren't of such an importance now, with machines that have 3 or 4
orders of magnitude more RAM and much more power.

Small RAM footprint, efficiency, etc... *ARE* still important today and
always will be. Not trying to attain such things, when one can, is just
plain laziness on the programmer's part.

Not every C program that is written is destined to be run on the latest
and greatest gamer's rig.

Todd
 
S

Seebs

A bufferred disc file copies disc sectors to virtual memory which can be paged
to disc. Which fopen can act as a rather complicated disc to disc copier. Memory
mapping files on pageable devices (as Multics did) gives you more control over
paging and page alignment.

Er.

In general, C implementations will do a reasonable job of implementing fopen()
sensibly.
Also the bufferring of tty style devices only really made sense for the decade
or two when telnet and modem muxes were the dominant connection between
terminals and computers. X-Windows and the Xerox windows that beget Macs and
Windows once again have the CPU reacting to each single character as it is
typed. (Terminal.app and xterm actually use per-character events to simulate a
telnet-like connection.)

This doesn't make any sense.

The major advantage of tty buffering is efficiency over connections where
latency is relatively high. Such as, say, any remote connection whatsoever.
If you've got access to machines which support both character and line
buffering, and aren't physically adjacent, try it out sometime; line buffering
with local editing is a huge win.
And the request/response nature of sockets isn't handled well with fopen buffers.

Which is why no one uses it that way.
fopen does make sense for nonpageable devices like.....ummm....does anyone still
use tape drives? Parallel port printers?

fopen makes sense for plain files. Arguing that it's possible that the buffer
will get paged is... well, frankly, totally irrelevant. The parts of the
system that do buffering are usually aware of those tradeoffs.

-s
 
S

Seebs

That requires fopen to be unbufferred or buffers to be page aligned.

I have this feeling that if only I were drunk, I could comprehend this.

The OS typically provides the C library, which is written with direct
knowledge of how the OS does paging.

Tell you what. On your choice of modern operating systems, give it a
try. Use an unbuffered read mechanism, then try a buffered one, and see
how they perform. (Note that fopen() allows you to turn off buffering.)

The cycle of growth is this:

1. We understand that sometimes fopen() may not behave absolutely perfectly.
2. We try unbuffered I/O.
3. We realize that unbuffered I/O is painfully slow and inefficient.
4. We write a little wrapper around our unbuffered I/O to provide buffering.
5. We realize that the wrapper is so useful it should be in a library.
6. We realize that this is what everyone was telling us about fopen().

-s
 
S

Seebs

That would explain why so many people have replaced it.

How many are those? I'm not sure I've ever seen a replacement of
fopen() outside of newbie code.
And how often do people use a remote connection any more?

Pretty much constantly. Do you know the word "server"?

Do you seriously think that server farms have keyboards and mice on
everything?
I hate to break to you, but stdio is increasingly the interface
of last resort.

And your authority for this statement is?

Everything I've seen from you suggests that you are one of those people
who hasn't yet gotten a sense of just how huge the world of computing is,
and you tend to overgeneralize from a very limited personal experience.

I don't know that much about the bulk of the computing world, but I'm
at least *aware* that my experience is in a few smallish subsets. All I
mostly work with is desktops, servers, embedded systems, and the like.

-s
 
S

Seebs


Language implementors are a sort of special case. :)
Again your experience betrays you.
Yeah.

I have run HTTP through telnet for debugging,
but usually I use a browser. Safari/FireFox/Chrome/Opera use the sockets, not
me. MT-Newswatcher uses sockets, not me. Mail uses sockets, not me. I don't even
do FTP by hand anymore.

Yeah, uhm.

Not what I was talking about.
Server farms are usually unmonitorred. Some servers have a keyboard and monitor
mounted in the rack, some are attached from a trolley, some are remotely
administerred. For the most part however they churn away without any people
connecting to a telnet, ssh, or serial port.

And yet, people who use stuff like build farms end up logging into machines
all the time.

I think the difference is, you're really happy to view the latest and newest
and best things as the entire world, and I'm used to having to care about
stuff that isn't like that.

There's a reason that screen and tmux exist and are actively maintained, and
it's not that no one uses command line tools remotely anymore.

-s
 
B

BGB

Small RAM footprint, efficiency, etc... *ARE* still important today and
always will be. Not trying to attain such things, when one can, is just
plain laziness on the programmer's part.

Not every C program that is written is destined to be run on the latest
and greatest gamer's rig.

yeah...

and, my projects, although relatively "lean" by modern PC standards,
will still have to have their memory footprint somewhat shaved down to
really be practical, say, on Android devices (me starting a new personal
project of trying to port some of my VM stuff, ... to work on Android).
I doubt it will really be all that terribly useful though, given both
the constrained resources and nature of the UI (it has given me some
ZUI-related thoughts though).

actually, I am initially porting it to Linux/ARM in an emulator (first
major goal: make it all work on ARM, as-is it is all a bit x86
specific...), rather than directly to Android, since it is a little
easier to develop and test on a target where I have things like, say,
the ability to look at console output, ...


anyways, in what ways is fopen()'s interface subject to performance or
memory concerns?...

well, in my case I have a VFS which does a lot more "stuff", but still
uses an interface similar to "fopen()"/... it has itself been in use in
my projects since the late 90s or so, originally intended to address
some uses which were not readily addressable with stdio's FS interface
(mostly in-program virtual files, ...).

some years back, its internals were rewritten, mostly to make it simpler
and less nasty, but dropping some functionality (mostly sockets and more
esoteric features), which I have not as of yet bothered to re-add.


or such...
 
A

Angel

I have this feeling that if only I were drunk, I could comprehend this.

The OS typically provides the C library, which is written with direct
knowledge of how the OS does paging.

Tell you what. On your choice of modern operating systems, give it a
try. Use an unbuffered read mechanism, then try a buffered one, and see
how they perform. (Note that fopen() allows you to turn off buffering.)

Well, why not, I'm bored. Very quick & dirty file copying program, using
stream I/O:


#include <stdio.h>

int main(int argc, char *argv[])
{
if (argc < 3)
{
fprintf(stderr, "Usage: %s <src> <dst>\n", argv[0]);
return 1;
}

FILE *src = fopen(argv[1], "r");
FILE *dst = fopen(argv[2], "w");

int data;
while ((data = fgetc(src)) != EOF)
fputc(data, dst);

fclose(src);
fclose(dst);
}

Test results of copying one file, size 46M, from a nfs share to a local
disk:

real 0m12.656s
user 0m1.227s
sys 0m0.087s


Same program, now done in unbuffered system calls. Note that the program
now is non-portable and more complex:


#include <stdio.h>

#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>

int main(int argc, char *argv[])
{
if (argc < 3)
{
fprintf(stderr, "Usage: %s <src> <dst>\n", argv[0]);
return 1;
}

int src = open(argv[1], O_RDONLY);
int dst = open(argv[2], O_WRONLY | O_TRUNC);

unsigned char data;
while (read(src, &data, 1) == 1)
write(dst, &data, 1);

close(src);
close(dst);
}

Test results, same file:

real 0m28.066s
user 0m6.861s
sys 0m21.180s


Mm... So using stream I/O gives me a 50% performance boost, makes my
program portable, and slightly easier to understand. I'd say it pays to
use stream I/O wherever possible. :)


For those that want to know, the system used is an Intel Quad Core 6600
(2.6GHz), running 64-bit Gentoo Linux. The system where the nfs server
resides is a Sun Netra 200 (UltraSPARC IIe, 500 MHz RISC), running 64-bit
Gentoo Linux with 32-bit user land. Network is 100mbit CAT-5E.
 
I

Ian Collins

Take a look at the options available when using the unix system function
open() and fcntl(). C streams only support a small fraction of those
options. Of course, the fraction that they do support includes all of
the most popular ones; I've seldom had any need to use the ones they
don't support.

Exactly my point, thank you.
 
I

Ian Collins

That would explain why so many people have replaced it.


And how often do people use a remote connection any more? An xterm is not a
remote connection. Most people now use desktop, laptops, and workstations. They
are rarely using telnet, ftp, etc directly.

Most of the time, most days.
I hate to break to you, but stdio is increasingly the interface of last resort.

Says who?

stdio is the best choice until proven otherwise. Even in the small
embedded world, stdio is the common channel for communicating with the
outside world.
 
K

Keith Thompson

China Blue Dolls said:
[QUOTE="Keith Thompson said:
In general, C implementations will do a reasonable job of
implementing fopen() sensibly.

That would explain why so many people have replaced it.

How many are those? I'm not sure I've ever seen a replacement of
fopen() outside of newbie code.

Here's one: http://tcl.sourceforge.net/
Possibly another: http://www.python.org/download/source/
[...]

Those are languages other than C. If those the best examples you
have, I don't think you're supporting your point very well.

What language is the Tcl library written in?[/QUOTE]

As far as I can tell, most of the library appears to be written
in Tcl. The language implementation appears to be in C.

Are you trying to make some point about C code that uses some kind
of replacement for fopen() rather than using fopen() itself? If so,
please be more specific. Posting the URLs for two large source
trees that may or may not contain something relevant is not helpful.

On the other hand, if you don't *want* to make your point clearly,
that's fine with me.
 
S

Seebs

China Blue Dolls said:
How many are those? I'm not sure I've ever seen a replacement of
fopen() outside of newbie code.
Here's one: http://tcl.sourceforge.net/
Possibly another: http://www.python.org/download/source/ [...]

Those are languages other than C. If those the best examples you
have, I don't think you're supporting your point very well.

Sort of. Those are languages of which the most widely-used implementations
are indeed in C. And they tend to replace stdio so they can do their own
magic.

Not sure I'd consider those very representative cases, though.

-s
 
D

Dr Nick

China Blue Dolls said:
That would explain why so many people have replaced it.


And how often do people use a remote connection any more? An xterm is
not a remote connection. Most people now use desktop, laptops, and
workstations. They are rarely using telnet, ftp, etc directly.

Just about. But "the cloud" and web based services looks like pushing
it back again.

All my current work expects to get streams of stuff in. Even when you
write things to cope with character-by-character activities (suggesting
word completions for example) you are reading from HTTP GETs and sending
HTTP formatted blocks of results.

All of which fits the tty model to a T.
I hate to break to you, but stdio is increasingly the interface of
last resort.

Obviously we program in very different domains. What's nice is that C
lets us do that.
 
D

Dr Nick

James Kuyper said:
Take a look at the options available when using the unix system function
open() and fcntl(). C streams only support a small fraction of those
options. Of course, the fraction that they do support includes all of
the most popular ones; I've seldom had any need to use the ones they
don't support.

I've had the need for "create and open the file if it doesn't exist,
otherwise fail" and had to stick it in my "system specific bodges" bit.
Very useful if more than one instance of your program (or other programs
using the same file structure) could be running at once.
 
M

Malcolm McLean

I hate to break to you, but stdio is increasingly the interface of last resort.
Typically on a modern hosted system you've got programs which do
processing, and write to and from stdio. Then you've got programs with
graphical user interfaces which will use stdio for communicating with
backing store but not for communicating with the user. Then you've got
Internet aware programs that use stdio for spitting out html,
sometimes for backing store, but also use sockets for retrieving data
from the web, and may or may not have a GUI. Finally you've got
programs like the clipboard which use special OS protocols but which
the user expects to work with most programs.

Basically it's a mess and stdio is part of that mess. You use stdio
for gluing together different programs, but it's far less efficient
than calling subroutines, which sometimes matters and sometimes
doesn't. Non-technical users like to be protected from the console as
far as possible, technical users sometimes but not always prefer it
depending on the job in hand, sometimes stdio is semi-transparent to
networks, and often you do want to specify that data is to be stored
locally.
 
N

Nobody

Typically on a modern hosted system you've got programs which do
processing, and write to and from stdio. Then you've got

Then you've got programs which do processing, and use low-level I/O
(open/read/write/close) or memory-mapped I/O.

For many of the GNU coreutils programs (i.e. classic Unix "filter"
programs), stdio is the interface of last resort. Even if they use
buffered I/O, they typically open() the file first then use fdopen() to
get a FILE* from the descriptor if they want it.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,770
Messages
2,569,584
Members
45,075
Latest member
MakersCBDBloodSupport

Latest Threads

Top