The C FAQ

Seebs · Feb 13, 2010

|What this ends up doing is allowing me to intercept every system call that
|deals with the filesystem and is affected by privileges, and act as though
|you had root privileges, or at least, make it look as though it acted as

Somebody already did this. It's called fakeroot. Terrible duplication of
effort if you reinvented it from scratch

We used to use fakeroot. It was unsuitable for our purposes in a number
of ways.

Among them:

1. Not much persistency. Fakeroot's good at handling a single build run
which completes in a few minutes. Our system has to handle doing dozens or
hundreds. Over a period of months.
2. Extreme vulnerability to database corruption if anything is changed
outside of it.
3. No real database to speak of; on exit, it writes a flat file list of
plain text data, and it can slurp that back in.
4. No file name tracking. This makes it harder to detect other errors.
5. Can get confused enough to, say, report that a directory is a plain
file, or vise versa.
6. Occasionally crashes, breaking programs relying on it.
7. Also, in default configuration, leaves shared memory segments allocated
and not cleaned up.

There were other issues. After careful study, we concluded that it would be
best to write one from scratch, from the ground up, addressing many of these
concerns.

Mine's called pseudo, and we're currently doing a code clean-up pass before
making it more widely available under some GPL variant. Key differences:

* Tracks file names in nearly-all cases. (You can get past it, but it is
very good at finding out file names.)
* Persistent database using sqlite3 maintained by the daemon.
* Database tracks both names and dev/inode pairs.
* Server does consistency checks and correctly handles cases such as, say,
a database entry which has the same inode as an incoming query, but has
a necessarily-incompatible type (e.g., is tagged as a directory but is
not).
* Client code can automatically restart the server if it goes away or isn't
available, so server doesn't need to be encouraged to stay running for a
long time just because a build might be slow.
* Can log every filesystem action it sees.

We used fakeroot in production systems for a couple of product releases.
Typically, if you loaded things up (say, started and stopped it dozens of
times during an extended build series), sooner or later, it would start
yielding garbage like claiming that a plain file was a directory, or giving
arbitrarily wrong results for files, or lose parts of the recorded state.
Or just crash. On average, if you were using the build system heavily, you'd
hit some kind of "fakeroot failed, you have to re-install all target packages"
thing about once or twice a week.

Since we put in pseudo, we've had one customer bug (which was in an area where
we'd decided to not fully implement something in the first release). When the
bug was reported, the report could tell us what file name pseudo expected to
see for a given file, and what file name had shown up, and why this was
surprising, allowing us to fix the bug in five minutes rather than spending
hours trying to figure out when that inode had been created or freed. We've
had a handful of bugs discovered internally during development, but as of this
writing, I have out for review fixes for the last bugs we are aware of that
have ever caused any kind of failure. (In fact, the last bugs we are aware
of. There have been bugs that didn't cause any failures, because pseudo is
extremely aggressive about sanity-checking its results.)

Basically, if you just look at the time that our developers lost to project
directories that got harmed by fakeroot failures, we've gotten the time I put
back on this returned to us about five-fold since I started.

I would not call it a waste of effort. We did research the alternatives
before trying this, and none of them were a good fit for our usage scenario.

(Relevance, such as it is: It's worth noting that there IS such a thing as
a time to rebuild something that already exists.)

-s

Ersek, Laszlo · Feb 13, 2010

With a magic bit of code using "dlsym()" to look up symbols in other
libraries, relying on a feature where it's possible to specify "give me
the next version after the one you already found".

Yes, yes, that's RTLD_NEXT as a handle, as I wrote previously (check
it). Or perhaps you use something more sophisticated, but that's not
what I'm after.

How do you assign the value returned by dlsym() to a
pointer-to-function?

Cheers,
lacos

Ersek, Laszlo · Feb 13, 2010

Mine's called pseudo, and we're currently doing a code clean-up pass before
making it more widely available under some GPL variant.

Thank you.
lacos

Seebs · Feb 13, 2010

Yes, yes, that's RTLD_NEXT as a handle, as I wrote previously (check
it). Or perhaps you use something more sophisticated, but that's not
what I'm after.

How do you assign the value returned by dlsym() to a
pointer-to-function?

=

Seriously, it's that simple.

Actually, it's sorta fancy. What I have is a bunch of declarations
like:

int (*real_open)(const char *, int, ...) = dummy_open;
and then a table that looks like

[...]
{ "open",
(int (**)(void)) &real_open,
(int (*)(void))dummy_open,
(int (*)(void))wrap_open },
[...]
{ NULL, NULL, NULL, NULL }

and then:

if (*pseudo_functions.real == pseudo_functions.dummy) {
int (*f)(void);
f = dlsym(RTLD_NEXT, pseudo_functions.name);
if (error_check) {
/* warn about the error */
} else {
*pseudo_functions.real = f;
}
}

Once that's done, real_open is a pointer of the correct type that points to
the correct function. Since function pointers may be safely cast to and
from other types, this is *almost* safe. Strictly speaking, I suspect that
I ought to be converting the pointer to the correct type, and storing the
correct types in the table, but that turns out to be very hard to write
clearly, and I'm reasonably confident that any system where dlsym works will
be one where the type punning works too.

-s

Phred Phungus · Feb 13, 2010

Andrew said:
Well, I realized after the fact that if I replaced "though"
with "however" and "Bad" with "Evil", I could have been rid
of those pesky spaces.

But I should add that as near as I can tell, there aren't a
whole lot of people coming here with silly questions caused
by mistrust of the FAQ. Perhaps merely being Usenet in 2010
is enough to be inaccessible to most newbies?

Andrew

I haven't read this thread carefully, but I think DOS lives on in
windows, if only because that's the way it exists in the minds of many
windows users.

If you told someone to "pull up a dos window," would that sentence be
more precise if one said, "I need you to pull up a windows window."

If one wants to use gnu development tools on the windows platform, he
can go the cygwin route. This option didn't work well for me, probably
because I couldn't do crap off a bash prompt when I tried this. A big
part of *nix is using the command effectively and quickly. Those of us
who grew up in shadow of Microsoft will find a steep learning curve here.

Alternatively, he can use a DOS prompt that begins life as a link. This
environment accepts the DOS commands that guys like me were using in
1983. So the shell has subsumed DOS, but with smoke and mirrors, it
seems like DOS to me.

Now that I'm finally on a pimped-out linux terminal, I won't miss DOS a bit.

Ersek, Laszlo · Feb 14, 2010

=

Seriously, it's that simple.

Actually, it's sorta fancy. What I have is a bunch of declarations
like:

I'll insert [1], [2] etc below.

int (*real_open)(const char *, int, ...) = dummy_open;
and then a table that looks like

[...]
{ "open",
[1] (int (**)(void)) &real_open,
[2] (int (*)(void))dummy_open,
(int (*)(void))wrap_open },
[...]
{ NULL, NULL, NULL, NULL }

and then:

if (*pseudo_functions.real == pseudo_functions.dummy) {
int (*f)(void);
[3] f = dlsym(RTLD_NEXT, pseudo_functions.name);
if (error_check) {
/* warn about the error */
} else {
[4] *pseudo_functions.real = f;
}
}

Once that's done, real_open is a pointer of the correct type that points to
the correct function. Since function pointers may be safely cast to and
from other types, this is *almost* safe. Strictly speaking, I suspect that
I ought to be converting the pointer to the correct type, and storing the
correct types in the table, but that turns out to be very hard to write
clearly, and I'm reasonably confident that any system where dlsym works will
be one where the type punning works too.

In my opinion, [1] might have an alignment problem theoretically (C90
6.3.4 "Cast operators" p6, C99 6.3.2.3 "Pointers" p7); [2] is conformant
if dummy_open and wrap_open are cast back to the correct
pointer-to-function type before called (C90 6.3.4 p7, C99 6.3.2.3 p8).
I guess [4] might store garbage through the incompatible pointer,
theoretically.

[3] is undefined under ISO C, but it's interesting to compare the
SUSv[234] specifications of dlsym():

http://www.opengroup.org/onlinepubs/007908799/xsh/dlsym.html

http://www.opengroup.org/onlinepubs/009695399/functions/dlsym.html

http://www.opengroup.org/onlinepubs/9699919799/functions/dlsym.html
http://www.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html#tag_15_12_03

The example of the C90-based SUSv2 simply casts the (void *) value
returned by dlsym() to a pointer-to-function type.

SUSv3 (based on C99) notes in the Rationale that

----v----
The ISO C standard does not require that pointers to functions can be
cast back and forth to pointers to data. Indeed, the ISO C standard does
not require that an object of type void * can hold a pointer to a
function. Implementations supporting the XSI extension, however, do
require that an object of type void * can hold a pointer to a function.
The result of converting a pointer to a function into a pointer to
another data type (except void *) is still undefined, however.
----^----

Thus it requires *conversion*, but then it goes on and (supposedly to
suppress a mandatory warning

----v----
Note that compilers conforming to the ISO C standard are required to
generate a warning if a conversion from a void * pointer to a function
pointer is attempted as in:

fptr = (int (*)(int))dlsym(handle, "my_function");
----^----

simply accesses the function pointer as if it was a pointer to void:

----v----
int (*fptr)(int);

*(void **)(&fptr) = dlsym(handle, "my_function");
----^----

SUSv4 finally covers both conversion and representation explicitly:

----v----
2.12.3 Pointer Types

All function pointer types shall have the same representation as the
type pointer to void. Conversion of a function pointer to void * shall
not alter the representation. A void * value resulting from such a
conversion can be converted back to the original function pointer type,
using an explicit cast, without loss of information.

Note:
The ISO C standard does not require this, but it is required for
POSIX conformance.
----^----

Another interesting thing is that this requirement was moved from the
XSI extension to the POSIX Base (along with dlsym() itself).

I was curious which way you chose for [3]. The one above follows the
example of SUSv2, clashes with SUSv3 (whose dlsym() rationale and
example are inconsistent anyway, in my view), and it is explicitly
permitted by SUSv4.

I'm obviously unqualified to advise you, but under SUSv4 (ie. a modern
GNU/Linux, even though it's probably not certified), you could make [1]
and [4] more elegant by making the "real" struct member a (void **), and
the "f" auto variable a (void *).

Thanks,
lacos

Nick · Feb 19, 2010

jacob navia said:
Seebs a Ã©crit :

First bet:

It doesn't use any GUI

Second bet:

It doesn't use any sound/video/multimedia

Third bet:

It doesn't use any graphics, or mouse

Obviously, THEN... it is portable. But it would be portable to
Mac/windows/whatever too without much pain.

So what. Apart from a few specialist niche programs like operating
systems and web browsers nobody needs to use old-fashioned things like
that any more. That's all last millennium stuff.

How much GUI, sound or mouse interaction do you think the programmers of
recent hits like Facebook or Twitter have had to do, eh?

Nick, only half tongue in cheek.

jacob navia · Feb 19, 2010

Nick a Ã©crit :

So what. Apart from a few specialist niche programs like operating
systems and web browsers nobody needs to use old-fashioned things like
that any more. That's all last millennium stuff.

How much GUI, sound or mouse interaction do you think the programmers of
recent hits like Facebook or Twitter have had to do, eh?

Ahhhh of course.

To use facebook you call a command line program, and you pass it the
list of your friends using the argc/argv mechanism of course...

You obtain a file of contacts, email, etc, and you read it using vi
or emacs.

Sure.

Nick · Feb 19, 2010

jacob navia said:
Nick a Ã©crit :

Ahhhh of course.

To use facebook you call a command line program, and you pass it the
list of your friends using the argc/argv mechanism of course...

You obtain a file of contacts, email, etc, and you read it using vi
or emacs.

Sure.

No, you send it a stream of text (let's say something like
GET facebook.com HTTP/1.1\n\n
And it responds with a stream of text.

Really. That's now it's done.

Have a look at the URLs below. That's all done in C. OK, there's a few
non-ansi bits, but they are to do with data changes when multiple
instances of the program are running simultaneously and similar.

It doesn't use GUI. It doesn't use sound/video/multimedia. It doesn't
use graphics of mouse.

Why on earth should I do all that sort of tedious old-fashioned stuff
when the browser and server can do it for me?

Nick, moving tongue further as it didn't seem to get noticed last time.

gwowen · Feb 19, 2010

To use facebook you call a command line program, and you pass it the
list of your friends using the argc/argv mechanism of course...

From the point of view of then backend, yes that it pretty much
exactly what it does. Remember, in all likelihood the program
generating your facebook page runs on a machine that does not have a
mouse, keyboard, soundcard or monitor. It's hard to do GUI
programming on such a machine.

From the point of view of the user interface, all that sound and event
handling have been abstracted away to a cross platform mechanism known
as a "web browser".

Phred Phungus · Feb 19, 2010

gwowen said:
From the point of view of then backend, yes that it pretty much
exactly what it does. Remember, in all likelihood the program
generating your facebook page runs on a machine that does not have a
mouse, keyboard, soundcard or monitor. It's hard to do GUI
programming on such a machine.

From the point of view of the user interface, all that sound and event
handling have been abstracted away to a cross platform mechanism known
as a "web browser".

Jacob,

Are you able to manipulate using C?

bartc · Feb 19, 2010

gwowen said:
From the point of view of then backend, yes that it pretty much
exactly what it does. Remember, in all likelihood the program
generating your facebook page runs on a machine that does not have a
mouse, keyboard, soundcard or monitor. It's hard to do GUI
programming on such a machine.

From the point of view of the user interface, all that sound and event
handling have been abstracted away to a cross platform mechanism known
as a "web browser".

Maybe that explains why I find facebook totally impossible to use.

Seebs · Feb 19, 2010

Ahhhh of course.

To use facebook you call a command line program, and you pass it the
list of your friends using the argc/argv mechanism of course...

You obtain a file of contacts, email, etc, and you read it using vi
or emacs.

Sure.

You don't seem to have caught his point:

A single web browser is used with thousands of applications -- nearly all of
which are 100% GUI-free apps. A CGI program in C does nothing but standard
input, standard output, and standard error. No mouse, no GUI. And there are
a LOT of those programs.

-s

Richard Bos · Mar 1, 2010

jacob navia said:
Nick Keighley a écrit :

Apparently you have never heard about the PDP 11 then.
A 286 board was MUCH more powerful than the PDP 11,
and Xenix proved that Unix was feasible in a 386 board.

Oh, come on, jacob! We know that you have a tendency to think that your
experiences are all there ever were in computing, but even you must be
aware that the 80_2_86 cannot seriously be called one of the early
microprocessors!

Microsoft targeted the mass market. And won, against all
odds because of the attitude of Unix vendors.

No, Microsoft won because they did then, and still do now, cheat.

Richard

Richard Bos · Mar 1, 2010

Seebs said:
Huh! Well, that's interesting. Maybe we need a poll.

I know that the first time I posted a system-specific question to clc, I
got a very polite note letting me know I would need a system-specific group,
and also an explanation of why.

The first question I posted to c.l.c, I asked whether there was a way to
read directories that did not rely on MS-DOS extensions. Which was in
the FAQ. Which I _had_ read - it was the very reason I had sought out
c.l.c - but not well enough... IIRC the reactions were concise but
polite.

Richard

How can I add arrows to my FAQ	0	Aug 9, 2023
C exercise	1	Feb 3, 2022
Beginner at c	0	Oct 5, 2023
C Programming functions	2	Dec 3, 2021
What is the most astounding C++ syntax construct?	0	Dec 22, 2022
C faq ,ques-2.9,2.18	9	May 22, 2014
Strict aliasing and Q2.6 in the FAQ	7	Sep 19, 2011
How can I view / open / render / display a pdf file with c code?	0	Sep 23, 2023

The C FAQ

Seebs

Ersek, Laszlo

Ersek, Laszlo

Seebs

Phred Phungus

Ersek, Laszlo

Nick

jacob navia

Nick

gwowen

Phred Phungus

bartc

Seebs

Richard Bos

Richard Bos

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads