How to write code which minimizes page faults?

S

Sune

Hi!

Pre-requisites:
-------------------
1) Consider I'm about to write a quite large program. Say 500 K
lines.
2) Part of this code will consist of 50 structs with, say, no more
than at most 1K bytes of data.
3) These structs are to be used by all of the other 500K lines in
various places.
4) Linux, SUN Solaris

Design decisions:
-------------------
- Add functions to the structs so they handle their own data. Kind of
C++ OOP.
- Just make the structs carry data, and write macros to handle the
data of each struct, and insert those macros in appropriate places
in the 500 K lines of code, wherever they are used.

This question does not deal in what is good programming in terms of
macros or OOP and data encapsulation in C!!!!! So please no remarks
on this unless it makes your answers to the questions below more
clear.

Question:
--------------------
- I'm afraid that adding code/functions to structs results in worse
locality, i.e. this will result in increased paging. Is that so?

- Is inserting code in the 500K lines wherever it's needed, worth
the effort? or am I over exaggerating the risk and consequences of
page faults?

Thanks in advance
/Sune
 
E

EventHelix.com

Use of object oriented programming should result it keeping related
code and data in close proximity. This would improve the locality
of reference for code as well as data.

Other than that, you should not worry about this issue. Do not
compromise readability and maintainability of code to achieve higher
locality of reference.
 
S

Sune

Hi!

What you say makes sense, thanks for sharing.

IF ANYBODY HAS ANYTHING TO ADD, PLEASE DO. I'LL GO BACK TO THIS MESSAGE
ON AND OFF FOR 1-2 WEEKS.

BRs
/Olle
 
A

Artie Gold

Sune said:
Hi!

Pre-requisites:
-------------------
1) Consider I'm about to write a quite large program. Say 500 K
lines.
2) Part of this code will consist of 50 structs with, say, no more
than at most 1K bytes of data.
3) These structs are to be used by all of the other 500K lines in
various places.
4) Linux, SUN Solaris

Design decisions:
-------------------
- Add functions to the structs so they handle their own data. Kind of
C++ OOP.
- Just make the structs carry data, and write macros to handle the
data of each struct, and insert those macros in appropriate places
in the 500 K lines of code, wherever they are used.

This question does not deal in what is good programming in terms of
macros or OOP and data encapsulation in C!!!!! So please no remarks
on this unless it makes your answers to the questions below more
clear.

Question:
--------------------
- I'm afraid that adding code/functions to structs results in worse
locality, i.e. this will result in increased paging. Is that so?

- Is inserting code in the 500K lines wherever it's needed, worth
the effort? or am I over exaggerating the risk and consequences of
page faults?

Thanks in advance
/Sune

Unfortunately, none of the information you've provided says much about
what the (potential) paging behavior of your program will be -- assuming
there's any paging at all. (50 1K structs is a trivial amount, so,
likely, is the amount of machine code corresponding to a 500KLOC source.)

What is more significant here, assuming there's enough data where paging
becomes an issue, is the *pattern* in which data are accessed (this
often works in ways that are counterintuitive, BTW).

Perhaps the classic situation is as follows:
The amount of memory you have is M whatevers.
You access M + 1 whatevers of data in a loop -- i.e. just barely more
than will fit in memory.

If you count the number of page faults that would be generated using a
typical LRU replacement algorithm, you'll see that this is pretty much a
pessimal situation -- which absolutely *clobbers* performance.

If, in this situation, you were to split your loop so it only iterates
over half the memory at a time, the number of page faults would be
*drastically* reduced; in fact they would nearly be eliminated! [Doing
the calculation is a very worthwhile exercise.]

HTH,
--ag
 
M

Malcolm

Sune said:
- I'm afraid that adding code/functions to structs results in worse
locality, i.e. this will result in increased paging. Is that so?
In C you cannot have functions as members of structs, in C++ you can.
There's quite a strong arument that if you want to hold your data in plain
structs, you should use C rather than C++. C++ people on comp.lang.c++ might
have more to say on this.

Once you move from plain data types, you begin to lose control of the way
your data is laid out in memory. It becomes much more difficult to keep
track of which data is accessed and when.
- Is inserting code in the 500K lines wherever it's needed, worth
the effort? or am I over exaggerating the risk and consequences of
page faults?
We cannot say.Is running time absolutely critical, and is the speed of
memory access the real culprit? Sometimes the answer will be yes, but it is
unusual and getting more unusual for these types of questions to dominate
design decisions.
Generally you should just lay out structures to make them readable, and
assume that the compiler does a reasonable job of putting them on
appropriate page boundaries. However sometimes it is worth trying to beat
the compiler to get that little bit of extra performance.
 
C

Christian Bau

"Sune said:
Hi!

Pre-requisites:
-------------------
1) Consider I'm about to write a quite large program. Say 500 K
lines.
2) Part of this code will consist of 50 structs with, say, no more
than at most 1K bytes of data.
3) These structs are to be used by all of the other 500K lines in
various places.
4) Linux, SUN Solaris

Design decisions:
-------------------
- Add functions to the structs so they handle their own data. Kind of
C++ OOP.
- Just make the structs carry data, and write macros to handle the
data of each struct, and insert those macros in appropriate places
in the 500 K lines of code, wherever they are used.

This question does not deal in what is good programming in terms of
macros or OOP and data encapsulation in C!!!!! So please no remarks
on this unless it makes your answers to the questions below more
clear.

Question:
--------------------
- I'm afraid that adding code/functions to structs results in worse
locality, i.e. this will result in increased paging. Is that so?

- Is inserting code in the 500K lines wherever it's needed, worth
the effort? or am I over exaggerating the risk and consequences of
page faults?

Would be better if you posted what you actually try to achieve.

If you try to keep a bunch of programmers busy for an extended amount of
time, modifying 500,000 lines of code and fixing all the bugs that you
introduce while making these changes seems a good idea. Who is going to
pay for it?
 
A

akarl

Sune said:
Pre-requisites:
-------------------
1) Consider I'm about to write a quite large program. Say 500 K
lines.
2) Part of this code will consist of 50 structs with, say, no more
than at most 1K bytes of data.
3) These structs are to be used by all of the other 500K lines in
various places.
4) Linux, SUN Solaris

Design decisions:
-------------------
- Add functions to the structs so they handle their own data. Kind of
C++ OOP.
- Just make the structs carry data, and write macros to handle the
data of each struct, and insert those macros in appropriate places
in the 500 K lines of code, wherever they are used.

This question does not deal in what is good programming in terms of
macros or OOP and data encapsulation in C!!!!! So please no remarks
on this unless it makes your answers to the questions below more
clear.

Question:
--------------------
- I'm afraid that adding code/functions to structs results in worse
locality, i.e. this will result in increased paging. Is that so?

- Is inserting code in the 500K lines wherever it's needed, worth
the effort? or am I over exaggerating the risk and consequences of
page faults?

Why do you want to use C (a high level assembler) for such a large program?

August
 
E

E. Robert Tisdale

Sune said:
Pre-requisites:
-------------------
1) Consider I'm about to write a quite large program.
Say 500 K lines.
2) Part of this code will consist of 50 structs
with, say, no more than at most 1K bytes of data.
3) These structs are to be used by all of the other 500K lines
in various places.
4) Linux, SUN Solaris

Design decisions:

Don't do this.
- Just make the structs carry data
and write macros to handle the data of each struct

Use inline [static] functions instead.
and [use those inline functions] in appropriate places
in the 500 K lines of code, wherever they are used.

This question does not deal in what is good programming
in terms of macros or OOP and data encapsulation in C!
So please no remarks on this
unless it makes your answers to the questions below more clear.

Question:

It's nonsense.
- Is inserting code in the 500K lines wherever it's needed
worth the effort?

Use inline [static] functions instead.
Or am I over exaggerating the risk and consequences of page faults?

Yes.

Just write code that is as clean and readable as possible.

You can't include functions in a struct.
You can include *pointers* to functions in structs
but *don't* do it unless you need run-time polymorphism.
If you need run-time polymorphism,
include a pointer to a "virtual function table" in your struct
then write inline [static] functions
to (de)reference the appropriate function from the table.
 
D

Default User

akarl wrote:

Why do you want to use C (a high level assembler) for such a large
program?


Why do you want to make false and ridiculous statements on newsgroups?




Brian
 
M

Malcolm

akarl said:
Why do you want to use C (a high level assembler) for such a large
program?
C++ is a good language for many complex projects. It doesn't follow that C
is a bad or worse language for such projects.
 
A

akarl

Default said:
Why do you want to make false and ridiculous statements on newsgroups?

No, let me rephrase that:

Why do you want to use a low-level and unsafe programming language
for a project of this size?

It's not a rant, I'm only curious.


August
 
D

Default User

akarl wrote:

Why do you want to use a low-level and unsafe programming language
for a project of this size?

Why would you bother to post to a newsgroup dedicated to a language you
despise.
It's not a rant, I'm only curious.

It's more a troll, I'd say. So,


*plonk*




Brian
 
A

akarl

Default said:
Why would you bother to post to a newsgroup dedicated to a language you
despise.

I don't. I think C is good for making small, efficient and/or low-level
programs.
It's more a troll, I'd say. So,

The comp.lang.* groups are not only for religious followers of one
language, you know. I would certainly be a troll if I said that C sucks
in every respect, but I haven't.

Since neither you nor Sune seem to be able to answer my question you
probably have no good answer, so I won't be bothering you with it.


August
 
R

Richard Bos

akarl said:
No, let me rephrase that:

Why do you want to use a low-level and unsafe programming language
for a project of this size?

Let me, then, rephrase Brian's reply to fit your rephrased question:

Why do you want to make false and ridiculous statements on newsgroups?
It's not a rant, I'm only curious.

Feh.

Richard
 
S

Sune

Hi guys!

Thanks for giving me insight on a non-issue it seems. I'm usually a C++
programmer so that's why some of my suggestions went off topic (like
functions in structs).

The reason I need info on C is that I'm rather fed up with OO and think
I can achieve the same abstractions and the necessary level of
encapsulation in C without having to instantiate 25-100 objects in the
traffical path.

I believe C with small intelligent C++ objects is the way to
go...together with exceptions, RAII etc. Nice mix I think...

BRs
/Olle
 
M

Malcolm

akarl said:
Since neither you nor Sune seem to be able to answer my question you
probably have no good answer, so I won't be bothering you with it.
C is a simple language, so maintenance and configuration is easy. That's
what you probably mean by "low level".
All programming languages are unsafe, in that none can guarantee that the
programmer hasn't made an error. Since C is so simple, mistakes are unlikely
to be due to some quirk of the language.
 
A

akarl

Malcolm said:
C is a simple language, so maintenance and configuration is easy. That's
what you probably mean by "low level".

No, I mean direct memory access, raw pointers etc.
All programming languages are unsafe, in that none can guarantee that the
programmer hasn't made an error.

A safe language is a language where a large class of errors are detected
when they occur and not when data gets corrupted. Such a language
provides e.g. array bounds checking, garbage collection and restricted
pointers.

Since C is so simple, mistakes are unlikely
to be due to some quirk of the language.

Lots of laughs. You're being ironic I hope. The quirks of C is part of
it's charm and success. If you want to see a *real* simple (procedural)
language, check out Oberon-2.

August
 
G

Gordon Burditt

All programming languages are unsafe, in that none can guarantee that the
Unless, of course, it is impossible to make an error. Consider an
assembly language with one-bit opcodes, NOOP and HALT, which can
be arranged in any order. (A HALT opcode is implied if you run
off the end of the program.) Of course, you can't do anything
USEFUL with it, either. All those ugly things that cause errors
like pointers, variables, arithmetic, loops, etc. are prohibited.

Gordon L. Burditt
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,578
Members
45,052
Latest member
LucyCarper

Latest Threads

Top