String search and length portability questions

W

Walter Dnes

I have taken an intro evening C course at a local university, but my
C programming experience is otherwise nill. I've come up with a little
project that I want to cut my teeth on. After reading the FAQ, I have a
few questions

1) What proportion of modern OSs/compilers would a string in excess of
64K charcters break ?

2) Is memmem a standard function in C ? I want to find if and where a
smaller string exists within a larger string. Spelunking through
/usr/include/string.h (Redhat 7.3; gcc 2.96) I find...

#ifdef __USE_GNU
/* Find the first occurrence of NEEDLE in HAYSTACK.
NEEDLE is NEEDLELEN bytes long;
HAYSTACK is HAYSTACKLEN bytes long. */
extern void *memmem (__const void *__haystack, size_t __haystacklen,
__const void *__needle, size_t __needlelen)
__THROW __attribute_pure__;

The call looks simple. The parameters would be
- pointer to larger (containing) string
- containing string size (why size_t rather than int ?)
- pointer to substring
- substring size (why size_t rather than int ?)

If the function is void, how do I find out what it has returned ?

3) Is there a list somewhere on the web of standard functions and the
libraries they're found in ?

4) What does fubar_t signify, where fubar can be anything ? This
looks like some sort of standard naming convention.
 
M

Malcolm

Walter Dnes said:
1) What proportion of modern OSs/compilers would a string in excess
of 64K charcters break ?
Not many. It was really only the old DOS compilers which imposed a limit of
64K.
However a C string is NUL terminated, which means that to do anything much
with it the computer has to scan through to the end. For a string as long as
64K this could be fairly slow, so it would be unusual to store a character
sequence that long as a single string.
 
W

Walter Dnes

The libraries in which they are found sometimes varies.
http://www.dinkumware.com/refxc.html provides a reference.

A little bit more spelunking allowed me to create a list of all
occurences of "#include fubar.h" in /usr/share/man/3. Here it is for
everybody else's benefit...

#!/bin/bash
cd /usr/share/man/man3
ls -1 |\
grep ^[a-z] |\
xargs zcat |\
grep \#include |\
sed "s/^.* <//
s/>.*$//" |\
sort -u > ~/liblist.txt

I can "man" the libraries, and get a list of included functions.
 
D

Dan Pop

Not many. It was really only the old DOS compilers which imposed a limit of
64K.

And not even the DOS implementations imposed such a limit if you
didn't mind the overhead of using the huge memory model. The programmer
had to choose between fast and compact code that couldn't manipulate
objects larger than 64K and somewhat slower code that could transparently
handle any object fitting into the memory available to the program.
However a C string is NUL terminated, which means that to do anything much
with it the computer has to scan through to the end.

There are plenty of things you can do without having to scan through to
the end: strchr, strstr, str[c]spn, str[n]cmp, strncpy. I would have
added sscanf, but apparently it is not that uncommon for sscanf
implementations to call strlen on their first parameter before starting
the actual scanning.
For a string as long as
64K this could be fairly slow, so it would be unusual to store a character
sequence that long as a single string.

It depends on the application domain. If it makes sense to have strings
that long, you just use them.

Dan
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,013
Latest member
KatriceSwa

Latest Threads

Top