Please help optimize (and standarize) this code...

Mark F. Haigh · Mar 13, 2005

gtippery wrote:

The platform limit is 64KB for any one item (i8086). The array would
be the limiting factor in this case (at something over 5,000

elements),

[OT]

The limits change depending on the compiler / memory model combination.
I know there's one out there (a huge model variant) using 16 bit int
and 32 bit size_t. Can't remember which compiler, and don't care to--
hopefully my DOS programming days are far behind me (pun intended),
never to return.

If size_t isn't actually an int (of some standard type), isn't that
going to be a problem in the for()? I thought the index variable had
to be an integer or enumeration.

The standard states (7.17) that "size_t [...] is the unsigned integer
type of the result of the sizeof operator". Use it like any other
integral type, including in for statements and subscripting operators
(ie []). Generally, size_t is a better choice than int for array
subscripting.

So, making a few simple modifications to my previously posted code:

---snip---
#include <stdio.h>

#define MAXCOL 2
#define NAMLEN 8
#define EXTLEN 3

struct NameExt
{
char name[NAMLEN];
char ext[EXTLEN];
};

int main(void)
{
/* Some sample filenames */
struct NameExt list[] = {
{ "One ", "1 " },
{ "TwoTwo ", "22 " },
{ "ThreeThr", "333" },
{ "Four ", "4 " },
{ "FiveFive", "55 " },
{ "SixSixSi", "666" },
};

/* Print the array out */
size_t i;
for(i = 0; i < sizeof(list) / sizeof(list[0]); i++)
printf("%-*.*s.%-*.*s%s",
NAMLEN, NAMLEN, list.name,
EXTLEN, EXTLEN, list.ext,
(i + 1) % MAXCOL ? " " : "\n");

return 0;
}
--- snip ---

[mark@icepick ~]$ gcc -Wall -O2 -ansi -pedantic foo.c -o foo
[mark@icepick ~]$ ./foo
One .1 TwoTwo .22
ThreeThr.333 Four .4
FiveFive.55 SixSixSi.666

I believe that's what you're looking for. That's the simplest code I
can think of off the top of my head.

Mark F. Haigh
(e-mail address removed)

gtippery · Mar 13, 2005

Mark said:
gtippery wrote:

If size_t isn't actually an int (of some standard type), isn't that
going to be a problem in the for()? I thought the index variable had
to be an integer or enumeration.

Click to expand...

The standard states (7.17) that "size_t [...] is the unsigned integer
type of the result of the sizeof operator". Use it like any other
integral type, including in for statements and subscripting operators
(ie []). Generally, size_t is a better choice than int for array
subscripting.

I still don't understand why it's "better", by which I suppose you mean
"bigger" (meaning wider range - for positive values, of course).
Wouldn't an implementation with e.g. 16-bit size_t and 32-bit type int
meet the spec you quote? Do you mean it's _likely_ to be bigger? Or
do you just mean that for a given size, unsigned int is (probably)
"bigger" than signed int?

Or maybe I've missed the point. Other than range, is there some reason
specific to array subscripting to prefer size_t ?

So, making a few simple modifications to my previously posted code:

---snip---
#include <stdio.h>

#define MAXCOL 2
#define NAMLEN 8
#define EXTLEN 3

struct NameExt
{
char name[NAMLEN];
char ext[EXTLEN];
};

int main(void)
{
/* Some sample filenames */
struct NameExt list[] = {
{ "One ", "1 " },
{ "TwoTwo ", "22 " },
{ "ThreeThr", "333" },
{ "Four ", "4 " },
{ "FiveFive", "55 " },
{ "SixSixSi", "666" },
};

/* Print the array out */
size_t i;
for(i = 0; i < sizeof(list) / sizeof(list[0]); i++)
printf("%-*.*s.%-*.*s%s",
NAMLEN, NAMLEN, list.name,
EXTLEN, EXTLEN, list.ext,
(i + 1) % MAXCOL ? " " : "\n");

return 0;
}
--- snip ---

[mark@icepick ~]$ gcc -Wall -O2 -ansi -pedantic foo.c -o foo
[mark@icepick ~]$ ./foo
One .1 TwoTwo .22
ThreeThr.333 Four .4
FiveFive.55 SixSixSi.666

I believe that's what you're looking for. That's the simplest code I
can think of off the top of my head.

Mark F. Haigh
(e-mail address removed)

That is indeed the desired output format, and indeed simple. With
luck, the compiler will factor out the constant expression in the for's
test expression.

I note you've changed the internal data representation somewhat, but I
assume it was just to simplify the example by using preinitialization.
I wanted to do that for the initial posting, but couldn't figure out
how. Is that an array of structures of strings? (As I mentioned, I
have trouble with C's declaration syntax, but I'm learning -- I hope.)

Michael Mair · Mar 13, 2005

gtippery said:
Mark said:

gtippery wrote:

If size_t isn't actually an int (of some standard type), isn't that
going to be a problem in the for()? I thought the index variable

Click to expand...

had

to be an integer or enumeration.

Click to expand...

The standard states (7.17) that "size_t [...] is the unsigned integer
type of the result of the sizeof operator". Use it like any other
integral type, including in for statements and subscripting operators
(ie []). Generally, size_t is a better choice than int for array
subscripting.

Click to expand...

I still don't understand why it's "better", by which I suppose you mean
"bigger" (meaning wider range - for positive values, of course).
Wouldn't an implementation with e.g. 16-bit size_t and 32-bit type int
meet the spec you quote? Do you mean it's _likely_ to be bigger? Or
do you just mean that for a given size, unsigned int is (probably)
"bigger" than signed int?

Or maybe I've missed the point. Other than range, is there some reason
specific to array subscripting to prefer size_t ?

size_t is the type of the result of the sizeof operator and the
type of argument taken by the dynamic memory allocation routines
malloc/calloc/realloc -- so, basically, every object you work with
has a size in bytes which can be expressed in size_t. The same
guarantee does not hold for short, int, long of either signed or
unsigned variety. So, using size_t for your index variables, you
can _never_ (*) go wrong. The downside is that you have to be
more careful with your loop tests as unsigned integer types never
can provide values <0.

(*) never: There are no absolutes. Using automatic or static
or dynamically allocated storage, within standard C you will be
on the safe side. Note that size_t has not to be large enough
to count, for example, the number of bytes in a file.

[snip: solved problem]

Cheers
Michael

Mark F. Haigh · Mar 13, 2005

gtippery wrote:

I still don't understand why it's "better", by which I suppose you mean
"bigger" (meaning wider range - for positive values, of course).
Wouldn't an implementation with e.g. 16-bit size_t and 32-bit type int
meet the spec you quote? Do you mean it's _likely_ to be bigger? Or
do you just mean that for a given size, unsigned int is (probably)
"bigger" than signed int?

Or maybe I've missed the point. Other than range, is there some reason
specific to array subscripting to prefer size_t ?

When you're talking about the *size* of things, use size_t. Take the
canonical example (incidentially, on a DOS-like platform):

#define SIZE 20000
char buf[SIZE];

Let's say it was changed to:

#define SIZE 40000

The thing is, any loop using a signed int (16 bit) index will fail
after element 32767, causing undefined behavior (signed integer
overflow). A size_t would make it to the maximum *size* buf can be.
The size of int is really unrelated to the maximum size of objects.

That is indeed the desired output format, and indeed simple. With
luck, the compiler will factor out the constant expression in the for's
test expression.

I note you've changed the internal data representation somewhat, but I
assume it was just to simplify the example by using preinitialization.
I wanted to do that for the initial posting, but couldn't figure out
how. Is that an array of structures of strings? (As I mentioned, I
have trouble with C's declaration syntax, but I'm learning -- I

hope.)

The internal data representation is the same as you need. C does not
include the terminating null ('\0') if there is no room for it:

6.7.8 Initialization

[...]

[#14] An array of character type may be initialized by a
character string literal, optionally enclosed in braces.
Successive characters of the character string literal
(including the terminating null character if there is room
or if the array is of unknown size) initialize the elements
of the array.

Since the size of each array is specified in each case (8 and 3,
respectively), if you provide exactly 8 and exactly 3 characters for
each initializer, the \0 will not be tacked on to the end. In other
words, the C implementation will not overflow the buffer for you, it'll
leave you to do that on your own. ;-)

Keep it up. We like the hard questions around here.

Mark F. Haigh
(e-mail address removed)

pete · Mar 13, 2005

Michael Mair wrote:

size_t is the type of the result of the sizeof operator and the
type of argument taken by the dynamic memory allocation routines
malloc/calloc/realloc -- so, basically, every object you work with
has a size in bytes which can be expressed in size_t. The same
guarantee does not hold for short, int, long of either signed or
unsigned variety.

The same guarantee has to hold for the highest ranking
unsigned type, which is unsigned long in C89.
size_t has the option of being smaller than the highest ranking
unsigned type, for implementations where that may be desirable.

gtippery · Mar 13, 2005

Michael Mair wrote:
....

Note that size_t has not to be large enough
to count, for example, the number of bytes in a file.

"has not to be"? Ah, was that idiom for "does not have to be", or typo
for "has got to be"?

gtippery · Mar 13, 2005

pete said:
The same guarantee has to hold for the highest ranking
unsigned type, which is unsigned long in C89.
size_t has the option of being smaller than the highest ranking
unsigned type, for implementations where that may be desirable.

I'm thinking perhaps a segmented or paged architecture. Wider
operands, but paged addressing using fewer address bits. Seems to me a
number of older computers were like this, as well as some present-day
microcontrollers (i8051?)

And some nominally 32-bit machines have a wider integer type supported
by their FPU. You can calculate with it, but you can't address with
it.

gtippery · Mar 13, 2005

Mark F. Haigh wrote:

....

The internal data representation is the same as you need. C does not
include the terminating null ('\0') if there is no room for it:

6.7.8 Initialization

[...]

[#14] An array of character type may be initialized by a
character string literal, optionally enclosed in braces.
Successive characters of the character string literal
(including the terminating null character if there is room
or if the array is of unknown size) initialize the elements
of the array.

Since the size of each array is specified in each case (8 and 3,
respectively), if you provide exactly 8 and exactly 3 characters for
each initializer, the \0 will not be tacked on to the end. In other
words, the C implementation will not overflow the buffer for you, it'll
leave you to do that on your own. ;-)

That's handy. What happens if you initialize a char[8] with
"123456789"? I mean, I can check what "happens" for me, but what's
_supposed_ to happen?

Keep it up. We like the hard questions around here.

That's _definitely_ a Good Thing <grin>.

Michael Mair · Mar 13, 2005

gtippery said:
Michael Mair wrote:
...

"has not to be"? Ah, was that idiom for "does not have to be", or typo
for "has got to be"?

The former ;-)

Cheers
Michael

Michael Mair · Mar 13, 2005

pete said:
Michael Mair wrote:

The same guarantee has to hold for the highest ranking
unsigned type, which is unsigned long in C89.
size_t has the option of being smaller than the highest ranking
unsigned type, for implementations where that may be desirable.

Thanks for the expansion -- I thought I'd better leave that
out as we then get to C89 vs. C99.
size_t does always fit.

-Michael

pete · Mar 14, 2005

gtippery wrote:

That's handy. What happens if you initialize a char[8] with
"123456789"? I mean, I can check what "happens" for me, but what's
_supposed_ to happen?

char array[8] = "123456789";
is undefined.
It violates a "shall constraint".

N869
6.7.8 Initialization
[#2] No initializer shall attempt to provide a value for an
object not contained within the entity being initialized.

gtippery · Mar 16, 2005

pete said:
gtippery said:

That's handy. What happens if you initialize a char[8] with
"123456789"? I mean, I can check what "happens" for me, but what's
_supposed_ to happen?

Click to expand...

char array[8] = "123456789";
is undefined.
It violates a "shall constraint".

N869
6.7.8 Initialization
[#2] No initializer shall attempt to provide a value for an
object not contained within the entity being initialized.

That interpretation isn't very obvious to me from what you quote;
perhaps it is in the larger context. (Can't check, the copy I
downloaded only goes to 4.13, including the library. Either it got
renumbered or we're on different versions.)
Sounds to me like it's saying you can't initialize object z in object
y's initializer unless z is part of y.

Walter Roberson · Mar 16, 2005

ete wrote:
:> char array[8] = "123456789";
:> is undefined.
:> It violates a "shall constraint".

:That interpretation isn't very obvious to me from what you quote;

The initializer is attempting to provide a value for the
"object" which is the character at array+8, but that object is
not part of the object array[] which is being initialized, since
array[] goes from array+0 to array+7.

Stephen Sprunk · Mar 16, 2005

gtippery said:
gtippery said:

That's handy. What happens if you initialize a char[8] with
"123456789"? I mean, I can check what "happens" for me, but what's
_supposed_ to happen?

Click to expand...

char array[8] = "123456789";
is undefined.
It violates a "shall constraint".

N869
6.7.8 Initialization
[#2] No initializer shall attempt to provide a value for an
object not contained within the entity being initialized.

Click to expand...

That interpretation isn't very obvious to me from what you quote;
perhaps it is in the larger context. (Can't check, the copy I
downloaded only goes to 4.13, including the library. Either it got
renumbered or we're on different versions.)
Sounds to me like it's saying you can't initialize object z in object
y's initializer unless z is part of y.

Your initializer is trying to provide values for array[8] and array[9],
which are not containted within the char[8] object called "array". The
behavior is undefined.

S

pete · Mar 17, 2005

gtippery said:
pete wrote:

char array[8] = "123456789";
is undefined.
It violates a "shall constraint".

N869
6.7.8 Initialization
[#2] No initializer shall attempt to provide a value for an
object not contained within the entity being initialized.

Click to expand...

That interpretation isn't very obvious to me from what you quote;
perhaps it is in the larger context. (Can't check, the copy I
downloaded only goes to 4.13, including the library. Either it got
renumbered or we're on different versions.)
Sounds to me like it's saying you can't initialize object z in object
y's initializer unless z is part of y.

That's what I'm saying.
I think the language is a little plainer in the C89 standard.

http://dev.unicals.com/papers/c89-draft.html#3.5.7

3.5.7 Initialization
Constraints

There shall be no more initializers in an initializer list
than there are objects to be initialized.

lawrence.jones · Mar 18, 2005

pete said:
I think the language is a little plainer in the C89 standard.

http://dev.unicals.com/papers/c89-draft.html#3.5.7

3.5.7 Initialization
Constraints

There shall be no more initializers in an initializer list
than there are objects to be initialized.

That language had to change in C99 due to designated initializers. For
example:

int a[10] = {[11] = 0};

is invalid even though there are 10 objects and only one initializer.

-Larry Jones

Years from now when I'm successful and happy, ...and he's in
prison... I hope I'm not too mature to gloat. -- Calvin

Joe Wright · Mar 18, 2005

pete said:
pete said:

I think the language is a little plainer in the C89 standard.

http://dev.unicals.com/papers/c89-draft.html#3.5.7

3.5.7 Initialization
Constraints

There shall be no more initializers in an initializer list
than there are objects to be initialized.

Click to expand...

That language had to change in C99 due to designated initializers. For
example:

int a[10] = {[11] = 0};

is invalid even though there are 10 objects and only one initializer.

-Larry Jones

Years from now when I'm successful and happy, ...and he's in
prison... I hope I'm not too mature to gloat. -- Calvin

Given..
int a[10] = {[11] = 0};
^^^^^^^^^^
...I have never seen an initializer like that. What do you think it
means? What is the type of this expression?

Dave Vandervies · Mar 18, 2005

Joe Wright said:
Given..
int a[10] = {[11] = 0};
^^^^^^^^^^
..I have never seen an initializer like that.

It's a C99ism.

What do you think it
means? What is the type of this expression?

It means precisely what 6.7.8#6 of n869 (and, unless the numbering has
changed, the same paragraph of C99) says it means, and has the type that
the same paragraph says it has.

(In this case, it's an attempt to initialize an array[10] of int with
an initializer of type array[more-than-10], which is what makes it an
appropriate example of an invalid initializer.)

dave

lawrence.jones · Mar 20, 2005

Joe Wright said:
Given..
int a[10] = {[11] = 0};
^^^^^^^^^^
..I have never seen an initializer like that. What do you think it
means? What is the type of this expression?

It's a designated initializer (a new feature in C99) -- it's an attempt
to initialize a[11] to 0 (which is invalid since a only has 10
elements). Since it's not an expression, it has no type. A more
complete (and valid!) example:

int a[10] = {0, 1, [7] = 7, 8, [2] = 2, 3, 4};

explicitly initalizes a[0] to 0, a[1] to 1, a[2] to 2, a[3] to 3, a[4]
to 4, a[7] to 7, a[8] to 8, and implicitly initializes all the other
elements to 0. There's a similar construct for struct and union
members and they can be combined:

struct foo s = {.x = 10, .y = 20, .u.z = 4, .a[3] = 3};

-Larry Jones

What's Santa's definition? How good do you have to be to qualify as good?
-- Calvin

gtippery · Apr 2, 2005

pete said:
That's what I'm saying.
I think the language is a little plainer in the C89 standard.

http://dev.unicals.com/papers/c89-draft.html#3.5.7

3.5.7 Initialization
Constraints

There shall be no more initializers in an initializer list
than there are objects to be initialized.

Yeah, that'd cover it. I forget an array isn't "an object".

Please help with C programming to save GPS reception data in Raspberry Pi.	0	Dec 8, 2022
Need help! Following code isnt working fully Comparison of integer and pointer	0	Nov 20, 2022
any tricks to golf this code further?	7	Aug 3, 2011
How can I keep second with 2 digital?	4	Nov 27, 2013
what's wrong with this path combine function?	6	Jan 23, 2011
any tricks to golf this code further?	0	Aug 4, 2011
What's wrong with this code?	7	Feb 13, 2009
Please help me to catch this error	22	Feb 27, 2007

Please help optimize (and standarize) this code...

Mark F. Haigh

gtippery

Michael Mair

Mark F. Haigh

pete

gtippery

gtippery

gtippery

Michael Mair

Michael Mair

pete

gtippery

Walter Roberson

Stephen Sprunk

pete

lawrence.jones

Joe Wright

Dave Vandervies

lawrence.jones

gtippery

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads