C Strings not returning from a function

S

sven_tmp

Hi Richard,

I was not thinking in every detail about the
code pkirk25 posted. The main thing I wanted to say is that the
return in the FixString-function is not needed.

You always write about undefined behavior, I don't know exactly
why but I suggest it is because of the use of unchecked
pointer-variables.

Can you give me the answers to your quetions?
void FixString(char *strIn, char *strOut)
{
int i = 0;
int j = strlen(strIn);

Undefined behaviour. Bonus question: why?
for (i = 0; i <= j; ++i) {
/* removing the cleanup code for brevity on usenet */
strOut = strIn;
}
printf("%s\n", strOut);

Undefined behaviour. Bonus question: why?
char *strOut = "This is a line from a web page <br>";
int i = strlen(strOut);
Undefined behaviour. Bonus question: why?
char *backBuff = malloc(i);
Constraint violation. Bonus question: why?
FixString(strOut, backBuff);
Undefined behaviour. Bonus question: why?
printf("%s\n", backBack); /* Prints out the string
perfectly */
Undefined behaviour. Bonus question: why?

At least you got main's return type right.

Thank you :)

Sven
 
P

pkirk25

If I understand rightly, the original version needed help in 3 areas:
1. Basic C logic - can't return pointer to local variable
2. C style - working with pointers is often better than working with
function returns
3. Security - what if the input file has a huge mangled string?

This is my effort to fix all three. It works so many thanks.

#include <stdio.h>
#include <string.h>
#include <stdlib.h>

#define STRING_MAX_SIZE 254 /* ample headroom here */

/*
Function that
*/
int FixString(const char *strIn, /* string to fix */
const char cToken, /* token to use */
char *strOut) /* the fixed string */
{
int i = 0;
int j = strlen(strIn);


if (j > STRING_MAX_SIZE)
{
printf("Input too big\n");
return 1;
}




for (i = 0; i <= j; ++i)
{
/* The cleanup code */
if (cToken != strIn)
{
strOut = strIn;
} else
{
/* the token is found */
strOut = '\0'; /* terminates the string */
j = i; /* ends the loop */
}
}

return 0;
}

int main(void)
{
char *strOut = "Moonglade-Alliance<br>";
char cToken = '-';
int i = strlen(strOut);

char *backBuff = malloc(STRING_MAX_SIZE);
int j = FixString(strOut, cToken, backBuff);

if (j > 0)
{
printf("Something has gone wrong\n");
return 1;
} else
{
printf("%s\n", backBuff); /* show fixed string */
return 0;
}


}
 
R

Richard Heathfield

(e-mail address removed) said:
Hi Richard,

I was not thinking in every detail about the
code pkirk25 posted. The main thing I wanted to say is that the
return in the FixString-function is not needed.

You always write about undefined behavior, I don't know exactly
why but I suggest it is because of the use of unchecked
pointer-variables.

Can you give me the answers to your quetions?

1) You don't have a prototype in scope for strlen (which, incidentally, does
*not* return int, so I don't know why you're using an int to catch its
return value), so compiler will not know to supply the required conversion
from size_t to int.

2) You don't have a prototype in scope for printf, which is a variadic
function.

3) See 1)

4) You don't have a prototype in scope for malloc, so the assignment is
illegal and must be diagnosed.

5) You're passing a potentially meaningless pointer to a function that
depends utterly on having a valid pointer to a certain minimum number of
bytes. Incidentally, you didn't allocate enough space for the null
terminator.

6) See 2)
 
R

Richard Heathfield

pkirk25 said:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>

Huge win, well done.
#define STRING_MAX_SIZE 254 /* ample headroom here */
/*
Function that
*/
int FixString(const char *strIn, /* string to fix */
const char cToken, /* token to use */
char *strOut) /* the fixed string */
{
int i = 0;
int j = strlen(strIn);

size_t would be better. Incidentally, you already measured the length of the
string, in main. Why not pass that length in as an argument, to save
recalculating it? (Measuring a string's length is O(n), so it's not
something you want to repeat if you can avoid it.)
for (i = 0; i <= j; ++i)

If you write to strOut[j], you're in trouble.
{
/* The cleanup code */
if (cToken != strIn)
{
strOut = strIn;


Here, you write to strOut[j] (last time round the loop).
} else
{
/* the token is found */
strOut = '\0'; /* terminates the string */


Here, too.
int i = strlen(strOut);

Why int? strlen doesn't return int.
char *backBuff = malloc(STRING_MAX_SIZE);

i + 1 would have been enough. But you want to check that it worked:

if(backBuff != NULL)
{
 
G

Guest

Richard said:
(e-mail address removed) said:

Constraint violation. Bonus question: why?

A question: since the behaviour is undefined if you declare any
standard library function (and more) inappropriately, isn't the
compiler allowed to try and generate a proper declaration for malloc(),
and accept this code without any diagnostic?
 
R

Richard Heathfield

Harald van D?k said:
A question: since the behaviour is undefined if you declare any
standard library function (and more) inappropriately, isn't the
compiler allowed to try and generate a proper declaration for malloc(),
and accept this code without any diagnostic?

No, because a diagnostic message *must* be generated if the source code
breaks any syntax rule or violates any constraint - which this one does -
even if the compiler can silently fix the code and proceed with the
translation (which it may, at its discretion). To be totally clear, the
diagnostic message is required. A successful translation (which may or may
not do what the programmer intended) is not required, but is allowed. That
is, the compiler may accept the code, or reject it, and it may either
understand (and thus, presumably, fix) the code or misinterpret it; the
Standard allows any of those options.
 
G

Guest

Richard said:
Harald van D?k said:


No, because a diagnostic message *must* be generated if the source code
breaks any syntax rule or violates any constraint - which this one does -
even if the compiler can silently fix the code and proceed with the
translation (which it may, at its discretion).

I mostly agree, except for the "which this one does" part. I think
you're overlooking my point. I'm not so sure the standard specifies
that any constraint is violated, since the standard does not specify
what the effect of the implicit declaration of malloc() is.
 
R

Richard Heathfield

Harald van D?k said:
I mostly agree, except for the "which this one does" part. I think
you're overlooking my point. I'm not so sure the standard specifies
that any constraint is violated, since the standard does not specify
what the effect of the implicit declaration of malloc() is.

But it does. The (C90) standard requires that, in the absence of an explicit
declaration for a function, that function is treated as if it returned int.
So the compiler is required to treat malloc(i) as returning int (because we
don't have a prototype). This int is then assigned to a char *, and that's
what violates a constraint.
 
G

Guest

Richard said:
Harald van D?k said:


But it does. The (C90) standard requires that, in the absence of an explicit
declaration for a function, that function is treated as if it returned int.

And as soon as malloc() is declared as returning int, before malloc()'s
new declaration is used, the behaviour is undefined. Would you also say
a diagnostic is required for

#define IGNORE(x) /* nothing */

int main (void)
{
IGNORE(")
void *p = malloc();
IGNORE(")
}

After all, here too the standard specifies both that the behaviour is
undefined before malloc()'s result is used, and that there would be a
constraint violation if the behaviour weren't explicitly undefined.
 
R

Richard Tobin

I mostly agree, except for the "which this one does" part. I think
you're overlooking my point. I'm not so sure the standard specifies
that any constraint is violated, since the standard does not specify
what the effect of the implicit declaration of malloc() is.
[/QUOTE]
But it does. The (C90) standard requires that, in the absence of an explicit
declaration for a function, that function is treated as if it returned int.

So now we have an (implicit) incorrect declaration for a standard
library function, which results in undefined behaviour. Could that
undefined behaviour not include magically using a correct declaration
instead?

-- Richard
 
R

Richard Heathfield

Richard Tobin said:
But it does. The (C90) standard requires that, in the absence of an
explicit declaration for a function, that function is treated as if it
returned int.

So now we have an (implicit) incorrect declaration for a standard
library function, which results in undefined behaviour. Could that
undefined behaviour not include magically using a correct declaration
instead?[/QUOTE]

Yes (provided the constraint violation is still diagnosed). :)
 
R

Richard Heathfield

Harald van D?k said:
Richard Heathfield wrote:


And as soon as malloc() is declared as returning int, before malloc()'s
new declaration is used, the behaviour is undefined.

C&V, please.
Would you also say a diagnostic is required for

#define IGNORE(x) /* nothing */

int main (void)
{
IGNORE(")
void *p = malloc();
IGNORE(")
}

I don't understand this code. I would have expected it to pre-process to:

int main (void)
{

void *p = malloc();

}

but in fact cpp pre-processed it to:

int main (void)
{



}

Why?

Assuming gcc is correct, the C source code doesn't ever see the malloc, so
the point is moot.
 
G

Guest

Richard said:
Harald van D?k said:


C&V, please.

In C99, it's 7.1.3p2.
"No other identiï¬ers are reserved. If the program declares or
deï¬nes an identiï¬er in a
context in which it is reserved (other than as allowed by 7.1.4), or
deï¬nes a reserved
identiï¬er as a macro name, the behavior is undeï¬ned."
I know C89 has similar wording, but I don't know the C&V.
I don't understand this code. I would have expected it to pre-process to:

int main (void)
{

void *p = malloc();

}

but in fact cpp pre-processed it to:

int main (void)
{



}

Why?

As far as the standard is concerned, the tokens before preprocessing
are

[#] [define] [IGNORE] [(] [x] [)]

[int] [main] [(] [void] [)]
[{]
[IGNORE] [(] ["] [)]
[void] [*] [p] [=] [malloc] [(] [)] [;]
[IGNORE] [(] ["] [)]
[}]

And the behaviour is undefined per 6.4p3. Because the behaviour is
undefined, compilers are permitted to support multi-line string
literals as an extension, and what gcc makes of it is

[#] [define] [IGNORE] [(] [x] [)]

[int] [main] [(] [void] [)]
[{]
[IGNORE] [(] [")\n\
void *p = malloc();\n\
IGNORE("] [)]
[}]
 
R

Richard Heathfield

[non-ASCII symbols removed or replaced]

Harald van D?k said:
In C99, it's 7.1.3p2.
"No other identifiers are reserved. If the program declares or
defines an identifier in a
context in which it is reserved (other than as allowed by 7.1.4), or
defines a reserved
identifier as a macro name, the behavior is undefined."
I know C89 has similar wording, but I don't know the C&V.

It's 4.1.2:

"If the program defines an external identifier with the same name as a
reserved external identifier, even in a semantically equivalent form, the
behavior is undefined."

So the whole thing has been re-worded in C99, and we have become divided by
a common language. My argument is based on C90, where it remains valid,
because malloc(i) is not a definition of malloc.

As far as the standard is concerned, the tokens before preprocessing

....do not legally include ". Very clever. :)
 
G

Guest

Richard said:
[non-ASCII symbols removed or replaced]

Sorry, that's what kpdf does when I simply copy and paste.
Harald van D?k said:


It's 4.1.2:

"If the program defines an external identifier with the same name as a
reserved external identifier, even in a semantically equivalent form, the
behavior is undefined."

So the whole thing has been re-worded in C99, and we have become divided by
a common language. My argument is based on C90, where it remains valid,
because malloc(i) is not a definition of malloc.

Odd and interesting. It may also be addressed by C99's 6.2.7p2; is this
in C90?
"All declarations that refer to the same object or function shall have
compatible type;
otherwise, the behavior is undefined."
 
R

Richard Heathfield

Harald van Dijk said:
Richard Heathfield wrote:


Odd and interesting. It may also be addressed by C99's 6.2.7p2; is this
in C90?
"All declarations that refer to the same object or function shall have
compatible type;
otherwise, the behavior is undefined."

It is [3.1.2.6], but in the absence of <stdlib.h> there is no prior
declaration for malloc, so it doesn't apply here.
 
M

Michal Nazarewicz

pkirk25 said:
#define STRING_MAX_SIZE 254 /* ample headroom here */

int FixString(const char *strIn, /* string to fix */
const char cToken, /* token to use */
char *strOut) /* the fixed string */
{
int i = 0;
int j = strlen(strIn);


if (j > STRING_MAX_SIZE)
{
printf("Input too big\n");
return 1;
}

Personally, I would change the function to accept the length of the
strOut buffer instead of gicing a fixed value, like, so:

#v+
#include <stdio.h>
#include <string.h>
#include <stdlib.h>

void strnfix(const char *strIn, char cToken, char *strOut, size_t len) {
if (!len) return;
while (--len && *strIn && *strIn != cToken) {
*strOut++ = *strIn++;
}
*strOut = 0;
}

int main(void) {
char *string = "Moonglade-Alliance<br>";
char *buffer = malloc(256);
if (!buffer) {
puts("not enough memory");
return 1;
}
strnfix(string, '-', buffer, 256);
puts(buffer);
return 0;
}
#v-

BTW. The function can also by written using strrchr(), strncpy() and
memcpy().
 
G

Guest

Richard said:
Harald van Dijk said:
Richard Heathfield wrote:


Odd and interesting. It may also be addressed by C99's 6.2.7p2; is this
in C90?
"All declarations that refer to the same object or function shall have
compatible type;
otherwise, the behavior is undefined."

It is [3.1.2.6], but in the absence of <stdlib.h> there is no prior
declaration for malloc, so it doesn't apply here.

I'm not sure what you mean. One possibility is that you're mixing up
6.2.7p2 with 6.7p4 ("All declarations in the same scope that refer to
the same object or function shall specify compatible types."). The
latter does not apply, but the former applies even to declarations in
different translation units. The other possibility is that you're
saying the (C90) standard requires implementations to act as if the
standard library is not written in C.
 
R

Richard Heathfield

Harald van D?k said:
Richard said:
Harald van Dijk said:
Richard Heathfield wrote:

So the whole thing has been re-worded in C99, and we have become
divided by a common language. My argument is based on C90, where it
remains valid, because malloc(i) is not a definition of malloc.

Odd and interesting. It may also be addressed by C99's 6.2.7p2; is this
in C90?
"All declarations that refer to the same object or function shall have
compatible type;
otherwise, the behavior is undefined."

It is [3.1.2.6], but in the absence of <stdlib.h> there is no prior
declaration for malloc, so it doesn't apply here.

I'm not sure what you mean. One possibility is that you're mixing up
6.2.7p2 with 6.7p4 ("All declarations in the same scope that refer to
the same object or function shall specify compatible types.").
Nope.

The
latter does not apply, but the former applies even to declarations in
different translation units.

A different translation unit is a completely different kettle of bananas.
It's compiled separately, and the compiler has no way of knowing what
declarations exist within Translation Unit A, at the time it is compiling
Translation Unit B. It may not even be the same compiler that was used for
compiling Translation Unit A. So it's hard to see how Translation Unit A's
contents can be relevant to whether a diagnostic message is required when
compiling Translation Unit B.
The other possibility is that you're
saying the (C90) standard requires implementations to act as if the
standard library is not written in C.

No, I'm not saying that at all.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,777
Messages
2,569,604
Members
45,217
Latest member
topweb3twitterchannels

Latest Threads

Top