malloc's strang behavior

G

Gustavo G. Rondina

Hi all

I'm writting a simple code to solve an ACM problem
(http://acm.uva.es, it is the problem #468). In its code I have the
following fragment:

freq = calcfreq(hashfreq, strfreq, input);
printf("before malloc: %s (%p)\n", input+INPUTLEN);
hchars = (char *)malloc(freq*sizeof(char));
schars = (char *)malloc(freq*sizeof(char));
printf("after malloc: %s\n (%p)\n", input+INPUTLEN);

Since input has nothing to do with the mallocs I expected it would be
unchanged but this isn't happening. Suppouse input points to the
string "xxxxyy", this fragment of code outputs like this:

before malloc: xxxxyy (0x8049ef0)
after malloc: xxxx (0x8049ef0)

Anyone got a clue on what could be happening here?


Thanks in advance
 
K

Keith Thompson

Gustavo G. Rondina said:
Hi all

I'm writting a simple code to solve an ACM problem
(http://acm.uva.es, it is the problem #468). In its code I have the
following fragment:

freq = calcfreq(hashfreq, strfreq, input);
printf("before malloc: %s (%p)\n", input+INPUTLEN);
hchars = (char *)malloc(freq*sizeof(char));
schars = (char *)malloc(freq*sizeof(char));
printf("after malloc: %s\n (%p)\n", input+INPUTLEN);

Since input has nothing to do with the mallocs I expected it would be
unchanged but this isn't happening. Suppouse input points to the
string "xxxxyy", this fragment of code outputs like this:

before malloc: xxxxyy (0x8049ef0)
after malloc: xxxx (0x8049ef0)

Anyone got a clue on what could be happening here?

In both printf calls, the format string has a "%s" and a "%p"
conversion, but you only provide one addtional argument rather than
two. I wouldn't necessarily expect this to cause the specific problem
you're seeing, but you should definitely fix it before attempting to
proceed any further.

Casting the result of malloc() is unnecessary, and can mask errors
such as forgetting the "#include <stdlib.h>". Also sizeof(char) is
always 1, by definition. Thus the two assignments can be simplified
to

hchars = malloc(freq);
schars = malloc(freq);

Finally, if you still have questions after making these changes,
please show us a small, complete, compilable program that exhibits the
problem. We can't see the declarations of your variables, and we
can't tell whether you've included the proper headers. That makes it
difficult to help you. (And there's a fairly good chance that you'll
solve the problem yourself while paring it down to something small
enough to post.)
 
B

Barry Schwarz

Hi all

I'm writting a simple code to solve an ACM problem
(http://acm.uva.es, it is the problem #468). In its code I have the
following fragment:

freq = calcfreq(hashfreq, strfreq, input);
printf("before malloc: %s (%p)\n", input+INPUTLEN);
hchars = (char *)malloc(freq*sizeof(char));
schars = (char *)malloc(freq*sizeof(char));
printf("after malloc: %s\n (%p)\n", input+INPUTLEN);

Since input has nothing to do with the mallocs I expected it would be
unchanged but this isn't happening. Suppouse input points to the
string "xxxxyy", this fragment of code outputs like this:

before malloc: xxxxyy (0x8049ef0)
after malloc: xxxx (0x8049ef0)

Anyone got a clue on what could be happening here?


Thanks in advance

I give up. How do you get two variables formatted in your output (to
match the two format specifiers) when you have only one argument
following the format string?

Why are you casting the return from malloc? If it is to silence a
compiler diagnostic about converting int to char* then you have
invoked undefined behavior by not providing the prototype for malloc.
If it is to silence a compiler diagnostic about converting void* to
int* then you are compiling it as C++ and this is the wrong group.

Since sizeof(char) is always 1, you can remove that expression from
the arguments to malloc.

Can you provide a compilable example that exhibits the behavior?


<<Remove the del for email>>
 
G

Gustavo G. Rondina

Sorry guys, copy and paste problem. Here is the correct code fragment:

freq = calcfreq(hashfreq, strfreq, input);
printf("%s (%p)\n", input+INPUTLEN, input+INPUTLEN);
hchars = (char *)malloc(freq*sizeof(char));
schars = (char *)malloc(freq*sizeof(char));
printf("%s (%p)\n", input+INPUTLEN, input+INPUTLEN);

The problem it is still there. The output follows:

before malloc: xxxxyy (0x8049ef0)
after malloc: xxxx (0x8049ef0)

The full code is avaliable at: http://www.brlivre.org/c/468.c

If anyone have spare time I would appreciate some help.


Thanks,
Gustavo

PS. this is not any kind of homework, I code (at least I try to) as a
hobby.
 
G

Gustavo G. Rondina

Sorry about the mess, here is the _right_ code:

http://www.brlivre.org/c/468.c

This code isn't finished yet, I'll try to fix this problem before
going on. The problem seems to be on this code fragment:

freq = calcfreq(hashfreq, strfreq, input);
printf("before malloc: %s (%p)\n", input+INPUTLEN, input+INPUTLEN);
hchars = (char *)malloc(freq*sizeof(char));
schars = (char *)malloc(freq*sizeof(char));
printf("after malloc: %s (%p)\n", input+INPUTLEN, input+INPUTLEN);

Here is what I got:

$ ./468
1

aaaaab
xxxxxxyyy
before malloc: xxxxxxyyy (0x8049f00)
after malloc: xxxx (0x8049f00)
$

Anyone got a clue?


Thanks (again)
 
M

Martin Ambuhl

Gustavo said:
Hi all

I'm writting a simple code to solve an ACM problem
(http://acm.uva.es, it is the problem #468). In its code I have the
following fragment:

freq = calcfreq(hashfreq, strfreq, input);

We don't know, of course, what any of freq, calcfreqm hashfreq, (badly
named) strfreq, or input are.
printf("before malloc: %s (%p)\n", input+INPUTLEN);

This is nonsense. The specification string expects two more arguments
(for three in all). You give only one more. Luckily for you, we can't
tell if you forgot to cast the argument corresponding to "%p" to
(void *).
hchars = (char *)malloc(freq*sizeof(char));
schars = (char *)malloc(freq*sizeof(char));

At this point we know for sure you have not followed normal usenet
etiquette and lurked before posting, or checked the archives, or checked
the FAQ. You would do better with
hchars = malloc(freq);
schars = malloc(freq);
or
hchars = malloc(freq * sizeof *hchars);
schars = malloc(freq * sizeof *schars);
printf("after malloc: %s\n (%p)\n", input+INPUTLEN);

This is nonsense, again.
Since input has nothing to do with the mallocs I expected it would be
unchanged but this isn't happening. Suppouse input points to the
string "xxxxyy", this fragment of code outputs like this:

before malloc: xxxxyy (0x8049ef0)
after malloc: xxxx (0x8049ef0)

Anyone got a clue on what could be happening here?

It appears that the array which either is 'input' or is pointed to by
'input' has a size of 4 and was initialized incorrectly, perhaps with
char input[4]="xxxx";
Somehow you have gained a '\0' in the first char _following_ input in
the process of allocating space.

All this is guessing, since you have not not posted compilable code, but
have posted something that is *not* your real code, else the output
would not be what you claim.
 
M

Martin Ambuhl

Gustavo said:
Sorry guys, copy and paste problem. Here is the correct code fragment:

freq = calcfreq(hashfreq, strfreq, input);
printf("%s (%p)\n", input+INPUTLEN, input+INPUTLEN);
^^^^^^^^^^^^^^
This is almost certainly wrong. Provide the cast (void *) to the type
%p expects.
hchars = (char *)malloc(freq*sizeof(char));
schars = (char *)malloc(freq*sizeof(char));
printf("%s (%p)\n", input+INPUTLEN, input+INPUTLEN);

The problem it is still there. The output follows:

before malloc: xxxxyy (0x8049ef0)
after malloc: xxxx (0x8049ef0)

The full code is avaliable at: http://www.brlivre.org/c/468.c

Come on: do the small amount of work necessary to produce a compilable
minimal program that exhibits your problem. If it's too much trouble
for you to do that, it's too much trouble for us. Post the resulting
code here. No one here has reason to go clicking on wesites we have no
reason to trust.
 
B

Barry Schwarz

Sorry guys, copy and paste problem. Here is the correct code fragment:

freq = calcfreq(hashfreq, strfreq, input);
printf("%s (%p)\n", input+INPUTLEN, input+INPUTLEN);
hchars = (char *)malloc(freq*sizeof(char));
schars = (char *)malloc(freq*sizeof(char));
printf("%s (%p)\n", input+INPUTLEN, input+INPUTLEN);

The problem it is still there. The output follows:

before malloc: xxxxyy (0x8049ef0)
after malloc: xxxx (0x8049ef0)

The full code is avaliable at: http://www.brlivre.org/c/468.c

If anyone have spare time I would appreciate some help.

Your readinput function invokes undefined behavior. It attempts to
read twice as many strings as you allocate space for.

Your calcfreq function will invoke undefined behavior if any of your
input contains anything other than lowercase ASCII characters since
your offset into the arrays will be negative.

The readinput error can very easily break your run-time memory
management and cause subsequent calls to malloc to do very strange
things.


<<Remove the del for email>>
 
C

Chris Dollin

Gustavo said:
Sorry about the mess, here is the _right_ code:

http://www.brlivre.org/c/468.c

This code isn't finished yet, I'll try to fix this problem before
going on. The problem seems to be on this code fragment:

freq = calcfreq(hashfreq, strfreq, input);
printf("before malloc: %s (%p)\n", input+INPUTLEN, input+INPUTLEN);
hchars = (char *)malloc(freq*sizeof(char));
schars = (char *)malloc(freq*sizeof(char));
printf("after malloc: %s (%p)\n", input+INPUTLEN, input+INPUTLEN);

No, the problem is that you've written past the end of `input`
somewhere, since if you mallocate an extra 1000 bytes for it,
the problem goes away.

Given that the code is horribly opaque (eg why on Earth do you write
`*(E+F)` for `E[F]`, what is 97 when it's a home, what it is all *for*?),
motivation for proceeding further in the analysis escapes me.
 
R

Rob Thorpe

Gustavo G. Rondina said:
Sorry about the mess, here is the _right_ code:

http://www.brlivre.org/c/468.c

This code isn't finished yet, I'll try to fix this problem before
going on. The problem seems to be on this code fragment:

freq = calcfreq(hashfreq, strfreq, input);
printf("before malloc: %s (%p)\n", input+INPUTLEN, input+INPUTLEN);
hchars = (char *)malloc(freq*sizeof(char));
schars = (char *)malloc(freq*sizeof(char));
printf("after malloc: %s (%p)\n", input+INPUTLEN, input+INPUTLEN);

Here is what I got:

$ ./468
1

aaaaab
xxxxxxyyy
before malloc: xxxxxxyyy (0x8049f00)
after malloc: xxxx (0x8049f00)
$

Anyone got a clue?


Thanks (again)


The most likely explanation is that you have corrupted memory
elsewhere in your program, and you are only seeing the result here.
For instance if the storage pointed to by "input" has already being
freed.
 
K

Kelsey Bjarnason

[snips]

The full code is avaliable at: http://www.brlivre.org/c/468.c

Actually, it's not. However, a little poking around found it. Allow me
to start by saying this code is *bad*, in every conceivable meaning of the
term.

A quick compile complains about bzero, and so it should - what the heck is
bzero? It's not defined anywhere in the code. Apparently - from a little
searching - it's just a memset, which leads to the obvious question, why
not use the standard memset? Also, in decode, the variable k is unused,
according to the compiler. That it still exists despite being unused
suggests either you forgot to write some code, or you're using one of the
other variables incorrectly - assigning to j when you should be assigning
to k, say. There are other possibilities, of course, but the unused
variable warning is quite sufficient to render the entire piece of code
questionable.

Okay, fine, it compiles. Let's run it. Hmm. No prompt, no nothing. So
I give it some input to see what happens - I type "woof". It spews out a
"cannot allocate memory" message and quits. Well, if it didn't want
"woof" as an input, why didn't it ask for what it did want?

So I read the code. It wants something called "cases". I'll give it
three and see what happens.

More waiting for unknown things to happen. Attempting to read through
your readinput function proves... futile, really. It's bad, evil, nasty.
It doesn't "read input", it loops, it jumps, it calculates, it glues, it
appends, it does all sorts of weird and wonderful things. A line such as
if (!(j + 1) % 2 && j < (c * 2) - 1) has absolutely no business in a
"readinput" function, the purpose of which is - from the name, at least -
to simply _read_ the input, possibly with validation. Not to process it,
store it, redirect it, do database lookups from it, or stuff it, quite
possibly incorrectly. into buffers it probably shouldn't be touching other
than to store one piece of finally-accepted data into.

However... we'll send some text at it, since it seems to want three pieces
of data. I give it "This is a test." and it promptly spews "Wrong input
format". Really? So why didn't it tell me what format it wanted? Let's
try again.

Give it a "3", give it "woof" and... it pukes. All in all, a most
unfriendly application, so let's ignore that and focus on the code.

While wading through it, I see some odd things. For example, I see chcode
= *foo - 97; the purpose of which isn't entirely clear. Oops... it looks
like this is intended to produce a character value, presumably on the
notion that, say, 'a'-97 == 0 or some such. Except that this notion can't
cope with different cases, for one thing, and isn't portable, as there's
no guarantees about the order of characters or what their numeric values
are.

A little further on, we see hchcount = 0. hchcount? What, exactly, is an
hch, and why are you counting them? No indication is given. Nor is there
any indication of what an sch is and why you're counting those.

On we go to decode(). Here's the function header:

void decode(int *hf, int *sf, char *s, char *hch, char *sch, int f)

Yeesh. That's no fewer than 5 separate things you're planning on
modifying. You're going to modify hf and sf and s and hch and sch. I've
met *very* few functions which actually do so much work that they require
*five* separate outputs. Indeed, I'd tend to suggest that such a function
needs a complete redesign.

For example, a decompression function might return three things - the
decompressed data, the size of the decompressed data, and a result code.
You're returning five things, but you're using completely non-descriptive
names for the parameters, so one has to simply guess what the function
actually means.

Moving on to main... there are 9 separate variables declared in main.
That's actually a _lot_ of variables for a single function, and this is
compounded by using some of the worst variable names imaginable; there is
absolutely nothing about "schars" which indicates what it really is, and
the comment isn't clear, either - what is a "diff character"? While
we're at it, what's with this: scanf("%d", &cases); ? Not even an
attempt to validate the input.

Between lack of comments, bad variable names, overly-busy functions,
non-standard functions, functions which do things other than what they
claim to do, non-portable assumptions, lack of prompts for inputs and the
use of what I'd regard as being at _best_ a truly bizarre handling of user
inputs, the best thing that could happen to this code would be to print it
out, gather around a campfire with a few wiccan priestesses, burn it while
chanting ritual cleansing mantras, then bury the ashes for all time.
 
K

Keith Thompson

Kelsey Bjarnason said:
Okay, fine, it compiles. Let's run it. Hmm. No prompt, no nothing. So
I give it some input to see what happens - I type "woof". It spews out a
"cannot allocate memory" message and quits. Well, if it didn't want
"woof" as an input, why didn't it ask for what it did want?

If I'm not mistaken, that behavior is imposed by the problem
description. (It's one of the problems at
<http://acm.uva.es/problemset/>.)

No comment on the rest of your criticisms.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,482
Members
44,900
Latest member
Nell636132

Latest Threads

Top