Warning to newbies

blmblm · Feb 8, 2010

Is that really how you, um, pronounce(?) "++i" and "i++"? Hm!
(I say "plus plus i" and "i plus plus".)

That's been my justification too -- even though in some contexts a
"good" compiler will almost certainly generate the same code for both.

Just asking out of interest, would a post-increment enable any
significant optimisations by the compiler that wouldn't be possible
with a pre-increment?

Rather the reverse, no? though in some contexts both should be treated
the same.

blmblm · Feb 8, 2010

[ snip ]

That's correct. However, the approximate solution is very useful and
according to my information in use. I recall a logistics system
written by my employer in the 1970s, before the science was
understood, that "seemed to loop" when it was given real routes to
plan. Later on, my brother's firm, a trucking firm, used the simulated
annealling algorithm.

Moore's law won't help us much with NP complete solutions.

Or NP-complete problems. (Terminology quibble? Maybe.)

(Agreed, for the record, that approximations can sometimes be
useful.)

And as
Chris points out it may run out of gas. Because of the need to master
parallelism and avoid problems such as the Toyota recall (IF that was
caused by a software bug, as Steve Wozniak has suggested), computer
programmers by 2025 may have to have PhDs. That might be a good thing.

It's not clear to me that a credential intended to certify that
the holder can do independent research would be very useful as a
qualification for writing concurrent/parallel code. Some sort of
professional certification, such as is required(?) for engineers,
would seem to be more appropriate. More quibbling about terminology,
maybe.

[ snip ]

Phil Carmody · Feb 8, 2010

Seebs said:
So far, this looks fine.

Lack of consts makes it look amateurish.

Phil

Ben Bacarisse · Feb 8, 2010

Joe Wright said:
Ben said:

Joe Wright said:

santosh wrote:
Joe Wright wrote:
spinoza1111 wrote:
[ all snipped ]
I suggest you spend overlong on a relatively simple task. Regard..

/*
Program: stuff.c
Author: Joe Wright
Date: 05/24/1999

The dBASE stuff() function
*/

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define LEN 256

void usage(void) {
printf("Usage: stuff 'cString' nStart nDelete 'cInsert'\n"
"Example: stuff 'Now is the time.' 7 0 'NOT '\n"
"Note: nStart is rel-0 in C\n"), exit(0);
}

char *stuff(char *string, int start, int delete, char *insert) {
static char str1[LEN]; /* Return buffer */
char str2[LEN]; /* Temporary buffer */
strncpy(str1, string, start);
str1[start] = 0;
strcpy( str2, string+start+delete);
strcat( str1, insert);
return strcat(str1, str2);
}

int main(int argc, char **argv) {
if (argc < 5) usage();
printf("%s\n", stuff(argv[1], atoi(argv[2]), atoi(argv[3]), argv[4]));
return 0;
}
/* End of stuff.c */

Note stuff() is eight lines, not eighty. You are a fraud.
This doesn't exactly do what spinoza1111 set out to do, and what the
subsequent examples from others like Ben and Willem did. That combined
with the fact that you're using static buffers means that a line count
comparison of this program with the others is not very meaningful.
Gimme a break. One malloc() and one free() add two lines.

Click to expand...

How does that give you a function like the one being discussed
elsewhere in the thread?

Click to expand...

I must be missing something. spinoza1111 came up with almost 100 lines
of code to solve what I consider to be a simple problem solvable in 10
lines or so. Maybe I didn't appreciate how hard a real programmer
would work on this sort of thing. Finding a substring and replacing it
with another is almost trivial.

If your point was simply that "this is trivial" then we are in total
agreement; but it looked like you were saying "*this* is how trivial
it is" by giving an function to solve the problem. My point was that
you really have to match the specification not use a related but
slightly simpler one.

Ben Bacarisse · Feb 8, 2010

Nick Keighley said:
no. The standard doesn't specify algorithms. I'd be slightly surprised
if strstr() used Boyer-Moore, I thought it was a bit over the top for
the simple cases.

You may be slightly surprised then. From the current glibc strstr
sources:

/* We use the Two-Way string matching algorithm, which guarantees
linear complexity with constant space. Additionally, for long
needles, we also use a bad character shift table similar to the
Boyer-Moore algorithm to achieve improved (potentially sub-linear)
performance.

See http://www-igm.univ-mlv.fr/~lecroq/string/node26.html#SECTION00260
and http://en.wikipedia.org/wiki/Boyer-Moore_string_search_algorithm
*/

It's not exactly Boyer-Moore but I suspect that this is more because
this hybrid is superior rather than a reluctance to use BM. I don't
know enough about this have an opinion (at east not one than anyone
should listen to) about this choice.

Anyway, here is a very important point for C programmers here: anyone
who decides not to use the C library is cutting themselves off from a
considerable pool of expertise.

Nick Keighley · Feb 8, 2010

Nick said:
Nick said:

Although I initially accepted your [Richard Heathfield's] foolish
ideas about quoting rather
than angle-bracketing included files, I am beginning to change my
mind. Perhaps C code written for a specific compiler should have a
look and feel unique to that compiler, with let us say upper case type
suffixes on file names. This would inform the reader that we're using
Windows, and it would have the added benefit of driving you bat shit.

Click to expand...

Click to expand...

Richard Heathfield general advice on header files is that standard
headers should be enclosed in angle brackets <> and non-implementaion
ones should be enclosed in string quotes "". [ ... ]

Click to expand...

Why Richard Heathfield's advice? That's the usual practice in C
programming isn't it?

yes. I was trying to stress (for the lukers) that what Richard was
recomending was normal practice. I'm sorry if that wasn't clear.

Normal Practice: standard headers enclose in angle brackets <>, non-
standard headers in string quotes "". System headers amy go either way
(in practice).

Nick Keighley · Feb 8, 2010

You may be slightly surprised then. From the current glibc strstr
sources:

/* We use the Two-Way string matching algorithm, which guarantees
linear complexity with constant space. Additionally, for long
needles, we also use a bad character shift table similar to the
Boyer-Moore algorithm to achieve improved (potentially sub-linear)
performance.

Seehttp://www-igm.univ-mlv.fr/~lecroq/string/node26.html#SECTION00260
andhttp://en.wikipedia.org/wiki/Boyer-Moore_string_search_algorithm
*/

Colour me surprised then! I was assuming the common case would be
small strings (for both strings) and that BM wasn't worth it. Thanks
for the enlightenment.

It's not exactly Boyer-Moore but I suspect that this is more because
this hybrid is superior rather than a reluctance to use BM. I don't
know enough about this have an opinion (at east not one than anyone
should listen to) about this choice.

Anyway, here is a very important point for C programmers here: anyone
who decides not to use the C library is cutting themselves off from a
considerable pool of expertise.

I always go for the c library first and only hunt for alternatives if
it doesn't do what I want. (Or I think I can do better- less likely
every day!).

Nick Keighley · Feb 8, 2010

Nick said:
Nick said:

[...] So I fixed these and compiled and ran it. It does produce
the expected output for the test cases you include in main(), but I
commented out both the TESTER macro and your main() definition and
supplied my own given below. This allows me to test your function from
the command-line without having to include each test case in the code
and go through the build process each time.

Click to expand...

Click to expand...

the argument for embedding the test cases in code (not necessarily the
actual delivered code) is that you have a permenant way of running all
the tests if any modifcation is ever made to the code.

Click to expand...

Why not just read the test cases from a file? That way, you can extend
the test pack whenever required, without having to recompile *and*
without having to type them all in on the command line.

that's kind of why I said use a scripting language. I was thinking you
got the best of both worlds; rapid development and permenant record
AND without having to write a parser for your test file!

(I must admit I just write C code in another file. I don't regard
compile times being as long enough to warrant anything more
complicated).

Willem · Feb 8, 2010

santosh wrote:
) Willem wrote:
)> spinoza1111 wrote:
)> ) The point being that a professional, whether or not he uses string.H,
)>
)> Oh so you're wanting a solution that doesn't use string.h ?
)>
)> I've seen some iterations of your code. No offense, but it looks
)> needlessly complex. I think even if we were to replicate strstr()
)> and memcpy() with our own versions, it would still be simpler.
)>
)> But sticking it all in one function, off the top of my head:
)>
)> #include <stdlib.h> /* for malloc() */
)>
)> size_t replace_string(char *string, char *replace, char *with, char *target)
)> {
)> char *string_ptr = string;
)> size_t target_len = 0;
)>
)> while (*string_ptr) {
)> size_t replace_len = 0;
)> while (string_ptr[replace_len] == replace[replace_len]) {
)> replace_len++;
)> }
)> if (replace[replace_len] == 0) {
)> size_t with_len;
)> for (with_len = 0; with[with_len]; with_len++) {
)> if (target) { target[target_len] = with[with_len]; }
)> target_len++;
)> }
)> string_ptr += replace_len;
)> } else {
)> if (target) { target[target_len] = *string_ptr; }
)> target_len++;
)> string_ptr++;
)> }
)> }
)> if (target) { target[target_len] = 0 }
)> target_len++;
)> return target_len;
)> }
)>
)> char *replace(char *string, char *replace, char *with)
)> {
)> char *result;
)> if (result = malloc(replace_string(string, replace, with, 0) + 1)) {
)> replace_string(string, replace, with, result);
)> }
)> return result;
)> }
)>
)>
)> Of course, this will get horridly bad performance for some nasty inputs,
)> especially when the 'replace' string has repeats.
)>
)> To fix that you need Boyer-Moore, IIRC (which I think strstr uses ?)
)
) $ ./willem "aaa" "aaa" "t"
)
) aaa
)
) $ ./willem "aaa" "a" "t"
)
) tta
)
) So a bug somewhere.

You should get into professional testing.
There's even the danger of UB there, looks like.

PS: did you actually randomly throw test cases at it
or did you just spot the bug with the end of string case ?

The fix to the above code would involve some ifs for that edge case,
but perhaps a more elegant solution can be found.

State machine anyone ?

SaSW, Willem
--
Disclaimer: I am in no way responsible for any of the statements
made in the above text. For all I know I might be
drugged or something..
No I'm not paranoid. You all think I'm paranoid, don't you !
#EOT

Seebs · Feb 8, 2010

[SNIP - aaaaaaaaaaaaaaaaaaaarrrrrrrrrrrrghhhhhhhhhhhhh!!!!!]
Hee.

That was a work of art! You should print that out and frame it.
I think I'll submit it for code review later today at work ;-)

Thanks! I will say up front that I wrote it BEFORE looking at his actual
attempt to solve the trivial string-replace problem. I just thought it
would be amusing.

One of the reasons I wrote that was to explore a theory: My theory is that,
just as you can recognize writers from their text in natural languages, you
can often recognize them from their code. Trying to write in someone else's
"voice" is a great challenge for writers, and doing the same thing with code
is a similar challenge for programmers.

It can teach you a lot about how you think about expressing things to work on
learning to express them differently.

-s

Seebs · Feb 8, 2010

I don't agree here either.

I don't particularly either, but I dispute the implicit claim that I ever
said it was particularly a good thing that I have taken no CS courses. I
pointed it out because I thought it was funny when contrasted with the
constant claims (by Nilges) that I was an ivory tower academic who had
spent too much time on graduate-level CS and not enough time programming.
His reaction was completely predictable, but still funny. (Usually,
humor has to surprise, but as anyone who has ever watched a Charlie Chaplin
routine can tell you, you can be just as funny with a completely predictable
failure mode.)

-s

Seebs · Feb 8, 2010

Lack of consts makes it look amateurish.

While I chose to use consts in mine, I am quite sympathetic to avoiding
const, because I did it for years. That said, I do agree that it would
be improved by the use of const qualifiers.

-s

Seebs · Feb 8, 2010

Only if you like slapstick. Personally, I can't stand it.

I like very little of it. It is very rarely done well. You'll notice that
my example of the field was Charlie Chaplin, not, say, any of the hundreds
of people trying to cram badly-done slapstick into films in the last decade
or three.

-s

Squeamizh · Feb 8, 2010

My code for this was written in about five minutes, I have not yet found a bug
in it (although it does have the documented limitation that, as of this
writing, it simply assumes that "%" will be the first character of "%s";
since it only has to run on two strings, and they're in the same module,
I was okay with that), and ran correctly on the first try. (It would have
been the second try, except that I'd just done something sort of similar
half an hour earlier and made a fencepost error there, which kept the question
clearly in mind.)

I did analyze the problem carefully, and I concluded that it was simple enough
and easy enough to get right in this particular case that it would be better
to write a fairly simple loop which the reader could verify by inspection
than to try to solve the general problem.

I don't know what's more funny- Spinoza's inability to get this simple
problem right, or Seebs' continued childish insistence that he only
spent five minutes implementing his solution.

Keith Thompson · Feb 8, 2010

Squeamizh said:
I don't know what's more funny- Spinoza's inability to get this simple
problem right, or Seebs' continued childish insistence that he only
spent five minutes implementing his solution.

What evidence do you have that Seebs spent more than five minutes
on it?

Seebs · Feb 8, 2010

What evidence do you have that Seebs spent more than five minutes
on it?

The original one (which was, admittedly, a hack) can't have been much more,
it was part of a much larger program that I think took about half an hour
total. The one I posted that addressed that issue (but still had buggy
behavior for an empty find string) took around 10 minutes before breakfast --
it absolutely took less time than the interval between putting a frozen pizza
in the oven and taking it out, and the pizza was not burned.

Why should it take more than five minutes? It's very close to being
completely trivial.

Up next: My astounding bragging continues as I claim to have spent "less
than an hour" writing a program to multiply corresponding numbers in two
arrays of equal length together, storing the results in a third array of
the same length.

(for (int i = 0; i < len; ++i) { c = a * b; })

-s
p.s.: Someone let me know if Nilges tries to "improve" that, I'll have
to read it.

Moi · Feb 8, 2010

Of course, this will get horridly bad performance for some nasty inputs,
especially when the 'replace' string has repeats.

To fix that you need Boyer-Moore, IIRC (which I think strstr uses ?)

Boyer-Moore needs additional storage for it's jump/fallback table.
IMHO it is a sane choice _not_ to let a library routine depend
(unless needed) on dynamic (or even worse: static) storage.

AvK

Phil Carmody · Feb 8, 2010

Or NP-complete problems. (Terminology quibble? Maybe.)

Not at all. The bilgemaster encounters what he considers to be
academic-sounding terms, and then repeats them hoping that he'll
be treated as someone even vaguely erudite. Unfortunately, he
uses the terms in completely hatstand contexts, and thus fails
dramatically. Tragic, and best ignored.

Phil

Richard Tobin · Feb 8, 2010

Check out Norman Wisdom. He does it very well indeed.

He's very popular in Albania, apparently.

-- Richard

Squeamizh · Feb 8, 2010

[...]

I don't know what's more funny- Spinoza's inability to get this simple
problem right, or Seebs' continued childish insistence that he only
spent five minutes implementing his solution.

Click to expand...

What evidence do you have that Seebs spent more than five minutes
on it?

None, and I didn't mean to imply that I didn't believe him. I just
find it funny.

Proposal for a non-periodic CPRNG (WARNING: CROSSPOST!)	4	Apr 28, 2012
Warning on assigning a function-returning-a-pointer-to-arrays	17	Jun 26, 2006
Adapting software to multiple usage patterns	90	Oct 13, 2009
Statement on Schildt submitted to wikipedia today	119	Sep 6, 2009
Proposal for Amendment to Section 6.5.3.2, Unary * Operator	65	Jul 31, 2010
Why I love Python (warning: rambling)	28	Oct 28, 2003
Undefined Behaviour designed to be caught [Was: Books for advancedC++ debugging]	9	Jul 17, 2009
comp.lang.c Answers (Abridged) to Frequently Asked Questions (FAQ)	0	Mar 1, 2008

Warning to newbies

blmblm

blmblm

Phil Carmody

Ben Bacarisse

Ben Bacarisse

Nick Keighley

Nick Keighley

Nick Keighley

Willem

Seebs

Seebs

Seebs

Seebs

Squeamizh

Keith Thompson

Seebs

Moi

Phil Carmody

Richard Tobin

Squeamizh

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads