Warning to newbies

S

spinoza1111

There's a physics debate on a forum I participate in (on the brainteaser
"can you make a completely wind-powered vehicle which can, steady state,
go directly downwind faster than the wind" -- the answer is obviously "no"
and actually "yes").  I bring it up because it has a participant who reminds
me a great deal of Nilges.  He has a genuinely awe-inspiring ability to
get ANYTHING wrong.  He refers to kilowatt-hours as "kilowatts per hour",
then argues at length that this must be correct because a Google search on
"kW/hr" turns up pages about kilowatt hours.  Someone described wind speed
as "30 to 40 miles per hour, with a peak recorded gust of 47"; our hero
pointed out that 47 is 16 miles per hour faster than the average of 35.

This is what the Nazis did. They substituted a caricature of "der
Ewige Jude" for the reality in hopes that the general public would
think in terms of the caricature.

And while you were posting this vile garbage, you were failing to find
the major bug in my code despite the fact you claim that you find bugs
professionally, and despite the fact that there's evidence in this
thread that you carefully reviewed the code.

This is sort of like that.  It's practically hypnotic trying to say something
so simple that he can't misunderstand it.  It was, to nearly everyone, obvious
what the objection was.  I mean, come on.  It's even *in the FAQ*.

        "The situation is more complicated in the case of nonstandard
        headers.  Some are completely system- or compiler-specific.
        Some are completely unnecessary, and should be replaced by their
        Standard equivalents.  (For example, instead of <malloc..h>, use
        <stdlib.h>.)"

(clc FAQ 10.11)

For that matter... "replace substring with other string" would be a
typical first year problem.  I liked my snprintf solution because it

How would you know? You have proudly announced that you haven't taken
ANY computer science classes.

Furthermore, this problem is in fact visited and revisited in graduate
level classes in algorithm analysis, such as my own in 1978 (my grade
was A). Had you been properly educated in your professed profession,
you would know that algorithm analysis visits very simple problems to
show their real complexity, and you would not be the sort of nasty
little clerk you are in fact.
 
S

spinoza1111

I see Spinny's promise to be nice lasted about five minutes.

I didn't promise to be nice. I promised to abide my suggested charter,
in which you don't say "Raca, thou fool" but instead engage in logical
argument and verbal self-defense. Self-defense in my case may be
engaging my lawyer to deal with what's going on here, and I needed to
put the OP on notice.
 
S

spinoza1111

Ben said:
santosh wrote:
Joe Wright wrote:
spinoza1111wrote:
[ all snipped ]
I suggest you spend overlong on a relatively simple task. Regard..
/*
    Program: stuff.c
    Author:  Joe Wright
    Date:    05/24/1999
    The dBASE stuff() function
*/
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define LEN 256
void usage(void) {
    printf("Usage: stuff 'cString' nStart nDelete 'cInsert'\n"
           "Example: stuff 'Now is the time.' 7 0 'NOT '\n"
           "Note: nStart is rel-0 in C\n"), exit(0);
}
char *stuff(char *string, int start, int delete, char *insert) {
    static char str1[LEN];                 /* Return buffer */
    char str2[LEN];                        /* Temporary buffer */
    strncpy(str1, string, start);
    str1[start] = 0;
    strcpy( str2, string+start+delete);
    strcat( str1, insert);
    return strcat(str1, str2);
}
int main(int argc,  char **argv) {
    if (argc < 5) usage();
    printf("%s\n", stuff(argv[1], atoi(argv[2]), atoi(argv[3]), argv[4]));
    return 0;
}
/* End of stuff.c */
Note stuff() is eight lines, not eighty. You are a fraud.
This doesn't exactly do whatspinoza1111set out to do, and what the
subsequent examples from others like Ben and Willem did. That combined
with the fact that you're using static buffers means that a line count
comparison of this program with the others is not very meaningful.
Gimme a break. One malloc() and one free() add two lines.
How does that give you a function like the one being discussed
elsewhere in the thread?

I must be missing something.spinoza1111came up with almost 100 lines of
code to solve what I consider to be a simple problem solvable in 10 lines
or so. Maybe I didn't appreciate how hard a real programmer would work on
this sort of thing. Finding a substring and replacing it with another is
almost trivial.

Read Cormen's Algorithms.

It isn't trivial. In

replace("banana", "ana", "111")

there are three answers depending on whether the operation is right to
left, left to right, or all occurences (there are three occurences of
ana in banana if overlapping occurences are allowed).

In fact, there are two problems. One is "find all occurences of a
string". The second is "replace occurences".

John Kenneth Galbraith, in a short essay entitled "The Economics of
Innocent Fraud" theorized that CEOs, not being in any way particularly
well-educated or gifted with any Platonic attributes of superior
wisdom (or basic ethics in many cases) are purely a "priesthood"
appointed more or less randomly, and based on quite arbitrary rules,
as a buffer class between the brutal fact of private ownership of the
means of production, and the common sort.

My extension is that programmers today are in the main a minor
priesthood or sexton class appointed owing to technology and money
worship who need not be qualified or intelligent, who need in fact
certain aporias (such as Peter's failure to have any CS classes) which
makes them overresponsive to the higher priesthood's commands.

As such, these eunuchs' primary function is to chant reassuring
*mantras*, one being that "this problem is simple, don't think, a
first year student could solve it, don't think, if you are worried
about it there's a problem with you, don't think, if you can't solve
it in ten minutes with no bugs you're obviously incompetent, don't
think, look I solved it while microwaving pizza, don't think, my
solution has bugs but I'm cute, don't think...AAAAAUUUUUMMMMMMMMM...".

These eunuchs have especial venom and contempt for "academic"
solutions including the principled effort to get it right and the
attendant openness about one's own limitations and simple decency that
results, a decency which is evident in only some of the participants
in this discussion.
 
S

spinoza1111

I don't think I noticed it.  

No, I don't think you did. And once you shape up you might actually
start contributing.
But yes, once you look, it's "obvious".  Except

No, once I tell you, Peter. And you could have found the error instead
of your stupid, malicious, and Fascistic attempt at satire.
for being carefully buried under a sea of poorly-chosen variable names.

Poorly chosen by what standard? Yours?
This was in the hand-written replacement for strstr(), which is a great
example of why you should just about ALWAYS use the standard library functions
when they have the right spec.

It has been already pointed out to you by Bacarisse that that wasn't
the spec.

But, oh typewriting monkey, you're right. The creation of the standard
libraries was an enormous amount of mental effort which should be
reused. The principled EXIT from the cave of C engineered first by
Stroustrup and later by the Microsoft C Sharp team was even more
effort which should be used.
 
G

gwowen

(there are three occurences of ana in banana
if overlapping occurences are allowed).

Really? A word with two 'n's has three occurences of 'ana'. I'm
struggling to see how that is the case. Can you enlighten me?
 
N

Nick Keighley

This was in the hand-written replacement for strstr(), which is a great
example of why you should just about ALWAYS use the standard library functions
when they have the right spec.

....and performance
 
S

spinoza1111

Really?  A word with two 'n's has three occurences of 'ana'.  I'm
struggling to see how that is the case.  Can you enlighten me?

My typo. There are two occurrences of ana in banana if overlapping
occurrences are allowed, but a one way scan and replace that sets the
index past the end of a complete match after finding a complete match
sees only one occurrence. If it goes left to right changing ana to
ono, banana becomes bonona: if it goes right to left, banana becomes
banono. Arguably the right answer is bonono. You say tomato, I say
tomahto. Microsoft Word changes to bonona.
 
S

spinoza1111

...and performance

But...isn't the algorithm I've used (disregarding the banana bonona
bonono banoono issue) the best we can get, given no "magically find
the next matching character" instruction? And don't the super studly
algorithms like Boyer Moore and Knuth Morris Pratt always require
preprocessing and tables, with some of them best implemented as
members of hash tables indexed by targets so we can reuse them for the
same target? And doesn't this imply that the C library function can't
be re-entrant if it uses them? And does not THIS imply that the super-
studly algorithms are best implemented in an OO language like C Sharp?

(Not rhetorical questions, genuinely want your views)
 
S

spinoza1111

But...isn't the algorithm I've used (disregarding the banana bonona
bonono banoono issue) the best we can get, given no "magically find
the next matching character" instruction? And don't the super studly
algorithms like Boyer Moore and Knuth Morris Pratt always require
preprocessing and tables, with some of them best implemented as
members of hash tables indexed by targets so we can reuse them for the
same target? And doesn't this imply that the C library function can't
be re-entrant if it uses them? And does not THIS imply that the super-
studly algorithms are best implemented in an OO language like C Sharp?

(Not rhetorical questions, genuinely want your views)

Discuss this while I implement it:

"One of the great tragedies in life is the murder of a beautiful
theory by a brutal gang of facts".

My "beautiful theory" was that on finding a partial match, I could set
the main index ptrIndex1 to the position of the unmatched character
and blast off.

It was brutally murdered by

TESTER(ptrResult,
"bbb1111123bbbbb",
"111123",
"ono",
"bbb1onobbbbb",
1,
lngReplacements)

But: perhaps we can revive the lovely lady.

As you're scanning for a match, note whether the "handle" of the
target re-occurs!

If the handle occurs between master and master[j] where j is the
point of match failure, say at master[k], start at k!

Otherwise, start at j!

I realize that I'm reinventing the wheel, or fighting battles probably
fought long ago, even as the second Battle of Breitenfeld was fought
amidst the skulls of the fallen during the Thirty Years War. Prior art
exists. But are we not men? Cf. Fukuyama The End of HIstory and the
Last Man: if all problems are solved, there's no way to go but
backwards in time to the Time of the Troglodytes.
 
S

spinoza1111

On Feb 10, 4:29 pm, Nick Keighley <[email protected]>
wrote:
But...isn't the algorithm I've used (disregarding the banana bonona
bonono banoono issue) the best we can get, given no "magically find
the next matching character" instruction? And don't the super studly
algorithms like Boyer Moore and Knuth Morris Pratt always require
preprocessing and tables, with some of them best implemented as
members of hash tables indexed by targets so we can reuse them for the
same target? And doesn't this imply that the C library function can't
be re-entrant if it uses them? And does not THIS imply that the super-
studly algorithms are best implemented in an OO language like C Sharp?
(Not rhetorical questions, genuinely want your views)

Discuss this while I implement it:

"One of the great tragedies in life is the murder of a beautiful
theory by a brutal gang of facts".

My "beautiful theory" was that on finding a partial match, I could set
the main index ptrIndex1 to the position of the unmatched character
and blast off.

It was brutally murdered by

    TESTER(ptrResult,
           "bbb1111123bbbbb",
           "111123",
           "ono",
           "bbb1onobbbbb",
           1,
           lngReplacements)

But: perhaps we can revive the lovely lady.

As you're scanning for a match, note whether the "handle" of the
target re-occurs!

If the handle occurs between master and master[j] where j is the
point of match failure, say at master[k], start at k!

Otherwise, start at j!

I realize that I'm reinventing the wheel, or fighting battles probably
fought long ago, even as the second Battle of Breitenfeld was fought
amidst the skulls of the fallen during the Thirty Years War. Prior art
exists. But are we not men? Cf. Fukuyama The End of HIstory and the
Last Man: if all problems are solved, there's no way to go but
backwards in time to the Time of the Troglodytes.


I am posting out of scale to others here, so it is time for me to take
a short sabbatical. The next version of this code will not be posted
prior to Chinese New Year (Sat 13 Feb) and I shall neither read nor
respond to posts. Thanks to all, including Seebach and Heathfield, for
your constructive participation and insights.

Gung hey fat choy to all.
 
M

Moi

Boyer Moore doesn't require preprocessing/tables. Knuth Morris Pratt
does. Both can be implemented as re-entrant routines. There are faster
algorithms that do require significant reprocessing. BM and KMP can be
implemented re-entrantly as can almost all algorithms. Requiring
re-entrancy is (almost) orthogonal to OO.

Oops, I think I mixed them up, too.

BTW, here is mine, very similar to Willem's
(error handling, and adding const qualifiers are left as an exercise to the reader ...)
Sorry for my strange style of indentation...

****/

#include <string.h>
#include <stdlib.h>
#include <stdio.h>

struct nuke {
struct nuke *nxt;
char *ptr;
};

char * replace(char * haystack, char *needle, char *nail)
{
struct nuke *ammo, *next, **hnd ;
char *result, *tail, *src,*dst;
size_t haystack_len, result_len, needle_len, nail_len, len;
unsigned nhit;

needle_len = strlen (needle);
nail_len = strlen (nail);

nhit = 0;
ammo = NULL;
hnd = &ammo;
for (tail = haystack; src = strstr(tail, needle); tail = src + needle_len) {
*hnd = malloc (sizeof *hnd);
fprintf(stderr, "Hit @%u:%s\n", (unsigned) (src-haystack), src);
(*hnd)->ptr = src;
(*hnd)->nxt = NULL;
nhit++;
hnd = &(*hnd)->nxt;
}

haystack_len = (tail-haystack) + strlen (tail);
result = malloc (haystack_len + nhit * (nail_len-needle_len) );

/* if (!nhit) {
memcpy( result, haystack, haystack_len+1);
return result;
} */

dst = result, src = haystack;
for ( ; ammo; free (ammo), ammo=next) {
next = ammo->nxt;
if (len = (ammo->ptr - src)) {
fprintf(stderr, "Copy %u bytes @%u:%s\n"
, (unsigned) len, (unsigned) (src-haystack), src);
memcpy( dst, src, (unsigned) len );
src += len;
dst += len;
}
fprintf(stderr, "replace %u/%u bytes @%u:%s\n"
, needle_len, nail_len, (unsigned) (src-haystack), src);
memcpy( dst, nail, nail_len);
src += needle_len;
dst += nail_len;
}
/* copy the final segment */
strcpy(dst,src);
return result;
}


#if WANT_MAIN
int main(int argc, char **argv)
{
char *res;
if (argc <= 3) exit(EXIT_FAILURE);

res = replace (argv[1], argv[2], argv[3] );

fprintf(stdout, "[%s,%s,%s] := \"%s\"\n", argv[1], argv[2], argv[3], res );

exit (EXIT_SUCCESS);
}
#endif

/*****************

AvK
 
S

spinoza1111

I found that the version with the fix to the "full match overlaps a
partial match on the left" bug was munged with extra blanks in the
macro and line breaks, so here is an unmunged version.

It also has been changed to incorporate a modification of the idea
discussed above: the idea was to take the LAST occurrence of the
"handle" (the first character of the target) prior to the point of no-
match in a partial match. But this will miss overlapping full matches
that start to the LEFT, and our desire to set to the rightmost handle
character to the left of the unmatched character is merely a
modification of our original sin, which was to lust after setting the
search pointer, over-eagerly, to the right of the partial match.

But, this seems to be the solution, implemented in the following code:
take the first occurence of the handle after the start of any matched
handle, remember it in ptrIndex4 (a new pointer), and update the main
search pointer, ptrIndex1 to either this value, or else the index of
the unmatched character after the partial match.

If the handle occurs prior to the partial match fail point, the
leftmost point at which it occurs is the first possible point at which
a full match may start. If the handle does not so occur, we may start
at the end of the partial match in purity of heart.

I haven't looked at any responses and shall not do so until Chinese
New Year (Saturday). This is to relieve the imbalance of postings.

Again, gung hey, fat choy.


// ***************************************************************
// * *
// * replace() demo *
// * *
// * Demonstrates how to replace non-NUL-defined strings in C *
// * using a simple function, and bypassing string.h. *
// * *
// * C H A N G E R E C O R D --------------------------------- *
// * DATE PROGRAMMER DESCRIPTION OF CHANGE *
// * -------- ---------- --------------------------------- *
// * 02 07 10 Nilges Version 1.0 *
// * *
// * 02 07 10 Nilges Bug: partial matches not handled *
// * correctly: need to iterate search *
// * for match. *
// * *
// * 02 07 10 Nilges 1. Santosh suggested tests *
// * 2. Heathfield put the boot in re *
// * including malloc *
// * *
// * 02 07 10 Nilges 1. Remove string.h use and code *
// * strlen by hand *
// * 2. Add comment block and comments*
// * inline. *
// * 3. free() storage *
// * 4. Use macro for testing *
// * 5. Bug: calculation of *
// * lngNewLength used incorrect *
// * index (intIndex3 instead of 2)*
// * which caused a memory leak. *
// * *
// * 02 07 10 Nilges 1. Bug: Ike Naar test failed. *
// * At end of scan loop, main *
// * string pointer was not *
// * correctly updated from index3 *
// * *
// * 02 07 10 Nilges Added new Santosh test *
// * *
// * 02 07 10 Nilges Added new Ike Naar test *
// * *
// * 02 08 10 Nilges 1. Added some new comments *
// * 2. Make "check for a complete *
// * match "structured" by means *
// * of a tested assignment *
// * 3. Get rid of "replace at end" *
// * evilness: the only time this *
// * flag is meaningful is in the *
// * LAST segment. *
// * 4. Return replace count *
// * 5. TESTER macro assertion added *
// * *
// * 02 10 10 Nilges 1. Bug fix: in a partial match, *
// * the main scan index is set to *
// * one past the end, which misses*
// * full matches that start *
// * between the first character of*
// * the partial match and its end.*
// * *
// * No longer updating the main *
// * scan index (ptrIndex1) to the *
// * point of partial match *
// * failure: setting it one past *
// * the first character of the *
// * partial match. *
// * *
// * 2. Line up expected & actual *
// * results per excellent *
// * suggestion (who made this?) *
// * *
// * 3. After a partial match, update *
// * the main handle search index *
// * (ptrIndex1) to the first *
// * occurrence of the handle *
// * character after the start of *
// * the partial match, or to the *
// * index of the unmatched char. *
// * ----------------------------------------------------------- *
// * *
// * "In the near future we shall have to live with the *
// * superstition that programming is 'so easy that even a *
// * Republican can do it!'" *
// * *
// * - E. W. Dijkstra *
// * *
// * *
// ***************************************************************
#include <stdio.h>
#include <stdlib.h>
// ***** Segmentation *****
struct TYPsegmentstruct
{ char * strSegment;
long lngSegmentLength;
struct TYPsegmentstruct * ptrNext; };
// ---------------------------------------------------------------
// Calculate string length
//
//
long strLength(char *strInstring)
{
char *ptrInstring;
for (ptrInstring = strInstring; *ptrInstring; ptrInstring++);
return ptrInstring - strInstring;
}

// ---------------------------------------------------------------
// Replace target by replacement string in master string
//
//
// Caution: the string returned by this function should be freed.
//
//
char * replace(char * strMaster,
char * strTarget,
char * strReplacement,
long * ptrReplacements)
{
char * ptrIndex0;
char * ptrIndex1;
char * ptrIndex2;
char * ptrIndex3;
char * ptrIndex4;
char * strNew;
char * strNewStart;
long lngNewLength;
long lngCount;
long lngReplacementLength;
struct TYPsegmentstruct * ptrSegmentStructStarts;
struct TYPsegmentstruct * ptrSegmentStruct;
struct TYPsegmentstruct * ptrSegmentStructPrev;
lngReplacementLength = strLength(strReplacement);
if (!*strTarget)
{
printf("Error in calling replace(): target can't be null");
abort();
}
ptrIndex1 = strMaster;
ptrSegmentStructPrev = 0;
lngNewLength = 0;
*ptrReplacements = 0;
while(*ptrIndex1)
{
ptrIndex0 = ptrIndex1;
while (-1)
{
// --- Check for (one character) handle
for(;
*ptrIndex1 && *ptrIndex1 != *strTarget;
ptrIndex1++);
// --- Check for complete match while remembering the
// --- last position of the handle
ptrIndex4 = 0;
for(ptrIndex2 = strTarget + 1,
ptrIndex3 = ptrIndex1 + 1;
*ptrIndex3
&&
*ptrIndex2
&&
*ptrIndex3 == *ptrIndex2;
ptrIndex3++, ptrIndex2++)
{
if (*ptrIndex3 == *strTarget
&&
ptrIndex4 == 0) ptrIndex4 = ptrIndex3;
}
// End test: check complete match, update main ptr past
// partial match while checking for end of loop
if ((!*ptrIndex2 ? ((*ptrReplacements)++, -1) : 0)
||
(!*ptrIndex3 ? (ptrIndex1 = ptrIndex3, -1) : 0))
break;
// Update the main search pointer
ptrIndex1 = (ptrIndex4 == 0 ? ptrIndex3 : ptrIndex4);
}
// --- Create new segment
if (!(ptrSegmentStruct =
malloc(sizeof(struct TYPsegmentstruct))))
abort();
ptrSegmentStruct->strSegment = ptrIndex0;
ptrSegmentStruct->lngSegmentLength =
ptrIndex1 - ptrIndex0;
ptrSegmentStruct->ptrNext = 0;
if (ptrSegmentStructPrev != 0)
ptrSegmentStructPrev->ptrNext = ptrSegmentStruct;
else
ptrSegmentStructStarts = ptrSegmentStruct;
ptrSegmentStructPrev = ptrSegmentStruct;
// --- Update mallocation length
lngNewLength += ptrSegmentStruct->lngSegmentLength +
(!*ptrIndex2
?
lngReplacementLength
:
0);
// --- Get past end of target string & iterate
ptrIndex1 = ptrIndex3;
}
// --- Allocate just enough storage for the new string
if (!(strNewStart = malloc(lngNewLength + 1))) abort();
// --- Build the new string whilst freeing the list
strNew = strNewStart;
ptrSegmentStruct = ptrSegmentStructStarts;
while (ptrSegmentStruct)
{
for (ptrIndex1 = ptrSegmentStruct->strSegment,
lngCount = 0;
lngCount < ptrSegmentStruct->lngSegmentLength;
ptrIndex1++, lngCount++, strNew++)
*strNew = *ptrIndex1;
if (ptrSegmentStruct->ptrNext || !*ptrIndex2)
for (ptrIndex1 = strReplacement;
*ptrIndex1;
ptrIndex1++, ++strNew)
*strNew = *ptrIndex1;
ptrSegmentStructPrev = ptrSegmentStruct;
ptrSegmentStruct = ptrSegmentStruct->ptrNext;
free(ptrSegmentStructPrev);
}
*strNew = '\0';
return strNewStart;
}

// ---------------------------------------------------------------
// Statement-format test macro
//
//
#define TESTER(resultPtr, master, target, replacement, expected,
expectedReplacements, replacements) \
{ \
printf("Replace \"%s\" by \"%s\" in \"%s\"\n", \
(target), (replacement), (master)); \
printf("Expect \"%s\":\n \"%s\"\n", \
(expected), \
resultPtr = replace((master), \
(target), \
(replacement), \
&(replacements))); \
printf("Replacements expected: %d: replacements: %d\n", \
(expectedReplacements), \
(replacements)); \
if (!(strLength(resultPtr) \
== \
strLength(master) \
+ \
(strLength(replacement)-strLength(target)) \
* \
replacements)) \
printf("Assertion failed\n"); \
printf("\n\n"); \
free(resultPtr); \
}

// ---------------------------------------------------------------
// Main procedure
//
//
int main()
{
char *ptrResult;
long lngReplacements;
printf("\nReplace\n\n\n");
TESTER(ptrResult,
"1111123bbb1111123bbb11123bb11111231111112111111123",
"111123",
"ono",
"1onobbb1onobbb11123bb1ono1111112111ono",
4,
lngReplacements)
TESTER(ptrResult,
"bbb1111123bbbbb",
"111123",
"ono",
"bbb1onobbbbb",
1,
lngReplacements)
TESTER(ptrResult,
"a stupid error",
"stupid error",
"miracle",
"a miracle",
1,
lngReplacements)
TESTER(ptrResult,
"a stupid error",
"stupid",
"miracle",
"a miracle error",
1,
lngReplacements)
TESTER(ptrResult,
"the stupid error",
"the stupid error",
"a miracle",
"a miracle",
1,
lngReplacements)
TESTER(ptrResult,
"the miracle",
"the",
"a",
"a miracle",
1,
lngReplacements)
TESTER(ptrResult,
"a miraclsnirpKamunkle",
"snirpKamunkle",
"e",
"a miracle",
1,
lngReplacements)
TESTER(ptrResult,
"a miraclesnirpKamunkle",
"a miracle",
"",
"snirpKamunkle",
1,
lngReplacements)
TESTER(ptrResult,
" a miraclesnirpKamunkle",
"a miracle",
"",
" snirpKamunkle",
1,
lngReplacements)
TESTER(ptrResult,
" a miraclesnirpKamunklea miraclea miracle",
"a miracle",
"",
" snirpKamunkle",
3,
lngReplacements)
TESTER(ptrResult,
"a miracle a miraclesnirpKamunkle a Miraclea miraclea
miracle",
"a miracle",
"",
" snirpKamunkle a Miracle",
4,
lngReplacements)
TESTER(ptrResult,
"a stupid errord",
"stupid error",
"miracle",
"a miracled",
1,
lngReplacements)
TESTER(ptrResult,
"a stupid errod",
"stupid error",
"miracle",
"a stupid errod",
0,
lngReplacements)
TESTER(ptrResult,
"a sstupid error",
"stupid error",
"miracle",
"a smiracle",
1,
lngReplacements)
TESTER(ptrResult,
"a stupid errorstupid error",
"stupid error",
"miracle",
"a miraclemiracle",
2,
lngReplacements)
TESTER(ptrResult,
"a stupid error stupiderror",
"stupid error",
"miracle",
"a miracle stupiderror",
1,
lngReplacements)
TESTER(ptrResult,
"bbbbbbbbbb",
"b",
"a",
"aaaaaaaaaa",
10,
lngReplacements)
TESTER(ptrResult,
"In the halls of R'yleh great %s lies dreaming",
"%s",
"Cthulu",
"In the halls of R'yleh great Cthulu lies dreaming",
1,
lngReplacements)
TESTER(ptrResult,
"%s%s%s%s%s%s",
"%s",
"Cthulu",
"CthuluCthuluCthuluCthuluCthuluCthulu",
6,
lngReplacements)
TESTER(ptrResult,
"banana",
"ana",
"oat",
"boatna",
1,
lngReplacements)
TESTER(ptrResult,
" a stupid errorstupid errorHeystupid errors",
"stupid error",
"+",
" a ++Hey+s",
3,
lngReplacements)
TESTER(ptrResult,
"foo barfoo barf",
"foo bar",
"bas",
"basbasf",
2,
lngReplacements)
TESTER(ptrResult,
"abab",
"ba",
"ba",
"abab",
1,
lngReplacements)
TESTER(ptrResult,
"abab",
"bab",
"boop",
"aboop",
1,
lngReplacements)
TESTER(ptrResult,
"banana",
"ana",
"ono",
"bonona",
1,
lngReplacements)
printf("\n\nTesting complete: check output carefully: \"Assertion
failed\" should not occur!\n\n");
return 0;
}


Replace


Replace "111123" by "ono" in
"1111123bbb1111123bbb11123bb11111231111112111111123"
Expect "1onobbb1onobbb11123bb1ono1111112111ono":
"1onobbb1onobbb11123bb1ono1111112111ono"
Replacements expected: 4: replacements: 4


Replace "111123" by "ono" in "bbb1111123bbbbb"
Expect "bbb1onobbbbb":
"bbb1onobbbbb"
Replacements expected: 1: replacements: 1


Replace "stupid error" by "miracle" in "a stupid error"
Expect "a miracle":
"a miracle"
Replacements expected: 1: replacements: 1


Replace "stupid" by "miracle" in "a stupid error"
Expect "a miracle error":
"a miracle error"
Replacements expected: 1: replacements: 1


Replace "the stupid error" by "a miracle" in "the stupid error"
Expect "a miracle":
"a miracle"
Replacements expected: 1: replacements: 1


Replace "the" by "a" in "the miracle"
Expect "a miracle":
"a miracle"
Replacements expected: 1: replacements: 1


Replace "snirpKamunkle" by "e" in "a miraclsnirpKamunkle"
Expect "a miracle":
"a miracle"
Replacements expected: 1: replacements: 1


Replace "a miracle" by "" in "a miraclesnirpKamunkle"
Expect "snirpKamunkle":
"snirpKamunkle"
Replacements expected: 1: replacements: 1


Replace "a miracle" by "" in " a miraclesnirpKamunkle"
Expect " snirpKamunkle":
" snirpKamunkle"
Replacements expected: 1: replacements: 1


Replace "a miracle" by "" in " a miraclesnirpKamunklea miraclea
miracle"
Expect " snirpKamunkle":
" snirpKamunkle"
Replacements expected: 3: replacements: 3


Replace "a miracle" by "" in "a miracle a miraclesnirpKamunkle a
Miraclea miraclea miracle"
Expect " snirpKamunkle a Miracle":
" snirpKamunkle a Miracle"
Replacements expected: 4: replacements: 4


Replace "stupid error" by "miracle" in "a stupid errord"
Expect "a miracled":
"a miracled"
Replacements expected: 1: replacements: 1


Replace "stupid error" by "miracle" in "a stupid errod"
Expect "a stupid errod":
"a stupid errod"
Replacements expected: 0: replacements: 0


Replace "stupid error" by "miracle" in "a sstupid error"
Expect "a smiracle":
"a smiracle"
Replacements expected: 1: replacements: 1


Replace "stupid error" by "miracle" in "a stupid errorstupid error"
Expect "a miraclemiracle":
"a miraclemiracle"
Replacements expected: 2: replacements: 2


Replace "stupid error" by "miracle" in "a stupid error stupiderror"
Expect "a miracle stupiderror":
"a miracle stupiderror"
Replacements expected: 1: replacements: 1


Replace "b" by "a" in "bbbbbbbbbb"
Expect "aaaaaaaaaa":
"aaaaaaaaaa"
Replacements expected: 10: replacements: 10


Replace "%s" by "Cthulu" in "In the halls of R'yleh great %s lies
dreaming"
Expect "In the halls of R'yleh great Cthulu lies dreaming":
"In the halls of R'yleh great Cthulu lies dreaming"
Replacements expected: 1: replacements: 1


Replace "%s" by "Cthulu" in "%s%s%s%s%s%s"
Expect "CthuluCthuluCthuluCthuluCthuluCthulu":
"CthuluCthuluCthuluCthuluCthuluCthulu"
Replacements expected: 6: replacements: 6


Replace "ana" by "oat" in "banana"
Expect "boatna":
"boatna"
Replacements expected: 1: replacements: 1


Replace "stupid error" by "+" in " a stupid errorstupid errorHeystupid
errors"
Expect " a ++Hey+s":
" a ++Hey+s"
Replacements expected: 3: replacements: 3


Replace "foo bar" by "bas" in "foo barfoo barf"
Expect "basbasf":
"basbasf"
Replacements expected: 2: replacements: 2


Replace "ba" by "ba" in "abab"
Expect "abab":
"abab"
Replacements expected: 1: replacements: 1


Replace "bab" by "boop" in "abab"
Expect "aboop":
"aboop"
Replacements expected: 1: replacements: 1


Replace "ana" by "ono" in "banana"
Expect "bonona":
"bonona"
Replacements expected: 1: replacements: 1




Testing complete: check output carefully: "Assertion failed" should
not occur!
 
B

Ben Bacarisse

Moi said:
BTW, here is mine, very similar to Willem's

I don't think Willem's used a list. I am not sure that the cost of
strstr is high enough that saving a call at the expense of a malloc
and free worthwhile. Of course, it depends on the quality of the
implementation of all of these functions and on the nature of the
replacements being made.

One thing I'd consider doing is special-casing those calls that have a
few replacements. With a fixed array of a few pointers all of the
repeat strstr calls can be avoided for all of these simple cases.
(error handling, and adding const qualifiers are left as an exercise to the reader ...)
Sorry for my strange style of indentation...

****/

#include <string.h>
#include <stdlib.h>
#include <stdio.h>

struct nuke {
struct nuke *nxt;
char *ptr;
};

char * replace(char * haystack, char *needle, char *nail)
{
struct nuke *ammo, *next, **hnd ;
char *result, *tail, *src,*dst;
size_t haystack_len, result_len, needle_len, nail_len, len;
unsigned nhit;

needle_len = strlen (needle);
nail_len = strlen (nail);

nhit = 0;
ammo = NULL;
hnd = &ammo;
for (tail = haystack; src = strstr(tail, needle); tail = src + needle_len) {
*hnd = malloc (sizeof *hnd);
fprintf(stderr, "Hit @%u:%s\n", (unsigned) (src-haystack), src);
(*hnd)->ptr = src;
(*hnd)->nxt = NULL;
nhit++;
hnd = &(*hnd)->nxt;
}

haystack_len = (tail-haystack) + strlen (tail);
result = malloc (haystack_len + nhit * (nail_len-needle_len) );

If nail_len is > needle_len this goes wrong. You need something like:

haystack_len + nhit * nail_len - nhit * needle_len
/* if (!nhit) {
memcpy( result, haystack, haystack_len+1);
return result;
} */

dst = result, src = haystack;
for ( ; ammo; free (ammo), ammo=next) {
next = ammo->nxt;
if (len = (ammo->ptr - src)) {
fprintf(stderr, "Copy %u bytes @%u:%s\n"
, (unsigned) len, (unsigned) (src-haystack), src);
memcpy( dst, src, (unsigned) len );
src += len;
dst += len;
}

I am pretty sure that memcpy(dst, src, 0); is safe, so testing for len
!= 0 is not needed. Do you have some evidence that it is worthwhile?
I'd worry about pipeline flushes.
fprintf(stderr, "replace %u/%u bytes @%u:%s\n"
, needle_len, nail_len, (unsigned) (src-haystack), src);
memcpy( dst, nail, nail_len);
src += needle_len;
dst += nail_len;
}
/* copy the final segment */
strcpy(dst,src);
return result;
}

<snip>
 
B

Ben Bacarisse

Boyer Moore doesn't require preprocessing/tables.

I can't see how it can be done without. Is there a trick I'm missing?
Knuth Morris
Pratt does. Both can be implemented as re-entrant routines.
There are faster algorithms that do require significant
reprocessing. BM and KMP can be implemented re-entrantly as can
almost all algorithms. Requiring re-entrancy is (almost)
orthogonal to OO.

Ack.
 
B

blmblm

I don't particularly either, but I dispute the implicit claim that I ever
said it was particularly a good thing that I have taken no CS courses.

Indeed, in one of your posts I quoted upstream, I think you said, or
implied, the contrary:

To me that seems to be a fairly clear statement that you recognize that
your lack of formal training has advantages and disadvantages. Why it
isn't perceived as such by spinoza1111 -- eh, not going there.
I
pointed it out because I thought it was funny when contrasted with the
constant claims (by Nilges) that I was an ivory tower academic who had
spent too much time on graduate-level CS and not enough time programming.

Did I miss a previous interaction? As best I can tell from skimming
through older posts, the first mention of your lack of CS coursework
in this thread was in response to a spinoza1111 comment about your
having "AP'd out of CS 101", which if anything would seem to imply
insufficient coursework .... :


I'm not sure this really says anything about academic work, since one
could "overspecialize" without it, no?

Not important. Just sayin'. Trying to resist the impulse to continue
this beyond-topic-drift ....

[ snip ]
 
B

blmblm

spinoza1111 said:
Nick Keighley <[email protected]> wrote:

[ snip ]
as may Dijsktra's memory; we badly need a solid
scientific biography of Dijkstra for the general reader, since so few
people outside the field know of his contributions.
And the ones inside the field seem to mainly know him as the inventor
of a particular graph algorithm. :-(
Interesting guy, as best I can tell, and/but a bit eccentric.
God, I hate that word, "eccentric". Its use normalizes deviance.

Propose a better one. (I'm mildly curious about what "normalizes
deviance" means in context, but not optimistic that an attempt to
clarify would help. <shrug>)

It is a term of art developed by the engineering anthropologist Diane
Vaughan to describe predominantly male technical groups that find ways
of "normalizing" deviations from standards and rules "in order to get
a job done".

I was right not to be optimistic .... Who (other than you) knew
that "eccentric" was a "term of art"? It seems to me to be,
in most contexts, an ordinary word with a fairly well-defined
meaning, though perhaps the meaning is to some extent a function
of the speaker's worldview.
She developed it in a groundbreaking study of the 1986 "let's kill a
teacher" Challenger mission, doing a detailed study of the physics of
the O-ring assembly that failed in that launch.

In reviewing the transcript of the meeting at the O-ring subcontractor
before the launch, she discovered that managers were "deviantly"
nagging engineers to approve the launch although the engineers KNEW
that the O-rings had not been tested in full under the unusually low
temperatures they'd endure in the January launch.

Her book "The Challenger Launch Decision: Risky Technology, Culture
and Deviance at NASA" (http://www.amazon.com/Challenger-Launch-
Decision-Technology-Deviance/dp/0226851761/ref=sr_1_1?
ie=UTF8&s=books&qid=1265704835&sr=8-1) identified what anthropologists
call a culture of dismissing engineers' concerns when management, over-
responsive to political pressure to demonstrate American macho
prowess. Unfortunately, although she served on the review panel, her
message wasn't heeded. The same sort of normalized deviance caused the
crash of the Columbia space shuttle in 2003. Here, known problems with
insulation padding were "normalized" in the same way Seebach
"normalized" his deviant failure to foresee that a character scan
would create a problem, with the same sort of pseudo-macho posturing,
and the same sort of bullying of dissidents.

So, what are you saying here? That "eccentric" should be taken to
mean "not complying with sensible standards, in a way that may cause
harm but will be excused"? I'm not familiar with this usage. <shrug>
Enough topic drift for now, maybe.

[ snip ]
 
B

blmblm

[ snip ]

[ "he" == Seebs ]
So don't agree. However, he's established it:

"The heap is a DOS term".

A statement which he later acknowledged was not the best way
of making his intended point. Why keep quoting a phrase that
its author has acknowledged was unclear? (Mostly a rhetorical
question.)
"Here's my code. It doesn't rilly work, and to fix it you have to mung
two places to add a lookahead, and if I knew my ass from a hole in the
ground, which I don't, I'd know that the lookahead has to lookahead
for 's' in two places, meaning that if the character changes there are
too places to change, but I'm cute."

What does it mean for code to "work"? One meaning is that it
meets its specification, in the sense that if the input meets the
stated preconditions the output meets the stated postconditions.
As best I can tell, Seebs's code does that. That he didn't write
a more-general function for replacing text within a string -- I
suppose one could make a case for the idea that this is not best
practice, because some future reader of the code might think,
despite clear documentation of its limitations, that it was more
general than it is. Whether that's worse than not solving the
immediate problem within the time constraints ....

I think I'll leave it at that.
You're missing the main reason. It's to be able to think about code
without some thug manager breathing down your neck. It is a taste of
human freedom and the civility that results.

Well, I would certainly never have guessed that as a reason.
In my experience, CS instructors are not immune to the temptation
to invent silly rules and penalize those who break them. *Good*
CS instructors may encourage their students to think deeply
about what they're doing, but it's far from clear to me that all
instructors meet this lofty goal.

Again, probably best to stop there.

[ snip ]
 
B

blmblm

The more I think about it, the more I think this is not a mere
quibble about terminology, and that it does matter whether it's
the problem or the solution that's "NP complete" -- I mean, the
point is not that an NP-complete problem has *A* solution that's
not polynomial-time, but that *ALL* solutions are thought to be
in that category.

That such problems may have approximate solutions that are
polynomial-time is useful in practice but somewhat beside the
point.
You're mistaking the verbal flexibility of someone who's actually
studied NP complete issues (graduate level class in algorithms, grade
== A) with misuse of "terminology", where unqualified little clerks
have to watch their mouths and use of "terminology" lest their brutal
slave-masters kill them for using the wrong "ears of corn", of
shibboleths.

You know, I'm inclined to think that quibbling about terminology
is apt to be *more* prevalent in academia than elsewhere.
Perhaps your experience differs from mine in that regard.

[ snip ]
 
M

Moi

I don't think Willem's used a list. I am not sure that the cost of

This thread is becoming too fuzzy. I admit: I haven't looked Willems
implementation up. (well, I did, but I could not find it)
strstr is high enough that saving a call at the expense of a malloc and
free worthwhile. Of course, it depends on the quality of the
implementation of all of these functions and on the nature of the
replacements being made.

I don't think that saving the extra pass is worthwhile, either.
And the linked list is overkill, too.
Just an oversized linear array of pointers + maybe realloc()
One thing I'd consider doing is special-casing those calls that have a
few replacements. With a fixed array of a few pointers all of the
repeat strstr calls can be avoided for all of these simple cases.
Agreed.



If nail_len is > needle_len this goes wrong. You need something like:

haystack_len + nhit * nail_len - nhit * needle_len


Oops, yes. Either that, or cast to signed int.
I am pretty sure that memcpy(dst, src, 0); is safe, so testing for len
!= 0 is not needed. Do you have some evidence that it is worthwhile?
I'd worry about pipeline flushes.

It's a tic I got from the DOS ages, before K&R2 and c89 / c90.
(Before Schildt, actually...)

The DOS-compilers often disagreed about memcpy(a,b,0), (maybe because
of the i86 "REPZ MOVSB" instruction?) and this was the way to have it
always behave well.

Nowadays memcpy() is often inlined, so the first two arguments will
probably not even dereferenced before the the third is proven non-zero.
So I would expect no cache effects.
And the extra condition is bad for the pipeline, yes.
(my guess is that the compiler would combine the two jump-targets,
so the number of jumps-per-loop is probably the same.


AvK
 
S

Seebs

1. Such an instruction may exist. Some processors have hardware
string processing.
2. "Best" algorithm is always restricted by input set. For the input
set I was using, a fairly naive strstr wins handily. For some input
sets, yours will be horrible and the "super studly" algorithms will
win by enough that the preprocessing and tables are worth it. So
no, yours wasn't "the best", it was just "a pretty good bet most of the
time" -- and on at least one system I know, strstr() uses precisely
that algorithm, only without the bugs.

Not really.
No.

Boyer Moore doesn't require preprocessing/tables. Knuth Morris
Pratt does. Both can be implemented as re-entrant routines.
There are faster algorithms that do require significant
reprocessing. BM and KMP can be implemented re-entrantly as can
almost all algorithms. Requiring re-entrancy is (almost)
orthogonal to OO.

Exactly.

-s
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,780
Messages
2,569,611
Members
45,280
Latest member
BGBBrock56

Latest Threads

Top