Efficency and the standard library

spinoza1111 · Feb 27, 2010

Thanks, I think.

I guess.

I still have no clue what his complaint about my version is. It works, and

I'm not talking about your version. You get an F, since you never met
the problem requirements (to not use string.H). I'm talking about
WILLEM'S solution, not yours.

The fact is, Seebach, you're bragging about taking so little time that
it's becoming obvious, and not only to me, that you're an
inexperienced and careless programmer whose main job (which you said
was finding and reporting bugs) isn't programming. You cover this fact
up with bluster about how little time you take, but we know that you
work too fast: your strlen **** up is Exhibit A.

spinoza1111 · Feb 27, 2010

[ snip ]

Oh gee, this was Walter. If anything I said (including my poem)
implies that my opinion of blm is lower, I withdraw it. I already had
a low opinion of Walter.

Click to expand...

Which poem was that .... (I have to say that I usually skip your
verse.)

Of course you do, dear Ms Massingill,
Of course you do, for we live in an age of songs of the doomed
Tuneless dirges are all you hear
The music of language, what you fear.

It was where I replied to Walter's saying I'm in the middle of the
pack. Got my goat, since I wouldn't be in the middle of any pack that
would have me. I am either way ahead or light years behind any pack
here.

spinoza1111 · Feb 27, 2010

spinoza1111 said:
spinoza1111 said:

[ snip ]
[ .... ] Professor Massengill [ .... ]
That's twice -- well, sort of, because this time you came a little
closer to spelling it right, though the variant you chose does
lend a bit of credibility to Seebs's suggestion about whether
the spelling mistake is a simple typo or something else. (I'm
uncertain about whether to add a here.)

Click to expand...

Click to expand...

Please don't imagine that he has any credibility. I was merely too
lazy to check the spelling, and frankly, people have abused my own
name too much for me to be all sensitive.

Click to expand...

And yet you don't hesitate to point out others' misspellings, and
to call them -- I forget, is it "illiterate" or "aliterate"?

Click to expand...

Give me a break. That reasoning makes no sense. You've got one data
point when my posts are long enough to make the magnitude of the error
1/bigX, whereas other posters are large n over small x. I am often too
lazy to check my spelling. Wanna know why? Because in fact, I'm
literate enough not to have to.

Just sayin'.

[ snip ]

Here, my extra care is not compensatory, I see, any more than
Seebach's.

Click to expand...

I have no idea what you mean here.

Deal.

Click to expand...

Okay, good.

[ snip ]

Click to expand...

Click to expand...

spinoza1111 · Feb 27, 2010

"spinoza1111" ha scritto nel messaggio
while(*ptrIndex1)
{
ptrIndex0 = ptrIndex1;
while (-1)
{
// --- Check for (one character) handle
for(;
*ptrIndex1 && *ptrIndex1 != *strTarget;
ptrIndex1++);
// --- Check for complete match while remembering the
// --- last position of the handle

yes this "for" seems run good
but remember the last position of hadle means memory, means to use malloc or get
memory from the stack: what about there are 1.000.000s of "matchs"

There's no problem, I think. The last position of the handle is used
and thrown away. If there are millions of matches, there IS a
"problem". This is a large linked list of unmatching segments.

However, Willem's code and mine both demonstrate, I think, the minimal
storage complexity of the problem. It appears to me that most other
solutions lazily do a big malloc, and some of them may even forget to
free() what isn't needed.

As far as I can see (and I must say I may be missing something, since
the meanness and incompetence of many posters makes me loth to read
their ugly code), all other solutions are not elegant or minimal in
their use of storage. Indeed, one loudmouth here (Heathfield) would
have been incapable of arriving at a clean solution, since the linked
list of segments must for proper performance be a link list of
pointers. The evidence from his book is that he would have allocated a
potentially huge struct element, with an idiotic char array, and
copied bytes, resulting in completely unacceptable time complexity in
addition to unacceptable storage usage.

But I would I could write and read your native language. I may be
missing something.

spinoza1111 · Feb 27, 2010

Nick Keighley said:
Nick Keighley said:

Nick Keighley wrote: [...]
For instance Jacob has been attacked in
this ng for not giving his compiler away for free, [...]
I don't agree with such attacks. He has as
much right to earn his daily bread as anyone else.
Had such attacks been made, I wouldn't have agreed with them
either.

Click to expand...

Click to expand...

I'm pretty sure someone asked him why he hadn't GPL'd it, and
pretty strongly implied that anyone that didn't GPL their software
also ate babies and told people what was going to happen next in
the film

Click to expand...

This was quite a while back, IIRC, and it was either the "teapot/tea
leaves" person, or someone else clearing taunting Jacob just to get a
rise out of him and enjoy the inevitable subsequent debates. So IMO
neither Jacob nor anyone else need take those posts as serious
criticism of his product.

They just good ole boys. They just havin' a little FUN with that
nigra. Pity things got outa hand.

Chris M. Thomasson · Feb 27, 2010

"spinoza1111" ha scritto nel
messaggioarticle

There's no problem, I think. The last position of the handle is used
and thrown away. If there are millions of matches, there IS a
"problem". This is a large linked list of unmatching segments.

However, Willem's code and mine both demonstrate, I think, the minimal
storage complexity of the problem. It appears to me that most other
solutions lazily do a big malloc, and some of them may even forget to
free() what isn't needed.

Well, IMVHO, your solution is fairly inefficient wrt storage because it
dynamically allocates a linked list node via `malloc()' for every darn
match. For instance, the following:

replace("abababababababababa", "a", "b", &x);

gives me 11 calls to `malloc()', 10 calls to `free()', plus one `free()' I
have to do in order to reclaim the returned string. Okay, this means that
you are using 160 bytes (or perhaps 240 on a 64-bit system) of dynamically
allocated storage to hold the linked-list for a source string that is only
19 characters long.

Why not at least do a simple stack-based region allocator for linked-list
nodes and switch over to `malloc()' when that is exhausted? Or attempt to
amortize everything by letting a single list node hold multiple matches?

[...]

spinoza1111 · Feb 28, 2010

Well, IMVHO, your solution is fairly inefficient wrt storage because it
dynamically allocates a linked list node via `malloc()' for every darn
match. For instance, the following:

replace("abababababababababa", "a", "b", &x);

gives me 11 calls to `malloc()', 10 calls to `free()', plus one `free()' I
have to do in order to reclaim the returned string. Okay, this means that
you are using 160 bytes (or perhaps 240 on a 64-bit system) of dynamically
allocated storage to hold the linked-list for a source string that is only
19 characters long.

Good point. You present a string which results in a linked list of
addresses of characters as a potential worst case, and you're right.

This could occur in genetics applications with which Malcolm seems to
work.

And there are many things we can do about this problem. Sure, each
linked list node could in general hold not a pointer and a length, but
an array of pointers and lengths of fixed size as a cache.

Or we could throw out linked lists and return instead an array of
pointers and lengths. We could use run length encoding in this array,
whose element would look like this:

struct changes { int repeatMe; char *p; int len };

But soft. The problem is becoming one of representing strings in the
best way possible. The string in your example is itself "inefficient"
since it could be represented more compactly using a repeat count. You
pass my solution weird data, yes, its performance suffers.

And yes, we're somewhat responsible for not having surprise
performance issues for unexpected data. Those of us who are real
programmers lose beauty sleep over this, unlike Richard Heathfield,
who doesn't seem to care that in his "reusable" linked list a large
amount of data will be a performance hog.

But now you're talking about how best to represent strings, which is
best done not in C but in an OO language, because inside the object we
can represent the string how the hell we want.

Why not at least do a simple stack-based region allocator for linked-list
nodes and switch over to `malloc()' when that is exhausted? Or attempt to
amortize everything by letting a single list node hold multiple matches?

[...]

Chris M. Thomasson · Feb 28, 2010

Good point. You present a string which results in a linked list of
addresses of characters as a potential worst case, and you're right.
This could occur in genetics applications with which Malcolm seems to
work.
Indeed.

And there are many things we can do about this problem. Sure, each
linked list node could in general hold not a pointer and a length, but
an array of pointers and lengths of fixed size as a cache.

Or we could throw out linked lists and return instead an array of
pointers and lengths. We could use run length encoding in this array,
whose element would look like this:

struct changes { int repeatMe; char *p; int len };

But soft. The problem is becoming one of representing strings in the
best way possible. The string in your example is itself "inefficient"
since it could be represented more compactly using a repeat count. You
pass my solution weird data, yes, its performance suffers.

Okay; fair enough Edward. BTW, I created an "automated" testing framework
for sub-string replacement/search algorithms:

http://groups.google.com/group/comp.lang.c/browse_frm/thread/6c021e75cf801832

As you can see I am using it to test your algorithm and a "newer" one of
mine that I have not formally posted to this news group yet. Both of our
algorithms are passing 10,000,000 iterations of the test. So far, this is
pretty good news!

I cannot seem to find a bug in our algorithms; cool!

:^)

And yes, we're somewhat responsible for not having surprise
performance issues for unexpected data. Those of us who are real
programmers lose beauty sleep over this, unlike Richard Heathfield,
who doesn't seem to care that in his "reusable" linked list a large
amount of data will be a performance hog.

But now you're talking about how best to represent strings, which is
best done not in C but in an OO language, because inside the object we
can represent the string how the hell we want.

If you examine the newer solution you might come to the conclusion that my
technique of amortizing the amount of calls to `realloc()' is "kind of
okay". However, it just might be a performance nightmare because of the
chance that the piece of shi% native allocator might need to relocate the
storage and perform a full-blown O(N) copy on each damn call. Therefore, I
conclude that I will probably get a very big performance enchantment if I
represent the final destination string as a singly linked-list. This way, I
can amortize the number of calls to `malloc()' and guarantee that I will not
ever have to do a horrible O(N) copy. I think this is a case in favor of
more exotic representations of strings.

My algorithm builds a limited amount of matches (e.g., 4096 in the code
as-is) and then flushes all of those matches to the destination string.
Expansion of the destination string only happens once per-4096 matches.
Pretty good. However, `realloc()' sucks if it has to fuc%ing copy!!!!!!!!!

GRRRRR!!!

Therefore, instead of calling `realloc()' every 4096 matches... I would
instead create a new segment of the destination string using `malloc()' and
link it to the previous via linked-list. BAM! Problem solved...

Here is a snippet of code that encompasses my newer replacement algorithm:

/* Chris M. Thomasson's Sub-String Replacement Algorithm
__________________________________________________________________*/
#include <string.h>
#define xstrstr strstr
#define xstrlen strlen
#define xstrcmp strcmp
#define xmemcpy memcpy

#define MATCH_MAX 4096

char*
chris_replace(char const* src,
char const* cmp,
char const* xchg)
{
size_t cmp_size = xstrlen(cmp);
size_t xchg_size = xstrlen(xchg);

char* dest = NULL;
size_t src_offset = 0;
size_t dest_head = 0;
size_t total_count = 0;
char const* head = src;
char const* tail = src;
char const* matches[MATCH_MAX];

if (! cmp_size)
{
/* special case handler */
size_t src_size = xstrlen(src);

if (! src_size)
{
if ((dest = malloc(xchg_size + 1)))
{
memcpy(dest, xchg, xchg_size);
dest[xchg_size] = '\0';
}
}

else
{
if ((dest = malloc(src_size + 1)))
{
memcpy(dest, src, src_size);
dest[src_size] = '\0';
}
}

return dest;
}

while (head)
{
size_t i;
size_t dest_size;
size_t src_size;
size_t count = 0;
char* dest_tmp;
char const* prev_head = head;

/* acquire matches */
while ((head = xstrstr(head, cmp)))
{
matches[count++] = head;
tail = head;
head += cmp_size;
if (count == MATCH_MAX) break;
}

/* calculate sizes and expand dest buffer */
total_count += count;

src_size = (! head) ? (tail - src) + xstrlen(tail)
: (head - src);

dest_size = (src_size - (total_count * cmp_size)) +
(total_count * xchg_size);

if (! (dest_tmp = realloc(dest, dest_size + 1)))
{
free(dest);
return NULL;
}

dest = dest_tmp;

src_size -= src_offset;

/* flush matches to dest buffer */
for (i = 0; i < count; ++i)
{
size_t offset = matches - prev_head;
size_t stride = offset + cmp_size;
memcpy(dest + dest_head, prev_head, offset);
src_offset += stride;
src_size -= stride;
prev_head += stride;
dest_head += offset;
memcpy(dest + dest_head, xchg, xchg_size);
dest_head += xchg_size;
}

if (src_size)
{
memcpy(dest + dest_head, prev_head, src_size);
dest_head += src_size;
}

dest[dest_size] = '\0';
}

return dest;
}

I know that I flaked out on eluding `string.h', but I wanted to focus on a
specific portion of the algorithm before I integrate my non-naive sub-string
search algorithm into it.

When I get some more time, I will convert this to a linked-list and alter my
testing function to be able to verify a result string in a linked-list
against the expected NULL terminated string.

By the time I add the linked-list destination string ability, and the
non-naive sub-string search... Well, it should outperform every solution
posted here so far. BTW, your soultion, and my previous soultion, are no
good for extremely large input. Both of ours might build very large
linked-lists. Mine will be amortized, but it still not good enough! This
newer soultion works well with huge input. Of course, the linked-list
verison I am going to code will be even better...

Stay tuned!

;^)

Why not at least do a simple stack-based region allocator for
linked-list
nodes and switch over to `malloc()' when that is exhausted? Or attempt
to
amortize everything by letting a single list node hold multiple matches?

[...]

Click to expand...

Click to expand...

spinoza1111 · Feb 28, 2010

Good point. You present a string which results in a linked list of
addresses of characters as a potential worst case, and you're right.
This could occur in genetics applications with which Malcolm seems to
work.
Indeed.

And there are many things we can do about this problem. Sure, each
linked list node could in general hold not a pointer and a length, but
an array of pointers and lengths of fixed size as a cache.

Click to expand...

Or we could throw out linked lists and return instead an array of
pointers and lengths. We could use run length encoding in this array,
whose element would look like this:

Click to expand...

struct changes { int repeatMe; char *p; int len };

Click to expand...

But soft. The problem is becoming one of representing strings in the
best way possible. The string in your example is itself "inefficient"
since it could be represented more compactly using a repeat count. You
pass my solution weird data, yes, its performance suffers.

Click to expand...

Okay; fair enough Edward. BTW, I created an "automated" testing framework
for sub-string replacement/search algorithms:

http://groups.google.com/group/comp.lang.c/browse_frm/thread/6c021e75...

As you can see I am using it to test your algorithm and a "newer" one of
mine that I have not formally posted to this news group yet. Both of our
algorithms are passing 10,000,000 iterations of the test. So far, this is
pretty good news!

I cannot seem to find a bug in our algorithms; cool!

:^)

And yes, we're somewhat responsible for not having surprise
performance issues for unexpected data. Those of us who are real
programmers lose beauty sleep over this, unlike Richard Heathfield,
who doesn't seem to care that in his "reusable" linked list a large
amount of data will be a performance hog.
But now you're talking about how best to represent strings, which is
best done not in C but in an OO language, because inside the object we
can represent the string how the hell we want.

Click to expand...

If you examine the newer solution you might come to the conclusion that my
technique of amortizing the amount of calls to `realloc()' is "kind of
okay". However, it just might be a performance nightmare because of the
chance that the piece of shi% native allocator might need to relocate the
storage and perform a full-blown O(N) copy on each damn call. Therefore, I
conclude that I will probably get a very big performance enchantment if I
represent the final destination string as a singly linked-list. This way, I
can amortize the number of calls to `malloc()' and guarantee that I will not
ever have to do a horrible O(N) copy. I think this is a case in favor of
more exotic representations of strings.

My algorithm builds a limited amount of matches (e.g., 4096 in the code
as-is) and then flushes all of those matches to the destination string.
Expansion of the destination string only happens once per-4096 matches.
Pretty good. However, `realloc()' sucks if it has to fuc%ing copy!!!!!!!!!

GRRRRR!!!

Therefore, instead of calling `realloc()' every 4096 matches... I would
instead create a new segment of the destination string using `malloc()' and
link it to the previous via linked-list. BAM! Problem solved...

Here is a snippet of code that encompasses my newer replacement algorithm:

/* Chris M. Thomasson's Sub-String Replacement Algorithm
__________________________________________________________________*/
#include <string.h>
#define xstrstr strstr
#define xstrlen strlen
#define xstrcmp strcmp
#define xmemcpy memcpy

#define MATCH_MAX 4096

char*
chris_replace(char const* src,
char const* cmp,
char const* xchg)
{
size_t cmp_size = xstrlen(cmp);
size_t xchg_size = xstrlen(xchg);

char* dest = NULL;
size_t src_offset = 0;
size_t dest_head = 0;
size_t total_count = 0;
char const* head = src;
char const* tail = src;
char const* matches[MATCH_MAX];

if (! cmp_size)
{
/* special case handler */
size_t src_size = xstrlen(src);

if (! src_size)
{
if ((dest = malloc(xchg_size + 1)))
{
memcpy(dest, xchg, xchg_size);
dest[xchg_size] = '\0';
}
}

else
{
if ((dest = malloc(src_size + 1)))
{
memcpy(dest, src, src_size);
dest[src_size] = '\0';
}
}

return dest;
}

while (head)
{
size_t i;
size_t dest_size;
size_t src_size;
size_t count = 0;
char* dest_tmp;
char const* prev_head = head;

/* acquire matches */
while ((head = xstrstr(head, cmp)))
{
matches[count++] = head;
tail = head;
head += cmp_size;
if (count == MATCH_MAX) break;
}

/* calculate sizes and expand dest buffer */
total_count += count;

src_size = (! head) ? (tail - src) + xstrlen(tail)
: (head - src);

dest_size = (src_size - (total_count * cmp_size)) +
(total_count * xchg_size);

if (! (dest_tmp = realloc(dest, dest_size + 1)))
{
free(dest);
return NULL;
}

dest = dest_tmp;

src_size -= src_offset;

/* flush matches to dest buffer */
for (i = 0; i < count; ++i)
{
size_t offset = matches - prev_head;
size_t stride = offset + cmp_size;
memcpy(dest + dest_head, prev_head, offset);
src_offset += stride;
src_size -= stride;
prev_head += stride;
dest_head += offset;
memcpy(dest + dest_head, xchg, xchg_size);
dest_head += xchg_size;
}

if (src_size)
{
memcpy(dest + dest_head, prev_head, src_size);
dest_head += src_size;
}

dest[dest_size] = '\0';
}

return dest;

}

I know that I flaked out on eluding `string.h', but I wanted to focus on a
specific portion of the algorithm before I integrate my non-naive sub-string
search algorithm into it.

When I get some more time, I will convert this to a linked-list and alter my
testing function to be able to verify a result string in a linked-list
against the expected NULL terminated string.

By the time I add the linked-list destination string ability, and the
non-naive sub-string search... Well, it should outperform every solution
posted here so far. BTW, your soultion, and my previous soultion, are no
good for extremely large input. Both of ours might build very large
linked-lists. Mine will be amortized, but it still not good enough! This
newer soultion works well with huge input. Of course, the linked-list
verison I am going to code will be even better...

Stay tuned!

;^)

Why not at least do a simple stack-based region allocator for
linked-list
nodes and switch over to `malloc()' when that is exhausted? Or attempt
to
amortize everything by letting a single list node hold multiple matches?
[...]

Click to expand...

Click to expand...

Looks great. I will try to find time to get into this more.

spinoza1111 · Feb 28, 2010

Good point. You present a string which results in a linked list of
addresses of characters as a potential worst case, and you're right.
This could occur in genetics applications with which Malcolm seems to
work.
Indeed.

And there are many things we can do about this problem. Sure, each
linked list node could in general hold not a pointer and a length, but
an array of pointers and lengths of fixed size as a cache.

Click to expand...

Or we could throw out linked lists and return instead an array of
pointers and lengths. We could use run length encoding in this array,
whose element would look like this:

Click to expand...

struct changes { int repeatMe; char *p; int len };

Click to expand...

But soft. The problem is becoming one of representing strings in the
best way possible. The string in your example is itself "inefficient"
since it could be represented more compactly using a repeat count. You
pass my solution weird data, yes, its performance suffers.

Click to expand...

Okay; fair enough Edward. BTW, I created an "automated" testing framework
for sub-string replacement/search algorithms:

http://groups.google.com/group/comp.lang.c/browse_frm/thread/6c021e75...

As you can see I am using it to test your algorithm and a "newer" one of
mine that I have not formally posted to this news group yet. Both of our
algorithms are passing 10,000,000 iterations of the test. So far, this is
pretty good news!

I cannot seem to find a bug in our algorithms; cool!

:^)

And yes, we're somewhat responsible for not having surprise
performance issues for unexpected data. Those of us who are real
programmers lose beauty sleep over this, unlike Richard Heathfield,
who doesn't seem to care that in his "reusable" linked list a large
amount of data will be a performance hog.
But now you're talking about how best to represent strings, which is
best done not in C but in an OO language, because inside the object we
can represent the string how the hell we want.

Click to expand...

If you examine the newer solution you might come to the conclusion that my
technique of amortizing the amount of calls to `realloc()' is "kind of
okay". However, it just might be a performance nightmare because of the
chance that the piece of shi% native allocator might need to relocate the
storage and perform a full-blown O(N) copy on each damn call. Therefore, I
conclude that I will probably get a very big performance enchantment if I
represent the final destination string as a singly linked-list. This way, I
can amortize the number of calls to `malloc()' and guarantee that I will not
ever have to do a horrible O(N) copy. I think this is a case in favor of
more exotic representations of strings.

My algorithm builds a limited amount of matches (e.g., 4096 in the code
as-is) and then flushes all of those matches to the destination string.
Expansion of the destination string only happens once per-4096 matches.
Pretty good. However, `realloc()' sucks if it has to fuc%ing copy!!!!!!!!!

GRRRRR!!!

Therefore, instead of calling `realloc()' every 4096 matches... I would
instead create a new segment of the destination string using `malloc()' and
link it to the previous via linked-list. BAM! Problem solved...

Here is a snippet of code that encompasses my newer replacement algorithm:

/* Chris M. Thomasson's Sub-String Replacement Algorithm
__________________________________________________________________*/
#include <string.h>
#define xstrstr strstr
#define xstrlen strlen
#define xstrcmp strcmp
#define xmemcpy memcpy

#define MATCH_MAX 4096

char*
chris_replace(char const* src,
char const* cmp,
char const* xchg)
{
size_t cmp_size = xstrlen(cmp);
size_t xchg_size = xstrlen(xchg);

char* dest = NULL;
size_t src_offset = 0;
size_t dest_head = 0;
size_t total_count = 0;
char const* head = src;
char const* tail = src;
char const* matches[MATCH_MAX];

if (! cmp_size)
{
/* special case handler */
size_t src_size = xstrlen(src);

if (! src_size)
{
if ((dest = malloc(xchg_size + 1)))
{
memcpy(dest, xchg, xchg_size);
dest[xchg_size] = '\0';
}
}

else
{
if ((dest = malloc(src_size + 1)))
{
memcpy(dest, src, src_size);
dest[src_size] = '\0';
}
}

return dest;
}

while (head)
{
size_t i;
size_t dest_size;
size_t src_size;
size_t count = 0;
char* dest_tmp;
char const* prev_head = head;

/* acquire matches */
while ((head = xstrstr(head, cmp)))
{
matches[count++] = head;
tail = head;
head += cmp_size;
if (count == MATCH_MAX) break;
}

/* calculate sizes and expand dest buffer */
total_count += count;

src_size = (! head) ? (tail - src) + xstrlen(tail)
: (head - src);

dest_size = (src_size - (total_count * cmp_size)) +
(total_count * xchg_size);

if (! (dest_tmp = realloc(dest, dest_size + 1)))
{
free(dest);
return NULL;
}

dest = dest_tmp;

src_size -= src_offset;

/* flush matches to dest buffer */
for (i = 0; i < count; ++i)
{
size_t offset = matches - prev_head;
size_t stride = offset + cmp_size;
memcpy(dest + dest_head, prev_head, offset);
src_offset += stride;
src_size -= stride;
prev_head += stride;
dest_head += offset;
memcpy(dest + dest_head, xchg, xchg_size);
dest_head += xchg_size;
}

if (src_size)
{
memcpy(dest + dest_head, prev_head, src_size);
dest_head += src_size;
}

dest[dest_size] = '\0';
}

return dest;

}

I know that I flaked out on eluding `string.h', but I wanted to focus on a
specific portion of the algorithm before I integrate my non-naive sub-string
search algorithm into it.

When I get some more time, I will convert this to a linked-list and alter my
testing function to be able to verify a result string in a linked-list
against the expected NULL terminated string.

By the time I add the linked-list destination string ability, and the
non-naive sub-string search... Well, it should outperform every solution
posted here so far. BTW, your soultion, and my previous soultion, are no
good for extremely large input. Both of ours might build very large
linked-lists. Mine will be amortized, but it still not good enough! This
newer soultion works well with huge input. Of course, the linked-list
verison I am going to code will be even better...

Stay tuned!

;^)

Why not at least do a simple stack-based region allocator for
linked-list
nodes and switch over to `malloc()' when that is exhausted? Or attempt
to
amortize everything by letting a single list node hold multiple matches?
[...]

Click to expand...

Click to expand...

If the linked list becomes overlarge, you should think coroutine, to
get started on assembling the output string while doing the replace.

But this involves multithreads, much easier in c sharp.

The problem is gradually becoming "how can we best replace Nul
terminated strings with a truly encapsulated string".

spinoza1111 · Feb 28, 2010

[email protected] said:
[email protected] said:

spinoza1111 <[email protected]> wrote:

Click to expand...

[more of the same]

Quoted for Seebs's benefit. I won't reply.

Click to expand...

[...]

Perhaps you could explain to us how re-postingspinoza1111'sravings
benefits Seebs, or anyone else for that matter.

Eventually you will probably realize that debatingspinoza1111is
a waste of your own time (which is your to waste) and of the rest
of our time as well. Most of his posts here are reponses to other
posters' response to him. If people started ignoring him completely,
he would probably post less, and this newsgroup's signal-to-noise
ratio would improve vastly.

*Please* stop feeding the troll (or whatever he is; there's been
some debate about whether he qualifies as a "troll", but the effect
is the same whether he does or not).

But if I'm not a troll, then why should people stop responding? Even
Seebach has admitted that the discussion I started concerning replace
was very productive for many participants, and I'd suggest that it's
been on topic and useful to people here. Whereas your contribution
seems to have been to tech review Seebach's famous off by one strlen
and miss the bug.

And...I'm not a troll.

spinoza1111 · Feb 28, 2010

[ snip ]

Results are in the middle of the pack not the stuff of a A+.
w..
Oh gee, this was Walter. If anything I said (including my poem)
implies that my opinion of blm is lower, I withdraw it. I already had
a low opinion of Walter.

Click to expand...

Click to expand...

Which poem was that .... (I have to say that I usually skip your
verse.)

Click to expand...

Of course you do, dear Ms Massingill,
Of course you do, for we live in an age of songs of the doomed
Tuneless dirges are all you hear
The music of language, what you fear.

Click to expand...

This was poorly written and unnecessarily offensive on my part, as
opposed when it is necessary to be offensive to the truly offensive. I
apologize for this.

It was where I replied to Walter's saying I'm in the middle of the
pack. Got my goat, since I wouldn't be in the middle of any pack that
would have me. I am either way ahead or light years behind any pack
here.

Click to expand...

spinoza1111 · Mar 1, 2010

spinoza1111wrote:
[ snip ]
Results are in the middle of the pack not the stuff of a A+.
w..
Oh gee, this was Walter. If anything I said (including my poem)
implies that my opinion of blm is lower, I withdraw it. I already had
a low opinion of Walter.
Which poem was that .... (I have to say that I usually skip your
verse.)

Click to expand...

Click to expand...

Of course you do, dear Ms Massingill,
Of course you do, for we live in an age of songs of the doomed
Tuneless dirges are all you hear
The music of language, what you fear.

Click to expand...

This was poorly written and unnecessarily offensive on my part, as
opposed when it is necessary to be offensive to the truly offensive. I
apologize for this.

It was where I replied to Walter's saying I'm in the middle of the
pack. Got my goat, since I wouldn't be in the middle of any pack that
would have me. I am either way ahead or light years behind any pack
here.

Click to expand...

Click to expand...

However, it's plain that I'm a piker when it comes to being offensive.
See Heathfield's post in the thread Edward Nilges' Lie.

Because he cannot code properly and doesn't know computer science, and
is an uncultivated boor who works as a temp, he thinks it's cute to
forge letters said to be from me, a violation of the law.

Julienne and blm, you are enablers, because you don't complain to him
about his behavior: like many women in this type of situation, you're
a little dazzled by the thug and his transgressions; perhaps an
atavistic part of you is not a little excited by blood, metaphorical
or otherwise. I am using this newsgroup as intended and even Seebach
has conceded that these threads have been useful and productive. I
start them, whereas Heathfield and Seebach endeavor to destroy them
because they're not qualified to participate on a level with people
like Navia and Willem, let alone me (100% bug rate in strlen, absurd
linked list, heap a DOS term, is not a programmer per se, etc).

Julienne, blm, and Malcolm, I shall not participate until you find it
in yourselves to complain to Heathfield and Seebach in the thread
Edward Nilges' lie where Heathfield posts a letter he says is from my
lawyer: this was a criminal act. I will not read or post to these
newsgroup, and you people can return to your regularly scheduled
programming.

You may email me at (e-mail address removed). But what I would most
appreciate is copies of your post or emails to Seebach and Heathfield,
asking them to desist. This type of behavior is the norm in groups of
fair and decent people.

Otherwise, I am wasting my time here.

Malcolm, Julienne, blm: unless I hear from you by Monday March 8,
under advice of my counsel, genuine letters are going this week to
Seebach's employer and my publisher (who is also Seebach's publisher)
concerning his behavior. In addition, a letter is going to SAMS
concerning Heathfield.

Click to expand...

Chris M. Thomasson · Mar 1, 2010

Chris M. Thomasson said:
Okay; fair enough Edward. BTW, I created an "automated" testing framework
for sub-string replacement/search algorithms:

http://groups.google.com/group/comp.lang.c/browse_frm/thread/6c021e75cf801832

As you can see I am using it to test your algorithm and a "newer" one of
mine that I have not formally posted to this news group yet. Both of our
algorithms are passing 10,000,000 iterations of the test. So far, this is
pretty good news!

I cannot seem to find a bug in our algorithms; cool!

:^)

Well, Ben Bacarisse so very kindly pointed out that my test was totally
missing a bug in your code. It turns out that my damn test was not
generating zero-length strings!!

;^(...

Now I have fixed that crap and it makes your code seg-fault. I need to hunt
down the most recent version of your code! It's hard to find in this huge
thread.

The fixed test also found a "bug" in Ben's code when the source and
comparand are both empty because it fails to output the exchange string:

http://groups.google.com/group/comp.lang.c/msg/eb7f0a86f7f1d752

http://groups.google.com/group/comp.lang.c/msg/26fb64f7229a42c8

Richard Bos · Mar 1, 2010

Nick Keighley said:
I'm pretty sure someone asked him why he hadn't GPL'd it, and pretty
strongly implied that anyone that didn't GPL their software also ate
babies and told people what was going to happen next in the film

Yeah, but that's just Gnidiots for you. They have that attitude towards
everybody - they don't pick out jacob specifically.

Richard

Richard Bos · Mar 1, 2010

Seebs said:
I think the key is the realization that it is quite possible to have
interesting discussions about C informed by spinoza1111's code, even though
his code is usually garbage and he appears to have an almost supernatural
talent for getting things wrong.

It's also quite possible to have interesting discussions about the
exothermic properties of the Three Mile Island reactors, but is it worth
the fallout?

Richard

Richard Bos · Mar 1, 2010

Moi said:
I have never understood the intentions of the silly DWORD typedefs.

Were they intended
* as "a machine wide int"
* an int with *exactly* 32 bits
* or just inherited from assembler naming ?

* All of the above, plus
* a vague, only fractionally understanding nod towards data hiding.

And then there's LPSTR and all that grot.

Richard

Ben Bacarisse · Mar 1, 2010

Chris M. Thomasson said:
The fixed test also found a "bug" in Ben's code when the source and
comparand are both empty because it fails to output the exchange
string:

This issue has come up before. There is no clear meaning for such a
call so I opted to exclude it "by contract" rather by returning
something arbitrary. Given that the issue has come up (I think) three
times now, that seems to have been a mistake, but it can't be a bug in
the classic sense since it was by design.

<snip>

Chris M. Thomasson · Mar 1, 2010

Ben Bacarisse said:
This issue has come up before. There is no clear meaning for such a
call so I opted to exclude it "by contract" rather by returning
something arbitrary. Given that the issue has come up (I think) three
times now, that seems to have been a mistake, but it can't be a bug in
the classic sense since it was by design.

You are correct. My replace algorithm does indeed special case this... Not
sure if it was even worth it. Rather than altering your algorithm, I think I
will remove the special case from mine and alter the actual test to change
it's expectations wrt zero-length comparand and/or source.

Seebs · Mar 1, 2010

It's also quite possible to have interesting discussions about the
exothermic properties of the Three Mile Island reactors, but is it worth
the fallout?

As long as no one is putting his code in production systems on which outcomes
anyone cares about might depend, I don't think the risks are comparable.

-s

Implementing strstr	229	Mar 19, 2010
Future standard GUI library	51	May 18, 2013
Multithreading and compatibility library (libconfig)	1	Jan 23, 2013
Problem with a login script, SESSION user rights and put this together so it works with the other pages and MySQL. Code examples.	2	May 5, 2023
AW: Pure python standard library and License	0	Mar 4, 2011
What is the most astounding C++ syntax construct?	0	Dec 22, 2022
review of the "container library", part 1/?	18	Mar 1, 2011
ParseTree and the Standard Library	0	Jul 27, 2007

Efficency and the standard library

spinoza1111

spinoza1111

spinoza1111

spinoza1111

spinoza1111

Chris M. Thomasson

spinoza1111

Chris M. Thomasson

spinoza1111

spinoza1111

spinoza1111

spinoza1111

spinoza1111

Chris M. Thomasson

Richard Bos

Richard Bos

Richard Bos

Ben Bacarisse

Chris M. Thomasson

Seebs

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads