Why doesn't strrstr() exist?

  • Thread starter Christopher Benson-Manica
  • Start date
C

Christopher Benson-Manica

(Followups set to comp.std.c. Apologies if the crosspost is unwelcome.)

strchr() is to strrchr() as strstr() is to strrstr(), but strrstr()
isn't part of the standard. Why not?
 
D

Default User

Christopher said:
(Followups set to comp.std.c. Apologies if the crosspost is
unwelcome.)

Followup UNSET.

I think it's dumb to do this. It annoys the crap out of me to have a
post in a group I read, with a followup set to one I DON'T read. If the
post was appropriate for comp.lang.c, then so are replies. If replies
aren't, then the post should never have been made here.

I find it rude and obnoxious.




Brian
 
C

Christopher Benson-Manica

Default User said:
I think it's dumb to do this. It annoys the crap out of me to have a
post in a group I read, with a followup set to one I DON'T read. If the
post was appropriate for comp.lang.c, then so are replies. If replies
aren't, then the post should never have been made here.

The fact that many comp.lang.c regulars do not (judging from the
paucity of posts) follow comp.std.c was my primary motivation for
crossposting it to this group as well.

I set the folloups to c.s.c only because tin seems to think it is bad
netiquette to set followups to more than one newsgroup...
I find it rude and obnoxious.

For that I humbly apologize; consider the lesson learned.
 
S

SM Ryan

# (Followups set to comp.std.c. Apologies if the crosspost is unwelcome.)
#
# strchr() is to strrchr() as strstr() is to strrstr(), but strrstr()
# isn't part of the standard. Why not?

char *strrstr(char *x,char *y) {
int m = strlen(x);
int n = strlen(y);
char *X = malloc(m+1);
char *Y = malloc(n+1);
int i;
for (i=0; i<m; i++) X[m-1-i] = x; X[m] = 0;
for (i=0; i<n; i++) Y[n-1-i] = y; Y[n] = 0;
char *Z = strstr(X,Y);
if (Z) {
int ro = Z-X;
int lo = ro+n-1;
int ol = m-1-lo;
Z = x+ol;
}
free(X); free(Y);
return Z;
}
 
W

Walter Roberson

SM Ryan said:
char *strrstr(char *x,char *y) {
int m = strlen(x);
int n = strlen(y);
char *X = malloc(m+1);
char *Y = malloc(n+1);

Small changes: strlen has a result type of size_t, not int, and
malloc() takes a parameter of type size_t, not int. A small change to
the declaratons of m and n fixes both issues.
int i;
for (i=0; i<m; i++) X[m-1-i] = x; X[m] = 0;
for (i=0; i<n; i++) Y[n-1-i] = y; Y[n] = 0;


As per the above, m and n are size_t not int, so i needs to be size_t
as well.

Also, you don't check to see whether the malloc() returned NULL.
char *Z = strstr(X,Y);
if (Z) {
int ro = Z-X;
int lo = ro+n-1;
int ol = m-1-lo;
Z = x+ol;

This starts to get into murky waters. Z-X is a subtraction
of pointers, the result of which is ptrdiff_t, which is a signed
integral type. Logically, though, Z-X could be of size_t, which
is unsigned. This difference has probably been discussed in the past,
but I have not happened to see the discussion of what happens with
pointer subtraction if the object size would fit in the unsigned
type but not in the signed type. Anyhow, ro, lo, ol should not be int.
 
D

Default User

Christopher said:
The fact that many comp.lang.c regulars do not (judging from the
paucity of posts) follow comp.std.c was my primary motivation for
crossposting it to this group as well.

Right. If I don't read c.s.c, I sure don't want to have to subscribe to
follow one thread.
I set the folloups to c.s.c only because tin seems to think it is bad
netiquette to set followups to more than one newsgroup...


For that I humbly apologize; consider the lesson learned.

I was harsher than I needed to be there. Sorry for going a bit over the
top.




Brian
 
D

Default User

Stephen said:
...says Usenet expert "Default User" :)


You feel that I my choice of moniker reflects something about my level
of expertise? Note that "Default User" is NOT the default name in
XanaNews, my current newsreader.



Brian
 
D

Douglas A. Gwyn

SM said:
# strchr() is to strrchr() as strstr() is to strrstr(), but strrstr()
# isn't part of the standard. Why not?
char *strrstr(char *x,char *y) {
int m = strlen(x);
int n = strlen(y);
char *X = malloc(m+1);
char *Y = malloc(n+1);
...

If one really wanted to use the function, that implementation
would be problematic.

I think the real answer is that there were lots of uses for
strstr() and few if any requests for strrstr() functionality.
Why specify/require it if it won't be used?

Also note that if you want to implement such a function you
might benefit from reading my chapter on string searching in
"Software Solutions in C" (ed. Dale Schumacher).
 
E

Eric Sosman

SM said:
# (Followups set to comp.std.c. Apologies if the crosspost is unwelcome.)
#
# strchr() is to strrchr() as strstr() is to strrstr(), but strrstr()
# isn't part of the standard. Why not?

char *strrstr(char *x,char *y) {
int m = strlen(x);
int n = strlen(y);

ITYM size_t, here and throughout.
char *X = malloc(m+1);
char *Y = malloc(n+1);

if (X == NULL || Y == NULL) ...?
int i;
for (i=0; i<m; i++) X[m-1-i] = x; X[m] = 0;
for (i=0; i<n; i++) Y[n-1-i] = y; Y[n] = 0;
char *Z = strstr(X,Y);
if (Z) {
int ro = Z-X;
int lo = ro+n-1;
int ol = m-1-lo;
Z = x+ol;
}
free(X); free(Y);
return Z;
}


Untested:

#include <string.h>
/* @NOPEDANTRY: ignore use of reserved identifier */
char *strrstr(const char *x, const char *y) {
char *prev = NULL;
char *next;
if (*y == '\0')
return strchr(x, '\0');
while ((next = strstr(x, y)) != NULL) {
prev = next;
x = next + 1;
}
return prev;
}

The behavior when y is empty is a matter of taste
and/or debate. The code above takes the view that the
rightmost occurrence in x of the empty string is the
one that appears (if that's the right word) just prior
to x's terminating zero; other conventions are surely
possible and might turn out to be better.

Note that simply omitting the test on y would be
an error: an empty y would then cause the while loop
to run off the end of x.
 
D

Douglas A. Gwyn

Walter said:
This starts to get into murky waters. Z-X is a subtraction
of pointers, the result of which is ptrdiff_t, which is a signed
integral type. Logically, though, Z-X could be of size_t, which
is unsigned. This difference has probably been discussed in the past,
but I have not happened to see the discussion of what happens with
pointer subtraction if the object size would fit in the unsigned
type but not in the signed type. Anyhow, ro, lo, ol should not be int.

ptrdiff_t is supposed to be defined as a type wide enough to
accommodate *any* possible result of a valid subtraction of
pointers to objects. If an implementation doesn't *have* a
suitable integer type, that is a deficiency..

Anyway, when you know which pointer is less than the other,
you can always subtract the lesser from the greater and the
result will then always be appropriately represented using
size_t. If you really had to worry about these limits in
some situation, you could first test which is lesser, then
use two branches in the code with size_t in each one.
 
W

Walter Roberson

Anyway, when you know which pointer is less than the other,
you can always subtract the lesser from the greater and the
result will then always be appropriately represented using
size_t. If you really had to worry about these limits in
some situation, you could first test which is lesser, then
use two branches in the code with size_t in each one.

It seems to me that you are implying that the maximum
object size that a C implementation may support, is only
half of the memory addressible in that address mode --
e.g., maximum 2 Gb object on a 32 bit (4 Gb span)
pointer machine. This limitation being necessary so that
the maximum object size would fit in a signed storage
location, just in case you wanted to do something like

(object + sizeof object) - object

"logically" the result would be sizeof object, an
unsigned type, but the pointer subtraction is defined
as returning a signed value, so the maximum
magnitude of the signed value would have to be at least
as great as the maximum magnitude of the unsigned value...

number_of_usable_bits(size_t) < number_of_usable_bits(ptrdiff_t)

[provided, that is, that one is not using a seperate-sign-bit
machine.]


The machines I use most often -happen- to have that property
anyhow, because the high-bit on a pointer is reserved for
indicating kernel memory space, but I wonder about the extent
to which this is true on other machines?
 
P

pete

Douglas A. Gwyn wrote:
ptrdiff_t is supposed to be defined as a type wide enough to
accommodate *any* possible result of a valid subtraction of
pointers to objects.

What are you talking about?

Is your point that ptrdiff_t is actually defined
opposite of the way that it's supposed to be?

"If the result is not representable in an object of that type,
the behavior is undefined.
In other words, if the expressions P and Q point to,
respectively, the i-th and j-th elements of an array object,
the expression (P)-(Q) has the value i-j
provided the value fits in an object of type ptrdiff_t."
 
O

Old Wolf

SM said:
#
# strchr() is to strrchr() as strstr() is to strrstr(), but strrstr()
# isn't part of the standard. Why not?

char *strrstr(char *x,char *y) {
int m = strlen(x);
int n = strlen(y);
char *X = malloc(m+1);
char *Y = malloc(n+1);

Using dynamic allocation for this function? You have got
to be kidding
int i;
for (i=0; i<m; i++) X[m-1-i] = x; X[m] = 0;
for (i=0; i<n; i++) Y[n-1-i] = y; Y[n] = 0;
char *Z = strstr(X,Y);
if (Z) {
int ro = Z-X;
int lo = ro+n-1;
int ol = m-1-lo;
Z = x+ol;


I don't know which is more obfuscated -- your code, or your
quote marker
 
K

Keith Thompson

Christopher Benson-Manica said:
strchr() is to strrchr() as strstr() is to strrstr(), but strrstr()
isn't part of the standard. Why not?

I don't think anyone has posted the real reason: it's arbitrary. The
C standard library isn't a coherently designed entity. It's a
collection of functionality from historical implementations,
consisting largely of whatever seemed like a good idea at the time,
filtered through the standards committee. Just look at the continuing
existence of gets(), or the design of <time.h>.

It's remarkable (and a tribute to the original authors and to the
committee) that the whole thing works as well as it does.
 
S

SM Ryan

#
#
# SM Ryan wrote:
# > # (Followups set to comp.std.c. Apologies if the crosspost is unwelcome.)
# > #
# > # strchr() is to strrchr() as strstr() is to strrstr(), but strrstr()
# > # isn't part of the standard. Why not?
# >
# > char *strrstr(char *x,char *y) {
# > int m = strlen(x);
# > int n = strlen(y);

Time complexity can be O(m+n), since strstr can be O(m+n)
and O(2m+2n) = O(m+n).

# char *strrstr(const char *x, const char *y) {
# char *prev = NULL;
# char *next;
# if (*y == '\0')
# return strchr(x, '\0');
# while ((next = strstr(x, y)) != NULL) {
# prev = next;
# x = next + 1;
# }
# return prev;
# }

Potentially O(m*n), depending on how often characters repeat in y.
 
A

Antoine Leca

Also note that if you want to implement such a function you
might benefit from reading my chapter on string searching in
"Software Solutions in C" (ed. Dale Schumacher).

The straightforward idea (using strstr() in a loop and returning the last
not-NULL answer, as strrchr() usually does) won't be a good one?
At least it would take profit from the optimized form of strstr() often
found (several people reported here that the shipped strstr()'s regularly
outperform crafted algorithms like Boyer-Moore.)

Not that I see any use for strrstr(), except perhaps to do the same as
strrchr() when c happens to be a multibyte character in a stateless
encoding.


Antoine
 
M

Michael Wojcik

Default User said:
I think it's dumb to [set followups].
I find it rude and obnoxious.

For that I humbly apologize; consider the lesson learned.

What lesson? That Brian doesn't like the Followup-To header? I
wouldn't recommend tailoring your posting habits solely to his
preferences. Setting Followup-To on crossposted messages is
recommended by a number of netiquette guides and Son-of-1036. Some
people dislike it; other people - some of whom felt sufficiently
animated by the subject to formalize their thoughts in usage guides -
do not.

My inclination, frankly, is to follow the recommendations of the
group which can be bothered to promulgate guidelines, over the
complaints of those who can't be bothered to do more than complain.
Sometimes there are good reasons (a clear majority of opinion or
well-established practice in a given group, for example) for
observing other conventions, but I don't see any of those here. What
I see is one poster (well, two, since I've seen Alan chime in as well)
complaining about a widely-recommended practice.

--
Michael Wojcik (e-mail address removed)

I will shoue the world one of the grate Wonders of the world in 15
months if Now man mourders me in Dors or out Dors
-- "Lord" Timothy Dexter, _A Pickle for the Knowing Ones_
 
C

Christopher Benson-Manica

Michael Wojcik said:
Sometimes there are good reasons (a clear majority of opinion or
well-established practice in a given group, for example) for
observing other conventions, but I don't see any of those here. What
I see is one poster (well, two, since I've seen Alan chime in as well)
complaining about a widely-recommended practice.

Be that as it may, I would not want to deprive myself of the
opportunity of receiving a response from Brian or Alan over a
netiquette nitpick; perhaps the safe thing to do is to avoid including
comp.lang.c in crossposts altogether. There aren't many cases where
doing so is appropriate anyway...
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,770
Messages
2,569,583
Members
45,075
Latest member
MakersCBDBloodSupport

Latest Threads

Top