substring

J

Jupiter

Hey,
I would like to know 2 things.
1)Is there any function (in C standard library) that extracts a
substring from a string?
2)Is there any function (in C standard library) that returns the
position of a substring in a string?



Thx a lot...
 
I

Ivan Vecerina

Jupiter said:
Hey,
I would like to know 2 things.
1)Is there any function (in C standard library) that extracts a
substring from a string?
Not directly -- this has to be emulated with a more general
function such as memcpy (how exactly depends on the
destination storage for the substring).
2)Is there any function (in C standard library) that returns the
position of a substring in a string?
Yes:
char* pos = strstr(str_to_search_through,str_to_find);
(null if not found).

hth, Ivan
 
P

Peter Pichler

Jupiter said:
Hey,
I would like to know 2 things.
1)Is there any function (in C standard library) that extracts a
substring from a string?
Yes.

2)Is there any function (in C standard library) that returns the
position of a substring in a string?
Yes.

Thx a lot...

The questions you presumably *wanted* to ask were "what is the function
to..." :)

To extract a substring on positions n to m from a string s you can use

strncpy(dest, s + n, m - n + 1);
dest[m - n] = 0;

(or in short strncpy(dest, s + n, m - n + 1)[m - n] = 0;)

Look up strncpy in your manual to see how it works and why adding the
zero is necessary. You must also make sure beforehand that:
a) s is at leat n + 1 characters long and
b) you have enough space in dest for m - n + 1 characters.

To find a substring in a string you can use

char * position = strstr(substring, string);

If you want an offset rather than a pointer to the substring, use

size_t offset = position - string;

Again, look up strstr in your manual for reference. Please note that
strstr returns NULL if the substring is not foud in string, you must
check for that.
 
A

Andreas Kahari

Hey,
I would like to know 2 things.
1)Is there any function (in C standard library) that extracts a
substring from a string?

Yes, strncpy() can be made to copy a substring of one string
into another:

#include <string.h> /* for strncpy() */
#include <stdio.h> /* for printf() */

int
main(void)
{
char msg[] = "Hello World!";
char submsg[10]; /* Must be long enough */

/* Copy the substring "o W" from msg to submsg */
strncpy(submsg, &msg[4], 3);

/* Terminate the resulting string since strncpy() doesn't */
submsg[3] = '\0';

printf("msg[] = '%s'\nsubmsg[] = '%s'\n", msg, submsg);

return 0;
}

2)Is there any function (in C standard library) that returns the
position of a substring in a string?

No, but you may use strstr() like this:

#include <string.h> /* for strstr() */
#include <stdio.h> /* for printf(), fprintf() */
#include <stddef.h> /* for ptrdiff_t, NULL */

int
main(void)
{
char msg[] = "Hello World!";
char *ptr;

ptrdiff_t ptrpos;

/* Locate the substring "o W" in msg */
ptr = strstr(msg, "o W");

if (ptr == NULL) {
fprintf(stderr, "Substring not found\n");
} else {
/* Calculate the position of ptr in msg */
ptrpos = ptr - &msg[0];

printf("Position of 'o W' in '%s' is %d\n", msg, ptrpos);
}

return 0;
}
Thx a lot...

Wlcm a lot...
 
P

Peter Pichler

Peter Pichler said:
To find a substring in a string you can use

char * position = strstr(substring, string); ....
Again, look up strstr in your manual for reference.

Err, *I* should have looked up strstr in the manual. Swap the two
parameters around. Doh!
 
E

Ed Morton

Jupiter said:
Hey,
I would like to know 2 things.
1)Is there any function (in C standard library) that extracts a
substring from a string?
2)Is there any function (in C standard library) that returns the
position of a substring in a string?

Yes. Take a look at http://www-ccs.ucsd.edu/c/string.html for a list of
the string functions in string.h. Buying a copy of K&R 2 wouldn't be a
bad idea either...

Ed.
 
T

Tristan Miller

Greetings.

Not directly -- this has to be emulated with a more general
function such as memcpy (how exactly depends on the
destination storage for the substring).

Eh? You got something against strncpy() or something?
 
A

Al Bowers

Tristan said:
Greetings.




Eh? You got something against strncpy() or something?

'Something' might be more appropriate than strncpy. It depends on
the definition of 'extract'. I interpret this to mean to remove a
substring from a string.

Take string:
char s[] = "Have a very good day"
and extract the substring "very " to make s the
string "Have a good day".

If this is the intent of the OP, then a solution would not involve
strncpy or memcpy. A function duo of strstr and memmove would
do the job.
 
R

Richard Heathfield

Tristan said:
Greetings.



Eh? You got something against strncpy() or something?

Well, as a substring extractor, it's suboptimal.

Consider, for example:

char foo[16];
char bar[16] = "abcdefghijklmno";

strncpy(foo, bar, 3);

At the end of this operation, foo does not contain a string. Oops.
 
T

Tristan Miller

Greetings.

Consider, for example:

char foo[16];
char bar[16] = "abcdefghijklmno";

strncpy(foo, bar, 3);

At the end of this operation, foo does not contain a string. Oops.

Well, it doesn't contain a nul-terminated string, but it does contain the
first three characters of bar, which may be all that is needed in some
cases. Does the C standard use the term "string" to refer to both nul- and
non-nul-terminated strings, or does it make a nomenclatural distinction
between "string" (which always includes the sentinel) and the more general
"array-of-char"?
 
R

Richard Heathfield

Tristan said:
Does the C standard use the term "string" to refer to both nul-
and non-nul-terminated strings, or does it make a nomenclatural
distinction between "string" (which always includes the sentinel) and the
more general "array-of-char"?

The C Standard defines a string to be an array of characters terminated by
the first null character. If you want the exact wording, I'm sure someone
can oblige.
 
M

Mark McIntyre

cases. Does the C standard use the term "string" to refer to both nul- and
non-nul-terminated strings

C defines a string as
7.1.1 (1) A string is a contiguous sequence of characters terminated
by and including the first null character.
 
P

pete

Richard said:
The C Standard defines a string to be an array
of characters terminated by the first null character.

The word "array" is conspicuously absent from the
standard definition of string.
If a program can determine whether or not two seperately
allocated objects are contigous,
then a string may span them if they are contiguous.
Also, an array may contain several strings.
If you want the exact wording, I'm sure someone
can oblige.

Here's the 411 from the C89 last public draft:
4. LIBRARY
4.1 INTRODUCTION
4.1.1 Definitions of terms

A string is a contiguous sequence of characters terminated by and
including the first null character. It is represented by a pointer to
its initial (lowest addressed) character and its length is the number
of characters preceding the null character.
 
T

Thomas Stegen

pete said:
Richard Heathfield wrote:


The word "array" is conspicuously absent from the
standard definition of string.
If a program can determine whether or not two seperately
allocated objects are contigous,
then a string may span them if they are contiguous.

It would be a useless string though since neither pointer arithmetic
or the array index operator is defined for accessing these things.

Lets assume that the following two arrays are contigous in memory.
(s2 follows s1 directly).

char s1[3] = "123";
char s2[4] = "456";

You cannot do this:

int i = 0;
while(s1 != '\0')
putchar(s1[i++]);

Since it will cause UB. Same arguments applies to pointers (since the
above really operates on pointers anyway).

I am not so sure if the same goes for arguments to library functions.
But the whole string will be accessed by the same pointer so I would
say you cannot do the above. The avoidance of the word array in the
standard might be to allow string literals to be strings proper.

So s1 and s2 constitutes a singe string but you will have to treat the
arrays separately and cannot draw any benefit from them being contigous.
 
P

pete

Thomas said:
It would be a useless string

It's just some C trivia.
though since neither pointer arithmetic
or the array index operator is defined for accessing these things.

Lets assume that the following two arrays are contigous in memory.
(s2 follows s1 directly).

char s1[3] = "123";
char s2[4] = "456";

You cannot do this:

int i = 0;
while(s1 != '\0')
putchar(s1[i++]);

Since it will cause UB. Same arguments applies to pointers (since the
above really operates on pointers anyway).
I am not so sure if the same goes for arguments to library functions.


If the arrays are shown to be contiguous,
then you can have
puts(s1);
 
R

Richard Bos

pete said:
Thomas said:
char s1[3] = "123";
char s2[4] = "456";

You cannot do this:

int i = 0;
while(s1 != '\0')
putchar(s1[i++]);

Since it will cause UB.


If the arrays are shown to be contiguous,
then you can have
puts(s1);


No, you can't. Any reference to s1[3] and above, including those
implicit in puts(s1), invoke undefined behaviour. Although it is true
that on many architectures this instance of UB will behave as if it is
defined, you cannot rely on that.

Richard
 
P

pete

Richard said:
pete said:
Thomas said:
char s1[3] = "123";
char s2[4] = "456";

You cannot do this:

int i = 0;
while(s1 != '\0')
putchar(s1[i++]);

Since it will cause UB.


If the arrays are shown to be contiguous,
then you can have
puts(s1);


No, you can't.

Any reference to s1[3] and above, including those
implicit in puts(s1), invoke undefined behaviour.

That doesn't matter.
If you give puts a pointer to a string,
then the behavior is defined.
How puts accomplishes the behavior, is up to the implementors.
If two objects are being spanned by a string,
and if puts doesn't want to index across them,
then puts may deal with the objects seperately,
or do it some other way.
 
R

Richard Bos

pete said:
Richard said:
pete said:
Thomas Stegen wrote:

char s1[3] = "123";
char s2[4] = "456";
If the arrays are shown to be contiguous,
then you can have
puts(s1);
Any reference to s1[3] and above, including those
implicit in puts(s1), invoke undefined behaviour.

That doesn't matter.

Yes, it does.
If you give puts a pointer to a string,
then the behavior is defined.

s1 is not a string. It may happen to look like one on your favourite
architecture, but that doesn't make it one.

Richard
 
P

pete

Richard said:
pete said:
Richard said:
Thomas Stegen wrote:

char s1[3] = "123";
char s2[4] = "456";
If the arrays are shown to be contiguous,
then you can have
puts(s1);
Any reference to s1[3] and above, including those
implicit in puts(s1), invoke undefined behaviour.

That doesn't matter.

Yes, it does.
If you give puts a pointer to a string,
then the behavior is defined.

s1 is not a string. It may happen to look like one on your favourite
architecture, but that doesn't make it one.

I believe we are only talking about cases where s1 and s3
are shown to be contiguous.
In that case s1, satisfies the defintion for "pointer to a string"

N869

7.1.1 Definitions of terms

[#1] A string is a contiguous sequence of characters
terminated by and including the first null character. The
term multibyte string is sometimes used instead to emphasize
special processing given to multibyte characters contained
in the string or to avoid confusion with a wide string. A
pointer to a string is a pointer to its initial (lowest
addressed) character.
 
P

pete

pete said:
Richard said:
pete said:
Richard Bos wrote:


Thomas Stegen wrote:

char s1[3] = "123";
char s2[4] = "456";
If the arrays are shown to be contiguous,
then you can have
puts(s1);
Any reference to s1[3] and above, including those
implicit in puts(s1), invoke undefined behaviour.

That doesn't matter.

Yes, it does.
If you give puts a pointer to a string,
then the behavior is defined.

s1 is not a string. It may happen to look like one on your favourite
architecture, but that doesn't make it one.

I believe we are only talking about cases where
s1 and s3

That should be "s1 and s2".
are shown to be contiguous.
In that case s1, satisfies the defintion for "pointer to a string"

N869

7.1.1 Definitions of terms

[#1] A string is a contiguous sequence of characters
terminated by and including the first null character. The
term multibyte string is sometimes used instead to emphasize
special processing given to multibyte characters contained
in the string or to avoid confusion with a wide string. A
pointer to a string is a pointer to its initial (lowest
addressed) character.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,743
Messages
2,569,478
Members
44,899
Latest member
RodneyMcAu

Latest Threads

Top