# substring

Discussion in 'C Programming' started by Jupiter, Nov 2, 2003.

1. ### JupiterGuest

Hey,
I would like to know 2 things.
1)Is there any function (in C standard library) that extracts a
substring from a string?
2)Is there any function (in C standard library) that returns the
position of a substring in a string?

Thx a lot...

Jupiter, Nov 2, 2003

2. ### Ivan VecerinaGuest

"Jupiter" <> wrote in message
news:...
> Hey,
> I would like to know 2 things.
> 1)Is there any function (in C standard library) that extracts a
> substring from a string?

Not directly -- this has to be emulated with a more general
function such as memcpy (how exactly depends on the
destination storage for the substring).

> 2)Is there any function (in C standard library) that returns the
> position of a substring in a string?

Yes:
char* pos = strstr(str_to_search_through,str_to_find);

hth, Ivan
--
http://ivan.vecerina.com

Ivan Vecerina, Nov 2, 2003

3. ### Peter PichlerGuest

"Jupiter" <> wrote in message
news:...
> Hey,
> I would like to know 2 things.
> 1)Is there any function (in C standard library) that extracts a
> substring from a string?

Yes.

> 2)Is there any function (in C standard library) that returns the
> position of a substring in a string?

Yes.

> Thx a lot...

The questions you presumably *wanted* to ask were "what is the function
to..."

To extract a substring on positions n to m from a string s you can use

strncpy(dest, s + n, m - n + 1);
dest[m - n] = 0;

(or in short strncpy(dest, s + n, m - n + 1)[m - n] = 0

Look up strncpy in your manual to see how it works and why adding the
zero is necessary. You must also make sure beforehand that:
a) s is at leat n + 1 characters long and
b) you have enough space in dest for m - n + 1 characters.

To find a substring in a string you can use

char * position = strstr(substring, string);

If you want an offset rather than a pointer to the substring, use

size_t offset = position - string;

Again, look up strstr in your manual for reference. Please note that
strstr returns NULL if the substring is not foud in string, you must
check for that.

Peter Pichler, Nov 2, 2003
4. ### Andreas KahariGuest

In article <>, Jupiter wrote:
> Hey,
> I would like to know 2 things.
> 1)Is there any function (in C standard library) that extracts a
> substring from a string?

Yes, strncpy() can be made to copy a substring of one string
into another:

#include <string.h> /* for strncpy() */
#include <stdio.h> /* for printf() */

int
main(void)
{
char msg[] = "Hello World!";
char submsg[10]; /* Must be long enough */

/* Copy the substring "o W" from msg to submsg */
strncpy(submsg, &msg[4], 3);

/* Terminate the resulting string since strncpy() doesn't */
submsg[3] = '\0';

printf("msg[] = '%s'\nsubmsg[] = '%s'\n", msg, submsg);

return 0;
}

> 2)Is there any function (in C standard library) that returns the
> position of a substring in a string?

No, but you may use strstr() like this:

#include <string.h> /* for strstr() */
#include <stdio.h> /* for printf(), fprintf() */
#include <stddef.h> /* for ptrdiff_t, NULL */

int
main(void)
{
char msg[] = "Hello World!";
char *ptr;

ptrdiff_t ptrpos;

/* Locate the substring "o W" in msg */
ptr = strstr(msg, "o W");

if (ptr == NULL) {
} else {
/* Calculate the position of ptr in msg */
ptrpos = ptr - &msg[0];

printf("Position of 'o W' in '%s' is %d\n", msg, ptrpos);
}

return 0;
}

>
>
>
> Thx a lot...

Wlcm a lot...

--
Andreas Kähäri

Andreas Kahari, Nov 2, 2003
5. ### Peter PichlerGuest

"Peter Pichler" <> wrote in message
news:UO4pb.422\$...
> To find a substring in a string you can use
>
> char * position = strstr(substring, string);

....
> Again, look up strstr in your manual for reference.

Err, *I* should have looked up strstr in the manual. Swap the two
parameters around. Doh!

Peter Pichler, Nov 2, 2003
6. ### Ed MortonGuest

Jupiter wrote:
> Hey,
> I would like to know 2 things.
> 1)Is there any function (in C standard library) that extracts a
> substring from a string?
> 2)Is there any function (in C standard library) that returns the
> position of a substring in a string?

Yes. Take a look at http://www-ccs.ucsd.edu/c/string.html for a list of
the string functions in string.h. Buying a copy of K&R 2 wouldn't be a

Ed.

Ed Morton, Nov 2, 2003
7. ### Tristan MillerGuest

Greetings.

In article <3fa4d902\$>, Ivan Vecerina wrote:
>> 1)Is there any function (in C standard library) that extracts a
>> substring from a string?

>
> Not directly -- this has to be emulated with a more general
> function such as memcpy (how exactly depends on the
> destination storage for the substring).

Eh? You got something against strncpy() or something?

--
_
_V.-o Tristan Miller [en,(fr,de,ia)] >< Space is limited
/ |`-' -=-=-=-=-=-=-=-=-=-=-=-=-=-=-= <> In a haiku, so it's hard
(7_\\ http://www.nothingisreal.com/ >< To finish what you

Tristan Miller, Nov 2, 2003
8. ### Al BowersGuest

Tristan Miller wrote:
> Greetings.
>
> In article <3fa4d902\$>, Ivan Vecerina wrote:
>
>>>1)Is there any function (in C standard library) that extracts a
>>>substring from a string?

>>
>>Not directly -- this has to be emulated with a more general
>>function such as memcpy (how exactly depends on the
>>destination storage for the substring).

>
>
> Eh? You got something against strncpy() or something?
>

'Something' might be more appropriate than strncpy. It depends on
the definition of 'extract'. I interpret this to mean to remove a
substring from a string.

Take string:
char s[] = "Have a very good day"
and extract the substring "very " to make s the
string "Have a good day".

If this is the intent of the OP, then a solution would not involve
strncpy or memcpy. A function duo of strstr and memmove would
do the job.

--
Al Bowers
Tampa, Fl USA
mailto: (remove the x to send email)
http://www.geocities.com/abowers822/

Al Bowers, Nov 2, 2003
9. ### Richard HeathfieldGuest

Tristan Miller wrote:

> Greetings.
>
> In article <3fa4d902\$>, Ivan Vecerina wrote:
>>> 1)Is there any function (in C standard library) that extracts a
>>> substring from a string?

>>
>> Not directly -- this has to be emulated with a more general
>> function such as memcpy (how exactly depends on the
>> destination storage for the substring).

>
> Eh? You got something against strncpy() or something?

Well, as a substring extractor, it's suboptimal.

Consider, for example:

char foo[16];
char bar[16] = "abcdefghijklmno";

strncpy(foo, bar, 3);

At the end of this operation, foo does not contain a string. Oops.

--
Richard Heathfield :
"Usenet is a strange place." - Dennis M Ritchie, 29 July 1999.
C FAQ: http://www.eskimo.com/~scs/C-faq/top.html
K&R answers, C books, etc: http://users.powernet.co.uk/eton

Richard Heathfield, Nov 2, 2003
10. ### Tristan MillerGuest

Greetings.

In article <bo3mqn\$6k7\$>, Richard Heathfield wrote:
> Consider, for example:
>
> char foo[16];
> char bar[16] = "abcdefghijklmno";
>
> strncpy(foo, bar, 3);
>
> At the end of this operation, foo does not contain a string. Oops.

Well, it doesn't contain a nul-terminated string, but it does contain the
first three characters of bar, which may be all that is needed in some
cases. Does the C standard use the term "string" to refer to both nul- and
non-nul-terminated strings, or does it make a nomenclatural distinction
between "string" (which always includes the sentinel) and the more general
"array-of-char"?

--
_
_V.-o Tristan Miller [en,(fr,de,ia)] >< Space is limited
/ |`-' -=-=-=-=-=-=-=-=-=-=-=-=-=-=-= <> In a haiku, so it's hard
(7_\\ http://www.nothingisreal.com/ >< To finish what you

Tristan Miller, Nov 2, 2003
11. ### Richard HeathfieldGuest

Tristan Miller wrote:

> Does the C standard use the term "string" to refer to both nul-
> and non-nul-terminated strings, or does it make a nomenclatural
> distinction between "string" (which always includes the sentinel) and the
> more general "array-of-char"?

The C Standard defines a string to be an array of characters terminated by
the first null character. If you want the exact wording, I'm sure someone
can oblige.

--
Richard Heathfield :
"Usenet is a strange place." - Dennis M Ritchie, 29 July 1999.
C FAQ: http://www.eskimo.com/~scs/C-faq/top.html
K&R answers, C books, etc: http://users.powernet.co.uk/eton

Richard Heathfield, Nov 2, 2003
12. ### Mark McIntyreGuest

On Sun, 02 Nov 2003 21:51:51 +0100, in comp.lang.c , Tristan Miller
<> wrote:

>cases. Does the C standard use the term "string" to refer to both nul- and
>non-nul-terminated strings

C defines a string as
7.1.1 (1) A string is a contiguous sequence of characters terminated
by and including the first null character.

--
Mark McIntyre
CLC FAQ <http://www.eskimo.com/~scs/C-faq/top.html>

----== Posted via Newsfeed.Com - Unlimited-Uncensored-Secure Usenet News==----
http://www.newsfeed.com The #1 Newsgroup Service in the World! >100,000 Newsgroups
---= 19 East/West-Coast Specialized Servers - Total Privacy via Encryption =---

Mark McIntyre, Nov 2, 2003
13. ### peteGuest

Richard Heathfield wrote:
>
> Tristan Miller wrote:
>
> > Does the C standard use the term "string" to refer to both nul-
> > and non-nul-terminated strings, or does it make a nomenclatural
> > distinction between "string"
> > (which always includes the sentinel) and the
> > more general "array-of-char"?

>
> The C Standard defines a string to be an array
> of characters terminated by the first null character.

The word "array" is conspicuously absent from the
standard definition of string.
If a program can determine whether or not two seperately
allocated objects are contigous,
then a string may span them if they are contiguous.
Also, an array may contain several strings.

> If you want the exact wording, I'm sure someone
> can oblige.

Here's the 411 from the C89 last public draft:
4. LIBRARY
4.1 INTRODUCTION
4.1.1 Definitions of terms

A string is a contiguous sequence of characters terminated by and
including the first null character. It is represented by a pointer to
its initial (lowest addressed) character and its length is the number
of characters preceding the null character.

--
pete

pete, Nov 3, 2003
14. ### Thomas StegenGuest

pete wrote:

> Richard Heathfield wrote:

>>The C Standard defines a string to be an array
>>of characters terminated by the first null character.

>
>
> The word "array" is conspicuously absent from the
> standard definition of string.
> If a program can determine whether or not two seperately
> allocated objects are contigous,
> then a string may span them if they are contiguous.

It would be a useless string though since neither pointer arithmetic
or the array index operator is defined for accessing these things.

Lets assume that the following two arrays are contigous in memory.
(s2 follows s1 directly).

char s1[3] = "123";
char s2[4] = "456";

You cannot do this:

int i = 0;
while(s1 != '\0')
putchar(s1[i++]);

Since it will cause UB. Same arguments applies to pointers (since the
above really operates on pointers anyway).

I am not so sure if the same goes for arguments to library functions.
But the whole string will be accessed by the same pointer so I would
say you cannot do the above. The avoidance of the word array in the
standard might be to allow string literals to be strings proper.

So s1 and s2 constitutes a singe string but you will have to treat the
arrays separately and cannot draw any benefit from them being contigous.

--
Thomas.

Thomas Stegen, Nov 3, 2003
15. ### peteGuest

Thomas Stegen wrote:
>
> pete wrote:
>
> > Richard Heathfield wrote:

>
> >>The C Standard defines a string to be an array
> >>of characters terminated by the first null character.

> >
> >
> > The word "array" is conspicuously absent from the
> > standard definition of string.
> > If a program can determine whether or not two seperately
> > allocated objects are contigous,
> > then a string may span them if they are contiguous.

>
> It would be a useless string

It's just some C trivia.

> though since neither pointer arithmetic
> or the array index operator is defined for accessing these things.
>
> Lets assume that the following two arrays are contigous in memory.
> (s2 follows s1 directly).
>
> char s1[3] = "123";
> char s2[4] = "456";
>
> You cannot do this:
>
> int i = 0;
> while(s1 != '\0')
> putchar(s1[i++]);
>
> Since it will cause UB. Same arguments applies to pointers (since the
> above really operates on pointers anyway).
> I am not so sure if the same goes for arguments to library functions.

If the arrays are shown to be contiguous,
then you can have
puts(s1);

--
pete

pete, Nov 4, 2003
16. ### Richard BosGuest

pete <> wrote:

> Thomas Stegen wrote:
>
> > char s1[3] = "123";
> > char s2[4] = "456";
> >
> > You cannot do this:
> >
> > int i = 0;
> > while(s1 != '\0')
> > putchar(s1[i++]);
> >
> > Since it will cause UB.

>
> If the arrays are shown to be contiguous,
> then you can have
> puts(s1);

No, you can't. Any reference to s1[3] and above, including those
implicit in puts(s1), invoke undefined behaviour. Although it is true
that on many architectures this instance of UB will behave as if it is
defined, you cannot rely on that.

Richard

Richard Bos, Nov 4, 2003
17. ### peteGuest

Richard Bos wrote:
>
> pete <> wrote:
>
> > Thomas Stegen wrote:
> >
> > > char s1[3] = "123";
> > > char s2[4] = "456";
> > >
> > > You cannot do this:
> > >
> > > int i = 0;
> > > while(s1 != '\0')
> > > putchar(s1[i++]);
> > >
> > > Since it will cause UB.

> >
> > If the arrays are shown to be contiguous,
> > then you can have
> > puts(s1);

>
> No, you can't.

> Any reference to s1[3] and above, including those
> implicit in puts(s1), invoke undefined behaviour.

That doesn't matter.
If you give puts a pointer to a string,
then the behavior is defined.
How puts accomplishes the behavior, is up to the implementors.
If two objects are being spanned by a string,
and if puts doesn't want to index across them,
then puts may deal with the objects seperately,
or do it some other way.

--
pete

pete, Nov 4, 2003
18. ### Richard BosGuest

pete <> wrote:

> Richard Bos wrote:
> >
> > pete <> wrote:
> >
> > > Thomas Stegen wrote:
> > >
> > > > char s1[3] = "123";
> > > > char s2[4] = "456";

> > > If the arrays are shown to be contiguous,
> > > then you can have
> > > puts(s1);

> > Any reference to s1[3] and above, including those
> > implicit in puts(s1), invoke undefined behaviour.

>
> That doesn't matter.

Yes, it does.

> If you give puts a pointer to a string,
> then the behavior is defined.

s1 is not a string. It may happen to look like one on your favourite
architecture, but that doesn't make it one.

Richard

Richard Bos, Nov 4, 2003
19. ### peteGuest

Richard Bos wrote:
>
> pete <> wrote:
>
> > Richard Bos wrote:
> > >
> > > pete <> wrote:
> > >
> > > > Thomas Stegen wrote:
> > > >
> > > > > char s1[3] = "123";
> > > > > char s2[4] = "456";

>
> > > > If the arrays are shown to be contiguous,
> > > > then you can have
> > > > puts(s1);

>
> > > Any reference to s1[3] and above, including those
> > > implicit in puts(s1), invoke undefined behaviour.

> >
> > That doesn't matter.

>
> Yes, it does.
>
> > If you give puts a pointer to a string,
> > then the behavior is defined.

>
> s1 is not a string. It may happen to look like one on your favourite
> architecture, but that doesn't make it one.

I believe we are only talking about cases where s1 and s3
are shown to be contiguous.
In that case s1, satisfies the defintion for "pointer to a string"

N869

7.1.1 Definitions of terms

[#1] A string is a contiguous sequence of characters
terminated by and including the first null character. The
term multibyte string is sometimes used instead to emphasize
special processing given to multibyte characters contained
in the string or to avoid confusion with a wide string. A
pointer to a string is a pointer to its initial (lowest

--
pete

pete, Nov 4, 2003
20. ### peteGuest

pete wrote:
>
> Richard Bos wrote:
> >
> > pete <> wrote:
> >
> > > Richard Bos wrote:
> > > >
> > > > pete <> wrote:
> > > >
> > > > > Thomas Stegen wrote:
> > > > >
> > > > > > char s1[3] = "123";
> > > > > > char s2[4] = "456";

> >
> > > > > If the arrays are shown to be contiguous,
> > > > > then you can have
> > > > > puts(s1);

> >
> > > > Any reference to s1[3] and above, including those
> > > > implicit in puts(s1), invoke undefined behaviour.
> > >
> > > That doesn't matter.

> >
> > Yes, it does.
> >
> > > If you give puts a pointer to a string,
> > > then the behavior is defined.

> >
> > s1 is not a string. It may happen to look like one on your favourite
> > architecture, but that doesn't make it one.

>
> I believe we are only talking about cases where

> s1 and s3

That should be "s1 and s2".

> are shown to be contiguous.
> In that case s1, satisfies the defintion for "pointer to a string"
>
> N869
>
> 7.1.1 Definitions of terms
>
> [#1] A string is a contiguous sequence of characters
> terminated by and including the first null character. The
> term multibyte string is sometimes used instead to emphasize
> special processing given to multibyte characters contained
> in the string or to avoid confusion with a wide string. A
> pointer to a string is a pointer to its initial (lowest