Matching nonempty sequence of '^' scanf with the "%[" specifier

Discussion in 'C Programming' started by regis, Apr 27, 2006.

  1. regis

    regis Guest

    Greetings,

    about scanf matching nonempty sequences using the "%[" specifier...

    "%[^-]" matches a nonempty sequence of anything except '-'
    "%[^[]" matches a nonempty sequence of anything except '['
    "%[^]]" matches a nonempty sequence of anything except ']'
    "%[^^]" matches a nonempty sequence of anything except '^'

    "%[-]" matches a nonempty sequence of '-'
    "%[[]" matches a nonempty sequence of '['
    "%[]]" matches a nonempty sequence of ']'

    ....but how to match a nonempty sequence of '^' ?

    "%[^]" is not possible because here ']' is not the closing bracket
    but a character in the inverted scanset.

    Assuming that '^' is 0136 in octal, then "%[\136" still has the
    meaning "%[^" with '^' interpreted as a special character,
    so this is not possible either.

    "%[^-^]" is not interpreted as matching a nonempty sequence in the
    degenerated range {'^', ..., '^'} but as matching anything
    except '^' and '-'.

    "\^" is non a valid escape sequence...

    is there a solution ?

    --
    regis
     
    regis, Apr 27, 2006
    #1
    1. Advertising

  2. regis

    P.J. Plauger Guest

    "regis" <-mrs.fr> wrote in message
    news:e2qm6s$81s$-mrs.fr...

    > about scanf matching nonempty sequences using the "%[" specifier...
    >
    > "%[^-]" matches a nonempty sequence of anything except '-'
    > "%[^[]" matches a nonempty sequence of anything except '['
    > "%[^]]" matches a nonempty sequence of anything except ']'
    > "%[^^]" matches a nonempty sequence of anything except '^'
    >
    > "%[-]" matches a nonempty sequence of '-'
    > "%[[]" matches a nonempty sequence of '['
    > "%[]]" matches a nonempty sequence of ']'
    >
    > ...but how to match a nonempty sequence of '^' ?


    ^^*

    > "%[^]" is not possible because here ']' is not the closing bracket
    > but a character in the inverted scanset.
    >
    > Assuming that '^' is 0136 in octal, then "%[\136" still has the
    > meaning "%[^" with '^' interpreted as a special character,
    > so this is not possible either.
    >
    > "%[^-^]" is not interpreted as matching a nonempty sequence in the
    > degenerated range {'^', ..., '^'} but as matching anything
    > except '^' and '-'.
    >
    > "\^" is non a valid escape sequence...
    >
    > is there a solution ?


    ^^*

    P.J. Plauger
    Dinkumware, Ltd.
    http://www.dinkumware.com
     
    P.J. Plauger, Apr 27, 2006
    #2
    1. Advertising

  3. P.J. Plauger wrote:
    > "regis" <-mrs.fr> wrote in message
    > news:e2qm6s$81s$-mrs.fr...
    >
    > > about scanf matching nonempty sequences using the "%[" specifier...
    > >
    > > "%[^-]" matches a nonempty sequence of anything except '-'
    > > "%[^[]" matches a nonempty sequence of anything except '['
    > > "%[^]]" matches a nonempty sequence of anything except ']'
    > > "%[^^]" matches a nonempty sequence of anything except '^'
    > >
    > > "%[-]" matches a nonempty sequence of '-'
    > > "%[[]" matches a nonempty sequence of '['
    > > "%[]]" matches a nonempty sequence of ']'
    > >
    > > ...but how to match a nonempty sequence of '^' ?

    >
    > ^^*


    I am obviously missing something here, could you elaborate or provide a
    complete example that demonstrates this?

    Robert Gamble
     
    Robert Gamble, Apr 27, 2006
    #3
  4. regis

    P.J. Plauger Guest

    "Robert Gamble" <> wrote in message
    news:...

    > P.J. Plauger wrote:
    >> "regis" <-mrs.fr> wrote in message
    >> news:e2qm6s$81s$-mrs.fr...
    >>
    >> > about scanf matching nonempty sequences using the "%[" specifier...
    >> >
    >> > "%[^-]" matches a nonempty sequence of anything except '-'
    >> > "%[^[]" matches a nonempty sequence of anything except '['
    >> > "%[^]]" matches a nonempty sequence of anything except ']'
    >> > "%[^^]" matches a nonempty sequence of anything except '^'
    >> >
    >> > "%[-]" matches a nonempty sequence of '-'
    >> > "%[[]" matches a nonempty sequence of '['
    >> > "%[]]" matches a nonempty sequence of ']'
    >> >
    >> > ...but how to match a nonempty sequence of '^' ?

    >>
    >> ^^*

    >
    > I am obviously missing something here, could you elaborate or provide a
    > complete example that demonstrates this?


    I was being glib. You talked only about matching the sequence,
    not storing it. In that case, "^^*" matches exactly the sequence
    you want, and discards it. When I want to match just a sequence of
    carets, and store it in a string, I do something dirty like "[\377^]"
    or something besides \377 I don't expect to be in the input.

    P.J. Plauger
    Dinkumware, Ltd.
    http://www.dinkumware.com
     
    P.J. Plauger, Apr 27, 2006
    #4
  5. regis

    Ben Pfaff Guest

    "P.J. Plauger" <> writes:

    > "Robert Gamble" <> wrote in message
    > news:...
    >
    >> P.J. Plauger wrote:
    >>> "regis" <-mrs.fr> wrote in message
    >>> news:e2qm6s$81s$-mrs.fr...
    >>>
    >>> > about scanf matching nonempty sequences using the "%[" specifier...


    [...]

    >>> > ...but how to match a nonempty sequence of '^' ?
    >>>
    >>> ^^*

    >>
    >> I am obviously missing something here, could you elaborate or provide a
    >> complete example that demonstrates this?

    >
    > I was being glib. You talked only about matching the sequence,
    > not storing it. In that case, "^^*" matches exactly the sequence
    > you want, and discards it. [...]


    It does? As far as I can tell it only matches those three
    characters literally, not a sequence of carets. I don't know
    about any special handling of * outside a conversion
    specification. Perhaps you can educate me.
    --
    "The lusers I know are so clueless, that if they were dipped in clue
    musk and dropped in the middle of pack of horny clues, on clue prom
    night during clue happy hour, they still couldn't get a clue."
    --Michael Girdwood, in the monastery
     
    Ben Pfaff, Apr 28, 2006
    #5
  6. P.J. Plauger wrote:
    > "Robert Gamble" <> wrote in message
    > news:...
    >
    > > P.J. Plauger wrote:
    > >> "regis" <-mrs.fr> wrote in message
    > >> news:e2qm6s$81s$-mrs.fr...
    > >>
    > >> > about scanf matching nonempty sequences using the "%[" specifier...
    > >> >
    > >> > "%[^-]" matches a nonempty sequence of anything except '-'
    > >> > "%[^[]" matches a nonempty sequence of anything except '['
    > >> > "%[^]]" matches a nonempty sequence of anything except ']'
    > >> > "%[^^]" matches a nonempty sequence of anything except '^'
    > >> >
    > >> > "%[-]" matches a nonempty sequence of '-'
    > >> > "%[[]" matches a nonempty sequence of '['
    > >> > "%[]]" matches a nonempty sequence of ']'
    > >> >
    > >> > ...but how to match a nonempty sequence of '^' ?
    > >>
    > >> ^^*

    > >
    > > I am obviously missing something here, could you elaborate or provide a
    > > complete example that demonstrates this?

    >
    > I was being glib. You talked only about matching the sequence,
    > not storing it. In that case, "^^*" matches exactly the sequence
    > you want, and discards it.


    I am not the OP but what I don't understand is the implied significance
    of the asterisk character in your example, could you expand on this?

    > When I want to match just a sequence of
    > carets, and store it in a string, I do something dirty like "[\377^]"
    > or something besides \377 I don't expect to be in the input.


    As far as I can tell, it is not possible to match/store a sequence of
    only carets with the %[] conversion specifier, do you agree that this
    is not possible? If it is possible to match (but not store) a sequence
    of one of more carets without using %[] (as you indicate is possible
    above) then it would be possible to cleanly obtain the number of
    characters matched using a couple of well-placed %n specifiers but so
    far I haven't seen any evidence that this is the case.

    Robert Gamble
     
    Robert Gamble, Apr 28, 2006
    #6
  7. regis

    ais523 Guest

    P.J. Plauger wrote:

    > "regis" <-mrs.fr> wrote in message
    > news:e2qm6s$81s$-mrs.fr...
    >
    > > about scanf matching nonempty sequences using the "%[" specifier...
    > >
    > > "%[^-]" matches a nonempty sequence of anything except '-'
    > > "%[^[]" matches a nonempty sequence of anything except '['
    > > "%[^]]" matches a nonempty sequence of anything except ']'
    > > "%[^^]" matches a nonempty sequence of anything except '^'
    > >
    > > "%[-]" matches a nonempty sequence of '-'
    > > "%[[]" matches a nonempty sequence of '['
    > > "%[]]" matches a nonempty sequence of ']'
    > >
    > > ...but how to match a nonempty sequence of '^' ?

    >
    > ^^*
    >
    > > "%[^]" is not possible because here ']' is not the closing bracket
    > > but a character in the inverted scanset.
    > >
    > > Assuming that '^' is 0136 in octal, then "%[\136" still has the
    > > meaning "%[^" with '^' interpreted as a special character,
    > > so this is not possible either.
    > >
    > > "%[^-^]" is not interpreted as matching a nonempty sequence in the
    > > degenerated range {'^', ..., '^'} but as matching anything
    > > except '^' and '-'.
    > >
    > > "\^" is non a valid escape sequence...
    > >
    > > is there a solution ?

    >
    > ^^*


    I think ^^* is an attempt to create a regexp that matches any number of
    carets (in which case \^\^* is what is needed), but the %[ specifier
    doesn't match regexps (not a standard C concept), only scansets (which
    appear similar to regexps). %[^^*] matches anything but carets and
    asterisks when in a scanf format string.

    To the OP: One slightly extreme solution is to write %[^] followed by
    every character in the character set apart from '^' and ']', then a
    ']'. The main problem with this is the inefficiency, and the handling
    of '\0' (which can't be written in the scanset, as it would terminate
    the string). However, this is not recommended; I would use strspn to
    input the carets followed by a sscanf on the rest of the string to
    accomplish a similar effect.
    Note also that %[ without a width specifier has the same problem as
    gets if used with scanf; it can only be used safely on sscanf (where
    you know the length of the input string) or possibly fscanf (if you're
    sure you know the contents of the file and nothing but your program can
    have modified it).
     
    ais523, Apr 28, 2006
    #7
  8. regis

    regis Guest

    ais523 wrote:
    >
    >>"regis" wrote
    >>
    >>>about scanf matching nonempty sequences using the "%[" specifier...
    >>>
    >>>"%[^-]" matches a nonempty sequence of anything except '-'
    >>>"%[^[]" matches a nonempty sequence of anything except '['
    >>>"%[^]]" matches a nonempty sequence of anything except ']'
    >>>"%[^^]" matches a nonempty sequence of anything except '^'
    >>>
    >>>"%[-]" matches a nonempty sequence of '-'
    >>>"%[[]" matches a nonempty sequence of '['
    >>>"%[]]" matches a nonempty sequence of ']'
    >>>
    >>>...but how to match a nonempty sequence of '^' ?


    > To the OP: One slightly extreme solution is to write %[^] followed by
    > every character in the character set apart from '^' and ']', then a
    > ']'. The main problem with this is the inefficiency, and the handling
    > of '\0' (which can't be written in the scanset, as it would terminate
    > the string). However, this is not recommended; I would use strspn to
    > input the carets followed by a sscanf on the rest of the string to
    > accomplish a similar effect.
    > Note also that %[ without a width specifier has the same problem as
    > gets if used with scanf; it can only be used safely on sscanf (where
    > you know the length of the input string) or possibly fscanf (if you're
    > sure you know the contents of the file and nothing but your program can
    > have modified it).


    The point of my question is that, in general, when the designers of
    some syntax introduce a special character, they always introduce a
    simple lexical way to get back the literal meaning of this character
    in the procese, e.g. by backslashing it, or by doubling it,
    or as it is the case for the example above,
    by analysing its position in the scanset.

    The designers of scanf seemed to have cared that it be the case
    for special characters '-','[',']' for both scansets and inverted
    scansets but seemed to have done half the work for '^'.
     
    regis, Apr 28, 2006
    #8
  9. regis

    P.J. Plauger Guest

    "Ben Pfaff" <> wrote in message
    news:...

    > "P.J. Plauger" <> writes:
    >
    >> "Robert Gamble" <> wrote in message
    >> news:...
    >>
    >>> P.J. Plauger wrote:
    >>>> "regis" <-mrs.fr> wrote in message
    >>>> news:e2qm6s$81s$-mrs.fr...
    >>>>
    >>>> > about scanf matching nonempty sequences using the "%[" specifier...

    >
    > [...]
    >
    >>>> > ...but how to match a nonempty sequence of '^' ?
    >>>>
    >>>> ^^*
    >>>
    >>> I am obviously missing something here, could you elaborate or provide a
    >>> complete example that demonstrates this?

    >>
    >> I was being glib. You talked only about matching the sequence,
    >> not storing it. In that case, "^^*" matches exactly the sequence
    >> you want, and discards it. [...]

    >
    > It does? As far as I can tell it only matches those three
    > characters literally, not a sequence of carets. I don't know
    > about any special handling of * outside a conversion
    > specification. Perhaps you can educate me.


    And I promosed I wouldn't shoot from the hip for a whole month.
    Never mind.

    P.J. Plauger
    Dinkumware, Ltd.
    http://www.dinkumware.com
     
    P.J. Plauger, Apr 28, 2006
    #9
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Replies:
    21
    Views:
    1,238
    Richard Herring
    Apr 20, 2005
  2. scanf/getchar sequence problem

    , Apr 7, 2005, in forum: C Programming
    Replies:
    21
    Views:
    4,800
    Richard Herring
    Apr 20, 2005
  3. =?ISO-8859-1?Q?Martin_J=F8rgensen?=

    scanf (yes/no) - doesn't work + deprecation errors scanf, fopen etc.

    =?ISO-8859-1?Q?Martin_J=F8rgensen?=, Feb 16, 2006, in forum: C Programming
    Replies:
    185
    Views:
    3,475
    those who know me have no need of my name
    Apr 3, 2006
  4. =?ISO-8859-1?Q?Martin_J=F8rgensen?=

    difference between scanf("%i") and scanf("%d") ??? perhaps bug inVS2005?

    =?ISO-8859-1?Q?Martin_J=F8rgensen?=, Apr 26, 2006, in forum: C Programming
    Replies:
    18
    Views:
    699
    Richard Bos
    May 2, 2006
  5. thomas
    Replies:
    2
    Views:
    3,543
    Alexander Bartolich
    Aug 18, 2009
Loading...

Share This Page