the usage of sscanf

Discussion in 'C Programming' started by Da Wang, Apr 1, 2005.

  1. Da Wang

    Da Wang Guest

    Hi, all

    I am trying to use sscanf to parse the header for a web server,
    according to the requirement, it need to neglect all the blanks in the
    header
    for example, all the following should be equvalient and the value should
    be read correctly( get "Host" and "localhost" )
    " Host: localhost "
    " Host : localhost "
    " Host :localhost "
    "Host:localhost"
    etc.

    I have tried various ways and wrote the following code:
    --------
    st=sscanf(header, " %[a-zA-Z0-9_-] : %[^ ]" ,name, value);
    ---------
    and so far it seems works..however, it only support a limit set of chars
    and if I want more, I need to add all of them into the bracket, which
    looks awkward. I am wondering if anyone has a better solution to my
    problem and hope you could kindly help me out.

    Many thanks.
    --
    Life is an opportunity to do something.
    .-._
    o_oo'_)
    `._ `._
    `, \
    //_(_)_/
    ~~
    Da Wang, Apr 1, 2005
    #1
    1. Advertising

  2. Da Wang

    Guest

    On Thu, 31 Mar 2005 22:37:16 -0500, Da Wang <>
    wrote:

    >Hi, all
    >
    >I am trying to use sscanf to parse the header for a web server,
    >according to the requirement, it need to neglect all the blanks in the
    >header
    >for example, all the following should be equvalient and the value should
    >be read correctly( get "Host" and "localhost" )
    >" Host: localhost "
    >" Host : localhost "
    >" Host :localhost "
    >"Host:localhost"
    >etc.
    >
    >I have tried various ways and wrote the following code:
    >--------
    >st=sscanf(header, " %[a-zA-Z0-9_-] : %[^ ]" ,name, value);
    >---------
    >and so far it seems works..however, it only support a limit set of chars
    >and if I want more, I need to add all of them into the bracket, which
    >looks awkward. I am wondering if anyone has a better solution to my
    >problem and hope you could kindly help me out.


    Use a #define with your character set in it...
    Use the resulting constant in your code...

    #define MY_CS a-zA-Z0-9_-

    st = sscanf(header, " %[MY_CS] : %[^ ]" ,name, value)
    , Apr 1, 2005
    #2
    1. Advertising

  3. writes:
    [...]
    > Use a #define with your character set in it...
    > Use the resulting constant in your code...
    >
    > #define MY_CS a-zA-Z0-9_-
    >
    > st = sscanf(header, " %[MY_CS] : %[^ ]" ,name, value)


    Macros aren't expanded in string literals.

    I suppose you could do:

    #define MY_CS "a-zA-Z0-9_-"
    st = sscanf(header, " %[" MY_CS "] : %[^ ]" ,name, value);

    but that's just equivalent to:

    st = sscanf(header, " %[a-zA-Z0-9_-] : %[^ ]" ,name, value);

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
    We must do something. This is something. Therefore, we must do this.
    Keith Thompson, Apr 1, 2005
    #3
  4. Da Wang

    Da Wang Guest

    Keith Thompson wrote:
    > writes:
    > [...]
    >
    >>Use a #define with your character set in it...
    >>Use the resulting constant in your code...
    >>
    >>#define MY_CS a-zA-Z0-9_-
    >>
    >>st = sscanf(header, " %[MY_CS] : %[^ ]" ,name, value)

    >
    >
    > Macros aren't expanded in string literals.
    >
    > I suppose you could do:
    >
    > #define MY_CS "a-zA-Z0-9_-"
    > st = sscanf(header, " %[" MY_CS "] : %[^ ]" ,name, value);
    >
    > but that's just equivalent to:
    >
    > st = sscanf(header, " %[a-zA-Z0-9_-] : %[^ ]" ,name, value);
    >

    Many thanks.

    Another question, is there any way to use another form of regular
    expression without using the charset?

    Thanks in advance again.
    --
    Life is an opportunity to do something.
    .-._
    o_oo'_)
    `._ `._
    `, \
    //_(_)_/
    ~~
    Da Wang, Apr 2, 2005
    #4
  5. Da Wang

    Chris Torek Guest

    >Keith Thompson wrote:
    [slight editing]
    >> #define MY_CS "a-zA-Z0-9_-"
    >> st = sscanf(header, " %[" MY_CS "] : %[^ ]" ,name, value);
    >>but that's just equivalent to:
    >> st = sscanf(header, " %[a-zA-Z0-9_-] : %[^ ]" ,name, value);


    In article <9oy3e.24794$>
    Da Wang <> wrote:
    >Another question, is there any way to use another form of regular
    >expression without using the charset?


    No. In fact, scanf does not really do regular expressions at
    all -- the character-class %[ conversion is the equivalent of
    [class]+ (i.e., one or more characters from the scanset), but no
    other regular-expression features are available. (As a result,
    the scanf engine does not need the amount of code found in most
    RE matchers. The obvious trivial algorithm has linear behavior
    and never needs to back up.)
    --
    In-Real-Life: Chris Torek, Wind River Systems
    Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
    email: forget about it http://web.torek.net/torek/index.html
    Reading email is like searching for food in the garbage, thanks to spammers.
    Chris Torek, Apr 2, 2005
    #5
  6. On Thu, 31 Mar 2005 22:37:16 -0500, Da Wang
    <> wrote:

    > Hi, all
    >
    > I am trying to use sscanf to parse the header for a web server,
    > according to the requirement, it need to neglect all the blanks in the
    > header
    > for example, all the following should be equvalient and the value should
    > be read correctly( get "Host" and "localhost" )
    > " Host: localhost "
    > " Host : localhost "
    > " Host :localhost "
    > "Host:localhost"
    > etc.
    >

    Your requirement is wrong. Treating a header line beginning with
    whitespace as a new item is in violation of 2068 syntax, inherited via
    1945 from 822, which makes it a continuation of the preceding "folded"
    header. Space after the header name before the colon is also
    explicitly forbidden, and I've never seen it used, although it can be
    parsed unambiguously under the "liberal receive" principle.

    > I have tried various ways and wrote the following code:
    > --------
    > st=sscanf(header, " %[a-zA-Z0-9_-] : %[^ ]" ,name, value);
    > ---------


    The range syntax a-z etc. is not standard C and thus not guaranteed
    portable, but in practice it probably works on all but EBCDIC systems.

    This isn't _ignoring_ spaces in the value part, it is terminating the
    value at a space. For Host in particular this is OK because a
    domainname (or IPaddress) can't contain whitespace, but this may be
    wrong for other header fields.

    > and so far it seems works..however, it only support a limit set of chars
    > and if I want more, I need to add all of them into the bracket, which
    > looks awkward. I am wondering if anyone has a better solution to my
    > problem and hope you could kindly help me out.
    >

    If you want to accept anything in the header label, except colon and
    maybe space (or HWS?) just use %[^:] or %[^ :] etc. If you want to
    restrict it to given characters, you have to state those characters
    somehow. You might find some systems that allow POSIX-style classes in
    a *scanf scanset (as well as a regex) like %[[:alpha:][:digit:]-_] ,
    but this isn't required and isn't that much better anyway.

    - David.Thompson1 at worldnet.att.net
    Dave Thompson, Apr 9, 2005
    #6
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. metfan
    Replies:
    2
    Views:
    4,833
    Robert Olofsson
    Oct 21, 2003
  2. Colin J. Williams

    Webchecker Usage - a problem with local usage

    Colin J. Williams, Feb 25, 2004, in forum: Python
    Replies:
    1
    Views:
    520
    Colin J. Williams
    Feb 26, 2004
  3. hvt
    Replies:
    0
    Views:
    1,193
  4. hvt
    Replies:
    0
    Views:
    1,449
  5. Krist
    Replies:
    8
    Views:
    6,339
    Arne Vajhøj
    Feb 10, 2010
Loading...

Share This Page