Wide string initializer syntax

Discussion in 'C Programming' started by Derrick Coetzee, Sep 11, 2004.

  1. Looking through the C90 standard, it occurred to me that the possible
    syntaxes for initializers, particularly of wchar_t arrays, are really
    bizarre. Consider the following:

    wchar_t s1[] = { L"abcdef" };
    wchar_t* s2[] = { L"abcdef" };
    wchar_t s3[][6] = { L"abcdef" };
    wchar_t* s4[][6] = { L"abcdef" };

    That's four different types initialized with exactly the same
    initializer syntax, but it means four different things. In the first
    case, a mutable buffer is being initialized, and the standard lets you
    wrap the string intializing the buffer in braces for no apparent reason.
    In the second case, an array containing one pointer to a literal string
    is declared. In the third case, an array containing one initialized
    mutable buffer is declared. In the fourth case, a 1 by 6 two-dimensional
    array is declared, with s4[0][0] set to a literal string, and s4[0][1]
    through s[0][5] set to a null pointer. I could continue with
    larger-dimensional arrays right up to the environment limits.

    Thoughts?
    --
    Derrick Coetzee
    I grant this newsgroup posting into the public domain. I disclaim all
    express or implied warranty and all liability. I am not a professional.
    Derrick Coetzee, Sep 11, 2004
    #1
    1. Advertising

  2. Derrick Coetzee wrote:
    > Looking through the C90 standard, it occurred to me that the possible
    > syntaxes for initializers, particularly of wchar_t arrays, are really
    > bizarre. Consider the following:


    Are you shure that wchar_t is a build-in TYpe for C? I don't know about
    C99, but in C90 ther is defently no wchar_t build-in type!

    Kind regards,
    Nicolas
    Nicolas Pavlidis, Sep 11, 2004
    #2
    1. Advertising

  3. Nicolas Pavlidis wrote:
    > Derrick Coetzee wrote:
    >
    >> Looking through the C90 standard, it occurred to me that the possible
    >> syntaxes for initializers, particularly of wchar_t arrays, are really
    >> bizarre. Consider the following:

    >
    >
    > Are you shure that wchar_t is a build-in TYpe for C? I don't know about
    > C99, but in C90 ther is defently no wchar_t build-in type!


    The wchar_t type is not built-in, but is required to be defined in the
    standard header stddef.h. Wide string literals are always arrays of
    whatever wchar_t is defined to be, even if the type's definition is not
    available. The standard mentions wchar_t in several places.
    --
    Derrick Coetzee
    I grant this newsgroup posting into the public domain. I disclaim all
    express or implied warranty and all liability. I am not a professional.
    Derrick Coetzee, Sep 12, 2004
    #3
  4. Derrick Coetzee

    Chris Torek Guest

    In article <news:chu1gr$fni$>
    Derrick Coetzee <> wrote:
    >Looking through the C90 standard, it occurred to me that the possible
    >syntaxes for initializers, particularly of wchar_t arrays, are really
    >bizarre. Consider the following:
    >
    > wchar_t s1[] = { L"abcdef" };
    > wchar_t* s2[] = { L"abcdef" };
    > wchar_t s3[][6] = { L"abcdef" };
    > wchar_t* s4[][6] = { L"abcdef" };
    >
    >That's four different types initialized with exactly the same
    >initializer syntax, but it means four different things. ...


    Indeed, this is all correct and true, but it is not special to wide
    characters. Replace "wchar_t" with "char", and remove the uppercase
    L's, and it is still all correct and true.

    (Versions of gcc helpfully warn about incomplete/inconsistent
    brace-bracketing of the fourth line, given the appropriate options.)
    --
    In-Real-Life: Chris Torek, Wind River Systems
    Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
    email: forget about it http://web.torek.net/torek/index.html
    Reading email is like searching for food in the garbage, thanks to spammers.
    Chris Torek, Sep 15, 2004
    #4
  5. Chris Torek wrote:
    >> wchar_t s1[] = { L"abcdef" };
    >> wchar_t* s2[] = { L"abcdef" };
    >> wchar_t s3[][6] = { L"abcdef" };
    >> wchar_t* s4[][6] = { L"abcdef" };

    >
    > Indeed, this is all correct and true, but it is not special to wide
    > characters. Replace "wchar_t" with "char", and remove the uppercase
    > L's, and it is still all correct and true.


    Ah, you're right. It was the first one I was unsure of, but:

    "An array of character type may be initialized by a character string
    literal, optionally enclosed in braces."
    "An array with element type compatible with wchar_t may be initialized
    by a wide string literal, optionally enclosed in braces."
    - C90, 6.5.7

    I can't figure out what these optional braces are for. I suppose yet
    another concession to existing implementations.
    --
    Derrick Coetzee
    I grant this newsgroup posting into the public domain. I disclaim all
    express or implied warranty and all liability. I am not a professional.
    Derrick Coetzee, Sep 15, 2004
    #5
  6. Derrick Coetzee <> wrote in message news:<ci9jr9$qt5$>...
    >
    > "An array of character type may be initialized by a character string
    > literal, optionally enclosed in braces."
    > "An array with element type compatible with wchar_t may be initialized
    > by a wide string literal, optionally enclosed in braces."
    > - C90, 6.5.7
    >
    > I can't figure out what these optional braces are for. I suppose yet
    > another concession to existing implementations.


    Consistency. In general, initializers for aggregate type are enclosed
    in braces.
    J. J. Farrell, Sep 15, 2004
    #6
  7. In article <>, (J. J. Farrell) writes:
    > Derrick Coetzee <> wrote in message news:<ci9jr9$qt5$>...
    > >
    > > I can't figure out what these optional braces are for. I suppose yet
    > > another concession to existing implementations.

    >
    > Consistency. In general, initializers for aggregate type are enclosed
    > in braces.


    The braces are also optional for initializers for scalar types.

    This consistency simplifies things for source-code generators, and
    means that {0} is a valid initializer for any object type or any
    array of unknown size (in a declaration where initialization is
    permitted).

    --
    Michael Wojcik
    Michael Wojcik, Sep 16, 2004
    #7
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Web Developer

    char 8bit wide or 7bit wide in c++?

    Web Developer, Jul 31, 2003, in forum: C++
    Replies:
    2
    Views:
    579
    John Harrison
    Jul 31, 2003
  2. thinktwice
    Replies:
    4
    Views:
    892
    =?iso-8859-1?q?Kirit_S=E6lensminde?=
    Sep 7, 2006
  3. Disc Magnet
    Replies:
    2
    Views:
    708
    Jukka K. Korpela
    May 15, 2010
  4. Disc Magnet
    Replies:
    2
    Views:
    786
    Neredbojias
    May 14, 2010
  5. Martin Rinehart

    80 columns wide? 132 columns wide?

    Martin Rinehart, Oct 31, 2008, in forum: Javascript
    Replies:
    16
    Views:
    176
    John W Kennedy
    Nov 13, 2008
Loading...

Share This Page