API for stream (char vs wchar_t)

Discussion in 'C++' started by mathieu, Feb 14, 2011.

  1. mathieu

    mathieu Guest

    Dear all,

    I am looking for suggestion on API recommendation when dealing with
    filename. Should I expose an API with a char* or wchar_t (or both) to
    my users ? Since I am reading image file (binary file), I would be
    using std::ifstream, however std::ifstream API with wchar_t is only
    available on Microsoft compilers.

    Thanks for comments,
    mathieu, Feb 14, 2011
    1. Advertisements

  2. mathieu

    Jorgen Grahn Guest

    What if you just let your API deal with istreams and ostreams, and
    let the user do the opening of files (or whatever she chooses)?

    Jorgen Grahn, Feb 14, 2011
    1. Advertisements

  3. mathieu

    Goran Guest

    Plain English, no other languages? char*.

    If not (welcome to the real world...)

    It's much worse than streams having wchar_t version only with MS.
    wchar_t means UTF-16 under Windows, and it means UTF32 under unix-like
    systems. char* under Unix-like systems of today means UTF-8, and most
    often means MBCS under Windows (UTF-8 is less often seen there).

    So IMO... You need your code functionality (typically UTF-8 or 16;
    note that ICU, Qt, Windows, Java and CLR use UTF-16), and you should
    slap platform-specific layer on top of that.

    Goran, Feb 14, 2011
  4. mathieu

    Öö Tiib Guest

    With file names and paths you can use boost::filesystem. It is quite
    portable and supports wchar_t. char is "a byte" in C++. Does buffer of
    bytes contain text and in what encoding that text is? That is meta-
    information and C++ compiler does not help with it at all. wchar_t is
    at least character type for compiler and if your users need particular
    encoding somewhere then it is less error prone for them to convert
    from wchar_t buffer than from char buffer.
    Öö Tiib, Feb 15, 2011
  5. mathieu

    Nobody Guest

    char on Unix, wchar_t on Windows.

    On Unix, filenames are strings of bytes with no associated encoding. On
    Windows, they're wide-character strings.

    Trying to use one approach on both operating systems will be sub-optimal,
    although it may be close enough depending upon the application (e.g. for a
    user-level application, you can probably assume that any filenames which
    you encounter will be in the encoding of the locale, but that won't work
    for an administrative utility, which may have to deal with files belonging
    to various users, each using a different locale).
    Nobody, Feb 17, 2011
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.