Why Am I Getting an Inverted Question Mark?

Discussion in 'C++' started by Phil Staite, Mar 14, 2005.

  1. Phil Staite

    Phil Staite Guest

    Seems odd. Maybe, just maybe there is an empty or blank line at the
    beginning of your source file? In that case during the first iteration
    of the while loop line would be empty. Now, it *should* be ok to call
    write with a char count of 0 and have it do nothing... But maybe there
    is a problem with your stream code? Try adding a simple test:

    while (getline(in,line)) {
    if( ! line.empty() )
    {
    out.write(line.c_str(),line.size());
    out.put('\n');
    }
    }
    Phil Staite, Mar 14, 2005
    #1
    1. Advertising

  2. Phil Staite

    mary Guest

    When I read an HTML file starting with

    <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=UTF-8">

    and then I write it into another file, say OUTPUT.txt, I get an
    inverted question mark, "¿",
    at the beginning of the OUTPUT.txt file. Why is that?
    Thanks!

    mary

    PS. I use:

    string line;
    while (getline(in,line)) {
    out.write(line.c_str(),line.size());
    out.put('\n');
    }
    mary, Mar 14, 2005
    #2
    1. Advertising

  3. Phil Staite

    mary Guest

    Phil,

    Here is the code. It still does it with any file starting with
    anything!
    Thanks!

    Mary

    @@@@@@@@@@@@@@@@@@@@@@@

    #include <iostream>
    #include <fstream>
    #include <string>

    using namespace std;

    string line;
    int main()
    {
    ifstream in("INPUT.txt",ios::in);
    if (!in) {
    cout << "Cannot Open the INPUT file.\n";
    return 1;
    }
    ofstream out("OUTPUT.txt",ios::eek:ut);
    if (!out) {
    cout << "Cannot Open the OUTPUT file.\n";
    in.close();
    return 1;
    }
    while (getline(in,line)) {
    if( ! line.empty() ) {
    out.write(line.c_str(),line.size());
    out.put('\n');
    }
    }
    in.close();
    out.close();
    return 0;
    }

    @@@@@@@@@@@@@@@@@@@@@@@@
    On Sun, 13 Mar 2005 21:26:05 -0700, Phil Staite <>
    wrote:

    >Seems odd. Maybe, just maybe there is an empty or blank line at the
    >beginning of your source file? In that case during the first iteration
    >of the while loop line would be empty. Now, it *should* be ok to call
    >write with a char count of 0 and have it do nothing... But maybe there
    >is a problem with your stream code? Try adding a simple test:
    >
    >while (getline(in,line)) {
    > if( ! line.empty() )
    > {
    > out.write(line.c_str(),line.size());
    > out.put('\n');
    > }
    >}
    mary, Mar 14, 2005
    #3
  4. Phil Staite

    Phlip Guest

    mary wrote:

    > When I read an HTML file starting with
    >
    > <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=UTF-8">
    >
    > and then I write it into another file, say OUTPUT.txt, I get an
    > inverted question mark, "¿",
    > at the beginning of the OUTPUT.txt file. Why is that?


    Are you saving the file with Notepad.exe?

    That program prefixes files that it perceives as Unicode (even UTF-8) with a
    Byte Order Mark. If you use an editor to open your file in hex (or "binary")
    mode, you might see the BOM, FEFF or FFEF, at the beginning.

    Your output system does not interpret the codes as UTF-8, so it probably
    uses ISO Latin-1. That has no glyph for FF or EF, so you get a "missing
    glyph" symbol as ¿.

    This could all be wrong, but the details are off-topic, so nobody is allowed
    to contradict me.

    --
    Phlip
    http://industrialxp.org/community/bin/view/Main/TestFirstUserInterfaces
    Phlip, Mar 14, 2005
    #4
  5. mary wrote:
    > out.write(line.c_str(),line.size());
    > out.put('\n');


    I don't see anything wrong with your code, but the above lines could be
    simplified to:
    out << line << '\n';
    Kurt Stutsman, Mar 14, 2005
    #5
  6. On Mon, 14 Mar 2005 06:16:12 GMT, Phlip wrote:

    > mary wrote:
    >
    >> When I read an HTML file starting with
    >>
    >> <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=UTF-8">
    >>
    >> and then I write it into another file, say OUTPUT.txt, I get an
    >> inverted question mark, "¿",
    >> at the beginning of the OUTPUT.txt file. Why is that?

    >
    > Are you saving the file with Notepad.exe?
    >
    > That program prefixes files that it perceives as Unicode (even UTF-8) with a
    > Byte Order Mark. If you use an editor to open your file in hex (or "binary")
    > mode, you might see the BOM, FEFF or FFEF, at the beginning.
    >
    > Your output system does not interpret the codes as UTF-8, so it probably
    > uses ISO Latin-1. That has no glyph for FF or EF, so you get a "missing
    > glyph" symbol as ¿.
    >
    > This could all be wrong, but the details are off-topic, so nobody is allowed
    > to contradict me.


    Well, your reasoning is correct, but not your facts. A Unicode file may
    start with FEFF or FFFE (not FFEF) to indicate endianness. A UTF-8 file,
    however, starts with EFBBBF if it has a BOM mark at all. But, no doubt, the
    BOM mark is what the OP is seeing.

    --
    Sven Axelsson, Sweden
    Sven Axelsson, Mar 14, 2005
    #6
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. AviraM
    Replies:
    2
    Views:
    6,333
    Manish Pandit
    Sep 28, 2006
  2. Mr. SweatyFinger

    why why why why why

    Mr. SweatyFinger, Nov 28, 2006, in forum: ASP .Net
    Replies:
    4
    Views:
    863
    Mark Rae
    Dec 21, 2006
  3. Mr. SweatyFinger
    Replies:
    2
    Views:
    1,765
    Smokey Grindel
    Dec 2, 2006
  4. Cirene
    Replies:
    5
    Views:
    573
    Cirene
    May 17, 2008
  5. Replies:
    8
    Views:
    261
    A. Sinan Unur
    May 25, 2006
Loading...

Share This Page