Why Am I Getting an Inverted Question Mark?

Phil Staite · Mar 14, 2005

Seems odd. Maybe, just maybe there is an empty or blank line at the
beginning of your source file? In that case during the first iteration
of the while loop line would be empty. Now, it *should* be ok to call
write with a char count of 0 and have it do nothing... But maybe there
is a problem with your stream code? Try adding a simple test:

while (getline(in,line)) {
if( ! line.empty() )
{
out.write(line.c_str(),line.size());
out.put('\n');
}
}

mary · Mar 14, 2005

When I read an HTML file starting with

<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=UTF-8">

and then I write it into another file, say OUTPUT.txt, I get an
inverted question mark, "¿",
at the beginning of the OUTPUT.txt file. Why is that?
Thanks!

mary

PS. I use:

string line;
while (getline(in,line)) {
out.write(line.c_str(),line.size());
out.put('\n');
}

mary · Mar 14, 2005

Phil,

Here is the code. It still does it with any file starting with
anything!
Thanks!

Mary

@@@@@@@@@@@@@@@@@@@@@@@

#include <iostream>
#include <fstream>
#include <string>

using namespace std;

string line;
int main()
{
ifstream in("INPUT.txt",ios::in);
if (!in) {
cout << "Cannot Open the INPUT file.\n";
return 1;
}
ofstream out("OUTPUT.txt",ios:

ut);
if (!out) {
cout << "Cannot Open the OUTPUT file.\n";
in.close();
return 1;
}
while (getline(in,line)) {
if( ! line.empty() ) {
out.write(line.c_str(),line.size());
out.put('\n');
}
}
in.close();
out.close();
return 0;
}

@@@@@@@@@@@@@@@@@@@@@@@@

Phlip · Mar 14, 2005

mary said:
When I read an HTML file starting with

<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=UTF-8">

and then I write it into another file, say OUTPUT.txt, I get an
inverted question mark, "¿",
at the beginning of the OUTPUT.txt file. Why is that?

Are you saving the file with Notepad.exe?

That program prefixes files that it perceives as Unicode (even UTF-8) with a
Byte Order Mark. If you use an editor to open your file in hex (or "binary")
mode, you might see the BOM, FEFF or FFEF, at the beginning.

Your output system does not interpret the codes as UTF-8, so it probably
uses ISO Latin-1. That has no glyph for FF or EF, so you get a "missing
glyph" symbol as ¿.

This could all be wrong, but the details are off-topic, so nobody is allowed
to contradict me.

Kurt Stutsman · Mar 14, 2005

mary said:
out.write(line.c_str(),line.size());
out.put('\n');

I don't see anything wrong with your code, but the above lines could be
simplified to:
out << line << '\n';

Sven Axelsson · Mar 14, 2005

Are you saving the file with Notepad.exe?

That program prefixes files that it perceives as Unicode (even UTF-8) with a
Byte Order Mark. If you use an editor to open your file in hex (or "binary")
mode, you might see the BOM, FEFF or FFEF, at the beginning.

Your output system does not interpret the codes as UTF-8, so it probably
uses ISO Latin-1. That has no glyph for FF or EF, so you get a "missing
glyph" symbol as ¿.

This could all be wrong, but the details are off-topic, so nobody is allowed
to contradict me.

Well, your reasoning is correct, but not your facts. A Unicode file may
start with FEFF or FFFE (not FFEF) to indicate endianness. A UTF-8 file,
however, starts with EFBBBF if it has a BOM mark at all. But, no doubt, the
BOM mark is what the OP is seeing.

How Do We Avoid the Extra Empty Line at the End of the Output File?	6	Jan 19, 2005
DatagramSocket! I am getting IllegalArgumentException but can't figure out why.	2	Aug 9, 2006
Very quick C++ I/O n00b question	5	Aug 30, 2007
Why I don't use Ruby.	144	Jul 1, 2004
newbie question	9	Mar 1, 2006
[SUMMARY] Pen and Paper (#90)	6	Aug 17, 2006
Wrong abt semi-finished program - why no output?	3	Aug 10, 2003
Detection of a loop in a linked tree (or linked list)	1	May 27, 2008

Why Am I Getting an Inverted Question Mark?

Phil Staite

mary

mary

Phlip

Kurt Stutsman

Sven Axelsson

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads