O_TEXT, in microsoft environment

O

O_TEXT

When opening a file with open and O_TEXT, in microsoft environment, and
then reading it with read, it is done in text mode.

What does text mode exactly mean?

How is it handled in reading?

How is it handled in writing?

does text mode has an impact on function returning position within the
file? If yes, which one?
 
J

jacob navia

O_TEXT said:
When opening a file with open and O_TEXT, in microsoft environment, and
then reading it with read, it is done in text mode.

What does text mode exactly mean?

How is it handled in reading?

How is it handled in writing?

does text mode has an impact on function returning position within the
file? If yes, which one?

DO YOUR OWN HOMEWORK!
 
M

Malcolm McLean

O_TEXT said:
When opening a file with open and O_TEXT, in microsoft environment, and
then reading it with read, it is done in text mode.

What does text mode exactly mean?

How is it handled in reading?

How is it handled in writing?

does text mode has an impact on function returning position within the
file? If yes, which one?
If you are printing out on an old-fashioned line printer you need to tell
the printer to move down to the next line, and also to return.
However if you are typing on a modern keyboard, usually the return button
will automatically move the cursor to the next line.
So the question is whether to represent newlines as "\n" or "\n\r".
Operating systems make different choices. However ANSI C has decided that
the "\n", or newline only route, will be used. So if you open a file in text
mode on an "\r\n" operating system the "\r" will be silently suppressed.
This is only good for files that actually represent text. Binary files might
have "\n\r" sequences embedded in them purely by chance.

So in the standard library we use

fopen(filename, "r");

to open in text mode for reading

fopen(filename, "rb")

to open in binary mode.

O_TEXT is just an alternative interface Microsoft have provided to the same
underlying system
 
O

O_TEXT

Malcolm McLean a écrit :
If you are printing out on an old-fashioned line printer you need to
tell the printer to move down to the next line, and also to return.
However if you are typing on a modern keyboard, usually the return
button will automatically move the cursor to the next line.
So the question is whether to represent newlines as "\n" or "\n\r".
Operating systems make different choices. However ANSI C has decided
that the "\n", or newline only route, will be used. So if you open a
file in text mode on an "\r\n" operating system the "\r" will be
silently suppressed.
This is only good for files that actually represent text. Binary files
might have "\n\r" sequences embedded in them purely by chance.

So in the standard library we use

fopen(filename, "r");

to open in text mode for reading

fopen(filename, "rb")

to open in binary mode.

O_TEXT is just an alternative interface Microsoft have provided to the
same underlying system


okay, I understand the \r\n -> \n translation mechanism.

But what I do not understand is whether read and lseek values are bytes
offset or character offset.

If you read 4096 bytes which contain 3900 translated characters, will
read return 4096 or 3900? what's about lseek?
 
O

O_TEXT

jacob navia a écrit :
DO YOUR OWN HOMEWORK!

Malcom provides a better answer.

your answer is useless.
Moreover, you should learn keyboards contain a key which allow typing
non capitalized characters.
 
C

Charlie Gordon

O_TEXT said:
Malcolm McLean a écrit :


okay, I understand the \r\n -> \n translation mechanism.

But what I do not understand is whether read and lseek values are bytes
offset or character offset.

If you read 4096 bytes which contain 3900 translated characters, will read
return 4096 or 3900? what's about lseek?

This is highly Microsoft specific. You are referring to non standard
low-level I/O. You will get a more accurate answer from a microsoft
specific forum.
 
M

Mark Bluemel

O_TEXT said:
When opening a file with open and O_TEXT, in microsoft environment, and
then reading it with read, it is done in text mode.

If you want to know about microsoft specifics, there are probably more
appropriate groups in which to ask. We tend to concentrate on the C
language as defined by the ISO standard, rather than platform-specific
features.
What does text mode exactly mean?

If O_TEXT means opening in what the standard defines as a text stream,
it means what the standard says...
How is it handled in reading?
How is it handled in writing?

The standard says that that is implementation-defined, as I understand it.
does text mode has an impact on function returning position within the
file? If yes, which one?

The standard only provides one such function - ftell() - and defines it
as producing, in effect, "opaque" data for text streams. By this I mean
that the value can be used by fseek() on the same stream but has no
externally-meaningful value.
 
C

Charlie Gordon

O_TEXT said:
jacob navia a écrit :

Malcom provides a better answer.

It's Malcolm actually.
But names don't matter do they O_TEXT ?
your answer is useless.

You should begin your phrases with a capital letter.
Moreover, you should learn keyboards contain a key which allow typing non
capitalized characters.

Keyboards do not "contain" keys.

Your question is probably not homework, but it is O/S specific. It is too
bad you were rude to Jacob, for he could have given you detailed answer,
Windows being his platform of choice.
 
R

Richard Heathfield

Charlie Gordon said:

Your question is probably not homework, but it is O/S specific. It is
too bad you were rude to Jacob,

Yes indeed - and it was also too bad that Jacob was rude to him.
for he could have given you detailed
answer, Windows being his platform of choice.

Yes, but then C is his language of choice. Just because <foo> is your <bar>
of choice, that doesn't make you an expert <foo>er.
 
C

Charlie Gordon

Richard Heathfield said:
Charlie Gordon said:



Yes indeed - and it was also too bad that Jacob was rude to him.


Yes, but then C is his language of choice. Just because <foo> is your
<bar>
of choice, that doesn't make you an expert <foo>er.

I used to waste my energy programming on DOS around these stupid Microsoft
specific text mode hacks. I even rewrote a C I/O library because I was
disgusted to see them do the \r\n translation at the low level interface
instead of just the stdio level. The benefits of bufferization were lost
because of this, as the low level I/O was not done on nice round aligned
blocks, leading to general slugishness.

I have since given up on this broken platform, and focus my development
efforts on Linux, only occasionally porting stuff to MingW or Cygwin.

Jacob makes a Windows based C compiler. He is certainly more up to date on
this issue than I am, and more an expert at it than most regulars on this
forum. Your constant bashing of his abilities is tiresome and provocative.
He makes mistakes, you make mistakes, I make mistakes, we all do, both on
technical issues and on communication skills. We are all on the same side
here, defending our language of choice, let's reserve our attacks for all
the crud out there that deserves it: java and C++ bloatware from all the big
names, MSFT and ORCL leading the pack...
 
O

O_TEXT

Charlie Gordon a écrit :
I used to waste my energy programming on DOS around these stupid Microsoft
specific text mode hacks. I even rewrote a C I/O library because I was
disgusted to see them do the \r\n translation at the low level interface
instead of just the stdio level. The benefits of bufferization were lost
because of this, as the low level I/O was not done on nice round aligned
blocks, leading to general slugishness.

I have since given up on this broken platform, and focus my development
efforts on Linux, only occasionally porting stuff to MingW or Cygwin.

Jacob makes a Windows based C compiler. He is certainly more up to date on
this issue than I am, and more an expert at it than most regulars on this
forum. Your constant bashing of his abilities is tiresome and provocative.
He makes mistakes, you make mistakes, I make mistakes, we all do, both on
technical issues and on communication skills. We are all on the same side
here, defending our language of choice, let's reserve our attacks for all
the crud out there that deserves it: java and C++ bloatware from all the big
names, MSFT and ORCL leading the pack...

I agree with you.
I am sorry if I have hurt somebody. I am sorry if I do not write your
language as well as you do.

What I'd like to do is understand how the stdio and io library from crt
from reactos (does not?) works. For this I needed finding some
information on what original microsoft behavior was.

I fastly searched on internet and did not found.
I just asked in microsoft public vc language; is this an appropriate forum?
 
O

O_TEXT

Mark Bluemel a écrit :
If you want to know about microsoft specifics, there are probably more
appropriate groups in which to ask. We tend to concentrate on the C
language as defined by the ISO standard, rather than platform-specific
features.

Yes, which one?
If O_TEXT means opening in what the standard defines as a text stream,
it means what the standard says...

Is the standard available at a known URL?
The standard says that that is implementation-defined, as I understand it.


The standard only provides one such function - ftell() - and defines it
as producing, in effect, "opaque" data for text streams. By this I mean
that the value can be used by fseek() on the same stream but has no
externally-meaningful value.

okay.

ftell is from stdio level, isn't it? (I mean it is not from same level
as lseek read write and open).

Does not lseek (with parameter 0 and SEEK_CUR) provide same kind of
functionality?


I had three other subsidiaries questions:

Does read takes as input buffer size in bytes and gives as output number
of readen characters?

Will lseek take number of bytes to move to or number of characters?

If I write a new line character in an existing text file, will the
behavior be deterministic in some way? Will this change the three bytes
file from "x\r\n" to "\r\n\n"?
 
R

Richard

Charlie Gordon said:
I used to waste my energy programming on DOS around these stupid Microsoft
specific text mode hacks. I even rewrote a C I/O library because I was
disgusted to see them do the \r\n translation at the low level interface
instead of just the stdio level. The benefits of bufferization were lost
because of this, as the low level I/O was not done on nice round aligned
blocks, leading to general slugishness.

I have since given up on this broken platform, and focus my development
efforts on Linux, only occasionally porting stuff to MingW or Cygwin.

Jacob makes a Windows based C compiler. He is certainly more up to date on
this issue than I am, and more an expert at it than most regulars on this
forum. Your constant bashing of his abilities is tiresome and
provocative.

Well said. He comes across as a spiteful and bitter. I simply don't know
why he can not reign himself in as his C knowledge is generally better
than most. Unfortunately a small clique have oiled each other up so much
for the past while, that a small core element really do think they own
this newsgroup. One rule for them and another rule for others.
 
J

jacob navia

Charlie said:
I used to waste my energy programming on DOS around these stupid Microsoft
specific text mode hacks. I even rewrote a C I/O library because I was
disgusted to see them do the \r\n translation at the low level interface
instead of just the stdio level. The benefits of bufferization were lost
because of this, as the low level I/O was not done on nice round aligned
blocks, leading to general slugishness.

I have since given up on this broken platform, and focus my development
efforts on Linux, only occasionally porting stuff to MingW or Cygwin.

Jacob makes a Windows based C compiler. He is certainly more up to date on
this issue than I am, and more an expert at it than most regulars on this
forum. Your constant bashing of his abilities is tiresome and provocative.
He makes mistakes, you make mistakes, I make mistakes, we all do, both on
technical issues and on communication skills. We are all on the same side
here, defending our language of choice, let's reserve our attacks for all
the crud out there that deserves it: java and C++ bloatware from all the big
names, MSFT and ORCL leading the pack...

The original poster did not give any name to his post besides the O_TEXT
pseudo. That, and the wording of the message, together with the content
(this is explained in ANY Microsoft documentation) led me to think that
this was yet another student seeking that we do his/her homework.

Now that the original poster has answered, I see that it wasn't
the case.

Note (Mr Heathfield) that I did not insult anyone or treated anyone
badly. I just said that he/she should do his/her own homework.

As far as I know, that is not an insult or being "rude" or whatever
you want to see in my answer.
 
O

O_TEXT

jacob navia a écrit :
The original poster did not give any name to his post besides the O_TEXT
pseudo. That, and the wording of the message, together with the content
(this is explained in ANY Microsoft documentation) led me to think that
this was yet another student seeking that we do his/her homework.

Now that the original poster has answered, I see that it wasn't
the case.

Note (Mr Heathfield) that I did not insult anyone or treated anyone
badly. I just said that he/she should do his/her own homework.

As far as I know, that is not an insult or being "rude" or whatever
you want to see in my answer.

Nor my answer was an insult.

Was my English not understandable?

What is the URL for the «ANY Microsoft documentation» answering the
questions?

PS: I hope none student is teached to use this low level non portable
things.
 
M

Mark Bluemel

O_TEXT said:
Mark Bluemel a écrit :

Yes, which one?

See section 11 of the C FAQ at http://www.c-faq.com
Is the standard available at a known URL?

See section 11 of the C FAQ at http://www.c-faq.com
ftell is from stdio level, isn't it? (I mean it is not from same level
as lseek read write and open).

lseek, read, write and open are not specified by the C language.
Any discussion of them is necessarily platform-specific.
Does not lseek (with parameter 0 and SEEK_CUR) provide same kind of
functionality?

It may do, and probably does on the platforms to which you refer, but as
far as standard C is concerned, lseek could fill a bathtub with brightly
coloured machine tools...
I had three other subsidiaries questions:
Does read takes as input buffer size in bytes and gives as output number
of readen characters?

You'd need to refer to the documentation for read() on your platform.
On my linux system, that is the case, but the standard doesn't specify a
read() function.
Will lseek take number of bytes to move to or number of characters?

Again the standard doesn't specify lseek(). You need to refer to the
documentation for your platform.

As I've already stated, the standard says that for ftell() and fseek()
on text files, the "position" is opaque data, as far as I can see. The
standard only allows use of "SEEK_SET" with text files.
If I write a new line character in an existing text file, will the
behavior be deterministic in some way? Will this change the three bytes
file from "x\r\n" to "\r\n\n"?

As I understand it, all the standard guarantees (and that is only for
the functions that the standard specifies) is that when a file is
written in text mode, the character represented by '\n' will be
converted to whatever character sequence "normally" terminates a
line on your specific platform.
 
R

Richard Heathfield

Charlie Gordon said:
Your constant bashing of his abilities is tiresome and
provocative.

I have never bashed his abilities, only his apparent *in*ability to learn
from his mistakes.
He makes mistakes, you make mistakes, I make mistakes, we
all do, both on
technical issues and on communication skills.

Right. When you make a mistake, you are generally quick to recognise it,
and even when you aren't so quick to do that, you make an effort to
understand what your critic is getting at. Well done you!
We are all on the same side here,

I'm not convinced.
defending our language of choice,

Oh, that isn't the side I'm on. If people want to use something else, let
them. I'm not here to defend C, but to help people to learn it. Part of
helping people to learn C involves pointing out mistakes made not just by
the OP but by those who respond to him. But I have made a conscious effort
(not always successfully) to resist replying to Jacob Navia's articles in
recent months, despite the large number of mistakes he makes, simply
because such discussions tend to generate more heat than light. Since few
others here want to attract the kind of flak that I get from him, the
result is that many of his mistakes go uncorrected. This is not good for
the OPs or for the group.

let's reserve our attacks
for all the crud out there that deserves it:

I'll make you a deal. If you will personally undertake to post corrections
to all the mistakes Jacob Navia makes in this group, I'll never so much as
mention his name again. Fair?
java and C++ bloatware from
all the big names, MSFT and ORCL leading the pack...

I see no reason to attack Java or C++ in a C newsgroup. If people want to
find out how good or how bad those technologies are, comp.lang.c is hardly
the right place to do it.
 
J

jacob navia

O_TEXT said:
What is the URL for the «ANY Microsoft documentation» answering the
questions?

When you go to visual studio IDE, then type
O_READ in the "search" text box, following documentation appears:
---------------------------------------------------------------------------
#include <fcntl.h>


Remarks
The _O_BINARY and _O_TEXT manifest constants determine the translation
mode for files (_open and _sopen) or the translation mode for streams
(_setmode).

The allowed values are:

_O_TEXT
Opens file in text (translated) mode. Carriage return – linefeed (CR-LF)
combinations are translated into a single linefeed (LF) on input.
Linefeed characters are translated into CR-LF combinations on output.
Also, CTRL+Z is interpreted as an end-of-file character on input. In
files opened for reading and reading/writing, fopen checks for CTRL+Z at
the end of the file and removes it, if possible. This is done because
using the fseek and ftell functions to move within a file ending with
CTRL+Z may cause fseek to behave improperly near the end of the file.

_O_BINARY
Opens file in binary (untranslated) mode. The above translations are
suppressed.

_O_RAW
Same as _O_BINARY. Supported for C 2.0 compatibility.

For more information, see Text and Binary Mode File I/O and File
Translation.

See Also
Reference
_open, _wopen
-----------------------------------------------------------------------------
You can also look in the internet if you wish:
http://msdn2.microsoft.com/en-us/library/ktss1a9b(VS.80).aspx

You can also see the correspondence between O_READ and fread flags in
http://msdn2.microsoft.com/en-us/library/yeby3zcb(VS.80).aspx

You can set the text/binary mode with the function _setmode. See:
http://msdn2.microsoft.com/en-us/library/tw4k6df8(VS.80).aspx

You can query the state of the text/binary flag with _getmode
See:
http://search.msdn.microsoft.com/search/Default.aspx?brand=msdn&query=O_TEXT&refinement=02&lang=


And I will stop here. A search for O_TEXT in
www.msdn.com took 0.8 seconds.
 
O

O_TEXT

jacob navia a écrit :
When you go to visual studio IDE, then type
O_READ in the "search" text box, following documentation appears:
--------------------------------------------------------------------------- snip
_O_TEXT
Opens file in text (translated) mode. Carriage return – linefeed (CR-LF)
combinations are translated into a single linefeed (LF) on input.
Linefeed characters are translated into CR-LF combinations on output.
Also, CTRL+Z is interpreted as an end-of-file character on input. In
files opened for reading and reading/writing, fopen checks for CTRL+Z at
the end of the file and removes it, if possible. This is done because
using the fseek and ftell functions to move within a file ending with
CTRL+Z may cause fseek to behave improperly near the end of the file.

It is pretty clear. thanks.

You can also look in the internet if you wish:
http://msdn2.microsoft.com/en-us/library/ktss1a9b(VS.80).aspx

You can also see the correspondence between O_READ and fread flags in
http://msdn2.microsoft.com/en-us/library/yeby3zcb(VS.80).aspx

You can set the text/binary mode with the function _setmode. See:
http://msdn2.microsoft.com/en-us/library/tw4k6df8(VS.80).aspx

You can query the state of the text/binary flag with _getmode
See:
http://search.msdn.microsoft.com/search/Default.aspx?brand=msdn&query=O_TEXT&refinement=02&lang=
And I will stop here. A search for O_TEXT in
www.msdn.com took 0.8 seconds.

On this site, I have found information on microsoft lseek ad read:

lseek:
parameter offset
Number of bytes from origin.
Return Value
_lseek returns the offset, in bytes, of the new position from the
beginning of the file.

and on read:
_read returns the number of bytes read, which might be less
than count if there are fewer than count bytes left in the file
or if the file was opened in text mode, in which case each
carriage return–line feed (CR-LF) pair is replaced with a
single linefeed character.
 
K

Kenneth Brody

O_TEXT wrote:
[...]
okay, I understand the \r\n -> \n translation mechanism.

But what I do not understand is whether read and lseek values are bytes
offset or character offset.

If you read 4096 bytes which contain 3900 translated characters, will
read return 4096 or 3900? what's about lseek?

Well, open/read/lseek are not part of standard C. However, the
same discussion applies to text mode of fopen/fread/fseek.

I will use MS-Windows as an example, and any specific numbers are
implementation-specific. However, the ideas are cross-platform.

Suppose you have a file containing the following 10 characters:

'a' 'b' 'c' '\r' '\n' '1' '2' '3' '\r' '\n'

If the file is fopened in text mode, and you call fread() with a
length of 5, you will be returned the following 5 characters,
and fread() will tell you that you read 5 characters:

'a' 'b' 'c' '\n' '1'

Note that you get the '1' from the second line of the file. This
is because you requested 5 characters, and there were 5 available.
Note, too, that although 5 characters were returned, you have
advanced 6 characters within the file.

If you were to call ftell(), it would tell you that you were at
position 6 in the file, despite having read only 5 characters.

If you were to later fseek() to the value returned from ftell(),
you would be positioned where you were before -- at the '2'.

However, if you were to assume that, having read 5 characters,
and passed the offset 5 to fseek() (which, as I recall, is not
"legal", as it's not a value returned from ftell), you would
be positioned elsewhere -- at the '1'.

--
+-------------------------+--------------------+-----------------------+
| Kenneth J. Brody | www.hvcomputer.com | #include |
| kenbrody/at\spamcop.net | www.fptech.com | <std_disclaimer.h> |
+-------------------------+--------------------+-----------------------+
Don't e-mail me at: <mailto:[email protected]>
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,432
Messages
2,571,680
Members
48,796
Latest member
Greg L.

Latest Threads

Top