differance between binary file and ascii file

V

vim

hello everybody
Plz tell the differance between binary file and ascii
file...............


Thanks
in advance
vim
 
R

Richard Heathfield

vim said:
hello everybody
Plz tell the differance between binary file and ascii
file...............

Well, that's really the wrong question.

The right question is: "what is the difference between a stream opened in
binary mode, and a stream opened in text mode?"

Let's deal with the easy one first. When you associate a binary stream with
a file, the data flows in from the file, through the stream, unmodified
(or, if you're writing, it flows out, through the stream, to the file,
unmodified). It's just a raw stream of bytes, to do with as you will.

Okay, now the hard one. When you associate a /text/ stream with a file, you
are assuming the convention that the data comprises zero or more lines,
where each line is composed of 0 or more bytes followed by a newline marker
of some kind.

The newline marker defined by C is '\n'.

On Unix, this agrees well with the marker actually used by longstanding
convention on that system.

On CP/M and derivatives (such as Q-DOS and that point-and-click adventure
game it spawned), the marker is '\r' and '\n', in that order.

On the Mac, it's just plain '\r'.

On the mainframe - well, you /really/ don't want to know.

All this is a bit of a nuisance, and it would be nice if we didn't have to
bother with such niceties when processing plain ol' text. And so, when you
are reading from a text stream, the standard library performs any necessary
conversions on incoming data, to force-map the newline marker into a nice
simple '\n'. And when you are writing to the stream, the standard library
looks for '\n' characters and replaces them with the byte or bytes used for
marking newlines on the particular system on which the program is running.

So, when you are writing your code, you can just pretend that the newline
marker is '\n', and - to all intents and purposes - so it is! So you don't
have to mess about with detecting whether you're running on a Mac or a mini
or a mainframe - you can just assume a '\n' delimiter and let the standard
library worry about the underlying representation.

If you don't /want/ the system to do this, open the file in binary mode. But
then managing the newline stuff all falls to you instead.
 
P

Parahat Melayev

vim said:
hello everybody
Plz tell the differance between binary file and ascii

Mate, you are in big trouble now. You just used 'silly' 'Plz'.
And this question is off-topic for someone.

If we come to answer. Every file is binary but when you open ASCII
file, you will see bytes representing elements of ASCII set. Every
element of ASCII set can be stored in char in C programming language.
And ASCII char is one byte and one byte is eight bits. And one bit only
can be 0 or 1. You will find more if you google for "binary
arithmetic".
 
R

Richard Heathfield

Parahat Melayev said:
Mate, you are in big trouble now.

No, he isn't.
You just used 'silly' 'Plz'.

Yes, he did.
And this question is off-topic for someone.

No, it isn't. But in some ways, your answer is.
If we come to answer. Every file is binary

The C Standard does not guarantee this.
but when you open ASCII file,

The C Standard does not specify the concept "ASCII file".
you will see bytes representing elements of ASCII set. Every
element of ASCII set can be stored in char in C programming language.
And ASCII char is one byte and one byte is eight bits.

C does not specify that one byte is eight bits wide - although it does
specify that one byte is *at least* eight bits wide.
 
S

santosh

Parahat said:
Mate, you are in big trouble now. You just used 'silly' 'Plz'.

It *is* a silly abbreviation. While nobody is going to be in big
trouble, consistent use of such incomprehensible English will simply
result the poster being ignored.
And this question is off-topic for someone.

Who said it was off-topic? Certainly it perfectly within topic as far
as I'm aware.
And ASCII char is one byte and one byte is eight bits.

A C byte is not always eight bits is size. Though 8-bit bytes are
common on PCs, other sizes are possible, and indeed are prevalent, on
Mainframes, DSPs etc. A char is garunteed by the standard to be atleast
8 bits but it could be more.
You will find more if you google for "binary arithmetic".

What has binary arithmetic got to do with C streams?
 
M

Mike S

Richard said:
[When] you
are reading from a text stream, the standard library performs any necessary
conversions on incoming data, to force-map the newline marker into a nice
simple '\n'. And when you are writing to the stream, the standard library
looks for '\n' characters and replaces them with the byte or bytes used for
marking newlines on the particular system on which the program is running.
From this explanation, I am under the impression that stdin is
therefore opened in binary mode, since I find I have to explicitly deal
with '\r's to ensure that redirected input from text files works. For
example, I once wrote an rtrim function to remove trailing whitespace
from a input line assumed to have come from stdin (being part of a K&R
exercise, I didn't give myself the luxury of using things like
isspace(); the code quality is in any event not the focus of my
curiosity :)

#include <stdio.h>

/* rtrim: removes trailing whitespace from s, re-attaches '\n' if
necessary*/
void rtrim(char s[], int len)
{
int i, newline;

newline = 0;
i = len - 1;
while(s == '\t' || s == ' ' || s == '\n' || s ==
'\r') {
if(s == '\n')
newline = 1;
--i;
}

if(newline && i > 0)
s[++i] = '\n';
s[++i] = '\0';
}

I added the check for '\r' as a afterthought in order to make the code
more portable (since it broke under Windows) - the input line I was
passing to rtrim was being read from stdin using getchar() and was
simply a line terminated by a sole '\n' character. The '\r's of course
cropped up when I redirected Windows text files to stdin.

So again, it seems stdin is opened in binary mode, and not text mode,
since the newlines don't get converted to a single '\n'. Does the
Standard make any statement about the default mode stdin opens in (and
for that matter, stdout and stderr), and is it possible or worthwhile
to explicitly put stdin into text mode if you know that you are going
to deal with text input exclusively?

Or it may well be the case I am missing something more fundamental
here....

Mike S
 
R

Richard Heathfield

Mike S said:
Richard said:
[When] you
are reading from a text stream, the standard library performs any
necessary conversions on incoming data, to force-map the newline marker
into a nice simple '\n'. And when you are writing to the stream, the
standard library looks for '\n' characters and replaces them with the
byte or bytes used for marking newlines on the particular system on which
the program is running.
From this explanation, I am under the impression that stdin is
therefore opened in binary mode, since I find I have to explicitly deal
with '\r's to ensure that redirected input from text files works.

No, that almost certainly means simply that you've got a Windows text file
on a Linux system. Linux doesn't know, bless it, that you've been sullying
its filesystem with foreign muck. :)
 
A

Alan

santosh said:
It *is* a silly abbreviation. While nobody is going to be in big
trouble, consistent use of such incomprehensible English will simply
result the poster being ignored.


Who said it was off-topic? Certainly it perfectly within topic as far
as I'm aware.

This newsgroup is for C language related questions. The OP is asking a more
generalized question.

The OP may find help here -
http://en.wikipedia.org/wiki/Binary_arithmetic
A C byte is not always eight bits is size. Though 8-bit bytes are
common on PCs, other sizes are possible, and indeed are prevalent, on
Mainframes, DSPs etc.

A byte is always 8 bits by definition!!! On older CDC computers, for
example, there was a "character" of 6 bits but it was never referred to as
a "byte".

A char is garunteed by the standard to be atleast
8 bits but it could be more.

A character is not some arbitrary size. A character is either one Byte (ie
8 bits) or in the case of Unicode it is two Bytes (ie 16 bits). BTW - the
word is "guaranteed".
What has binary arithmetic got to do with C streams?

Don't you know??


Alan
 
R

Richard Heathfield

Alan said:
This newsgroup is for C language related questions. The OP is asking a
more generalized question.

No, he was asking for an explanation of binary mode and text mode in C
streams.

Unlikely, since he was not asking about binary arithmetic.
A byte is always 8 bits by definition!!!

Not true in C. I've used a system with 32-bit bytes, and I'm by no means the
only one here who has done so.
A character is not some arbitrary size.

It is exactly CHAR_BIT bits wide, and CHAR_BIT is at least 8 but can be
more.
A character is either one Byte
(ie
8 bits) or in the case of Unicode it is two Bytes (ie 16 bits).

No, a character is always exactly one byte in size. If a Unicode glyph
representation won't fit in a single byte, then it won't fit in a character
either, but will have to make do with a "wide character".
BTW - the word is "guaranteed".

Careful - those who live by the spelling flame will die by the spelling
flame.
Don't you know??

No. Please explain the connection.
 
A

Alan

Richard said:
Parahat Melayev said:


No, he isn't.


Yes, he did.


No, it isn't. But in some ways, your answer is.


The C Standard does not guarantee this.


The C Standard does not specify the concept "ASCII file".


C does not specify that one byte is eight bits wide - although it does
specify that one byte is *at least* eight bits wide.

I think you mean character not byte

"Byte: a group of eight binary digits, often used to represent one
character". The Concise Oxford Dictionary.

"Byte: a group of eight binary digits processed as a unit by a computer and
used to represent an alphanumeric character". Merriam-Webster Dictionary.

We have to use some standard definition of words otherwise we will fall into
a morass of misunderstanding.

Alan
 
R

Richard Heathfield

Alan said:
I think you mean character not byte

Well, you're wrong. I meant byte.
"Byte: a group of eight binary digits, often used to represent one
character". The Concise Oxford Dictionary.

"byte: addressable unit of data storage large enough to hold any member of
the basic character set of the execution environment.
Note 1: It is possible to express the address of each individual byte of an
object uniquely.
Note 2: A byte is composed of a contiguous sequence of bits, the number of
which is implementation-defined." - ISO/IEC 9899:1999

Authoritative technical definitions trump dictionary definitions.
We have to use some standard definition of words otherwise we will fall
into a morass of misunderstanding.

That's why we have an International C Standard, which defines "byte" very
precisely.
 
J

jjf

Alan said:
I think you mean character not byte

You think incorrectly.
"Byte: a group of eight binary digits, often used to represent one
character". The Concise Oxford Dictionary.

"Byte: a group of eight binary digits processed as a unit by a computer and
used to represent an alphanumeric character". Merriam-Webster Dictionary.

It's surprising that these two authorities should both be wrong, but
they are. A byte is a group of bits, the number defined by the context
in which the term is used. That's why the International Standardization
groups had to invent the word 'octet' to mean a byte of 8 bits.
We have to use some standard definition of words otherwise we will fall into
a morass of misunderstanding.

Indeed. In the C context that standard definition is provided by the C
Standard. A byte is defined to be the same size as a char, which is an
implementation-defined size greater than 7 bits.
 
J

jjf

Alan said:
A byte is always 8 bits by definition!!! On older CDC computers, for
example, there was a "character" of 6 bits but it was never referred to as
a "byte".

Nonsense. A byte is a group of bits of a size defined by its context.
I've worked on systems which had bytes of 6 bits. That couldn't be used
as a byte in C of course.
A character is not some arbitrary size. A character is either one Byte (ie
8 bits) or in the case of Unicode it is two Bytes (ie 16 bits). BTW - the
word is "guaranteed".

The size of a character is defined by its character set definition.
ASCII characters are 7 bits; 8859-1 characters are 8 bits; Unicode
characters are 21 bits (assuming you use the single-word
representation).
Don't you know??

I don't; an explanation would be welcome.
 
K

Keith Thompson

Richard Heathfield said:
Alan said:

No, he was asking for an explanation of binary mode and text mode in C
streams.

Perhaps, but he didn't say so. The original question was:

| hello everybody
| Plz tell the differance between binary file and ascii
| file...............

I don't think we can necessarily assume he was asking about C streams
(though he should have been, given that he posted the question here).
 
F

Flash Gordon

Alan wrote:


Indeed. In the C context that standard definition is provided by the C
Standard. A byte is defined to be the same size as a char, which is an
implementation-defined size greater than 7 bits.

Greater than 8 bits. Of course, as 8 is greater than 7 it is also
greater than 7 bits ;-)
--
Flash Gordon, living in interesting times.
Web site - http://home.flash-gordon.me.uk/
comp.lang.c posting guidelines and intro:
http://clc-wiki.net/wiki/Intro_to_clc

Inviato da X-Privat.Org - Registrazione gratuita http://www.x-privat.org/join.php
 
R

Richard Heathfield

Flash Gordon said:
Greater than 8 bits.

No, greater than 7 bits. It is legal for CHAR_BIT to be 8, as you really
ought to know. Please make sure your "corrections" are correct before
posting.
 
C

Chris Hills

Richard Heathfield said:
Parahat Melayev said:


No, he isn't.


Yes, he did.


No, it isn't. But in some ways, your answer is.


The C Standard does not guarantee this.


The C Standard does not specify the concept "ASCII file".


C does not specify that one byte is eight bits wide - although it does
specify that one byte is *at least* eight bits wide.

Even though in some architectures is isn't 8bits (or 7 for that matter)
though fortunately that is more historical than current.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,432
Messages
2,571,680
Members
48,796
Latest member
Greg L.

Latest Threads

Top