Reading LAST line from text file without iterating through the file?

A

Arne Vajhøj

Funny that something so "completely different" intersects with ASCII in
the entirety of ASCII's range (0-127). It just specifies what 128-255
mean instead of leaving those values undefined. Unicode specifies what
128-65535 mean and still intersects with ASCII on 0-127.

Occasionally backwards compatibility is a design goal.

If you knew about programming, then you would have
seen that before.

Arne
 
A

Arne Vajhøj

Funny then that bog-standard ASCII files seem to read and write just fine
in Notepad on the occasions that I use Windows computers.

That just mean that it use something ASCII compatible - not that
it uses ASCII.

And you can easily verify that it indeed supports characters
not part of ASCII.
All of those seem to be ASCII plus another up to 128 characters, or in
the case of UTF-16, another up to 65408 characters.

Saying that a 7-bit-clean file interpreted in one of those is not ASCII
is like saying that humans are not mammals.

And?

Noone is saying that such a file is not ASCII.

We are saying that the system are not reading ASCII. It reads
a character set that is backwards compatible with ASCII.

Arne

PS: UTF-16 is *not* ASCII compatible.
 
A

Arne Vajhøj

have no bearing on this discussion, which has to do with the majority of
*computers* and, secondarily, what will be encountered routinely by the
majority of *IT workers*.

Well the topic was market share.

Market share is counted in dollars.

And since somebody is willing to pay a lot more for a mainframe
running an entire bank than for somebody to be able to read email,
then counting computers does not really reflect market share.

Arne
 
A

Arne Vajhøj

Your employer may happen to be using such legacy systems, but I very much
doubt that very many people deal with them in an IT capacity. Far, *far*
fewer than deal with Unix, Windows, and Mac boxes in such a capacity.

How many end-users interact indirectly with these systems is of course
irrelevant.

Not really - the high number of end users mean that the company
is willing to pay a lot of money for those systems, which impacts
the market share.

Arne
 
A

Arne Vajhøj

Those aren't text files. Text is, notionally, a string of characters,
including perhaps spaces and line-end characters. A text file is
therefore a file whose content is a string of characters, including
perhaps spaces and line-end characters. Such a thing is, logically, the
only native way to represent raw text. Anything more structured is
obviously not a plain text file. It may be a text-containing file of some
kind but it is not a text file.

A text file is something you read and write as lines of text.

Whether the system used LF delimiters or CR LF delimters or
a counted approach does not matter.
That's nonsense. The only character a normal text file cannot have in
lines is a line break, and in actual fact you cannot have a line break in
the middle of a line *by definition*. Wherever there is a line break one
line ENDS and another one BEGINS, *by definition*. If that weren't the
case then it wouldn't be a line break!

No.

newline is a code 10 in many char sets.

It is perfectly valid as content in the middle of a line on
older MacOS systems (because they use another line delimiter
and on all systems using count prefixes (no line delimiter at
all).
So there is no "advantage" here. What you are actually describing is a
"list-of-strings" file, not a text file

A text file is a list of strings.
Face it: those record-oriented file formats are not text files. They have
additional structure that cannot be represented natively in a String,

Neither can delimited files
therefore represent more than just a String, such as a collection of
Strings, and therefore are not text files but something else -- archives
of multiple text files bundled into single files.

The main use for such a thing over plain ordinary text files that I can
think of is storing a mailbox without resorting to hacks that behave
oddly when lines in the bodies start with the word "from". And these days
filesystems work fine with large directories full of tiny files, so
there's less need for that sort of thing than there once was.


Nobody is assuming records use delimiters. They are assuming text files
are text files. The lines in text files use delimiters as an inherent
property.

No.

That is an illusion that you seem to have.
If you have a text in a String, seeking backward from the end
until a newline character (or the beginning of the String, whichever you
hit first) will reliably find the start of the last line in the String.

No.

It will not work on systems that uses CR as line delimiter or systems
using count prefixed lines.
The same is true of any disk file format that faithfully represents the
String as a flat string of text rather, and in particular of the formats
commonly used to store, e.g., C source files.

Wrong.

C source files are stored using count prefix line format son systems
that uses such.

Arne
 
A

Arne Vajhøj

Who says "all the world's ASCII", Sosman? I can't recall anybody doing so
in this group recently.

It is true that almost all the world seems to use encodings that contain
ASCII as a subset. That is not quite the same thing.

Somebody with the name of Ken Wesson wrote:

# Since those days, the world has standardized on ASCII flat files for
text files.

# Windows text files are flat ASCII files (with CRLF line ends). Mac text
# files are flat ASCII files (with CR line ends). Unix text files are flat
# ASCII files (with LF line ends).

Arne
 
D

Daniele Futtorovic

Alleged by whom? That distorted quote is most certainly not what I wrote.

Alleged by my Usenet provider.

I was trying to extract the wisdom in your postings. Give me some credit
here. That quote is most certainly what you (pertinently) wrote, minus
the fluff.

And please, I beg of you sincerely and benevolently, stop acting like
such a loonie.
 
D

Daniele Futtorovic

This is an IT group.

Not a group for hairdressers or chefs.

This mean that we use exact terms.

Dear Mr. Vajhøj ,

We'll see you in court.

Yours frivolously,

D. Futtorovic
Chief Representative of the Local (Pubic) Hairdresser's Union
 
L

Lew

Alleged by my Usenet provider.

I was trying to extract the wisdom in your postings. Give me some credit
here. That quote is most certainly what you (pertinently) wrote, minus
the fluff.

And please, I beg of you sincerely and benevolently, stop acting like
such a loonie.

"Your grand-daddy's ASCII" is exactly today's ASCII. Ergo, "It's not
your grand-daddy's ASCII" is exactly "It's not ASCII".
 
J

Jim Janney

Arne Vajhøj said:
Not really - the high number of end users mean that the company
is willing to pay a lot of money for those systems, which impacts
the market share.

Indeed. And let's not forget where a lot of Eclipse funding comes from.
 
T

Tom Anderson

Dear Mr. Vajhøj ,

We'll see you in court.

Yours frivolously,

D. Futtorovic
Chief Representative of the Local (Pubic) Hairdresser's Union

My members wish to join you in this suit.

t. Anderson
Chair of Delegates, cljp Local (Pubic) Chef's Union

--
Formal logical proofs, and therefore programs - formal logical proofs
that particular computations are possible, expressed in a formal system
called a programming language - are utterly meaningless. To write a
computer program you have to come to terms with this, to accept that
whatever you might want the program to mean, the machine will blindly
follow its meaningless rules and come to some meaningless conclusion. --
Dehnadi and Bornat
 
A

Arne Vajhøj

My members wish to join you in this suit.

t. Anderson
Chair of Delegates, cljp Local (Pubic) Chef's Union

I think I have a problem.

Microwave food and DIY hair cut for the rest of my life ...

:)

Arne
 
T

Tom Anderson

"Your grand-daddy's ASCII" is exactly today's ASCII.

My grandfather's ASCII would probably have been ASCII-1963, which is not
today's ASCII.

Actually, my grandfather's ASCII would probably have been ITA2, IYSWIM.

tom

--
Formal logical proofs, and therefore programs - formal logical proofs
that particular computations are possible, expressed in a formal system
called a programming language - are utterly meaningless. To write a
computer program you have to come to terms with this, to accept that
whatever you might want the program to mean, the machine will blindly
follow its meaningless rules and come to some meaningless conclusion. --
Dehnadi and Bornat
 
T

Tom Anderson

Good question.

I don't know if they sell or lease them out.

IBM deliver boxes to customers and get a ton of money in return.

Well, that's them, FedEx, and Emperors Club then.

tom

--
Formal logical proofs, and therefore programs - formal logical proofs
that particular computations are possible, expressed in a formal system
called a programming language - are utterly meaningless. To write a
computer program you have to come to terms with this, to accept that
whatever you might want the program to mean, the machine will blindly
follow its meaningless rules and come to some meaningless conclusion. --
Dehnadi and Bornat
 
D

Daniele Futtorovic

"Your grand-daddy's ASCII" is exactly today's ASCII. Ergo, "It's not
your grand-daddy's ASCII" is exactly "It's not ASCII".

Precisely. ASCII is not a /technique/, it's a _standard_.

It's /also/ a technique, of course, but only secondarily so: a technique
that derives from a standard.
 
T

Tom Anderson

Given that:
data + LF
data + CR + LF
are alo record formats then that is nonsense.

The thing about CR and LF is that lineprinters, and things which are
pretending to be lineprinters, like terminal emulators and text editors,
know how to deal with them; they write the next character lower down
and/or at the start of the line. They aren't record separators, they're
format effectors (ASCII does have record separators - an impressive range
of them, in fact - but i don't known of anybody using them).

What happens if you send one of these alleged text files from a mainframe
to a printer or a shell? Do the printers and shells in mainframe land
handle those formats, or does there have to be a program that reads the
format and then talks to the printer? Or does that all happen down in the
OS? How does the lineprinter know to move the golf ball across the paper
when it gets to the end of a record?

tom

--
Formal logical proofs, and therefore programs - formal logical proofs
that particular computations are possible, expressed in a formal system
called a programming language - are utterly meaningless. To write a
computer program you have to come to terms with this, to accept that
whatever you might want the program to mean, the machine will blindly
follow its meaningless rules and come to some meaningless conclusion. --
Dehnadi and Bornat
 
T

Tom Anderson

Of course they are text files.

If I edit Foobar.java in a text editor and write a Java program and
saves it, then why should it be less of a text file, because the record
format used on that system is not delimited?

If i edit Foobar.java in Google Docs and write a Java program and save it,
then why should it be less of a text file, because it's stored in some
mysterious cloud database?

Or how about:

$ dbm Foobar.java init
$ dbm Foobar.java set 1 "public class Foobar"
$ dbm Foobar.java set 2 "{"
$ dbm Foobar.java set 3 "}"

?

That a file has text somewhere in it does not make it a text file.

tom
 
T

Tom Anderson

And with what do you support your claim for this definition of "text
file"?

I hope it's something more solid than KW's flailing appeals to "notion"
and the like, which are unsupported by contemporary or historical uses
of the term "text", in the computing disciplines or more broadly. Have
you something better to offer?

Merely my observations of the usage of the term by people.

tom
 
A

Arne Vajhøj

If i edit Foobar.java in Google Docs and write a Java program and save
it, then why should it be less of a text file, because it's stored in
some mysterious cloud database?

Or how about:

$ dbm Foobar.java init
$ dbm Foobar.java set 1 "public class Foobar"
$ dbm Foobar.java set 2 "{"
$ dbm Foobar.java set 3 "}"

?

That a file has text somewhere in it does not make it a text file.

Well - the fact that:
- the Java compiler reads Java source in that format
- the C compiler reads C source in that format
- Java BufferedReader/FileReader readLine can read the files
- C fopen with t and fgets can read the files

seems to distinguish it a lot from what you mention.

Arne

can read it would be a significant
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,904
Latest member
HealthyVisionsCBDPrice

Latest Threads

Top