Reading LAST line from text file without iterating through the file?

A

Arne Vajhøj

00, Ken Wesson wrote:
On Thu, 24 Feb 2011 21:23:34 +0800, Peter Duniho wrote:

On 2/24/11 9:06 PM, Ken Wesson wrote:
[...]
Obsolete systems do not interest me.

then…

Since those days, the world has standardized on ASCII flat files
for text files.

LOL!

Windows text files are flat ASCII files (with CRLF line ends).

No.

They are CP-1252, UTF-8 or UTF-16.

All of which are ASCII++, for all intents and purposes.

This is an IT group.

Not a group for hairdressers or chefs.

Evasive change of subject noted.
The PC/mainframe ratio is probably like 100000:1.

Hence why I said *at least*. I was being conservative in my estimates --
as generous to *your* case as possible. And still I was demolishing it.

Not at all.

Because market share is counted in dollars.
One computer is still one computer, no matter how expensive it is. It's
the price tag whose relevance is not that big.

I have news for you: money is always important.
Funnily enough, all of them can cope just fine with ASCII text files. I
wonder how that can be, Arne, unless of course you're wrong yet again.

It is called "backwards compatibility".

Try google the term.

Oops - sorry - I know you are too lazy to do that.

You will have to trust me that such a term exists.

Arne
 
A

Arne Vajhøj

The topic was as I stated above.

No.

You wrote:

# And that exhausts 99.99% of the
# operating system market share right there

Note the word "market share".
The original debate arose from your worry that if the thread's OP counted
backward from the end of a file to the final newline character in it,
this would break on a handful of oddball systems.

How does your hand look like?

There are probably something like 100000 of these systems
in production.

Arne
 
A

Arne Vajhøj

Yes, really.

Let us recall the context in which this silly argument blew up, shall we?

Someone asked for an efficient way to get at the last line of a text
file. *Several* people suggested seeking backwards from the end to find a
newline character. (Interestingly, only one of those people has since
been subjected to flamage for this suggestion. I wonder why?)

Actually no one was flamed for suggesting using that trick.

You got flamed because:
- you do not understand the concept of portability
- you do not understand the IT market
which resulted in some weird statements from you.
You and some others pooh-poohed that suggestion because it might not work
properly on fewer than 0.01% (you've since admitted fewer than 0.001%) of
computers, with the implied grounds that any number at all above zero is
unacceptable.

I don't even think anyone claimed it was unacceptable.

The point was that it was a solution that was not guaranteed
to work on all platforms.
I've got news for you. That suggestion also won't work on any system
without a JVM.

It won't work on a 2MB RAM 286 too cramped for the program to run on.

It won't work on an Xbox 360 or any other system that won't run unsigned
binaries and whose signing authority won't sign this program.

And so on.

NO program will work on *all computers*, Arne. So that goal is
unattainable.

That is not so relevant.

A Java program will not work on a system with no Java.

Everyone should know that.

That is not an excuse for writing non portable Java code
without even noting the portability problem.
Hardly any users, if any at all, of the OP's program would be using it on
a weird machine like those 0.001% you say *a subset* of which don't store
text normally. (Apparently having no true text files at all!)

They have text files.

Just different physical storage than those computers you have
experience with.
I'd expect those few users will expect their quirky computers to not
accept software that works nearly everywhere else, and accept that, or
else they would have gotten a less quirky computer instead.

They will expect that Java code is written in a portable way.

Portability was one of the main design goals by Java.
And in actual fact the OP's user base almost certainly consists of Unix
sysadmins who want to view the last entry or last few entries of a
ginormous log file without difficulty, in which case the OP could
probably get away even with hardcoding \u000A as the line-end character
(though I wouldn't recommend they actually do so).

*That* is what's relevant here.

No.

What is relevant is that suggesting such without any notes
about portability based in guesses about the users context
is pretty bad programming.

Arne
 
L

Lew

Daniel said:
You can do something a little better than seeking backwards. You can make some
guesses about line length. If it is a typical text file, you can guess that
the length f that line is < 1024 (for instance). Seek to that location before
the end of the file and then perform the typical "tail" operation.

If you don't find the EOL as expected, you would then do the same thing, but
start further back.

That's a form of seeking backwards.
 
A

Arne Vajhøj

It's a "different char set" in the manner that adding a suit of clothes
to a naked person results in a "different person", Arne.

It is a different char set in the manner that adding a
suit of clothes to a naked person results in it no
longer being a naked person.

Arne
 
A

Arne Vajhøj

I wasn't counting non-general-purpose computers. Only computers you can
get an open-ended set of vari-purposed apps for.


It's pretty sharp if you use the criterion I just articulated above.

No.

You can get Java ME apps for a BlackBerry and for a cheap Nokia
phone.

One is considered a smart phone the other is not.
Not at all. I said desktop PCs would have been in the majority over a
decade ago, and that phones might have an edge now. I doubt they are "far
and away" in the majority, though.

Googling suggests 41 million iPhones and another over 8 million Android
phones have been sold. Let's round this up to an even 50 million
smartphones, total, absorbing the small numbers of true smartphones with
open-ended sets of downloadable apps that are neither iPhones nor
Androids.

The same methods indicate the number of PCs in use (not just sold ever,
but in use now) at over 1 billion.

So it is likely that PCs are still outnumbering phones, perhaps by as
much as 20 to 1.

Better googling skills would have let you to:

http://en.wikipedia.org/wiki/Mobile_phones

<quote>
In the twenty years from 1990 to 2010, worldwide mobile phone
subscriptions grew from 12.4 million to over 4.6 billion
</quote>

Arne
 
A

Arne Vajhøj

But he wasn't counting business entities; he was counting the sectors
themselves.

There are not much point in counting sectors at arbitrarily
granularity.

You could also argue that PC'es are only used 2 places
(work and private).

BTW, rather unusual to refer to yourself as "he".
Ah, so all of the thread-OP's users don't matter. They're mere flies,
because they aren't filthy stinking rich.

But that is your claim.

You are claiming that the people using large shared computers
does not count.

We want to count people - you want to count computers.
That is not what "widely used" means. By your definition, Ferraris are
more widely used than ordinary four-door sedans, for Christ's sake.

No.

Ferrari revenue i small. Mainframe revenue is big.
That's not a useful way of looking at it when the topic is software
compatibility. How large a fraction of machines the OP's software will
run correctly on, out of the set people might try to run it on, is the
metric that matters there.

If the programmer is developing software for money, then he cares
about how much money he can make not how many computers it will
run on.
That's clearly not true.

It is what the various analysis'es say.

That is what
You're joking, right? It might cost that much to replace them with more
of the same, but to replace them with commodity hardware and operating
systems will certainly cost a lot less, modulo the cost of porting
software. (In practice it probably makes more sense to phase out their
use by just not getting new ones, or even just by having new companies
that enter those fields use modern systems and waiting for the older
companies in the space to die off over time, because of that porting
cost.)

Not joking.

Why do you think it has not happened?

Arne
 
A

Arne Vajhøj

Not at all.


By that definition the concept of "record-based" vs. "not-record-based"
becomes completely meaningless.

But most of us use "records" to mean a structure that involves out-of-
band boundaries of some sort. Linear text with inline line break etc.
characters has only in-band boundaries and is much less structured than
what a "record" typically implies.

A line is by definition a structure because there is something
that determines where it starts and where it ends.

Neither a count prefix or the the line delimiter are part of the
line itself.

Arne
 
J

Jukka Lahtinen

Ken Wesson said:
Your personal opinions of others are not the topic of this newsgroup. Do
you have anything Java-related to say?

Most of your recent rambling here has had nothing to do with Java, just
attacking other people personally and repeating yourself about your
weird ideas of what ASCII is or is not.
Look in the mirror, PLEASE!
 
J

Jim Janney

Ken Wesson said:
Yes, really.

Let us recall the context in which this silly argument blew up, shall we?

Someone asked for an efficient way to get at the last line of a text
file. *Several* people suggested seeking backwards from the end to find a
newline character. (Interestingly, only one of those people has since
been subjected to flamage for this suggestion. I wonder why?)

You and some others pooh-poohed that suggestion because it might not work
properly on fewer than 0.01% (you've since admitted fewer than 0.001%) of
computers, with the implied grounds that any number at all above zero is
unacceptable.

I've got news for you. That suggestion also won't work on any system
without a JVM.

It won't work on a 2MB RAM 286 too cramped for the program to run on.

It won't work on an Xbox 360 or any other system that won't run unsigned
binaries and whose signing authority won't sign this program.

And so on.

Your argument works for Java programs that only work with files stored
on the system that they are running on, but this is another bad
assumption. The programs I work on run under Windows JVMs but still
read and write files on the AS/400 (or whatever IBM is calling it this
month) file system.

Even dinosaurs know about networked file systems.
 
A

Arne Vajhøj

Fascinating. But we weren't counting all cell phones. We were only
counting phones with app stores and the like.

Practically all of these phones can run Java ME apps.

Arne
 
A

Arne Vajhøj

Perhaps, but if the OP's code is for working with Unix log files -- well,
who the heck is likely to store Unix log files on an AS/400 box? The Unix
machines networked with the hypothetical AS/400 box will no doubt store
their log files on their own hard drives, not the AS/400 box's.

I don't think that it is good to give advice on how
to read the last line from a text file in Java based
on an assumption that it must be a Unix log file when the
OP did not indicate so.

Arne
 
A

Arne Vajhøj

[...]
Better googling skills bark bark bark bark bark!

Bark bark bark bark bark bark bark bark bark!

If you jump up at me, I will take action to defend myself, and I outmass
all terriers by *at least* a factor of 20 to 1, so you *will* get the raw
end of it!

20 to 1? _All_ terriers? Well, let's see…a small Airedale runs around 50
lbs, which puts you at around half a ton.

Even a small terrier at 20 pound put him at 400 pounds, which would
cause some health concerns.

Arne
 
L

Lew

I don't think that it is good to give advice on how
to read the last line from a text file in Java based
on an assumption that it must be a Unix log file when the
OP did not indicate so.

In this forum people all the time try to make the claim that an OP meant
something not in the original post. For example, the other thread where the
OP asked how to produce a 'List <String1>' and everyone (except me) assumed
and argued that they *must* have meant 'List <String>', even though they took
great pains not to say so.

So I would expand your advice to add that one eschew assuming anything outside
the problem statement. Beyond that, because this is a discussion forum and
not a help desk, it is entirely appropriate to discuss the general
applicability of principles elicited from a specific problem. Thus, even if
the OP did want to speak only of log files, it is important and highly
relevant to point out that "text files" (about which they actually did ask)
have a wider and fuzzer meaning that certain ignorant trolls would believe.
 
L

Lew

[...]
Better googling skills bark bark bark bark bark!

Bark bark bark bark bark bark bark bark bark!

If you jump up at me, I will take action to defend myself, and I outmass
all terriers by *at least* a factor of 20 to 1, so you *will* get the raw
end of it!

Now we know why he finds fear of violence to be an acceptable workplace
parameter. Clearly Paul here, ahem, sorry, I mean "Ken Wesson" here is a
violent psychopath.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,756
Messages
2,569,535
Members
45,008
Latest member
obedient dusk

Latest Threads

Top