IO.lineno= not behaving as expected

B

Belorion

------=_Part_3370_24140208.1132327424566
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline

What am I missing here (apologies if this is just my 7am brain fog
speaking):

The contents of my file (test) are:
1
2
3
4
5
6

f =3D File.open('test')
f.lineno =3D 4
puts f.readline # <=3D This returns 1, instead of 4 as expected

I'm trying to do fast lookups in a file based on line number, but this isn'=
t
working like I expected it to. 1) What am I missing here? 2) If this is not
what lineno=3D() is designed for, then what is it supposed to be used for? =
3)
is there another easy way for fast lookups by line#? Thanks.

------=_Part_3370_24140208.1132327424566--
 
J

J. Merrill

belorion said:
What am I missing here (apologies if this is just my 7am brain fog
speaking):

The contents of my file (test) are:
1
2
3
4
5
6

f = File.open('test')
f.lineno = 4
puts f.readline # <= This returns 1, instead of 4 as expected [snip]

If remember that the docn says that lineno= just changes the current
value of lineno -- if you check the result of f.lineno after your
f.readline above, you'll get 5. Not sure how useful that behavior is,
but that's how it's defined....

Suggest that you read the file into memory and split it by lines
(File#readlines IIRC, not near the manual now).
 
B

Brian Schröder

What am I missing here (apologies if this is just my 7am brain fog
speaking):

The contents of my file (test) are:
1
2
3
4
5
6

f =3D File.open('test')
f.lineno =3D 4
puts f.readline # <=3D This returns 1, instead of 4 as expected

I'm trying to do fast lookups in a file based on line number, but this is= n't
working like I expected it to. 1) What am I missing here? 2) If this is n= ot
what lineno=3D() is designed for, then what is it supposed to be used for= ? 3)
is there another easy way for fast lookups by line#? Thanks.

Hello Belorion,

it is impossible to do a fast lookup by line without building an index
first (i.e. loading everything into an array or hashing line-number to
byte position) because the program has to count the number of newlines
to the line. So the best would be to simply do

DATA =3D File.read("file").split(/\n/)

and use

DATA[linenumber-1]

to access the data. Try to restructure your program in such a way that
it need not read the file repeatedly. You could even go overboard and
create a multiton class that reads the file into memory and checks on
each access if the file has changed. Something like this: (Untested)

class DataFile
@@files =3D {}

def self.open(file)
@@files[file] ||=3D DataFile.new(file)
end

def initialize(filename)
@filename =3D filename
reread()
end

def read(line)
mt =3D File.mtime(@filename)
reread() if not @mt or @mt < mt
@mt =3D mt
@data[line]
end

private
def reread
@data =3D File.read(@filename).split(/\n/)
end

private :new
end

But that was more for the fun of it.

Brian
 
B

Belorion

------=_Part_3588_11453504.1132328542191
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline
If remember that the docn says that lineno=3D just changes the current
value of lineno -- if you check the result of f.lineno after your
f.readline above, you'll get 5. Not sure how useful that behavior is,
but that's how it's defined....


I noticed that ... but, what, exactly is that useful for? (as you already
questioned) It seems like I am missing something here, because otherwise
lineno=3D() seems useless and misleading.


Suggest that you read the file into memory and split it by lines
(File#readlines IIRC, not near the manual now).

The only problem with that is I need to query, say, only 1000 lines in a
file with 195_199_572 lines in it.

------=_Part_3588_11453504.1132328542191--
 
B

Brian Schröder

I noticed that ... but, what, exactly is that useful for? (as you already
questioned) It seems like I am missing something here, because otherwise
lineno=3D() seems useless and misleading.

It could be used if you read into a file and want to update lineno
manually. E.g.

File.open("f") do | f |
header =3D f.read(1024)
f.lineno =3D header.gsub(/[^\n]/, "").length
do_something_with_f_that_needs_linenumbers(f)
end
Suggest that you read the file into memory and split it by lines

The only problem with that is I need to query, say, only 1000 lines in a
file with 195_199_572 lines in it.

Are the lines you need known at once? Then you could do it like this:

lines =3D [1, 2, 4, 8, 16, 32, 1024]
lines =3D lines.sort.reverse
File.open("file") do | f |
while line =3D f.gets
if line =3D=3D lines.first
puts line
lines.pop
end
end
end

Brian
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,535
Members
45,007
Latest member
obedient dusk

Latest Threads

Top