Too many open files on OpenBSD

E

Edd Barrett

Hi,

I am the maintainer of the TeXLive typesetting suite for OpenBSD. Each
year I must parse a database in order to calculate subsets of the
TeXLive system for packaging purposes.

The database format changed this year and I have thrown away my python
script and had a go at it with ruby. I am quite new to ruby, but not to
programming (previously C, C++, Java, BASIC..... )

This is the error I am stuck with:

---8<---
/tlpsrcnode.rb:19:in `initialize': Too many open files -
/home/tl/tl/Master/tlpkg/tlpsrc/wordcount.tlpsrc (Errno::EMFILE)
from ./tlpsrcnode.rb:19:in `new'
from ./tlpsrcnode.rb:19:in `parse'
from ./tlpsrcnode.rb:30:in `parse'
from ./tlpsrcnode.rb:20:in `each'
from ./tlpsrcnode.rb:20:in `parse'
from ./tlpsrcnode.rb:30:in `parse'
from ./tlpsrcnode.rb:20:in `each'
from ./tlpsrcnode.rb:20:in `parse'
... 3055 levels...
from ./tlpsrcnode.rb:20:in `each'
from ./tlpsrcnode.rb:20:in `parse'
from ./roottlpsrcnode.rb:53:in `startParse'
from ./rbmfsplit:54
---8<---

This appears to happen after 297 files are concurrently opened.

Immediately I dump myself into the staff login class and uncap the file
descriptor count in login.conf. Log out then back in.

---8<---
puff% ulimit -a
-t: cpu time (seconds) unlimited
-f: file size (blocks) unlimited
-d: data seg size (kbytes) 524288
-s: stack size (kbytes) 4096
-c: core file size (blocks) unlimited
-m: resident set size (kbytes) 1747436
-l: locked-in-memory size (kb) 582993
-u: processes 128
-n: file descriptors 1024
---8<---

Uncapped in this case appears to be 1024. Plenty.

The error still occurs after 297 files are open.

Is there a workaround for this or is ruby not suitable for this nature
of processing?

Thanks

Edd
 
E

Edd Barrett

Edd said:
---8<---
./tlpsrcnode.rb:19:in `initialize': Too many open files -
/home/tl/tl/Master/tlpkg/tlpsrc/wordcount.tlpsrc (Errno::EMFILE)
from ./tlpsrcnode.rb:19:in `new'
from ./tlpsrcnode.rb:19:in `parse'
from ./tlpsrcnode.rb:30:in `parse'
from ./tlpsrcnode.rb:20:in `each'
from ./tlpsrcnode.rb:20:in `parse'
from ./tlpsrcnode.rb:30:in `parse'
from ./tlpsrcnode.rb:20:in `each'
from ./tlpsrcnode.rb:20:in `parse'
... 3055 levels...
from ./tlpsrcnode.rb:20:in `each'
from ./tlpsrcnode.rb:20:in `parse'
from ./roottlpsrcnode.rb:53:in `startParse'
from ./rbmfsplit:54
---8<---

This appears to happen after 297 files are concurrently opened.

Sorry this statement is inaccurate.

297 files have previously been opened and after each one has been parsed
is immediately closed.

The call stack should be an indication of concurrently opened files? In
this case 5 files. (Its recursive).

i suppose the src of the parse function helps (probably disobeying all
coding conventions, sorry):
---8<---
def parse()
fileHandle = File.new @path
for line in fileHandle do

line =~ /^(.*?) (.*)/
cmd = $~[1]
arg = $~[2]

if cmd =~ /^#.*/ then
nil # comment
elsif cmd == "depend" then
newNode = TlpSrcNode.new @basePath + arg + ".tlpsrc", @rootNode
newNode.parse # and so recursion starts
@rootNode.incrementDeps
elsif cmd == "runpattern" then
@rootNode.addRunFile arg
elsif cmd == "docpattern" then
@rootNode.addDocFile arg
elsif cmd == "srcpattern" then
@rootNode.addSrcFile arg
elsif @nothingCmds.detect { |i| i == cmd } then
nil # OK
elsif @todoCmds.detect { |i| i == cmd } then
nil # XXX work out what these are
else
puts "*error: unknown cmd: '#{cmd}' arg='#{arg}'"
exit 1
end
end

fileHandle.close
end
---8<---
 
G

Gerardo Santana Gómez Garrido

Hello Edd, an OpenBSD user here too.

The error messages speaks of more than 3055 recursion levels. Not
every level is a call to parse, but if you take out the #new and #each
methods I'm sure there are a lot more than just 5 files opened at the
same time.

Are there that many dependencies? I don't think so. I haven't seen
those .tlpsrc files yet. I'll take a look.

Edd said:
---8<---
./tlpsrcnode.rb:19:in `initialize': Too many open files -
/home/tl/tl/Master/tlpkg/tlpsrc/wordcount.tlpsrc (Errno::EMFILE)
from ./tlpsrcnode.rb:19:in `new'
from ./tlpsrcnode.rb:19:in `parse'
from ./tlpsrcnode.rb:30:in `parse'
from ./tlpsrcnode.rb:20:in `each'
from ./tlpsrcnode.rb:20:in `parse'
from ./tlpsrcnode.rb:30:in `parse'
from ./tlpsrcnode.rb:20:in `each'
from ./tlpsrcnode.rb:20:in `parse'
... 3055 levels...
from ./tlpsrcnode.rb:20:in `each'
from ./tlpsrcnode.rb:20:in `parse'
from ./roottlpsrcnode.rb:53:in `startParse'
from ./rbmfsplit:54
---8<---

This appears to happen after 297 files are concurrently opened.

Sorry this statement is inaccurate.

297 files have previously been opened and after each one has been parsed
is immediately closed.

The call stack should be an indication of concurrently opened files? In
this case 5 files. (Its recursive).

i suppose the src of the parse function helps (probably disobeying all
coding conventions, sorry):
---8<---
def parse()
fileHandle = File.new @path
for line in fileHandle do

line =~ /^(.*?) (.*)/
cmd = $~[1]
arg = $~[2]

if cmd =~ /^#.*/ then
nil # comment
elsif cmd == "depend" then
newNode = TlpSrcNode.new @basePath + arg + ".tlpsrc", @rootNode
newNode.parse # and so recursion starts
@rootNode.incrementDeps
elsif cmd == "runpattern" then
@rootNode.addRunFile arg
elsif cmd == "docpattern" then
@rootNode.addDocFile arg
elsif cmd == "srcpattern" then
@rootNode.addSrcFile arg
elsif @nothingCmds.detect { |i| i == cmd } then
nil # OK
elsif @todoCmds.detect { |i| i == cmd } then
nil # XXX work out what these are
else
puts "*error: unknown cmd: '#{cmd}' arg='#{arg}'"
exit 1
end
end

fileHandle.close
end
---8<---
 
D

Damjan Rems

Yukihiro said:
Hi,

It seems you recursively call parse, which keep files opened. How
about reading whole file content in a string by File.read, then call
each_line on the string? E.g.

def parse()
File.read(@path).each_line do |line|
line =~ /^(.*?) (.*)/

....
end
end

Or even better:

def parse()
lines = File.open(@path) {|f| f.read}
lines.each_line do |line|
line =~ /^(.*?) (.*)/

....
end
end

With File.open file gets closed automaticly after block is processed.

by
TheR
 
E

Edd Barrett

Hello,

Thanks for all the replies so far. I will later try the suggestion to
read the file into memory straight away then close the file.
Hello Edd, an OpenBSD user here too.

Great :) Ready for 4.3?
The error messages speaks of more than 3055 recursion levels. Not
every level is a call to parse, but if you take out the #new and #each
methods I'm sure there are a lot more than just 5 files opened at the
same time.

Ah i see. So each call frame is a "file" in ruby? I probably
misunderstood.
Are there that many dependencies? I don't think so. I haven't seen
those .tlpsrc files yet. I'll take a look.

http://tug.org/svn/texlive/trunk/Master/tlpkg/tlpsrc/

More than 1000 last time I checked.

This means I will be soring a lot of strings. Is the GC in ruby
effecient, or should I be encouraging it Java style. System.gc() :p

Thanks

Edd
 
G

Gerardo Santana Gómez Garrido

Sorry for top posting, Gmail doesn't support my browser.

Indeed, I'm waiting for that OPENBSD_4_3_BASE tag ;-)

What I meant is that you are opening a file in every call to parse. If
you have many recursive calls to parse, you'll end up with many files
open.

As some others have suggested, you could transform recursion into iteration=
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,059
Latest member
cryptoseoagencies

Latest Threads

Top