Got SystemStackError exception: stack level too deep

B

Bezan Kapadia

I have master Process that is forking 2 child processes in the
background.

Now one of the child process is sending a mail out to me using the smtp
server every minute.

After about 20 hours I get the following error below:
Got Timeout::Error exception: execution expired
Got SystemExit exception: exit

Am guessing the above error is to with the slow server Network.
The program continues fine because I have trapped the exception using
"rescue".

After about 6 hours I get the following:
Got SystemStackError exception: stack level too deep

Once I get this error my entire program begins to bomb and work
incorrectly.

Can someone explain what does this error really mean "Got
SystemStackError exception: stack level too deep" ?
and if there are anything to look for or keep in mind to contain or
prevent this error or some kind of work around etc...?

Thanks in advance
 
T

Tony Arcieri

[Note: parts of this message were removed to make it a legal post.]

Can someone explain what does this error really mean "Got
SystemStackError exception: stack level too deep" ?
and if there are anything to look for or keep in mind to contain or
prevent this error or some kind of work around etc...?

It means that you have too many calls to functions which are in turn calling
other functions before returning. A quick and easy way to cause a similar
error is:

def foo; foo; end
foo

The stack trace you get with the SystemStackError will be enormously helpful
in diagnosing this problem. Perhaps you can paste it to the list?
 
B

Bezan Kapadia

I see ...

My code below : The function 1 calls the function mail_system every 60
seconds.

Take a look at my 2 functions below and can u suggest of what should I
really change to do away with this issue

require 'net/smtp'

def mail_system(message,i)

user=`echo $USER`.chomp
recipient=user+"@gmail.com"
msg = "#{message}"
Net::SMTP.start('smtp.gmail.com',25) do |smtp|
smtp.send_message(msg, "#{recipient}", "#{recipient}")
Notification for Email Sent completion - #{message}"
end
rescue Timeout::Error => e
$mes_trace.error("Got #{e.class} exception: #{e.message}")
puts "Got #{e.class} exception: #{e.message}"
exit false
end



def function1
........
mail_system(message,i)
rescue Exception => e
$mes_trace.error("Got #{e.class} exception: #{e.message}")
puts "Got #{e.class} exception: #{e.message}"
end

sleep(60)
end
 
J

Jayce Meade

Notification for Email Sent completion - #{message}"
end

It looks to me like you have an " to terminate a string but you don’t have
one to begin it. Because of that, It's probably encountering the word 'for'
and interpreting it as a loop construct and it just goes over and over and
over since there's nothing to break the loop. My guess is that the root of
the stack level error is in the code being taken as part of the loop...

Maybe I missed it, if so, I apologize, but that's what it looks like to me.
Hope it helps.

- jayce
 
B

Bezan Kapadia

ahh No ...thats the issue with my cut copy paste .. sorry I pasted that
incorrectly here..I have pasted that line correctly below..
puts "Notification for Email Sent completion - #{message}"

Where do I get this from..?
 
J

Jayce Meade

if this is all in a single file you should paste all of it. it seems it's
missing some code and you should also paste the whole error.

-jayce
 
B

Bezan Kapadia

Well there are multiple Files and its complicated.. but I think the
issue definately lies some where in the part of the code that I have
pasted above.

The scripts have run days in the background and the error that I have
pasted is all that is there in the trace File.
 
J

Jayce Meade

Trace file? If you're sending attachments, I don't thing the mailing list
allows them.
 
T

Tony Arcieri

[Note: parts of this message were removed to make it a legal post.]

The scripts have run days in the background and the error that I have
pasted is all that is there in the trace File.

Can you provide the complete stack trace associated with the
SystemStackError?
 
J

Jayce Meade

Perhaps your error-checking code is causing the program to restart when the
error crops up. If so, it might lead to the stack error. If say, every six
hours, something changes (I think that's how long you said it took) and it
causes an error, the original error is caught and the program
continues/restarts execution. If the error trapping code restarts by calling
the methods used to start the script in the first place, it would turn into
a stack error if it's not done correctly.

It would be like writing something on a piece of paper. Say once this
figurative piece of paper is full, the stack error crops up. If your program
error is equivalent to the end of a line of text on this piece of paper,
after every error, or line, the program restarts, or a new line is started.
The problem here is that as the errors or lines of text add up on this piece
of paper, eventually, the paper is full, and you can't write anything else
on it, and the StackError pops up.

Basically put, your program may be just writing line after line to this
piece of paper rather than starting with a clean sheet every time it
restarts. What you want to do is go all the way back up to the program's
initiation, where the first line of code used to start the program is, and
set up your error rescuing code there, that way, the error is caught where
the program begins, and you can safely restart it because that's where the
program started in the first place.

This probably isn't even proper syntax, but bear with me:

def method1() # B
#...
rescue Exception # C
# restarting code here
end

method1() # A

###

A is the last line where method1 is called.
B is where method1 is defined/executed
C is where you have your error catching code

Right here, its basically doing this:

A -> B -> Error pops up -> Rescued by C in the context of B -> A -> B ->
Error - > rescued by C in the context of B -> A and so on.

You want it to do this instead:

def method1 #B
#...
end

begin
method1() # A

rescue Exception #C
#... Error catching code here, maybe output to console
retry

end


this time, it's doing this:

A -> B -> Error pops up -> Rescued and restarted by C in top - level
context -> Program restarts, executing line A in top-level context

In this one, instead of restarting in the context of B, it's going all the
way back to the top level of execution and restarting from there, rather
than in the context of the program's execution code itself.

A good way to set up an error system so you know what happened and where,
and specifically why it happened is to raise a runtime error like so and
then catch it later:

def method1
# ...
rescue Exception => e
raise("Errored in method1: #{e}")
end
end

begin
method1()

rescue RuntimeError => e
puts e
# .. what to do next
end

Hope that all made sense! =P

Hope it helps, too. From what I've learned you want to restart your
program from where it was started initially to avoid things building up and
getting problematic.

- jayce

--------------------------------------------------
From: "Jayce Meade" <[email protected]>
Sent: Monday, February 09, 2009 6:17 PM
Trace file? If you're sending attachments, I don't thing the mailing list
allows them.
 
B

Brian Candler

Bezan said:
rescue Timeout::Error => e
$mes_trace.error("Got #{e.class} exception: #{e.message}")
puts "Got #{e.class} exception: #{e.message}"

puts e.backtrace.join("\n")

This should give you the backtrace you need to understand what's
happening. (Although sometimes I think it can be garbled in the case of
a stack overflow error)
 
B

Bezan Kapadia

Ok .. Well I tested the Code again.

This Time the code errored out after 40 hrs with only the System Stack
error.

Some things I have noted this time..
_________________________________________________
1)This is how my child process code that runs in the background looks
like -

The code is precisely doing what Tony has indicated above that could be
possibly triggering the Stack error exception.


#Main Program

def check_for_running_jobs
begin

# do a whole lot of work


rescue Exception => e
$mes_trace.error("In MES_monitor check_for_running_jobs : Got
#{e.class} exception: #{e.message}")
puts "Got #{e.class} exception: #{e.message}"
$mes_trace.error(e.backtrace.join("\n"))
puts e.backtrace.join("\n")

end

end


def run_function

#this function internally calls tons of XYZ functions...
check_for_running_jobs
# there are rescue statements in various of these XYZ functions to trap
exceptions....


sleep(60) #note it gets activated every 60 seconds

run_function
end

run_function

###################################
One thing I wanted to know that if I have an exception in some function
(run_function and if I rescue that exception in the function
(run_function or within other functions that this function calls)
...does that restart execution of the entire program again (note am not
doing anything like calling some other method or anything else except
rescuing the error)....leading to stack error..
_______________________________________________

2) Just a Side Note - Am trying to run this background child process for
days / may be even weeks something like a cron job that sleeps , wakes
up, polls the status and goes to sleep.

Is it robust way to Run a detached child process in the background for
weeks ...?
______________________________________________

3) Ok I have got hold of my backtrace if you folks want to take closer
look...

I, [06:12:05#22059] INFO -- : Sleeping for 2 Minutes
I, [06:13:05#22059] INFO -- : Finished Sleeping for 2 Minutes
E, [06:13:05#22059] ERROR -- : In MES_monitor check_for_running_jobs :
Got SystemStackError exception: stack level too deep
E, [06:13:05#22059] ERROR -- :
/usr/pkgs/ruby/1.8.5-p12/lib/ruby/1.8/monitor.rb:273:in
`mon_check_owner'
/usr/pkgs/ruby/1.8.5-p12/lib/ruby/1.8/monitor.rb:220:in `mon_exit'
/usr/pkgs/ruby/1.8.5-p12/lib/ruby/1.8/monitor.rb:240:in `synchronize'
/usr/pkgs/ruby/1.8.5-p12/lib/ruby/1.8/logger.rb:496:in `write'
/usr/pkgs/ruby/1.8.5-p12/lib/ruby/1.8/logger.rb:326:in `add'
/usr/pkgs/ruby/1.8.5-p12/lib/ruby/1.8/logger.rb:374:in `info'
/MES_monitor.rb:90:in `check_for_running_jobs'
/MES_monitor.rb:78:in `times'
/MES_monitor.rb:78:in `check_for_running_jobs'
/MES_monitor.rb:70:in `each'
/MES_monitor.rb:70:in `check_for_running_jobs'
/MES_monitor.rb:272:in `run_function'
/MES_monitor.rb:281:in `run_function'
/MES_monitor.rb:283

_________________________________________________

4)

Jayce from what you suggested ...
#Main program

def run_function

#this function internally calls tons of XYZ functions...
check_for_running_jobs
# there are rescue statements in various of these XYZ functions to trap
exceptions....

sleep(60) #note it gets activated every 60 seconds

run_function
end

begin
run_function

rescue Exception => e
$mes_trace.error("In MES_monitor check_for_running_jobs : Got
#{e.class} exception: #{e.message}")
puts "Got #{e.class} exception: #{e.message}"
$mes_trace.error(e.backtrace.join("\n"))
puts e.backtrace.join("\n")


end



Am going to fire test run with the above change...Hopefully this will
solve my problem ...
let me know if you folks suggest anything else....
__________________________________________
 
B

Brian Candler

Bezan said:
2) Just a Side Note - Am trying to run this background child process for
days / may be even weeks something like a cron job that sleeps , wakes
up, polls the status and goes to sleep.

Is it robust way to Run a detached child process in the background for
weeks ...?

Yes, I have written systems like that. They have been robust enough not
to need any sort of supervisor. (If you do want to use a supervisor,
something like 'monit' should do the job)
3) Ok I have got hold of my backtrace if you folks want to take closer
look...

I, [06:12:05#22059] INFO -- : Sleeping for 2 Minutes
I, [06:13:05#22059] INFO -- : Finished Sleeping for 2 Minutes
E, [06:13:05#22059] ERROR -- : In MES_monitor check_for_running_jobs :
Got SystemStackError exception: stack level too deep
E, [06:13:05#22059] ERROR -- :
/usr/pkgs/ruby/1.8.5-p12/lib/ruby/1.8/monitor.rb:273:in
`mon_check_owner'
/usr/pkgs/ruby/1.8.5-p12/lib/ruby/1.8/monitor.rb:220:in `mon_exit'
/usr/pkgs/ruby/1.8.5-p12/lib/ruby/1.8/monitor.rb:240:in `synchronize'
/usr/pkgs/ruby/1.8.5-p12/lib/ruby/1.8/logger.rb:496:in `write'
/usr/pkgs/ruby/1.8.5-p12/lib/ruby/1.8/logger.rb:326:in `add'
/usr/pkgs/ruby/1.8.5-p12/lib/ruby/1.8/logger.rb:374:in `info'

Two things strike me from this:

1. You're using a rather old version of Ruby

2. You're using thread primitives (monitor / synchronize) and there are
known problems with them in older versions of Ruby.

I'd strongly suggest you upgrade to ruby-1.8.6p287. (Or you could try
1.8.6p114, but I'm not sure if all the thread issues were resolved by
then).

As an alternative, there's a separate library called "fastthread" which
you could try installing under your old version of Ruby.

http://groups.google.com/group/ruby-talk-google/browse_thread/thread/7933e7e987dad1c3

But I'd strongly recommend going 1.8.6 instead.

Regards,

Brian.
 
B

Bezan Kapadia

Thanks..
I can move to a newer version..
but I think am hitting the issue that Tony indicated above..
foo


Am wondering what the work around to that is in the context of my
code...I'm trying what Jayce suggested and will move to newer version,
Lets see what happens...

Please continue to pour in your suggestions...
 
B

Bezan Kapadia

Since it takes 40 hrs for the Real code to cause the System Stack Error
I tried all the above suggestions in the mean while on the code below
and none seem to worked..(i.e I used newer versions of Ruby, rescue
statements in the manner indicated above etc etc )...but the stack level
too deep error keeps coming back.

The issue that triggers the exception:
def foo;foo;end
foo

__________________________

So i have tried an alternative solution of a while loop which seems to
be working fine so far but lets see..

while (1)

foo

end

def foo

end
___________________________

But if anyone knows of a neat way to get around the issue that Tony
indicated please let me know ....
 
B

Brian Candler

Bezan said:
Since it takes 40 hrs for the Real code to cause the System Stack Error
I tried all the above suggestions in the mean while on the code below
and none seem to worked..(i.e I used newer versions of Ruby, rescue
statements in the manner indicated above etc etc )...but the stack level
too deep error keeps coming back.

The issue that triggers the exception:
def foo;foo;end
foo

Sorry I didn't notice this in your original post:

def run_function
...
sleep(60) #note it gets activated every 60 seconds
run_function
end

Yes, what you've got there is infinite recursion. Some languages will
optimise out this "tail recursion" into a flat loop, but Ruby doesn't;
it will burn stack until it runs out of memory.

So you need to write this as an explicit loop, e.g.

def run_function
while true
...
sleep(60)
end
end
So i have tried an alternative solution of a while loop which seems to
be working fine so far but lets see..

OK, you got there before me :)
But if anyone knows of a neat way to get around the issue that Tony
indicated please let me know ....

ruby 1.9 does have the ability to optimise tail recursion, but only if
you compile it with a special flag. So in Ruby, I'd say it's better to
write an infinite loop as a loop, rather than as recursion.

Regards,

Brian.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,743
Messages
2,569,478
Members
44,898
Latest member
BlairH7607

Latest Threads

Top