Strange logger rotation failure

Daniel Berger · Jul 11, 2005

Hi all,

Ruby 1.8.2
Solaris 10

I have a couple Ruby scripts running via cron that connect to an Oracle
database and execute some stored procedures - nothing fancy. On
Saturday, July 2, 2005, the logger package spewed a strange error at me:

Your "cron" job on itwsp193910wss
log_cleanup_dev.rb

produced the following output:

/usr/local/lib/ruby/1.8/logger.rb:516:in `write': Shifting failed.
'log_cleanup_dev.log.20050630' already exists. (Logger::ShiftingError)
from /usr/local/lib/ruby/1.8/logger.rb:318:in `add'
from /usr/local/lib/ruby/1.8/logger.rb:399:in `fatal'
from log_cleanup_dev.rb:68

First, a bit of background. There are two nearly identical scripts that
run - one is log_cleanup.rb and the other is log_cleanup_dev.rb. The
only difference between them is the database connect string (i.e. one
connects to a production database, and the script in question connects
to a dev database).

The first curious thing to me is that this error was raised on July
*2nd* (it also happened on July 8th, btw). The second (less) curious
thing is that this was only raised by the dev script. It's less
curious, because the reason the script failed in both instances was
because the DBA was doing some maintenance on the dev database on those
mornings.

# Relevant snippet from log_cleanup_dev.rb (line 68 noted):

# Logger instance
log = Logger.new("log_cleanup_dev.log", "monthly")

# and later...
begin
sth = dbh.prepare("begin my_proc; end;")
sth.execute
rescue DBI::Error => e
msg = "Error executing stored procedure 'my_proc': #{e.errstr}"
log.fatal{ msg } # Line 68
raise
rescue => e
msg = "Unexpected error executing stored procedure 'my_proc': #{e}"
log.fatal{ msg }
raise
ensure
sth.finish if sth
end

Any ideas? It looks to me like it's trying to rotate when it shouldn't
or something, but I haven't dug too deeply into this yet.

Regards,

Dan

Ara.T.Howard · Jul 11, 2005

Hi all,

Ruby 1.8.2
Solaris 10

I have a couple Ruby scripts running via cron that connect to an Oracle
database and execute some stored procedures - nothing fancy. On Saturday,
July 2, 2005, the logger package spewed a strange error at me:

Your "cron" job on itwsp193910wss
log_cleanup_dev.rb

produced the following output:

/usr/local/lib/ruby/1.8/logger.rb:516:in `write': Shifting failed.
'log_cleanup_dev.log.20050630' already exists. (Logger::ShiftingError)
from /usr/local/lib/ruby/1.8/logger.rb:318:in `add'
from /usr/local/lib/ruby/1.8/logger.rb:399:in `fatal'
from log_cleanup_dev.rb:68

First, a bit of background. There are two nearly identical scripts that run
- one is log_cleanup.rb and the other is log_cleanup_dev.rb. The only
difference between them is the database connect string (i.e. one connects to
a production database, and the script in question connects to a dev
database).

The first curious thing to me is that this error was raised on July *2nd* (it
also happened on July 8th, btw). The second (less) curious thing is that
this was only raised by the dev script. It's less curious, because the
reason the script failed in both instances was because the DBA was doing some
maintenance on the dev database on those mornings.

# Relevant snippet from log_cleanup_dev.rb (line 68 noted):

# Logger instance
log = Logger.new("log_cleanup_dev.log", "monthly")

# and later...
begin
sth = dbh.prepare("begin my_proc; end;")
sth.execute
rescue DBI::Error => e
msg = "Error executing stored procedure 'my_proc': #{e.errstr}"
log.fatal{ msg } # Line 68
raise
rescue => e
msg = "Unexpected error executing stored procedure 'my_proc': #{e}"
log.fatal{ msg }
raise
ensure
sth.finish if sth
end

Any ideas? It looks to me like it's trying to rotate when it shouldn't or
something, but I haven't dug too deeply into this yet.

Regards,

Dan

probably what happened is that

first cron starts, then hangs since db is being maintained

second cron job starts, connects, begins logging and rolls

first cron jobs finally blows up/times out. raises error, attempts to roll
log, and then fails to roll it since second already did.

i've seen things like this many, many times in our production system whe
things are cron'd. basically cron jobs can stack if one or more hangs and
code rarely deals with this gracfully. of course, this may not be your issue
but looking at timestamps in the log file in question, the mtime of the log
file itself, and the cron interval should give you a clue. if it's cron'd
with any frequency i'd put money on this.

the way i've worked around this for some of my stuff, like rq and dirwatch, is
to make the code run as a daemon. the daemon aquires a lockfile and refuses
to start if one is already running. then i simply attempt to start the daemon
via cron every fifteen minutes. that one only one instance ever runs at a
time, which greatly simplifies making robust systems. it's also nice in the
case of things which are supposed to run at, say, 7pm - but don't since the
system crashed at 6:59. the cron would simply restart the daemon when the
machine comes up and then whatever work gets done gets done at 7:01 or
whatever.

another thought is nfs. mtime cannot be trusted on nfs unless you sysads are
completely stellar and totally understand nfs setups wrt inode caching -
which, in my experience, is extremely rare.

for what it's worth i use the builtin logger in around 200 production
applications and have never seen an error - even on nfs - so i'm guessing
the logger code is o.k. then again i don't use monthly rolling and don't
exercise that code. i looked at the monthly code - it look fine to me
but...

hth and good luck.

-a
--
===============================================================================
| email :: ara [dot] t [dot] howard [at] noaa [dot] gov
| phone :: 303.497.6469
| My religion is very simple. My religion is kindness.
| --Tenzin Gyatso
===============================================================================

More on logger hang-ups	2	Apr 5, 2010
Logger rotation problem in multi-process/Rails environment	7	Oct 5, 2006
gem install failure - probably easy to fix	6	Sep 2, 2009
Poblem with RubyGems after upgrade to 1.8.7p334	1	Mar 10, 2011
aws::s3 intermittent failure with -> SSL_write:: bad write retry	0	Jul 7, 2007
Failure to load "openssl.bundle" (RuntimeError) - need help	0	Jan 15, 2008
net/ssh makes log4r fail	0	Oct 19, 2007
Temporary failure in name resolution	1	Aug 16, 2006

Strange logger rotation failure

Daniel Berger

Ara.T.Howard

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads