Downtime last night
Last night around 9PM PST, Some of the Lighthouse mongrels started acting funny and returning errors. By the time I had logged in to take a look, Ezra at Engineyard had already restarted the mongrels. However, the app didn’t come right back up properly. After going through some log files, I found the cause: a log file had grown to enormous size and filled the drive up. After clearing the log and restarting, everything came right back up. Total downtime was about 15 minutes.
I immediately filed a ticket to EY to get some extra logs from some new plugins that were installed, and to get a hard drive size increase on our slices. Sorry for the troubles lately. We’re working on some stuff behind the scenes to be able to keep better tabs on how the application is performing.
Update We actually had some more downtime yesterday (Feb 5th) relating to the same plugin (different problem though). The offending plugin has been updated.
Sorry, comments are closed for this article.



Discussion
Logs filling up the HDs – sounds familiar. Has happened to me several times at EY, including just the night before last. Seems like a weak point.
You solve this problem? What plugins your update if it not secret.
I can’t say which plugin right now. It’s been brought up with the author and fixed however. I’ll be able to say more later though.
Ben: sure, just make sure EY sets up chronolog for you. I just asked them to set it up for log/*.log. I have the bj and viking plugins creating their own logs now too.