cron/logrotate chicken and egg problem


 
Thread Tools Search this Thread
Top Forums UNIX for Advanced & Expert Users cron/logrotate chicken and egg problem
# 1  
Old 09-11-2008
cron/logrotate chicken and egg problem

I have run into a problem where about a dozen machines, all the same x86_64 2.6.12 GNU/Linux. For some reason these machines will fill up their /var partition (10G), because their logs never get rotated... Unfortunately, there is no error message from logrotate (would be in /var/log/messages) and the last time logrotate ran (according to /var/log/logrotate.log) was August 11, 2008.

This unfortunately is somewhat of a puzzling problem, making it feel (however unlikely) that this is actually a problem with cron, and not log-rotate. IE. cron failed first, which made logrotate never happen. I had originally thought that log-rotate just couldn't do its job because there was no space available for the rotation to occur. But even when I gave it enough space, cron just never ran the logrotate script. On top of that, I even added my own "append timestamp to file" script to the cron.hourly, and it never got run either. As a test, on one of the servers, I restarted cron... BINGO, it fixed the whole thing.

I could care less about the fact that I can fix it. I want to prevent this from ever happening in the field. In my lab is one thing, but in the field, there goes all my logs for debugging.

Here is a sanity check I ran (just proving that cron was functioning and stopped functioning - all within the same up-time):
Code:
[root@babylon5 ~]# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/sda2             9.5G  3.7G  5.3G  42% /
/dev/sda1             190M   18M  163M  10% /boot
/dev/shm              2.0G  4.0K  2.0G   1% /dev/shm
/dev/sda6              44G  8.4G   33G  21% /stuff
/dev/sda5             9.5G  9.5G     0 100% /var

[root@babylon5 ~]# ls -lut /etc/init.d/crond
-rwxr-xr-x  1 root root 1904 Aug 11 15:14 /etc/init.d/crond

[root@babylon5 ~]# ls -lut /etc/crontab
-rw-r--r--  1 root root 255 Jul 15 10:19 /etc/crontab

[root@babylon5 ~]# grep "Aug 11 15" /var/log/cron
Aug 11 15:01:02 babylon5 crond[9429]: (root) CMD (run-parts /etc/cron.hourly)

[root@babylon5 ~]# ps auxwww|grep cr[o]n
root      3014  0.0  0.0  60404  1296 ?        Ss   Jul15   0:00 crond

[root@babylon5 ~]# uptime
 13:33:34 up 57 days,  3:15,  1 user,  load average: 15.63, 15.37, 15.29

[root@babylon5 ~]# grep "STARTUP" /var/log/cron*
/var/log/cron.4:Jul 14 14:16:44 babylon5 crond[3156]: (CRON) STARTUP (V5.0)
/var/log/cron.4:Jul 15 08:53:45 babylon5 crond[3008]: (CRON) STARTUP (V5.0)
/var/log/cron.4:Jul 15 10:19:35 babylon5 crond[3014]: (CRON) STARTUP (V5.0)

[root@babylon5 ~]# tail -n 1 /var/log/cron.4
Jul 20 00:01:01 babylon5 crond[15555]: (root) CMD (run-parts /etc/cron.hourly)

[root@babylon5 ~]# tail -n 1 /var/log/cron.3
Jul 27 01:01:02 babylon5 crond[27941]: (root) CMD (run-parts /etc/cron.hourly)

[root@babylon5 ~]# tail -n 1 /var/log/cron.2
Aug  3 01:01:01 babylon5 crond[17755]: (root) CMD (run-parts /etc/cron.hourly)

[root@babylon5 ~]# tail -n 1 /var/log/cron.1
Aug 10 01:01:02 babylon5 crond[29697]: (root) CMD (run-parts /etc/cron.hourly)

[root@babylon5 ~]# tail -n 1 /var/log/cron
Sep 10 13:01:01 babylon5 crond[22186]: (root) MAIL (mailed 146 bytes of output but got status 0x007f )

[root@babylon5 ~]# ll /var/log/cron*
-rw-r--r--  1 root root 138502 Sep 10 13:01 /var/log/cron
-rw-------  1 root root  14407 Aug 10 01:01 /var/log/cron.1
-rw-------  1 root root  14581 Aug  3 01:01 /var/log/cron.2
-rw-------  1 root root  14483 Jul 27 01:01 /var/log/cron.3
-rw-------  1 root root  14513 Jul 20 00:01 /var/log/cron.4

Any help?
# 2  
Old 09-11-2008
Can you be sure that the jobs aren't being run if you are relying on logs stored in /var to check? If /var is 100% full, it's quite possible that the jobs *are* running, but simply can't add a log entry to say so. Did your "append timestamp to file" test script use a file on another filesystem?

Having said that, I think it's quite likely that cron would stop working once /var becomes 100% full... there must be some pretty big logs being generated somewhere to fill a 9.5GB /var?
# 3  
Old 09-11-2008
The "append timestamp to file" test used /tmp... which is on /dev/sda2.

If you look at the output from "ll /var/log/cron*" you will see that messages continued to be sent to /var/log/cron from after August 10. The last line of the cron.1 file confirms that August 10 was when the last logrotate happened. Given that cron logs should be relatively consistent, one would expect the next cron log rotation to occur on August 17... but it did not. Also, because of the size (~14k) of the cron log, there is a guarantee that the logrotate did not fail (on the 17th) due to the filesystem being full... instead logrotate just never ran.

According to /var/log/logrotate.log, the last time that logrotate ran was:
1218492062 which translates to : August 11th, 2008 10:01:02 PM

As for the logs being generated, yeah, they are big, but its normal for my app with debug on max...
# 4  
Old 09-11-2008
Have you established whether cron stops running all jobs when it's in this state? Just the cron.hourly? Or just jobs for a specific user (i.e. root in this case)? Also, are there any other cron jobs stuck at the time?

Have you tried strace-ing a crond that's stuck like this? Also, does a pkill -HUP crond "wake" it up again?

It might also be worth running cron with some debugging options, although they don't seem to be very well documented. The best reference (for Vixie cron) I've found so far is:

Mac OS X Manual Page For cron(8)
# 5  
Old 09-12-2008
I tried to use GDB but it would appear that the symbols are stripped from the crond binary, as the backtrace only provided the following:
Code:
(gdb) bt
#0  0x00002aaaaae6d5a0 in ?? ()
#1  0x00002aaaaae6d3f4 in ?? ()
#2  0x0000555555557772 in ?? () from /usr/sbin/crond
#3  0x0000000000000000 in ?? ()

I HUPed the cron daemon, and waited to see if the test script ran, but unfortunately no... even though it is theoretically writing to someplace other than the /var partition (aka. /tmp)

As for strace, I was unaware that I could attach strace to a process that is already running... if that is so, can you reference an example? I will gladly kick it off as a test.
# 6  
Old 09-13-2008
Just a silly question:
Are you sure you have inodes available on /var?
('df -i', depending on your O/S)

If your inode table is full, you may have space on the disk but you can't make a new file to rotate the data into.
# 7  
Old 09-14-2008
strace -aefp <pid> should do the trick.
Login or Register to Ask a Question

Previous Thread | Next Thread

8 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

How to disable cron emails, but only for logrotate only not for other logs?

Guys, is there a script or command? how to disable cron emails, but only for logrotate only not for other logs (3 Replies)
Discussion started by: kenshinhimura
3 Replies

2. Shell Programming and Scripting

Problem with logrotate

Hi, I have a problem with logrotate at Centos 7. My logrotate is configured with "rotate 0" to Apache logs, so it should never keep logs when rotating, just removing them and replacing by new empty ones at every rotation. But for some reason, once in a while, I see that logrotate is creating... (0 Replies)
Discussion started by: dado000
0 Replies

3. AIX

Logrotate - /etc/logrotate.conf does't exist

Hi Admins. I have installed logrotate rpm on Aix 6.1. After the installation of rpm, I don't find /etc/logrotate.conf file and /etc/logrotate.d dir . The config file is located in /opt/freeware/etc/logrotate.conf. When I ran logrotate -v /opt/freeware/etc/logrotate.conf I get below... (2 Replies)
Discussion started by: snchaudhari2
2 Replies

4. Shell Programming and Scripting

Logrotate - I am not able to rotate files using logrotate

I have written script which is working in Home directory perfectly and also compressing log files and rotating correctly. But, when i try to run script for /var/log/ i am able to get compressed log files but not able to get rotation of compressed log files. Please suggest. I am using below command... (5 Replies)
Discussion started by: VSom007
5 Replies

5. UNIX for Dummies Questions & Answers

logrotate and cron.daily/weekly

Hi guys, I've got two separate logrotates I'd like to run, one for Tomcat and one for Apache, but I'd like to run the Tomcat one daily and the Apache one weekly. Now, the logrotate itself is working fine, but although I have 'daily' in Tomcat, and 'weekly' in the Apache one, the latter is... (2 Replies)
Discussion started by: jimbob01
2 Replies

6. UNIX for Advanced & Expert Users

Logrotate configuration problem

Hi, I have the following configuration file: /logs/system/mindundi/* { rotate 0 daily missingok sharedscripts postrotate find /logs/system/mindundi/ -name "*log" -mtime +15 -exec /bin/rm -f {} \; endscript } I want to save only... (6 Replies)
Discussion started by: mitchbcn
6 Replies

7. UNIX for Advanced & Expert Users

logrotate with /etc/logrotate.conf file

Hi there, I want to rotate the logfiles which are located in /var/log/jboss/tomcat* so I have created a file named as 'tomat' in /etc/logrotate.d/tomcat with the following content. # cat /etc/logrotate.d/tomcat /var/log/jboss/tomcat_access_log*.log { daily nocreate ... (2 Replies)
Discussion started by: skmdu
2 Replies

8. Red Hat

Problem seen with logrotate

Hi all, I have configured logrotate to logorotate every 12 hour. The configurations are as follows. /etc/cron.d/config ------------------------- SHELL=/bin/bash PATH=/sbin:/bin:/usr/sbin:/usr/bin MAILTO="" HOME=/root 0 */12 * * * root logrotate /etc/logrotate.d/test ... (1 Reply)
Discussion started by: rsravi74
1 Replies
Login or Register to Ask a Question