I work on solaris 8, 9 and 10 platforms and have encountered an error which is my wtmpx files appear to be corrupted as all entries contain the date 1970 (the birth of unix).
Now this is obviously not the case, so my query is:
1 - Can the existing wtmpx files be manipulated to provide correct dates
2 - Is it possible to recreate the wtmpx files with historic data
I have seen data "messed up" on boxes that have excessively large utmpx or wtmpx files.
Or there are disk space issues.
There is no known bug (or patch) for this problem (AFAIK), so I would look at your file sizes, and maybe use dd to rescue the good data from earlier in the file.
The utmpd daemon writes to utmpx and wtmpx, so you can stop the daemon for a minute while you rename the old files.
How big is the wtmpx file? If it is very near 2Gb, it it too big.
Is (or was) the disc partition containing the wtmpx file full?
Is the output from "who -u" correct (it comes from the utmpx file) ?
There is a known bug in Solaris if the hardware realtime clock was wrong (or had a flat battery) at the time the system was booted. Correcting the time with the unix "date" command is not enough.
"Piping to a file"? sounds unlikely.
Something like this (assuming /var/tmp has enough space).
Then look at the early records to see how long this file has been in use.
There is no automatic cleanup process, you need to write one.
I normally start a new wtmpx file each week and keep 8 weeks worth.
Some sites would require more history, but bear in mind that once a wtmpx file gets to be a year old, the output from commands like "last" can be nonsense.
When starting a new file, copy the old file to a new name with "cp -p" and then null the wtmpx file (only) with ">wtmpx". Do not rename the wtmpx file or you will have to create a new file and fix the permissions and stop/start the accounting process.
Do not touch the utmpx file. If you null that one you will have to reboot the server to fix the problem.
I've used dd to read "records " up to a certain point in the old file and move them to a new file, discarding the rest.
/usr/lib/acct/wtmpfix - this will attempt correct dates, it does not always work. The fact that is exists testifies to wtmpx corruption being an old problem.
methyl is right about rotating accounting files, very important to do.
I made up the 100000 in the dd example below, you have to determine where the line in the file goes south and you can't fix it:
You want to keep as many records as possible.
And I don't think it is a bug, per se. Sun used to explicitly tell you to rotate accounting logs to avoid corruption. And fwtmp was made for sysadmins who did not read that warning, I guess they got tired of hearing about it.
Last edited by jim mcnamara; 12-29-2011 at 02:28 PM..
This User Gave Thanks to jim mcnamara For This Post:
I've tried to avoid mentioning "wtmpfix" because it is a very specific repair tool and not really relevant to this problem. Historically I have run "wtmpfix" every day on a system which was feeding data to a commercial stats package.
I have used "fwtmp" to export a wtmp file for basic repairs following a power fail and a few more times when a computer has been started with the clock set incorrectly or ran out of disc space. I would normally copy the old file and start a new file before attempting any repair. If the length of history actually matters, you can combine the repaired file with the current file. It's easier to have a script which runs "last" on current and saved files.
I don't rely in wtmp for long term "last login" history and prefer keeping a brief rolling history in the users home directory. If fact for me the main use of wtmp is for basic weekly server usage statistics tabulated by IP Network. I also maintain the "btmp" files in parallel and check them automatically for basic hack attempts.
In my experience, a corrupt wtmpx (or wtmp) file is ususally due to a write to the file being interrupted in the middle of writing a record. This means that log entries after this event will be shifted a number of bytes which are not a whole record.
The file has fixed-lenghth records. When reading the file from start, and the file is corrupted, there is somewhere a record which is shorter than the record length, and the reading program gets out of synch with the records.
So the way to fix the file is to find and remove the incomplete record. This can be done in a binary-capable editor such as Emacs (I have used that), where you look for recurring patterns to find the start of records, and when you find the short record you remove that and save the file. Formatting it with fwtmp will aid you in finding the number of records you need to pass before reaching the faulty record.
Possibly a simpler method would be to use dd in intelligent ways to first read the uncorrupted part of the file and then skip an offset of a number of bytes until you get output which can be formatted correctly by fwtmp.
What I am getting at is that you don't have to throw away the last part of the file, the information can be recovered by using my method.
Last edited by sebofo; 01-24-2012 at 05:56 AM..
Reason: Added info
Hi,
I tried running the command "last" in the server to check the users that were last logged into the system.
However, I get this error :
root@csidblog:# last
/var/adm/wtmpx: Value too large for defined data type
How do I proceed to get this info?
I read some forums suggesting to use... (2 Replies)
Hi all,
I have been tasked to change permissions on the wtmpx file to 640. Currently the permissions are at 644. My question is will anything be affected if I change the permissions as shown? Thanks in advance.
Derek (2 Replies)
hi,
we have a solaris 10 box that was handled by a different sysadmin before & now it is turned over to us for system administration. our concern is that if we issue the "last" command, it usually says "wtmp begins current day current month date 02:30". just like this "wtmp begins Thu Mar 7... (6 Replies)
Hi all,
I have F5 load balancer on my system and checking service status by opening an ftp session in every 30 seconds. These ftp sessions are being logged in /var/adm/wtmpx and filling up the file. when i run the last command most of the output is this ftp session. I was wondering if there is a... (1 Reply)
Hi, saw couple threads about wtmpx corruption, I had this problem on many servers, last command was not working or displaying old output, found good information on a thread on this site and wrote a perl script to fix, thought it might help some people.
I found that using wtmpfix I lost many... (0 Replies)
Hi
in my solaris 9 system wmptx file is not updating so it is not recording any login or logout or any other entry.
can any one tell me how to solve this problem (0 Replies)
Hi,
I am using Sun Solaris 5.9 OS. I have found a file called wtmpx having a size of 5.0 GB. I want to clear this file using :>/var/adm/wtmpx. My query is, would it cause any problem to the running live system.
Could anyone suggest the best method to clear the file without causing problem to... (6 Replies)
Hello everybody:
the wtmpx file on my Sol8 machine, got so big (2GB), that my root partition is almost full now, can I empty that file, I read about it that it contains database of user access and auditing, so in case I emptied it will it affect my system??
Thanks alot (3 Replies)
Do someone know how to delete entry(some lines)
in file "wtmpx" that command "last" use it.
this file is binary so I cannot edit directy.
=========================
#last
root pts/1 noc Fri Mar 3 22:04 still logged in
root pts/1 noc Fri Mar 3 22:01 - 22:02 ... (4 Replies)