awk last n lines of file Post: 302906460

Sponsored Content

Top Forums Shell Programming and Scripting awk last n lines of file Post 302906460 by Don Cragun on Thursday 19th of June 2014 11:16:51 PM

06-20-2014

Registered User

I keep getting lost. You have shown several different bits and pieces of your code and commented that you have made other changes without showing them to us. The code below makes some guesses at what you want based on my understanding from re-reading all of the posts in this thread.

You still refuse to show us the format you want for the sums and averages of the times the system has been running for the last 7 and 30 entries in the log file. So, I have arbitrarily decided to print those values in the following format:

Code:

n line sum: h:mm; average: h:mm:ss

where n is the number of lines used from the end of the file to perform the calculations. (If there are less than 7 or 30 lines in the log file, the output will tell you how many lines were actually used. The sums are shown in hours and minutes. The averages are shown in hours, minutes, and seconds.)

There are several things that still make absolutely no sense to me, but I think the script below does what you asked for (even if it doesn't make sense to me).

First some notes before we get into the code:

If there is any chance that a script can be run close to midnight, it is dangerous to make multiple calls to the date utility to gather bits and pieces of various date components. The code below only invokes date once and extracts all of the date information used in the script from that one invocation.
Even if a machine is just up for 4 days and 4 hours, the substring extraction you're using from the uptime output will not work. (Output in this case from uptime could be something like 105:19 (for 4 days, 9 hours, and 19 minutes) and your current code would put 105:1 in your log file.) The code below puts just the minutes (without the "min" or "mi") or the complete hours and minutes in log file entries.
I repeat that using /home/uplog.txt instead of $HOME/uplog.txt will fail if more than one user on a system runs this script unless they are running with root priviledges.
I repeat that the number you are extracting from the uptime output says absolutely nothing about how long a given user has been working on a project, how long they have been logged in, or anything else related to any particular user. The number you are using just tells you how long it has been since the machine you are using was rebooted. I see absolutely no reason why adding or averaging 7 or 30 of these values would be important to anyone!
This script uses echo to add a line to the end of your log file. The awk script then processes that log file. If, and only if, the log file contains more than 200 (or maxrec) lines, the awk script will rewrite the log file in place throwing away all but the last maxrec lines. Until you are confident that this works correctly, make a backup copy of your log file between the echo that adds a line to the log file and the awk script that rewrites the log file and writes the averages file.
This awk script should work for the up time formats you have described for your version of the uptime utility. (The BSD and OS X versions of the uptime utility provide considerably more complex ouput formats than you have described for your version of uptime. This script will not work correctly with the BSD or OS X versions of uptime.)
This awk script uses two circular buffers. The array l[] holds up to maxrec lines from the log file. After maxrec lines have been read, the next input line overwrites the oldest input line so when we get to the end of file, l[] holds only the last maxrec lines that were in the file. The array mins[] holds up to 30 up time values (after converting them to minutes) from the last 30 lines read from the log file.
The 7 and 30 for the number of lines to be averaged and the 200 for the number of lines to be kept in the log file are parameterized, so you just need to change the number used to initialize nl_av1, nl_av2, and maxrec, respectively, to have the script use whatever values you want. It will automatically adjust such that nl_av1 and nl_av2 will never be greater than maxrec.
The averages are saved in the file /home/averages.txt.
The input and output file pathnames are also parameterized.

Now for the code:

Code:

#!/bin/bash
# Set script related variables:
# Input and output files:
input="/home/uplog.txt"			# Input log file (this file will be
					# updated in place if it contains more
					# than "maxrec" records).
output_averages="/home/averages.txt"	# Output averages file.

# Record counts:
maxrec=200				# Maximum # of lines to keep in log file.
nl_av1=7				# # of lines to be used for 1st average.
nl_av2=30				# # of lines to be used for 2nd average.
# Time and date stuff:
read V R <<-EOF
	$(date '+%x %A  %d/%m/%y die %V.week')
EOF
T=$((86400/3600))			# Hours in a day (the hard way)

# Data for log entry:
machine=$(uname -n)			# Node name.

# Get time since reboot from uptime:
IFS=' ,' read junk junk H_or_HM junk <<-EOF
	$(uptime)
EOF

echo $T "not yet"			# ???
echo $USER				# shows me the actual user
echo "today is" $R 

# Add entry to log file:
echo "uptime $USER an $machine   $H_or_HM   $V" >> "$input"

# Process log file:
awk -v maxrec="$maxrec" -v nl_av1="$nl_av1" -v nl_av2="$nl_av2"  \
	-v avoutf="$output_averages" '
# Normalize script variables:
BEGIN {	# Do not accept averages for more lines than we are saving...
	# Do not average more input lines than we are keeping in the output.
	if(nl_av1 > maxrec) {
		printf("nl_av1 reduced from %d to %d.\n", nl_av1, maxrec)
		nl_av1 = maxrec
	}
	if(nl_av2 > maxrec) {
		printf("nl_av2 reduced from %d to %d.\n", nl_av2, maxrec)
		nl_av2 = maxrec
	}
	# Set size of uptime data circular buffer:
	m = nl_av1 > nl_av2 ? nl_av1 : nl_av2
}
# Save input file pathname in case we need to trim it to maxrec lines.
NR == 1 {
	inf = FILENAME
}
# Process input log file data:
{	# Save last maxrec lines in input circular buffer (l[]):
	l[NR % maxrec] = $0
	# Convert uptime "x mins," or "hours:minutes," to minutes and save it in
	# another circular buffer (mins[]).
	nf = split($5, hm, /[:,]/)
	if (nf == 1) {
		# We just have minutes.
		mins[NR % m] = hm[1]
	} else {# We have hours and minutes separated by a ":".
		mins[NR % m] = hm[1] * 60 + hm[2]
	}
	printf("mins[%d] = %d from %s\n", NR % m, mins[NR % m], $5)
}
# Function to calculate sum and average uptime from last "n" entries in the
# mins[] circular buffer.
# Returns number of lines actually used in calculations.
function av(n,		i) {
	if(n > NR) {
		# Calculate sum and average based on all saved lines.
		n = NR
	}
	t = 0
	for(i = NR - n + 1; i <= NR; i++)
		t += mins[i % m]
	a = t / n
	# printf("av(%d): total=%d minutes, average=%.2f minutes\n", n, t, a)
	return n
}
# Function to convert "m" minutes to "H" hours, "M" minutes, and "S" seconds.
function m2hms(m) {
	S = (m % 1) * 60		# seconds
	M = int(m % 60)			# minutes
	H = int(m / 60)			# hours
}
# We have hit EOF on input, rewrite input file if we found more than "maxrec"
# lines, calculate and print both sums and averages.
END {	# If we had more than "maxrec" input lines, trim input file to "maxrec"
	# lines.
	if(NR > maxrec) {
		for(i = NR - maxrec + 1; i <= NR; i++)
			print l[i % maxrec] > inf
		printf("Input file %s trimmed from %d lines to %d lines\n",
			inf, NR, maxrec)
	} else	printf("Input file %s: %d lines processed.\n", inf, NR)
	# Print averages.
	L = av(nl_av1)
	m2hms(t)
	printf("%d line sum: %d:%02d; ", L, H, M) > avoutf
	m2hms(a)
	printf("average: %d:%02d:%02.0f\n", H, M, S) > avoutf
	L = av(nl_av2)
	m2hms(t)
	printf("%d line sum: %d:%02d; ", L, H, M) > avoutf
	m2hms(a)
	printf("average: %d:%02d:%02.0f\n", H, M, S) > avoutf
}' "$input"

There is nothing bash specific about this script. It will also work with ksh or any other shell that meets basic POSIX shell syntax requirements.

Don Cragun

View Public Profile for Don Cragun

Find all posts by Don Cragun

10 More Discussions You Might Find Interesting

1. UNIX for Advanced & Expert Users

Help with splitting lines in a file using awk

I have a file which is one big long line of text about 10Kb long. Can someone provide a way using awk to introduce carriage returns every 40 chars in this file. Any other solutions would also be welcome. Thank you in advance.

2. Shell Programming and Scripting

Select some lines from a txt file and create a new file with awk

Hi there, I have a text file with several colums separated by "|;#" I need to search the file extracting all columns starting with the value of "1" or "2" saving in a separate file just the first 7 columns of each row maching the criteria, with replacement of the saparators in the nearly created...

3. UNIX for Dummies Questions & Answers

How do you subtotal lines in a file? Awk?

I have a file with 8 fields. I need the subtotals for fields 7 & 8 when field 5 changes. cat wk1 01/02/2011/18AB/17/18/000000071/000000033 01/02/2011/18AB/17/18/000000164/000000021 01/02/2011/18AB/17/18/000000109/000000023 01/02/2011/28FB/04/04/000000000/000000000...

4. Shell Programming and Scripting

awk print lines in a file

Dear All, a.txt A 1 Z A 1 ZZ B 2 Y B 2 AA how can i use awk one line to achieve the result: A Z|ZZ B Y|AA Thanks

5. Shell Programming and Scripting

Reducing file lines in awk

Hi, Here i have to check first record $3 $4 with second record $1 $2 respectively. If match found, then check first record $2 == second record $4 , if it equals , then reduce two records to single record like as desired output. Input_file 1 1 2 1 2 1 3 1 3 1 4 1 3 1 3 2 desired...

6. Shell Programming and Scripting

Read a file using awk for a given no of lines.

Hi, how do i read a file using awk for a given no of line? e.g 1. read only first 50 line. 2. read starting from line 20 to line 60.. thanks in advance. -alva

7. Shell Programming and Scripting

Counting lines in a file using awk

I want to count lines of a file using AWK (only) and not in the END part like this awk 'END{print FNR}' because I want to use it. Does anyone know of a way? Thanks a lot.

8. Shell Programming and Scripting

awk remove/grab lines from file with pattern from other file

Sorry for the weird title but i have the following problem. We have several files which have between 10000 and about 500000 lines in them. From these files we want to remove lines which contain a pattern which is located in another file (around 20000 lines, all EAN codes). We also want to get...

9. Shell Programming and Scripting

awk to reorder lines in file

The output of an awk script is the below file. Line 1,3 that starts with the Ion... need to be under line 2,4 that starts with R_. The awk runs but no output results. Thank you :). file IonXpress_007 MEV37 R_2016_09_20_12_47_36_user_S5-00580-7-Medexome IonXpress_007 MEV40...

10. UNIX for Beginners Questions & Answers

awk to average matching lines in file

The awk below executes and is close (producing the first 4 columns in desired). However, when I add the sum of $7, I get nothing returned. Basically, I am trying to combine all the matching $4 in f1 and output them with the average of $7 in each match. Thank you :). f1 ...