awk last n lines of file

06-15-2014

Registered User

12,315, 4,560

Join Date: Jul 2012

Last Activity: 22 November 2019, 4:29 PM EST

Location: San Jose, CA, USA

Posts: 12,315

Thanks Given: 952

Thanked 4,560 Times in 3,818 Posts

Quote:

Originally Posted by 1in10

@Don Cragun
Much ambition means many errors, no errors means no trouble at all, I agree that this is a task for me.
From the very outset I want to set this file to just 200 entries, nothing more. And catch the last seven and in a furhter step the last thirty lines of the fifth field for a calculation. The text-string " swap " could be any other. BTW I am not a pro so: I do not know anything about circular buffers, excuse me.
My OS here is debian wheezy 7.5, no server.
So I cut out the first statement do direct it after the awk-statement to the stdout. As you may see, this could be a beginner, but I assure you I am right in the middle, because this is my third week around with awk.

We don't care if you're a beginner. As long as you want to learn, we want to help you.

But, to help you we need to understand what you're trying to do.

You want to

Quote:

set this file to just 200 entries

??? (The 1st 200 entries? The last 200 entries? What constitutes an entry?)

I don't understand what you mean by:

Quote:

So I cut out the first statement do direct it after the awk-statement to the stdout.

Show us a sample of your input file. Give us details about the format of this file, the size of the file, the field separators in the file, etc.

Explain to us in detail in English what you want to do to that input.

Show us a sample of the output you want your script to produce.

Don Cragun

View Public Profile for Don Cragun

Find all posts by Don Cragun

06-15-2014

Registered User

122, 4

Join Date: Jun 2014

Last Activity: 10 August 2017, 2:46 PM EDT

Location: Brazil

Posts: 122

Thanks Given: 38

Thanked 4 Times in 4 Posts

Code:

#!/bin/bash 

#path=/home/Desktop/bashes
machine=$(uname -n);
R=`date +%A'  '%d'/'%m'/'%y' die '%V'.'week`;
V=`date +%x`              #will be used later
T=$((86400/3600));          #will be used later    

echo $T "not yet";

# shows me the actual user
echo $USER;

if (( "$T" < 31 ))
then 
    echo "today is" $R 
    else :
fi;

# I do know that this notation below of string4 and string5 is not the pretty version, but I want to keep it!!!

string4=`uptime`
string5=`date +%x`

echo "uptime" $USER "an" $machine " " ${string4:13:5} " " ${string5}  | head -c 10K >> /home/uplog.txt | awk  'END{print NR " full " NF$5; exit}' /home/uplog.txt

This is the the whole script. Indeed I just want to have a maximum of lines of 200. Nothing else.
If I cut out the first time >> /home/uplog.txt the file won't be updated. So it may remain. I tested it without and it stopped at that line. What gives me some hope was an old thread right here that was turning the file upside down.

This script shall keep the last 200 times of uptime of a user, that simple. Furthermore I want to fetch the last seven and the last thirty entries of it for a calculation, average uptime and total uptime.
Beyond this I want to switch after a certain value of uptime my MAC-address or to make a redial.
@Don Cragun sure I am willing to learn, I think that keeps me afloat. For not having at least the five posts here I have to wait to send the link. The user that gave that hint is cfajohnson and his answer dates back to 2007.

Last edited by Don Cragun; 06-15-2014 at 11:41 PM.. Reason: Change ICODE tags to CODE tags.

1in10

View Public Profile for 1in10

Find all posts by 1in10

06-16-2014

Registered User

12,315, 4,560

Join Date: Jul 2012

Last Activity: 22 November 2019, 4:29 PM EST

Location: San Jose, CA, USA

Posts: 12,315

Thanks Given: 952

Thanked 4,560 Times in 3,818 Posts

You don't need 5 posts to cut and paste sample data into a post (just like you did with your code). It looks like you're saying you have a file (in a very strange place unless you're running as root) that contains lines like:

Code:

uptime username an   ays,   MM/DD/YYYY
uptime username an   days,   MM/DD/YYYY
            or
uptime username an   N day   MM/DD/YYYY

depending on how long many days the machine has been up (where username, MM, DD, and YYYY are obvious and N is the last digit in the number of days the system has been running if the system has been up for more than 999 days). (Of course, your version of uptime may print something else in the 5 characters starting in position 15.)

And, this would mean the output from awk could be something like:

Code:

325 full 506/15/2014
       or
325 full 6day

assuming you had collected 324 samples before you ran the script and that awk saw the last sample you added to the log using echo and head.
Note that the head in this script is a fairly expensive no-op. And, the awk can be replaced by a set and an echo.

I repeat. Please tell us in English what you are trying to do!
Show us sample input!
Show us desired output!

Keeping only the last 200 lines in your log file can be done by:

Code:

(tail -n 199 /home/uplog.txt
echo "uptime" $USER "an" $machine " " ${string4:13:5} " " ${string5}) > newuplog.txt && mv newuplog.txt /home/uplog.txt

(assuming that you are running this in a directory that is on the same filesystem as /home).

If this is code being run by a normal user, I would have expected it to use $HOME/uplog.txt rather than /home/uplog.txt.

How is capturing 7 or 30 copies of the word day possibly mixed with dates of the form MM/DD/YYYY going to help you calculate average or total uptime? This makes no sense to me. PLEASE SHOW US SAMPLE DATA!

Last edited by Don Cragun; 06-16-2014 at 01:51 AM.. Reason: Fix sample output produced by your echo.

Don Cragun

View Public Profile for Don Cragun

Find all posts by Don Cragun

06-16-2014

Registered User

122, 4

Join Date: Jun 2014

Last Activity: 10 August 2017, 2:46 PM EDT

Location: Brazil

Posts: 122

Thanks Given: 38

Thanked 4 Times in 4 Posts

My aim is just to set a maximum of records or entries to that file, that very maximum shall be 200 lines. And I want to extract that specific seven and thirty last entries for that average-value and the sum of each of them. That means the sum of the last seven as well as their average. The same calculation for the last thirty entries.
By now the input is done every time executing the script in the interpreter. It adds one line to the file. This should be done when shutting down the computer, because it is not a server. Therefore the script will be placed in /etc/rc0.d/ with a k-link. And while trying to figure it out it has more than 3400 lines. Yes, I do use root for that purpose. So there is no strange place anyhow for none of the files.
While trying as well the command of "tac" or "sort -nrk5" or similar ones, I want to go on with the upside-down-example given by the user cfajohnson shown in this code snippet below.

Code:

 awk '{x[NR] = $0}
  END { while ( NR > 0 ) print x[NR--] }' /home/uplog.txt;

Assuming to find a solution with NR==1,NR==7 for the range of one calculation e.g.

Code:

 awk '{sum=sum+$5} END {print sum}' /home/uplog.txt

and

Code:

 awk '{sum=sum+$5} END {print sum/NR} /home/uplog.txt

for the average value and the sum of that row.
I suppose this should even work with both targets, the range of
the first seven (after turning it upside down) and the first thirty values. Even in that format of dd/mm/yyyy. So far string4 is shown from position 13 five digits on.
The output so far is the last line of the file, as shown below.

Code:

 24 not yet
sandy
Today is monday  16/06/14 the 25.th Week
99 full 61:52

This output is just adapted to english. But the date-format remains the same dd/mm/yyyy. As in this example the script on this machine has been running 99 times, the 24 (hours) for the user (in this case sandy) are not completed, the date plus the week, 99 lines and the total value.

1in10

View Public Profile for 1in10

Find all posts by 1in10

06-16-2014

Registered User

12,315, 4,560

Join Date: Jul 2012

Last Activity: 22 November 2019, 4:29 PM EST

Location: San Jose, CA, USA

Posts: 12,315

Thanks Given: 952

Thanked 4,560 Times in 3,818 Posts

Quote:

Originally Posted by 1in10

My aim is just to set a maximum of records or entries to that file, that very maximum shall be 200 lines. And I want to extract that specific seven and thirty last entries for that average-value and the sum of each of them. That means the sum of the last seven as well as their average. The same calculation for the last thirty entries.
By now the input is done every time executing the script in the interpreter. It adds one line to the file. This should be done when shutting down the computer, because it is not a server. Therefore the script will be placed in /etc/rc0.d/ with a k-link. And while trying to figure it out it has more than 3400 lines. Yes, I do use root for that purpose. So there is no strange place anyhow for none of the files.
While trying as well the command of "tac" or "sort -nrk5" or similar ones, I want to go on with the upside-down-example given by the user cfajohnson shown in this code snippet below.

Code:

 awk '{x[NR] = $0}
  END { while ( NR > 0 ) print x[NR--] }' /home/uplog.txt;

Assuming to find a solution with NR==1,NR==7 for the range of one calculation e.g.

Code:

 awk '{sum=sum+$5} END {print sum}' /home/uplog.txt

and

Code:

 awk '{sum=sum+$5} END {print sum/NR} /home/uplog.txt

for the average value and the sum of that row.
I suppose this should even work with both targets, the range of
the first seven (after turning it upside down) and the first thirty values. Even in that format of dd/mm/yyyy. So far string4 is shown from position 13 five digits on.
The output so far is the last line of the file, as shown below.

Code:

 24 not yet
sandy
Today is monday  16/06/14 the 25.th Week
99 full 61:52

This output is just adapted to english. But the date-format remains the same dd/mm/yyyy. As in this example the script on this machine has been running 99 times, the 24 (hours) for the user (in this case sandy) are not completed, the date plus the week, 99 lines and the total value.

Obviously your version of uptime produces significantly different output than uptime on the laptop I have running OS X. If you continue to refuse to show us sample data from /home/uplog.txt I can't help you any more.

How have you determined that the 24 (hours) for the user (in this case sandy) are not completed? The uptime utility reports how long a system has been running and what recent load averages are. It says absolutely nothing about how long sandy or any other user has been logged in. And, the last line of your output seems to show that this machine has been running for almost 62 hours.

You don't need tac to get the last 7 or 30 lines. You definitely don't want to use sort -nrk5 if you're trying to process the last 7 or 30 lines of your input file. If this script is being run by root, why does $USER expand to sandy?

It is nice that you have learned how to emulate tac using awk, but unless there is some reason why you want to reverse the lines in your log file, that isn't what you need for this project.

If the 5th field in /home/uplog.txt is hours and minutes separated by a colon, sum+=$5 isn't even going to come close to doing what you want. In awk the command sum+=$5 will keep a running sum of integer or floating point values; it won't sum up values given as hours and minutes.

Thank you for showing us what your script produces. But, we know that isn't the output you want. So, please:

Show us the output from the uptime command on your system.
Show us what the last 35 lines are in /home/uplog.txt!
Show us exactly what output you want to have produced from those 35 lines.

If you'll do that for us, we'll show you how to use circular buffers in awk to save the last 200 lines of your input file, to get sums and/or averages from the last 7 lines in your input file, and how to get sums and/or averages from the last 30 lines in your input file.

Don Cragun

View Public Profile for Don Cragun

Find all posts by Don Cragun

06-17-2014

Registered User

122, 4

Join Date: Jun 2014

Last Activity: 10 August 2017, 2:46 PM EDT

Location: Brazil

Posts: 122

Thanks Given: 38

Thanked 4 Times in 4 Posts

@Don Cragun
Eagle eye Don Cragun was right, my first attempt worked but it is confusing. I corrected the part to the following variables and their output. But I still do not agree with your point of view of using "head -c". All I could find on that was ulimit for setting a limit. So I let that first "head -c".
Thanks for pointing at that first mess. I was right focused on that awk.

Code:

machine="Today at: ";
machine=$(printf "%s %s" "$machine" "$(uname -n)");

and

Code:

stringZ=`uptime`
stringZ=$(printf "%s %s" "$stringZ" `date +%x`);

That gives me the following output

Code:

24 not yet
sandy
uptime sandy   Today at: jarbo3   2:45

While jarbo3 is the name of the computer.
This is the actual output of that specific file, I deleted the old one, due to that confusion. It's without that awk-output yet.

Code:

22:00:55 up 2:45, 3 users, load average: 0.02, 0.04, 0.00 17.06.2014

I guess after clearing that, I can get a coffee to spend the time with awk. Thanks for being so harsh. :-)
There is no specific reason for making a difference between user or root on this computer here, since I am the only user. And yes, the 24 hours will be changed in the ongoing of that script. When the uptime hits a certain value, e.g. 3 hours, it shall trigger a redial and a change of the MAC-address as well.

Last edited by 1in10; 06-17-2014 at 10:30 PM..

1in10

View Public Profile for 1in10

Find all posts by 1in10

06-18-2014

Registered User

12,315, 4,560

Join Date: Jul 2012

Last Activity: 22 November 2019, 4:29 PM EST

Location: San Jose, CA, USA

Posts: 12,315

Thanks Given: 952

Thanked 4,560 Times in 3,818 Posts

Please trust me. The:

Code:

| head -c 10K

in:

Code:

echo "uptime" $USER "an" $machine " " ${string4:13:5} " " ${string5}  | head -c 10K >> /home/uplog.txt

isn't doing anything but slowing down your script. If you replace that with:

Code:

echo "uptime" $USER "an" $machine " " ${string4:13:5} " " ${string5} >> /home/uplog.txt

it will produce the same output, but do it faster. We'll take care of your desired maximum number of lines to be kept later in your awk script.

We still need to see some sample lines from /home/uplog.txt and we still need to see an exact sample of the output you want to get when processing that input. (Note that you don't have to wait 24 hours between invocations of your script. If you invoke it once per minute for a half hour, you'll have enough data in your file to compute 7 and 30 entry sums and averages.) Until you show us sample data from that file, we can't help.

Don Cragun

View Public Profile for Don Cragun

Find all posts by Don Cragun

Shell Programming and Scripting

awk last n lines of file

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

awk to average matching lines in file

Discussion started by: cmccabe

2. Shell Programming and Scripting

awk to reorder lines in file

Discussion started by: cmccabe

3. Shell Programming and Scripting

awk remove/grab lines from file with pattern from other file

Discussion started by: SDohmen

4. Shell Programming and Scripting

Counting lines in a file using awk

Discussion started by: guitarist684

5. Shell Programming and Scripting

Read a file using awk for a given no of lines.

Discussion started by: alvagenesis

6. Shell Programming and Scripting

Reducing file lines in awk

Discussion started by: vasanth.vadalur

7. Shell Programming and Scripting

awk print lines in a file

Discussion started by: jimmy_y

8. UNIX for Dummies Questions & Answers

How do you subtotal lines in a file? Awk?

Discussion started by: MS75001

9. Shell Programming and Scripting

Select some lines from a txt file and create a new file with awk

Discussion started by: capnino

10. UNIX for Advanced & Expert Users

Help with splitting lines in a file using awk

Discussion started by: martinbarretto