Reducing file lines in awk


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Reducing file lines in awk
# 1  
Old 11-18-2011
Reducing file lines in awk

Hi,

Here i have to check first record $3 $4 with second record $1 $2 respectively. If match found, then check first record $2 == second record $4 , if it equals , then reduce two records to single record like as desired output.



Input_file
Code:
1 1 2 1
2 1 3 1
3 1 4 1
3 1 3 2

desired output file:
Code:
1 1 4 1 
3 1 3 2

# 2  
Old 11-19-2011
I think this does what you are looking for:

Code:
awk '
    {
        if( NR > 1 )
        {
            split( $0, b, " " );
            if( b[1] == a[3] && b[2] == a[4]  && a[2] == b[4] )
            {
                b[1] = a[1];
                b[2] = a[2];
            }
            else
                printf( "%s %s %s %s\n", a[1], a[2], a[3], a[4] );

            for( i = 1; i <5; i ++ )
                a[i] = b[i];
        }
        else
            split( $0, a, " " );
    }

    END {
        printf( "%s %s %s %s\n", a[1], a[2], a[3], a[4] );
    }
' input-file

Might be possible to refine it, but off the top of my head the output from your sample matches what you posted as desired.
This User Gave Thanks to agama For This Post:
# 3  
Old 11-19-2011
Hi,

Thanks.

With Same Logic...
For the below input file

Code:
 
1 2 3 4       
1.275 3 1.325 3 
1.275 3 1.225 3.025 
1.325 3 1.375 3
1.375 3 1.425 3 
1.425 3 1.475 3 
1.475 3 1.525 3
1.525 3 1.575 3
1.625 3 1.575 3 
1.625 3 1.675 3 
1.675 3 1.725 3 
1.725 3 1.775 3 
1.775 3 1.825 3 
1.825 3 1.875 3 
1.875 3 1.925 3


Expected output
Code:
1 2 3 4    
1.275 3 1.925 3 
1.275 3 1.225 3.025

But output got is
Code:
1 2 3 4
1.275 3 1.325 3
1.275 3 1.225 3.025
1.325 3 1.575 3
1.625 3 1.575 3
1.625 3 1.925 3

Still repeats are there.

Where went wrong...

Smilie

---------- Post updated at 02:09 AM ---------- Previous update was at 12:36 AM ----------

Hi,

Since it is becoming confusing algorithm.

I have changed my algorithm to,

if $2==$4 add extra column as $5 which us a value of $2.

Find min.of $1 and maximum of $3 .

And final output will be,

Min.$1 $com.value max.$3 $com.value
# 4  
Old 11-19-2011
Quote:
Originally Posted by vasanth.vadalur
Expected output
Code:
1 2 3 4    
1.275 3 1.925 3 
1.275 3 1.225 3.025

But output got is
Code:
1 2 3 4
1.275 3 1.325 3       
1.275 3 1.225 3.025
1.325 3 1.575 3       
1.625 3 1.575 3
1.625 3 1.925 3

Still repeats are there.

Where went wrong...
Well, actually it didn't go wrong. Your original post indicated that only sequential lines in the file need to be tested, and I inferred that the 'new line' was to be matched against the next line in the file if there was a match. The programme is doing exactly this and the output you see is expected given those parameters.

Thinking on the minimum/maximum redefinition of the problem.

---------- Post updated at 11:50 ---------- Previous update was at 11:30 ----------

I'm not as confident in this as I don't know what combinations fields 2 and 4 might take. I've made an assumption based on your example and this does work for it, but there might be other unexpected results. Have a go with this and see how it does:

Code:
awk '
    {
        idx = $2 "," $4;
        if( min[idx] == ""  ||  min[idx] > $1+0 ) 
             min[idx] = $1+0; 
        if( max[idx] == "" || max[idx] < $3+0 ) 
             max[idx] = $3+0; 
    }

    END {
        for( x in min )
        {
            split( x, a, "," );
            printf( "%.3f %.3f %.3f %.3f\n", min[x], a[1], max[x], a[2] );
        }
    }
'  input-file


Last edited by agama; 11-19-2011 at 12:30 PM.. Reason: typo
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Reducing input file size after pattern search

I have a very large file with millions of entries identified by @M. I am using the following script to "extract" entries based on specific strings/patterns: #!/bin/bash if ] then file=$1 else echo "Input_file passed as an argument $1 is NOT found." exit; fi MID=(NULL "string-1"... (10 Replies)
Discussion started by: Xterra
10 Replies

2. Shell Programming and Scripting

awk remove/grab lines from file with pattern from other file

Sorry for the weird title but i have the following problem. We have several files which have between 10000 and about 500000 lines in them. From these files we want to remove lines which contain a pattern which is located in another file (around 20000 lines, all EAN codes). We also want to get... (28 Replies)
Discussion started by: SDohmen
28 Replies

3. Shell Programming and Scripting

awk last n lines of file

Just my second week working on awk I need a hint for the following tasks. I want to limit my logfile from the very outset to 200 lines. All I do until now is head -c 10K >> /home/uplog.txt | awk 'END{print NR " swap " NF$5; exit}' /home/uplog.txt; After being read it shall print the very... (27 Replies)
Discussion started by: 1in10
27 Replies

4. Shell Programming and Scripting

Reducing the decimal points of numbers (3d coordinates) in a file; how to input data to e.g. Python

I have a file full of coordinates of the form: 37.68899917602539 58.07500076293945 57.79100036621094 The numbers don't always have the same number of decimal points. I need to reduce the decimal points of all the numbers (there are 128 rows of 3 numbers) to 2. I have tried to do this... (2 Replies)
Discussion started by: crunchgargoyle
2 Replies

5. Shell Programming and Scripting

Counting lines in a file using awk

I want to count lines of a file using AWK (only) and not in the END part like this awk 'END{print FNR}' because I want to use it. Does anyone know of a way? Thanks a lot. (7 Replies)
Discussion started by: guitarist684
7 Replies

6. Shell Programming and Scripting

Reducing text file using similar lines

Hello, I am a java programmer but want to try unix for a purpose where I need to reduce a file using its first field.. Here is the sample data: admin;2;0;; admission;8;0;; aman;1;0;; caroline;0;4;; cook;0;4;; cook;2;0;; far;0;3;; far;1;5;; I am explaining the dataset first. There... (5 Replies)
Discussion started by: shekhar2010us
5 Replies

7. Shell Programming and Scripting

Select some lines from a txt file and create a new file with awk

Hi there, I have a text file with several colums separated by "|;#" I need to search the file extracting all columns starting with the value of "1" or "2" saving in a separate file just the first 7 columns of each row maching the criteria, with replacement of the saparators in the nearly created... (4 Replies)
Discussion started by: capnino
4 Replies

8. Solaris

reducing to root file size

My root file size has reached 80% and I am looking where all i can reduce the file size . Here is the output of top directories in / . To me none of this looks useful but not sure . We use an appplication and email. Which all can be deleted . Please advise . 2016989 989445 /var 930059 ... (2 Replies)
Discussion started by: Hitesh Shah
2 Replies

9. UNIX for Dummies Questions & Answers

Reducing file names

I have a script which includes an FTP. The filename is too long for my target area. The filename is HD012_ABCD_EFGH_061004_F_300_40. I need to the filename to be HD012_ABCD_EFGH_061004_F_. Any ideas. (5 Replies)
Discussion started by: paul1s
5 Replies

10. UNIX for Advanced & Expert Users

Help with splitting lines in a file using awk

I have a file which is one big long line of text about 10Kb long. Can someone provide a way using awk to introduce carriage returns every 40 chars in this file. Any other solutions would also be welcome. Thank you in advance. (5 Replies)
Discussion started by: martinbarretto
5 Replies
Login or Register to Ask a Question