Reducing file lines in awk

11-18-2011

Registered User

64, 0

Join Date: Oct 2008

Last Activity: 22 December 2015, 7:03 AM EST

Posts: 64

Thanks Given: 6

Thanked 0 Times in 0 Posts

Reducing file lines in awk

Hi,

Here i have to check first record $3 $4 with second record $1 $2 respectively. If match found, then check first record $2 == second record $4 , if it equals , then reduce two records to single record like as desired output.

Input_file

Code:

desired output file:

Code:

1 1 4 1 
3 1 3 2

vasanth.vadalur

View Public Profile for vasanth.vadalur

Find all posts by vasanth.vadalur

11-19-2011

Registered User

1,466, 512

Join Date: Jul 2010

Last Activity: 7 April 2014, 3:02 PM EDT

Location: earth>US>UTC-5

Posts: 1,466

Thanks Given: 110

Thanked 512 Times in 491 Posts

I think this does what you are looking for:

Code:

awk '
    {
        if( NR > 1 )
        {
            split( $0, b, " " );
            if( b[1] == a[3] && b[2] == a[4]  && a[2] == b[4] )
            {
                b[1] = a[1];
                b[2] = a[2];
            }
            else
                printf( "%s %s %s %s\n", a[1], a[2], a[3], a[4] );

            for( i = 1; i <5; i ++ )
                a[i] = b[i];
        }
        else
            split( $0, a, " " );
    }

    END {
        printf( "%s %s %s %s\n", a[1], a[2], a[3], a[4] );
    }
' input-file

Might be possible to refine it, but off the top of my head the output from your sample matches what you posted as desired.

This User Gave Thanks to agama For This Post:

agama

View Public Profile for agama

Find all posts by agama

11-19-2011

Registered User

64, 0

Join Date: Oct 2008

Last Activity: 22 December 2015, 7:03 AM EST

Posts: 64

Thanks Given: 6

Thanked 0 Times in 0 Posts

Hi,

Thanks.

With Same Logic...
For the below input file

Code:

 
1 2 3 4       
1.275 3 1.325 3 
1.275 3 1.225 3.025 
1.325 3 1.375 3
1.375 3 1.425 3 
1.425 3 1.475 3 
1.475 3 1.525 3
1.525 3 1.575 3
1.625 3 1.575 3 
1.625 3 1.675 3 
1.675 3 1.725 3 
1.725 3 1.775 3 
1.775 3 1.825 3 
1.825 3 1.875 3 
1.875 3 1.925 3

Expected output

Code:

1 2 3 4    
1.275 3 1.925 3 
1.275 3 1.225 3.025

But output got is

Code:

1 2 3 4
1.275 3 1.325 3
1.275 3 1.225 3.025
1.325 3 1.575 3
1.625 3 1.575 3
1.625 3 1.925 3

Still repeats are there.

Where went wrong...

---------- Post updated at 02:09 AM ---------- Previous update was at 12:36 AM ----------

Hi,

Since it is becoming confusing algorithm.

I have changed my algorithm to,

if $2==$4 add extra column as $5 which us a value of $2.

Find min.of $1 and maximum of $3 .

And final output will be,

Min.$1 $com.value max.$3 $com.value

vasanth.vadalur

View Public Profile for vasanth.vadalur

Find all posts by vasanth.vadalur

11-19-2011

Registered User

1,466, 512

Join Date: Jul 2010

Last Activity: 7 April 2014, 3:02 PM EDT

Location: earth>US>UTC-5

Posts: 1,466

Thanks Given: 110

Thanked 512 Times in 491 Posts

Quote:

Originally Posted by vasanth.vadalur

Expected output

Code:

1 2 3 4    
1.275 3 1.925 3 
1.275 3 1.225 3.025

But output got is

Code:

1 2 3 4
1.275 3 1.325 3       
1.275 3 1.225 3.025
1.325 3 1.575 3       
1.625 3 1.575 3
1.625 3 1.925 3

Still repeats are there.

Where went wrong...

Well, actually it didn't go wrong. Your original post indicated that only sequential lines in the file need to be tested, and I inferred that the 'new line' was to be matched against the next line in the file if there was a match. The programme is doing exactly this and the output you see is expected given those parameters.

Thinking on the minimum/maximum redefinition of the problem.

---------- Post updated at 11:50 ---------- Previous update was at 11:30 ----------

I'm not as confident in this as I don't know what combinations fields 2 and 4 might take. I've made an assumption based on your example and this does work for it, but there might be other unexpected results. Have a go with this and see how it does:

Code:

awk '
    {
        idx = $2 "," $4;
        if( min[idx] == ""  ||  min[idx] > $1+0 ) 
             min[idx] = $1+0; 
        if( max[idx] == "" || max[idx] < $3+0 ) 
             max[idx] = $3+0; 
    }

    END {
        for( x in min )
        {
            split( x, a, "," );
            printf( "%.3f %.3f %.3f %.3f\n", min[x], a[1], max[x], a[2] );
        }
    }
'  input-file

Last edited by agama; 11-19-2011 at 12:30 PM.. Reason: typo

agama

View Public Profile for agama

Find all posts by agama

Shell Programming and Scripting

Reducing file lines in awk

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Reducing input file size after pattern search

Discussion started by: Xterra

2. Shell Programming and Scripting

awk remove/grab lines from file with pattern from other file

Discussion started by: SDohmen

3. Shell Programming and Scripting

awk last n lines of file

Discussion started by: 1in10

4. Shell Programming and Scripting

Reducing the decimal points of numbers (3d coordinates) in a file; how to input data to e.g. Python

Discussion started by: crunchgargoyle

5. Shell Programming and Scripting

Counting lines in a file using awk

Discussion started by: guitarist684

6. Shell Programming and Scripting

Reducing text file using similar lines

Discussion started by: shekhar2010us

7. Shell Programming and Scripting

Select some lines from a txt file and create a new file with awk

Discussion started by: capnino

8. Solaris

reducing to root file size

Discussion started by: Hitesh Shah

9. UNIX for Dummies Questions & Answers

Reducing file names

Discussion started by: paul1s

10. UNIX for Advanced & Expert Users

Help with splitting lines in a file using awk

Discussion started by: martinbarretto