Visit Our UNIX and Linux User Community

Top Forums Shell Programming and Scripting Need to print duplicate row along with highest version of original Post 302829709 by Yoda on Friday 5th of July 2013 04:52:40 PM
Old 07-05-2013
Here is a brief explanation:
Code:
awk '
        # Skip first two records of input file
        NR > 2 {
                # Set variable V = $0 (current record)
                V = $0
                # Remove first field to get the description in variable: V value
                sub ( $1, X, V )
                # Remove leading and trailing space from description in variable: V value
                gsub ( /^[ ]*|[ ]*$/, X, V )
                # Create indexed array: R with 1st and 2nd field separated by comma
                R[++c] = $1 "," V
                Check if associative array: A contain record indexed by variable: V value
                if ( V in A )
                {
                        If yes, compare if existing vale is less that 1st field value
                        if ( A[V] < $1 )
                        {
                                Set associate array: M = $1 (maximum value)
                                M[V] = $1
                        }
                }
                # If associative array: A does not contain record indexed by V value
                else
                {
                        Set associative array: A indexed by V = $1
                        A[V] = $1
                        Set associative array: M indexed by V = $1
                        M[V] = $1
                }
        }
        # END Block
        END {
                # For each element in indexed array: R
                for ( i = 1; i <= c; i++ )
                {
                        # Split record separated by comma into array: T
                        n = split ( R[i], T, "," )
                        # Print records that are having duplicates and not having maximum value
                        if ( M[T[n]] != T[1] && M[T[n]] )
                                print T[1], T[n], "IS DUPLICATE OF \"" M[T[n]], T[n] "\""
                }
        }
' file

This User Gave Thanks to Yoda For This Post:
 
Test Your Knowledge in Computers #468
Difficulty: Medium
Both Ethernet and IPv4 use an all-zeros broadcast address to indicate a broadcast packet.
True or False?

10 More Discussions You Might Find Interesting

1. Linux Benchmarks

Original BYTE UNIX Benchmarks (Version 3.11)

Just dusted off an old version of the Byte UNIX Benchmarks from our old benchmark days at http://linux.silkroad.com/ and ran them against www.unix.com: ============================================================== BYTE UNIX Benchmarks (Version 3.11) System -- Linux www 2.4.20 #2 Mon... (0 Replies)
Discussion started by: Neo
0 Replies

2. UNIX for Dummies Questions & Answers

about UNIX? original version?

sorry for my English We'll report about Unix in my school, for Operating Systems subject... with Installation demo.... I'm wondering if System V, which is from original developers AT&T still exist and downloadable? because I cant find it anywhere... then i found out that Solaris, MacOS... (4 Replies)
Discussion started by: slowchem
4 Replies

3. UNIX for Advanced & Expert Users

tar: how to preserve atime? (also on extracted version, not just original)

How do I make tar set the correct atime on the extracted version? The option --atime-preserve works just on the original, not on the extracted file. The extracted files always have current time as atime, which is bad. (10 Replies)
Discussion started by: frankie06
10 Replies

4. Shell Programming and Scripting

How to delete a duplicate line and original with sed.

I am completely new to shell scripting but have been assigned the task of creating several batch files to manipulate data. My final task requires me to find lines that have duplicates present then delete not only the duplicate but the original as well. The script will be used in a windows... (9 Replies)
Discussion started by: chino_1
9 Replies

5. UNIX for Dummies Questions & Answers

Print line with highest value from one column

Hi everyone, This is my first post, but I have already received a lot of help from the forums in the past. Thanks! I've searched the forums and my question is very similar to an earlier post entitled "Printing highest value from one column", which I am apparently not yet allowed to post a... (1 Reply)
Discussion started by: dliving3
1 Replies

6. Shell Programming and Scripting

Print the key with highest value

print the key with highest value input a 10 a 20 a 30 b 2 b 3 b 1 output a 30 b 3 (9 Replies)
Discussion started by: quincyjones
9 Replies

7. Shell Programming and Scripting

Remove duplicates and update last 2 digits of the original row with 0's

Hi, I have a requirement where I have to remove duplicates from a file based on the first 8 chars (It is fixed width file of 10 chars length) and whenever a duplicate row is found, its original row's last 2 chars should be updated to all 0's. I thought of using sort -u -k 1.1,1.8... (4 Replies)
Discussion started by: farawaydsky
4 Replies

8. Shell Programming and Scripting

Filtering out duplicates with the highest version number

Hi, I have a huge text file with filenames which which looks like the following ie uniquenumber_version_filename: e.g. 1234_1_xxxx 1234_2_vfvfdbb 343333_1_vfvfdvd 2222222_1_ggggg 55555_1_xxxxxx 55555_2_vrbgbgg 55555_3_grgrbr What I need to do is examine the file, look for... (4 Replies)
Discussion started by: mantis
4 Replies

9. Shell Programming and Scripting

Need to show highest version line from the list

Hi All, Need help here, can you tell me the syntax to line grep the highest file version? 0 04-05-2016 08:00 lib/SBSSchemaProject.jar/schemas/ 0 04-05-2016 08:00 lib/SBSSchemaProject.jar/schemas/airprice/ 0 04-05-2016 08:00 ... (2 Replies)
Discussion started by: 100rin
2 Replies

10. Shell Programming and Scripting

Print whole line with highest value from one column

Hi, I have a little issue right now. I have a file with 4 columns test0000002,10030010330,c_,218 test0000002,10030010330,d_,202 test0000002,10030010330,b_,193 test0000002,10030010020,c_,178 test0000002,10030010020,b_,170 test0000002,10030010330,a_,166 test0000002,10030010020,a_,151... (3 Replies)
Discussion started by: Ebk
3 Replies
PSC(1)							      General Commands Manual							    PSC(1)

NAME
psc - prepare sc files SYNOPSIS
psc [-fLkrSPv] [-s cell] [-R n] [-C n] [-n n] [-d c] DESCRIPTION
Psc is used to prepare data for input to the spreadsheet calculator sc(1). It accepts normal ascii data on standard input. Standard out- put is a sc file. With no options, psc starts the spreadsheet in cell A0. Strings are right justified. All data on a line is entered on the same row; new input lines cause the output row number to increment by one. The default delimiters are tab and space. The column for- mats are set to one larger than the number of columns required to hold the largest value in the column. OPTIONS
-f Omit column width calculations. This option is for preparing data to be merged with an existing spreadsheet. If the option is not specified, the column widths calculated for the data read by psc will override those already set in the existing spreadsheet. -L Left justify strings. -k Keep all delimiters. This option causes the output cell to change on each new delimiter encountered in the input stream. The default action is to condense multiple delimiters to one, so that the cell only changes once per input data item. -r Output the data by row first then column. For input consisting of a single column, this option will result in output of one row with multiple columns instead of a single column spreadsheet. -s cell Start the top left corner of the spreadsheet in cell. For example, -s B33 will arrange the output data so that the spreadsheet starts in column B, row 33. -R n Increment by n on each new output row. -C n Increment by n on each new output column. -n n Output n rows before advancing to the next column. This option is used when the input is arranged in a single column and the spreadsheet is to have multiple columns, each of which is to be length n. -d c Use the single character c as the delimiter between input fields. -P Plain numbers only. A field is a number only when there is no imbedded [-+eE]. -S All numbers are strings. -v Print the version of psc SEE ALSO
sc(1) AUTHOR
Robert Bond PSC 7.16 19 September 2002 PSC(1)

Featured Tech Videos

All times are GMT -4. The time now is 03:45 AM.
Unix & Linux Forums Content Copyright 1993-2021. All Rights Reserved.
Privacy Policy