Sponsored Content
Top Forums Shell Programming and Scripting Remove somewhat Duplicate records from a flat file Post 302560291 by Corona688 on Thursday 29th of September 2011 11:18:45 AM
Old 09-29-2011
It keeps stripping them out because you're not putting them in code tags.

It could be as simple as
Code:
awk 'NF>5' < infile > outfile

to exclude all records with less than 6 fields.

Or, if some 'short' fields do NOT have duplicates, then:

Code:
awk '{if(NF == 6)  {        K=$1 $3 $4 $5 $6;        }
        else           {        K=$1 $2 $3 $4 $5; }

        if(!L[K]) { O[N++]=K; L[K]=$0; }
        else if(length(L[K]) < length($0)) L[K]=$0; }
END { for(M=0; M<N; M++) print L[O[M]]; }' < data

 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Inserting records from flat file to db table

I have 20000 numbers present in a file in each line like 25663, 65465, 74579, 56446, .. .. I have created a table in db with single number column in it. create table testhari (no number(9)); I want to insert all these numbers into that table. how can i do it? can anybody please... (4 Replies)
Discussion started by: Hara
4 Replies

2. Shell Programming and Scripting

Remove all instances of duplicate records from the file

Hi experts, I am new to scripting. I have a requirement as below. File1: A|123|NAME1 A|123|NAME2 B|123|NAME3 File2: C|123|NAME4 C|123|NAME5 D|123|NAME6 1) I have 2 merge both the files. 2) need to do a sort ( key fields are first and second field) 3) remove all the instances... (3 Replies)
Discussion started by: vukkusila
3 Replies

3. Solaris

How to remove duplicate records with out sort

Can any one give me command How to delete duplicate records with out sort. Suppose if the records like below: 345,bcd,789 123,abc,456 234,abc,456 712,bcd,789 out tput should be 345,bcd,789 123,abc,456 Key for the records is 2nd and 3rd fields.fields are seperated by colon(,). (2 Replies)
Discussion started by: svenkatareddy
2 Replies

4. Shell Programming and Scripting

How to remove duplicate records with out sort

Can any one give me command How to delete duplicate records with out sort. Suppose if the records like below: 345,bcd,789 123,abc,456 234,abc,456 712,bcd,789 out tput should be 345,bcd,789 123,abc,456 Key for the records is 2nd and 3rd fields.fields are seperated by colon(,). (19 Replies)
Discussion started by: svenkatareddy
19 Replies

5. Shell Programming and Scripting

Sorting the records in the Flat file

Hi all, I am using this command "sort -d -u -k1 IMSTEST.74E -o tmp.txt" to the records in the flat. Can any tell me how to sort the file except first line in the file For ex: i/p First line: DXYZ Second line : jumy third : cmhk fourth : andy Output should... (5 Replies)
Discussion started by: sudhir_barker
5 Replies

6. Shell Programming and Scripting

How do I load records from table to a flat file!

Hi All, I need to load records from oracle table XYZ to a flat file say ABC.dat. could any one tell me how do i do this in UNXI, Regards Ann (1 Reply)
Discussion started by: Haque123
1 Replies

7. Shell Programming and Scripting

Remove duplicate records

I want to remove the records based on duplicate. I want to remove if two or more records exists with combination fields. Those records should not come once also file abc.txt ABC;123;XYB;HELLO; ABC;123;HKL;HELLO; CDE;123;LLKJ;HELLO; ABC;123;LSDK;HELLO; CDF;344;SLK;TEST key fields are... (7 Replies)
Discussion started by: svenkatareddy
7 Replies

8. Shell Programming and Scripting

Remove Duplicate Records

Hi frinds, Need your help. item , color ,desc ==== ======= ==== 1,red ,abc 1,red , a b c 2,blue,x 3,black,y 4,brown,xv 4,brown,x v 4,brown, x v I have to elemnet the duplicate rows on the basis of item. the final out put will be 1,red ,abc (6 Replies)
Discussion started by: imipsita.rath
6 Replies

9. Shell Programming and Scripting

Deleting duplicate records from file 1 if records from file 2 match

I have 2 files "File 1" is delimited by ";" and "File 2" is delimited by "|". File 1 below (3 record shown): Doc1;03/01/2012;New York;6 Main Street;Mr. Smith 1;Mr. Jones Doc2;03/01/2012;Syracuse;876 Broadway;John Davis;Barbara Lull Doc3;03/01/2012;Buffalo;779 Old Windy Road;Charles... (2 Replies)
Discussion started by: vestport
2 Replies

10. Shell Programming and Scripting

Remove duplicate records

Hi, i am working on a script that would remove records or lines in a flat file. The only difference in the file is the "NOT NULL" word. Please see below example of the input file. INPUT FILE:> CREATE a ( TRIAL_CLIENT NOT NULL VARCHAR2(60), TRIAL_FUND NOT NULL... (3 Replies)
Discussion started by: reignangel2003
3 Replies
IGAWK(1)							 Utility Commands							  IGAWK(1)

NAME
igawk - gawk with include files SYNOPSIS
igawk [ all gawk options ] -f program-file [ -- ] file ... igawk [ all gawk options ] [ -- ] program-text file ... DESCRIPTION
Igawk is a simple shell script that adds the ability to have ``include files'' to gawk(1). AWK programs for igawk are the same as for gawk, except that, in addition, you may have lines like @include getopt.awk in your program to include the file getopt.awk from either the current directory or one of the other directories in the search path. OPTIONS
See gawk(1) for a full description of the AWK language and the options that gawk supports. EXAMPLES
cat << EOF > test.awk @include getopt.awk BEGIN { while (getopt(ARGC, ARGV, "am:q") != -1) ... } EOF igawk -f test.awk SEE ALSO
gawk(1) Effective AWK Programming, Edition 1.0, published by the Free Software Foundation, 1995. AUTHOR
Arnold Robbins (arnold@skeeve.com). Free Software Foundation Nov 3 1999 IGAWK(1)
All times are GMT -4. The time now is 02:32 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy