awk - Remove duplicates during array build Post: 302976803

Sponsored Content

Top Forums Shell Programming and Scripting awk - Remove duplicates during array build Post 302976803 by chill3chee on Wednesday 6th of July 2016 11:17:18 AM

07-06-2016

Registered User

I aligned your post to suit my requirement and works great. As always, awesome and thank you RudiC.
I have a small question here. My script uses 2 input files is facing issues when reading the second file as

Code:

awk -F "@" ' NR==FNR { ....; next; } {  #second file processing }' file1.txt file2.txt

It doesn't even read the 2nd file; Tested with some print statements in the 2nd file processing and they never get printed. When I changed it to if (NR!=FNR) the same part works great.

Code:

awk -F "@" ' { if(NR==FNR) {
....; next; }
if(NR!=FNR){
#second file processing } }' file1.txt file2.txt

and this works; I am not complaining about awk; I am sure that I messed up some where and not able to figure it out. But if(NR!=FNR) comes to my rescue at this point and hence I am using it.

Though I know it is not possible to figure out the issue without looking into the script and files, looking for some guess; did someone ever face similar issue.
!T[$1 OFS $2 OFS $3]++ should have worked in my script. But strangely, it didn't. So, I just tweaked as

Code:

if (!T[$1 OFS $2 OFS $3] {
v_array[$1 OFS $2]=(v_array[$1 OFS $2]? v_array[$1 OFS $2] "," $3 : $3)
T[$1 OFS $2 OFS $3="1"
}

and this works. I am not sure what difference does ++ and the if make as they should be ideal.

Thank you for your time.

chill3chee

View Public Profile for chill3chee

Find all posts by chill3chee

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Arranging an array so that duplicates will turn up first

Hi All, I have an array that contains duplicates as well unique numbers. ex- (21, 33, 35, 21, 33, 70, 33, 35, 50) I need to arrange it in such a way that all the duplicates will come up first followed by unique numbers. Result for the given example should be: (21, 21, 33, 33, 35, 35, 70,...

2. Shell Programming and Scripting

Remove duplicates

Hello Experts, I have two files named old and new. Below are my example files. I need to compare and print the records that only exist in my new file. I tried the below awk script, this script works perfectly well if the records have exact match, the issue I have is my old file has got extra...

3. Shell Programming and Scripting

bash - remove duplicates

I need to use a bash script to remove duplicate files from a download list, but I cannot use uniq because the urls are different. I need to go from this: http://***/fae78fe/file1.wmv http://***/39du7si/file1.wmv http://***/d8el2hd/file2.wmv http://***/h893js3/file2.wmv to this: ...

4. Shell Programming and Scripting

Awk: Remove Duplicates

I have the following code for removing duplicate records based on fields in inputfile file & moves the duplicate records in duplicates file(1st Awk) & in 2nd awk i fetch the non duplicate entries in inputfile to tmp file and use move to update the original file. Requirement: Can both the awk...

5. Shell Programming and Scripting

awk remove first duplicates

Hi All, I have searched many threads for possible close solution. But I was unable to get simlar scenario. I would like to print all duplicate based on 3rd column except the first occurance. Also would like to print if it is single entry(non-duplicate). i/P file 12 NIL ABD LON 11 NIL ABC...

6. Shell Programming and Scripting

Help with merge and remove duplicates

Hi all, I need some help to remove duplicates from a file before merging. I have got 2 files: file1 has data in format 4300 23456 4301 2357 the 4 byte values on the right hand side is uniq, and are not repeated anywhere in the file file 2 has data in same format but is not in...

7. Shell Programming and Scripting

Remove duplicates

8. Shell Programming and Scripting

Remove top 3 duplicates

hello , I have a requirement with input in below format abc 123 xyz bcd 365 kii abc 987 876 cdf 987 uii abc 456 yuu bcd 654 rrr Expecting Output abc 456 yuu bcd 654 rrr cdf 987 uii

9. Shell Programming and Scripting

Remove duplicates

Hi I have a below file structure. 200,1245,E1,1,E1,,7611068,KWH,30, ,,,,,,,, 200,1245,E1,1,E1,,7611070,KWH,30, ,,,,,,,, 300,20140223,0.001,0.001,0.001,0.001,0.001 300,20140224,0.001,0.001,0.001,0.001,0.001 300,20140225,0.001,0.001,0.001,0.001,0.001 300,20140226,0.001,0.001,0.001,0.001,0.001...

10. Shell Programming and Scripting

How to remove duplicates in C shell Array?

Please help me on this My script name is uniqueArray.csh #!/bin/csh set ARRAY = ( one teo three one three ) set ARRAY = ( $ARRAY one five three five ) How to remove the duplicates in this array ,sort and save those in the same variable or different variable. Thanks in the advance ...

LEARN ABOUT V7

join

JOIN(1) 						      General Commands Manual							   JOIN(1)

NAME

       join - relational database operator

SYNOPSIS

       join [ options ] file1 file2

DESCRIPTION

       Join  forms,  on the standard output, a join of the two relations specified by the lines of file1 and file2.  If file1 is `-', the standard
       input is used.

       File1 and file2 must be sorted in increasing ASCII collating sequence on the fields on which they are to be joined, normally the  first	in
       each line.

       There  is  one line in the output for each pair of lines in file1 and file2 that have identical join fields.  The output line normally con-
       sists of the common field, then the rest of the line from file1, then the rest of the line from file2.

       Fields are normally separated by blank, tab or newline.	In this case, multiple separators count as one, and leading  separators  are  dis-
       carded.

       These options are recognized:

       -an    In addition to the normal output, produce a line for each unpairable line in file n, where n is 1 or 2.

       -e s   Replace empty output fields by string s.

       -jn m  Join on the mth field of file n.	If n is missing, use the mth field in each file.

       -o list
	      Each  output line comprises the fields specifed in list, each element of which has the form n.m, where n is a file number and m is a
	      field number.

       -tc    Use character c as a separator (tab character).  Every appearance of c in a line is significant.

SEE ALSO

       sort(1), comm(1), awk(1)

BUGS

       With default field separation, the collating sequence is that of sort -b; with -t, the sequence is that of a plain sort.

       The conventions of join, sort, comm, uniq, look and awk(1) are wildly incongruous.

																	   JOIN(1)

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Arranging an array so that duplicates will turn up first

Discussion started by: ashim

2. Shell Programming and Scripting

Remove duplicates

Discussion started by: forumthreads

3. Shell Programming and Scripting

bash - remove duplicates

Discussion started by: locoroco

4. Shell Programming and Scripting

Awk: Remove Duplicates

Discussion started by: siramitsharma

5. Shell Programming and Scripting

awk remove first duplicates

Discussion started by: sybadm

6. Shell Programming and Scripting

Help with merge and remove duplicates

Discussion started by: roy121

7. Shell Programming and Scripting

Remove duplicates

Discussion started by: dtdt

8. Shell Programming and Scripting

Remove top 3 duplicates

Discussion started by: Tomlight

9. Shell Programming and Scripting

Remove duplicates

Discussion started by: tejashavele

10. Shell Programming and Scripting

How to remove duplicates in C shell Array?

Discussion started by: SA_Palani

LEARN ABOUT V7

join