If the 1th column of file f1 and file f2 is the same, then export those line with maximum string of

Tags
shell scripts, sorting awk numbers mix letter

Login to Reply

 
Thread Tools Search this Thread
# 1  
Old 04-26-2018
If the 1th column of file f1 and file f2 is the same, then export those line with maximum string of

please help to write a awk command-line programs to achieve the following functions: Thank in advance.

Requeset Description:
compare two files f1 and f2, export to file f3:
1 Delete duplicate rows of in file f1 and file f2
2 If the 1th column of file f1 and file f2 is the same, then export those line with maximum string of 2nd column.
for example:
Code:
  
 0.1-37    < 0.2-53;   
 6.1.4-b.0 < 6.1.5-c.2;   
 9.13.2    < 11.5.6;    
 18b-16    > 8c-7;   
 D15       < F4;   
 1.b5_a    < 1.b12_d   
 4c5.8     < 4c12.8   
 d18g      < d18j

3 Rule: For the 2nd column of 2 files:
> num of 0-9 consecutive occurrences may be different, such as 9.13.2 vs 11.5.6, D15 vs F4
> The type, order, num of other characters (such as '.' '_' '-' 'A-Z' 'a-Z') except 0-9 is the same.
like 6.1.4-b.0 vs 6.1.5-c.2, 1.b5_a vs 1.b12_d, D15 vs F4 ....
> if find the 1st large string after comparison, then stop comparing the 2nd column, and output this line of those file,
such as 'IO 1.b5_a' of f1, 'IO 1.b12_d' of f2, will output 'IO 1.b12_d'

4 cat f1:
Code:
 PK      0.1-37  
 Art     6.1.4-b.0  
 Fle     9.13.2     
 Uni     18b-16   
 STD     D15   
 IO      1.b5_a  
 FPG     4c5.8 
 SRA     d18g 
 .... 
 ....

cat f2:
Code:
 Uni     8c-7 
 IO      1.b12_d 
 Art     6.1.5-c.2 
 PK      0.2-53 
 Fle     11.5.6 
 SRA     d18j 
 STD     F4 
 FPG     4c12.8 
 .... 
 ....

desired file f3:
Code:
 Art     6.1.5-c.2 
 Fle     11.5.6 
 IO      1.b12_d 
 PK      0.2-53 
 STD     F4 
 Uni     18b-16   
 FPG     4c12.8 
 SRA     d18j 
 ... 
 ...

# 2  
Old 04-26-2018
Code:
awk '
NR==FNR {a[$1]=$2; next}
{ if (length(a[$1])) {
      line1="";
      c=split($2, a1, "[^0-9A-Za-z]");

      for (i=1; i<=c; i++)  {
         t=u=a1[i];
         gsub("[0-9]", " ", t); gsub("[A-Za-z]", " ", u);
         d=split(t, a2); e=split(u, a3);
         if (a1[i] ~ /^[0-9]/) {
            for (j=1; j<=e; j++) line1=line1 (sprintf("%5s",  a3[j]));
            for (j=1; j<=d; j++) line1=line1 (sprintf("%5s",  a2[j]));
         } else {
            for (j=1; j<=d; j++) line1=line1 (sprintf("%5s",  a2[j]));
            for (j=1; j<=e; j++) line1=line1 (sprintf("%5s",  a3[j]));
         }
      }

      line2="";
      c=split(a[$1], a1, "[^0-9A-Za-z]");

      for (i=1; i<=c; i++)  {
         t=u=a1[i];
         gsub("[0-9]", " ", t); gsub("[A-Za-z]", " ", u);
         d=split(t, a2); e=split(u, a3);
         if (a1[i] ~ /^[0-9]/) {
            for (j=1; j<=e; j++) line2=line2 (sprintf("%5s",  a3[j]));
            for (j=1; j<=d; j++) line2=line2 (sprintf("%5s",  a2[j]));
         } else {
            for (j=1; j<=d; j++) line2=line2 (sprintf("%5s",  a2[j]));
            for (j=1; j<=e; j++) line2=line2 (sprintf("%5s",  a3[j]));
         }
      }
      if (line1 > line2 ) {print $1, $2} else {print $1, a[$1]}
   } else {
      print $1, $2;
   }
   delete a[$1];
}
END {
   for (i in a) if (length(a[i])) print i, a[i];
}
' OFS="\t" f1 f2

Login to Reply

|
Thread Tools Search this Thread
Search this Thread:
Advanced Search

Similar Threads More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
Get maximum per column from CSV file, based on date column ejianu Shell Programming and Scripting 6 07-10-2017 06:08 AM
Replace line in file with line in another file based on matching string nikilbr86 Shell Programming and Scripting 1 10-16-2013 01:29 AM
Replace and add line in file with line in another file based on matching string jyu3 Shell Programming and Scripting 11 06-13-2013 04:12 PM
How to export the string to a text file. nanthagopal Shell Programming and Scripting 10 08-27-2012 10:12 PM
AWK, Perl or Shell? Unique strings and their maximum values from 3 column data file rich@ardz Shell Programming and Scripting 4 02-22-2012 06:14 AM
Replace line in file with line in another file based on matching string luckycharm Shell Programming and Scripting 5 01-10-2012 12:16 AM
search a string in a particular column of file and return the line number of the line arunshankar.c Shell Programming and Scripting 3 12-28-2011 07:51 AM
Replace 2nd column for each line in a csv file with fixed string+random number tententen Shell Programming and Scripting 7 10-27-2011 12:54 PM
Reformatting single column text file starting new line when finding particular string kieranfoley Shell Programming and Scripting 7 10-21-2011 03:02 PM
find string(s) in text file and nearby data, export to list help kar23me Shell Programming and Scripting 1 08-02-2011 03:14 PM
Match a line in File 1 with Column in File 2 and print whole line in file 2 when matched mustafa.abdulsa Shell Programming and Scripting 22 06-08-2011 06:31 AM
replace (sed?) a single line/string in file with multiple lines (string) from another file?? tuathan Shell Programming and Scripting 5 04-20-2011 01:16 AM
Need to convert the content of file into COLUMN (To export into excel) velocitnitin Shell Programming and Scripting 6 12-07-2009 01:26 AM
extract data between nth and n+1th instance of a line ryanp200 Shell Programming and Scripting 5 09-16-2008 10:53 AM
Print the line containing the maximum value in a column kingkong UNIX for Advanced & Expert Users 22 02-03-2006 10:38 AM
All times are GMT -4. The time now is 12:52 PM.

Unix & Linux Forums Content Copyright 1993-2018. All Rights Reserved.