Shell Programming and Scripting

BSD, Linux, and UNIX shell scripting — Post awk, bash, csh, ksh, perl, php, python, sed, sh, shell scripts, and other shell scripting languages questions here.

If the 1th column of file f1 and file f2 is the same, then export those line with maximum string of


👤 Login to reply

    #1  
Old 04-26-2018
weichanghe2000 weichanghe2000 is offline
Registered User
 
Join Date: May 2013
Last Activity: 26 April 2018, 12:42 PM EDT
Posts: 15
Thanks: 0
Thanked 0 Times in 0 Posts
If the 1th column of file f1 and file f2 is the same, then export those line with maximum string of

please help to write a awk command-line programs to achieve the following functions: Thank in advance.

Requeset Description:
compare two files f1 and f2, export to file f3:
1 Delete duplicate rows of in file f1 and file f2
2 If the 1th column of file f1 and file f2 is the same, then export those line with maximum string of 2nd column.
for example:
Code:
  
 0.1-37    < 0.2-53;   
 6.1.4-b.0 < 6.1.5-c.2;   
 9.13.2    < 11.5.6;    
 18b-16    > 8c-7;   
 D15       < F4;   
 1.b5_a    < 1.b12_d   
 4c5.8     < 4c12.8   
 d18g      < d18j

3 Rule: For the 2nd column of 2 files:
> num of 0-9 consecutive occurrences may be different, such as 9.13.2 vs 11.5.6, D15 vs F4
> The type, order, num of other characters (such as '.' '_' '-' 'A-Z' 'a-Z') except 0-9 is the same.
like 6.1.4-b.0 vs 6.1.5-c.2, 1.b5_a vs 1.b12_d, D15 vs F4 ....
> if find the 1st large string after comparison, then stop comparing the 2nd column, and output this line of those file,
such as 'IO 1.b5_a' of f1, 'IO 1.b12_d' of f2, will output 'IO 1.b12_d'

4 cat f1:
Code:
 PK      0.1-37  
 Art     6.1.4-b.0  
 Fle     9.13.2     
 Uni     18b-16   
 STD     D15   
 IO      1.b5_a  
 FPG     4c5.8 
 SRA     d18g 
 .... 
 ....

cat f2:
Code:
 Uni     8c-7 
 IO      1.b12_d 
 Art     6.1.5-c.2 
 PK      0.2-53 
 Fle     11.5.6 
 SRA     d18j 
 STD     F4 
 FPG     4c12.8 
 .... 
 ....

desired file f3:
Code:
 Art     6.1.5-c.2 
 Fle     11.5.6 
 IO      1.b12_d 
 PK      0.2-53 
 STD     F4 
 Uni     18b-16   
 FPG     4c12.8 
 SRA     d18j 
 ... 
 ...

Sponsored Links
    #2  
Old 04-26-2018
rdrtx1 rdrtx1 is offline Forum Advisor  
Registered Pusher
 
Join Date: Sep 2012
Last Activity: 31 May 2018, 9:41 AM EDT
Location: Houston, Texas, USA
Posts: 1,142
Thanks: 0
Thanked 421 Times in 398 Posts
Code:
awk '
NR==FNR {a[$1]=$2; next}
{ if (length(a[$1])) {
      line1="";
      c=split($2, a1, "[^0-9A-Za-z]");

      for (i=1; i<=c; i++)  {
         t=u=a1[i];
         gsub("[0-9]", " ", t); gsub("[A-Za-z]", " ", u);
         d=split(t, a2); e=split(u, a3);
         if (a1[i] ~ /^[0-9]/) {
            for (j=1; j<=e; j++) line1=line1 (sprintf("%5s",  a3[j]));
            for (j=1; j<=d; j++) line1=line1 (sprintf("%5s",  a2[j]));
         } else {
            for (j=1; j<=d; j++) line1=line1 (sprintf("%5s",  a2[j]));
            for (j=1; j<=e; j++) line1=line1 (sprintf("%5s",  a3[j]));
         }
      }

      line2="";
      c=split(a[$1], a1, "[^0-9A-Za-z]");

      for (i=1; i<=c; i++)  {
         t=u=a1[i];
         gsub("[0-9]", " ", t); gsub("[A-Za-z]", " ", u);
         d=split(t, a2); e=split(u, a3);
         if (a1[i] ~ /^[0-9]/) {
            for (j=1; j<=e; j++) line2=line2 (sprintf("%5s",  a3[j]));
            for (j=1; j<=d; j++) line2=line2 (sprintf("%5s",  a2[j]));
         } else {
            for (j=1; j<=d; j++) line2=line2 (sprintf("%5s",  a2[j]));
            for (j=1; j<=e; j++) line2=line2 (sprintf("%5s",  a3[j]));
         }
      }
      if (line1 > line2 ) {print $1, $2} else {print $1, a[$1]}
   } else {
      print $1, $2;
   }
   delete a[$1];
}
END {
   for (i in a) if (length(a[i])) print i, a[i];
}
' OFS="\t" f1 f2

Sponsored Links
👤 Login to reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
Get maximum per column from CSV file, based on date column ejianu Shell Programming and Scripting 6 07-10-2017 05:08 AM
search a string in a particular column of file and return the line number of the line arunshankar.c Shell Programming and Scripting 3 12-28-2011 06:51 AM
Replace 2nd column for each line in a csv file with fixed string+random number tententen Shell Programming and Scripting 7 10-27-2011 11:54 AM
Reformatting single column text file starting new line when finding particular string kieranfoley Shell Programming and Scripting 7 10-21-2011 02:02 PM
Print the line containing the maximum value in a column kingkong UNIX for Advanced & Expert Users 22 02-03-2006 09:38 AM



All times are GMT -4. The time now is 01:17 PM.

Unix & Linux Forums Content Copyright©1993-2018. All Rights Reserved.
×
UNIX.COM Login
Username:
Password:  
Show Password





Not a Forum Member?
Forgot Password?