[awk] Compare two files


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting [awk] Compare two files
# 1  
Old 07-01-2015
[awk] Compare two files

HI!!
I am trying to compare two files using AWK but I have some problems. I need to count how many times letters are used in two texts. This is my script
Code:
{
   long=length($0)
   
   for (i=1;i<=long;i++) 
      {
      aux=substr($0,i,1)
      if ( aux != " " && aux != "" )
          letter[tolower(aux)]++
          if ( letter[tolower(aux)] > max )
              {
              max=letter[tolower(aux)] 
              max_letter=aux
              }
      }
}
END {
   print "Lettera maggiormente utilizzata:"max_letter" Occorrenze:"max
   for (item in letter) 
       print "Lettera:"item" Occorrenze:"letter[item]
}

It works well for a file, but I need to have the results in this way:
file n.1 :
file n.2:
It-s important to have a distinction between the results.
I have thought a script like this, but I have some problems with the syntax and I cant merge the two things
Code:
BEGIN {
    n = 0
}
FNR==1 {
    idx = 0
    n += 1
}

{ files[n][idx++]=$0 }

END {
    for(i=1;i<=n; i++)
        for (line in files[i])
            printf "file n.%s: %s\n", i, files[i][line]
}

# 2  
Old 07-01-2015
How about
Code:
awk '
        {for (i=1; i<=length; i++)
                {aux=tolower(substr($0,i,1))
                 if ( aux != " " && aux != "" )
                        letter[FILENAME,aux]++
                        if ( letter[FILENAME,aux] > max[FILENAME] )
                                {max[FILENAME]=letter[FILENAME,aux]
                                 max_letter[FILENAME]=aux
                                }
                }
        }
END     {for (F in max) {print "Lettera maggiormente utilizzata in " F ": " max_letter[F] ", Occorrenze:" max[F]
                         for (item in letter)  if (item ~ F)
                                {split (item, T, SUBSEP)
                                 print "Lettera:" T[2] ", Occorrenze: " letter[item], F | "sort"
                                }
                         close ("sort")
                        }
        }
' file[12]

# 3  
Old 07-01-2015
Quote:
Originally Posted by RudiC
How about
Code:
awk '
        {for (i=1; i<=length; i++)
                {aux=tolower(substr($0,i,1))
                 if ( aux != " " && aux != "" )
                        letter[FILENAME,aux]++
                        if ( letter[FILENAME,aux] > max[FILENAME] )
                                {max[FILENAME]=letter[FILENAME,aux]
                                 max_letter[FILENAME]=aux
                                }
                }
        }
END     {for (F in max) {print "Lettera maggiormente utilizzata in " F ": " max_letter[F] ", Occorrenze:" max[F]
                         for (item in letter)  if (item ~ F)
                                {split (item, T, SUBSEP)
                                 print "Lettera:" T[2] ", Occorrenze: " letter[item], F | "sort"
                                }
                         close ("sort")
                        }
        }
' file[12]

thanks, but I've tried and I have the results of two files together! The results have been summed up!
# 4  
Old 07-01-2015
Not for me:
Code:
Lettera maggiormente utilizzata in file1: a, Occorrenze:35
Lettera: 0, Occorrenze: 24 file1
Lettera: 1, Occorrenze: 26 file1
Lettera: 2, Occorrenze: 5 file1
Lettera: 3, Occorrenze: 6 file1
Lettera: a, Occorrenze: 35 file1
Lettera: b, Occorrenze: 12 file1
Lettera: c, Occorrenze: 11 file1
Lettera: d, Occorrenze: 8 file1
Lettera: e, Occorrenze: 23 file1
Lettera: f, Occorrenze: 4 file1
Lettera: g, Occorrenze: 12 file1
Lettera: h, Occorrenze: 8 file1
Lettera: i, Occorrenze: 20 file1
Lettera: j, Occorrenze: 6 file1
Lettera: k, Occorrenze: 2 file1
Lettera: l, Occorrenze: 17 file1
Lettera: m, Occorrenze: 11 file1
Lettera: n, Occorrenze: 17 file1
Lettera: ,, Occorrenze: 12 file1
Lettera: -, Occorrenze: 2 file1
Lettera: :, Occorrenze: 2 file1
Lettera: (, Occorrenze: 2 file1
Lettera: ), Occorrenze: 2 file1
Lettera: +, Occorrenze: 2 file1
Lettera: ;, Occorrenze: 4 file1
Lettera: ", Occorrenze: 4 file1
Lettera: =, Occorrenze: 6 file1
Lettera: #, Occorrenze: 8 file1
Lettera:  , Occorrenze:  file1
Lettera: o, Occorrenze: 20 file1
Lettera: p, Occorrenze: 26 file1
Lettera: q, Occorrenze: 8 file1
Lettera: r, Occorrenze: 25 file1
Lettera: s, Occorrenze: 20 file1
Lettera: t, Occorrenze: 17 file1
Lettera: u, Occorrenze: 17 file1
Lettera: v, Occorrenze: 2 file1
Lettera: w, Occorrenze: 6 file1
Lettera: y, Occorrenze: 16 file1
Lettera maggiormente utilizzata in file2: a, Occorrenze:12
Lettera: 0, Occorrenze: 8 file2
Lettera: 1, Occorrenze: 9 file2
Lettera: 2, Occorrenze: 2 file2
Lettera: 3, Occorrenze: 4 file2
Lettera: 7, Occorrenze: 1 file2
Lettera: a, Occorrenze: 12 file2
Lettera: b, Occorrenze: 3 file2
Lettera: d, Occorrenze: 1 file2
Lettera: e, Occorrenze: 2 file2
Lettera: g, Occorrenze: 8 file2
Lettera: i, Occorrenze: 2 file2
Lettera: j, Occorrenze: 3 file2
Lettera: l, Occorrenze: 1 file2
Lettera: m, Occorrenze: 3 file2
Lettera: n, Occorrenze: 2 file2
Lettera: #, Occorrenze: 3 file2
Lettera: o, Occorrenze: 2 file2
Lettera: p, Occorrenze: 11 file2
Lettera: q, Occorrenze: 2 file2
Lettera: r, Occorrenze: 2 file2
Lettera: s, Occorrenze: 3 file2
Lettera: w, Occorrenze: 2 file2

---------- Post updated at 18:21 ---------- Previous update was at 18:21 ----------

What OS and what version of awk do you use?
# 5  
Old 07-01-2015
Quote:
Originally Posted by RudiC
Not for me:
Code:
Lettera maggiormente utilizzata in file1: a, Occorrenze:35
Lettera: 0, Occorrenze: 24 file1
Lettera: 1, Occorrenze: 26 file1
Lettera: 2, Occorrenze: 5 file1
Lettera: 3, Occorrenze: 6 file1
Lettera: a, Occorrenze: 35 file1
Lettera: b, Occorrenze: 12 file1
Lettera: c, Occorrenze: 11 file1
Lettera: d, Occorrenze: 8 file1
Lettera: e, Occorrenze: 23 file1
Lettera: f, Occorrenze: 4 file1
Lettera: g, Occorrenze: 12 file1
Lettera: h, Occorrenze: 8 file1
Lettera: i, Occorrenze: 20 file1
Lettera: j, Occorrenze: 6 file1
Lettera: k, Occorrenze: 2 file1
Lettera: l, Occorrenze: 17 file1
Lettera: m, Occorrenze: 11 file1
Lettera: n, Occorrenze: 17 file1
Lettera: ,, Occorrenze: 12 file1
Lettera: -, Occorrenze: 2 file1
Lettera: :, Occorrenze: 2 file1
Lettera: (, Occorrenze: 2 file1
Lettera: ), Occorrenze: 2 file1
Lettera: +, Occorrenze: 2 file1
Lettera: ;, Occorrenze: 4 file1
Lettera: ", Occorrenze: 4 file1
Lettera: =, Occorrenze: 6 file1
Lettera: #, Occorrenze: 8 file1
Lettera:  , Occorrenze:  file1
Lettera: o, Occorrenze: 20 file1
Lettera: p, Occorrenze: 26 file1
Lettera: q, Occorrenze: 8 file1
Lettera: r, Occorrenze: 25 file1
Lettera: s, Occorrenze: 20 file1
Lettera: t, Occorrenze: 17 file1
Lettera: u, Occorrenze: 17 file1
Lettera: v, Occorrenze: 2 file1
Lettera: w, Occorrenze: 6 file1
Lettera: y, Occorrenze: 16 file1
Lettera maggiormente utilizzata in file2: a, Occorrenze:12
Lettera: 0, Occorrenze: 8 file2
Lettera: 1, Occorrenze: 9 file2
Lettera: 2, Occorrenze: 2 file2
Lettera: 3, Occorrenze: 4 file2
Lettera: 7, Occorrenze: 1 file2
Lettera: a, Occorrenze: 12 file2
Lettera: b, Occorrenze: 3 file2
Lettera: d, Occorrenze: 1 file2
Lettera: e, Occorrenze: 2 file2
Lettera: g, Occorrenze: 8 file2
Lettera: i, Occorrenze: 2 file2
Lettera: j, Occorrenze: 3 file2
Lettera: l, Occorrenze: 1 file2
Lettera: m, Occorrenze: 3 file2
Lettera: n, Occorrenze: 2 file2
Lettera: #, Occorrenze: 3 file2
Lettera: o, Occorrenze: 2 file2
Lettera: p, Occorrenze: 11 file2
Lettera: q, Occorrenze: 2 file2
Lettera: r, Occorrenze: 2 file2
Lettera: s, Occorrenze: 3 file2
Lettera: w, Occorrenze: 2 file2

---------- Post updated at 18:21 ---------- Previous update was at 18:21 ----------

What OS and what version of awk do you use?
SmilieSmilieyou're result is what I want! I use GAWK!! Maybe, I've done a mistake substituting the name of the filesSmilie
# 6  
Old 07-01-2015
Post the output for one file, the other file, and both files together. Make it small files.
# 7  
Old 07-01-2015
Quote:
Originally Posted by RudiC
Post the output for one file, the other file, and both files together. Make it small files.
file1:
Lettera maggiormente utilizzata: a, Occorrenze: 22
Lettera:', Occorrenze: 1
Lettera:-, Occorrenze: 1
Lettera: , Occorrenze:
Lettera:., Occorrenze: 3
Lettera:¿, Occorrenze: 1
Lettera:├, Occorrenze: 3
Lettera:á, Occorrenze: 2
Lettera:a, Occorrenze: 2
Lettera:c, Occorrenze: 5
Lettera:d, Occorrenze: 7
Lettera:e, Occorrenze: 1
Lettera:f, Occorrenze: 3
Lettera:i, Occorrenze: 1
Lettera:l, Occorrenze: 1
Lettera:m, Occorrenze: 2
Lettera:n, Occorrenze: 1
LetteraSmilie, Occorrenze: 1
LetteraSmilie, Occorrenze: 3
Lettera:r, Occorrenze: 1
Lettera:s, Occorrenze: 5
Lettera:t, Occorrenze: 1
Lettera:u, Occorrenze: 4, Occorrenze: 10
Lettera:v, Occorrenze: 1

file2:
Lettera maggiormente utilizzata:a, Occorrenze:47
Lettera:', Occorrenze: 1
Lettera: , Occorrenze:
LetteraSmilie, Occorrenze: 1
LetteraSmilie, Occorrenze: 1
Lettera:,, Occorrenze: 4
Lettera:¨, Occorrenze: 2
Lettera:a, Occorrenze: 4
Lettera:b, Occorrenze: 2
Lettera:c, Occorrenze: 8
Lettera:d, Occorrenze: 1
Lettera:e, Occorrenze: 4
Lettera:f, Occorrenze: 6
Lettera:g, Occorrenze: 6
Lettera:h, Occorrenze: 1
Lettera:i, Occorrenze: 3
Lettera:l, Occorrenze: 2
Lettera:m, Occorrenze: 1
Lettera:n, Occorrenze: 3
LetteraSmilie, Occorrenze: 3
LetteraSmilie, Occorrenze: 1
Lettera:q, Occorrenze: 1
Lettera:r, Occorrenze: 3
Lettera:s, Occorrenze: 2
Lettera:t, Occorrenze: 2
Lettera:u, Occorrenze: 1
Lettera:v, Occorrenze: 4
Lettera:z, Occorrenze: 4

File together:
Lettera maggiormente utilizzata : a, Occorrenze:69
Lettera:', Occorrenze: 2
Lettera:-, Occorrenze: 1
Lettera: , Occorrenze:
LetteraSmilie, Occorrenze: 1
LetteraSmilie, Occorrenze: 1
Lettera:,, Occorrenze: 4
Lettera:., Occorrenze: 3
Lettera:¨, Occorrenze: 2
Lettera:¿, Occorrenze: 1
Lettera:├, Occorrenze: 3
Lettera:á, Occorrenze: 2
Lettera:a, Occorrenze: 69
Lettera:b, Occorrenze: 2
Lettera:c, Occorrenze: 13
Lettera:d, Occorrenze: 23
Lettera:e, Occorrenze: 65
Lettera:f, Occorrenze: 9
Lettera:g, Occorrenze: 6
Lettera:h, Occorrenze: 1
Lettera:i, Occorrenze: 51
Lettera:l, Occorrenze: 37
Lettera:m, Occorrenze: 12
Lettera:n, Occorrenze: 46
LetteraSmilie, Occorrenze: 48
LetteraSmilie, Occorrenze: 15
Lettera:q, Occorrenze: 1
Lettera:r, Occorrenze: 47
Lettera:s, Occorrenze: 26
Lettera:t, Occorrenze: 39
Lettera:u, Occorrenze: 14
Lettera:v, Occorrenze: 5
Lettera:z, Occorrenze: 9

---------- Post updated at 12:07 PM ---------- Previous update was at 12:02 PM ----------

Quote:
Originally Posted by RudiC
Post the output for one file, the other file, and both files together. Make it small files.
Solved!!!!!!! SmilieSmilieSmilie thank you!!!

---------- Post updated at 12:08 PM ---------- Previous update was at 12:07 PM ----------

Quote:
Originally Posted by RudiC
Post the output for one file, the other file, and both files together. Make it small files.
Solved!!!!!!! SmilieSmilieSmilie thank you!!!

---------- Post updated at 12:08 PM ---------- Previous update was at 12:08 PM ----------

Quote:
Originally Posted by RudiC
Post the output for one file, the other file, and both files together. Make it small files.
Solved!!!!!!! SmilieSmilieSmilie thank you!!!
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

awk compare files

I have a below requirement and trying to compare the files using awk File 1 - Already stored on a prev day id | text | email id --------------------------------- 89564|this is line 1 | xyz@sample.txt 985384|this is line 2 | abc@sample.txt 657342|this is line 3 |... (3 Replies)
Discussion started by: rakesh_411
3 Replies

2. Shell Programming and Scripting

Compare 2 files, awk maybe?

I have 2 files, file1: alfa numbers numbers vita numbers numbers gama numbers numbers delta numbers numbers epsilon numbers numbers zita numbers numbers ... file2: 'zita' keepnumbers keepnumbers keepnumbers 'gama' keepnumbers keepnumbers keepnumbers 'misc' ... (11 Replies)
Discussion started by: phaethon
11 Replies

3. HP-UX

Awk compare two files

Hi guys, I have 2 files: File1 ABC|2203|115.50 ABC|2288|328.12 ABC|2289|611.09 ABC|2290|698 DEF|1513|721.3 DEF|1514|40 DEF|1515|5 File2 ABC|2288|328.12 ABC|2289|666.08 ABC|2290|698.00 DEF|1513|721.30 (3 Replies)
Discussion started by: Eduardo Aceves
3 Replies

4. Shell Programming and Scripting

Compare files using awk

Please help me to compare two files and remove the items in file2 from file1 file 1:delimited using pipe(|) file1 00012|Description - 1|||||AA12345|1|AB12345|2|2012/06/03 AB123|Description - 2|||||AA12345|3|ZA11111|4|2012/06/04 11111|Description - 3|||||AP00012|1|AB12345|2|2012/06/03... (8 Replies)
Discussion started by: Mary James
8 Replies

5. Shell Programming and Scripting

awk command to compare a file with set of files in a directory using 'awk'

Hi, I have a situation to compare one file, say file1.txt with a set of files in directory.The directory contains more than 100 files. To be more precise, the requirement is to compare the first field of file1.txt with the first field in all the files in the directory.The files in the... (10 Replies)
Discussion started by: anandek
10 Replies

6. Shell Programming and Scripting

Compare two files with awk

Hello, I have a script which extracts the values from a csv file when a specific date is entered : #!/bin/sh awk 'BEGIN{printf("Entrez la date : "); getline date < "-"} $0 ~ date {f=1;print;next} /^{2}\//{f=0} f' file1.csv This script gives me a number of lines with different values. ... (6 Replies)
Discussion started by: freyr
6 Replies

7. UNIX for Dummies Questions & Answers

Using AWK to compare 2 files

Hi How can I use awk to compare specific columns in 2 files and print the difference. I currently have this: BEGIN { OFS = FS = "," } NR == FNR { b = $3 next } { e = "" for (x in b) { if (match ($1, x)) { if (RSTART == 1 && RLENGTH > length(e)) { e=x (2 Replies)
Discussion started by: ladyAnne
2 Replies

8. Shell Programming and Scripting

compare two files using awk

Hi, I want to compare two files using awk and write an output based on if the records matched. Both the files are space delimitted. File A: 8351 00000000000636 2009044 -00001.000 8351 00000000000637 2009044 -00002.000 8351 00000000000638 2009044 -00001.000 8351 00000000000640... (7 Replies)
Discussion started by: gpaulose
7 Replies

9. Shell Programming and Scripting

Compare two files using awk

Hi. I'm new to awk and have searched for a solution to my problem, but haven't found the right answer yet. I have two files that look like this: file1 Delete,3105551234 Delete,3105551236 Delete,5625559876 Delete,5625556789 Delete,5625553456 Delete,5625551234 Delete,5625556956... (8 Replies)
Discussion started by: paul.o
8 Replies

10. Shell Programming and Scripting

awk compare 2 files

Hi i hope some awk gurus here can help me.. here is what i need i have 2 files: File1 152445 516532 405088.pdf 152445 516533 405089.pdf 152491 516668 405153.jpg 152491 520977 408779.jpg 152491 0 409265.pdf File2 516532 /tmp/MainStreet_Sum09_Front_FNL.pdf 516533... (9 Replies)
Discussion started by: kenray
9 Replies
Login or Register to Ask a Question