Determining Word Frequency of Specific Terms


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Determining Word Frequency of Specific Terms
# 8  
Old 03-06-2009
Hi, Can we take out:

Total number of SOA records = 30

I only need records showing the below in each db.x

PTR
MX
NS
CNAME
A

The code must be smart to look at tabs/spaces I guess??


a copy of a db.x looks like

;
; THIS FILE IS AUTOMATICALLY GENERATED. DO NOT EDIT IT.
; THIS FILE IS AUTOMATICALLY GENERATED. DO NOT EDIT IT.
; THIS FILE IS AUTOMATICALLY GENERATED. DO NOT EDIT IT.
; THIS FILE IS AUTOMATICALLY GENERATED. DO NOT EDIT IT.
;
; generated from: $Id: master.txt,v 2.1230 2009/01/05 22:29:21 root Exp $
;

$TTL 3600

beerprime.com. IN SOA iqedns1.internet.com. hostmaster.beer.com. (
2009010501 ; Serial
900 ; Refresh
300 ; Retry
1209600 ; Expire
3600 ) ; Minimum
beerprime.com. IN NS iqdns1.internet.com.

integ4 IN A 192.168.205.156
beerprime.com. IN A 192.168.205.175
www IN CNAME intg4.beerprime.com.
86.96.168.192.in-addr.arpa. IN PTR sepapp.beerprime.com


;
; END OF beerprime.com
;

Thanks
# 9  
Old 03-06-2009
It's smart enough Smilie
Try this and let me know if the output is OK:

Code:
awk 'END {
  print f ":"
    for (Z in z)
      printf "Total number of %s records = %d\n", \
      Z, z[Z]
    print RS
    }
FNR == 1 {
  if (f) {
    print f ":"
    for (Z in z)
      printf "Total number of %s records = %d\n", \
      Z, z[Z]
    print RS
    }
    f = FILENAME
  }    
$3 ~ /^(PTR|MX|NS|CNAME|A)$/ { z[$3]++ }' db*


Last edited by radoulov; 03-06-2009 at 08:47 AM.. Reason: corrected $2 -> $3
# 10  
Old 03-06-2009
Hi !

OK, looks like we got a count issue, see db.beerstearns.com

beerstearns.com. IN SOA iqedns1.internet.com. hostmaster.beer.com. (
2009010501 ; Serial
900 ; Refresh
300 ; Retry
1209600 ; Expire
3600 ) ; Minimum
bearstearns.com. IN NS iqedns1.internet.com.

fbhp IN A 192.168.205.124
futures IN A 192.168.205.165
bigdog IN A 192.168.205.195
bigdog2 IN A 192.168.205.196

; SPECIALS
;
situnifiedportal.bearstearns.com. IN NS whdgss1cnis-pri1.clearco.com.
situnifiedportal.bearstearns.com. IN NS metgss1cnis-sec1.clearco.com.
qa.bearstearns.com. IN NS whdgss1cnis-pri1.clearco.com.
qa.bearstearns.com. IN NS metgss1cnis-sec1.clearco.com.


The output came out as:

db.bearstearns.com:

Total number of CNAME records = 1
Total number of A records = 6
Total number of NS records = 26
Total number of PTR records = 166

There is 4 A records, I dont see CNAME.

I also need a count if it detects the word "Special"
So maybe
Total number of Special records = 4
Sorry, I just noticed that
# 11  
Old 03-06-2009
You're right, I have to empty the array at the beginning of every file. Try this one:

Code:
awk 'END {
  print f ":"
    for (Z in z)
      printf "Total number of %s records = %d\n", \
      Z, z[Z]
    print RS
    }
FNR == 1 {
  if (f) {
    print f ":"
    for (Z in z)
      printf "Total number of %s records = %d\n", \
      Z, z[Z]
    if (sc) printf "Total number of Special records = %d\n", \
    sc    
    print RS
    split(x, z)
    s = sc = 0
    }
    f = FILENAME
  }    
$3 ~ /^(PTR|MX|NS|CNAME|A)$/ { z[$3]++; s && sc++ }
/SPECIALS/ { s = 1 }' db*

Do you want the special records in the total or you want a separate count for them?
For db.beerstearns.com you want this:

Code:
db.beerstearns.com:
Total number of A records = 4
Total number of NS records = 5
Total number of Special records = 4

Or this:

Code:
db.beerstearns.com:
Total number of A records = 4
Total number of NS records = 1
Total number of Special records = 4


Last edited by radoulov; 03-06-2009 at 09:55 AM.. Reason: corrected again: reset s at the beginning of every file
# 12  
Old 03-06-2009
Hi, I would like to have it like this:

Or this:

Code:
db.beerstearns.com:
Total number of A records = 4
Total number of NS records = 1
Total number of Special records = 4
# 13  
Old 03-06-2009
Is this OK?
Do you want the IN strings (I don't know the exact word Smilie) for the special records too or the count is sufficient?
Code:
awk 'END {
  print f ":"
    for (Z in z)
      printf "Total number of %s records = %d\n", \
      Z, z[Z]
    print RS
    }
FNR == 1 {
  if (f) {
    print f ":"
    for (Z in z)
      printf "Total number of %s records = %d\n", \
      Z, z[Z]
    if (sc) printf "Total number of Special records = %d\n", \
    sc    
    print RS
    split(x, z)
    s = sc = 0
    }
    f = FILENAME
  }    
$3 ~ /^(PTR|MX|NS|CNAME|A)$/ {  
  if (s) sc++ 
  else z[$3]++
  }
/SPECIALS/ { s = 1 }' db*

# 14  
Old 03-06-2009
Awesome !

Thanks a whole bunch !!
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Search for a specific word and print only the word from the input file

Hi, I have a sample file as shown below, I am looking for sed or any command which prints the complete word only from the input file. Ex: $ cat "sample.log" I am searching for a word which is present in this file We can do a pattern search using grep but I need to cut only the word which... (1 Reply)
Discussion started by: mohan_kumarcs
1 Replies

2. Shell Programming and Scripting

Count frequency of unique values in specific column

Hi, I have tab-deliminated data similar to the following: dot is-big 2 dot is-round 3 dot is-gray 4 cat is-big 3 hot in-summer 5 I want to count the frequency of each individual "unique" value in the 1st column. Thus, the desired output would be as follows: dot 3 cat 1 hot 1 is... (5 Replies)
Discussion started by: owwow14
5 Replies

3. Shell Programming and Scripting

Shell scripting: frequency of specific word in a string and statistics

Hello friends, I need a BIG help from UNIX collective intelligence: I have a CSV file like this: VALUE,TIMESTAMP,TEXT 1,Sun May 05 16:13:05 +0000 2013,"RT @gracecheree: Praying God sends me a really great man one day. Gotta trust in his timing. 0,Sun May 05 16:13:05 +0000 2013,@sendi__... (19 Replies)
Discussion started by: kraterions
19 Replies

4. Shell Programming and Scripting

Convert a list of word/terms into their Regexp representation

Ok this might sound pretty weird but here is the request. Running on a linux system in bash or Perl (i really don't know perl but the end user has a few pearl script already) Start File looks something like this (4000 entries) TEST PLAN T//TF T-TF TEST (T) Hacker ... I am thinking about... (3 Replies)
Discussion started by: oly_r
3 Replies

5. Shell Programming and Scripting

Fetch entries in front of specific word till next word

Hi all I have following file which I have to edit for research purpose file:///tmp/moz-screenshot.png body, div, table, thead, tbody, tfoot, tr, th, td, p { font-family: "Liberation Sans"; font-size: x-small; } Drug: KRP-104 QD Drug: Placebo Drug: Metformin|Drug:... (15 Replies)
Discussion started by: Priyanka Chopra
15 Replies

6. Shell Programming and Scripting

Help with calculating frequency of specific word in a string

Input file: #read_1 AWEAWQQRZZZQWQQWZ #read_2 ZZAQWRQTWQQQWADSADZZZ #read_3 POGZZZZZZADWRR . . Desired output file: #read_1 3 #read_1 1 #read_2 2 #read_2 3 #read_3 6 . . (3 Replies)
Discussion started by: perl_beginner
3 Replies

7. UNIX for Dummies Questions & Answers

How to print line starts with specific word and contains specific word using sed?

Hi, I have gone through may posts and dint find exact solution for my requirement. I have file which consists below data and same file have lot of other data. <MAPPING DESCRIPTION ='' ISVALID ='YES' NAME='m_TASK_UPDATE' OBJECTVERSION ='1'> <MAPPING DESCRIPTION ='' ISVALID ='NO'... (11 Replies)
Discussion started by: tmalik79
11 Replies

8. Shell Programming and Scripting

Word Frequency Sort

hello, Here is a program for creating a word-frequency # wf.gk --- program to generate word frequencies from a file { # remove punctuation: This will remove all punctuations from the file gsub(/_]/, "", $0) #Start frequency analysis for (i = 1; i <= NF; i++) freq++ } END #Print output... (11 Replies)
Discussion started by: gimley
11 Replies

9. Shell Programming and Scripting

word frequency counter - awk solution?

Dear all, i need your help on this. There is a text file, i need to count word frequency for each word with frequency >40 in each line of file and output it into another file with columns like this: word1,word2,word3, ...wordn 0,0,1 1,2,0 3,2,0 etc -- each raw represents... (13 Replies)
Discussion started by: irrevocabile
13 Replies

10. Shell Programming and Scripting

Word frequency with additional information

Hello everyone, I am using a chunk of code to display the frequency of a file name in a list of directories. The code looks like this: find . -name "*.log" | cut -d/ -f4 | cut -d. -f1 | awk '{print $1}' | sort | uniq -c | sort -nr The file paths would look something like this:... (1 Reply)
Discussion started by: ToeLint
1 Replies
Login or Register to Ask a Question