i need your help on this. There is a text file, i need to count word frequency for each word with frequency >40 in each line of file and output it into another file with columns like this:
word1,word2,word3, ...wordn
0,0,1
1,2,0
3,2,0 etc -- each raw represents word counts for a line of the original text file
numbers are wordn frequencies in each line of the original file.
This AWK of course does the first part (collects a list of words to count)
This does searches and counts
How do i put them together??? In awk? Sorry, i am a complete newbie.
Worked on my PC too, perhaps OP should use nawk instead of awk.
Couple of things to note, yinyuemi's code does search and replace so if words are substrings of other words eg "the" and "thesis" it's starts going all wrong.
This update fixes this issue for me (Change >=1 to >=40 when your ready to limit to only 40 or greater total occurances):
Last edited by Chubler_XL; 03-06-2011 at 09:16 PM..
Hi experts, I've been struggling to format a large genetic dataset. It's complicated to explain so I'll simply post example input/output
$cat input.txt
ID GENE pos start end
blah1 coolgene 1 3 5
blah2 coolgene 1 4 6
blah3 coolgene 1 4 ... (4 Replies)
Hello friends, I need a BIG help from UNIX collective intelligence:
I have a CSV file like this:
VALUE,TIMESTAMP,TEXT
1,Sun May 05 16:13:05 +0000 2013,"RT @gracecheree: Praying God sends me a really great man one day. Gotta trust in his timing.
0,Sun May 05 16:13:05 +0000 2013,@sendi__... (19 Replies)
Hi, I wanted to calculate cumulative frequency distribution of my data that involves several arithmetic calls. I did things in excel but its taking me forever. this is what I want to do:
var1.txt contains n observations which I have to compute for frequency which is given by 1/n and subsequently... (7 Replies)
Hi
I have a file like below
############################################
# ParentFolder Flag SubFolders
Colateral 1 Source1/Checksum
CVA 1 Source1/Checksum
Flexing 1 VaR/Checksum
Flexing 1 SVaR/Checksum
FX 1 ... (5 Replies)
hello,
Here is a program for creating a word-frequency
# wf.gk --- program to generate word frequencies from a file
{
# remove punctuation: This will remove all punctuations from the file
gsub(/_]/, "", $0)
#Start frequency analysis
for (i = 1; i <= NF; i++)
freq++
}
END
#Print output... (11 Replies)
Hello everyone,
I am using a chunk of code to display the frequency of a file name in a list of directories. The code looks like this:
find . -name "*.log" | cut -d/ -f4 | cut -d. -f1 | awk '{print $1}' | sort | uniq -c | sort -nr
The file paths would look something like this:... (1 Reply)
Hello,
I require a perl script that will read a .txt file that contains words like
224.199.207.IN-ADDR.ARPA. IN NS NS1.internet.com.
4.200.162.207.in-addr.arpa. IN PTR beeriftw.internet.com.
arroyoeinternet.com. IN A 200.199.227.49
I want to focus on words:
IN... (23 Replies)