Sponsored Content
Top Forums Shell Programming and Scripting Finding consecutive same words in a file Post 302540277 by shoaibjameel123 on Wednesday 20th of July 2011 08:34:40 AM
Old 07-20-2011
Finding consecutive same words in a file

Hi All,

I tried this but I am having trouble formulating this:
I have a file that looks like this (this is a sample file words can be different):

Code:
network
router
frame
network
router
computer
card
host
computer
card

One can see that in this file "network" and "router" occur together two times and also "computer" and "card" two times. I want to find the count of those words in the file that occur together in the same file taking two words at a time. I expect my output to be like this:
Code:
1
1
1
2
1
1
1
1
2

The output above means that "network" and "router" have occurred 1 time in the first occurrence. Then "router" and "frame" occur 1 time. Then "frame" and
"network" occurs 1 time. Then again we encounter "network", "router" and now this count becomes 2. And we keep on doing this for the rest of the file.

This is what I tried, but the problem is I am supplying the words manually. Moreover, I have more than one file all with .dat extension. One can see that I am reading .dat files and storing the result in .txt files.:

Code:
ls -1 *.dat | while read page
do
cat $page | grep "$network[[:blank:]]*$router" $page>$page.txt
done


I am using BASH in Linux.
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

how to find capital letter names in a file without finding words at start of sentence

Hi, I want to be able to list all the names in a file which begin with a capital letter, but I don't want it to list words that begin a new sentence. Is there any way round this? Thanks for your help. (1 Reply)
Discussion started by: kev269
1 Replies

2. UNIX for Dummies Questions & Answers

finding no of counts the words occured

hi, cud u help me to find this. i hav 2 files. file1 has data as "ARUN ARUN is from Australia Arun likes America etc.. ARUN ARUN " file2 has "ARUN Australia America" i... (5 Replies)
Discussion started by: arunsubbhian
5 Replies

3. Shell Programming and Scripting

Finding the most frequently occurring set of words

Hi guys, I have a file with a list of phoneme for words, it looks like this: AILS EY1 L Z AIMLESSLY EY1 M L AH0 S L IY0 AIMONE EY1 M OW2 N AIMS EY1 M Z AINGE EY1 NG AINGE(2) EY1 N JH AINLEY EY1 N L IY0 AINSLIE EY1 N Z L IY0 AIR EH1 R AIRBAGS EH1 R B AE2 G Z and I need to... (5 Replies)
Discussion started by: Andrew9191
5 Replies

4. Shell Programming and Scripting

Finding the number of unique words in a file

find the number of unique words in a file using sort com- mand. (7 Replies)
Discussion started by: abhikamune
7 Replies

5. Shell Programming and Scripting

Finding consecutive numbers in version names on a txt file

Hi all. I have a directory which contains files that can be versioned. All the files are named according to a pattern like this: TEXTSTRING1-001.EXTENSION TEXTSTRING2-001.EXTENSION TEXTSTRING3-001.EXTENSION ... TEXTSTRINGn-001.EXTENSION If a file is versioned, a file called ... (10 Replies)
Discussion started by: fox1212
10 Replies

6. Shell Programming and Scripting

finding and removing 2 identical consecutive words in a text

i want to write a shell script that correct a text file.for example if i have the input file: "john has has 2 apples anne has 3 oranges oranges" i want that the output file be like this: "john has 2 apples anne has 3 oranges" i've tried to read line by line from input text file into array... (11 Replies)
Discussion started by: cocostaec
11 Replies

7. Shell Programming and Scripting

Finding my lost file by searching for words in it

Got a question for you guys...I am searching through a public directory (that has tons of files) trying to find a file that I was working on a longggggg time ago. I can't remember what it is called, but I do remember the content. It should contains words like this: Joe Pulvo botnet zeus... (5 Replies)
Discussion started by: statichazard
5 Replies

8. Shell Programming and Scripting

Finding consecutive maxima and recording them

Hello, I have a file with two columns (I uploaded it because it is some 500K): File-Upload.net - data.dat If you plot the data with, say, gnuplot, plot 'data.dat' u 1:2 w l you will see that there are jumps. This is actually an orbit and a maximum corresponds to an apocenter and a... (2 Replies)
Discussion started by: pau
2 Replies

9. Shell Programming and Scripting

Get group of consecutive uppercase words using gawk

Hi I'd like to extract, from a text file, the strings starting with "The Thing" and only composed of words with a capital first letter and apostrophes, like for example: "The Thing I Only" from "those are the The Thing I Only go for whatever." or "The Thing That Are Like Men's Eyewear" ... (7 Replies)
Discussion started by: louisJ
7 Replies

10. UNIX for Dummies Questions & Answers

Finding the same pattern in three consecutive lines in several files in a directory

I know how to search for a pattern/regular expression in many files that I have in a directory. For example, by doing this: grep -Ril "News/U.S." . I can find which files contain the pattern "News/U.S." in a directory. I am unable to accomplish about how to extend this code so that it can... (1 Reply)
Discussion started by: shoaibjameel123
1 Replies
POLISH(5)							      Debian								 POLISH(5)

NAME
polish - a list of Polish words DESCRIPTION
/usr/share/dict/polish is an ASCII file which contains an alphabetic list of words, one per line. FILES
/etc/dictionaries-common/words is a symbolic link to a /usr/share/dict/<language> file. /usr/share/dict/words is a symbolic link to /etc/dictionaries-common/words, and is the name by which other software should refer to the system word list. See select-default-wordlist(8) for more information. The directory /usr/share/dict can contain word lists for many languages, with name of the language in English, e.g., /usr/share/dict/french and /usr/share/dict/danish contain respectively lists of French and Danish words if they exist. Such lists should be coded using the UTF-8 character set encoding. SEE ALSO
ispell(1), select-default-wordlist(8), and the Filesystem Hierarchy Standard. HISTORY
The words lists are not specific, and may be generated from any number of sources. The system word list used to be /usr/dict/words. For compatibility, software should check that location if /usr/share/dict/words does not exist. AUTHOR
Word lists are collected and maintained by various authors. Debian Project March 29th, 2011 POLISH(5)
All times are GMT -4. The time now is 11:28 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy