Finding consecutive same words in a file Post: 302540277

Sponsored Content

Top Forums Shell Programming and Scripting Finding consecutive same words in a file Post 302540277 by shoaibjameel123 on Wednesday 20th of July 2011 08:34:40 AM

07-20-2011

Registered User

Finding consecutive same words in a file

Hi All,

I tried this but I am having trouble formulating this:
I have a file that looks like this (this is a sample file words can be different):

Code:

network
router
frame
network
router
computer
card
host
computer
card

One can see that in this file "network" and "router" occur together two times and also "computer" and "card" two times. I want to find the count of those words in the file that occur together in the same file taking two words at a time. I expect my output to be like this:

Code:

The output above means that "network" and "router" have occurred 1 time in the first occurrence. Then "router" and "frame" occur 1 time. Then "frame" and
"network" occurs 1 time. Then again we encounter "network", "router" and now this count becomes 2. And we keep on doing this for the rest of the file.

This is what I tried, but the problem is I am supplying the words manually. Moreover, I have more than one file all with .dat extension. One can see that I am reading .dat files and storing the result in .txt files.:

Code:

ls -1 *.dat | while read page
do
cat $page | grep "$network[[:blank:]]*$router" $page>$page.txt
done

I am using BASH in Linux.

shoaibjameel123

View Public Profile for shoaibjameel123

Find all posts by shoaibjameel123

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

how to find capital letter names in a file without finding words at start of sentence

Hi, I want to be able to list all the names in a file which begin with a capital letter, but I don't want it to list words that begin a new sentence. Is there any way round this? Thanks for your help.

2. UNIX for Dummies Questions & Answers

finding no of counts the words occured

hi, cud u help me to find this. i hav 2 files. file1 has data as "ARUN ARUN is from Australia Arun likes America etc.. ARUN ARUN " file2 has "ARUN Australia America" i...

3. Shell Programming and Scripting

Finding the most frequently occurring set of words

Hi guys, I have a file with a list of phoneme for words, it looks like this: AILS EY1 L Z AIMLESSLY EY1 M L AH0 S L IY0 AIMONE EY1 M OW2 N AIMS EY1 M Z AINGE EY1 NG AINGE(2) EY1 N JH AINLEY EY1 N L IY0 AINSLIE EY1 N Z L IY0 AIR EH1 R AIRBAGS EH1 R B AE2 G Z and I need to...

4. Shell Programming and Scripting

Finding the number of unique words in a file

find the number of unique words in a file using sort com- mand.

5. Shell Programming and Scripting

Finding consecutive numbers in version names on a txt file

Hi all. I have a directory which contains files that can be versioned. All the files are named according to a pattern like this: TEXTSTRING1-001.EXTENSION TEXTSTRING2-001.EXTENSION TEXTSTRING3-001.EXTENSION ... TEXTSTRINGn-001.EXTENSION If a file is versioned, a file called ...

6. Shell Programming and Scripting

finding and removing 2 identical consecutive words in a text

i want to write a shell script that correct a text file.for example if i have the input file: "john has has 2 apples anne has 3 oranges oranges" i want that the output file be like this: "john has 2 apples anne has 3 oranges" i've tried to read line by line from input text file into array...

7. Shell Programming and Scripting

Finding my lost file by searching for words in it

Got a question for you guys...I am searching through a public directory (that has tons of files) trying to find a file that I was working on a longggggg time ago. I can't remember what it is called, but I do remember the content. It should contains words like this: Joe Pulvo botnet zeus...

8. Shell Programming and Scripting

Finding consecutive maxima and recording them

Hello, I have a file with two columns (I uploaded it because it is some 500K): File-Upload.net - data.dat If you plot the data with, say, gnuplot, plot 'data.dat' u 1:2 w l you will see that there are jumps. This is actually an orbit and a maximum corresponds to an apocenter and a...

9. Shell Programming and Scripting

Get group of consecutive uppercase words using gawk

Hi I'd like to extract, from a text file, the strings starting with "The Thing" and only composed of words with a capital first letter and apostrophes, like for example: "The Thing I Only" from "those are the The Thing I Only go for whatever." or "The Thing That Are Like Men's Eyewear" ...

10. UNIX for Dummies Questions & Answers

Finding the same pattern in three consecutive lines in several files in a directory

I know how to search for a pattern/regular expression in many files that I have in a directory. For example, by doing this: grep -Ril "News/U.S." . I can find which files contain the pattern "News/U.S." in a directory. I am unable to accomplish about how to extend this code so that it can...

LEARN ABOUT DEBIAN

polish

POLISH(5)							      Debian								 POLISH(5)

NAME

       polish - a list of Polish words

DESCRIPTION

       /usr/share/dict/polish is an ASCII file which contains an alphabetic list of words, one per line.

FILES

       /etc/dictionaries-common/words  is  a  symbolic	link  to  a  /usr/share/dict/<language> file.  /usr/share/dict/words is a symbolic link to
       /etc/dictionaries-common/words,	and  is  the  name  by	which  other   software   should   refer   to	the   system   word   list.    See
       select-default-wordlist(8) for more information.

       The directory /usr/share/dict can contain word lists for many languages, with name of the language in English, e.g., /usr/share/dict/french
       and /usr/share/dict/danish contain respectively lists of French and Danish words if they exist.	Such lists should be coded using the UTF-8
       character set encoding.

SEE ALSO

       ispell(1), select-default-wordlist(8), and the Filesystem Hierarchy Standard.

HISTORY

       The words lists are not specific, and may be generated from any number of sources.

       The  system word list used to be /usr/dict/words.  For compatibility, software should check that location if /usr/share/dict/words does not
       exist.

AUTHOR

       Word lists are collected and maintained by various authors.

Debian Project							 March 29th, 2011							 POLISH(5)

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

how to find capital letter names in a file without finding words at start of sentence

Discussion started by: kev269

2. UNIX for Dummies Questions & Answers

finding no of counts the words occured

Discussion started by: arunsubbhian

3. Shell Programming and Scripting

Finding the most frequently occurring set of words

Discussion started by: Andrew9191

4. Shell Programming and Scripting

Finding the number of unique words in a file

Discussion started by: abhikamune

5. Shell Programming and Scripting

Finding consecutive numbers in version names on a txt file

Discussion started by: fox1212

6. Shell Programming and Scripting

finding and removing 2 identical consecutive words in a text

Discussion started by: cocostaec

7. Shell Programming and Scripting

Finding my lost file by searching for words in it

Discussion started by: statichazard

8. Shell Programming and Scripting

Finding consecutive maxima and recording them

Discussion started by: pau

9. Shell Programming and Scripting

Get group of consecutive uppercase words using gawk

Discussion started by: louisJ

10. UNIX for Dummies Questions & Answers

Finding the same pattern in three consecutive lines in several files in a directory

Discussion started by: shoaibjameel123

LEARN ABOUT DEBIAN

polish