Remove repeated letter words


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Remove repeated letter words
# 1  
Old 11-14-2017
Remove repeated letter words

Hi,
I have this text file with these words and I need help with removing words with repeated letter from these lines.

Code:
      1 ama
      5 bib
     29 bob
      2 bub
      5 civic
      2 dad
     10 deed
      1 denned
    335 did
      1 eeee
      1 eeeee
      2 eke
      8 ere
      4 eve
    116 eye
      1 gig
      2 hah
      1 huuh
      3 III
      1 kraark
     12 level
      1 lil
      6 maam
      2 madam
      1 mem
      1 minim
     13 non
      8 noon
     11 nun
      1 pap
      5 peep
      1 pip
      2 poop
      9 pop
      2 pup
      1 rever
      1 sas
     10 sees
      1 ses
      1 solos
      1 tattarrattat
      1 tot
      1 tut
      1 txt
      2 wow

remove the words like 'eeee' and 'III'

i know there is a way of using sed to remove specific lines, I think it's along the lines of 's/'eeee/' but i want a way to remove any repeated letters using one command.

any help would be great thank you.
# 2  
Old 11-14-2017
What operating system and shell are you using?

Is this a homework assignment? We have seen a request similar to this recently, except that request was looking for words that only contained a single character instead of a repeated letter.

Please more clearly define exactly what you are trying to do. Every word in your sample text file has at least one letter that appears more than once. Are you saying that you want to remove every line from your text file?
# 3  
Old 11-15-2017
Does ama also count as a word with repeating letters? After all, the 'a' occurs twice, but not in succession.


# 4  
Old 11-15-2017
Welcome crepe6,

Please always include the output from uname -a wrapped in CODE tags so we know which OS and version you are using.

Please confirm the context of this issue, i.e. is it homework/course assignment so we know how to answer.


If you are looking for 3 or more of the same character in succession (given that double-letters are valid in many words) would you be okay with some Perl?

An expression something like /(.)\1\1/ might help I think, i.e. match any character followed by the same twice. You might want to refine that to letters only to avoid matching on white-space or numbers if they are in your file too.


Of course, this doesn't match kraark, tattarrattat, wibblewobblesnoozydunno or even Hh_ee_ll_ll_oo_WW_oo_rr_ll_dd if they are in (or added to) the list, so should they be in or out?

Is there a dictionary list you can match against if you want real words only?


You need to be clearer to your criteria, after answering if this is homework or if not giving the context so we can most suitably be able to progress this.



Kind regards,
Robin
# 5  
Old 11-16-2017
Code:
awk '{l=$0; if (length($2)!=gsub(substr($2,1,1), "", $2)) print l}' infile

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Counting all words that start with a capital letter in a string using python dictionary

Hi, I have written the following python snippet to store the capital letter starting words into a dictionary as key and no of its appearances as a value in this dictionary against the key. #!/usr/bin/env python import sys import re hash = {} # initialize an empty dictinonary for line in... (1 Reply)
Discussion started by: royalibrahim
1 Replies

2. Shell Programming and Scripting

Find repeated word and take sum of the second field to it ,for all the repeated words in awk

Hi below is the input file, i need to find repeated words and sum up the values of it which is second field from the repeated work.Im trying but getting no where close to it.Kindly give me a hint on how to go about it Input fruits,apple,20,fruits,mango,20,veg,carrot,12,veg,raddish,30... (11 Replies)
Discussion started by: 100bees
11 Replies

3. UNIX for Dummies Questions & Answers

Delete all words not containing letter /s/

I have a word file that looks like: pens binder spiral user I want to delete all the words without the letter /s/, so output looks like: pens spiral user I tried using sed: sed '//d' infile.txt > out.txt (5 Replies)
Discussion started by: pxalpine
5 Replies

4. Shell Programming and Scripting

Make all words begin with capital letter?

I need to use bash to convert sentences where all words start with a small letter into one where all words start with a capital letter. So that a string like: are utilities ready for hurricane sandy becomes: Are Utilities Ready For Hurricane Sandy (10 Replies)
Discussion started by: locoroco
10 Replies

5. Shell Programming and Scripting

remove brackets and put it in a column and remove repeated entry

Hi all, I want to remove the remove bracket sign ( ) and put in the separate column I also want to remove the repeated entry like in first row in below input (PA156) is repeated ESR1 (PA156) leflunomide (PA450192) (PA156) leflunomide (PA450192) CHST3 (PA26503) docetaxel... (2 Replies)
Discussion started by: manigrover
2 Replies

6. Shell Programming and Scripting

Script to compare 2 words (first and last letter only)

Hello, I need a script to do the following: I have a file filled of lines like: valu -> value confirmaton -> confirmation I need a script to compare the first and last letters of the words, for example for the line: valu -> value compare "v" to "v" and "u" to "e" and print the line... (7 Replies)
Discussion started by: bojomojo
7 Replies

7. Shell Programming and Scripting

delete repeated strings (tags) in a line and concatenate corresponding words

Hello friends! Each line of my input file has this format: word<TAB>tag1<blankspace>lemma<TAB>tag2<blankspace>lemma ... <TAB>tag3<blankspace>lemma Of this file I need to eliminate all the repeated tags (of the same word) in a line, as in the example here below, but conserving both (all) the... (2 Replies)
Discussion started by: mjomba
2 Replies

8. Shell Programming and Scripting

Help in counting the no of repeated words with count in a file

Hi Pls help in solving my doubt.Iam having file like below file1.txt priya jenny jenny priya raj radhika priya bharti bharti Output required: I need a output like count of repeated words with name for ex: priya 3 jenny 2 (4 Replies)
Discussion started by: bha148
4 Replies

9. UNIX for Advanced & Expert Users

How to filter the words, if that word contains the expected letter

Hi, I am trying to filter the words from a file which contain 'abc'. But I am unable to. Could any one help me. For eg: The file contents are 123ab 12hnj1 123abc456 123cgbcahjkf23 23134abchfhj43 gc32abc abc1 2abc3 sd uiguif fhwe 21242 uh123 jkcas124d123 u3hdbh23u ffsd8 Output... (3 Replies)
Discussion started by: venu_eie
3 Replies

10. Shell Programming and Scripting

how to find capital letter names in a file without finding words at start of sentence

Hi, I want to be able to list all the names in a file which begin with a capital letter, but I don't want it to list words that begin a new sentence. Is there any way round this? Thanks for your help. (1 Reply)
Discussion started by: kev269
1 Replies
Login or Register to Ask a Question