Sponsored Content
Top Forums UNIX for Dummies Questions & Answers Grep words with X doubles only Post 302825171 by sudon't on Sunday 23rd of June 2013 09:03:22 PM
Old 06-23-2013
Grep words with X doubles only

Hi!
I'm trying to figure out how to find words with X number of doubles, only. I'm searching a dictionary, (one word per line). For instance, if you want to find words containing only one pair of double letters, you could do something like this:
Code:
egrep '(.)\1' wordlist.txt |egrep -v '(.)\1.*(.)\2'

That'll get rid of words with two, or more, doubles. But when you want to search for two or three sets of doubles, it gets a bit unwieldy.
Code:
egrep '(.)\1.*(.)\2' wordlist.txt |egrep -v '(.)\1.*(.)\2.*(.)\3'

And so on...
It seems to me that there must be a way to specify the max number of doubles in a single regex, but I cannot figure out how. I've found a number of pages online that talk about finding doubles, but none of them mention how to limit them to only the desired amount. I thought maybe a negative backreference could do it but, either I'm writing it wrong, or it just doesn't work. I still get words with more than X doubles.
Code:
$ grep -P '(.)\1.*(?!\1)' twl |head
aa
aah
aahed
aahing
aahs
aal
aalii
aaliis
aals
aardvark

$ grep -P '^(.)\1.*(?!(.)\1)$' twl |head
aa
aah
aahed
aahing
aahs
aal
aalii
aaliis
aals
aardvark

And I tried a bunch of other stuff, but can't figure it out, so I'm turning to you all.
 

10 More Discussions You Might Find Interesting

1. Programming

long doubles

hey there, i've been trrying to calculate the first 10000 fibonacci numbers using a long double. weird thing is that from a certain value it returns Inf. i'm declaring the vars as long double var; and printing them to a file using: fprintf(filepointer, "%.0Ld\n", var); am i doing... (1 Reply)
Discussion started by: crashnburn
1 Replies

2. Shell Programming and Scripting

find words with grep....

I have a .txt file which contains several lines of text. I need to write a script program using grep or any other unix tool so as to detect part of the text (words) between / / that begin with the symbol ~. For example if somewhere in the text appears a webpage address like... (8 Replies)
Discussion started by: chrisxgr
8 Replies

3. UNIX for Dummies Questions & Answers

Grep Three Words

I have been trying to find files containing the words AAA, BBB and CCC. I tried: grep AAA `grep BBB files*` grep CCC files* but is does not work I tried several ways this is an easy one but I am a dummy, Does anyone can help me? Thanks :( (12 Replies)
Discussion started by: murbina
12 Replies

4. UNIX for Dummies Questions & Answers

search multiple words using grep

Hi frnds i want to desplay file names that should be word1 and word2 ex : i have 10 *.log files 5 files having word1 and word2 5 files having only word1, i have used below command egrep -l 'word1|word2' *.log its giving all 10 files, but i want to display only 5... (20 Replies)
Discussion started by: pb18798
20 Replies

5. UNIX Desktop Questions & Answers

need help writing a program to look for doubles

to determine if two two doubles are equal, we check to see if their absolute difference is very close to zero. . .if two numbers are less than .00001 apart, theyre equal. keep a count field in each record (as you did in p5). once the list is complete, ask the user to see if an element is on... (2 Replies)
Discussion started by: rickym2626
2 Replies

6. Shell Programming and Scripting

grep for words in file

Hi Please can you help me on this: How to grep for multiple words in a file, BUT for every word found output it to a new line. regards FR (8 Replies)
Discussion started by: fretagi
8 Replies

7. Shell Programming and Scripting

grep words from txt

Queue on node in domain description : type : local max message len : 104857600 max queue depth : 5000 queue depth max event : enabled persistent msgs : yes backout threshold : 0 msg delivery seq :... (4 Replies)
Discussion started by: Daniel Gate
4 Replies

8. UNIX for Dummies Questions & Answers

Remove Doubles Without Sort?

Hi! I have concatenated two files which are wordlists, i.e., one word per line. The new file contains some doubles, but I cannot use sort and uniq as I need to keep the sort order that it is already in, which is not alphabetical, and uniq only compares adjacent lines, and the doubles are not on... (15 Replies)
Discussion started by: sudon't
15 Replies

9. Shell Programming and Scripting

How to grep the words with space between?

see I have a text like: 27-MAY 14:00 4 aaa 5.30 0.01 27-MAY 14:00 3 aaa 0.85 0.00 27-MAY 14:00 2 aaa 1.09 0.00 27-MAY 14:00 5 aaa 0.03 0.00 27-MAY 14:00... (3 Replies)
Discussion started by: netbanker
3 Replies

10. Shell Programming and Scripting

Grep only words containing specific string

Hello, I have two files. All urls are space seperated. source http://xx.yy.zz http://df.ss.sd.xz http://09.09.090.01 http://11.22.33 http://canada.xx.yy http://01.02.03.04 http://33.44.55 http://98.87.76.65 http://russia.xx.zz http://aa.tt.xx.zz http://1w.2e.3r.4t http://china.rr.tt ... (4 Replies)
Discussion started by: baris35
4 Replies
egrep(1)																  egrep(1)

NAME
egrep - search a file for a pattern using full regular expressions SYNOPSIS
/usr/bin/egrep [-bchilnsv] [-e pattern_list] [-f file] [strings] [file...] /usr/xpg4/bin/egrep [-bchilnsvx] [-e pattern_list] [-f file] [strings] [file...] The egrep (expression grep) utility searches files for a pattern of characters and prints all lines that contain that pattern. egrep uses full regular expressions (expressions that have string values that use the full set of alphanumeric and special characters) to match the patterns. It uses a fast deterministic algorithm that sometimes needs exponential space. If no files are specified, egrep assumes standard input. Normally, each line found is copied to the standard output. The file name is printed before each line found if there is more than one input file. /usr/bin/egrep The /usr/bin/egrep utility accepts full regular expressions as described on the regexp(5) manual page, except for ( and ), ( and ), { and }, < and >, and , and with the addition of: 1. A full regular expression followed by + that matches one or more occurrences of the full regular expression. 2. A full regular expression followed by ? that matches 0 or 1 occurrences of the full regular expression. 3. Full regular expressions separated by | or by a NEWLINE that match strings that are matched by any of the expressions. 4. A full regular expression that can be enclosed in parentheses ()for grouping. Be careful using the characters $, *, [, ^, |, (, ), and in full regular expression, because they are also meaningful to the shell. It is safest to enclose the entire full regular expression in single quotes '... '. The order of precedence of operators is [], then *?+, then concatenation, then | and NEWLINE. /usr/xpg4/bin/egrep The /usr/xpg4/bin/egrep utility uses the regular expressions described in the EXTENDED REGULAR EXPRESSIONS section of the regex(5) manual page. The following options are supported for both /usr/bin/egrep and /usr/xpg4/bin/egrep: -b Precede each line by the block number on which it was found. This can be useful in locating block numbers by context (first block is 0). -c Print only a count of the lines that contain the pattern. -e pattern_list Search for a pattern_list (full regular expression that begins with a -). -f file Take the list of full regular expressions from file. -h Suppress printing of filenames when searching multiple files. -i Ignore upper/lower case distinction during comparisons. -l Print the names of files with matching lines once, separated by NEWLINEs. Does not repeat the names of files when the pat- tern is found more than once. -n Precede each line by its line number in the file (first line is 1). -s Work silently, that is, display nothing except error messages. This is useful for checking the error status. -v Print all lines except those that contain the pattern. /usr/xpg4/bin/egrep The following option is supported for /usr/xpg4/bin/egrep only: -x Consider only input lines that use all characters in the line to match an entire fixed string or regular expression to be matching lines. The following operands are supported: file A path name of a file to be searched for the patterns. If no file operands are specified, the standard input is used. /usr/bin/egrep pattern Specify a pattern to be used during the search for input. /usr/xpg4/bin/egrep pattern Specify one or more patterns to be used during the search for input. This operand is treated as if it were specified as -epattern_list. USAGE
See largefile(5) for the description of the behavior of egrep when encountering files greater than or equal to 2 Gbyte ( 2**31 bytes). See environ(5) for descriptions of the following environment variables that affect the execution of egrep: LC_COLLATE, LC_CTYPE, LC_MES- SAGES, and NLSPATH. The following exit values are returned: 0 If any matches are found. 1 If no matches are found. 2 For syntax errors or inaccessible files (even if matches were found). See attributes(5) for descriptions of the following attributes: /usr/bin/egrep +-----------------------------+-----------------------------+ | ATTRIBUTE TYPE | ATTRIBUTE VALUE | +-----------------------------+-----------------------------+ |Availability |SUNWcsu | +-----------------------------+-----------------------------+ |CSI |Not Enabled | +-----------------------------+-----------------------------+ /usr/xpg4/bin/egrep +-----------------------------+-----------------------------+ | ATTRIBUTE TYPE | ATTRIBUTE VALUE | +-----------------------------+-----------------------------+ |Availability |SUNWxcu4 | +-----------------------------+-----------------------------+ |CSI |Enabled | +-----------------------------+-----------------------------+ fgrep(1), grep(1), sed(1), sh(1), attributes(5), environ(5), largefile(5), regex(5), regexp(5), XPG4(5) Ideally there should be only one grep command, but there is not a single algorithm that spans a wide enough range of space-time tradeoffs. Lines are limited only by the size of the available virtual memory. /usr/xpg4/bin/egrep The /usr/xpg4/bin/egrep utility is identical to /usr/xpg4/bin/grep -E (see grep(1)). Portable applications should use /usr/xpg4/bin/grep -E. 23 May 2005 egrep(1)
All times are GMT -4. The time now is 08:45 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy