Sponsored Content
Top Forums UNIX for Dummies Questions & Answers Comparing lines within a word list Post 302954699 by drl on Thursday 10th of September 2015 08:46:37 AM
Old 09-10-2015
Hi.

Off the top of my head (but before more than one cup of coffee):

1) I agree with RudiC. The OP asked for all sets.

2) I think I'd try applying the idea of a sieve as in https://en.wikipedia.org/wiki/Sieve_of_Eratosthenes

So I'd sort the list, and, starting from the top, record matches, and erase from the original the matched items. The move to the next. Perhaps hashes would be faster than going through the list a number of times, but I haven't thought about that in detail.

I'd save the sets in filenames like, for example, for the string abc: _bc, a_b, and ab_

The list is presumably from some work, perhaps a dictionary as mentioned by the OP, so it is not as large as a complete set of permutations could be (26 items taken 3 at a time, some 15K if I did that correctly). I tried a few initial stabs using a common list of words, 25143 items total, of which 800 were 3 characters long.

So perhaps an awk/perl code to start, but might need a compiled code if that's too slow.

3) I wish that the OP had given more information about why this is needed.

Best wishes ... cheers, drl
This User Gave Thanks to drl For This Post:
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Comparing a distinct value in 1 list with another list

Hi all, I need to compare the contents of 2 directories where the file contents are similar and take out the filenames whose contents does not exist within the 2 directories. Directory1 1 2 3 4 Directory2 54 55 56 57 Does anyone has a script which can do this? At the end of... (6 Replies)
Discussion started by: manualvin
6 Replies

2. Shell Programming and Scripting

comparing lines from 2 files

Hi Friends, I have 2 files A and B . I want to compare the 3rd line of file A and B . (I dont want to compare the 2 files, using diff or cmp). I just want to know whether 3rd line of A matches the 3 rd line of B. Can anybody share their knowledge on the same? Thanks , Vijaya (12 Replies)
Discussion started by: vijaya2006
12 Replies

3. Shell Programming and Scripting

Word count of lines ending with certain word

Hi all, I am trying to write a command that can help me count the number of lines in the /etc/passwd file ending in bash. I have read through other threads but am yet to find one indicating how to locate a specifc word at the end of a line. I know i will need to use the wc command but when i... (8 Replies)
Discussion started by: warlock129
8 Replies

4. Shell Programming and Scripting

comparing lines in file

i have 2 files and i want to compare i currently cat the files and awk print $1, $2 and doing if file1=file2 then fail, else exit 0 what i want to do is compare values, with column 1 being a reference i want to compare line by line and then still be able to do if then statement to see if worked... (1 Reply)
Discussion started by: sigh2010
1 Replies

5. Shell Programming and Scripting

Comparing lines of two different files

Hello, Please help me with this problem if you have a solution. I have two files: <file1> : In each line, first word is an Id and then other words that belong to this Id piMN-1 abc pqr xyz py12 niLM y12 FY4 pqs fiRLym F12 kite red <file2> : same as file1, but can have extra lds... (3 Replies)
Discussion started by: mira
3 Replies

6. Shell Programming and Scripting

Search the word to be deleted and delete lines above this word starting from P1 to P3

Hi, I have to search a word in a text file and then I have to delete lines above from the word searched . For eg suppose the file is like this: Records P1 10,23423432 ,77:1 ,234:2 P2 10,9089004 ,77:1 ,234:2 ,87:123 ,9898:2 P3 456456 P1 :123,456456546 P2 abc:324234 (2 Replies)
Discussion started by: vsachan
2 Replies

7. UNIX for Dummies Questions & Answers

Comparing lines of data

Total UNIX Rookie, but I'm learning. I have columns of integer data separated by spaces, and I'm using a Mac terminal. What I want to do: 1. Compare "line 1 column 2" (x) to "line 2 column 2" (y); is y-x>=100? 2. If yes, display difference and y's line number 3. If no, increment x and y by... (9 Replies)
Discussion started by: markymarkg123
9 Replies

8. UNIX for Dummies Questions & Answers

Delete lines with a word and their above lines

Hi, i have a file like this: A1 kdfjdljfdkljfdlf A2 lfjdlfkjddkjf A3 ***no hit*** A4 ldjfldjfdk A5 ***no hit*** A6 jldfjdlfjdlkfjd I want to remove the lines "***no hit*** and their above line to get an output file like this: (11 Replies)
Discussion started by: the_simpsons
11 Replies

9. Shell Programming and Scripting

Shell Script @ Find a key word and If the key word matches then replace next 7 lines only

Hi All, I have a XML file which is looks like as below. <<please see the attachment >> <?xml version="1.0" encoding="UTF-8"?> <esites> <esite> <name>XXX.com</name> <storeId>10001</storeId> <module> ... (4 Replies)
Discussion started by: Rajeev_hbk
4 Replies

10. Shell Programming and Scripting

Comparing alternate lines of code

Hi gents, Have only a passing familiarity with linux/shell at this point, so please forgive simple question. I have text files that have lines something like the following: a b c d d d e f e f e f a b (6 Replies)
Discussion started by: cabled
6 Replies
egrep(1)							   User Commands							  egrep(1)

NAME
egrep - search a file for a pattern using full regular expressions SYNOPSIS
/usr/bin/egrep [-bchilnsv] -e pattern_list [file...] /usr/bin/egrep [-bchilnsv] -f file [file...] /usr/bin/egrep [-bchilnsv] pattern [file...] /usr/xpg4/bin/egrep [-bchilnqsvx] -e pattern_list [-f file] [file...] /usr/xpg4/bin/egrep [-bchilnqsvx] [-e pattern_list] -f file [file...] /usr/xpg4/bin/egrep [-bchilnqsvx] pattern [file...] DESCRIPTION
The egrep (expression grep) utility searches files for a pattern of characters and prints all lines that contain that pattern. egrep uses full regular expressions (expressions that have string values that use the full set of alphanumeric and special characters) to match the patterns. It uses a fast deterministic algorithm that sometimes needs exponential space. If no files are specified, egrep assumes standard input. Normally, each line found is copied to the standard output. The file name is printed before each line found if there is more than one input file. /usr/bin/egrep The /usr/bin/egrep utility accepts full regular expressions as described on the regexp(5) manual page, except for ( and ), ( and ), { and }, < and >, and , and with the addition of: 1. A full regular expression followed by + that matches one or more occurrences of the full regular expression. 2. A full regular expression followed by ? that matches 0 or 1 occurrences of the full regular expression. 3. Full regular expressions separated by | or by a NEWLINE that match strings that are matched by any of the expressions. 4. A full regular expression that can be enclosed in parentheses ()for grouping. Be careful using the characters $, *, [, ^, |, (, ), and in full regular expression, because they are also meaningful to the shell. It is safest to enclose the entire full regular expression in single quotes (a'a'). The order of precedence of operators is [], then *?+, then concatenation, then | and NEWLINE. /usr/xpg4/bin/egrep The /usr/xpg4/bin/egrep utility uses the regular expressions described in the EXTENDED REGULAR EXPRESSIONS section of the regex(5) manual page. OPTIONS
The following options are supported for both /usr/bin/egrep and /usr/xpg4/bin/egrep: -b Precede each line by the block number on which it was found. This can be useful in locating block numbers by context (first block is 0). -c Print only a count of the lines that contain the pattern. -e pattern_list Search for a pattern_list (full regular expression that begins with a -). -f file Take the list of full regular expressions from file. -h Suppress printing of filenames when searching multiple files. -i Ignore upper/lower case distinction during comparisons. -l Print the names of files with matching lines once, separated by NEWLINEs. Does not repeat the names of files when the pattern is found more than once. -n Precede each line by its line number in the file (first line is 1). -s Work silently, that is, display nothing except error messages. This is useful for checking the error status. -v Print all lines except those that contain the pattern. /usr/xpg4/bin/egrep The following options are supported for /usr/xpg4/bin/egrep only: -q Quiet. Does not write anything to the standard output, regardless of matching lines. Exits with zero status if an input line is selected. -x Consider only input lines that use all characters in the line to match an entire fixed string or regular expression to be matching lines. OPERANDS
The following operands are supported: file A path name of a file to be searched for the patterns. If no file operands are specified, the standard input is used. /usr/bin/egrep pattern Specify a pattern to be used during the search for input. /usr/xpg4/bin/egrep pattern Specify one or more patterns to be used during the search for input. This operand is treated as if it were specified as -epat- tern_list.. USAGE
See largefile(5) for the description of the behavior of egrep when encountering files greater than or equal to 2 Gbyte ( 2^31 bytes). ENVIRONMENT VARIABLES
See environ(5) for descriptions of the following environment variables that affect the execution of egrep: LC_COLLATE, LC_CTYPE, LC_MES- SAGES, and NLSPATH. EXIT STATUS
The following exit values are returned: 0 If any matches are found. 1 If no matches are found. 2 For syntax errors or inaccessible files (even if matches were found). ATTRIBUTES
See attributes(5) for descriptions of the following attributes: /usr/bin/egrep +-----------------------------+-----------------------------+ | ATTRIBUTE TYPE | ATTRIBUTE VALUE | +-----------------------------+-----------------------------+ |Availability |SUNWcsu | +-----------------------------+-----------------------------+ |CSI |Not Enabled | +-----------------------------+-----------------------------+ /usr/xpg4/bin/egrep +-----------------------------+-----------------------------+ | ATTRIBUTE TYPE | ATTRIBUTE VALUE | +-----------------------------+-----------------------------+ |Availability |SUNWxcu4 | +-----------------------------+-----------------------------+ |CSI |Enabled | +-----------------------------+-----------------------------+ SEE ALSO
fgrep(1), grep(1), sed(1), sh(1), attributes(5), environ(5), largefile(5), regex(5), regexp(5), XPG4(5) NOTES
Ideally there should be only one grep command, but there is not a single algorithm that spans a wide enough range of space-time trade-offs. Lines are limited only by the size of the available virtual memory. /usr/xpg4/bin/egrep The /usr/xpg4/bin/egrep utility is identical to /usr/xpg4/bin/grep -E. See grep(1). Portable applications should use /usr/xpg4/bin/grep -E. SunOS 5.11 24 Mar 2006 egrep(1)
All times are GMT -4. The time now is 03:08 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy