A little help using grep for anagram solving with BASH


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting A little help using grep for anagram solving with BASH
# 1  
Old 07-08-2009
A little help using grep for anagram solving with BASH

Hi guys,

I have been making a simple script for looking for anagram solutions in a word list (a file of 22k or so words).

At the moment it funtions like so:

User enters an 8 character string (whatever letters you want to find anagrams of, or solve rather)

The script moves all the words from the list that contain the first character to a temp file, then filters this to another temp file for those containing the 2nd character as well and so on until the only words left contain all 8 characters.

The problem i've hit is that there are, of course words left at the end that contain all 8 characters but also contain others and are not filtered out.

I'm sure there is a simple way of removing these and it may just be because i'm tired but it's not going anywhere at the moment!

My other thought was: Is there an option for grep that tells it to looks for the string in any order. i.e. you grep for the string 'Man' and it will give you back lines that match 'Man, nam, nma, anm mna' (if they were real words of course). This would obviously make the code rather simple and is possibly a bit hopeful..

Anyway, here's the code so far, I'm trying to avoid sed for the time being out of habit. I'm looking for a bit of advice rather then a complete solution.

Code:
function Start(){
echo Enter 8 Letter String.
read letters
for a in $(seq 1 8)
do
array[$a]=$(echo $letters | cut -c$a)
done
}

function FilterList(){
for b in $(seq 1 8)
do
c=$(($b+1))
grep ${array[$b]} TempList$b > TempList$c
rm TempList$b
done
cat TempList9
}

cat WordList > TempList1
Start
FilterList

Thanks for any advice offered.

edit: n.b This is not a pressing matter, just a small exercise I set myself to try and refresh my memory of shell scripting, so I am open to discussion of various other approaches you might use.
# 2  
Old 07-08-2009
Stuff like this worries me. It might be homework. Or it might just be a private exercize. Smilie Well I'll take a chance. Here is one solution...
Code:
$ cat bashtest
#! /bin/bash
x=abcdefg
eval letters=( $(echo $x | sed 's/./& /g') )
exec < candidates
while read candidate ; do
        temp=$candidate
        for((k=0; ${#letters[@]}>k; k++ )) ; do
                temp=${temp/${letters[k]}/}
                [[ -z $temp ]] && echo $candidate
        done
done
exit 0
$
$
$
$
$ cat candidates
abcdefg
axbcdef
cgfaebd
$
$
$
$
$ ./bashtest
abcdefg
cgfaebd
$

# 3  
Old 07-09-2009
Homework? Ha no homework is quite a few years behind me!

Just an exercise I've set myself to revise my shell scripting because i've got interviews coming up.

Thanks for the reply, i'll take a proper look in the morning at the office, have to get to sleep soon.

If anyone has any thoughts on other variations on my idea that might work i'd like to hear them, i'll have another crack at finishing it in the next couple days when i'm done with loadrunner.

---------- Post updated at 09:19 AM ---------- Previous update was at 12:19 AM ----------

Thanks for the reply Perderabo, I tried running your script to see how it functioned but without success, I have not used sed before so not really ideal for helping me revise my scripting knowledge (which is of a fairly basic standard), but thankyou none the less for taking the time to reply.

What I was really after was just an idea of either an option for Grep that I had not thought of or suggestions on how one might go about removing the words from the final temp file that contain extra characters that are not wanted.

I.e. you want to find solutions for the letters 'nma' and you get:
Man
Manager
Mandate
etc etc

Of course the only actual one you want is 'Man' but the others appear because they still contain all the letters...

I'll see what I can come up with today but any thoughts/discussion on the subject is welcome and hopefully I can expand my knowledge a little by getting involved with the online community.

EDIT:


Came up with an idea, I'm going to try something along the lines of:

Load all the letters of the alphabet into an array.
Remove those contained in the user entered variable.
Remove any words from the word list that contain letters left in the array.

In theory this should just leave words contain all the input letters and no others. I still need to work out something for double letters but i'll work on that later.

This way also should take away the need to have it create so many temp files to ween down the results...

Last edited by Donthommo; 07-09-2009 at 05:42 AM..
Login or Register to Ask a Question

Previous Thread | Next Thread

6 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Shell script for solving the math question

Can such Puzzle solve through UNIX script? if yes, what could be the code? This has been solve in C language. we were trying to solve this through shell but could not because of not able to pass 1st argument with multiple value. we are not expert in unix scripting. Below is the puzzle John is a... (4 Replies)
Discussion started by: anshu ranjan
4 Replies

2. Shell Programming and Scripting

Anagram finder based in Ascii values

Hello, i need some help with a programm i want to make, what i want to do is to make a dictionary and include some anagrams with it, and make the programm read the Ascii value of each word, and compare them with the anagrams and make the programm print the words that have the same Ascii value,... (1 Reply)
Discussion started by: jose2802
1 Replies

3. Homework & Coursework Questions

Solving heat equation using crank-nicolsan scheme in FORTRAN

! The one-dimensional PDE for heat diffusion equation ! u_t=(D(u)u_x)_x + s where u(x,t) is the temperature, ! D(u) is the diffusivity and s(x,t) is a source term. ! Taking D(u)= 1 and s(x,t)=0 gives ! u_t= u_xx ! uniform one dimensional region |x|<1 for t>0 ! uniform mesh size delta x=0.1 !... (1 Reply)
Discussion started by: watto1
1 Replies

4. UNIX for Dummies Questions & Answers

Bash - CLI - grep - Passing result to grep through pipe

Hello. I want to get all modules which are loaded and which name are exactly 2 characters long and not more than 2 characters and begin with "nv" lsmod | (e)grep '^nv???????????? I want to get all modules which are loaded and which name begin with "nv" and are 2 to 7 characters long ... (1 Reply)
Discussion started by: jcdole
1 Replies

5. Shell Programming and Scripting

Lay person needs perl help solving error message

Hi, My name is Tex I am past 60 and in need of perl help. My hobby is genealogy and I am using a perl program to display my data on my web pages. I don't even know enough to know how to ask in the right way for the help I need. This program is written in perl it is open and has been updated... (2 Replies)
Discussion started by: tex
2 Replies

6. UNIX for Advanced & Expert Users

Solving the network collisions in Unix box

Hi, Anyone can u give me an idea to clear the network collisions in the unix box(Solaris and Linux)? NIC performance is very low, and it shows collisions, when issuing the command ifconfig -a in the production server. How can i rectify the network collisions in the box. Using netstat and lsof... (4 Replies)
Discussion started by: muthulingaraja
4 Replies
Login or Register to Ask a Question