Why randomize at all? You could scan kord, eliminating each word in turn if it is not the same length or does not use the same letters with the same frequency. 162,060 isn't a lot to brute.
Regards,
Alister
That certainly sounds like a much better solution. Thanks
Can you supply some hints how to achieve this? I'm not an experienced scripter..
Location: Saint Paul, MN USA / BSD, CentOS, Debian, OS X, Solaris
Posts: 2,288
Thanks Given: 430
Thanked 480 Times in 395 Posts
Hi, Scrutinizer.
Quote:
Originally Posted by Scrutinizer
Hold on, there is nothing random about sort -r, obviously
--
Edit: Ah I see you probably mean (GNU) sort -R
Thanks for catching and correcting my too-hastily posted code.
Looking back at my history, I originally used:
but then I remembered that I had written unsort (back in 1996), and I didn't want to post that code, so I punted with GNU sort, and got the case wrong. Mea culpa.
As it turns out, Linux also provides an unsort, so I could have left it in, sigh ... cheers, drl
Can you supply some hints how to achieve this? I'm not an experienced scripter..
The following reads the dictionary from stdin and takes the word for which anagrams are to be found as the first argument to the script. This is not going to be very efficient (for that, better to implement a solution with AWK or perl) nor is it very convenient even if you consider it efficient enough (for example, you can't simply provide it more than one word to match against per invocation); perhaps it's enough to get you started with a better shell script (or AWK/perl solution).
Regards,
Alister
Last edited by alister; 06-19-2012 at 11:13 AM..
Reason: typos
It works slower than my rough "try a random permutation and see if it matches a word that isn't the same" script.
Ideally, I would like a list of all possible anagrams in Danish, so if I ever get this script done, it should work on word 1 in the word-list, check if the letters can be combined in a way that matches one or more words in the same word-list, then the next etc.
I also wonder if there is a (hopefully simple) way to generate a list of all possible combinations of letters.
@jeppe83
Hold on!
Are you trying to write an anagram cracker (like you would use for crosswords) against your Danish words list?
If so, you don't need to generate all the combinations at all.
Ideally you would have your words list indexed by a key made of the letters in the word sorted, with duplicate keys allowed. Then take the input string, sort the letters and look up the anagrams.
This is trivial in most modern database packages, and is often set as a final piece. I haven't seen this one set as Homework on a Shell course so we shall assume that this is hobby computing.
It is also fairly trivial in Shell, but the important part is the script to prepare your look-up file(s) with each record containing a sorted letter key field and the matching word. When working with flat files, splitting the data by word length into separate files should be faster, but it depends how many seconds you are prepared to wait for an answer.
Did you mention anything about your computer or your own skills?
Operating System and version.
Preferred Shell.
Any programming languages which you know?
Ps. I have an old technology version. The Longman Anagram Dictionary (a book). It is first ordered by the length of the word, then the sorted letters of the word in alphabetical order. If you can beat me with that book in my hand, your program is good!
I realize I don't need to generate all combinations, but I would like to know how anyway.
It's not homework. I'm a linguist and years ago I followed a course "information technology for linguists" where I was introduced to the wonders of grep and sed etc. I've recently taken it up again, just for fun and to see how much I can remember. I don't aspire to be a scripting wizard..
I use
GNU bash, version 4.1.5(1)-release (x86_64-pc-linux-gnu)
I only have experience with bash-scripts and a little knowledge of awk.
Generating all combinations of a string once-only is not a trivial piece of code. I last wrote a program to do this in Basic-A (for those with long memories) to drive a stage light show.
The essence is that you take each character in turn then remove it from its position in the original string and then insert it into every possible position in the remaining string (including front and back). At the end of the process you have every possible permutation once-only. Purists would take account of duplicate letters (I didn't).
Somebody who has this GNU bash will be able to find a substring function (like that in Basic-A) which makes this easy.
Last edited by methyl; 06-19-2012 at 08:03 PM..
Reason: try to remove some ambiguity.
Hi there,
first of all this is not homework...this is a new type of exercise for practicing vocabulary with my students.
I have a file consisting of two columns, separated by a tab, each line consisting of a word and its definition, separated by a line break.
What i need is to replace a... (15 Replies)
Hi there, friends!
Writing exams again! This time my wish would be to randomize certain columns in a csv file. Given a file containing records consisting of 3 columns tab-separated:
A B C
A B C
A B C
I would love to get the columns of each record in random order...separated by a tab as... (12 Replies)
I was wondering how I could cut only the names of items from the following list:
spoons50
cups29
forks50
plates29
I used "man cut" and thought -c would help, but the items have different character lengths. Please note that there is no space between the item and number (so I can't use... (10 Replies)
Hi,
I have a large file that looks like this:
@FCC189PACXX:2:1101:1420:2139/1
AGCGAGACTCCGTCTCAAAAAGAAAAAATTTTTCAAAATATTGCAATGGGCTTGTAATTTCTGCTTAAATGTCAGGAGGTCTGAGCCATT
+
bbbeeeceggggghiiiiiiiiiihfihihiiihhhghiihhihifhihiihhhhhhhhiiigfggggdceeeeebdcc^``bbcbccbb... (3 Replies)
--please have a look at my third post in this thread! there I explained it more clearly--
Hey guys.
I posted a complex problem few days back. No reply! :|
Here is simplified question:
I have a matrix with 0/1:
* col1 col2 col3
row1 1 0 1
row2 0 0 ... (5 Replies)
I want to add letters A,B,C,… in front of every line of input while printing them out using PERL.
eg
A file is parsed as a cmd line arg and its context will be displayed as
A line1...
B line 2..
I tried this..but I want better and perfect solution!
!perl -p
my $counter;
BEGIN { $counter... (4 Replies)
Hey guys..
Can experts help me in achieving my purpose..
I have a file which contains email address of some 100 to 1000 domains, I need only the domain names..
Eg: abc@yahoo.com
hd@gamil.com
ed@hotmail.com
The output should contain only
Yahoo.com
... (5 Replies)
Hello,
I have a list of words.. ranging from 4 to any characters long.. to not more than 20 though.
How can I select only first seven letters of the list of words?
example:-
wwwwwwwwww
eeeee
wererreetf
sdsarddrereewtewt
sdsdsds
sdsd
ereetetttt
ewtwertwrttrttrtrtwtrww
I... (10 Replies)
I have an odd issue.
I am trying to copy some files/folders to my linux box via a burned CD which I created on my mac. When I browse the files on the mac (or my windows box), everything looks fine (some of the folder names start with a capital letter, which is needed for everything to work... (8 Replies)