Quote:
Originally Posted by
Lipidil
How do you write a script that counts the number of times a word appears in a file and output it?
The first thing you need to clarify is: what is a "word"? This is not as obvious as it looks: "SMARCB1" in your example is seemingly one, but is "SMARCB1-Jon" also one word or is it two words, connected by a dash?
Usually it is a matter of some characters continuing a word and others ending one. In your example obviously all characters (small and caps) as well as digits continue a word, the semicolon (and blanks, probably) ends one.
Once you have this defined clearly write a filter which inserts line breaks at every "non-continuing character" and sort, then simply count.
You might wonder why - instead of providing some command line ready to use we are a bit vague about what to do: first, this is the beginners forum. We do not want to spoil your joy of learning the trade yourself. Therefore we give pointers to guide but won't spoon-feed you solutions. Second, what you present looks suspiciously like homework. If it is: there is a special forum for this and your thread can get transferred there if this is the case. Ask any moderator and we will gladly assist you. Still, there are special rules in place in this forum and you will have to hand in the necessary questionnaire subsequently.
I hope this helps.
bakunin