Frequency of Words in a File, sed script from 1980

07-11-2016

Registered User

122, 4

Join Date: Jun 2014

Last Activity: 10 August 2017, 2:46 PM EDT

Location: Brazil

Posts: 122

Thanks Given: 38

Thanked 4 Times in 4 Posts

Frequency of Words in a File, sed script from 1980

Code:

tr -cs A-Za-z\' '\n' | tr A-Z a-z | sort | uniq -c | sort -k1,1nr -k2 | sed ${1:-25} < book7.txt

This is not my script, it can be found way back from 1980 but once it worked fine to give me the most used words in a text file.
Now the shell is complaining about an error in sed

Code:

sed: -e expression #1, Character 2: missing command

The instruction to this one liner tells to set it into an executable script, but lazy people ask, because in my former configuration it worked fine to find the most used words in a large text file. So can anyone give me a hint on the error of sed and its missing expression to the characters. I am trying this in the very directory where the file of book7.txt is located.
Thanks in advance.

1in10

View Public Profile for 1in10

Find all posts by 1in10

07-11-2016

Registered User

12,315, 4,560

Join Date: Jul 2012

Last Activity: 22 November 2019, 4:29 PM EST

Location: San Jose, CA, USA

Posts: 12,315

Thanks Given: 952

Thanked 4,560 Times in 3,818 Posts

One might guess that a current sed would work if you change:

Code:

sed ${1:-25}

in that pipeline to:

Code:

sed -n "1,${1:-25}p"

which would print the 1st 25 lines if no command line arguments are given to your script or the top X lines if the 1st argument to your script is X.

This User Gave Thanks to Don Cragun For This Post:

Don Cragun

View Public Profile for Don Cragun

Find all posts by Don Cragun

07-11-2016

Registered User

15,129, 5,008

Join Date: Jul 2012

Last Activity: 4 May 2020, 4:31 PM EDT

Location: Aachen, Germany

Posts: 15,129

Thanks Given: 735

Thanked 5,008 Times in 4,483 Posts

Did you follow the instruction? And run the executable script with an adequate parameter?In bourne compatible shells, the ${1:-25} expands to the first positional parameter's contents or - if missing - to 25; c.f. man bash:

Quote:

${parameter:-word} Use Default Values.

With no parameter given, I get the same error message as you do, as sed can't cope with a 25 as the sole "command". With a first positional parameter of e.g. 1,15!d, above script will print the 15 topmost words in the text presented.

I'm a bit surprised that script should have ever run with no parameters given.

RudiC

View Public Profile for RudiC

Find all posts by RudiC

07-11-2016

Registered User

2,898, 136

Join Date: Mar 2007

Last Activity: 11 July 2016, 2:55 PM EDT

Location: Toronto, Canada

Posts: 2,898

Thanks Given: 0

Thanked 136 Times in 120 Posts

Where do you think tr is getting its input?

cfajohnson

View Public Profile for cfajohnson

Find all posts by cfajohnson

07-11-2016

Registered User

12,315, 4,560

Join Date: Jul 2012

Last Activity: 22 November 2019, 4:29 PM EST

Location: San Jose, CA, USA

Posts: 12,315

Thanks Given: 952

Thanked 4,560 Times in 3,818 Posts

Quote:

Originally Posted by cfajohnson

Where do you think tr is getting its input?

Good point. A better chance at a working script might be any one of the following three commands:

Code:

{ tr -cs A-Za-z\' '\n' | tr A-Z a-z | sort | uniq -c | sort -k1,1nr -k2 | head -n ${1:-25}
} < book7.txt

or:

Code:

(tr -cs A-Za-z\' '\n' | tr A-Z a-z | sort | uniq -c | sort -k1,1nr -k2 | head -n ${1:-25}) < book7.txt

or:

Code:

tr -cs A-Za-z\' '\n' < book7.txt | tr A-Z a-z | sort | uniq -c | sort -k1,1nr -k2 | head -n ${1:-25}

This User Gave Thanks to Don Cragun For This Post:

Don Cragun

View Public Profile for Don Cragun

Find all posts by Don Cragun

07-11-2016

Registered User

122, 4

Join Date: Jun 2014

Last Activity: 10 August 2017, 2:46 PM EDT

Location: Brazil

Posts: 122

Thanks Given: 38

Thanked 4 Times in 4 Posts

@Rudi C It worked under debian squeeze.
@ Don Cragun I will try the given options, many thanks, really
@cfajohnson I do not know, I thought it would be given by the < character
While moving to another living space, I will try this given hints, and reply which one solved the problem. Will take some days...
@Don Cragun

all I get as an answer is that there is a wrong modifier in all three cases

[/CODE]
$ (-)
[CODE]So I guess it would be better to make it an executable script to test it. And taking out the " - " character did not work either

Well, I am probably unable to just apply a script. The following is from 2009, and should work as well counting the frequency or occurrence of words in a given textfile.

Code:

  cat test.file | tr -d '[:punct:]' | tr ' ' '\n' | tr 'A-Z' 'a-z' | sort | uniq -c | sort -rn

I put in another .txt-file and it works fine.

Last edited by 1in10; 08-14-2016 at 06:20 PM.. Reason: solved

1in10

View Public Profile for 1in10

Find all posts by 1in10

Shell Programming and Scripting

Frequency of Words in a File, sed script from 1980

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Write Linux script to convert timestamps older than 1.1.1970 to 1.1.1980

Discussion started by: francus

2. Shell Programming and Scripting

Assigning the same frequency to more than one words in a file

Discussion started by: gimley

3. Shell Programming and Scripting

Creating Frequency of words from a file by accessing a corpus

Discussion started by: gimley

4. Shell Programming and Scripting

Script to sort large file with frequency

Discussion started by: gimley

5. Shell Programming and Scripting

count frequency of words in a file

Discussion started by: mohit_iitk

6. Shell Programming and Scripting

SED - delete words between two possible words

Discussion started by: meuser

7. Shell Programming and Scripting

Using Sed to Delete Words in a File

Discussion started by: SkySmart

8. UNIX for Dummies Questions & Answers

sed how to delete between two words within a file

Discussion started by: martinsmith

9. UNIX for Dummies Questions & Answers

sed replace words in file and keep some

Discussion started by: cas

10. UNIX for Dummies Questions & Answers

sed option to delete two words within a file

Discussion started by: klannon