Help with text analysis - UNIX


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers Help with text analysis - UNIX
# 1  
Old 03-30-2011
Help with text analysis - UNIX

Hey Guys

I recently posted yesterday about trying to count the amount of separate words that exists in a text file e.g. walle.txt.
i want the output to give to give me a list of words with a number next indicating how many times its came up in the file e.g:
cat 20
the 11
if 40

I'm completely new to Unix, I'm currently using the bash terminal from a Macbook Pro. I am running this on some example file scripts, is what i'm trying to do possible? if so please help.

Thanks
 
Login or Register to Ask a Question

Previous Thread | Next Thread

8 More Discussions You Might Find Interesting

1. Infrastructure Monitoring

Nmon Analysis

Dear All, I am an performance tester. Now i am working in project where we are using linux 2.6.32. Now I got an oppurtunity to learn the monitoring the server. As part of this task i need to do analysis of the Nmon report. I was completely blank in this. So please suggest me how to start... (0 Replies)
Discussion started by: iamsengu
0 Replies

2. Shell Programming and Scripting

How can i run sql queries from UNIX shell script and retrieve data into text docs of UNIX?

Please share the doc asap as very urgently required. (1 Reply)
Discussion started by: 24ajay
1 Replies

3. UNIX for Dummies Questions & Answers

Data analysis, Regular Expression - Unix

Hey every one! I have a dataset like this : 1 100 1 0 5 100 1 8 7 50 1 0 7 100 2 0 10 20 1 8 10 30 1 8 10 100 3 8 15 50 5 0 20 90 1 0 20 99 9 0 I wanna check if the 4th column is 0 or 8 If it's zero write the 1st column itself, if it's 8 write sum of 1st and second something... (2 Replies)
Discussion started by: @man
2 Replies

4. UNIX for Dummies Questions & Answers

Text analysis

Hey Guys, Does anyone know how to count the separate amount of words in a text file? e.g the 5 and 20 Furthermore does anyone know how to convert whole numbers in decimals? Thanks (24 Replies)
Discussion started by: John0101
24 Replies

5. Shell Programming and Scripting

text file analysis

Hello, I have a text file containin 4 lines which are repeated along the file, ie the file looks like this: 16:20:12.060769 blablabla 40 16:20:12.093199 blablabla 640 16:20:12.209003 blablabla 640 16:20:12.273179 blablabla 216 16:20:27.217444 blablabla 40 16:20:27.235410 blablabla 640... (2 Replies)
Discussion started by: Celine19
2 Replies

6. Programming

Regarding stack analysis

I would like to know how I could do the following : void func(){ int a = 100; b=0; int c = a/b; } void sig_handler (int sig,siginfo_t *info,void *context){ //signal handling function //here I want to access the variables of func() } int main(){ struct sigaction *act =... (7 Replies)
Discussion started by: vpraveen84
7 Replies

7. Shell Programming and Scripting

AWK script: decrypt text uses frequency analysis

Ez all! I have a question how to decrypt text uses letter frequency analysis. I have code which count the letters, but what i need to do after that. Can anybody help me to write a code. VERY NEEDED! My code now: #!/usr/bin/awk -f BEGIN { FS="" } { for (i=1; i <= NF; i++) { if ($i... (4 Replies)
Discussion started by: SerJel
4 Replies

8. UNIX for Dummies Questions & Answers

How do I convert unix text to to win text?

How do I convert unix text files into readable text for windows. Dave (1 Reply)
Discussion started by: nucca
1 Replies
Login or Register to Ask a Question
SLMSEG(1)						User Contributed Perl Documentation						 SLMSEG(1)

NAME
slmseg - maximum matching segment Chinese text. SYNOPSIS
slmseg -d dict_file [option]... [corpus_file]... DESCRIPTION
slmseg is a tool for segmenting Chinese text into words using maximum matching algorithm. slmseg segments corpus_file, or standard input if no filename is specified, and write the segmented result to standard output. OPTIONS
-d dict_file Use dict_file as lexicon. A default lexicon can be found at /usr/share/sunpinyin-slm/dict.utf8. -f,--format (text|bin) Output Format, can be 'text' or 'bin'. default 'bin'. Normally, in text mode, word text are output, while in binary mode, binary short integer of the word-ids are written to stdout. -s, --stok STOK_ID Sentence token id. Default 10. It will be written to output in binary mode after every sentence. -i, --show-id Show Id info. Under text output format mode, attach id after known words. If under binary mode, print id(s) in text. -m, --model language-model-file Speficy the language model file. This file is always generated by slmthread. NOTES
Under binary mode, consecutive id of 0 are merged into one 0. Under text mode, no space are inserted between unknown-words. AUTHOR
Originally written by Phill.Zhang <phill.zhang@sun.com>. Currently maintained by Kov.Chai <tchaikov@gmail.com>. SEE ALSO
mmseg(1), ids2ngram (1). perl v5.14.2 2012-06-09 SLMSEG(1)