Sponsored Content
Top Forums Shell Programming and Scripting Frequent words and trigraphs in text Post 302860149 by khaled79 on Friday 4th of October 2013 04:35:17 PM
Old 10-04-2013
Frequent words and trigraphs in text

Hello all,
how to get the most 30 frequent words in text and the most frequent trigraphs (three character in same order in text )?

note that : the text is none English text (Arabic text)

so I will get the result as

Code:
 
top 30 words 
abdbdns
asddd
wqwfqw
 
top 30 trigraphs 
abc
sed
asd

thanks
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Display text between two words/characters

Using sed or awk, I need to display text between two words/characters. Below are two example inputs and the desired output. In a nutshell, I need the date-range value between the quotes (but only the first occurance of date-range as there can be more than one). Example One Input: xml-report... (1 Reply)
Discussion started by: cmichaelson
1 Replies

2. Programming

Optimizing frequent file transfer?

Hi I have written a simple client/server(socket programming) application using TCP/IP. My server code runs on Linux and client is on windows. The concept is that the client request for files(on demand basis) to the server and the server sends it back to the client. As the client is attached to... (3 Replies)
Discussion started by: akilan
3 Replies

3. Shell Programming and Scripting

Extracting Text Between Two Words

Hi all! Im trying to extract a portion of text from a KML and put it into a new file. Im trying to get all of the points out of it, ignoring everything else so I need only the text between <Placement> and </Placement>. Is there a way to make it extract all instances of these points and not just... (2 Replies)
Discussion started by: Grizzly
2 Replies

4. Shell Programming and Scripting

How to select only the most frequent instances of a variable string in a file?

I've got a web access file that I want to grep (or awk or perl or whatever will work!) out the most frequent instances of unique IP entries. Meaning the file looks something like this: I'd like to run a sort or grep (or whatever) that will only select out the lines from IP's that had the... (7 Replies)
Discussion started by: kevinmccallum
7 Replies

5. Shell Programming and Scripting

How to find frequent occurance of a word in a line?

File_source.DAT 1|abc|abc|abc|abc|abc 2|abc|abc|efg|efg|def 3|abc|bcd|cde|def|efg 4|abc|abc|abc|def|efg ========================= Please help me to solve this as below using UNIX. ========================= File_output.DAT "1"|"abc" - as... (3 Replies)
Discussion started by: scpyraj
3 Replies

6. Solaris

Frequent crashes in ESXi virtual Solaris machine

Anybody have any ideas on what could be causing this crash? This an x86 virtual machine I have running in ESXi at my house, so obviously no way to send the data to Oracle for analysis. Just curious if anyone has any ideas. I suspect motherboard problems with the x86 hardware, as I already... (4 Replies)
Discussion started by: christr
4 Replies

7. UNIX for Advanced & Expert Users

Need help either with awk or sed to get text between words

Hello All, My requirement is to get test between two words START & END, something like html tags Eg. Input file: START Line1 Line2 Line3 CLOSE START Line4 Line5 Line6 END START Line7 START Line8 (7 Replies)
Discussion started by: konerusuneel
7 Replies

8. Shell Programming and Scripting

Extracting Words from Text

Hi there, Unix Gurus Back in September last year you helped me find a way to extract the words in brackets in a textfile to a new one. In that case my textfile was made up of sentences containing an only bracketed word per sentence/line: 1. If the boss's son had been , someone would... (9 Replies)
Discussion started by: eldeingles
9 Replies

9. Shell Programming and Scripting

Search for words NOT in a text file

I have a long list of alphanumberic words (no spaces or characters) in file1.txt I need to check for the existance of each of the words from file1.txt against file2.txt and if the word is NOT in file2.txt, I need to know about it, either standard output or redirect to file3.txt For example:... (5 Replies)
Discussion started by: ajp7701
5 Replies

10. Shell Programming and Scripting

Printing most frequent string in column

I am trying to put together an script that will output the most frequent string in a column. This is what I have: awk '{count++} END {for ( i in count ) print i, count }' Of course, my script is outputting all different strings and counts. However, I just need the most frequent one (there... (7 Replies)
Discussion started by: Xterra
7 Replies
JHINDEXER(1)							    Java Tools							      JHINDEXER(1)

NAME
jhindexer - JavaHelp command line utility SYNOPSIS
jhindexer [ options ] [ file | folder ] PARAMETERS
options Command-line options. file JavaHelp system content file. If the argument is a folder, the folder is searched recursively for JavaHelp system content files. DESCRIPTION
jhindexer creates a full-text search database used by the JavaHelp system full-text search engine to locate matches. You can use the jhsearch command to verify the validity of the database. OPTIONS
-c file A configuration file name. -db dir The name of the database output folder. By default the output folder is named JavaHelpSearch and is created in the current folder. -locale lang_country_variant The name of the locale as described in java.util.Locale.For example: en_US (English, United States) or en_US_WIN (English, United States, Windows variant). -logfile file Captures jhindexer messages in a specified file. You can use this option to preserve jhindexer output on Win32 machines where the con- sole window is dismissed after execution terminates. -nostop words Causes stop words to be indexed in the full-text search database. -verbose Displays verbose messages while processing. SEE ALSO
jhsearch (1) June 28, 2007 JHINDEXER(1)
All times are GMT -4. The time now is 07:08 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy