Sponsored Content
Top Forums Shell Programming and Scripting Regex to identify a full-stop as a sentence delimiter Post 302678463 by Chirel on Saturday 28th of July 2012 03:51:00 AM
Old 07-28-2012
Hi,

The input & output of what you want is not clear for me, but about parsing full-stop.

Maybe you could say that full-stop must be followed by a \w and a capital letter or end of file ?
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Script to ask for a sentence and then count number of spaces in the sentence

Hi People, I need some Help to write a unix script that asks for a sentence to be typed out then with the sentence. Counts the number of spaces within the sentence and then echo's out "The Number Of Spaces In The Sentence is 4" as a example Thanks Danielle (12 Replies)
Discussion started by: charlie101208
12 Replies

2. Shell Programming and Scripting

How to take a full sentence and check the condition?

I have one input file and content of file is : --------------------------------------------------- Input.txt --------------------------------------------------- american express Bahnbau GmbH Bahnbau GmbH CRH Europe crh europe Helgeland Ferdigbetong AS... (8 Replies)
Discussion started by: humaemo
8 Replies

3. UNIX for Dummies Questions & Answers

Use Regex to identify / format a complex string

First of all, please have mercy on me. I am not a noob to programming, but I am about as noob as you can get with regex. That being said, I have a problem. I've got a string that looks something like this: Publication - Bob M. Jones, Tony X. Stark, and Fred D. Man, \"Really Awesome Article... (1 Reply)
Discussion started by: egill
1 Replies

4. Shell Programming and Scripting

Regex to identify word in second position on a line

I am interested in finding a regex to find a word in second position on a line. The word in question is या I tried the following PERL EXPRESSION but it did not work: ] या or ^\W या But both gave Null results I am giving below a Sample file: देना या सौंपना=delegate तह जमना या... (8 Replies)
Discussion started by: gimley
8 Replies

5. Shell Programming and Scripting

Identify full path in argument

I have a small script to send copies of files to another computer used for tests but in the same location:pwd=`pwd` for i in "$@" do echo "rcp -p $i comp-2:$pwd/$i" rcp -p $i comp-2:$pwd/$i echo "Finished with $i" doneIs there a way I can check the parameter to see if it is a full... (5 Replies)
Discussion started by: wbport
5 Replies

6. Shell Programming and Scripting

Sentence delimiter in perl: modifications needed

Hello, I found this Perl Script on the EuroParl website which does Sentence Splitting. #!/usr/bin/perl -w # Based on Preprocessor written by Philipp Koehn binmode(STDIN, ":utf8"); binmode(STDOUT, ":utf8"); binmode(STDERR, ":utf8"); use FindBin qw($Bin); use strict; my $mydir =... (0 Replies)
Discussion started by: gimley
0 Replies

7. Shell Programming and Scripting

Regex to identify unique words in a dictionary database

Hello, I have a dictionary which I am building for the Open Source Community. The data structure is as under HEADWORD=PARTOFSPEECH=ENGLISH MEANING as shown in the example below अ=m=Prefix signifying negation. अँहँ=ind=Interjection expressing disapprobation. अं=int=An interjection... (2 Replies)
Discussion started by: gimley
2 Replies

8. Shell Programming and Scripting

Regex to identify illegal characters in a perso-arabic database

I am working on Sindhi: a perso-Arabic script and since it shares the Unicode-block with over 400 other languages, quite often the database contains characters which are not wanted: illegal characters. I have identified the character set of Sindhi which is given below: For clarity's sake, each... (8 Replies)
Discussion started by: gimley
8 Replies

9. UNIX for Beginners Questions & Answers

Regex to identify pattern

Hi In a file I have string in multiple lines. Like below: <?=test.getObjectName("L", "testTBL","D") ?> <?=test.getObjectName("L", "testTBL","testDB", "D") ?> I want to use regex to search for the pattern "<?=test.getObjectName...?>" If the parenthesis has 3 parameters then return 2nd... (5 Replies)
Discussion started by: dashing201
5 Replies

10. UNIX for Beginners Questions & Answers

How to identify delimiter to find and replace a string with sed?

I need to find and replace a date format in a SQL script with sed. The original lines are like this: ep.begin_date, ep.end_date, ep.facility_code, AND ep.begin_date <= '01-JUL-2019' ep.begin_date, ep.end_date, ep.facility_code, AND ... (15 Replies)
Discussion started by: duke0001
15 Replies
DICTION(1)							   User commands							DICTION(1)

NAME
diction - print wordy and commonly misused phrases in sentences SYNOPSIS
diction [-b] [-d] [-f file [-n|-L language]] [file...] diction [--beginner] [--ignore-double-words] [--file file [--no-default-file|--language language]] [file...] diction -h|--help diction --version DESCRIPTION
Diction finds all sentences in a document that contain phrases from a database of frequently misused, bad or wordy diction. It further checks for double words. If no files are given, the document is read from standard input. Each found phrase is enclosed in [ ] (brack- ets). Suggestions and advice, if any and if asked for, are printed headed by a right arrow ->. A sentence is a sequence of words, that starts with a capitalised word and ends with a full stop, double colon, question mark or exclaimation mark. A single letter followed by a dot is considered an abbreviation, so it does not terminate a sentence. Various multi-letter abbreviations are recognized, they do not terminate a sentence as well, neither do fractional numbers. Diction understands cpp(1) #line lines for being able to give precise locations when printing sentences. OPTIONS
-b, --beginner Complain about mistakes typically made by beginners. -d, --ignore-double-words Ignore double words and do not complain about them. -s, --suggest Suggest better wording, if any. -f file, --file file Read the user specified database from the specified file in addition to the default database. -n, --no-default-file Do not read the default database, so only the user-specified database is used. -L language, --language language Set the phrase file language. -h, --help Print a short usage message. --version Print the version. ERRORS
On usage errors, 1 is returned. Termination caused by lack of memory is signalled by exit code 2. EXAMPLE
The following example first removes all roff constructs and headers from a document and feeds the result to diction with a German database: deroff -s file.mm | diction -L de | fmt ENVIRONMENT
LC_MESSAGES=de|en specifies the message language and is also used as default for the phrase language. The default language is en. FILES
/usr/share/diction/* databases for various languages AUTHOR
This program is GNU software, copyright 1997-2005 Michael Haardt <michael@moria.de>. The English phrase file contains contributions by Greg Lindahl <lindahl@pbm.com>, Wil Baden, Gary D. Kline, Kimberly Hanks and Beth Morris. This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MER- CHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program. If not, write to the Free Software Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. HISTORY
There has been a diction command on old UNIX systems, which is now part of the AT&T DWB package. The original version was bound to roff by enforcing a call to deroff. This version is a reimplementation and must run in a pipe with deroff(1) if you want to process roff docu- ments. Similarly, you can run it in a pipe with dehtml(1) or detex(1) to process HTML or TeX documents. SEE ALSO
deroff(1), fmt(1), style(1) Cherry, L.L.; Vesterman, W.: Writing Tools--The STYLE and DICTION programs, Computer Science Technical Report 91, Bell Laboratories, Murray Hill, N.J. (1981), republished as part of the 4.4BSD User's Supplementary Documents by O'Reilly. Strunk, William: The elements of style, Ithaca, N.Y.: Priv. print., 1918, http://coba.shsu.edu/help/strunk/ GNU
June 09, 2006 DICTION(1)
All times are GMT -4. The time now is 04:06 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy