07-28-2012
Hi Many thanks.
I tried the regex you had provided.
Here is the input:
Quote:
The temperature was 32.8 degrees Celsius. His B.Sc. degree was deemed insufficient. He owed the bank USD 4000.50 which he had not paid back. On 27.07.2004 a major earthquake occurred. It was 17.05 by the clock.
What I need is that the regex should identify only sentences delimited with a full-stop.
The expected output would be:
Quote:
The temperature was 32.8 degrees Celsius.
His B.Sc. degree was deemed insufficient.
He owed the bank USD 4000.50 which he had not paid back.
On 27.07.2004 a major earthquake occurred.
It was 17.05 by the clock.
and not for example
Quote:
His B.
Sc.
degree was deemed insufficient.
The Regex which you furnished and which I applied as a Unix regex gave me the following:
Quote:
His B.Sc.
degree was deemed insufficient.
I tried quite a few tweaks but they made it worse.
Any workarounds please. I have a huge database with this type of strings and need to identify valid strings.
Many thanks
10 More Discussions You Might Find Interesting
1. UNIX for Dummies Questions & Answers
Hi People,
I need some Help to write a unix script that asks for a sentence to be typed out then with the sentence. Counts the number of spaces within the sentence and then echo's out "The Number Of Spaces In The Sentence is 4" as a example
Thanks
Danielle (12 Replies)
Discussion started by: charlie101208
12 Replies
2. Shell Programming and Scripting
I have one input file and content of file is :
---------------------------------------------------
Input.txt
---------------------------------------------------
american express
Bahnbau GmbH
Bahnbau GmbH
CRH Europe
crh europe
Helgeland Ferdigbetong AS... (8 Replies)
Discussion started by: humaemo
8 Replies
3. UNIX for Dummies Questions & Answers
First of all, please have mercy on me. I am not a noob to programming, but I am about as noob as you can get with regex. That being said, I have a problem.
I've got a string that looks something like this:
Publication - Bob M. Jones, Tony X. Stark, and Fred D. Man, \"Really Awesome Article... (1 Reply)
Discussion started by: egill
1 Replies
4. Shell Programming and Scripting
I am interested in finding a regex to find a word in second position on a line. The word in question is या
I tried the following PERL EXPRESSION but it did not work:
] या
or
^\W या
But both gave Null results
I am giving below a Sample file:
देना या सौंपना=delegate
तह जमना या... (8 Replies)
Discussion started by: gimley
8 Replies
5. Shell Programming and Scripting
I have a small script to send copies of files to another computer used for tests but in the same location:pwd=`pwd`
for i in "$@"
do
echo "rcp -p $i comp-2:$pwd/$i"
rcp -p $i comp-2:$pwd/$i
echo "Finished with $i"
doneIs there a way I can check the parameter to see if it is a full... (5 Replies)
Discussion started by: wbport
5 Replies
6. Shell Programming and Scripting
Hello,
I found this Perl Script on the EuroParl website which does Sentence Splitting.
#!/usr/bin/perl -w
# Based on Preprocessor written by Philipp Koehn
binmode(STDIN, ":utf8");
binmode(STDOUT, ":utf8");
binmode(STDERR, ":utf8");
use FindBin qw($Bin);
use strict;
my $mydir =... (0 Replies)
Discussion started by: gimley
0 Replies
7. Shell Programming and Scripting
Hello,
I have a dictionary which I am building for the Open Source Community. The data structure is as under
HEADWORD=PARTOFSPEECH=ENGLISH MEANING
as shown in the example below
अ=m=Prefix signifying negation.
अँहँ=ind=Interjection expressing disapprobation.
अं=int=An interjection... (2 Replies)
Discussion started by: gimley
2 Replies
8. Shell Programming and Scripting
I am working on Sindhi: a perso-Arabic script and since it shares the Unicode-block with over 400 other languages, quite often the database contains characters which are not wanted: illegal characters.
I have identified the character set of Sindhi which is given below:
For clarity's sake, each... (8 Replies)
Discussion started by: gimley
8 Replies
9. UNIX for Beginners Questions & Answers
Hi
In a file I have string in multiple lines. Like below:
<?=test.getObjectName("L", "testTBL","D") ?>
<?=test.getObjectName("L", "testTBL","testDB", "D") ?>
I want to use regex to search for the pattern "<?=test.getObjectName...?>"
If the parenthesis has 3 parameters then return 2nd... (5 Replies)
Discussion started by: dashing201
5 Replies
10. UNIX for Beginners Questions & Answers
I need to find and replace a date format in a SQL script with sed. The original lines are like this:
ep.begin_date, ep.end_date, ep.facility_code,
AND ep.begin_date <= '01-JUL-2019'
ep.begin_date, ep.end_date, ep.facility_code,
AND ... (15 Replies)
Discussion started by: duke0001
15 Replies
LEARN ABOUT DEBIAN
extract_usage_from_stx
extract_usage_from_stx() extract_usage_from_stx()
NAME
extract_usage_from_stx - extract "Usage:" messages from manpages written in Stx
SYNOPSIS
extract_usage_from_stx [ file file ... ]
DESCRIPTION
Process the given files, which should be manpages written in Stx, stripping away everything that is typically not included in a "Usage:"
message. If no files are given, read standard input instead.
A "Usage:" message is a message typically printed when a program is called with incorrect arguments or when help is specifically requested,
for example with a --help command line option. A "Usage:" message is typically a subset of the information provided on a command's manual
page.
The information left in the "Usage:" message by extract_usage_from_stx includes:
o the command synopsis, as given in the "SYNOPSIS" section
o the command line options, as given in the "OPTIONS" section, together with the first sentence of their description. A sentence is
taken to end at a period (.).
The output of extract_usage_from_stx is still in Stx format, which you might want to further process to produce the actual "Usage:" mes-
sage.
OPTIONS
--version, -V
Just show version information and exit.
--help, -?
Just show a short help message and exit.
SEE ALSO
stx2any (1).
BUGS
The end-condition of a sentence is too strong.
AUTHOR
This manual page was written by Panu A. Kalliokoski.
Panu A. Kalliokoski extract_usage_from_stx()