Sponsored Content
Top Forums Shell Programming and Scripting Regex to identify a full-stop as a sentence delimiter Post 302678469 by Chirel on Saturday 28th of July 2012 04:08:39 AM
Old 07-28-2012
Hum i guess that when i write in english it's not clear. So let's talk regex

i said :
Quote:
Maybe you could say that full-stop must be followed by a \w and a capital letter or end of file ?
That could mean something like : '\.\w[A-Z]'
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Script to ask for a sentence and then count number of spaces in the sentence

Hi People, I need some Help to write a unix script that asks for a sentence to be typed out then with the sentence. Counts the number of spaces within the sentence and then echo's out "The Number Of Spaces In The Sentence is 4" as a example Thanks Danielle (12 Replies)
Discussion started by: charlie101208
12 Replies

2. Shell Programming and Scripting

How to take a full sentence and check the condition?

I have one input file and content of file is : --------------------------------------------------- Input.txt --------------------------------------------------- american express Bahnbau GmbH Bahnbau GmbH CRH Europe crh europe Helgeland Ferdigbetong AS... (8 Replies)
Discussion started by: humaemo
8 Replies

3. UNIX for Dummies Questions & Answers

Use Regex to identify / format a complex string

First of all, please have mercy on me. I am not a noob to programming, but I am about as noob as you can get with regex. That being said, I have a problem. I've got a string that looks something like this: Publication - Bob M. Jones, Tony X. Stark, and Fred D. Man, \"Really Awesome Article... (1 Reply)
Discussion started by: egill
1 Replies

4. Shell Programming and Scripting

Regex to identify word in second position on a line

I am interested in finding a regex to find a word in second position on a line. The word in question is या I tried the following PERL EXPRESSION but it did not work: ] या or ^\W या But both gave Null results I am giving below a Sample file: देना या सौंपना=delegate तह जमना या... (8 Replies)
Discussion started by: gimley
8 Replies

5. Shell Programming and Scripting

Identify full path in argument

I have a small script to send copies of files to another computer used for tests but in the same location:pwd=`pwd` for i in "$@" do echo "rcp -p $i comp-2:$pwd/$i" rcp -p $i comp-2:$pwd/$i echo "Finished with $i" doneIs there a way I can check the parameter to see if it is a full... (5 Replies)
Discussion started by: wbport
5 Replies

6. Shell Programming and Scripting

Sentence delimiter in perl: modifications needed

Hello, I found this Perl Script on the EuroParl website which does Sentence Splitting. #!/usr/bin/perl -w # Based on Preprocessor written by Philipp Koehn binmode(STDIN, ":utf8"); binmode(STDOUT, ":utf8"); binmode(STDERR, ":utf8"); use FindBin qw($Bin); use strict; my $mydir =... (0 Replies)
Discussion started by: gimley
0 Replies

7. Shell Programming and Scripting

Regex to identify unique words in a dictionary database

Hello, I have a dictionary which I am building for the Open Source Community. The data structure is as under HEADWORD=PARTOFSPEECH=ENGLISH MEANING as shown in the example below अ=m=Prefix signifying negation. अँहँ=ind=Interjection expressing disapprobation. अं=int=An interjection... (2 Replies)
Discussion started by: gimley
2 Replies

8. Shell Programming and Scripting

Regex to identify illegal characters in a perso-arabic database

I am working on Sindhi: a perso-Arabic script and since it shares the Unicode-block with over 400 other languages, quite often the database contains characters which are not wanted: illegal characters. I have identified the character set of Sindhi which is given below: For clarity's sake, each... (8 Replies)
Discussion started by: gimley
8 Replies

9. UNIX for Beginners Questions & Answers

Regex to identify pattern

Hi In a file I have string in multiple lines. Like below: <?=test.getObjectName("L", "testTBL","D") ?> <?=test.getObjectName("L", "testTBL","testDB", "D") ?> I want to use regex to search for the pattern "<?=test.getObjectName...?>" If the parenthesis has 3 parameters then return 2nd... (5 Replies)
Discussion started by: dashing201
5 Replies

10. UNIX for Beginners Questions & Answers

How to identify delimiter to find and replace a string with sed?

I need to find and replace a date format in a SQL script with sed. The original lines are like this: ep.begin_date, ep.end_date, ep.facility_code, AND ep.begin_date <= '01-JUL-2019' ep.begin_date, ep.end_date, ep.facility_code, AND ... (15 Replies)
Discussion started by: duke0001
15 Replies
FMT(1)							    BSD General Commands Manual 						    FMT(1)

NAME
fmt -- simple text formatter SYNOPSIS
fmt [-cmnps] [-d chars] [-l num] [-t num] [goal [maximum] | -width | -w width] [file ...] DESCRIPTION
The fmt utility is a simple text formatter which reads the concatenation of input files (or standard input if none are given) and produces on standard output a version of its input with lines as close to the goal length as possible without exceeding the maximum. The goal length defaults to 65 and the maximum to 10 more than the goal length. Alternatively, a single width parameter can be specified either by prepend- ing a hyphen to it or by using -w. For example, ``fmt -w 72'', ``fmt -72'', and ``fmt 72 72'' all produce identical output. The spacing at the beginning of the input lines is preserved in the output, as are blank lines and interword spacing. Lines are joined or split only at white space; that is, words are never joined or hyphenated. The options are as follows: -c Center the text, line by line. In this case, most of the other options are ignored; no splitting or joining of lines is done. -m Try to format mail header lines contained in the input sensibly. -n Format lines beginning with a '.' (dot) character. Normally, fmt does not fill these lines, for compatibility with nroff(1). -p Allow indented paragraphs. Without the -p flag, any change in the amount of whitespace at the start of a line results in a new para- graph being begun. -s Collapse whitespace inside lines, so that multiple whitespace characters are turned into a single space. (Or, at the end of a sen- tence, a double space.) -d chars Treat the chars (and no others) as sentence-ending characters. By default the sentence-ending characters are full stop ('.'), ques- tion mark ('?') and exclamation mark ('!'). Remember that some characters may need to be escaped to protect them from your shell. -l number Replace multiple spaces with tabs at the start of each output line, if possible. Each number spaces will be replaced with one tab. The default is 8. If number is 0, spaces are preserved. -t number Assume that the input files' tabs assume number spaces per tab stop. The default is 8. The fmt utility is meant to format mail messages prior to sending, but may also be useful for other simple tasks. For instance, within vis- ual mode of the ex(1) editor (e.g., vi(1)) the command !}fmt will reformat a paragraph, evening the lines. SEE ALSO
mail(1), nroff(1) HISTORY
The fmt command appeared in 3BSD. The version described herein is a complete rewrite and appeared in FreeBSD 4.4. AUTHORS
Kurt Shoens Liz Allen (added goal length concept) Gareth McCaughan BUGS
The program was designed to be simple and fast - for more complex operations, the standard text processors are likely to be more appropriate. When the first line of an indented paragraph is very long (more than about twice the goal length), the indentation in the output can be wrong. The fmt utility is not infallible in guessing what lines are mail headers and what lines are not. BSD
June 25, 2000 BSD
All times are GMT -4. The time now is 03:07 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy