If your file contains many paragraphs, say several thousand, then you may wish to avoid creating additional processes.
The perl code listed below demonstrates reading records as paragraphs, and wrapping non-%-tagged records internally with standard (but possibly not installed) perl module Text/Wrap. There is a bit of overhad in splitting the paragraphs from the scalar to the array. That could be optimized if necessary, but the paragraph-reading tends to simplify the code. Everything is, as usual, a trade-off.
The core of the solution is listed after the versions of OS, commands, and modules are noted.
% ./s1
Environment: LC_ALL = C, LANG = C
(Versions displayed with local utility "version")
OS, ker|rel, machine: Linux, 2.6.26-2-amd64, x86_64
Distribution : Debian GNU/Linux 5.0.8 (lenny)
bash GNU bash 3.2.39
perl 5.10.0
divepm (local) 1.4
1.06 warnings
1.04 strict
2006.1117 Text::Wrap
-----
Input data file data1:
%
1 2 3 4 5 6 7 8
12345678901234567890123456789012345678901234567890123456789012345678901234567890
%
Here is the first paragraph of type 1 which is very
long and should be formatted.
%
Here is the second paragraph of type 2 which should stay as it is.
%
Here is the third paragraph of type 1 which is very
long and should be well formatted.
%
Here is the fourth paragraph of type 2
which should stay as it is.
%
-----
perl code:
#!/usr/bin/env perl
# @(#) p1 Demonstrate reading paragraphs, re-formatting.
use warnings;
use strict;
use Text::Wrap;
my (@p);
$/ = "\n\n"; # Records are paragraphs, ending with empty line.
$Text::Wrap::columns = 78;
while (<>) {
if (m{^%}) {
print;
}
else {
@p = split( /\n/, $_ ); # split paragraph into array of lines
print wrap( '', '', @p );
print "$/";
}
}
exit(0);
-----
Results:
%
1 2 3 4 5 6 7 8
12345678901234567890123456789012345678901234567890123456789012345678901234567890
%
Here is the first paragraph of type 1 which is very long and should be
formatted.
%
Here is the second paragraph of type 2 which should stay as it is.
%
Here is the third paragraph of type 1 which is very long and should be well
formatted.
%
Here is the fourth paragraph of type 2
which should stay as it is.
%
The data file was modified to illustrate actual re-formatting, as well as avoidance of it.
Hi everyone,
I have a file with multiple entries and I would like to remove the ones that contain either /A"> or /A/, where A can be any letter of the alphabet. Here's an example of the entries:
<Topic r:id="Top/World/Fran">
<catid>476</catid>
<link... (1 Reply)
Hi,
I am trying to filter out those paragraphs that contains 'CONNECT', 'alter system switch logfile'. That means say the input file is :
-------------------------------------------------------
Wed Jun 7 00:32:31 2006
ACTION : 'CONNECT'
CLIENT USER: prdadm
CLIENT TERMINAL:
Wed Jun 7... (7 Replies)
Hi,
I have a log file which might have certain paragraphs.
Switch not possible Error code 1234
Process number 678
Log not available Error code 567
Process number 874
.....
......
......
Now I create an exception file like this.
cat text.exp
Error code 1234
Process number 874 (7 Replies)
hi,
i am a SED newbie and i need some help. i have a log file as shown below. and i want to search specific Error Code, and fetch the whole paragraph.
...
...
.................
....ErrCode...
.................
...
...
...
.................
....ErrCode...
... (4 Replies)
Hello,
I have file which needs to be splitted into multiple line with each line holding 80 bytes.
Im using the fmt command as
fmt -w 80 {filename} > {out filename}
but getting the error
fmt: Too many characters in a word.
Thanks
Use code tags, ty. (7 Replies)
hi,
i have file,
file is separated into parahgraphs by these line(----------).
i want to find out logId = string : "AIALARM", in each parahgraph or page
if found then i want to cut next five lines....
... (3 Replies)
Hi all,
I want to extract some paragraphs out of a file under certain conditions.
- The paragraph must start with 'fmri'
- The paragraph must contain the string 'restarter svc:/system/svc/restarter:default'
My input is like that :
fmri svc:/system/vxpbx:default
state_time Wed... (4 Replies)
Hi all!
I want to make a code to split sentences into paragraphs maybe
4-5 sentences into one <p>text</p>
there are no new lines in the text string
any ideas with AWK, SSH?
Thank you! (5 Replies)
I am very new to shell scripting, current try to do a sorting of a text file in paragraphs with ksh script.
example:
File content:
A1100001 line 1 = "testing"
line 2 = something,
line 3 = 100
D1200003 line 1 = "testing"
line 2 = something,
line 3 = 100
B1200003 line 1 =... (3 Replies)
Hi,
I have a text with a number of paragraphs in them. My problem is I need to locate certain errors/warning and extract/count them. Problem is I do not know how many paras are there with that particular type of error/warning. I had thought that somehow if I could count the number of... (25 Replies)
Discussion started by: dsid
25 Replies
LEARN ABOUT OSF1
fmt
fmt(1) General Commands Manual fmt(1)NAME
fmt - Formats mail messages prior to sending
SYNOPSIS
fmt [-width] file...
DESCRIPTION
The fmt command reads the input file or files, or standard input if no files are specified, and writes to standard output a version of the
input with lines of a length as close as possible to width columns. (Because fmt is internationalized software, the number of display col-
umns is not necessarily equivalent to the number of bytes.)
The fmt command both joins and splits lines to achieve the desired width, but words are never joined or split; spaces are always preserved,
and lines are split at spaces only. In effect, fmt ignores newline characters in the input and wraps words to make lines a close as possi-
ble to width columns, resulting in individual lines of varying length but a consistent (new) text width overall. Because blank lines are
always preserved, fmt does not merge paragraphs separated by blank lines.
If you specify more than one file, the files are concatenated as input to fmt. If you do not specify -width, the default line length is 72
columns. Spacing at the beginning of input lines is always preserved in the output.
The fmt command is generally used to format mail messages to improve their appearance before they are sent. It may also be useful, how-
ever, for other simple formatting tasks. For example, when you are using vi, you can use the command :%!fmt -60 to reformat your text so
that all lines are approximately 60 columns long.
NOTES
The fmt command is a fast, simple formatting program. Standard text editing programs are more appropriate than fmt for complex formatting
operations. Do not use the fmt command if the message contains embedded messages or preformatted information from other files. This com-
mand formats the heading information in embedded messages and may change the format of preformatted information.
EXAMPLES
file1 contains these lines:
Australia is an island-continent, home to many very interesting plants and animals.
To reformat this text to a narrower width, enter: fmt -30 file1
This results in the following, displayed on your screen: Australia is an island-continent, home to many very interesting plants and
animals.
To make file1 wider, enter: fmt -60 file1
This results in: Australia is an island-continent, home to many very interesting plants and animals. To format a message you have
created with the mailx editor, at the left margin enter: ~|fmt
After you enter the command, your message is formatted, in this case to the default line length of 72 columns, and the word continue
is displayed to indicate that you can enter more information or send your message.
SEE ALSO
Commands: mail(1), mailx(1), vi(1)fmt(1)