Sponsored Content
Top Forums Shell Programming and Scripting Formatting paragraphs with fmt with rule Post 302851027 by drl on Friday 6th of September 2013 10:11:09 AM
Old 09-06-2013
Hi.

If your file contains many paragraphs, say several thousand, then you may wish to avoid creating additional processes.

The perl code listed below demonstrates reading records as paragraphs, and wrapping non-%-tagged records internally with standard (but possibly not installed) perl module Text/Wrap. There is a bit of overhad in splitting the paragraphs from the scalar to the array. That could be optimized if necessary, but the paragraph-reading tends to simplify the code. Everything is, as usual, a trade-off.

The core of the solution is listed after the versions of OS, commands, and modules are noted.
Code:
#!/usr/bin/env bash

# @(#) s1	Demonstrate perl wrap paragraphs.

# Utility functions: print-as-echo, print-line-with-visual-space, debug.
# export PATH="/usr/local/bin:/usr/bin:/bin"
pe() { for _i;do printf "%s" "$_i";done; printf "\n"; }
pl() { pe;pe "-----" ;pe "$*"; }
db() { ( printf " db, ";for _i;do printf "%s" "$_i";done;printf "\n" ) >&2 ; }
db() { : ; }
C=$HOME/bin/context && [ -f $C ] && $C perl divepm
divepm -q -i p1

FILE=${1-data1}

pl " Input data file $FILE:"
cat $FILE

pl " perl code:"
cat p1

pl " Results:"
./p1 $FILE

exit 0

producing:
Code:
% ./s1

Environment: LC_ALL = C, LANG = C
(Versions displayed with local utility "version")
OS, ker|rel, machine: Linux, 2.6.26-2-amd64, x86_64
Distribution        : Debian GNU/Linux 5.0.8 (lenny) 
bash GNU bash 3.2.39
perl 5.10.0
divepm (local) 1.4
 1.06	warnings
 1.04	strict
 2006.1117	Text::Wrap

-----
 Input data file data1:
%
         1         2         3         4         5         6         7         8
12345678901234567890123456789012345678901234567890123456789012345678901234567890
%

Here is the first paragraph of type 1 which is very 
long and should be formatted.

%
Here is the second paragraph of type 2 which should stay as it is.
%

Here is the third paragraph of type 1 which is very 
long and should be well formatted.

%
Here is the fourth paragraph of type 2
which should stay as it is.
%


-----
 perl code:
#!/usr/bin/env perl

# @(#) p1	Demonstrate reading paragraphs, re-formatting.

use warnings;
use strict;
use Text::Wrap;

my (@p);

$/ = "\n\n";    # Records are paragraphs, ending with empty line.
$Text::Wrap::columns = 78;

while (<>) {
  if (m{^%}) {
    print;
  }
  else {
    @p = split( /\n/, $_ );	# split paragraph into array of lines
    print wrap( '', '', @p );
    print "$/";
  }
}

exit(0);


-----
 Results:
%
         1         2         3         4         5         6         7         8
12345678901234567890123456789012345678901234567890123456789012345678901234567890
%

Here is the first paragraph of type 1 which is very long and should be
formatted.

%
Here is the second paragraph of type 2 which should stay as it is.
%

Here is the third paragraph of type 1 which is very long and should be well
formatted.

%
Here is the fourth paragraph of type 2
which should stay as it is.
%

The data file was modified to illustrate actual re-formatting, as well as avoidance of it.

See man pages for details.

Best wishes ... cheers, drl
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Using sed to remove paragraphs with variables

Hi everyone, I have a file with multiple entries and I would like to remove the ones that contain either /A"> or /A/, where A can be any letter of the alphabet. Here's an example of the entries: <Topic r:id="Top/World/Fran"> <catid>476</catid> <link... (1 Reply)
Discussion started by: BlueberryPickle
1 Replies

2. Shell Programming and Scripting

how to filter out some paragraphs in a file

Hi, I am trying to filter out those paragraphs that contains 'CONNECT', 'alter system switch logfile'. That means say the input file is : ------------------------------------------------------- Wed Jun 7 00:32:31 2006 ACTION : 'CONNECT' CLIENT USER: prdadm CLIENT TERMINAL: Wed Jun 7... (7 Replies)
Discussion started by: cnlhap
7 Replies

3. Shell Programming and Scripting

removing certain paragraphs for matching patterns

Hi, I have a log file which might have certain paragraphs. Switch not possible Error code 1234 Process number 678 Log not available Error code 567 Process number 874 ..... ...... ...... Now I create an exception file like this. cat text.exp Error code 1234 Process number 874 (7 Replies)
Discussion started by: kaushys
7 Replies

4. Shell Programming and Scripting

fetching paragraphs with SED

hi, i am a SED newbie and i need some help. i have a log file as shown below. and i want to search specific Error Code, and fetch the whole paragraph. ... ... ................. ....ErrCode... ................. ... ... ... ................. ....ErrCode... ... (4 Replies)
Discussion started by: ipat
4 Replies

5. UNIX for Advanced & Expert Users

fmt command

Hello, I have file which needs to be splitted into multiple line with each line holding 80 bytes. Im using the fmt command as fmt -w 80 {filename} > {out filename} but getting the error fmt: Too many characters in a word. Thanks Use code tags, ty. (7 Replies)
Discussion started by: atlantis
7 Replies

6. Shell Programming and Scripting

file separated into paragraphs or pages

hi, i have file, file is separated into parahgraphs by these line(----------). i want to find out logId = string : "AIALARM", in each parahgraph or page if found then i want to cut next five lines.... ... (3 Replies)
Discussion started by: dodasajan
3 Replies

7. Shell Programming and Scripting

Extract paragraphs under conditions

Hi all, I want to extract some paragraphs out of a file under certain conditions. - The paragraph must start with 'fmri' - The paragraph must contain the string 'restarter svc:/system/svc/restarter:default' My input is like that : fmri svc:/system/vxpbx:default state_time Wed... (4 Replies)
Discussion started by: Armoric
4 Replies

8. Shell Programming and Scripting

Split text into paragraphs

Hi all! I want to make a code to split sentences into paragraphs maybe 4-5 sentences into one <p>text</p> there are no new lines in the text string any ideas with AWK, SSH? Thank you! (5 Replies)
Discussion started by: sanantonio7777
5 Replies

9. Shell Programming and Scripting

Need help with sorting in paragraphs

I am very new to shell scripting, current try to do a sorting of a text file in paragraphs with ksh script. example: File content: A1100001 line 1 = "testing" line 2 = something, line 3 = 100 D1200003 line 1 = "testing" line 2 = something, line 3 = 100 B1200003 line 1 =... (3 Replies)
Discussion started by: gavin_L
3 Replies

10. Shell Programming and Scripting

Extract paragraphs and count them

Hi, I have a text with a number of paragraphs in them. My problem is I need to locate certain errors/warning and extract/count them. Problem is I do not know how many paras are there with that particular type of error/warning. I had thought that somehow if I could count the number of... (25 Replies)
Discussion started by: dsid
25 Replies
fmt(1)							      General Commands Manual							    fmt(1)

NAME
fmt - Formats mail messages prior to sending SYNOPSIS
fmt [-width] file... DESCRIPTION
The fmt command reads the input file or files, or standard input if no files are specified, and writes to standard output a version of the input with lines of a length as close as possible to width columns. (Because fmt is internationalized software, the number of display col- umns is not necessarily equivalent to the number of bytes.) The fmt command both joins and splits lines to achieve the desired width, but words are never joined or split; spaces are always preserved, and lines are split at spaces only. In effect, fmt ignores newline characters in the input and wraps words to make lines a close as possi- ble to width columns, resulting in individual lines of varying length but a consistent (new) text width overall. Because blank lines are always preserved, fmt does not merge paragraphs separated by blank lines. If you specify more than one file, the files are concatenated as input to fmt. If you do not specify -width, the default line length is 72 columns. Spacing at the beginning of input lines is always preserved in the output. The fmt command is generally used to format mail messages to improve their appearance before they are sent. It may also be useful, how- ever, for other simple formatting tasks. For example, when you are using vi, you can use the command :%!fmt -60 to reformat your text so that all lines are approximately 60 columns long. NOTES
The fmt command is a fast, simple formatting program. Standard text editing programs are more appropriate than fmt for complex formatting operations. Do not use the fmt command if the message contains embedded messages or preformatted information from other files. This com- mand formats the heading information in embedded messages and may change the format of preformatted information. EXAMPLES
file1 contains these lines: Australia is an island-continent, home to many very interesting plants and animals. To reformat this text to a narrower width, enter: fmt -30 file1 This results in the following, displayed on your screen: Australia is an island-continent, home to many very interesting plants and animals. To make file1 wider, enter: fmt -60 file1 This results in: Australia is an island-continent, home to many very interesting plants and animals. To format a message you have created with the mailx editor, at the left margin enter: ~|fmt After you enter the command, your message is formatted, in this case to the default line length of 72 columns, and the word continue is displayed to indicate that you can enter more information or send your message. SEE ALSO
Commands: mail(1), mailx(1), vi(1) fmt(1)
All times are GMT -4. The time now is 01:29 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy