Formatting paragraphs with fmt with rule


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Formatting paragraphs with fmt with rule
# 1  
Old 09-04-2013
Formatting paragraphs with fmt with rule

Hi,
i know that i can format a whole file using
Code:
 fmt -w 78 file > file_new

Now, i would like to implement a rule in the following way:
- If a paragraph starts with % (and ends with %), the paragraph should remain unchanged.
- Otherwise the command stated above should be applied to the paragraph.

As I am not familiar enough with shell scripting yet, I did not succeed in implementing a paragraph selection rule.

I would be happy for your comments/help!

Thanks in advance!
John
# 2  
Old 09-05-2013
Hi - welcome to the forums.

No one has answered. There is a reason. First off, fmt does not have a rule set, only a defined set of behaviors. We would have to write code to do that.

Next, without example input and expected output we cannot possibly do that, without simply guessing. If you want a quick answer, most modern editors will let you set up a macro to reformat lines in a paragraph. If the file is gigantic, code is a best choice.
# 3  
Old 09-06-2013
Hi,
thanks a lot for your reply! I see the problem...
Unfortunately, I am not able so far to present an approach to the problem which you could correct.

Let me state the problem once more giving an example and afterwards present my ideas of how to solve it:

The file is of the form
Code:
Here is the first paragraph of type 1 which is very long and should be formated.

%
Here is the second paragraph of type 2 which should stay as it is.
%

Here is the third paragraph of type 1 which is very long and should be formated.

%
Here is the fourth paragraph of type 2 which should stay as it is.
%

...

I would like to format the paragraphs of type 1 using

Code:
 fmt -w 78

such that the output should look as follows:

Code:
Here is the first paragraph of type 1 which is very 
long and should be formated.

%
Here is the second paragraph of type 2 which should stay as it is.
%

Here is the third paragraph of type 1 which is very 
long and should be formated.

%
Here is the fourth paragraph of type 2 which should stay as it is.
%

...

I am trying to split up the text into pieces containing the different paragraphs and apply the fmt command only on paragraphs of type 1, but unfortunately I am not able to do so far (as I am a beginner) and would appreciate your help.

Thanks in advance.
# 4  
Old 09-06-2013
perl can open and close a pipe to the fmt program
Code:
perl -ne '$s.=$_; if (/^$/) {if ($prev ne "%") {open(OUT,"|fmt -w 78"); print OUT $s; close OUT;} else {print $s}; $s=""}; chomp($prev=$_);' file

# 5  
Old 09-06-2013
Hi.

If your file contains many paragraphs, say several thousand, then you may wish to avoid creating additional processes.

The perl code listed below demonstrates reading records as paragraphs, and wrapping non-%-tagged records internally with standard (but possibly not installed) perl module Text/Wrap. There is a bit of overhad in splitting the paragraphs from the scalar to the array. That could be optimized if necessary, but the paragraph-reading tends to simplify the code. Everything is, as usual, a trade-off.

The core of the solution is listed after the versions of OS, commands, and modules are noted.
Code:
#!/usr/bin/env bash

# @(#) s1	Demonstrate perl wrap paragraphs.

# Utility functions: print-as-echo, print-line-with-visual-space, debug.
# export PATH="/usr/local/bin:/usr/bin:/bin"
pe() { for _i;do printf "%s" "$_i";done; printf "\n"; }
pl() { pe;pe "-----" ;pe "$*"; }
db() { ( printf " db, ";for _i;do printf "%s" "$_i";done;printf "\n" ) >&2 ; }
db() { : ; }
C=$HOME/bin/context && [ -f $C ] && $C perl divepm
divepm -q -i p1

FILE=${1-data1}

pl " Input data file $FILE:"
cat $FILE

pl " perl code:"
cat p1

pl " Results:"
./p1 $FILE

exit 0

producing:
Code:
% ./s1

Environment: LC_ALL = C, LANG = C
(Versions displayed with local utility "version")
OS, ker|rel, machine: Linux, 2.6.26-2-amd64, x86_64
Distribution        : Debian GNU/Linux 5.0.8 (lenny) 
bash GNU bash 3.2.39
perl 5.10.0
divepm (local) 1.4
 1.06	warnings
 1.04	strict
 2006.1117	Text::Wrap

-----
 Input data file data1:
%
         1         2         3         4         5         6         7         8
12345678901234567890123456789012345678901234567890123456789012345678901234567890
%

Here is the first paragraph of type 1 which is very 
long and should be formatted.

%
Here is the second paragraph of type 2 which should stay as it is.
%

Here is the third paragraph of type 1 which is very 
long and should be well formatted.

%
Here is the fourth paragraph of type 2
which should stay as it is.
%


-----
 perl code:
#!/usr/bin/env perl

# @(#) p1	Demonstrate reading paragraphs, re-formatting.

use warnings;
use strict;
use Text::Wrap;

my (@p);

$/ = "\n\n";    # Records are paragraphs, ending with empty line.
$Text::Wrap::columns = 78;

while (<>) {
  if (m{^%}) {
    print;
  }
  else {
    @p = split( /\n/, $_ );	# split paragraph into array of lines
    print wrap( '', '', @p );
    print "$/";
  }
}

exit(0);


-----
 Results:
%
         1         2         3         4         5         6         7         8
12345678901234567890123456789012345678901234567890123456789012345678901234567890
%

Here is the first paragraph of type 1 which is very long and should be
formatted.

%
Here is the second paragraph of type 2 which should stay as it is.
%

Here is the third paragraph of type 1 which is very long and should be well
formatted.

%
Here is the fourth paragraph of type 2
which should stay as it is.
%

The data file was modified to illustrate actual re-formatting, as well as avoidance of it.

See man pages for details.

Best wishes ... cheers, drl
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Extract paragraphs and count them

Hi, I have a text with a number of paragraphs in them. My problem is I need to locate certain errors/warning and extract/count them. Problem is I do not know how many paras are there with that particular type of error/warning. I had thought that somehow if I could count the number of... (25 Replies)
Discussion started by: dsid
25 Replies

2. Shell Programming and Scripting

Need help with sorting in paragraphs

I am very new to shell scripting, current try to do a sorting of a text file in paragraphs with ksh script. example: File content: A1100001 line 1 = "testing" line 2 = something, line 3 = 100 D1200003 line 1 = "testing" line 2 = something, line 3 = 100 B1200003 line 1 =... (3 Replies)
Discussion started by: gavin_L
3 Replies

3. Shell Programming and Scripting

Split text into paragraphs

Hi all! I want to make a code to split sentences into paragraphs maybe 4-5 sentences into one <p>text</p> there are no new lines in the text string any ideas with AWK, SSH? Thank you! (5 Replies)
Discussion started by: sanantonio7777
5 Replies

4. Shell Programming and Scripting

Extract paragraphs under conditions

Hi all, I want to extract some paragraphs out of a file under certain conditions. - The paragraph must start with 'fmri' - The paragraph must contain the string 'restarter svc:/system/svc/restarter:default' My input is like that : fmri svc:/system/vxpbx:default state_time Wed... (4 Replies)
Discussion started by: Armoric
4 Replies

5. Shell Programming and Scripting

file separated into paragraphs or pages

hi, i have file, file is separated into parahgraphs by these line(----------). i want to find out logId = string : "AIALARM", in each parahgraph or page if found then i want to cut next five lines.... ... (3 Replies)
Discussion started by: dodasajan
3 Replies

6. UNIX for Advanced & Expert Users

fmt command

Hello, I have file which needs to be splitted into multiple line with each line holding 80 bytes. Im using the fmt command as fmt -w 80 {filename} > {out filename} but getting the error fmt: Too many characters in a word. Thanks Use code tags, ty. (7 Replies)
Discussion started by: atlantis
7 Replies

7. Shell Programming and Scripting

fetching paragraphs with SED

hi, i am a SED newbie and i need some help. i have a log file as shown below. and i want to search specific Error Code, and fetch the whole paragraph. ... ... ................. ....ErrCode... ................. ... ... ... ................. ....ErrCode... ... (4 Replies)
Discussion started by: ipat
4 Replies

8. Shell Programming and Scripting

removing certain paragraphs for matching patterns

Hi, I have a log file which might have certain paragraphs. Switch not possible Error code 1234 Process number 678 Log not available Error code 567 Process number 874 ..... ...... ...... Now I create an exception file like this. cat text.exp Error code 1234 Process number 874 (7 Replies)
Discussion started by: kaushys
7 Replies

9. Shell Programming and Scripting

how to filter out some paragraphs in a file

Hi, I am trying to filter out those paragraphs that contains 'CONNECT', 'alter system switch logfile'. That means say the input file is : ------------------------------------------------------- Wed Jun 7 00:32:31 2006 ACTION : 'CONNECT' CLIENT USER: prdadm CLIENT TERMINAL: Wed Jun 7... (7 Replies)
Discussion started by: cnlhap
7 Replies

10. UNIX for Dummies Questions & Answers

Using sed to remove paragraphs with variables

Hi everyone, I have a file with multiple entries and I would like to remove the ones that contain either /A"> or /A/, where A can be any letter of the alphabet. Here's an example of the entries: <Topic r:id="Top/World/Fran"> <catid>476</catid> <link... (1 Reply)
Discussion started by: BlueberryPickle
1 Replies
Login or Register to Ask a Question