Sponsored Content
Top Forums UNIX for Dummies Questions & Answers Copying Text between two unique text patterns Post 302119182 by spindoctor on Monday 28th of May 2007 02:17:53 PM
Old 05-28-2007
Copying Text between two unique text patterns

Dear Colleagues:
I have .rtf files of a collection of newspaper articles. Each newspaper article starts with a variation of the phrase "Document * of 20" and is separated from the next article with the character string "==================="

I would like to be able to take the text composing each news article from between these two patterns and dump them into separate, uniquely named files. I've been playing around with SED, grep, cut and csplit, but nothing seems to be working. I have the regular expressions developed to capture the two lines "Document * of 20" and "--------" independently, but I can't figure out how to capture and play with the text between the two lines. I hope you can help.
Yours,
Simon J. Kiss
Queen's University
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

extracting unique lines from text file

I have a file with 14million lines and I would like to extract all the unique lines from the file into another text file. For example: Contents of file1 happy sad smile happy funny sad I want to run a command against file one that only returns the unique lines (ie 1 line for happy... (3 Replies)
Discussion started by: soliberus
3 Replies

2. Shell Programming and Scripting

Extracting Text Between Two Unique Lines

Hi all! Im trying to extract a portion of text from a file and put it into a new file. I need all the lines between <Placement> and </Placement> including the Placemark lines themselves. Is there a way to extract all instances of these and not just the first one found? I've tried using sed and... (4 Replies)
Discussion started by: Grizzly
4 Replies

3. Shell Programming and Scripting

Extracting several lines of text after a unique string

I'm attempting to write a script to identify users who have sudo access on a server. I only want to extract the ID's of the sudo users after a unique line of text. The list of sudo users goes to the EOF so I only need the script to start after the unique line of text. I already have a script to... (1 Reply)
Discussion started by: bouncer
1 Replies

4. UNIX for Advanced & Expert Users

Vi copying text

Is there a trick for copying from something like a powerpoint into vi? Every time I try to copy text from something like a powerpoint to vi my spacing gets messed up. I think it has something to do with my .vimrc file. When I renamed it was able to copy it in just fine so can someone please... (2 Replies)
Discussion started by: cokedude
2 Replies

5. UNIX for Dummies Questions & Answers

Copying text from Windows to AIX - missing text?

Hi All, I'm hoping this is an easy question, but I'm having a weird problem trying to simply copy and paste text from MS Windows (XP) Notepad and then pasting into vi or vim in AIX. When I type "oslevel" I get "5.3.0.0". The problem is that once the text is pasted, there are sections of text... (2 Replies)
Discussion started by: PlainInverted
2 Replies

6. Shell Programming and Scripting

Replacing text between two patterns

I would like to replace ], with ]]], between /* SECTION2-BEGIN */ and /* SECTION2-END */ in my file. My file contains the following information: /* SECTION1-BEGIN */ , /* SECTION1-END */ /* SECTION2-BEGIN */ , /* SECTION2-END */ /*... (5 Replies)
Discussion started by: azdps
5 Replies

7. Shell Programming and Scripting

Need to extract text repetitively between two patterns

Hi All, I want to extract the text between some pattern which occurs repeatedly in a file. For example my input is like, /home/..... ..........java:25: cannot find symbol ............ /home/...... /home/....... I want to display... (2 Replies)
Discussion started by: Vignesh58
2 Replies

8. Shell Programming and Scripting

Find patterns and filter the text

I need to filter the text in between two patterns and output that to a different file. Please help me how to do it. Ex: ............. <some random text> ............. Pattern_1 <Few lines that need to be output to different file> Pattern_2 ................ ............... <more text in... (4 Replies)
Discussion started by: metturr
4 Replies

9. Shell Programming and Scripting

Command for non-unique text

awk -F "" '/<TestName>|<testname>|<Offerer>|<offerer>|<Line1>|<line1>|<City>|<city>|<State>|<state>/ {print $2, $3}' OFS='\t' UBE3A.xml > UBE3A.txt Is it possible to use the code above to search for a pattern that is non-unique? For example, if I wanted to capture the<MethodList>|<string>... (6 Replies)
Discussion started by: cmccabe
6 Replies

10. Shell Programming and Scripting

awk to print unique text in field

I am trying to use awk to print the unique entries in $2 So in the example below there are 3 lines but 2 of the lines match in $2 so only one is used in the output. File.txt chr17:29667512-29667673 NF1:exon.1;NF1:exon.2;NF1:exon.38;NF1:exon.4;NF1:exon.46;NF1:exon.47 703.807... (5 Replies)
Discussion started by: cmccabe
5 Replies
News::Scan::Article(3pm)				User Contributed Perl Documentation				  News::Scan::Article(3pm)

NAME
News::Scan::Article - collect information about news articles SYNOPSIS
use News::Scan::Article; my $art = News::Scan::Article->new( ARG, [ OPTIONS, ] SCAN ); DESCRIPTION
This module provides a derived class of "Mail::Internet" whose objects are suitable for digesting Usenet news articles. CONSTRUCTOR
new ( ARG, [ OPTIONS, ] SCAN-OBJ ) The "ARG" and "OPTIONS" parameters are identical to those required by "Mail::Internet", except "ARG" is required. See Mail::Internet. The "SCAN" parameter should be a "News::Scan" object. See News::Scan. If the article falls into the period of interest for "SCAN", the object is returned, else "undef". METHODS
group ( [ SCAN-OBJ ] ) Sets or returns an object's group depending on whether "SCAN-OBJ" is present. author Returns the article's author represented as a "Mail::Address" object. message_id Returns the article's Message-ID. subject Returns the article's subject. newsgroups Returns the list of newsgroups this article was posted to. size Returns the size of this article in bytes. header_size Returns the size of this article's header in bytes. header_lines Returns the number of lines consumed in this article by headers. body_size Returns the size of this article's body in bytes. body_lines Returns the number of lines consumed in this article by the body. orig_size Returns the size of this article's original content in bytes. See "QuoteRE" in News::Scan. orig_lines Returns the number of lines consumed in this article by original content. Keep in mind that original content is a subset of the body. sig_size Returns the size of this article'ss signature in bytes. sig_lines Returns the number of lines consumed in this article by the signature. SEE ALSO
News::Scan, Mail::Internet, Mail::Address AUTHOR
Greg Bacon <gbacon@cs.uah.edu> COPYRIGHT
Copyright (c) 1997 Greg Bacon. All Rights Reserved. This library is free software. You may distribute and/or modify it under the same terms as Perl itself. perl v5.10.1 2000-08-09 News::Scan::Article(3pm)
All times are GMT -4. The time now is 01:25 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy