How to extract a paragraph containing a given string?


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting How to extract a paragraph containing a given string?
# 1  
Old 06-29-2016
How to extract a paragraph containing a given string?

Hello:

Have a very annoying problem:

Need to extract paragraphs with a specific string in them from a very large file
with a repeating record separator.

Example data: a file called test.out

Code:
CREATE VIEW view1
AS something
FROM table1 ,table2 as A, table3 (something FROM table4)
FROM table5, table6
USING file1
;
CREATE VIEW view1
FROM table1 ,table2 ,table6 ,table9
something
something
FROM table5 ,table (something FROM table4 ,table5(this is something FROM table8)
USING file2
;
CREATE VIEW view1
FROM table1 ,table2 ,table6 ,table8
something
something
FROM table5 ,table (something FROM table4 ,table5(this is something FROM table8)
USING file2
;
CREATE VIEW view1
FROM table1 ,table2 ,table6 ,table7
something
something
FROM table5 ,table7 (something FROM table4 ,table5(this is something FROM table8)
USING file2
;
CREATE VIEW view1
FROM table1 ,table2 ,table6 ,table6
something
something
FROM table5 ,table (something FROM table4 ,table5(this is something FROM table8)
USING file2
;

If I want to extract a paragraph containing the string "table7"

Code:
 
awk -v RS="CREATE VIEW" '/table7/' test.out

 view1
FROM table1 ,table2 ,table6 ,table7
something
something
FROM table5 ,table7 (something FROM table4 ,table5(this is something FROM table8)
USING file2
;

The problem is that the RS variable always cuts out the RS value itself, as you can see..

How do I tell awk to print the RS value too .. ??

Thnx in advance.


Moderator's Comments:
Mod Comment Thanks for trying to use the required tags, but please use CODE tags instead of ICODE tags.

Last edited by RudiC; 06-29-2016 at 08:26 AM.. Reason: Changed ICODE to CODE tags.
# 2  
Old 06-29-2016
You can set the ORS variable equal to RS, but I doubt you'd be happy with the result of this applied to your problem.
Wouldn't ; lend itself as an ORS/RS character in this case? Although I know it can show up in other spots in DDL as well ...
# 3  
Old 06-29-2016
Yes I tried setting RS & ORS the same value but the output is the same, I am afraid..
Still cuts out the RS value..:

Code:
awk -v RS="CREATE" -v ORS="CREATE" '/table7/' test.out

==========================
 VIEW view1
FROM table1 ,table2 ,table6 ,table7
something
something
FROM table5 ,table7 (something FROM table4 ,table5(this is something FROM table8)
USING file2
;

# 4  
Old 06-29-2016
What about:
Code:
awk 'BEGIN{RS=ORS=";\n"}/table7/' text

CREATE VIEW view1
FROM table1 ,table2 ,table6 ,table7
something
something
FROM table5 ,table7 (something FROM table4 ,table5(this is something FROM table8)
USING file2
;

I just can't get rid of the leading empty line....
# 5  
Old 06-29-2016
Thnx.. But when I try it on the real data, it just greps the string ..
Not the whole paragraph..
# 6  
Old 06-29-2016
Code:
perl -ne 'BEGIN{$/=";\n"} /table7/ and print' file

Output:
Code:
CREATE VIEW view1
FROM table1 ,table2 ,table6 ,table7
something
something
FROM table5 ,table7 (something FROM table4 ,table5(this is something FROM table8)
USING file2
;

# 7  
Old 06-29-2016
Hi.

A grep-like code from ATT, cgrep, allows 3 patterns: the pattern in which you are primarily interested, and the 2 end-point patterns of an enclosing window. Here is an example:
Code:
#!/usr/bin/env bash

# @(#) s1       Demonstrate extraction by matching token in paragraph.
# cgep source:
# http://sourceforge.net/projects/cgrep/ (verified: 2016.06.29)

# Utility functions: print-as-echo, print-line-with-visual-space, debug.
# export PATH="/usr/local/bin:/usr/bin:/bin"
LC_ALL=C ; LANG=C ; export LC_ALL LANG
pe() { for _i;do printf "%s" "$_i";done; printf "\n"; }
pl() { pe;pe "-----" ;pe "$*"; }
db() { ( printf " db, ";for _i;do printf "%s" "$_i";done;printf "\n" ) >&2 ; }
db() { : ; }
C=$HOME/bin/context && [ -f $C ] && $C cgrep

FILE=${1-data1}

pl " Sample input data file $FILE:"
head $FILE

pl " Results:"
cgrep -D -w '^CREATE' +w '^;' table7 $FILE

exit 0

producing:
Code:
$ ./s1

Environment: LC_ALL = C, LANG = C
(Versions displayed with local utility "version")
OS, ker|rel, machine: Linux, 3.16.0-4-amd64, x86_64
Distribution        : Debian 8.4 (jessie) 
bash GNU bash 4.3.30
cgrep ATT cgrep 8.15

-----
 Sample input data file data1:
CREATE VIEW view1
AS something
FROM table1 ,table2 as A, table3 (something FROM table4)
FROM table5, table6
USING file1
;
CREATE VIEW view1
FROM table1 ,table2 ,table6 ,table9
something
something

-----
 Results:
CREATE VIEW view1
FROM table1 ,table2 ,table6 ,table7
something
something
FROM table5 ,table7 (something FROM table4 ,table5(this is something FROM table8)
USING file2
;

You will need to compile cgrep. I have done it several times in both 32-bit and 64-bit without trouble.

Best wishes ... cheers, drl
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Bash script to extract paragraph with globs in it

Hi, Its been a long time since I have used Bash to write a script so am really struggling here. Need the gurus to help me out. uname -a Linux lxserv01 2.6.18-417.el5 i have a text file with blocks of code written in a similar manner ******* BEGIN MESSAGE ******* Station /... (12 Replies)
Discussion started by: dsid
12 Replies

2. UNIX for Dummies Questions & Answers

Extract paragraph that contains a value x<-30

I am using OSX. I have a multi-mol2 file (text file with coordinates and info for several molecules). An example of two molecules in the file is given below for molecule1 and molecule 2. The total file contains >50,000 molecules. I would like to extract out and write to another file only the... (2 Replies)
Discussion started by: Egy
2 Replies

3. Shell Programming and Scripting

How to extract every repeated string between two specific string?

Hello guys, I have problem with hpux shell script. I have one big text file that contains like SOH bla bla bla bla bla bla ETX SOH bla bla bla ETX SOH bla bla bla ETX What I need to do is save first SOH*BLA into file1.txt, save second SOH*BLA into file2.txt and so on.... (17 Replies)
Discussion started by: sembii
17 Replies

4. Shell Programming and Scripting

Search String and extract few lines under the searched string

Need Assistance in shell programming... I have a huge file which has multiple stations and i wanted to search particular station and extract few lines from it and the rest is not needed Bold letters are the stations . The whole file has multiple stations . Below example i wanted to search... (4 Replies)
Discussion started by: ajayram_arya
4 Replies

5. Shell Programming and Scripting

to extract string from main string and string comparison

continuing from my previous post, whose link is given below as a reference https://www.unix.com/shell-programming-scripting/171076-shell-scripting.html#post302573569 consider there is create table commands in a file for eg: CREATE TABLE `Blahblahblah` ( `id` int(11) NOT NULL... (2 Replies)
Discussion started by: vivek d r
2 Replies

6. Shell Programming and Scripting

How to extract multiple line in a paragraph? Please help.

Hi all, The following lines are taken from a long paragraph: Labels of output orbitals: RY* RY* RY* RY* RY* RY* 1\1\GINC-COMPUTE-1-3\SP\UB3LYP\6-31G\C2H5Cr1O1(1+,5)\LIUZHEN\19-Jan-20 10\0\\# ub3lyp/6-31G pop=(nbo,savenbo) gfprint\\E101GECP\\1,5\O,0,-1.7 ... (1 Reply)
Discussion started by: liuzhencc
1 Replies

7. UNIX for Dummies Questions & Answers

Output text from 1st paragraph in file w/ a specific string through last paragraph of file w/ string

Hi, I'm trying to output all text from the first paragraph in a file that contains a specific string through the last paragraph in that file that contains that string. Previously, I was outputting just each paragraph with that search string with: cat in_file | nawk '{RS=""; FS="\n";... (2 Replies)
Discussion started by: carpenn
2 Replies

8. Shell Programming and Scripting

Search for a particular string in a paragraph in a text

Hi all, i'm new to this community. I am trying to write a script which will fetch ftp completion time of a file from a paragraph of a big text file ( which contains multiple paragraphs) . Each paragraph will have ftp details.. Now I dont know how to fetch process time within a paragraph of... (3 Replies)
Discussion started by: prachiagra
3 Replies

9. Shell Programming and Scripting

Search for string in a file and extract another string to a variable

Hi, guys. I have one question: I need to search for a string in a file, and then extract another string from the file and assign it to a variable. For example: the contents of the file (group) is below: ... ftp:x:23: mail:x:34 ... testing:x:2001 sales:x:2002 development:x:2003 ...... (6 Replies)
Discussion started by: daikeyang
6 Replies

10. Linux

Extract a paragraph

Hi , Unix.com has been life saver for me I admit :) I am trying to extract a paragraph based on matching pattern "CREATE TABLE " from a ddl file . The paragraphs are seperated by blank line . Input file is #cat zip.20080604.sql1 CONNECT TO TST103 SET SESSION_USER OPSDM002 ... (2 Replies)
Discussion started by: capri_drm
2 Replies
Login or Register to Ask a Question