Extracting text between two constant strings


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Extracting text between two constant strings
# 1  
Old 09-11-2011
Extracting text between two constant strings

Hi All,

I have a file whose common patter is like this:

Code:
.I 1
.U    
87049087
.S
Some text here too
.M
This is a text
.T
Some another text here
.P
Name of the book
.W
Some lines of more text. This text needs to be extracted.
.A
more text goes here too
.I 2
.U    
87049088
.S
Some text here too. More text from previous text
.M
This is a text
.T
Some another text here
.P
Name of the book
.W
Some lines of more text. This text needs to be extracted. This is text 2.
.A
more text goes here too

I want to extract text that is between .W and .A that is this text and store this text in 1.txt. The above pattern continues in the entire file. This means that I will start from 1.txt, then go to next pattern and

Code:
Some lines of more text. This text needs to be extracted.

and for the second pattern store in 2.txt

Code:
Some lines of more text. This text needs to be extracted. This is text 2.

As you can see the file numbers actually come from .I that is present in the above pattern.

I am using Linux with BASH and this is what I have done but seem it does not produce the desired results.

Code:
awk '/\.W/,/\.A/{c++}{print > c ".txt"}' FILE

# 2  
Old 09-11-2011
Code:
awk -v start=.W -v finish=.A '                                               
$0 ~ start  { t = 1; c++; next }
$0 ~ finish { t = 0 }
t           { print > c ".txt" }
' INPUTFILE

This User Gave Thanks to yazu For This Post:
# 3  
Old 09-21-2011
Hi,

I've been trying with this script and tried to modify it but it is not working in my new instance. I want to do the same thing but I want to extract until the newline is encountered. This means I read the text after .W and keep extracting the text until a newline is encountered not .A this time.

I've tried with this but not working:

Code:
awk -v start=.W -v finish='\n' '                                               
$0 ~ start  { t = 1; c++; next }
$0 ~ finish { t = 0 }
t           { print > c ".txt" }
' INPUTFILE

My input and output are all same but instead of .A I read until newline when I start from .W
Login or Register to Ask a Question

Previous Thread | Next Thread

8 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Extracting strings at various positions of text file

Hi Team - I hope everyone has been well! I export a file from one of our source systems that gives me more information than I need. The way the file outputs, I need to extract certain strings at different positions on the file and echo them to another file. I can do this in batch easily,... (2 Replies)
Discussion started by: SIMMS7400
2 Replies

2. UNIX for Dummies Questions & Answers

Extracting 22-character strings from text using sed/awk?

Here is my task, I feel sure this can be accomplished with see/awk but can't seem to figure out how. I have large flat file from which I need to extract every case of a pairing of characters (GG) in this case PLUS the previous 20 characters. The output should be a list (which I plan to make... (17 Replies)
Discussion started by: Twinklefingers
17 Replies

3. Shell Programming and Scripting

Extracting text between two strings, multiple instances

Hi experts, Ive got a text file which has the following text which will occur in this format at least one time: +=========================>> Some stuff that evreryone should knnow other stufsjdokajkajokajda aijhjajcdjajcisajcqsqdqwdqad <<=========================+ It is likely that... (8 Replies)
Discussion started by: martin0852
8 Replies

4. Shell Programming and Scripting

Extracting text between two strings

Hi, I've looked at a few existing posts on this, but they don't seem to work for my inputs. I have a text file where I want to extract all the text between two strings, every time that occurs. Eg my input file is Anna said that she would fetch the bucket. Anna and Ben moved the bucket.... (9 Replies)
Discussion started by: JamesForeman
9 Replies

5. Shell Programming and Scripting

Extracting text between two strings, first instance only

There are a lot of ways to extract text from between two strings, but what if those strings occur multiple times and you only want the text from the first two strings? I can't seem to find anything to work here. I'm using sed to process the text after it's extracted, so I prefer a sed answer, but... (4 Replies)
Discussion started by: fubaya
4 Replies

6. Shell Programming and Scripting

using awk to extract text between two constant strings

Hi, I have a file from which i need to extract data between two constant strings. The data looks like this : Line 1 SUN> read db @cmpd unit 60 Line 2 Parameter: CMPD -> "C00071" Line 3 Line 4 SUN> generate Line 5 tabint>ERROR: (Variable data) The data i need to extract is... (11 Replies)
Discussion started by: mjoshi
11 Replies

7. Shell Programming and Scripting

How to insert some constant text at beginig of each line within a text file.

Dear Folks :), I am new to UNIX scripting and I do not know how can I insert some text in the first column of a UNIX text file at command promtp. I can do this in vi editor by using this command :g/^/s//BBB_ e,g I have a file named as Test.dat and it containins below text: michal... (4 Replies)
Discussion started by: Muhammad Afzal
4 Replies

8. Shell Programming and Scripting

extracting a set of strings from a text file

i have textfiles that contain a series of lines that look like this: string0 .................................................... column3a column4a string1**384y0439 ..................................... column3b column4b... (2 Replies)
Discussion started by: Deanne
2 Replies
Login or Register to Ask a Question