Extract data between two strings


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Extract data between two strings
# 1  
Old 03-16-2010
Extract data between two strings

Hi , I have a billing CDR file which has repeated lines as indicated below and I need to extract data between two strings (i.e.: <?> and </?>). Eventually, map that information with the corresponding field. I'm new to unix, any help will be greatly appreciated.
Gamini

Input (single line):
! TICKET NBR : 78 ! 3100.2.13.1 0.30 ! 3100.2.14.2 51 ! 1.2.1.8 <A>91368599971</A><B>Mobility</B><C>9138599971</C><D>9134284050</D><E></E><F>0220_KANSAS</F><G>NPA_913</G><H>402404</H><I>2002</I><J>12/03/2010 10:08:20</J><Q>A18</Q><R>P_30_LOCAL</R><K>RL 0.30/60/60 QL 60 CL 0.30 </K><L>0.30</L><M>RL 0.00/1/1 QL 60 CL 0.00 </M><N>0.00</N><O>0</O><P>9.70</P><P1></P1><TA>0</TA><MS>302614168599971</MS><P2></P2><BRQ></BRQ> ! 3100.2.984.45 0 !

Output:
Field A: 91368599971
Filed B: Mobility
Field C: 9138599971
.
.
Field MS: 302614168599971
Field P2:<empty>
Field BRQ: <empty>
# 2  
Old 03-16-2010
Hello, jaygamini:

With your sample data in file named "data":
Code:
$ sed 's/[^<]*<\([^>]*\)>\([^<]*\)<\/\1>[^<]*/Field \1: \2</g;y/</\n/' data
Field A: 91368599971
Field B: Mobility
Field C: 9138599971
Field D: 9134284050
Field E: 
Field F: 0220_KANSAS
Field G: NPA_913
Field H: 402404
Field I: 2002
Field J: 12/03/2010 10:08:20
Field Q: A18
Field R: P_30_LOCAL
Field K: RL 0.30/60/60 QL 60 CL 0.30 
Field L: 0.30
Field M: RL 0.00/1/1 QL 60 CL 0.00 
Field N: 0.00
Field O: 0
Field P: 9.70
Field P1: 
Field TA: 0
Field MS: 302614168599971
Field P2: 
Field BRQ:

It can also be done without the y command kludge (which along with the rest of the regular expression assumes that '<' does not appear outside of the tag markup elements themselves), but note that the newline in the substitution text must be immediately preceded by a backslash and inside stong single quotes:
Code:
sed 's/[^<]*<\([^>]*\)>\([^<]*\)<\/\1>[^<]*/Field \1: \2\                
/g' data

Regards,
Alister

Last edited by alister; 03-16-2010 at 02:38 PM..
# 3  
Old 03-16-2010
MySQL

You are best. I really appreciate your help ...
# 4  
Old 03-16-2010
If the input data is in a single line, then -

Code:
$
$
$ cat -n f7
     1  ! TICKET NBR : 78 ! 3100.2.13.1 0.30 ! 3100.2.14.2 51 ! 1.2.1.8 <A>91368599971</A><B>Mobility</B><C>9138599971</C><D>9134284050</D><E></E><F>0220_KANSAS</F><G>NPA_913</G><H>402404</H><I>2002</I><J>12/03/2010 10:08:20</J><Q>A18</Q><R>P_30_LOCAL</R><K>RL 0.30/60/60 QL 60 CL 0.30 </K><L>0.30</L
><M>RL 0.00/1/1 QL 60 CL 0.00 </M><N>0.00</N><O>0</O><P>9.70</P><P1></P1><TA>0</TA><MS>302614168599971</MS><P2></P2><BRQ></BRQ> ! 3100.2.984.45 0 !
$
$ perl -lne 'while(/<(.*?)>(.*?)<\/.*?>/g){print "Field $1 : $2"}' f7
Field A : 91368599971
Field B : Mobility
Field C : 9138599971
Field D : 9134284050
Field E :
Field F : 0220_KANSAS
Field G : NPA_913
Field H : 402404
Field I : 2002
Field J : 12/03/2010 10:08:20
Field Q : A18
Field R : P_30_LOCAL
Field K : RL 0.30/60/60 QL 60 CL 0.30
Field L : 0.30
Field M : RL 0.00/1/1 QL 60 CL 0.00
Field N : 0.00
Field O : 0
Field P : 9.70
Field P1 :
Field TA : 0
Field MS : 302614168599971
Field P2 :
Field BRQ :
$
$

tyler_durden
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Extract content between strings

Hello i am stuck with this. i have input which is as follows /type/work /works/OL10627594W 3 2019-04-24T16:46:21.351549 {"created": {"type": "/type/datetime", "value": "2009-12-11T03:18:17.488715"}, "title": "Tog the dog", "covers": , "last_modified": {"type":... (3 Replies)
Discussion started by: ahfze
3 Replies

2. UNIX for Dummies Questions & Answers

Issue when using egrep to extract strings (too many strings)

Dear all, I have a data like below (n of rows=400,000) and I want to extract the rows with certain strings. I use code below. It works if there is not too many strings for example n of strings <5000. while I have 90,000 strings to extract. If I use the egrep code below, I will get error: ... (3 Replies)
Discussion started by: forevertl
3 Replies

3. UNIX for Dummies Questions & Answers

Extract code between 2 strings.

Hi, Im having some problems with this. I have loaded a file with html code. All code is placed in the same line. I want to get everything between two given strings (including these strings and get only the first appearance). Example: File contains <html><body><a href='a.html'>abc</a><a... (5 Replies)
Discussion started by: ngb
5 Replies

4. Shell Programming and Scripting

Extract two strings from a file and create a new file with these strings

I have the following lines in a log file. It would be great if some one can help me to create a new file with the just entries in the below format. 66.150.161.195 HPSAC=Z05 66.150.161.196 HPSAC=A05 That is just extract the IP address and the string DPSAC=its value 66.150.161.195 -... (1 Reply)
Discussion started by: Tuxidow
1 Replies

5. Shell Programming and Scripting

Extract text between two strings

Hi I have something like this: EXAMPLE 1 CREATE UNIQUE INDEX "STRING_1"."STRING_2" ON "BOSNI_CAB_EVENTO" ("CD_EVENTO" , "CD_EJECUCION" ) PCTFREE 10 INITRANS 2 MAXTRANS 255 STORAGE(INITIAL 5242880 FREELISTS 1 FREELIST GROUPS 1 BUFFER_POOL DEFAULT) TABLESPACE "DB1000_INDICES_512K"... (4 Replies)
Discussion started by: chrispaz81
4 Replies

6. Shell Programming and Scripting

How to Extract text between two strings?

Hi, I want to extract some text between two strings in a line i am using following command i.e; awk '/-string1/,/-string2/' filename contents of file is--- line1 line2 aaa -bbb -ccc -string1 c,d,e -string2 line4 but it is showing complete line which is having searched strings. aaa... (19 Replies)
Discussion started by: emresearch
19 Replies

7. Shell Programming and Scripting

Search and Extract data between two strings

hi, In a given directory, i need to search for a string (eg:ABCD). For a given file, i have to extract the text between START and END strings . I need to extract all the text between START and END and there can be multiple START and END in a file. Sample: There is a directort... (3 Replies)
Discussion started by: flamingo_l
3 Replies

8. Shell Programming and Scripting

Extract and parse data between two strings

Hi , I have a billing CDR file which is separated by “!”. I need to extract and format data between the starting (“!”) and the end of the line (“1.2.1.8”). These two variables are permanent tags to show begin and end. ! TICKET NBR : 2 ! GSI : 101 ! 3100.2.112.1 24/03/2010 00:41:14 !... (3 Replies)
Discussion started by: jaygamini
3 Replies

9. Shell Programming and Scripting

Extract specific data content from a long list of data

My input: Data name: ABC001 Data length: 1000 Detail info Data Direction Start_time End_time Length 1 forward 10 100 90 1 forward 15 200 185 2 reverse 50 500 450 Data name: XFG110 Data length: 100 Detail info Data Direction Start_time End_time Length 1 forward 50 100 50 ... (11 Replies)
Discussion started by: patrick87
11 Replies

10. Shell Programming and Scripting

Extract data based on match against one column data from a long list data

My input file: data_5 Ali 422 2.00E-45 102/253 140/253 24 data_3 Abu 202 60.00E-45 12/23 140/23 28 data_1 Ahmad 256 7.00E-45 120/235 140/235 22 data_4 Aman 365 8.00E-45 15/65 140/65 20 data_10 Jones 869 9.00E-45 65/253 140/253 18... (12 Replies)
Discussion started by: patrick87
12 Replies
Login or Register to Ask a Question