Pattern replace from a text file using sed


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Pattern replace from a text file using sed
# 1  
Old 05-17-2015
Pattern replace from a text file using sed

I have a sample text format as given below

Code:
<Text Text_ID="10155645315851111_10155645333076543" From="460350337461111" Created="2011-03-16T17:05:37+0000" use_count="123">This is the first text</Text>
<Text Text_ID="10155645315851111_10155645317023456" From="1626711840902323" Created="2011-03-16T17:01:02+0000" use_count="234">This is the second text</Text>
<Text Text_ID="10155645315851111_10155645320006543" From="1481727095384343" Created="2011-03-16T17:02:04+0000" use_count="3456">This is the third text 
If counted  
GOT IT... 👍👍</Text>
<Text Text_ID="10155645315851111_10155645326222345" From="411021195696789" Created="2011-04-16T17:03:44+0000" use_count="5433">This is the fourth text........</Text>

There are many lines in a file as given above. My concern is which script will be suitable to extract only the text MESSAGE between the markers, i.e., <Text ...> MESSAGE </Text>. Please note the MESSAGE can be of multiple line and including some special character as given in the third text message. Can someone help me out with a sample script? Thanks in advance.Smilie
# 2  
Old 05-18-2015
Regex is not really suited to parsing HTML, you should consider using a Perl module such as XML::Simple.

that said, you want a sed script that will strip everything within the angle brackets <...>
Code:
 
sed 's/<[^>]\+>//g' ~/tmp/tmp.txt
This is the first text
This is the second text
This is the third text
If counted
GOT IT... ����
This is the fourth text........

This User Gave Thanks to Skrynesaver For This Post:
# 3  
Old 05-18-2015
If you've got other tags than just <Text>, and want to eliminate those, so just print the contents of <Text> tags, try
Code:
awk '/<Text/ {P=1E9; sub(/<Text[^>]*>/,_)} /<\/Text>/ {P=NR; sub(/<\/Text>/,_)} P>=NR' file

This User Gave Thanks to RudiC For This Post:
# 4  
Old 05-28-2015
Considering the above text format, I would like to filter the messages matching the numbers, for example, all the messages that matches From="460350337461111" and From="411021195696789" at a time using sed script. Can someone help me out with a sample sed script? Thanks in advance. Smilie

Last edited by my_Perl; 05-29-2015 at 12:24 AM..
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

sed help, Find a pattern, replace it with same text minus leading 0

HI Folks, I'm looking for a solution for this issue. I want to find the Pattern 0/ and replace it with /. I'm just removing the leading zero. I can find the Pattern but it always puts literal value as a replacement. What am I missing?? sed -e s/0\//\//g File1 > File2 edit by... (3 Replies)
Discussion started by: SirHenry1
3 Replies

2. UNIX for Dummies Questions & Answers

Using sed to replace / in text file

Hi, I want to use sed to replace " /// " with "///" in a text file. However I am getting error messages when I use sed 's/ /// /////g' input.txt > output.txt. How do I go about doing this in sed? Input: 219518_s_at 0.000189 ELL3 / SERINC4 Output: 219518_s_at 0.000189 ELL3/SERINC4 (5 Replies)
Discussion started by: evelibertine
5 Replies

3. Shell Programming and Scripting

Replace a pattern in a file with a generated number using sed or awk

my file has thousands of line but let me show what i want to achieve... here is one line from that file cat fileName.txt (2,'','user3002,user3003','USER_DATA_SINGLE',1,0,0,'BACKUP',2,NULL,0,450,NULL,NULL,'','2011-05-10... (13 Replies)
Discussion started by: vivek d r
13 Replies

4. UNIX for Dummies Questions & Answers

Use sed to replace but only in a specific column of the text file

Hi, I would like to use sed to replace NA to x ('s/NA/x/g'), but only in the 5th column of the space delimited text file, nowhere else. How do I go about doing that? Thanks! (1 Reply)
Discussion started by: evelibertine
1 Replies

5. Shell Programming and Scripting

Sed command to replace with pattern except for text and closing parentheses

Can someone help me with a sed command: There will be multiple occurences in a file that look like this: MyFunction(12c34r5) and I need to replace that with just the 12c34r5 for every occurrence. The text between the parentheses will be different on each occurrence, so I can't search for that.... (4 Replies)
Discussion started by: missb
4 Replies

6. Shell Programming and Scripting

How to replace multiple text in a file using sed

can anyone please help me in the below scenario: File1: Hello1 Hello1 i want to use sed to replace multiple occurances of Hello1 in file 1 to welcome. Thanks a ton for the help (9 Replies)
Discussion started by: amithkhandakar
9 Replies

7. Shell Programming and Scripting

using sed/awk to replace a block of text in a file?

My apologies if this has been answered in a previous post. I've been doing a lot of searching, but I haven't been able to find what I was looking for. Specifically, I am wondering if I can utilize sed and/or awk to locate two strings in a file, and replace everything between those two strings... (12 Replies)
Discussion started by: kiddsupreme
12 Replies

8. Shell Programming and Scripting

Can sed replace every 2 instances it finds in a file? Pattern.

My goal is to make a script to find/replace the variable "PORT" with a unique number. Like the following <VirtualHost 174.120.36.236:PORT> ServerName architect.com.ph ServerAlias www.architect.com.ph DocumentRoot /home/architec/public_html ServerAdmin... (16 Replies)
Discussion started by: EXT3FSCK
16 Replies

9. Shell Programming and Scripting

pattern replace inside text file using sed

Hi, I have a situation where I want to replace some occurrences of ".jsp" into ".html" inside a text file. For Example: If a pattern found like <a href="http://www.mysite.com/mypage.jsp"> it should be retained. But if a pattern found like <a href="../mypage.jsp"> it should be changed to... (4 Replies)
Discussion started by: meharo
4 Replies

10. AIX

Pattern to replace ^M and ^Y in a 4.2 AIX text file

I have files on my AIX 4.2 client system where I need to do the following replacements below but have no clue how ? They are control characters (linefeed, chariage return, ...). First, replace "^M^Y^M" with ^char_for_end_of_line Then replace "^M" with " " Trim all left spaces In VI, my... (7 Replies)
Discussion started by: Browser_ice
7 Replies
Login or Register to Ask a Question