Parse data


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Parse data
# 1  
Old 10-03-2010
Parse data

hi

i have a file p1.htm

Code:
<div class="colorID2">
   
    aaaa aaaa aa <br/>
    bbbbbbbb   bbb<br/>


    <br/>cccc ccc ccc 
</div><div class="colorID1">
   
    dddd d ddddd<br/>
 

eeee eeee eeeeeeeeee<br/>
      fffff

<br/>g gg<br/>
</div>
<div ...

output:
Code:
aaaa aaaa aa.bbbbbbbb   bbb.cccc ccc ccc.dddd d ddddd.eeee eeee eeeeeeeeee.fffff.g gg

my code:

Code:
awk -vRS="" '{gsub(/<br/>/,".",$0)}1' p1.htm

but don't work

thank's
# 2  
Old 10-03-2010
Try:
Code:
awk '{$1=$1;gsub(/<\/*div[^>]*>/,"");gsub(/ *(<br\/>)+ */,".")}1' RS= ORS= infile


Last edited by Scrutinizer; 10-03-2010 at 06:10 AM.. Reason: made change so first space get removed too
# 3  
Old 10-03-2010
Quote:
Originally Posted by Scrutinizer
Try:
Code:
awk '{$1=$1;gsub(/<\/*div[^>]*>/,"");gsub(/(<br\/>)+ */,".")}1' RS= ORS= infile

thank's Scrutinizer

---------- Post updated at 04:17 AM ---------- Previous update was at 03:40 AM ----------

Quote:
Originally Posted by Scrutinizer
Try:
Code:
awk '{$1=$1;gsub(/<\/*div[^>]*>/,"");gsub(/ *(<br\/>)+ */,".")}1' RS= ORS= infile

Scrutinizer, sorry, can you explain me:

Code:
/ *(<br\/>)+ */



---------- Post updated at 04:17 AM ---------- Previous update was at 04:17 AM ----------

Scrutinizer, sorry, can you explain me:

Code:
/ *(<br\/>)+ */

what the difference:

Code:
/<br\/>/

# 4  
Old 10-03-2010
It means zero or more spaces, followed by 1 or more occurrences of the string <br/> followed by zero or more spaces.
# 5  
Old 10-03-2010
Code:
sed 's/<[^<]*>//g' infile | tr '\n' ' '

doesn't convert tabs and multiple spaces but can be read by the shell, awk...
# 6  
Old 10-03-2010
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Need a script to parse data and output to csv

I am not too savvy with arrays and am assuming that what I am looking for needs arrays. This is my requirement. So I have the raw data that gets updated to a log as shown below StudentInfo: FullInfo = { Address = Newark Age = 20 Name= John } StudentInfo:... (2 Replies)
Discussion started by: sidnow
2 Replies

2. Solaris

Need command to parse data

Hi Friends, I have data like below t064266 I want output into this format t064266 Data are space delimited and i want parse third column data. Thanks (9 Replies)
Discussion started by: Jagaat
9 Replies

3. Shell Programming and Scripting

Perl :: to parse the data from a string.

Hi folks, I have a line in log from which I need to parse few data. Jul 6 00:05:58 dg01aipagnfe01p %FWSM-3-106011: Deny inbound (No xlate) From the above... I need to parse the %FWSM-3-106011: substring. Another example Jul 13 00:08:55 dq01aipaynas01p %FWSM-6-302010: 2 in use, 1661... (3 Replies)
Discussion started by: scriptscript
3 Replies

4. Shell Programming and Scripting

Regex to Parse data

Experts and Informed folks, Need some help here in parsing the log file. 1389675 Opera_ShirtCatalog INSERT INTO Opera_ShirtCatalog(COL1, COL2) VALUES (1, 'TEST1'), (2,'TEST2'); 1389685 Opera_ShirtCatlog_Wom INSERT INTO Opera_ShirtCatlog_Wom(col1, col2, col3) VALUES (9,'Siz12, FormFit',... (12 Replies)
Discussion started by: ManoharMa
12 Replies

5. Shell Programming and Scripting

Parse data

Guys , please help me out with another AWK solution ... Input Device Physical Name : Not Visible Device Symmetrix Name : 0743 Front Director Paths (2): { ---------------------------------------------------------------------- ... (5 Replies)
Discussion started by: greycells
5 Replies

6. Shell Programming and Scripting

How to parse data?

Hi all, I have output of paction command looking like this: RELCI 0 IP address 1.2.16.3 Xmit: CURRENT Recv: WAIT_HEADER 0 congestions 2617/0 buf. sent/rec Xmit: CURRENT Recv: WAIT_HEADER 0 congestions 0/0 buf. sent/rec BUFFER Xmit: ... (6 Replies)
Discussion started by: sameucho
6 Replies

7. Shell Programming and Scripting

Extract and parse data between two strings

Hi , I have a billing CDR file which is separated by “!”. I need to extract and format data between the starting (“!”) and the end of the line (“1.2.1.8”). These two variables are permanent tags to show begin and end. ! TICKET NBR : 2 ! GSI : 101 ! 3100.2.112.1 24/03/2010 00:41:14 !... (3 Replies)
Discussion started by: jaygamini
3 Replies

8. Shell Programming and Scripting

parse data using sh script

Hi, I am a newbie to unix/shell scripting and i have a question on how to parse a txt file using perl in a sh script. I have a txt file that contains hundreds of lines with data like this.... X, Y, Latitude, Longitude 1, 142, -38.000000, -91.000000, 26.348 2, 142, 60.000000, -90.000000,... (2 Replies)
Discussion started by: moonbaby
2 Replies

9. UNIX for Dummies Questions & Answers

How to parse the specific data from the file

Hi, I need to parse this data FastEthernet0/9,|FastEthernet0/10,|FastEthernet0/11,FastEthernet0/13|, FastEthernet0/12,FastEthernet0/24 . and get only the value like e.g 0/24,0/11. how to do this in shell script. Thanks in Advance. (2 Replies)
Discussion started by: MuthuAlagappan
2 Replies

10. Shell Programming and Scripting

Parse a range of data

Hello, I have a file which has a range of date like: 00:00 test 00:01 test2 00:02 test3 00:03 test4 00:04 test5 00:05 test6 Using input (stdin) i would like to parse the data 00:01 to 00:04. The output file should be like this: 00:01 test2 00:02 test3 00:03 test4 00:04 test5 ... (5 Replies)
Discussion started by: BufferExploder
5 Replies
Login or Register to Ask a Question