Text to column starting/ending with special character in each row


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Text to column starting/ending with special character in each row
# 1  
Old 10-09-2015
Text to column starting/ending with special character in each row

Hello,

Here is my text data excerpted from the webpage:

input
Quote:
<tr id="Row1" class=alt1 onMouseOver='setupdateRow(1)' onMouseOut='setupdateRow(0)'><td><a href='/connector?id=1'>www.example1.com:10004</a></td><td><tr>
<tr id="Row2" class=alt2 onMouseOver='setupdateRow(2)' onMouseOut='setupdateRow(0)'><td><a href='/connector?id=2'>www.example2.com:5555</a></td><tr>
<tr id="Row3" class=alt1 onMouseOver='setupdateRow(3)' onMouseOut='setupdateRow(0)'><td><a href='/connector?id=3'>www.example8.com:4532</a></td><td><tr>
<tr id="Row4" class=alt2 onMouseOver='setupdateRow(4)' onMouseOut='setupdateRow(0)'><td><a href='/connector?id=4'>www.example21.com:9501</a></td><tr>
<tr id="Row5" class=alt1 onMouseOver='setupdateRow(5)' onMouseOut='setupdateRow(0)'><td><a href='/connector?id=5'>host17.com:9501</a></td><tr>
<tr id="Row6" class=alt2 onMouseOver='setupdateRow(6)' onMouseOut='setupdateRow(0)'><td><a href='/connector?id=6'>host.com:4445</a></td><tr>

My target is to get:


What i tried is:

Code:
sed 's/.*\(connector\)/1/' input > output

but all characters coming before the word "connector" are deleted which is not good for me.

My question:
Is there a quicker way to get the requested target file?
Is it possible to convert each line to one column, extracting data between <a> and </a> ?


Many thanks
Baris

Last edited by baris35; 10-09-2015 at 12:20 PM..
# 2  
Old 10-09-2015
Code:
sed 's#.*\(www\)#\1#;s#<.*##' input

# 3  
Old 10-09-2015
Quote:
Originally Posted by Yoda
Code:
sed 's#.*\(www\)#\1#;s#<.*##' input

Thanks Yoda,
what if the last two connector domains are not including www ?
# 4  
Old 10-09-2015
How about awk?
Code:
awk -F'[<>]' '{ print $7 }' input

or
Code:
sed 's#.*connector[^>]*>##;s#<.*##'

This User Gave Thanks to Yoda For This Post:
# 5  
Old 10-09-2015
Quote:
Originally Posted by Yoda
How about awk?
Code:
awk -F'[<>]' '{ print $7 }' input

or
Code:
sed 's#.*connector[^>]*>##;s#<.*##'

perfect! many thanks....
# 6  
Old 10-09-2015
Quote:
but all characters coming before the word "connector" are deleted which is not good for me.
What else do you want to do to those chars?

Try
Code:
sed 's/.*connector[^>]*>//; s/<.*$//' file
www.example1.com:10004
www.example2.com:5555
www.example8.com:4532
www.example21.com:9501
host17.com:9501
host.com:4445

# 7  
Old 10-09-2015
Could you please explain how this code works?
Code:
sed 's#.*connector[^>]*>##;s#<.*##'

---------- Post updated at 10:47 AM ---------- Previous update was at 10:40 AM ----------

Quote:
Originally Posted by RudiC
What else do you want to do to those chars?

Try
Code:
sed 's/.*connector[^>]*>//; s/<.*$//' file
www.example1.com:10004
www.example2.com:5555
www.example8.com:4532
www.example21.com:9501
host17.com:9501
host.com:4445

What I am trying to do is to extract the data of abusing connections to block them. Your prompt support makes me happy. Trying to understand the algorithm of each sed command but not familiar with it.

Thanks
Baris
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Print every alternate column in row in a text file

Hi, I have a comma separated file. I would like to print every alternate columns into a new row. Example input file: Name : John, Age : 30, DOB : 30-Oct-2018 Example output: Name,Age,DOB John,30,30-Oct-2018 (3 Replies)
Discussion started by: Lini
3 Replies

2. Shell Programming and Scripting

Column to row and position data in a text file

Hi everyone.. I have a list of values in a file... a, b, c, 1, 2, 3, aaaa, bbbbb, I am interested in converting this column to a row.. "text",aaaa, bbbb a,1 (7 Replies)
Discussion started by: manihi
7 Replies

3. UNIX for Beginners Questions & Answers

Search for words starting and ending with

im trying to search for a WORD in a file which begins with a number followed by a hypen follwed multiple words and end with a dot "." and pront the entire line which matches the above. Please note that there is a space at the begining of each line i/p file 19458 00000-CONTROL-PARA.... (5 Replies)
Discussion started by: anijan
5 Replies

4. Shell Programming and Scripting

awk command to find total number of Special character in a column

How to find total number of special character in a column? I am using awk -f "," '$col_number "*$" {print $col_number}' file.csv|wc -l but its not giving correct output. It's giving output as 1 even though i give no special character? Please use code tags next time for your code and... (4 Replies)
Discussion started by: AjitKumar
4 Replies

5. Shell Programming and Scripting

Extract specific line in an html file starting and ending with specific pattern to a text file

Hi This is my first post and I'm just a beginner. So please be nice to me. I have a couple of html files where a pattern beginning with "http://www.site.com" and ending with "/resource.dat" is present on every 241st line. How do I extract this to a new text file? I have tried sed -n 241,241p... (13 Replies)
Discussion started by: dejavo
13 Replies

6. UNIX for Advanced & Expert Users

Pring starting and ending numbers using UNIX

Hi all, I need to do scrip for printing starting and ending numbers along with count in given file.:wall: Input: a.txt 10000030 10000029 10000028 10000027 10000026 10000024 10000023 10000021 10000018 10000018 10000017 10000016 10000015 10000014 (2 Replies)
Discussion started by: jackbell2013
2 Replies

7. Shell Programming and Scripting

if statement to check files with different ending but same starting name

I am trying to check if files staring with filename but ending with diffent dates e.g. filename.2011-10-25. The code I am using is below if It works find only if one file is present but returns binary operator expected when there are mulptiple files. Please help me correcting it. I... (5 Replies)
Discussion started by: ningy
5 Replies

8. UNIX for Dummies Questions & Answers

Sort text file starting at column X

Hello everyone! As the heading reads, I would like to sort the lines of a text file, starting at a specific column (i.e. skip the first X characters of each line). What I’m actually trying to sort is the md5 sums file of a directory. Every time I copy a new file to that directory, I perform... (3 Replies)
Discussion started by: iznogoud
3 Replies

9. Shell Programming and Scripting

remove special character from a specific column

Hello , i have a text file like this : A123 c12AB c32DD aaaa B123 23DS 12QW bbbb C123 2GR 3RG cccccc i want to remove the numbers from second and third column only. i tried this : perl -pe 's///g' file.txt > newfile.txt but it will remove the number from... (7 Replies)
Discussion started by: shelladdict
7 Replies

10. Shell Programming and Scripting

Changing the column for a row in a text file and adding another row

Hi, I want to write a shell script which increments a particular column in a row from a text file and then adds another row below the current row with the incremented value . For Eg . if the input file has a row : abc xyz lmn 89 lm nk o p I would like the script to create something like... (9 Replies)
Discussion started by: aYankeeFan
9 Replies
Login or Register to Ask a Question