How to get part of string in awk from match


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting How to get part of string in awk from match
# 1  
Old 05-05-2010
How to get part of string in awk from match

Hi,

Im an awk noob and I am having trouble trying to get matches.

Here is my script:

Code:
#!/bin/gawk -f
BEGIN {}
$0 ~ /<a href=".*">.*<\/a>/{print}

Ideally I want to be able to get the actual link and print it. In PHP you can do preg_replace and get the match you want by using \\1 where 1 is the occurance of the match.

I have no idea how to do this in AWK.
# 2  
Old 05-05-2010
Perl to the rescue. If you want the complete line:
Code:
perl -lne 'print if /<a href=".*?">/;' file.html

If you want just the URI:
Code:
perl -lne 'print $1 if /<a href="(.*?)">/;' file.html

If you want just the link text:
Code:
perl -lne 'print $1 if /<a href=".*?">(.*?)<\/a>/;' file.html

# 3  
Old 05-05-2010
Ah, your use of brackets help me devise it in awk:

Code:
$0 ~ /<a href=".*">(.*)<\/a>/ {print gensub(/<a href=\"(.*)\">.*<\/a>/, "\\1", 1)}

Now I just need to find out how it can capture multiple links that are on the same line...?
Login or Register to Ask a Question

Previous Thread | Next Thread

9 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

awk - how to compare part of the string?

Need help for awk.. file will have comma separated numbers, I need check digits before 10 numbers eg ( 001)1234567890 Basically want to check country code of a mobile number. eg: abc,def,data, data,0011234567890, data,data Script should be checking country code with 001, I will pass... (10 Replies)
Discussion started by: vegasluxor
10 Replies

2. Shell Programming and Scripting

Awk, if line after string does not match insert

I have a large file with interface records. I need to check every record that has the string "encapsulation bridge1483" and if the next line after this does not have "ip description" then I need to insert a line to add "ip description blah_blah_blah. Sample file: interface atm 1/0.190158... (3 Replies)
Discussion started by: numele
3 Replies

3. Shell Programming and Scripting

Match part of string in file2 based on column in file1

I have a file containing texts and indexes. I need the text between (and including ) INDEX and number "1" alone in line. I have managed this: awk '/INDEX/,/1$/{if (!/1$/)print}' file1.txt It works for all indexes. And then I have second file with years and indexes per year, one per line... (3 Replies)
Discussion started by: phoebus
3 Replies

4. Shell Programming and Scripting

awk : match the string and string with the quotes :

Hi all, Here is the data file: - want to match only lan3 in the output . - not lan3:1 file : OPERATING_SYSTEM=HP-UX LOOPBACK_ADDRESS=127.0.0.1 INTERFACE_NAME="lan3" IP_ADDRESS="10.53.52.241" SUBNET_MASK="255.255.255.192" BROADCAST_ADDRESS="" INTERFACE_STATE=""... (2 Replies)
Discussion started by: rveri
2 Replies

5. UNIX for Dummies Questions & Answers

awk for trimming a string up to the first, then second, then third... match

Hi ! With awk, I would need to trim a string from the beginning up to the first occurrence of "1", then from the beginning up to the second occurrence of "1", then from the beginning up to the third, then the fourth...., then the last occurrence of "1". input: 1aaa1bb1ccccccc dd1e1ffff... (7 Replies)
Discussion started by: beca123456
7 Replies

6. Shell Programming and Scripting

awk: process a part of string, leaving others unchanged

Hello, I have a bunch of strings of the following format: 11.22.33.44.in-addr.arpa x.y.z. What I want to do is format each string in this way: 44.33.22.11 x.y.z i.e. the string conists of two columns, where delimiter is space. I need to output modified first column (delimiter inside... (7 Replies)
Discussion started by: zerorange
7 Replies

7. Shell Programming and Scripting

match string exactly with awk/sed

Hi all, I have a list that I would like to parse with awk/sed. The list is contains entries such as: JournalTitle: Biochemistry JournalTitle: Biochemistry and cell biology = Biochimie et biologie cellulaire JournalTitle: Biochemistry and experimental biology JournalTitle: Biochemistry and... (6 Replies)
Discussion started by: euval
6 Replies

8. Shell Programming and Scripting

Need Awk command to get part of string based on delimeter

HI, Need awk command to get date and time alone from Input : "15:29:15 28.08.2010|SCHEDULE: Started program POSG1" Output expected : "15:29:15 28.08.2010" Please help. (9 Replies)
Discussion started by: shanneykar
9 Replies

9. Shell Programming and Scripting

Extract Part of string from 3rd field $3 using AWK

I'm executing "wc -lc" command in a c shell script to get record count and byte counts and writing them to a file. I get the result with the full pathname of the file. But I do not want the path name to be printed in the output file. I heard that using Awk we can get this but I don't have any... (4 Replies)
Discussion started by: stakuri
4 Replies
Login or Register to Ask a Question