Match text and print/pipe only that text


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Match text and print/pipe only that text
# 1  
Old 03-30-2015
Match text and print/pipe only that text

I'm trying to pull an image source url from a html source file. I'm new with regex. I'm in BaSH. I've tried
Code:
grep -E 'http.*jpg' file

which highlights the text, but gives me 2 problems:

1) Results aren't stand alone and can't be piped to another command. (I believe it includes everything in results between line breaks)

2) It includes spaces. This is an issue when there's an anchor before the image, and it counts from href='http.....'><img src.....jpg

I've tried putting lookbehind in there
Code:
grep -E '(?<=src\=\[\'|\"])http.*jpg' file
grep -E '(?<!href..)http.*jpg' file
grep -E '(?<=src..)http.*jpg' file

I either get errors or nothing returned. I don't know if it's something simple or not, but any help would be appreciated. I'm not opposed to sed or awk, but my knowledge of them is basically clean slate.

Also, this is *NOT* a homework assignment. It's me trying to learn by doing and hitting a wall repeatedly. Thanks for the help!!!
# 2  
Old 03-30-2015
Please show us a sample input file (including samples of the lines that are giving you problems with spaces) and the output you are trying to produce.

What operating system are you using. Some implementations of grep have a non-standard -o option that prints only the text matched by the search pattern; not the entire line containing the matched text.
This User Gave Thanks to Don Cragun For This Post:
# 3  
Old 03-30-2015
CPU & Memory

I am using ubuntu 14.04. Thanks! The -o option was a big part of what I was looking for. Thank you!

Currently, I'm getting
Code:
http://site.com/image/word/tag/funny"><img src="http://images.site.com/pic/19434-3201cd0ed412e26b2c06cf00a0803c64.jpg
http://images.site.com/uploaded_pics/thumbs/19434.jpg

Desired result is
Code:
http://images.site.com/pic/19434-3201cd0ed412e26b2c06cf00a0803c64.jpg

Whenever I use lookbehinds I get no results at all. ex:
Code:
grep -Eo '(?<=")http.*jpg' file
grep -Eo '(?<=\")http.*jpg' file
grep -Eo "(?<!\')http.*jpg' file



I was able to get the desired result with
Code:
grep -Eo 'http\S*\/pic\S*jpg' file

I am still not sure why the lookbehinds didn't work. Please let me know if you have any insight into what I'm doing wrong and thank you so much for the help!
# 4  
Old 03-30-2015
The system I use doesn't have lookbehinds, so I can't experiment and definitively say why your lookbehind attempts were failing. The \S is not standard in REs either. A standard ERE that matches a string starting with http:, containing /pic/, ending with jpg, and containing no spaces is:
Code:
http:[^ ]*/pic/[^ ]*jpg

This same string used as a BRE produces the same results, so I would just use:
Code:
grep -o 'http:[^ ]*/pic/[^ ]*jpg' file

This User Gave Thanks to Don Cragun For This Post:
# 5  
Old 03-30-2015
Quote:
Originally Posted by amx401
...Whenever I use lookbehinds I get no results at all. ex:
Code:
grep -Eo '(?<=")http.*jpg' file
grep -Eo '(?<=\")http.*jpg' file
grep -Eo "(?<!\')http.*jpg' file

...I am still not sure why the lookbehinds didn't work. Please let me know if you have any insight into what I'm doing wrong ...
I have GNU grep and it doesn't work on it.
Most likely grep's "-E" option does not support lookarounds. The man page or the GNU page at: GNU Grep 2.21
do not mention anything about lookarounds.
The "-E" option for "Extended REs" provides the special meaning to metacharacters like "+", "|", "{" etc. unlike BREs.

Code:
$
$ cat f31
http://site.com/image/word/tag/funny"><img src="http://images.site.com/pic/19434-3201cd0ed412e26b2c06cf00a0803c64.jpg
http://images.site.com/uploaded_pics/thumbs/19434.jpg
$
$ grep -E '(?<!")http://images' f31
$
$

If you have GNU grep, you could try the experimental "-P" option which provides support for Perl-compatible regular expressions:

Code:
$
$ grep -Po '(?<!")(http://images.*jp)' f31
http://images.site.com/uploaded_pics/thumbs/19434.jp
$
$

Or you could use Perl:

Code:
$
$ perl -ne 'printf("Line = %d, Matched Text = %s\n", $., $1) if /(?<!")(http:\/\/images.*jpg)/' f31
Line = 2, Matched Text = http://images.site.com/uploaded_pics/thumbs/19434.jpg
$
$

# 6  
Old 04-01-2015
I was operating under the incorrect assumption that lookarounds were universal. Thank you both for your help and responses!
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Match text to lines in a file, iterate backwards until text or text substring matches, print to file

hi all, trying this using shell/bash with sed/awk/grep I have two files, one containing one column, the other containing multiple columns (comma delimited). file1.txt abc12345 def12345 ghi54321 ... file2.txt abc1,text1,texta abc,text2,textb def123,text3,textc gh,text4,textd... (6 Replies)
Discussion started by: shogun1970
6 Replies

2. Linux

How to run commands with pipe from text file?

Hello, I have standard loop while read -r info; do command $info done < info in info text file I have multiple commands each on line that I want to execute. When I used them in console they worked, but not with this loop. This is one of the commands in info file: grep... (4 Replies)
Discussion started by: adamlevine
4 Replies

3. Shell Programming and Scripting

awk to print text in field if match and range is met

In the awk below I am trying to match the value in $4 of file1 with the split value from $4 in file2. I store the value of $4 in file1 in A and the split value (using the _ for the split) in array. I then strore the value in $2 as min, the value in $3 as max, and the value in $1 as chr. If A is... (6 Replies)
Discussion started by: cmccabe
6 Replies

4. Shell Programming and Scripting

Display match or no match and write a text file to a directory

The below bash connects to a site, downloads a file, searches that file based of user input - could be multiple (all that seems to work). What I am not able to figure out is how to display on the screen match found or no match found" and write a file to a directory (C:\Users\cmccabe\Desktop\wget)... (4 Replies)
Discussion started by: cmccabe
4 Replies

5. Shell Programming and Scripting

Match text from file 1 to file 2 and return specific text

I hope this makes sense and is possible. I am trying to match $1 of panel_genes.txt with $3 of RefSeqGene.txt and when a match is found the value in $6 of RefSeqGene.txt Example: ACTA2 is $1 of panel_genes.txt ACTA2 NM_001613.2 ACTA2 NM_001141945.1 awk 'FNR==NR {... (4 Replies)
Discussion started by: cmccabe
4 Replies

6. UNIX for Dummies Questions & Answers

Search String, Out matched text and input text for no match.

I need to search a string for some specific text which is no big deal using grep. My problem is when the search fails to find the text. I need to add text like "na" when my search does not match. I have tried this command but it does not work when I put the command in a loop in a bash script: ... (12 Replies)
Discussion started by: jojojmac5
12 Replies

7. UNIX for Advanced & Expert Users

Pipe text in to find command

I would like to know why this command does not work. I have a script which connects to and ftp site. After getting the remote files localy i need move each remote file to a archive folder on the FTP site *Please also note that some of the files have spaces in the file name. Im trying to... (3 Replies)
Discussion started by: juanjanse
3 Replies

8. Shell Programming and Scripting

process text between pattern and print other text

Hi All, The file has the following. =========start of file=== This is a file containing employee info START name john id 123 date 12/1/09 END START name sam id 4234 date 12/1/08 resigned END (9 Replies)
Discussion started by: vlinet
9 Replies

9. Shell Programming and Scripting

Pipe text from a file into an array

Hi Guys I have a question about filling up an array I have a file called USER_FILE.txt it contains the following: Real Name:Thomas A Username:THOMAS_A Real Name:Thomas B Username:THOMAS_B Real Name:Thomas C Username:THOMAS_C Real Name:Thomas D Username:THOMAS_D Real Name:Thomas E... (8 Replies)
Discussion started by: grahambo2005
8 Replies

10. Shell Programming and Scripting

Search text from a file and print text and one previous line too

Hi, Please let me know how to find text and print text and its previous line. Please don't get irritated few days back I asked text and next line. I am using HP-UX 11.11 Thanks for your help. (6 Replies)
Discussion started by: kamranjalal
6 Replies
Login or Register to Ask a Question