Wget, grep, sort, sed in 1 command/script


 
Thread Tools Search this Thread
Top Forums UNIX for Beginners Questions & Answers Wget, grep, sort, sed in 1 command/script
# 1  
Old 04-19-2017
Wget, grep, sort, sed in 1 command/script

Hi, I need to join these statements for efficiency, and without having to make a new directory for each batch. I'm annotating commands below.
Code:
wget -q -r -l1 URL 
^^ can't use -O - here and pipe | to grep because of -r
grep -hrio "\b[a-z0-9.-]\+@[a-z0-9.-]\+\.[a-z]\{2,4\}\+\b" * > first.txt
^^ Need to grep the output of wget only; at present it's grepping other files in the directory.
sort -u first.txt > second.txt && sed '0~5 a\\' second.txt > third.txt
^^ piping | these doesn't work; && does.

Thanks in advance for direction.
# 2  
Old 04-19-2017
What operating system are you using?

What shell are you using?

If the commands:
Code:
grep -hrio "\b[a-z0-9.-]\+@[a-z0-9.-]\+\.[a-z]\{2,4\}\+\b" * > first.txt
sort -u first.txt > second.txt && sed '0~5 a\\' second.txt > third.txt

produce the output you want in third.txt after running the command:
Code:
wget -q -r -l1 URL

what is the difference between third.txt and fourth.txt after you also run:
Code:
grep -hrio "\b[a-z0-9.-]\+@[a-z0-9.-]\+\.[a-z]\{2,4\}\+\b" * |
    sort -u |
    sed '0~5 a\\' > fourth.txt

? Are any diagnostic messages produced by either of these sets of commands? If so, exactly what are those diagnostic messages?
# 3  
Old 04-19-2017
Hi, thanks for responding:
Linux Mint 18.1
GNU bash, version 4.3.46(1)-release (x86_64-pc-linux-gnu)

The results are identical in the example you suggested, no errors.

wget's -O - option doesn't work in recursive mode, so how to feed wget output to the grep statement programmatically, without grep reading every previously-wget'd folder in the local directory? (Which is what is happening now.)
# 4  
Old 04-20-2017
According to the wget man page, old versions (before version 1.11) and new versions of wget (version 1.11.2 and later, although it may issue a warning in this case) should work just fine with:
Code:
wget -q -r -l1 -O - URL |
    grep -hrio "\b[a-z0-9.-]\+@[a-z0-9.-]\+\.[a-z]\{2,4\}\+\b" |
    sort -u |
    sed '0~5 a\\' > fifth.txt

This User Gave Thanks to Don Cragun For This Post:
# 5  
Old 04-20-2017
Code:
wget -q -r -l1 -O - URL/ |
>     grep -hrio "\b[a-z0-9.-]\+@[a-z0-9.-]\+\.[a-z]\{2,4\}\+\b" |
>     sort -u |
>     sed '0~5 a\\' > 6.txt
-k or -r can be used together with -O only if outputting to a regular file.

...and this yields an empty file in its own directory (or grabs all previous wget'd folders in shared directory--see my post above):
Code:
wget -q -r -l1 -O wget.html URL/ |     grep -hrio "\b[a-z0-9.-]\+@[a-z0-9.-]\+\.[a-z]\{2,4\}\+\b" |     sort -u |     sed '0~5 a\\' > 6.txt

So the problem persists as mentioned in my post above, that grep is not getting the wget I'm trying to attach to it. When the URL is downloaded, it creates a folder with many files, so I can see why outputting everything to an .html file (or any single file) is not going to work. Even if it did work, it's not getting piped to the grep statement, because executed in a directory with other previously wget'd URL's, the script is extracting info from all those other folders. So I need the script to access only the wget'd folder in the current command. I can do this by creating a new directory (I know, I'm repeating myself...), but I'd like to do it in the same directory.
# 6  
Old 04-20-2017
Would using a fifo (as a regular file for -O) be an option?
This User Gave Thanks to RudiC For This Post:
# 7  
Old 04-20-2017
Just Googled fifo...this isn't doing anything differently than what I've already posted (no error either).
Code:
mkfifo /tmp/jobqueue |
wget -q -r -l1 -O /tmp/jobqueue URL/ |
    grep -hrio "\b[a-z0-9.-]\+@[a-z0-9.-]\+\.[a-z]\{2,4\}\+\b" |
    sort -u |
    sed '0~5 a\\' > 7.txt

Login or Register to Ask a Question

Previous Thread | Next Thread

9 More Discussions You Might Find Interesting

1. Programming

[ask]SQL command act like sort and grep

for example, I have a text file in random content inside, maybe something like this. 234234 54654 123134 467456 24234234 7867867 23424 568567if I run this command cat "filename.txt" | sort -n | grep "^467456$" -A 1 -B 1the result is 234234 467456 568567is it possible to do this command... (2 Replies)
Discussion started by: 14th
2 Replies

2. Shell Programming and Scripting

applescript & grep - sed command

I'm new using Unix commands in applescript. The following script you choose different folders with PDfs, get file count of PDfs on chosen folders, & write the results in text file. set target_folder to choose folder with prompt "Choose target folders containing only PDFs to count files" with... (0 Replies)
Discussion started by: nellbern
0 Replies

3. Shell Programming and Scripting

Deleting lines from wget script using find/sed

I am downloading numerous files (600 plus) using a 'wget' script. Some files are not downloading and have zero byte size. I am using the following 'find' command to find the files in my cd which have non-zero byte size after the wget script has been run. find -type f -size +0 -exec basename {}... (3 Replies)
Discussion started by: dl226
3 Replies

4. Shell Programming and Scripting

Using a combination of sort/cut/grep/awk/join/paste/sed

I have a file and need to only select users that have a shell of “/bin/bash” in the line using awk or sed please help (4 Replies)
Discussion started by: boyboy1212
4 Replies

5. Shell Programming and Scripting

Shell script with wget in ssh command

Hi, I am using a linux with bash. I have a script written which will login to a remote server and from there it runs a "wget" to downlaod a build file from a webserver. Here is the line inside the script: ssh -t -q -o StrictHostKeyChecking=no -o ConnectTimeout=5 root@${a}'wget... (4 Replies)
Discussion started by: sunrexstar
4 Replies

6. Shell Programming and Scripting

Help with grep awk sed command

I have a txt file with data abc:def:ghi:jkl:mno pq stu vwx I want to take out abc what should i do? I try awk awk '/abc/ {print $1}' listfile.txt> extendedfile.txt (5 Replies)
Discussion started by: Learnerabc
5 Replies

7. UNIX for Dummies Questions & Answers

Using grep output as input for sed command

Hi, I would like to know if this is possible, and if so what can i do to make this work. I would like to grep a line X from fileA and then use the output to replace a word Y in fileB. grep "line X" fileA | sed -e 's/Y/X/g' > outfile this statement does not work, as i do not know how to... (7 Replies)
Discussion started by: cavanac2
7 Replies

8. Shell Programming and Scripting

Complex find grep or sed command

Haven't worked in bash for ages. did a good bit of shell scripting in regular sh, but have forgotten most of it. I have several thousand php files that now include the following line at the end of the file. There is no LF or CR/LF before it begins, it is just concatenated to the final line of... (3 Replies)
Discussion started by: sjburden
3 Replies

9. UNIX for Dummies Questions & Answers

sort script/command

ok. i am doing a project where i have hand typed in the titles of nearly 500 DVD titles, each one is on a seperate line. but they arent in any type of alphebetical order, and i need them sorted in that format (A-Z or a-z) ..... i know that the 'sort' command can be used but also know the... (6 Replies)
Discussion started by: Chadbot
6 Replies
Login or Register to Ask a Question