Sponsored Content
Top Forums Shell Programming and Scripting Extracting urls from curl output Post 302964625 by jozo95 on Saturday 16th of January 2016 04:14:09 PM
Old 01-16-2016
Quote:
Originally Posted by RavinderSingh13
Hello jozo95,

Sorry I haven't seen links without <img, so only it didn't match it properly.
Could you please try following and let me know if this helps you.
Code:
awk -F"[><]" '{for(i=1;i<=NF;i++){if($i ~ /a href=.*\//){print "<" $i ">"}}}'   Input_file


Thanks,
R. Singh
'

That works good.

I solved it using this code:

Code:
grep -o '<a href="[a-z]\+[^>"]*' | sed -ne 's/^<a href="\(.*\)/\1/p'

---------- Post updated at 04:14 PM ---------- Previous update was at 04:12 PM ----------

Quote:
Originally Posted by Aia
Any href:
Unfortunately I dont know perl, yet, but thanks for your input anyways, much appreciated Smilie
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Extracting fields from an output 8-)

I am getting a variable as x=2006/01/18 now I have to extract each field from it. Like x1=2006, x2=01 and x3=18. Any idea how? Thanks a lot for help. Thanks CSaha (6 Replies)
Discussion started by: csaha
6 Replies

2. Shell Programming and Scripting

let curl output to stdout AND save to a file

hello hackers. i have a curl process running as cgi directly pushing stdout to the client. but i want to additionally save that stream to a file at the same time. any directions madly welcome. thanks in advance (3 Replies)
Discussion started by: scarfake
3 Replies

3. Shell Programming and Scripting

Pattern matching extracting urls from rss, shell scripts

Hi all, how could i do ? I have a Rss file, i want to extract only the Urls (many) matching http://www.xxx.com/trailers/ from that file and copy into another file. like " <pubDate>Wed, 29 Apr 2009 00:00:00 PST</pubDate> <content:encoded><!Apple - Movie Trailers - The Hangover"><img... (3 Replies)
Discussion started by: BremboloIV
3 Replies

4. Shell Programming and Scripting

script to output curl result as html

hi, new to scripting and would like to know how can I have a script which will curl a few URLs and have the results such as the URLs being curled, dns lookup time, connection time, total time, etc save in a html format in a form of table with column and rows. thank you. (4 Replies)
Discussion started by: squidusr
4 Replies

5. Shell Programming and Scripting

Getting cURL to output verbose to a file

This is about to drive me crazy. What I want to do is simple: output ALL the verbose information from curl to a file I have read the manual, tried several options and searched this forum but no salvation... I'm using curl -k -Q "command" --user user:passwd --ftp-pasv --ftp-ssl -v... (1 Reply)
Discussion started by: caramandi
1 Replies

6. Shell Programming and Scripting

web service call: curl output to xsltproc input

I need to invoke a web service and extract what I need from the response using a combination of curl and xsltproc. However, any file-based parameters that must be supplied to both these programs must be from stdin and not actual files. At least with curl, it seems to think that I am supplying a... (3 Replies)
Discussion started by: webuser
3 Replies

7. Shell Programming and Scripting

ery weird wget/curl output - what should I do?

Hi, I'm trying to write a script to download RedHat's errata digest. It comes in a txt.gz format, and i can get it easily with firefox. HOWEVER: output is VERY strange when donwloading it in a script. It seems I'm getting a file of the same size - but partially text and partly binary! It... (5 Replies)
Discussion started by: jstilby
5 Replies

8. Shell Programming and Scripting

Encapsulating output of CURL and/or WGET

i use curl and wget quite often. i set up alarms on their output. for instance, i would run a "wget" on a url and then search for certain strings within the output given by the "wget". the problem is, i cant get the entire output or response of my wget/curl command to show up correctly in... (3 Replies)
Discussion started by: SkySmart
3 Replies

9. Shell Programming and Scripting

Filter output in curl

Hello guys, I'm writing a little script which sends me sms with my shell script via api of a sms provider. problem is I can't filter my curl output for this site: site url:... (1 Reply)
Discussion started by: genius90
1 Replies

10. Web Development

Filename output in curl

How can I get the name of the default output filename from curl using the argument -O? Using -o one can choose a filename. I want to get the name of the original file, but don't understand how to get it. curl -o filename http://www.website.com curl -O http://www.website.com The... (3 Replies)
Discussion started by: locoroco
3 Replies
TMPWATCH(8)						   System Administrator's Manual					       TMPWATCH(8)

NAME
tmpwatch - removes files which haven't been accessed for a period of time SYNOPSIS
tmpwatch [-u|-m|-c] [-MUadfqstvx] [--verbose] [--force] [--all] [--nodirs] [--nosymlinks] [--test] [--fuser] [--quiet] [--atime|--mtime|--ctime] [--dirmtime] [--exclude path] [--exclude-user user] time dirs DESCRIPTION
tmpwatch recursively removes files which haven't been accessed for a given time. Normally, it's used to clean up directories which are used for temporary holding space such as /tmp. When changing directories, tmpwatch is very sensitive to possible race conditions and will exit with an error if one is detected. It does not follow symbolic links in the directories it's cleaning (even if a symbolic link is given as its argument), will not switch filesystems, skips lost+found directories owned by the root user, and only removes empty directories, regular files, and symbolic links. By default, tmpwatch dates files by their atime (access time), not their mtime (modification time). If files aren't being removed when ls -l implies they should be, use ls -u to examine their atime to see if that explains the problem. If the --atime, --ctime or --mtime options are used in combination, the decision about deleting a file will be based on the maximum of these times. The --dirmtime option implies ignoring atime of directories, even if the --atime option is used. The time parameter defines the threshold for removing files. If the file has not been accessed for time, the file is removed. The time argument is a number with an optional single-character suffix specifying the units: h for hours, d for days. If no suffix is specified, time is in hours. Following this, one or more directories may be given for tmpwatch to clean up. OPTIONS
-u, --atime Make the decision about deleting a file based on the file's atime (access time). This is the default. Note that the periodic updatedb file system scans keep the atime of directories recent. -m, --mtime Make the decision about deleting a file based on the file's mtime (modification time) instead of the atime. -c, --ctime Make the decision about deleting a file based on the file's ctime (inode change time) instead of the atime; for directories, make the decision based on the mtime. -M, --dirmtime Make the decision about deleting a directory based on the directory's mtime (modification time) instead of the atime; completely ignore atime for directories. -a, --all Remove all file types, not just regular files, symbolic links and directories. -d, --nodirs Do not attempt to remove directories, even if they are empty. -f, --force Remove files even if root doesn't have write access (akin to rm -f). -l, --nosymlinks Do not attempt to remove symbolic links. -q, --quiet Report only fatal errors. -s, --fuser Attempt to use the "fuser" command to see if a file is already open before removing it. Not enabled by default. Does help in some circumstances, but not all. Dependent on fuser being installed in /sbin. Not supported on HP-UX or Solaris. -t, --test Don't remove files, but go through the motions of removing them. This implies -v. -U, --exclude-user=user Don't remove files owned by user, which can be an user name or numeric user ID. -v, --verbose Print a verbose display. Two levels of verboseness are available -- use this option twice to get the most verbose output. -x, --exclude=path Skip path; if path is a directory, all files contained in it are skipped too. If path does not exist, it must be an absolute path that contains no symbolic links. SEE ALSO
cron(1), ls(1), rm(1), fuser(1) WARNINGS
GNU-style long options are not supported on HP-UX. AUTHORS
Erik Troan <ewt@redhat.com> Preston Brown <pbrown@redhat.com> Nalin Dahyabhai <nalin@redhat.com> Miloslav Trmac <mitr@redhat.com> 4th Berkeley Distribution Fri Dec 14 2007 TMPWATCH(8)
All times are GMT -4. The time now is 07:11 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy