Sponsored Content
Full Discussion: Multi html download.
Top Forums Shell Programming and Scripting Multi html download. Post 302738079 by Corona688 on Friday 30th of November 2012 10:15:20 AM
Old 11-30-2012
Quote:
Originally Posted by bipinajith
Code:
while read URL
do
    wget "$URL" >> download.txt # Downloading URL using wget & appending it to file: download.txt
done < urls_list.dat            # Reading from a file: urls_list.dat which has list of URLs

Good use of while read. You can redirect the entire loop instead of reopening download.txt 1000 times though:
Code:
while read line
do
        wget ...
done > download.txt

wget also has some features which make a loop unnecessary though Smilie

wget is able to read a list of files with -i. The -nv option is also useful, to make it still print completed files without printing all the complicated junk wget usually does.

Code:
wget -nv -i urls_list.dat > download.txt

This should be much faster than calling wget 1000 times since it is able to re-use the same connection if it's connecting to the same site. Concurrency may not be necessary ( and may not be desirable in many cases -- how fast is your connection? ) but if it is, I'd split the list into parts and use wget -i on those parts.
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

multi-file multi-edit

Good day! I am trying to learn how to use the "sed" editor, to perform multiple edits on multiple files in multiple directories. I have one script that tries to call up each file and process it according to the edits listed in a second script. I am using a small input text to test these, at... (12 Replies)
Discussion started by: kielitaide
12 Replies

2. UNIX for Dummies Questions & Answers

Multi User Multi Task

Dear Experts Why we always hear that unix operating system is Multi User and Multi task. What does these two means. I have looked at some books and documents but couldn't find aclear explenation. Can we say Windows operating system is also multi user and multi task?? Thanks for your help in... (6 Replies)
Discussion started by: Reza Nazarian
6 Replies

3. AIX

Multi Link Interface Runtime - where to download ?

Hello, I need "devices.common.IBM.ml 1.4.0.0 C F Multi Link Interface Runtime" to be installed on my machine. I need it for two SAN cards to work correctly. Where do I get it ? thanks Vilius (1 Reply)
Discussion started by: vilius
1 Replies

4. Red Hat

Send HTML body and HTML attachment using MUTT command

Hi there.. I need a proper "mutt" command to send a mail with html body and html attachment at a time. Also if possible let me know the other commands to do this task. Please help me.. (2 Replies)
Discussion started by: vickramshetty
2 Replies

5. Shell Programming and Scripting

download an html file via wget and pass it to mysql and update a database

CAN I download an html file via wget and pass it to mysql and update a database field? (8 Replies)
Discussion started by: mapasainfo
8 Replies

6. Shell Programming and Scripting

How to substract selective values in multi row, multi column file (using awk or sed?)

Hi, I have a problem where I need to make this input: nameRow1a,text1a,text2a,floatValue1a,FloatValue2a,...,floatValue140a nameRow1b,text1b,text2b,floatValue1b,FloatValue2b,...,floatValue140b look like this output: nameRow1a,text1b,text2a,(floatValue1a - floatValue1b),(floatValue2a -... (4 Replies)
Discussion started by: nricardo
4 Replies

7. UNIX for Advanced & Expert Users

Mutt for html body and multiple html & pdf attachments

Hi all: Been racking my brain on this for the last couple of days and what has been most frustrating is that this is the last piece I need to complete a project. There are numerous posts discussing mutt in this forum and others but I have been unable to find similar issues. Running with... (1 Reply)
Discussion started by: raggmopp
1 Replies

8. Shell Programming and Scripting

Download dynamic generated image from HTML page

I've an HTML page where the pie chart is generated with google java code with the required input values in UNIX. The HMTL page is generated in UNIX and then when it loads in browser, the code is interpreted thought internet and the pie chart is generated. This is done by the java code in the... (4 Replies)
Discussion started by: Amutha
4 Replies

9. Programming

Multi head/multi window hello world

I am trying to write a large X app. I have successfully modified my xorg.conf to setup 4 monitors on an NVIDIA Quatro5200. I am trying to modify a simple hello world application to open a window on three of the four monitors. depending on the changes to loop the window creation section and event... (2 Replies)
Discussion started by: advorak
2 Replies

10. Shell Programming and Scripting

Help with reformat single-line multi-fasta into multi-line multi-fasta

Input File: >Seq1 ASDADAFASFASFADGSDGFSDFSDFSDFSDFSDFSDFSDFSDFSDFSDFSD >Seq2 SDASDAQEQWEQeqAdfaasd >Seq3 ASDSALGHIUDFJANCAGPATHLACJHPAUTYNJKG ...... Desired Output File >Seq1 ASDADAFASF ASFADGSDGF SDFSDFSDFS DFSDFSDFSD FSDFSDFSDF SD >Seq2 (4 Replies)
Discussion started by: patrick87
4 Replies
DGET(1)                                                                                                                                    DGET(1)

NAME
dget -- Download Debian source and binary packages SYNOPSIS
dget [options] URL ... dget [options] package[=version] DESCRIPTION
dget downloads Debian packages. In the first form, dget fetches the requested URLs. If this is a .dsc or .changes file, then dget acts as a source-package aware form of wget: it also fetches any files referenced in the .dsc/.changes file. The downloaded source is then checked with dscverify and, if successful, unpacked by dpkg-source. In the second form, dget downloads a binary package (i.e., a .deb file) from the Debian mirror configured in /etc/apt/sources.list(.d). Unlike apt-get install -d, it does not require root privileges, writes to the current directory, and does not download dependencies. If a version number is specified, this version of the package is requested. In both cases dget is capable of getting several packages and/or URLs at once. (Note that .udeb packages used by debian-installer are located in separate packages files from .deb packages. In order to use .udebs with dget, you will need to have configured apt to use a packages file for component/debian-installer). Before downloading files listed in .dsc and .changes files, and before downloading binary packages, dget checks to see whether any of these files already exist. If they do, then their md5sums are compared to avoid downloading them again unnecessarily. dget also looks for matching files in /var/cache/apt/archives and directories given by the --path option or specified in the configuration files (see below). Finally, if downloading (.orig).tar.gz or .diff.gz files fails, dget consults apt-get source --print-uris. Download backends used are curl and wget, looked for in that order. dget was written to make it easier to retrieve source packages from the web for sponsor uploads. For checking the package with debdiff, the last binary version is available via dget package, the last source version via apt-get source package. OPTIONS
-b, --backup Move files that would be overwritten to ./backup. -q, --quiet Suppress wget/curl non-error output. -d, --download-only Do not run dpkg-source -x on the downloaded source package. This can only be used with the first method of calling dget. -x, --extract Run dpkg-source -x on the downloaded source package to unpack it. This option is the default and can only be used with the first method of calling dget. -u, --allow-unauthenticated Do not attempt to verify the integrity of downloaded source packages using dscverify. --build Run dpkg-buildpackage -b -uc on the downloaded source package. --path DIR[:DIR ...] In addition to /var/cache/apt/archives, dget uses the colon-separated list given as argument to --path to find files with a matching md5sum. For example: "--path /srv/pbuilder/result:/home/cb/UploadQueue". If DIR is empty (i.e., "--path ''" is specified), then any previously listed directories or directories specified in the configuration files will be ignored. This option may be specified multiple times, and all of the directories listed will be searched; hence, the above example could have been written as: "--path /srv/pbuilder/result --path /home/cb/UploadQueue". --insecure Allow SSL connections to untrusted hosts. --no-cache Bypass server-side HTTP caches by sending a Pragma: no-cache header. -h, --help Show a help message. -V, --version Show version information. CONFIGURATION VARIABLES
The two configuration files /etc/devscripts.conf and ~/.devscripts are sourced by a shell in that order to set configuration variables. Command line options can be used to override configuration file settings. Environment variable settings are ignored for this purpose. The currently recognised variable is: DGET_PATH This can be set to a colon-separated list of directories in which to search for files in addition to the default /var/cache/apt/archives. It has the same effect as the --path command line option. It is not set by default. DGET_UNPACK Set to 'no' to disable extracting downloaded source packages. Default is 'yes'. DGET_VERIFY Set to 'no' to disable checking signatures of downloaded source packages. Default is 'yes'. BUGS AND COMPATIBILITY
dget package should be implemented in apt-get install -d. Before devscripts version 2.10.17, the default was not to extract the downloaded source. Set DGET_UNPACK=no to revert to the old behaviour. AUTHOR
This program is Copyright (C) 2005-08 by Christoph Berg <myon@debian.org>. Modifications are Copyright (C) 2005-06 by Julian Gilbey <jdg@debian.org>. This program is licensed under the terms of the GPL, either version 2 of the License, or (at your option) any later version. SEE ALSO
apt-get(1), debcheckout(1), debdiff(1), dpkg-source(1), curl(1), wget(1). Debian Utilities 2013-12-23 DGET(1)
All times are GMT -4. The time now is 04:25 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy