wget and xml isssue


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting wget and xml isssue
# 1  
Old 12-22-2009
wget and xml isssue

Hi All,
I need to download with wget all files with "xml" extension for a specifix url say for instance https://www.example.com/xmlfiles/
I need to do this 3 times a day downloading just the new files added since last download and/or files that are changed from the last download. In order to do this i have used the following command :
Code:
wget -r -nd -N -A xml --no-check-certificate https://user:password@www.example.com/xmlfiles/

I am successfully authenticate from the server but than I get
403 FORBIDDEN

If fo the following :
Code:
wget -r -nd -N --no-check-certificate https://user:password@www.example.com/xmlfiles/file.xml

Than I can successfully download the file/s from the url.

Where is my mistake ?
P.s. I cannot use ftp but just https

Than I need to parse all the downloaded xml files extracting data into some csv file what will be the best way ?
Here in attachment you can find one of my xml files as well as the output csv ( here exported into xls because csv is not allowed) file that I need to abtain after parsing of xml.



Thank you in advance for your help and Merry Christmas to all.
Nino

Last edited by pludi; 12-22-2009 at 07:53 AM.. Reason: removed links and added code tags
# 2  
Old 12-24-2009
If you point a web browser at:
Code:
https://www.example.com/xmlfiles/

does it try and retrieve index.htm or index.html (whatever the web server's default is set to)?
If that is the case then I imagine wget(1) is doing the same and presumably that page does not exist?
# 3  
Old 12-25-2009
Did you try using -A.xml as an option ?
# 4  
Old 12-28-2009
Hi there, thank you for your reply.
I have tried also the -A.xml but no luck, you are right Tony wget is looking for the index.html page that in this case doesn't exist.
Any idea how solve the problem ?
Thanks you again.
Greetings
# 5  
Old 12-29-2009
Either:

1. Try a wget of index.htm, just in case it then gives you a list of the files in the directory from which you can then extract the names of the files you are interested in and then wget each file in turn.

2. Just wget each file in turn you are expecting to get.

3. Get the web server configuration amended so that option 1 works with either index.html or index.htm giving you a directory listing and then doing suggestion 1!
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

How to pull multiple XML tags from the same XML file in Shell.?

I'm searching for the names of a TV show in the XML file I've attached at the end of this post. What I'm trying to do now is pull out/list the data from each of the <SeriesName> tags throughout the document. Currently, I'm only able to get data the first instance of that XML field using the... (9 Replies)
Discussion started by: hungryd
9 Replies

2. UNIX for Beginners Questions & Answers

Grepping multiple XML tag results from XML file.

I want to write a one line script that outputs the result of multiple xml tags from a XML file. For example I have a XML file which has below XML tags in the file: <EMAIL>***</EMAIL> <CUSTOMER_ID>****</CUSTOMER_ID> <BRANDID>***</BRANDID> Now I want to grep the values of all these specified... (1 Reply)
Discussion started by: shubh752
1 Replies

3. Web Development

CURL - Post Form Isssue ( sequel )

Hi, I write a new thread to discuss about my closed topic with new information ( /280990-curl-post-form-issue.html ) The previous post was closed because of missing informations, I didn't have access yet to server logs. ----------------------------------------------------------------------... (4 Replies)
Discussion started by: Fred13
4 Replies

4. Shell Programming and Scripting

Wget - working in browser but cannot download from wget

Hi, I need to download a zip file from my the below US govt link. https://www.sam.gov/SAMPortal/extractfiledownload?role=WW&version=SAM&filename=SAM_PUBLIC_MONTHLY_20160207.ZIP I only have wget utility installed on the server. When I use the below command, I am getting error 403... (2 Replies)
Discussion started by: Prasannag87
2 Replies

5. Shell Programming and Scripting

How to add Xml tags to an existing xml using shell or awk?

Hi , I have a below xml: <ns:Body> <ns:result> <Date Month="June" Day="Monday:/> </ns:result> </ns:Body> i have a lookup abc.txtt text file with below details Month June July August Day Monday Tuesday Wednesday I need a output xml with below tags <ns:Body> <ns:result>... (2 Replies)
Discussion started by: Nevergivup
2 Replies

6. Shell Programming and Scripting

Shell Command to compare two xml lines while ignoring xml tags

I've got two different files and want to compare them. File 1 : HTML Code: <response ticketId="944" type="getQueryResults"><status>COMPLETE</status><description>Query results fetched successfully</description><recordSet totalCount="1" type="sms_records"><record... (1 Reply)
Discussion started by: Shaishav Shah
1 Replies

7. Shell Programming and Scripting

Perl Reading Excel sheet isssue

There is a perl scriptwhich will read Excel sheet and create one file(.v) . Excel sheet::: A B C D 1 cpu_dailog 2 3 4 Perl will create the file(.v) like thsi ::: assert (cpu_dailog_iso ==2) ; assert (cpu_dailog_reset ==3); assert (cpu_dailog_idle... (3 Replies)
Discussion started by: naaj_ila
3 Replies

8. Shell Programming and Scripting

How to add the multiple lines of xml tags before a particular xml tag in a file

Hi All, I'm stuck with adding multiple lines(irrespective of line number) to a file before a particular xml tag. Please help me. <A>testing_Location</A> <value>LA</value> <zone>US</zone> <B>Region</B> <value>Russia</value> <zone>Washington</zone> <C>Country</C>... (0 Replies)
Discussion started by: mjavalkar
0 Replies

9. Shell Programming and Scripting

python - wget xml doc and parse with awk

Well, that's what I'd do in bash :) Here's what I have so far: import urllib2 from BeautifulSoup import BeautifulStoneSoup xml = urllib2.urlopen('http://weatherlink.com/xml.php?user=blah&pass=blah') soup = BeautifulStoneSoup(xml) print soup.prettify() but all it does is grab the html... (0 Replies)
Discussion started by: unclecameron
0 Replies

10. UNIX for Dummies Questions & Answers

email users isssue

Hi, my email server is set up in a different machine which runs lineox enterprise 3.0. It exports /var/spool/mail to the sun server running solaris 9 and hence, all workstations nd users can access their mail. but the problem is some users cannot open their mail at all. the error "mailer... (0 Replies)
Discussion started by: stakes20
0 Replies
Login or Register to Ask a Question