wget and xml isssue


Login or Register for Dates, Times and to Reply

 
Thread Tools Search this Thread
# 1  
wget and xml isssue

Hi All,
I need to download with wget all files with "xml" extension for a specifix url say for instance https://www.example.com/xmlfiles/
I need to do this 3 times a day downloading just the new files added since last download and/or files that are changed from the last download. In order to do this i have used the following command :
Code:
wget -r -nd -N -A xml --no-check-certificate https://user:password@www.example.com/xmlfiles/

I am successfully authenticate from the server but than I get
403 FORBIDDEN

If fo the following :
Code:
wget -r -nd -N --no-check-certificate https://user:password@www.example.com/xmlfiles/file.xml

Than I can successfully download the file/s from the url.

Where is my mistake ?
P.s. I cannot use ftp but just https

Than I need to parse all the downloaded xml files extracting data into some csv file what will be the best way ?
Here in attachment you can find one of my xml files as well as the output csv ( here exported into xls because csv is not allowed) file that I need to abtain after parsing of xml.



Thank you in advance for your help and Merry Christmas to all.
Nino

Last edited by pludi; 12-22-2009 at 08:53 AM.. Reason: removed links and added code tags
# 2  
If you point a web browser at:
Code:
https://www.example.com/xmlfiles/

does it try and retrieve index.htm or index.html (whatever the web server's default is set to)?
If that is the case then I imagine wget(1) is doing the same and presumably that page does not exist?
# 3  
Did you try using -A.xml as an option ?
# 4  
Hi there, thank you for your reply.
I have tried also the -A.xml but no luck, you are right Tony wget is looking for the index.html page that in this case doesn't exist.
Any idea how solve the problem ?
Thanks you again.
Greetings
# 5  
Either:

1. Try a wget of index.htm, just in case it then gives you a list of the files in the directory from which you can then extract the names of the files you are interested in and then wget each file in turn.

2. Just wget each file in turn you are expecting to get.

3. Get the web server configuration amended so that option 1 works with either index.html or index.htm giving you a directory listing and then doing suggestion 1!
Login or Register for Dates, Times and to Reply

Previous Thread | Next Thread
Thread Tools Search this Thread
Search this Thread:
Advanced Search

Test Your Knowledge in Computers #975
Difficulty: Easy
In July 2009, Alan Cox quit his Linux kernel development role as the TTY layer maintainer after disagreement with Linus Torvalds about who should pay for beers during a trip to Amsterdam.
True or False?

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Grepping multiple XML tag results from XML file.

I want to write a one line script that outputs the result of multiple xml tags from a XML file. For example I have a XML file which has below XML tags in the file: <EMAIL>***</EMAIL> <CUSTOMER_ID>****</CUSTOMER_ID> <BRANDID>***</BRANDID> Now I want to grep the values of all these specified... (1 Reply)
Discussion started by: shubh752
1 Replies

2. Web Development

CURL - Post Form Isssue ( sequel )

Hi, I write a new thread to discuss about my closed topic with new information ( /280990-curl-post-form-issue.html ) The previous post was closed because of missing informations, I didn't have access yet to server logs. ----------------------------------------------------------------------... (4 Replies)
Discussion started by: Fred13
4 Replies

3. Shell Programming and Scripting

Splitting a single xml file into multiple xml files

Hi, I'm having a xml file with multiple xml header. so i want to split the file into multiple files. Sample.xml consists multiple headers so how can we split these multiple headers into multiple files in unix. eg : <?xml version="1.0" encoding="UTF-8"?> <ml:individual... (3 Replies)
Discussion started by: Narendra921631
3 Replies

4. Shell Programming and Scripting

Wget - working in browser but cannot download from wget

Hi, I need to download a zip file from my the below US govt link. https://www.sam.gov/SAMPortal/extractfiledownload?role=WW&version=SAM&filename=SAM_PUBLIC_MONTHLY_20160207.ZIP I only have wget utility installed on the server. When I use the below command, I am getting error 403... (2 Replies)
Discussion started by: Prasannag87
2 Replies

5. Shell Programming and Scripting

Shell Command to compare two xml lines while ignoring xml tags

I've got two different files and want to compare them. File 1 : HTML Code: <response ticketId="944" type="getQueryResults"><status>COMPLETE</status><description>Query results fetched successfully</description><recordSet totalCount="1" type="sms_records"><record... (1 Reply)
Discussion started by: Shaishav Shah
1 Replies

6. Shell Programming and Scripting

Perl Reading Excel sheet isssue

There is a perl scriptwhich will read Excel sheet and create one file(.v) . Excel sheet::: A B C D 1 cpu_dailog 2 3 4 Perl will create the file(.v) like thsi ::: assert (cpu_dailog_iso ==2) ; assert (cpu_dailog_reset ==3); assert (cpu_dailog_idle... (3 Replies)
Discussion started by: naaj_ila
3 Replies

7. Shell Programming and Scripting

How to add the multiple lines of xml tags before a particular xml tag in a file

Hi All, I'm stuck with adding multiple lines(irrespective of line number) to a file before a particular xml tag. Please help me. <A>testing_Location</A> <value>LA</value> <zone>US</zone> <B>Region</B> <value>Russia</value> <zone>Washington</zone> <C>Country</C>... (0 Replies)
Discussion started by: mjavalkar
0 Replies

8. Shell Programming and Scripting

python - wget xml doc and parse with awk

Well, that's what I'd do in bash :) Here's what I have so far: import urllib2 from BeautifulSoup import BeautifulStoneSoup xml = urllib2.urlopen('http://weatherlink.com/xml.php?user=blah&pass=blah') soup = BeautifulStoneSoup(xml) print soup.prettify() but all it does is grab the html... (0 Replies)
Discussion started by: unclecameron
0 Replies

9. Shell Programming and Scripting

How to remove xml namespace from xml file using shell script?

I have an xml file: <AutoData xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> <Table1> <Data1 10 </Data1> <Data2 20 </Data2> <Data3 40 </Data3> <Table1> </AutoData> and I have to remove the portion xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" only. I tried using sed... (10 Replies)
Discussion started by: Gary1978
10 Replies

10. UNIX for Dummies Questions & Answers

email users isssue

Hi, my email server is set up in a different machine which runs lineox enterprise 3.0. It exports /var/spool/mail to the sun server running solaris 9 and hence, all workstations nd users can access their mail. but the problem is some users cannot open their mail at all. the error "mailer... (0 Replies)
Discussion started by: stakes20
0 Replies

Featured Tech Videos