01-29-2011
Help with using lynx/wget/curl when a link has an ampersand
Hi, for my own interest I want to scrape a lot of data off the Maple Story game rankings page.
The problem is, when I want to get the data at this page
maplestory(dot)nexon(dot)net/Rankings/OverallRanking.aspx?type=overall&s=&world=0&job=0&pageIndex=6
It gives me the data at this page
maplestory(dot)nexon(dot)net/Rankings/OverallRanking.aspx?type=overall
so I think it has to do with the ampersands. I've tried it with curl/wget/lynx --dump and none of them work. Maybe I'm just missing a command or using the wrong tool. Does anyone have advice?
Thanks.
10 More Discussions You Might Find Interesting
1. Shell Programming and Scripting
We are trying to invoke a https service from our unix script using curl command. The service is not getting invoked because it is SSL configured. Bypassing certification (using curl –k) does not work.
curl -k https://site
curl -k -x IP:Port https://site
curl -k -x IP:443 https://id:pwd@site
... (0 Replies)
Discussion started by: dineshbabu01
0 Replies
2. Shell Programming and Scripting
I need a proxy that would enable me to use cli curl/wget with another ip address.
How do I find a paid proxy server that supports curl/wget? (1 Reply)
Discussion started by: locoroco
1 Replies
3. Shell Programming and Scripting
Hello,
I am wondering does anyone know of a method using curl/wget or other where by I could specify the IP address of the server I wish to query for a website.
Something similar to editing /etc/hosts but that can be done directly from the command line. I have looked through the man pages... (4 Replies)
Discussion started by: colinireland
4 Replies
4. Shell Programming and Scripting
Hi
I need a Shell script that will download a zip file every second from a http server but i can't use neither curl nor wget.
Can anyone will help me go about this task ???
Thanks!! (1 Reply)
Discussion started by: rubber08
1 Replies
5. Shell Programming and Scripting
i use curl and wget quite often.
i set up alarms on their output. for instance, i would run a "wget" on a url and then search for certain strings within the output given by the "wget".
the problem is, i cant get the entire output or response of my wget/curl command to show up correctly in... (3 Replies)
Discussion started by: SkySmart
3 Replies
6. Shell Programming and Scripting
Hi,
My script needs to crawl the data from a third party site. Currently it is written in wget. The third party site is of shared interface with different IP addresses.
My wget works with all the IP address but not with one. Whereas the curl is able to hit that IP address and comes out... (2 Replies)
Discussion started by: sathyaonnuix
2 Replies
7. Shell Programming and Scripting
Experts,
I login to a 3rd party and pull some valuable information with my credentials. I pass my credentials via --post-data in wget.
Now my Account is locked. I want my wget to alert that the Account is locked. How can i achieve this.
My idea is, get the Source page html from the... (2 Replies)
Discussion started by: sathyaonnuix
2 Replies
8. UNIX for Dummies Questions & Answers
Hi Experts,
Problem statement :
We have an URL for which we need to read the data and get parsed inside the shell scripts.
My Aix has very limited perl utility, i cant install any utility as well.
Precisely, wget,cURL,Lynx,w3m and Lwp cant be used as i got these utilities only when i googled... (0 Replies)
Discussion started by: scott_cog
0 Replies
9. Shell Programming and Scripting
Hello,
What I am trying to do is to get html data of a website automatically.
Firstly I decided to do it manually and via terminal I entered below code:
$ wget http://www.***.*** -q -O code.html
Unfortunately code.html file was empty.
When I enter below code it gave Error 303-304
$... (1 Reply)
Discussion started by: baris35
1 Replies
10. Web Development
What can I use instead of wget/curl when I need to log into websites that use javascript?
Wget and curl don't handle javascript. (6 Replies)
Discussion started by: locoroco
6 Replies
LEARN ABOUT MOJAVE
curlopt_path_as_is
CURLOPT_PATH_AS_IS(3) curl_easy_setopt options CURLOPT_PATH_AS_IS(3)
NAME
CURLOPT_PATH_AS_IS - do not handle dot dot sequences
SYNOPSIS
#include <curl/curl.h>
CURLcode curl_easy_setopt(CURL *handle, CURLOPT_PATH_AS_IS, long leaveit);
DESCRIPTION
Set the long leaveit to 1, to explicitly tell libcurl to not alter the given path before passing it on to the server.
This instructs libcurl to NOT squash sequences of "/../" or "/./" that may exist in the URL's path part and that is supposed to be removed
according to RFC 3986 section 5.2.4.
Some server implementations are known to (erroneously) require the dot dot sequences to remain in the path and some clients want to pass
these on in order to try out server implementations.
By default libcurl will merge such sequences before using the path.
DEFAULT
0
PROTOCOLS
All
EXAMPLE
CURL *curl = curl_easy_init();
if(curl) {
curl_easy_setopt(curl, CURLOPT_URL, "http://example.com/../../etc/password");
curl_easy_setopt(curl, CURLOPT_PATH_AS_IS, 1L);
curl_easy_perform(curl);
}
AVAILABILITY
Aded in 7.42.0
RETURN VALUE
Returns CURLE_OK if the option is supported, and CURLE_UNKNOWN_OPTION if not.
SEE ALSO
CURLOPT_STDERR(3), CURLOPT_DEBUGFUNCTION(3), CURLOPT_URL(3),
libcurl 7.54.0 February 14, 2016 CURLOPT_PATH_AS_IS(3)