Today (Saturday) We will make some minor tuning adjustments to MySQL.

You may experience 2 up to 10 seconds "glitch time" when we restart MySQL. We expect to make these adjustments around 1AM Eastern Daylight Saving Time (EDT) US.


How to use cURL to download web page with authentification (form)?

Login or Register to Reply

 
Thread Tools Search this Thread
# 1  
How to use cURL to download web page with authentification (form)?

Hello,

I'm new in the forum and really beginer, and also sorry form my bad english.

I use linux and want to create little program to download automaticaly some pdf (invoices) and put in a folder of my computer. I learn how to do and programme with ubuntu but the program will be implemented on a mac osx so I look for compatible soft.

So I look for cURL.

I don't know if a subject is already open arround curl and authentification, if yes, please redirect me.

So,
I can download a pdf from a command line (good begining) but the file I want to get are on a web site with authentification form, and it's impossible to go directly to the url of the pdf, I have to go to login page, and when it's done I fall in the "home" part of the web site with an other authentification.

So I have to pass the form, pass a second form with just a number "personal code", and after go to the url of the file I want to download.

I try to pass the first authentification but I dont know if it's done or not, and i dont know how to, after authentification, keep the cookies and continu to pass the second authentification and ask cURL for the url of the file.

To begin, some body can maybe say me if with this command line I pass the form with succes or not: (XXXXX is the pass word and username)

Code:
curl --trace-ascii debugcyclosoftware.txt -d klantid=XXXXX -d wachtwoord=XXXXX https://s02.cyclesoftware.nl/app/cs/account/login/login/

You can see the form at the adress in the code.

And the debugcyclosoftware.txt is in attachement.

The second athentification is also a form but ask the "personnal code" to ride the web site.

When I look the debugcyclosoftware.txt file I see that I finish at location that I would like to go (home page), so it's done?

If i try to put a second url after the first to do a second post (for second form) I directly go back to the login page with a HTTP/1.1 302 Found.

But I maybe need a language like python or PHP to manage curl to do different action step by step and keep the cookie.

I don't know to mush php but I see that it work well with cURL.

Any advises or suggestion to help me to continu to learn and do?

I hope I m relatively clear...

Thanks a lot to having read =)
# 2  
I havent tried that with cURL, but used wget (a similar tool) to achieve about the same.

here is a test script i once wrote to write a bot for wiki-pages. It is not exactly a solution for you but shows how things work and how you can manipulate and use persistent login data across several calls of wget:

Code:
#! /bin/ksh93

typeset WGET=$(which wget)
typeset chWikiInst="my_wiki"
typeset chWikiURL="http://my_system/wiki/api.php"
typeset fWorkDir="/home/bakunin/projects/wiki/work"
typeset fOut="$fWorkDir/outfile"
typeset fTok="$fWorkDir/token"
typeset chUser="BotUser"
typeset chPwd="UserBot"
typeset chToken=""
typeset chSessionID=""
typeset chEditToken=""

rm "${fOut}*"
rm "${fTok}*"

# ----------------------- login1 --------------------
$WGET --post-data "action=login&lgname=${chUser}&lgpassword=${chPwd}&format=xml" \
      --save-cookies="${fTok}.login1" \
      --output-document="${fOut}.login1" \
      --keep-session-cookies \
      -q \
      "$chWikiURL"

                                                       # extract info
chToken="$(sed 's/.*\ token="\([^"]*\)".*/\1/' "${fOut}.login1")"
chSessionID="$(sed 's/.*\ sessionid="\([^"]*\)".*/\1/' "${fOut}.login1")"

print - "sessionID: $chSessionID \t Token: $chToken"

# ----------------------- confirm token --------------------
$WGET --post-data "action=login&lgname=${chUser}&lgpassword=${chPwd}&lgtoken=${chToken}&format=xml" \
      --load-cookies="${fTok}.login1" \
      --save-cookies="${fTok}.login2" \
      --output-document="${fOut}.login2" \
      --keep-session-cookies \
      -q \
      "$chWikiURL"

# ----------------------- get edit token --------------------
$WGET --post-data "action=tokens&type=edit&format=xml" \
      --load-cookies="${fTok}.login2" \
      --save-cookies="${fTok}.edit" \
      --output-document="${fOut}.edit" \
      --keep-session-cookies \
      -q \
      "$chWikiURL"

                                                       # extract info
chEditToken="$(sed 's/.*\ edittoken="\([^"]*\)+\\".*/\1/' "${fOut}.edit")"
                                                       # pseudo-URL-encode trailing "+\"
chEditToken="${chEditToken}%2B%5C"
print - "sessionID: $chSessionID\nToken....: $chToken\nEditToken: $chEditToken"

# ----------------------- create new page --------------------
$WGET --post-data "action=edit&title=MyTestPage&contentformat=text/x-wiki&format=xml&text='Hello-World'&token=${chEditToken}" \
      --load-cookies="${fTok}.edit" \
      --save-cookies="${fTok}.create" \
      --output-document="${fOut}.create" \
      --keep-session-cookies \
      -q \
      "$chWikiURL"

# ----------------------- logout -------------------
$WGET --post-data "action=logout&format=xml" \
      --load-cookies="${fTok}.edit" \
      --save-cookies="${fTok}.sessionend" \
      --output-document="${fOut}.sessionend" \
      "$chWikiURL"
exit 0

I hope this helps.

bakunin
This User Gave Thanks to bakunin For This Post:
Login or Register to Reply

|
Thread Tools Search this Thread
Search this Thread:
Advanced Search

More UNIX and Linux Forum Topics You Might Find Helpful
Curl login to web page
Vit0_Corleone
Hello dears, I am trying to log in the website using curl but no luck so far. The web page Content-Type is : text/html;charset=ISO-8859-1 I try the following command using curl: curl \ --header "Content-type: text/html" \ --request POST \ --data '{"user": "someusername",...... Linux
0
Linux
List and download web page files
shadyuk
Hello, Does anyone know of a way to list all files related to a single web page and then to download say 4 files at a time simultaneously until the entire web page has been downloaded successfully? I'm essentially trying to mimic a browser. Thanks.... UNIX for Dummies Questions & Answers
2
UNIX for Dummies Questions & Answers
Random web page download wget script
shadyuk
Hi, I've been attempting to create a script that downloads web pages at random intervals to mimic typical user usage. However I'm struggling to link $url to the URL list and thus wget complains of a missing URL. Any ideas? Thanks #!/bin/sh #URL List url1="http://www.bbc.co.uk"...... Shell Programming and Scripting
14
Shell Programming and Scripting
help pulling ${VARS} out of a web page user curl
briandanielz
Here is the code I have so far #!/bin/bash INFOF="/tmp/mac.info" curl --silent http://www.everymac.com/systems/apple/macbook_pro/specs/macbook-pro-core-2-duo-2.8-aluminum-17-mid-2009-unibody-specs.html "$INFOF" I want help putting these specs into a vars Standard Ram: value into $VAR1...... Shell Programming and Scripting
1
Shell Programming and Scripting
Possible to download web page's text to a file?
Breanne
Hi, Say there is a web page that contains just text only - that is, even the source code is just the text itself, nothing more. An example would be "http://mynasadata.larc.nasa.gov/docs/ocean_percent.txt" Is there a UNIX command that would allow me to download this text and store it in a...... UNIX for Dummies Questions & Answers
1
UNIX for Dummies Questions & Answers

Featured Tech Videos