Go Back   TechArena Community > Software > Windows Software
Become a Member!
Forgot your username/password?
Register Tags Active Topics RSS Search Mark Forums Read

Sponsored Links



Automatic download of data from website that requires login and password

Windows Software


Reply
 
Thread Tools Search this Thread
  #1  
Old 19-04-2010
Member
 
Join Date: Dec 2009
Posts: 23
Automatic download of data from website that requires login and password
  

I recently wrote a script that automatically downloads data from a website that requires login/password, without actually logging in manually. I wanted to share it with those who may be needing something similar.

Investbulls is a website that supplies stock data on bse (borse or bombay stock exchange). It is free but requires a registration. After that, its users manually login to the website, navigate to the download pages, select previous trading day's data, download it, logout, then open the downloaded data (a zip) file on their computer.

This script automates all of that. It is in biterscripting. I am commenting the code to explain the work flow of the script.

Investbulls is a PHP site. After logging in, the site sends the client a cookie which must be submitted in headers of all subsequent requests to their server. I am using the command 'script SS_ISCookies.txt' to do that.


Code:
# Script investbulls.txt
# Input arguments
var string username # Username that one would manually enter on the login page
var string password # Password that one would manually enter on the login page

var string out, downloadlink, description, zipfile

# Start a browser session and connect it to investbulls.com.
isstart ib "daily download of EOD" "Mozilla/4.0"
isconnect ib "http://www.investbulls.com" > $out

# Submit the login form.
issubmit ib "/news.php" ("username="+$username) ("userpass="+$password) "autologin=0" "userlogin=Login" > null

# Exchange cookies with the server.
script SS_ISCookies.txt from("ib") to("ib")

# Retrieve the page that has the links for the last few days.
# The link at the top is the immediately previous day's data - this
#    is the data we are interested in.
isretrieve ib "/download.php?list.40" > $out

# The source for this page is now in string $out.
#  Do string extractions (stex command) to get the link for yesterday.
# Also, get its description. We will use it to save this data under that file name.
stex -c "]^<a href='download.php?view^" $out > null
stex -c "^<a href='^]" $out > null
stex -c "]^'^" $out > $downloadlink
stex -c "^'>^]" $out > null
stex -c "]^<^" $out > $description
set $zipfile=$description+".zip"

# Yesterday's URL is in $downloadlink. Get that page. There is a link
# there which, when clicked, the browser starts the download file process.
isretrieve ib $downloadlink > $out
stex -c "]^<a href='request.php?^" $out > null
stex -c "^<a href='^]" $out > null
stex -c "]^'^" $out > $downloadlink

# Ok. The link to the actual downloadble .zip file is now in $downloadlink.
# Retrieve this file in binary (-b) mode.
isretrieve -b ib $downloadlink > $out

# The binary file data is in session's buffer. Save it to the local file.
issave -b ib $zipfile

# Open the local .zip file for the user.
system ("\""+$zipfile+"\"")

# We are done. Click the 'logout' link, disconnect session, close browser.
isretrieve ib "index.php?logout" > null
isdisconnect ib
isend ib

To try:

Save the script in file "C:/Scripts/investbulls.txt", start biterscripting, enter this command.

Code:
script "C:/Scripts/investbulls.txt" username("your user name") password("your password")

The user would use the correct username and password registered with investbulls in the above command in place of "your user name" and "your password".

On my computer, the script takes about a minute to complete execution. After done, the user sees the .zip folder opened. There is no actual browser opened or anything. The script simulates the communication a browser would carry out with the web server.

Thought this may be useful to someone in the future.

Similar code will work on most PHP sites. For .NET sites, you may need to extract some variable values such as __VIEWSTATE from each response ($out) and return them to the server with the next request. Feel free to post your question if you need help automating downloads from .NET web sites.

Reply With Quote
  #2  
Old 20-04-2010
Member
 
Join Date: Aug 2009
Posts: 2,881
Re: Automatic download of data from website that requires login and password

The tip is really helpful for getting data from various sites without login. But the process is bit lengthy a need some advance technical skills. Here I got one more solution by which you can enter any site, forums without getting the account registration. For that you just need User Agent addon for Firefox and configure the same as Google Bot. You get access in the site without login id and password. But some sites can sue you for this.
Reply With Quote
  #3  
Old 20-04-2010
Member
 
Join Date: Dec 2009
Posts: 23
Re: Automatic download of data from website that requires login and password

I will be quite surprised if this can be done using firefox, without proper login and password. I am not saying it can't be done, just saying that I can't imagine how it can be done, and I will be surprised.

My scripting approach isn't that brave - it requires proper login and password. In essence we are creating our own mini-browser using a scripting language - we are merely automating the manual typing, reading, entering, pointing, clicking.

Notice that I did not hardcode the login and password in the script - so the caller of the script must supply his own login/password. That way, if the website is doing billing based on login/password, the web site will have the correct information.

Another advantage of scripting approach, I don't know if noticed it, is that the site cookies remain in the script's buffer - cookies are never written to the hard drive. Google analytics scripts inserted in most websites routinely scan the browser's computer for tracing these cookies and collect info on which sites you visited, even if you never went to those sites thru google.

In the scripting approach, google would not see your cookies. That's the correct approach , since google does not need to know, nor has any right to know, which sites you visit, if you did NOT go to those sites thru google.

Last edited by ranjankumar09 : 20-04-2010 at 08:51 PM.
Reply With Quote
Reply

  TechArena Community > Software > Windows Software
Tags: , , , , , , , , ,



Thread Tools Search this Thread
Search this Thread:

Advanced Search


Similar Threads for: "Automatic download of data from website that requires login and password"
Thread Thread Starter Forum Replies Last Post
How to make DNN website automatic login in Zoom Search Engine? Aarya Technology & Internet 5 20-08-2010 11:17 AM
consuming a webservice that requires login credentials willjones Software Development 1 23-01-2010 02:07 AM
Windows 7 requires twice login access to desktop Abhibhava Operating Systems 5 22-12-2009 03:49 PM
Download ALPass for free to Manage Web Site Login and Password Spykar Tips & Tweaks 1 02-04-2009 07:53 PM
Windows Login password fails. Can switch user without getting login screen and bypass password !?!?!? Gaffigana Windows XP Support 1 28-04-2007 08:57 PM


All times are GMT +5.5. The time now is 09:42 PM.