Monday, May 20, 2024
 Popular · Latest · Hot · Upcoming
37
rated 0 times [  43] [ 6]  / answers: 1 / hits: 23542  / 9 Years ago, wed, july 22, 2015, 12:00:00

When I'm browsing a website A using normal browser (Chrome) and when I click on a link on the website A, Chrome imediatelly downloads report in a form of CSV file.



When I checked a server response headers I get the following results:



Cache-Control:private,max-age=31536000
Connection:Keep-Alive
Content-Disposition:attachment; filename=report.csv
Content-Encoding:gzip
Content-Language:de-DE
Content-Type:text/csv; charset=UTF-8
Date:Wed, 22 Jul 2015 12:44:30 GMT
Expires:Thu, 21 Jul 2016 12:44:30 GMT
Keep-Alive:timeout=15, max=75
Pragma:cache
Server:Apache
Transfer-Encoding:chunked
Vary:Accept-Encoding


Now, I want to download and parse this file using PhantomJS. I set page onResourceReceived listener to see if Phantom will receive/download the file.



clientRequests.phantomPage.onResourceReceived = function(response) {
console.log('Response (#' + response.id + ', stage ' + response.stage + '): ' + JSON.stringify(response));
};


When I make Phantom request to download a file (this is page.open('URL OF THE FILE')), I can see in Phantom log that file is downloaded. Here are logs:



contentType: text/csv; charset=UTF-8,
headers: {
name: Date,
value: Wed, 22 Jul 2015 12:57:41 GMT
},
name: Content-Disposition,
value: attachment; filename=report.csv,
status:200,statusText:OK


I received the file and its content, but how to access file data? When I print current PhantomJS page object, I get the HTML of the page A and I don't want that, I want CSV file, which I need to parse using JavaScript.


More From » http

 Answers
30

After days and days of investigation, I have to say that there are some solutions:




  • In your evaluate function you can make AJAX call to download and encode your file, then you can return this content back to phantom script

  • You can use some custom Phantom library available on some GitHub pages



If you need to download a file using PhanotmJS, then run away from PhantomJS and use CasperJS. CasperJS is based on PhantomJS, but it has much better and intuitive syntax and program flow.



Here is good post explaining Why CasperJS is better than PhantomJS. In this post you can find section about file download.



How to download CSV file using CasperJS (this works even when server sends header Content-Disposition:attachment; filename='file.csv)



Here you can find some custom csv file available for download: http://captaincoffee.com.au/dump/items.csv



In order to download this file using CasperJS execute the following code:



var casper = require('casper').create();

casper.start(http://captaincoffee.com.au/dump/, function() {
this.echo(this.getTitle())
});
casper.then(function(){
var url = 'http://captaincoffee.com.au/dump/csv.csv';
require('utils').dump(this.base64encode(url, 'get'));
});

casper.run();


The code above will download http://captaincoffee.com.au/dump/csv.csv CSV file and will print results as base64 string. So this way, you don't even have to download data to file, you have your data as base64 string.



If you explicitly want to download file to file system, you can use download function which is available in CasperJS.


[#65724] Monday, July 20, 2015, 9 Years  [reply] [flag answer]
Only authorized users can answer the question. Please sign in first, or register a free account.
josefn

Total Points: 251
Total Questions: 93
Total Answers: 84

Location: Senegal
Member since Fri, Aug 21, 2020
4 Years ago
;