Monday, June 3, 2024
 Popular · Latest · Hot · Upcoming
3
rated 0 times [  8] [ 5]  / answers: 1 / hits: 20019  / 8 Years ago, wed, august 10, 2016, 12:00:00

Hey guys and ladies first of all this is my first question here in stackoverflow so don't be so hard on me.. but w/e :P.
I have a problem..
i'm totally new to web scraping and at the moment i have the problem that i can't select the right elements. My code looks like this:



var express = require('express');
var path = require('path');
var request = require('request');
var cheerio = require('cheerio');
var fs = require('fs');

var app = express();
var port = 8000;

var url = http://www.finanzparasiten.de/html/links/awd.html;

request(url, function (err, resp, body) {
if(!err) {
var $ = cheerio.load(body)

var test = $('body table table table > tbody > tr > td > p');
console.log(test.html())
test.each(function (ii, asdf) {
var rr = $(asdf).find(table).find(tr).first().find('td:nth-child(2)').text();
console.log(asdf);
})
} else {
console.log(we encountered an error: + err);
}
});

app.listen(port);
console.log('server is listening on ' + port);


It keeps logging NULL for the variable test.
It seems like cheerio has problems with the > selector. With jQuery this selection would work as expected.



Thanks to @logol's anwser i could solve the first problem but now i facing the problem that i have to select direct childs after body and it seems to bug as the tbody.. any1 got a workaround?


More From » jquery

 Answers
9

Original:



as far as I remember (when I used cheerio the last time) tbody is not recognized in cheerio, just leave it and use this instead:



table > tr > td



PS: thead was working



Update:



it seems to work sometimes even with tbody, try this in REPL



const cheerio = require('cheerio');
const html = '
<!DOCTYPE html>
<html>
<head>
<title>Cheerio Test</title>
</head>
<body>
<div id=#1>
<table>
<thead>
<tr>
<th>Month</th>
<th>Savings</th>
</tr>
</thead>
<tfoot>
<tr>
<td>Sum</td>
<td>180</td>
</tr>
</tfoot>
<tbody>
<tr>
<td>January</td>
<td>100</td>
</tr>
<tr>
<td>February</td>
<td>80</td>
</tr>
</tbody>
</table>
</div>
</body>
</html>';
const dom = cheerio.load(html);

// not working:
let tds1 = dom('div#1 > table > tbody > tr > td').map(function () {
return dom(this).text().trim();
}).get();

// working:
let tds2 = dom('table > tbody > tr > td').map(function () {
return dom(this).text().trim();
}).get();

// not working:
let tds3 = dom('div#1 > table > tr > td').map(function () {
return dom(this).text().trim();
}).get();

console.log(tds1);
console.log(tds2);
console.log(tds3);

[#61072] Monday, August 8, 2016, 8 Years  [reply] [flag answer]
Only authorized users can answer the question. Please sign in first, or register a free account.
taliac

Total Points: 84
Total Questions: 114
Total Answers: 114

Location: Morocco
Member since Fri, May 22, 2020
4 Years ago
taliac questions
Sun, Mar 21, 21, 00:00, 3 Years ago
Tue, May 12, 20, 00:00, 4 Years ago
Mon, Jan 13, 20, 00:00, 4 Years ago
;