Friday, May 17, 2024
 Popular · Latest · Hot · Upcoming
187
rated 0 times [  192] [ 5]  / answers: 1 / hits: 23035  / 9 Years ago, sun, march 1, 2015, 12:00:00

I have the following html that I like to parse through Cheerios.



    var $ = cheerio.load('<html><head><meta http-equiv=Content-Type content=text/html; charset=UTF-8/></head><body style=word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;><div>This works well.</div><div><br clear=none/></div><div>So I have been doing this for several hours. How come the space does not split? Thinking that this could be an issue.</div><div>Testing next paragraph.</div><div><br clear=none/></div><div>Im testing with another post. This post should work.</div><div><br clear=none/></div><h1>This is for test server.</h1></body></html>', {
normalizeWhitespace: true,
});

// trying to parse the html
// the goals are to
// 1. remove all the 'div'
// 2. clean up <br clear=none/> into <br>
// 3. Have all the new 'empty' element added with 'p'

var testData = $('div').map(function(i, elem) {
var test = $(elem)
if ($(elem).has('br')) {
console.log('spaceme');
var test2 = $(elem).removeAttr('br');
} else {
var test2 = $(elem).removeAttr('div').add('p');
}
console.log(i +' '+ test2.html());
return test2.html()
})

res.send(test2.html())


My end goals are to try and parse the html




  • remove all the div

  • clean up <br clear=none/> and change into <br>

  • and finally have all the empty 'element' (those sentences with 'div') remove to be added with 'p' sentence '/p'



I try to start with a smaller goal in the above code I have written. I tried to remove all the 'div' (it is a success) but I'm unable to to find the 'br. I been trying out for days and have no head way.



So I'm writing here to seek some help and hints on how can I get to my end goal.



Thank you :D


More From » node.js

 Answers
5

It's easier than it looks, first you iterate over all the DIV's



$('div').each(function() { ...


and for each div, you check if it has a <br> tag



$(this).find('br').length


if it does, you remove the attribute



$(this).find('br').removeAttr('clear');


if not you create a P with the same content



var p = $('<p>' + $(this).html() + '</p>');


and then just replace the DIV with the P



$(this).replaceWith(p);


and output



res.send($.html());


All together it's



$('div').each(function() {
if ( $(this).find('br').length ) {
$(this).find('br').removeAttr('clear');
} else {
var p = $('<p>' + $(this).html() + '</p>');
$(this).replaceWith(p);
}
});

res.send($.html());

[#67616] Friday, February 27, 2015, 9 Years  [reply] [flag answer]
Only authorized users can answer the question. Please sign in first, or register a free account.
victorr

Total Points: 193
Total Questions: 86
Total Answers: 105

Location: Pitcairn Islands
Member since Thu, Jun 24, 2021
3 Years ago
victorr questions
Fri, Nov 13, 20, 00:00, 4 Years ago
Sat, Jul 25, 20, 00:00, 4 Years ago
Thu, Jun 11, 20, 00:00, 4 Years ago
;