Monday, May 20, 2024
 Popular · Latest · Hot · Upcoming
40
rated 0 times [  43] [ 3]  / answers: 1 / hits: 44606  / 9 Years ago, mon, december 28, 2015, 12:00:00

Given this HTML as a string html, how can I split it into an array where each header <h marks the start of an element?



Begin with this:



<h1>A</h1>
<h2>B</h2>
<p>Foobar</p>
<h3>C</h3>


Result:



[<h1>A</h1>, <h2>B</h2><p>Foobar</p>, <h3>C</h3>]


What I've tried:



I wanted to use Array.split() with a regex, but the result splits each <h into its own element. I need to figure out how to capture from the start of one <h until the next <h. Then include the first one but exclude the second one.



var html = '<h1>A</h1><h2>B</h2><p>Foobar</p><h3>C</h3>';
var foo = html.split(/(<h)/);


Edit: Regex is not a requirement in anyway, it's just the only solution that I thought would work for generally splitting HTML strings in this way.


More From » regex

 Answers
4

In your example you can use:



/
<h // Match literal <h
(.) // Match any character and save in a group
> // Match literal <
.*? // Match any character zero or more times, non greedy
</h // Match literal </h
1 // Match what previous grouped in (.)
> // Match literal >
/g


var str = '<h1>A</h1><h2>B</h2><p>Foobar</p><h3>C</h3>'
str.match(/<h(.)>.*?</h1>/g); // [<h1>A</h1>, <h2>B</h2>, <h3>C</h3>]


But please don't parse HTML with regexp, read RegEx match open tags except XHTML self-contained tags


[#63925] Thursday, December 24, 2015, 9 Years  [reply] [flag answer]
Only authorized users can answer the question. Please sign in first, or register a free account.
trayvon

Total Points: 35
Total Questions: 117
Total Answers: 88

Location: Guernsey
Member since Tue, Jul 6, 2021
3 Years ago
;