Monday, May 20, 2024
 Popular · Latest · Hot · Upcoming
112
rated 0 times [  116] [ 4]  / answers: 1 / hits: 21942  / 13 Years ago, fri, december 9, 2011, 12:00:00

Suppose I've a long string containing newlines and tabs as:



var x = This is a long string.nt This is another one on next line.;


So how can we split this string into tokens, using regular expression?



I don't want to use .split(' ') because I want to learn Javascript's Regex.



A more complicated string could be this:



var y = This @is a #long $string. Alright, lets split this.;


Now I want to extract only the valid words out of this string, without special characters, and punctuation, i.e I want these:



var xwords = [This, is, a, long, string, This, is, another, one, on, next, line];

var ywords = [This, is, a, long, string, Alright, lets, split, this];

More From » regex

 Answers
33

Here is a jsfiddle example of what you asked: http://jsfiddle.net/ayezutov/BjXw5/1/



Basically, the code is very simple:



var y = This @is a #long $string. Alright, lets split this.;
var regex = /[^s]+/g; // This is multiple not space characters, which should be searched not once in string

var match = y.match(regex);
for (var i = 0; i<match.length; i++)
{
document.write(match[i]);
document.write('<br>');
}


UPDATE:
Basically you can expand the list of separator characters: http://jsfiddle.net/ayezutov/BjXw5/2/



var regex = /[^s.,!?]+/g;


UPDATE 2:
Only letters all the time:
http://jsfiddle.net/ayezutov/BjXw5/3/



var regex = /w+/g;

[#88650] Thursday, December 8, 2011, 13 Years  [reply] [flag answer]
Only authorized users can answer the question. Please sign in first, or register a free account.
georgeh

Total Points: 193
Total Questions: 103
Total Answers: 111

Location: United States Minor Outlying Island
Member since Sat, May 28, 2022
2 Years ago
;