Wednesday, June 5, 2024
 Popular · Latest · Hot · Upcoming
80
rated 0 times [  83] [ 3]  / answers: 1 / hits: 23815  / 13 Years ago, wed, may 11, 2011, 12:00:00

Okay, I have read about regex all day now, and still don't understand it properly. What i'm trying to do is validate a name, but the functions i can find for this on the internet only use [a-zA-Z], leaving characters out that i need to accept to.



I basically need a regex that checks that the name is at least two words, and that it does not contain numbers or special characters like !#¤%&/()=..., however the words can contain characters like æ, é, Â and so on...



An example of an accepted name would be: John Elkjærd or André Svenson
An non-accepted name would be: Hans, H4nn3 Andersen or Martin Henriksen!



If it matters i use the javascript .match() function client side and want to use php's preg_replace() only in negative server side. (removing non-matching characters).



Any help would be much appreciated.



Update:

Okay, thanks to Alix Axel's answer i have the important part down, the server side one.



But as the page from LightWing's answer suggests, i'm unable to find anything about unicode support for javascript, so i ended up with half a solution for the client side, just checking for at least two words and minimum 5 characters like this:



if(name.match(/S+/g).length >= minWords && name.length >= 5) {
//valid
}


An alternative would be to specify all the unicode characters as suggested in shifty's answer, which i might end up doing something like, along with the solution above, but it is a bit unpractical though.


More From » php

 Answers
285

Try the following regular expression:



^(?:[p{L}p{Mn}p{Pd}'x{2019}]+s[p{L}p{Mn}p{Pd}'x{2019}]+s?)+$


In PHP this translates to:



if (preg_match('~^(?:[p{L}p{Mn}p{Pd}'x{2019}]+s[p{L}p{Mn}p{Pd}'x{2019}]+s?)+$~u', $name) > 0)
{
// valid
}


You should read it like this:



^   # start of subject
(?: # match this:
[ # match a:
p{L} # Unicode letter, or
p{Mn} # Unicode accents, or
p{Pd} # Unicode hyphens, or
' # single quote, or
x{2019} # single quote (alternative)
]+ # one or more times
s # any kind of space
[ #match a:
p{L} # Unicode letter, or
p{Mn} # Unicode accents, or
p{Pd} # Unicode hyphens, or
' # single quote, or
x{2019} # single quote (alternative)
]+ # one or more times
s? # any kind of space (0 or more times)
)+ # one or more times
$ # end of subject


I honestly don't know how to port this to Javascript, I'm not even sure Javascript supports Unicode properties but in PHP PCRE this seems to work flawlessly @ IDEOne.com:



$names = array
(
'Alix',
'André Svenson',
'H4nn3 Andersen',
'Hans',
'John Elkjærd',
'Kristoffer la Cour',
'Marco d'Almeida',
'Martin Henriksen!',
);

foreach ($names as $name)
{
echo sprintf('%s is %s' . n, $name, (preg_match('~^(?:[p{L}p{Mn}p{Pd}'x{2019}]+s[p{L}p{Mn}p{Pd}'x{2019}]+s?)+$~u', $name) > 0) ? 'valid' : 'invalid');
}


I'm sorry I can't help you regarding the Javascript part but probably someone here will.






Validates:




  • John Elkjærd

  • André Svenson

  • Marco d'Almeida

  • Kristoffer la Cour



Invalidates:




  • Hans

  • H4nn3 Andersen

  • Martin Henriksen!






To replace invalid characters, though I'm not sure why you need this, you just need to change it slightly:



$name = preg_replace('~[^p{L}p{Mn}p{Pd}'x{2019}s]~u', '$1', $name);


Examples:




  • H4nn3 Andersen -> Hnn Andersen

  • Martin Henriksen! -> Martin Henriksen



Note that you always need to use the u modifier.


[#92291] Monday, May 9, 2011, 13 Years  [reply] [flag answer]
Only authorized users can answer the question. Please sign in first, or register a free account.
rashawn

Total Points: 451
Total Questions: 83
Total Answers: 83

Location: Egypt
Member since Tue, May 3, 2022
2 Years ago
;