Monday, May 20, 2024
 Popular · Latest · Hot · Upcoming
106
rated 0 times [  109] [ 3]  / answers: 1 / hits: 20279  / 5 Years ago, fri, january 25, 2019, 12:00:00

I've a little problem.



I'm using NodeJS as backend. Now, an user has a field biography, where the user can write something about himself.



Suppose that this field has 220 maxlength, and suppose this as input:



👶🏻👦🏻👧🏻👨🏻👩🏻👱🏻‍♀️👱🏻👴🏻👵🏻👲🏻👳🏻‍♀️👳🏻👮🏻‍♀️👮🏻👷🏻‍♀️👷🏻💂🏻‍♀️💂🏻🕵🏻‍♀️👩🏻‍⚕️👨🏻‍⚕️👩🏻‍🌾👨🏻‍🌾👨🏻‍🌾👨🏻‍🌾👨🏻‍🌾👨🏻‍🌾👨🏻‍🌾👨🏻‍🌾👨🏻‍🌾👨🏻‍🌾👨🏻‍🌾👨🏻‍🌾👨🏻‍🌾👨🏻‍🌾👨🏻‍🌾👨🏻‍🌾 


As you can see there aren't 220 emojis (there are 37 emojis), but if I do in my nodejs server



console.log(bio.length)


where bio is the input text, I got 221. How could I parse the string input to get the correct length? Is it a problem about unicode?



SOLVED



I used this library: https://github.com/orling/grapheme-splitter



I tried that:



var Grapheme = require('grapheme-splitter');
var splitter = new Grapheme();
console.log(splitter.splitGraphemes(bio).length);


and the length is 37. It works very well!


More From » node.js

 Answers
1

  1. str.length gives the count of UTF-16 units.



  2. Unicode-proof way to get string length in codepoints (in characters) is [...str].length as iterable protocol splits the string to codepoints.



  3. If we need the length in graphemes (grapheme clusters), we have these native ways:


    a. Unicode property escapes in RegExp. See for example: Unicode-aware version of w or Matching emoji.


    b. Intl.Segmenter — coming soon, probably in ES2021. Can be tested with a flag in the last V8 versions (realization was synced with the last spec in V8 86). Unflagged (shipped) in V8 87.




See also:



[#52712] Tuesday, January 22, 2019, 5 Years  [reply] [flag answer]
Only authorized users can answer the question. Please sign in first, or register a free account.
neildrews

Total Points: 166
Total Questions: 103
Total Answers: 85

Location: Moldova
Member since Sat, Aug 6, 2022
2 Years ago
neildrews questions
Fri, Feb 18, 22, 00:00, 2 Years ago
Tue, Oct 12, 21, 00:00, 3 Years ago
Tue, Mar 23, 21, 00:00, 3 Years ago
Sun, Aug 16, 20, 00:00, 4 Years ago
;