Surrogate Pairs? UTF-16 can treat U+0000ʙU+10FFFF, but UTF-16 can’t express codepoint exceeded 0x10000. So Unicode has solve it by creating Surrogate code points. Surrogate code points are code points that are not assigned by all Unicode encoding form. UTF-16 use this Surrogate zone to express code points that are exceeded 0x10000 by combine two code point.
What is occured? RegExp in javascript treats string value same as javascript. So any(.) match 1 code point but not 1 character, if string contains surrogate pair. So how do we treat surrogate pair with RegExp easily?