represented as a sequence of UTF–16 code units. All lengths, character indexes, and ranges are expressed in terms of 16-bit platform- endian values, with index values starting at 0.
single encoding → Has idea of "canonical equivalence" → Several different underlying representations → Can choose between them depending upon the application
→ S ̵͘ ͝ ͖͎̯̱͕̣̝ u ̵̊̈́̓̈́̾̎̔̄̃ ̢͚͓̞̺ ͜ p ̶͓͔͕̞͆͆̇̂̄ 㸅 ͍̘̰̮̝̪̖ e ̴ ̚ ͑̿ ̚ ̙̬̜͕͓͈̝ r ̷̛͠ ͐̈́͌̎̒́͑ ̕ ͖ ̴̔̎̇ ́͆͐̔̚ ̢̳̲͔ ͎ l ̸̀̎̿̚ ͉̥ a ̴̾̅̿́ ̢̪̤͇͎̣ ͔̠ r ̷̓̈́̓͑ ̡̥̯͈ ̨̜̬̘ g ̵͛̅ ͚̤̰ e ̸̡͍͚̞̩͇̱̘̭͇̝̦͕̃́́̀̈́̌̄̆́ ̸̽̈́̃͘ ̚ ̐ ͒͝ ̕ ̓̽ ̨͍͍̬͖̭̠͓ s ̷̓̔͌̌ ̡ ̞͍͕̻̰̥ t ̵̆͂̍͐͝ ͋ ͕̭̯ r ̷̖͎̟̻̯̗̥̓̅ i̴ ̠̝̑̚ ͇̼͎̖̝̮ n ̸̛͉ g ̵̀͛̓̒͗̓͌͆̑͘̚ ͓ s ̷̏́͛͒ ̡̙̻̪