Developing applications to handle the natural languages and written scripts of the world—or even a small handful of them—is an impressively large task. Fortunately, Unicode provides tools to do just that. It’s more than just a character set, it’s a collection of standards for working with the world’s textual data. The problem is: Unicode itself is complex!
This talk will help make supporting Unicode easier by providing some of the best practices for your projects—whether CPAN modules, RESTful services, or web applications. We’ll briefly review Unicode and then dive into best practices for handling Unicode text in the following areas:
◦ User experience
◦ Collation (comparison and sorting)
◦ Input, output, and logging
◦ Security considerations
◦ Debugging
◦ Testing (unit tests and QA)
Presented at:
◦ 2013-06-19: Open Source Bridge 2013, Portland, OR
Speaker notes: http://opensourcebridge.org/wiki/2013/Unicode_Best_Practices
Example code in multiple languages: https://github.com/patch/unicode-programming