International Components for Unicode

International Components for Unicode

Internationalisation is going to be tough when you have to implement all the different data representations (numbers, currencies, dates, collations and much more) by yourself. Fear not, because with the International Components for Unicode, or ICU for short, every developer has access to the different formats and information every developer needs. For C, C++, Java... and with the Intl extension PHP too. This talk shows how you can keep your Projects or Shops ready for every country.

A45e9d9d06ac3d88dcb4f2b9606d58c2?s=128

Claudio Zizza

April 21, 2016
Tweet

Transcript

  1. International Components for Unicode

  2. Claudio Zizza Developer Developer Developer (Currently PHP) PHP Snippets: Twitter:

    php.budgegeria.de @SenseException
  3. None
  4. None
  5. None
  6. Mars Climate Orbiter

  7. None
  8. Date representation es_ES: 21/4/16 en_US: 4/21/16

  9. ICU - International Components for Unicode icu-project.org

  10. ICU Open Source Project Unicode and Globalization support (C/C++/Java) Released

    in 1999 Sponsored by IBM and others Current version: ICU 57.1
  11. Intl-Extension

  12. NumberFormatter 1'000.45 CHF 1'000.45 CHF < ? p h p

    $ n u m b e r F o r m a t t e r = n e w N u m b e r F o r m a t t e r ( ' d e _ C H ' , N u m b e r F o r m a t t e r : : D E C I M A L ) ; e c h o $ n u m b e r F o r m a t t e r - > f o r m a t ( 1 0 0 0 . 4 5 ) ; $ n u m b e r F o r m a t t e r = n e w N u m b e r F o r m a t t e r ( ' d e _ C H ' , N u m b e r F o r m a t t e r : : C U R R E N C Y ) ; e c h o $ n u m b e r F o r m a t t e r - > f o r m a t ( 1 0 0 0 . 4 5 ) ; e c h o $ n u m b e r F o r m a t t e r - > g e t S y m b o l ( N u m b e r F o r m a t t e r : : C U R R E N C Y _ S Y M B O L ) ;
  13. Date & Time Formatter 18 aprile 2016 21:51 20 gennaio

    2014 22:22 < ? p h p $ d a t e F o r m a t t e r = n e w I n t l D a t e F o r m a t t e r ( ' i t _ I T ' , I n t l D a t e F o r m a t t e r : : L O N G , I n t l D a t e F o r m a t t e r : : S H O R T ) ; $ d a t e = n e w D a t e T i m e ( ) ; e c h o $ d a t e F o r m a t t e r - > f o r m a t ( $ d a t e ) . P H P _ E O L ; e c h o $ d a t e F o r m a t t e r - > f o r m a t ( 1 3 9 0 2 5 2 9 2 3 ) ;
  14. MessageFormatter Am Sonntag, 17. April 2016 waren es 1.240.000 Besucher.

    Am Sonntag, 17. April 2016 waren es 1'240'000 Besucher. < ? p h p $ t e x t = ' A m { d a t e v a l , d a t e , f u l l } w a r e n e s { v i s i t o r , n u m b e r , i n t e g e r } B e s u c h e r . ' $ m s g D e = n e w M e s s a g e F o r m a t t e r ( ' d e _ D E ' , $ t e x t ) ; $ m s g C h = n e w M e s s a g e F o r m a t t e r ( ' d e _ C h ' , $ t e x t ) ; $ a r g s = a r r a y ( ' v i s i t o r ' = > 1 2 4 0 0 0 0 , ' d a t e v a l ' = > n e w D a t e T i m e ( ) , ) ; e c h o $ m s g D e - > f o r m a t ( $ a r g s ) ; e c h o $ m s g C h - > f o r m a t ( $ a r g s ) ;
  15. MessageFormatter Type Style number (none) integer currency percent (styletext) Type

    Style date (none) short medium long full (styletext) Type Style time (none) short medium long full (styletext) Type Style spellout ordinal duration
  16. IntlCalendar 1461093920108 gregorian false 2016 true < ? p h

    p $ c a l e n d a r = I n t l C a l e n d a r : : c r e a t e I n s t a n c e ( ' E u r o p e / B e r l i n ' , ' d e _ D E ' ) ; v a r _ d u m p ( $ c a l e n d a r - > g e t T i m e ( ) , $ c a l e n d a r - > g e t T y p e ( ) , $ c a l e n d a r - > i s W e e k e n d ( ) , $ c a l e n d a r - > g e t ( I n t l C a l e n d a r : : F I E L D _ Y E A R ) , $ c a l e n d a r - > i n D a y l i g h t T i m e ( ) ) ;
  17. Calendar information April Mo. Di. Mi. Do. Fr. Sa. So.

    28 29 30 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 1 April Sun Mon Tue Wed Thu Fri Sat 27 28 29 30 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
  18. TimeZone Central European Standard Time false true 3600000 < ?

    p h p $ t i m e z o n e = I n t l T i m e Z o n e : : c r e a t e T i m e Z o n e ( ' E u r o p e / B e r l i n ' ) ; $ t i m e z o n e 2 = I n t l T i m e Z o n e : : c r e a t e T i m e Z o n e ( ' E u r o p e / P a r i s ' ) ; v a r _ d u m p ( $ t i m e z o n e - > g e t D i s p l a y N a m e ( ) , $ t i m e z o n e - > h a s S a m e R u l e s ( $ t i m e z o n e 2 ) , $ t i m e z o n e - > u s e D a y l i g h t T i m e ( ) , $ t i m e z o n e - > g e t R a w O f f s e t ( ) ) ;
  19. Locale en_US_POSIX de_DE < ? p h p v a

    r _ d u m p ( L o c a l e : : g e t D e f a u l t ( ) ) ; L o c a l e : : s e t D e f a u l t ( ' d e _ D E ' ) ; v a r _ d u m p ( L o c a l e : : g e t D e f a u l t ( ) ) ;
  20. Locale's Region Deutschland Germania Germany < ? p h p

    v a r _ d u m p ( L o c a l e : : g e t D i s p l a y R e g i o n ( ' d e _ D E ' , ' d e ' ) , L o c a l e : : g e t D i s p l a y R e g i o n ( ' d e _ D E ' , ' i t ' ) , L o c a l e : : g e t D i s p l a y R e g i o n ( ' d e _ D E ' , ' e n ' ) ) ;
  21. Locale's Language Deutsch tedesco German < ? p h p

    v a r _ d u m p ( L o c a l e : : g e t D i s p l a y L a n g u a g e ( ' d e _ D E ' , ' d e ' ) , L o c a l e : : g e t D i s p l a y L a n g u a g e ( ' d e _ D E ' , ' i t ' ) , L o c a l e : : g e t D i s p l a y L a n g u a g e ( ' d e _ D E ' , ' e n ' ) ) ;
  22. Spoofchecker false false true true < ? p h p

    $ s p o o f = n e w S p o o f c h e c k e r ( ) ; / / a r e s t r i n g s v i s u a l l y c o n f u s a b l e ? v a r _ d u m p ( $ s p o o f - > a r e C o n f u s a b l e ( " K ö r n e r " , " K ö r n e r \ 0 " ) , $ s p o o f - > a r e C o n f u s a b l e ( " K ö r n e r " , " K o r n e r " ) , $ s p o o f - > a r e C o n f u s a b l e ( ' l o l ' , ' 1 o 1 ' ) , $ s p o o f - > a r e C o n f u s a b l e ( ' l o l ' , ' I o I ' ) ) ;
  23. Character Encoding corazón corazón < ? p h p $

    u c o n v = n e w U C o n v e r t e r ( ' U T F - 8 ' , ' l a t i n - 1 ' ) ; e c h o $ u c o n v - > c o n v e r t ( ' c o r a z � n ' ) ; e c h o U C o n v e r t e r : : t r a n s c o d e ( ' c o r a z � n ' , ' U T F - 8 ' , ' l a t i n - 1 ' ) ;
  24. IntlBreakIterator Si contano i danni. ----- next ----- A Pescara,

    1.500 sfollati per l'esondazione del Fosso Vallelunga. ----- next ----- Dall'inizio dell'anno l'agricoltura ha subito un miliardo di euro di danni. ----- next ----- < ? p h p $ t e x t = " S i c o n t a n o i d a n n i . A P e s c a r a , " . " 1 . 5 0 0 s f o l l a t i p e r l ' e s o n d a z i o n e d e l F o s s o V a l l e l u n g a . " . " D a l l ' i n i z i o d e l l ' a n n o l ' a g r i c o l t u r a h a s u b i t o u n m i l i a r d o " . " d i e u r o d i d a n n i . " ; $ i = I n t l B r e a k I t e r a t o r : : c r e a t e S e n t e n c e I n s t a n c e ( ' i t _ I T ' ) ; $ i - > s e t T e x t ( $ t e x t ) ; f o r e a c h ( $ i - > g e t P a r t s I t e r a t o r ( ) a s $ s e n t e n c e ) { e c h o $ s e n t e n c e . P H P _ E O L . ' - - - - - n e x t - - - - - ' . P H P _ E O L ; }
  25. Sorting sort = A,a,g,j,z,ß,ä < ? p h p $

    a r r a y = a r r a y ( ' a ' , ' g ' , ' A ' , ' ß ' , ' ä ' , ' j ' , ' z ' ) ; s o r t ( $ a r r a y ) ;
  26. Sorting with Collator Collator::sort = a,A,ä,g,j,ß,z < ? p h

    p $ a r r a y = a r r a y ( ' a ' , ' g ' , ' A ' , ' ß ' , ' ä ' , ' j ' , ' z ' ) ; $ c o l l a t o r = n e w C o l l a t o r ( ' d e _ D E ' ) ; $ c o l l a t o r - > s e t A t t r i b u t e ( C o l l a t o r : : C A S E _ F I R S T , C o l l a t o r : : L O W E R _ F I R S T ) ; $ c o l l a t o r - > s o r t ( $ a r r a y ) ;
  27. Transliterator kon'nichiha < ? p h p $ t r

    a n s = T r a n s l i t e r a t o r : : c r e a t e ( ' A n y - L a t i n ' ) ; e c h o $ t r a n s - > t r a n s l i t e r a t e ( ' こんにちは' ) ;
  28. Transliterator array(286) { [0]=> string(11) "ASCII-Latin" [1]=> string(11) "Latin- Arabic"

    ... < ? p h p v a r _ d u m p ( T r a n s l i t e r a t o r : : l i s t I D s ( ) ) ;
  29. Custom Resources DEM Marco Tedesco < ? p h p

    / / r e t u r n s n u l l o n e r r o r $ c u r r = n e w R e s o u r c e B u n d l e ( ' i t ' , _ _ D I R _ _ . ' / r e s o u r c e s ) ; / / g e t o l d g e r m a n c u r r e n c y $ d e m C u r r e n c y = $ c u r r - > g e t ( ' C u r r e n c i e s ' ) - > g e t ( ' D E M ' ) ; e c h o $ d e m C u r r e n c y - > g e t ( 0 ) . P H P _ E O L ; e c h o $ d e m C u r r e n c y - > g e t ( 1 ) . P H P _ E O L ;
  30. Custom Ressources - it.txt i t { C u r

    r e n c i e s { A D P { " A D P " , " P e s e t a A n d o r r a n a " , } . . . . D E M { " D E M " , " M a r c o T e d e s c o " , }
  31. Created resource: it.res Convert Ressources for ResourceBundle g e n

    r b - d / p a t h / t o / r e s o u r c e s / c u r r e n c y / i t . t x t
  32. genrb http://linux.die.net/man/1/genrb

  33. Characters Ä č F < ? p h p e

    c h o I n t l C h a r : : t o u p p e r ( ' ä ' ) ; e c h o I n t l C h a r : : t o l o w e r ( ' Č ' ) ; e c h o I n t l C h a r : : t o t i t l e ( ' f ' ) ;
  34. Characters true false false true < ? p h p

    v a r _ d u m p ( I n t l C h a r : : i s U U p p e r c a s e ( ' A ' ) ) ; v a r _ d u m p ( I n t l C h a r : : i s U L o w e r c a s e ( ' A ' ) ) ; v a r _ d u m p ( I n t l C h a r : : i s d i g i t ( 3 ) ) ; v a r _ d u m p ( I n t l C h a r : : i s d i g i t ( ' 3 ' ) ) ;
  35. Characters LATIN CAPITAL LETTER U WITH DIAERESIS < ? p

    h p e c h o I n t l C h a r : : c h a r N a m e ( ' Ü ' ) ;
  36. and Intl

  37. Thank you Claudio Zizza php.budgegeria.de @SenseException