Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Remove AS::Mb::Unicode::UnicodeDatabase
Search
Fumiaki MATSUSHIMA
August 05, 2017
Programming
4
1.3k
Remove AS::Mb::Unicode::UnicodeDatabase
ぎんざRuby会議01 発表資料
https://ginzarb.github.io/kaigi01/
Fumiaki MATSUSHIMA
August 05, 2017
Tweet
Share
More Decks by Fumiaki MATSUSHIMA
See All by Fumiaki MATSUSHIMA
Learning from performance improvements on GraphQL Ruby
mtsmfm
1
1k
Ruby で作る Ruby (物理)
mtsmfm
1
190
GraphQL Ruby benchmark
mtsmfm
1
760
タイムアウトにご用心 / Timeout might break application state
mtsmfm
6
2.5k
Build REST API with GraphQL Ruby
mtsmfm
0
300
GraphQL Ruby をちょっとだけ速くした / Make graphql-ruby faster a bit
mtsmfm
1
690
Gaming PC on GCP
mtsmfm
0
700
How to introduce GraphQL to an existing React-Redux application
mtsmfm
1
230
Canary release in StudySapuri
mtsmfm
0
3k
Other Decks in Programming
See All in Programming
Unity Android XR入門
sakutama_11
0
160
Grafana Cloudとソラカメ
devoc
0
170
iOSエンジニアから始める visionOS アプリ開発
nao_randd
3
130
Djangoアプリケーション 運用のリアル 〜問題発生から可視化、最適化への道〜 #pyconshizu
kashewnuts
1
250
もう僕は OpenAPI を書きたくない
sgash708
4
1.5k
仕様変更に耐えるための"今の"DRY原則を考える / Rethinking the "Don't repeat yourself" for resilience to specification changes
mkmk884
0
110
Introduction to kotlinx.rpc
arawn
0
690
Pulsar2 を雰囲気で使ってみよう
anoken
0
240
個人アプリを2年ぶりにアプデしたから褒めて / I just updated my personal app, praise me!
lovee
0
340
Grafana Loki によるサーバログのコスト削減
mot_techtalk
1
130
Amazon Bedrock Multi Agentsを試してきた
tm2
1
280
AWSマネコンに複数のアカウントで入れるようになりました
yuhta28
2
160
Featured
See All Featured
Measuring & Analyzing Core Web Vitals
bluesmoon
6
240
Fashionably flexible responsive web design (full day workshop)
malarkey
406
66k
How to train your dragon (web standard)
notwaldorf
91
5.8k
How to Ace a Technical Interview
jacobian
276
23k
RailsConf & Balkan Ruby 2019: The Past, Present, and Future of Rails at GitHub
eileencodes
133
33k
Producing Creativity
orderedlist
PRO
344
39k
The Myth of the Modular Monolith - Day 2 Keynote - Rails World 2024
eileencodes
21
2.5k
Bash Introduction
62gerente
611
210k
The Cost Of JavaScript in 2023
addyosmani
47
7.3k
Templates, Plugins, & Blocks: Oh My! Creating the theme that thinks of everything
marktimemedia
30
2.2k
Fireside Chat
paigeccino
34
3.2k
Art, The Web, and Tiny UX
lynnandtonic
298
20k
Transcript
@mtsmfm ActiveSupport::Multibyte:: Unicode::UnicodeDatabase を消したかった
Fumiaki MATSUSHIMA GitHub, Twitter @mtsmfm Web Developer
https://www.quipper.com/
https://ninirb.github.io
https://www.meetup.com/ja-JP/GraphQL-Tokyo/
http://rubykaigi.org/2017/speakers
http://contributors.rubyonrails.org/
Rails で 一番大きいファイル 知ってますか?
$ find vendor/bundle/gems/acti* -type f -exec du -h -a {}
+ | sort -h -r | head -n 10 1.1M vendor/bundle/gems/activesupport-5.1.2/lib/active_support/values/unicode_tables.dat 104K vendor/bundle/gems/actionview-5.1.2/lib/action_view/helpers/form_helper.rb 100K vendor/bundle/gems/activerecord-5.1.2/lib/active_record/associations.rb 76K vendor/bundle/gems/actionpack-5.1.2/lib/action_dispatch/routing/mapper.rb 60K vendor/bundle/gems/actionview-5.1.2/lib/action_view/helpers/date_helper.rb 52K vendor/bundle/gems/activerecord-5.1.2/lib/active_record/connection_adapters/abstract/schema_statements.rb 44K vendor/bundle/gems/activerecord-5.1.2/lib/active_record/migration.rb 44K vendor/bundle/gems/actionview-5.1.2/lib/action_view/helpers/form_tag_helper.rb 44K vendor/bundle/gems/actionview-5.1.2/lib/action_view/helpers/form_options_helper.rb 40K vendor/bundle/gems/activerecord-5.1.2/lib/active_record/relation/query_methods.rb
$ find vendor/bundle/gems/acti* -type f -exec du -h -a {}
+ | sort -h -r | head -n 10 1.1M vendor/bundle/gems/activesupport-5.1.2/lib/active_support/values/unicode_tables.dat 104K vendor/bundle/gems/actionview-5.1.2/lib/action_view/helpers/form_helper.rb 100K vendor/bundle/gems/activerecord-5.1.2/lib/active_record/associations.rb 76K vendor/bundle/gems/actionpack-5.1.2/lib/action_dispatch/routing/mapper.rb 60K vendor/bundle/gems/actionview-5.1.2/lib/action_view/helpers/date_helper.rb 52K vendor/bundle/gems/activerecord-5.1.2/lib/active_record/connection_adapters/abstract/schema_statements.rb 44K vendor/bundle/gems/activerecord-5.1.2/lib/active_record/migration.rb 44K vendor/bundle/gems/actionview-5.1.2/lib/action_view/helpers/form_tag_helper.rb 44K vendor/bundle/gems/actionview-5.1.2/lib/action_view/helpers/form_options_helper.rb 40K vendor/bundle/gems/activerecord-5.1.2/lib/active_record/relation/query_methods.rb 1.1M!
active_support/values/unicode_tables.dat
https://github.com/rails/rails/blob/16f2b2044eaaa54b7bc205ef9af1689a152b2fdf/actives upport/lib/active_support/multibyte/unicode.rb
Rails で 一番大きいファイル ↓ ActiveSupport::Multibyte:: Unicode::UnicodeDatabase の dat ファイル
https://github.com/rails/rails/pull/26743
@mtsmfm ActiveSupport::Multibyte:: Unicode::UnicodeDatabase を消したかった
http://agile.esm.co.jp/news/2016-04-08-rails-study-session.html
社内 Rails 勉強会 ↓ OSS パッチ会
https://speakerdeck.com/a_matsuda/3x-rails
https://speakerdeck.com/a_matsuda/3x-rails?slide=156
https://speakerdeck.com/a_matsuda/3x-rails?slide=156
https://speakerdeck.com/a_matsuda/3x-rails?slide=156
None
None
AS::Mb::Unicode そもそも何ができる?
None
PR 出したタイミングの Rails v5.0.0.1 の コードベースで話をします (今も大差ないけれど) 当時は Ruby 2.4
が出る ちょっと前でした
- Normalize - Case mapping - Pack/unpack grapheme - Tidy
bytes
- Normalize - Case mapping - Pack/unpack grapheme - Tidy
bytes AS::Mb::Unicode::UnicodeDatabase 使ってない
- Normalize - Case mapping - Pack/unpack grapheme
Unicode Normalize とは
Decompose ‘が’ [‘か’, ‘゛’] Compose [‘か’, ‘゛’] ‘が’
Normalize 関連のメソッド - AS::Mb::Unicode#normalize - AS::Mb::Unicode#decompose - AS::Mb::Unicode#compose - AS::Mb::Unicode#reorder_characters
Unicode 正規化 - NFD - NFC - NFKD - NFKC
Normalization Form Decopose Compose
Unicode 正規化 - NFD - NFC - NFKD - NFKC
Normalization Form Decopose Compose
Unicode 正規化 - NFD - NFC - NFKD - NFKC
Normalization Form Decopose Compose
“In NFKC and NFKD, a K is used to stand
for compatibility to avoid confusion with the C standing for composition.” http://unicode.org/reports/tr15/
Unicode 正規化 - NFD - NFC - NFKD - NFKC
Normalization Form Decopose Compose K(C)ompatibility (互換等価)
Unicode 正規化の等価性 - 正準等価 (Kじゃない方) - 戻れる - 互換等価 (Kの方)
- 緩め。戻れない
㈱
正準等価 ‘㈱’ != [‘(’ , ‘株’, ‘)’] 互換等価 ‘㈱’ ==
[‘(’, ‘株’, ‘)’]
Normalize 関連のメソッド - AS::Mb::Unicode#normalize - AS::Mb::Unicode#decompose - AS::Mb::Unicode#compose - AS::Mb::Unicode#reorder_characters
https://github.com/rails/rails/blob/v5.0.0.1/activesupport/lib/active_ support/multibyte/unicode.rb#L285-L301
https://github.com/rails/rails/blob/v5.0.0.1/activesupport/lib/active_ support/multibyte/unicode.rb#L159-L177
https://github.com/rails/rails/blob/v5.0.0.1/activesupport/lib/active_ support/multibyte/unicode.rb#L180-L236
https://github.com/rails/rails/blob/v5.0.0.1/activesupport/lib/active_ support/multibyte/unicode.rb#L143-L136
Normalize 関連のメソッド - AS::Mb::Unicode#normalize - AS::Mb::Unicode#decompose - AS::Mb::Unicode#compose - AS::Mb::Unicode#reorder_characters
#normalize で使うための ヘルパメソッド (なぜ public なのか...)
Normalize 関連のメソッド - AS::Mb::Unicode#normalize - AS::Mb::Unicode#decompose - AS::Mb::Unicode#compose - AS::Mb::Unicode#reorder_characters
Ruby 本体は?
https://docs.ruby-lang.org/ja/search/
https://docs.ruby-lang.org/ja/search/query:unicode/query:normalize/
あった!
String#unicode_normalize [1] pry(main)> '株'.codepoints => [26666] [2] pry(main)> '㈱'.codepoints =>
[12849] [3] pry(main)> '㈱'.unicode_normalize(:nfc).codepoints => [12849] [4] pry(main)> '㈱'.unicode_normalize(:nfd).codepoints => [12849] [5] pry(main)> '㈱'.unicode_normalize(:nfkc).codepoints => [40, 26666, 41] [6] pry(main)> '㈱'.unicode_normalize(:nfkd).codepoints => [40, 26666, 41]
https://github.com/rails/rails/pull/26743/files?diff=split
https://github.com/rails/rails/pull/26743/files?diff=split
https://github.com/rails/rails/pull/26743/files?diff=split
Ruby 便利!
- Normalize - Case mapping - Pack/unpack grapheme ✔
‘A’ ‘a’
‘A’ ‘a’ ‘Ä’ ‘ä’
Case mapping 関連のメソッド - AS::Mb::Unicode#downcase - AS::Mb::Unicode#upcase - AS::Mb::Unicode#swapcase
https://github.com/rails/rails/blob/v5.0.0.1/activesupport/lib/active_ support/multibyte/unicode.rb#L303-L313
https://github.com/rails/rails/blob/v5.0.0.1/activesupport/lib/active_ support/multibyte/unicode.rb#L392-L402
Ruby 本体は?
http://rubykaigi.org/2016/presentations/duerst.html
https://www.ruby-lang.org/en/news/2016/09/08/ruby-2-4-0-preview 2-released/
$ docker run -e LANG=C.UTF-8 --rm ruby:2.3 \ ruby -e
"p 'Ä'.downcase == 'ä'" false $ docker run -e LANG=C.UTF-8 --rm ruby:2.4 \ ruby -e "p 'Ä'.downcase == 'ä'" true
https://github.com/rails/rails/pull/26743/files?diff=split
Ruby 便利!!
https://www.sw.it.aoyama.ac.jp/2016/pub/RubyKaigi/
https://bugs.ruby-lang.org/issues/10084
- Normalize - Case mapping - Pack/unpack grapheme ✔ ✔
Grapheme とは
Grapheme (書記素) ≒ 文字の単位 あ が ゛
ぎんざ
[‘き’, ‘゛’, ‘ん’, ‘ざ’]
文字区切り [[‘き’], [‘゛’], [‘ん’], [‘ざ’]] 書記素区切り [[‘き’, ’゛’], [‘ん’], [‘ざ’]]
Pack/unpack grapheme 関連のメソッド - AS::Mb::Unicode#pack_graphemes - AS::Mb::Unicode#unpack_graphemes
https://github.com/rails/rails/blob/v5.0.0.1/activesupport/lib/active_ support/multibyte/unicode.rb#L138-L140
https://github.com/rails/rails/blob/v5.0.0.1/activesupport/lib/active_ support/multibyte/unicode.rb#L80-L133
Ruby 本体は?
https://docs.ruby-lang.org/ja/search/
https://docs.ruby-lang.org/ja/search/query:grapheme
/\X/
https://github.com/rails/rails/pull/26743/files
https://github.com/rails/rails/pull/26743/files
Ruby 本体の機能便利!!!
と思いきや テストが通らない
None
https://github.com/k-takata/Onigmo/issues/46
https://bugs.ruby-lang.org/issues/12831
https://bugs.ruby-lang.org/issues/12831
2.4 で入った
https://github.com/rails/rails/pull/26743/files
- Normalize - Case mapping - Pack/unpack grapheme ✔ ✔
✔
None
https://github.com/rails/rails/pull/26743
なぜマージできないか
Rails 5 は Ruby 2.2.2 以降を サポート
- Normalize - Ruby 2.2 から - Case mapping -
Ruby 2.4 から - Pack/unpack grapheme - Ruby 2.0 から - ただし、Unicode のテストが 通るのは 2.4 から
入るとしたら Ruby のバージョンが 上がるとき ≒ Rails 6 ?
Rails を待たなくても 手元の開発では 使える
それ、 Ruby 本体で できるかも
まとめ - Rails 6 になると UnicodeDatabase が 消せて、3x Rails に近づくかも
- 多数の人の力により、gem でやっていた ことが Ruby 本体でできるようになって いっている
Credits Background pattern from subtlepatterns.com Emoji artwork provided by Emoji
One