$30 off During Our Annual Pro Sale. View Details »

Regular expressions basics/正規表現の基本

Regular expressions basics/正規表現の基本

Regular expressions basics/正規表現の基本

Kishikawa Katsumi

July 26, 2022
Tweet

More Decks by Kishikawa Katsumi

Other Decks in Programming

Transcript

 1. ਖ਼نදݱͷجຊ
  Regular expressions basics

  View Slide

 2. Swift Regex

  View Slide

 3. https://swiftregex.com/

  View Slide

 4. ਖ਼نදݱͱ͸
  • จࣈྻͷू߹ʢύλʔϯʣΛද͢൚༻తͳه๏

  • [bc]ook͸book·ͨ͸cookʹϚον͢Δ

  View Slide

 5. Ϧςϥϧ
  Literal Characters
  • a

  • Jack is a boy,

  • cat

  • About cats and dogs

  View Slide

 6. ϝλจࣈ
  Special Characters (Meta Characters)
  • ಛผͳҙຯΛ࣋ͭ12ͷจࣈʢϦςϥϧͱͯ͠ѻ͏ʹ͸Τεέʔϓ͕ඞཁɻʢྫʣ1\+1=2ʣ

  • όοΫεϥογϡ \

  • ΩϟϨοτ ^

  • υϧϚʔΫ $

  • υοτʢϐϦΦυʣ .

  • ύΠϓ |

  • ΫΤενϣϯϚʔΫ ?

  • ΞελϦεΫ *

  • ϓϥε +

  • ։ؙ͖Χοί (

  • ดؙ͡Χοί )

  • ։͖֯Χοί [

  • ։͖೾Χοί {

  View Slide

 7. ੍ޚจࣈ
  Non-Printable Characters (Control Characters, Escape sequence)
  • \t

  • λϒʹϚον͢Δ

  • \n

  • վߦʹϚον͢Δ

  View Slide

 8. จࣈΫϥε
  Character Classes (Character Sets)
  • ෳ਺ͷจࣈͷத͔Β̍ͭͷจࣈʹϚον͢Δ

  • a·ͨ͸eʹϚονͤ͞Δʹ͸[ae]ͱॻ͘

  • ʢྫʣgr[ae]y

  • gray·ͨ͸greyʹϚον

  • จࣈΫϥε͸1ͭͷจࣈʹϚον͢Δ

  • Χοίͷதͷจࣈͷॱং͸ؔ܎ͳ͍

  View Slide

 9. จࣈΫϥε
  Character Classes (Character Sets)
  • จࣈΫϥεͷதͰϋΠϑϯΛ࢖༻ͯ͠ൣғΛࢦఆͰ͖Δ

  • [0-9]

  • ͔̌Β̕ͷؒͷܻ̍ͷ਺ࣈʹϚον

  • [0-9a-fA-F]

  • େจࣈͱখจࣈΛ۠ผͤͣʹɺ16ਐ਺ͷ਺ࣈܻ̍ʹϚον

  • ൱ఆจࣈΫϥεʢNegated Character Classesʣ

  • [^0-9\r\n]

  • ਺ࣈ·ͨ͸վߦͰͳ͍೚ҙͷจࣈʹϚον

  View Slide

 10. จࣈΫϥεʢͷུه๏ʣ
  Shorthand Character Classes
  • จࣈΫϥεͷதͰΑ͘࢖ΘΕΔ΋ͷΛ؆୯ʹॻ͚ΔΑ͏ʹ͋Β͔͡Ί༻ҙ͞Εͨه๏

  • \d͸[0-9]ͷུه๏

  • UnicodeΛαϙʔτ͢Δ؀ڥͰ͸׽਺ࣈ΍ؙ਺ࣈͳͲ͢΂ͯͷ਺ࣈʹϚον
  • \w “word character” [A-Za-z0-9_]ͱಉ͡ʢΞϯμʔείΞؚ͕·ΕΔ͜ͱʹ஫ҙʣ

  • UnicodeΛαϙʔτ͢Δ؀ڥͰ͸͍Ζ͍ΖͳจࣈʹϚον
  • \s “whitespace character” ۭനจࣈʹϚον [ \t\r\n\f]

  • UnicodeΛαϙʔτ͢Δ؀ڥͰ͸UnicodeͷʮseparatorʯΧςΰϦͷ͢΂ͯͷจࣈʹϚον

  View Slide

 11. υοτʢϐϦΦυʣ
  The Dot Matches (Almost) Any Character
  • վߦจࣈΛআ̍͘จࣈʹϚον

  • “dot matches all”·ͨ͸“single line”Ϟʔυʢϓϩάϥϛϯάݴޠ΍ਖ਼نදݱΤ
  ϯδϯʹΑͬͯݺͼํ͸ҟͳΔʣΛࢦఆ͢ΔͱվߦจࣈΛؚΉ೚ҙͷ1จࣈʹ
  Ϛον

  • gr.y͸ɺgrayɺgrayɺgr%yͳͲʹϚον

  • υοτ͸ڧྗʹͳΜͰ΋Ϛον͢ΔͷͰ࢖͍͗͢ͳ͍

  • จࣈΫϥε΍൱ఆจࣈΫϥεΛ୅ΘΓʹ࢖͏

  View Slide

 12. ΞϯΧʔ
  Anchors
  • จࣈͰ͸ͳ͘ҐஔʹϚον

  • ^

  • จࣈྻͷઌ಄ʹϚον

  • $

  • จࣈྻͷ຤ඌʹϚον

  • ΄ͱΜͲͷਖ਼نදݱ͸“multi-line”Ϟʔυ͕͋Γɺ ^͸վߦͷޙΖɺ $͸վߦͷલʹϚον͢Δ

  • \b

  • ୯ޠڥքʹϚον

  • ୯ޠڥքͱ͸\wͰϚονͰ͖ΔจࣈͱɺͰ͖ͳ͍จࣈͷؒͷҐஔ

  View Slide

 13. બ୒
  Alternation
  • ࿦ཧ࿨ʢORʣ

  • cat|dog

  • About cats and dogs

  • cat|dog|mouse|
  fi
  sh

  • ޷͖ͳ͚ͩͭͳ͛Δ͜ͱ͕Ͱ͖Δ

  • cat|dog food

  • cat·ͨ͸dog foodʹϚον

  • cat food͔dog foodʹϚονͤ͞Δʹ͸ɺ(cat|dog) foodͷΑ͏ʹબ୒ΛάϧʔϓԽ͢Δ

  View Slide

 14. ܁Γฦ͠
  Repetition
  • ΫΤενϣϯϚʔΫʮ?ʯ

  • Optional

  • colou?r͸color·ͨ͸colourʹϚον

  • ΞελϦεΫʮ*ʯ

  • ̌ճҎ্ͷ܁Γฦ͠

  • <[A-Za-z][A-Za-z0-9]*>

  • ଐੑ͕ͳ͍HTMLλάʹϚον

  • ϓϥεʮ+ʯ

  • ̍ճҎ্ͷ܁Γฦ͠

  • ೾Χοίʮ{n,m}ʯ

  • ࢦఆճ਺ͷ܁Γฦ͠

  • \b[1-9][0-9]{3}b

  • 1000͔Β9999ͷ਺ࣈʹϚον

  • \b[1-9][0-9]{2,4}\b

  • 100͔Β99999ͷ਺ࣈʹϚον

  View Slide

 15. άϧʔϓͱΩϟϓνϟ
  Grouping and Capturing
  • ΧοίͰғΉͱάϧʔϓԽ͞ΕΔ

  • άϧʔϓʹରͯ͠܁Γฦ͠ΛࢦఆͰ͖Δ

  • Set(Value)?

  • Set·ͨ͸SetValueʹϚον

  • ௨ৗͷؙΧοί͸ΩϟϓνϟάϧʔϓΛ࡞੒͢Δ

  • Set(Value)?ͷਖ਼نදݱͰSetValue͕Ϛονͨ͠৔߹͸ɺάϧʔϓ̍ʹΞΫηε͢ΔͱValue͕औΓग़ͤΔ

  • Ωϟϓνϟ͕ඞཁͳ͍৔߹͸Set(?:Value)?ͱ͢ΔͱΩϟϓνϟ͠ͳ͍άϧʔϓ͕࡞੒Ͱ͖Δ

  • ؙΧοίͷޙͷΫΤενϣϯϚʔΫͱɺ̌ճҎ্ͷ܁Γฦ͠ͷࢦఆͷΫΤενϣϯϚʔΫΛࠞಉ͠ͳ͍Α͏ʹ
  ஫ҙ

  View Slide

 16. ޙํࢀর
  Backreferences
  • ΩϟϓνϟάϧʔϓͰΩϟϓνϟʢϚονʣͨ͠಺༰ʹϚον

  • ΩϟϓνϟάϧʔϓʹϚονͨ݁͠ՌΛ࠶ར༻Ͱ͖Δ

  • <([A-Z][A-Z0-9]*)\b[^>]*>.*?\1>

  • HTMLλάʹϚονʢΩϟϓνϟάϧʔϓʹϚονͨ͠։࢝λάΛऴྃλάͰ
  ࠶ར༻͍ͯ͠Δʣ

  View Slide

 17. ໊લ෇͖άϧʔϓʢΩϟϓνϟʣͱޙํࢀর
  Named Groups and Backreferences
  • Ωϟϓνϟ΁ͷࢀরΛ൪߸Ͱ؅ཧ͢Δͷ͸େมͩ͠ɺ௥Ճ࡟আͰͣΕΔͷͰ໊લΛ෇͚ΒΕΔ

  • ߏจʢ໊લ෇͖άϧʔϓʣ

  • (?Pgroup)

  • ߏจʢޙํࢀরʣ

  • (?P=name)

  • <(?P[A-Z][A-Z0-9]*)\b[^>]*>.*?(?P=tag)>

  • HTMLλάʹϚονʢ <([A-Z][A-Z0-9]*)\b[^>]*>.*?\1>ͱಉ͡ʣ

  • ߏจʢ໊લ෇͖Ωϟϓνϟʢ.NETʣʣ

  • (?group)·ͨ͸(?’name'group)

  • ߏจʢ໊લʹΑΔࢀরʢ.NETʣʣ

  • \k·ͨ͸\k'name'

  View Slide

 18. ઌಡΈͱޙಡΈ
  Lookaround (Lookahead/Lookback(Lookbehind))
  • ಛघͳάϧʔϓͰɺΞϯΧʔͷΑ͏ʹϚονͨ݁͠ՌͷҐஔΛࢦఆ͢Δ

  • ʢྫʣ\d+(?=€)

  • ੔਺஋ͷޙʹʮ€ʯ͕ଓ͘จࣈྻʹϚον

  • 1 turkey costs 30€ͷ30ʹϚον

  • ߏจʢߠఆઌಡΈʢPositive lookaheadʣʣ

  • X(?=Y)

  • ߏจʢ൱ఆઌಡΈʢNegative lookaheadʣʣ

  • X(?!Y)

  • ߏจʢߠఆޙಡΈʢPositive lookbehindʣʣ

  • (?<=Y)X

  • ߏจʢ൱ఆޙಡΈʢNegative lookbehindʣʣ

  • (?

  View Slide

 19. References
  • Regular-Expressions.info

  https://www.regular-expressions.info/

  • Swift Regex

  https://swiftregex.com/

  View Slide