Slide 1

Slide 1 text

RE: AWK @wtnabe Kanazawa.rb meetup #18 2014-02-15 (Sat) at IT Plaza MUSASHI

Slide 2

Slide 2 text

Abstract

Slide 3

Slide 3 text

Regular Expression Basics AWK Basics

Slide 4

Slide 4 text

Regular Expression Basics

Slide 5

Slide 5 text

Difficulties of RE Many different implementations Complex Character Sequences

Slide 6

Slide 6 text

Rough Classification POSIX ( Basic / Extended ) PCRE (Perl Compatible Regular Expression) PHP, Apache, GNU Grep, ... GNU ( Basic / Extended ) and more

Slide 7

Slide 7 text

Ignore Minor tool's original expressions What does grep in your tools mean ?

Slide 8

Slide 8 text

Basic Syntax Literal Character Meta Character / Escape Sequence Character List / Character Class Grouping and Back reference

Slide 9

Slide 9 text

Elementary Operators . * + ? | () grouping and back reference

Slide 10

Slide 10 text

Elementary Operators ^ $ ( \A \z ) \r \n \s \xXX escape sequence [] [^] character list

Slide 11

Slide 11 text

RE Literal and Language Syntax Such as escape sequence CONFLICT with parent language Some Languages have RE Literal AWK, Perl, Ruby, JavaScript, ...

Slide 12

Slide 12 text

Examples /^[0-9]+(-[0-9]+)+$/ /\s[Kk]anazawa\.rb\s/ %r{\bhttps?://.*\b}

Slide 13

Slide 13 text

Keep away Too Match /[0-9-]+/ Complex one-shot match q{[^"'<>]*(? :"[^"]*"[^"'<>]*|'[^']*'[^"'<>]*)*(? :>|(?=<)|$(?!\n))}; #'}}}} cf. Perlメモ

Slide 14

Slide 14 text

AWK

Slide 15

Slide 15 text

Name from famous Human Names Aho Weinberger Kernighan

Slide 16

Slide 16 text

Filter-oriented Programming Language $ awk 'script' srcfile $ cat srcfile | awk 'script' > destfile $ awk -f script srcfile

Slide 17

Slide 17 text

Basic Syntax C-like / Shell-like ( semicolon less ) Patterns and Actions No need to write about stdin and split

Slide 18

Slide 18 text

Patterns and Actions pattern { action } pattern { action }

Slide 19

Slide 19 text

BEGIN and END rule BEGIN { ... } END { ... }

Slide 20

Slide 20 text

Example Count a number of gems depending on from Gemfile.lock

Slide 21

Slide 21 text

Gemfile.lock GEM remote: https://rubygems.org/ specs: blankslate (2.1.2.4) ... PLATFORMS ruby DEPENDENCIES ...

Slide 22

Slide 22 text

BEGIN { counting = 0 } /^$/ { counting = 0 } counting == 1 && !/:/ { print $1 } /^GEM$/ { counting = 1 }

Slide 23

Slide 23 text

$ awk -f script.awk Gemfile.lock | \ sort | uniq | wc -l

Slide 24

Slide 24 text

Same as NF == 2 && $2 ~ /\(.*\)/ {print $1}

Slide 25

Slide 25 text

Notes http://www.regular-expressions.info/ 正規表現メモ The GNU Awk User's Guide The GAWK Manual - Table of Contents