Slide 1

Slide 1 text

AWS APIGATEWAY + LAMBDA + NEOLOGD Ͱ࡞ΔαʔόϨε೔ຊޠܗଶૉղੳAPI

Slide 2

Slide 2 text

ࣗݾ঺հ • ໳࿬ ་ (KADOWAKI Satoru) • BakFoo, Inc
 όΫϑʔגࣜձࣾ CTO • PyConJP2015τʔΫ • Tornado/ElasticSearchͰ࣮ݱ͢ΔେྔπΠʔτͷ

Slide 3

Slide 3 text

όΫϑʔגࣜձࣾ • NHK NMAPS ϦΞϧλΠϜؾ৅ՄࢹԽ

Slide 4

Slide 4 text

όΫϑʔגࣜձࣾ • SocioMeasure http://socio.bakfoo.com

Slide 5

Slide 5 text

όΫϑʔגࣜձࣾ • ιʔγϟϧղੳ - πΠολʔ ࣗવݴޠղੳ-

Slide 6

Slide 6 text

໨త • MeCab + ৽ޠΛ࣋ͭࣙॻ؀ڥ • Ͱ͖Δ͚ͩؾܰʹ࢖͍͍ͨ • Θ͟Θ͟ઐ༻αʔόͰͳͯ͘΋...(αʔόϨεʣ • MeCabͷηοτΞοϓ͸ҙ֎ʹ໘౗ • OSʹґଘͨ͠จࣈίʔυ໰୊ • ࣙॻ͸Ͳ͏͢΂͖͔໰୊ • ԿΛ࢖͏͔ɼͲ͏͍͏ࣙॻ͕ඞཁ͔ • ൺֱత௿ίετͰօͰ࢖͑Δ؀ڥ • Ϗϧυ͔ΒσϓϩΠ·Ͱͷ֓ཁ

Slide 7

Slide 7 text

࣮૷؀ڥ֓ཁ • AWS API Gateway + Lambda + Python MeCab API Gateway Lambda S3

Slide 8

Slide 8 text

Ϗϧυ͔ΒσϓϩΠ·ͰͷྲྀΕ 1. Lambda༻Ϗϧυ؀ڥΛ࡞੒ 2. LambdaεΫϦϓτ࡞੒ 3. AWS΁σϓϩΠ

Slide 9

Slide 9 text

1. Lambda༻Ϗϧυ؀ڥΛ࡞੒

Slide 10

Slide 10 text

Ϗϧυ؀ڥ • AWS LambdaͷͨΊʹEC2 (ඞਢ) • AMI: amzn-ami-hvm-2016.03.3.x86_64-gp2 • Python 3.6.1 + Miniconda • Python2.7ͷྫ • AWS Lambda Ͱ MeCab Λಈ͔͢ • http://dev.classmethod.jp/cloud/improved-aws-lambda-with-mecab/ • NEologdࣙॻͷϏϧυ • Ubuntu 16.04 ͜ͷAMIͰmecabΛίϯύΠϧʂ

Slide 11

Slide 11 text

Ϗϧυ؀ڥͷৄࡉ

Slide 12

Slide 12 text

1. Ϗϧυ؀ڥߏங • @Lambda༻EC2Πϯελϯε • AMI: amzn-ami-hvm-2016.03.3.x86_64-gp2 • Πϯετʔϧ࡞ۀ $ sudo yum install gcc $ sudo yum install gcc-c++ $ sudo yum install git $ sudo yum install patch $ sudo rpm -ivh \ > http://packages.groonga.org/centos/groonga-release-1.1.0-1.noarch.rpm

Slide 13

Slide 13 text

1. Ϗϧυ؀ڥߏங • Python؀ڥ $ wget --quiet \ > https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh \ > -O ~/miniconda3.sh $ /bin/bash miniconda3.sh -b -p $HOME/miniconda3 $ echo 'export PATH="$HOME/miniconda3/bin:$PATH"' >> .bashrc $ pip install boto3

Slide 14

Slide 14 text

1. Ϗϧυ؀ڥߏங • Lambda༻ͷmecabΠϯετʔϧ $ mkdir $HOME/lambda_neologd $ curl -L \ > "https://drive.google.com/uc? export=download&id=0B4y35FiV1wh7cENtOXlicTFaRUE" \ > -o mecab-0.996.tar.gz $ tar zxvf mecab-0.996.tar.gz && cd mecab-0.996 $ ./configure --prefix=$HOME/lambda_neologd/local --enable-utf8-only $ make && make install ๨Εͣʹ

Slide 15

Slide 15 text

1. Ϗϧυ؀ڥߏங • ಉ༷ʹmecab-ipadic΋Πϯετʔϧ $ curl -L \ > "https://drive.google.com/uc? export=download&id=0B4y35FiV1wh7MWVlSDBCSX ZMTXM" \ > -o mecab-ipadic-2.7.0-20070801.tar.gz $ tar xvzf mecab-ipadic-2.7.0-20070801.tar.gz $ cd mecab-ipadic-2.7.0-20070801/ $ export PATH=$HOME/lambda_neologd/local/bin:$PATH $ export \ > LD_LIBRARY_PATH=$HOME/lambda_neologd/local/lib:$LD_LIBRARY_PATH $ ./configure --prefix=$HOME/lambda_neologd/local \ > --enable-utf8-only --with-charset=utf8 $ make && make install ๨Εͣʹ ؀ڥม਺

Slide 16

Slide 16 text

1. Ϗϧυ؀ڥߏங • mecabͷಈ࡞֬ೝ $ mecab -F "%m\t%h,%H,%pw,%pc,%pn\n" mecabςετ mecab 41,໊ࢺ,ݻ༗໊ࢺ,Ұൠ,*,*,*,mecab,ϝΧϒ,ϝΧϒ,6518,6208,6208 ςετ 36,໊ࢺ,αม઀ଓ,*,*,*,*,ςετ,ςετ,ςετ,3637,9107,2899 • Python MeCabͷΠϯετʔϧ $ cd $HOME/sandbox/lambda_neologd $ pip install mecab-python3 -t . ๨Εͣʹ

Slide 17

Slide 17 text

Lambda؀ڥ֓ཁ • AWS Lambda 512MB! NEologdࣙॻ͸900MBఔ౓͋Γͦͷ··Ͱ͸ಈ͔ͳ͍ʂ

Slide 18

Slide 18 text

1. Ϗϧυ؀ڥߏங • NEologdࣙॻͷੜ੒ • ඪ४తͳࣙॻ࡞੒Ͱ͸Lambda࣮૷Ͱ͖ͳ͍ • ࠷খߏ੒ͰࣙॻΛ࡞੒͢Δ $ git clone --depth 1 https://github. com/neologd/mecab-ipadic-neologd.git $ cd mecab-ipadic-neologd $ ./bin/install-mecab-ipadic-neologd -y \ > -p $HOME/neologd -n --eliminate-redundant-entry --eliminate-redundant-entryͰࣙॻαΠζ͸400MBఔ౓

Slide 19

Slide 19 text

1. Ϗϧυ؀ڥߏங • --eliminate-redundant-entry: Ͳ͏͢Δʁ • NEologdͷࣙॻʹ͸༳Β͕͗ొ࿥͞Ε͍ͯΔ • ྫ: • AMAZON, amazon, Amazon • ͍͢͝ɼ͢͝ʔ͍ɼ͢͝ʔʔʔʔʔ͍ ϊʔϚϥΠζ͞ΕͨࣙॻΛ࡞੒͢Δ • ۩ମతʹ΍Δ͜ͱ • ΞϧϑΝϕοτ൒֯খจࣈԽ • AMAZON, amazon, Amazon →ʮamazonʯ

Slide 20

Slide 20 text

1. Ϗϧυ؀ڥߏங • ࣙॻͷݩσʔλ • mecab-ipadic-neologd/seed/*.csv.xz • (ྫ) mecab-ipadic-neologd/seed/mecab-user-dict-seed.20170828.csv.xz • ͜ͷCSV.XZͷϑΝΠϧͷதʹ୯ޠ͕ొ࿥͞Ε͍ͯΔ • ͜ΕΒͷϑΝΠϧΛϊʔϚϥΠζͯࣙ͠ॻ࡞੒͢Δ

Slide 21

Slide 21 text

1. Ϗϧυ؀ڥߏங - ࠷ॳͷࣙॻ࡞੒ޙ $ python seed_normalize.py - ࣙॻΛ࠶࡞੒(ࣙॻߋ৽͠ͳ͍) $ ./bin/install-mecab-ipadic-neologd -y \ > -p $HOME/neologd --eliminate-redundant-entry ᶃ seedΛϊʔϚϥΠζ ᶄࣙॻͷݩσʔλΛߋ৽͠ͳ͍ͰࣙॻΛ࠶࡞੒ ==ʮ"-n" ΦϓγϣϯΛ͚ͭͳ͍ʯ

Slide 22

Slide 22 text

1. Ϗϧυ؀ڥߏங • seed_normalize.py # coding: utf-8 import lzma import os import unicodedata seeds_dir = 'mecab-ipadic-neologd/seed/' files = os.listdir(seeds_dir) for file in files: if 'xz' in file: print('normalizing mecab seed...', file) f = lzma.open(seeds_dir + file, 'rb') lines = f.readlines() f.close() with lzma.open(seeds_dir + file, "w") as f: for line in lines: norm_text = unicodedata.normalize( 'NFKC', line.decode()).lower() f.write(norm_text.encode('utf-8')) ϊʔϚϥΠζ͢Δ͚ͩ

Slide 23

Slide 23 text

1. Ϗϧυ؀ڥߏங dicdir = /tmp/neologd • $HOME/lambda_neologd/local/etc/mecabrc • ࣙॻͷύεΛ/tmpʹมߋ • Ϗϧυ؀ڥͰ͸γϯϘϦοΫϦϯΫʹ $ ln -s $HOME/neologd /tmp/neologd Ϗϧυ؀ڥͰ͖͕͋Γʂ

Slide 24

Slide 24 text

2. LambdaεΫϦϓτ࡞੒

Slide 25

Slide 25 text

2. LambdaεΫϦϓτ࡞੒ • PythonεΫϦϓτͰ΍Δ͜ͱ 1. ࣙॻΛLambdaΠϯελϯεॳظԽ࣌ʹ S3͔Βμ΢ϯϩʔυ • ࣙॻ͸mecabrcʹ߹Θͤͯ/tmp/neologdʹ 2. τʔΫφΠζͷલॲཧ • ࣙॻੜ੒ͱಉ༷ʹϊʔϚϥΠζΛ࣮ߦ • ςΩετΛαχλΠζ • ςΩετͷछྨʹΑ࣮ͬͯ૷Λม͑Δ 3. ߹໊ࢺɼϑϨʔζநग़ͷ࣮૷Λ௥Ճ

Slide 26

Slide 26 text

2. LambdaεΫϦϓτ࡞੒ • αχλΠζͱ͸ • ೔ຊޠͷॲཧʹ͓͍ͯ • จষͷதͰͦΕ΄Ͳҙຯͷͳ͍(Ԛ͍)૷০ͷ ͨΊͷจࣈྻͳͲΛҰൠతͳจࣈྻʹ߹Θ ͤΔ • ྫ: • ಉ఺ΰʔʔʔʔϧʂ → ಉ఺ΰʔϧʂ • ௕Իූ߸ʢ−ʣΛ·ͱΊΔ • ۟ಡ఺Λ·ͱΊΔ • จࣈྻͷؒͷεϖʔε͸۟఺ΛຒΊΔ • ϨΠΞ΢τͷͨΊͷ࿈ଓ͢ΔεϖʔεΛ ࡟আʢΠϯσϯτʣ

Slide 27

Slide 27 text

2. LambdaεΫϦϓτ࡞੒ • ʮNHKϚΠϧνϟϯϐΦϯγοϓʯ
 ˠʮNHKϚΠϧʯͰলུ͞ΕΔέʔε
 ˠ NHKͱϚΠϧʹ෼ׂ͞Εͯ͠·͏
 ɹɹˠʮNHKʯ-ʮϚΠϧʯΛ࿈݁ • ߹໊ࢺͱ͸ • ࣙॻʹ͸ͳ͍࿈ଓ͢Δ໊ࢺΛ૊Έ߹Θͤͯ Ұͭͷ୯ޠʢ৽ޠʁʣͱͯ͠ѻ͏ • ʮετʔϦʔɾϝΠΩϯά͕ૉ੖Β͍͠ʯ
 ɹˠʮετʔϦʔʯ-ʮɾʯ-ʮϝΠΩϯάʯΛ࿈݁

Slide 28

Slide 28 text

2. LambdaεΫϦϓτ࡞੒ • ࡞੒ͨ͠εΫϦϓτ • lambda_function.py • ϝΠϯεΫϦϓτ • normalize.py • αχλΠζ༻ • termextract.py • ߹໊ࢺ • ϑϨʔζநग़

Slide 29

Slide 29 text

2. LambdaεΫϦϓτ࡞੒ • lambda_function.py ͷϑϩʔ 1. Πϯϙʔτ෦ 2. S3઀ଓͷॳظԽ (boto3) 3. mecabࣙॻͷμ΢ϯϩʔυ 4. MeCabϞδϡʔϧϩʔυ 5. τʔΫφΠζʢίΞʣॲཧ

Slide 30

Slide 30 text

2. LambdaεΫϦϓτ࡞੒ • lambda_function.py • 1. Πϯϙʔτ෦ import os import logging import traceback import unicodedata import json import boto3 import ctypes import normalize import termextract libdir = os.path.join(os.getcwd(), 'local', 'lib') libmecab = ctypes.cdll.LoadLibrary(os.path.join(libdir,'libmecab.so')) ctypesΛ࢖༻ͯ͠libmecab.soΛϩʔυ

Slide 31

Slide 31 text

• lambda_function.py • 2. S3ॳظԽ(boto3) # Configuration: AWS AWS_S3_BUCKET = 'mecabdic' BOTOCONF = { 'aws_access_key_id': 'AKI.....', 'aws_secret_access_key': 'pTWl.....', 'region_name': 'ap-northeast-1' } boto3.setup_default_session(**BOTOCONF) session = boto3._get_default_session() session_region = session._session.get_config_variable('region') s3 = boto3.client('s3') 2. LambdaεΫϦϓτ࡞੒

Slide 32

Slide 32 text

• lambda_function.py • 3. mecabࣙॻͷμ΢ϯϩʔυ MECAB_DIC_FILES = [ # mecabࣙॻͷϦετ 'char.bin', 'dicrc', 'left-id.def','matrix.bin','pos-id.def', 'rewrite.def', 'right-id.def', 'sys.dic', 'unk.dic', ] DICDIR = '/tmp/neologd/' # ࣙॻͷอଘઌ # S3͔ΒࣙॻΛμ΢ϯϩʔυ def prepareMecabDic(): if not os.path.exists(DICDIR): os.mkdir(DICDIR) for mdic in MECAB_DIC_FILES: dest_dic = DICDIR + mdic if not os.path.exists(dest_dic) or os.path.getsize(dest_dic) == 0: with open(dest_dic, 'wb') as f: s3.download_file(AWS_S3_BUCKET, mdic, dest_dic) prepareMecabDic() /tmp/neologdʹμ΢ϯϩʔυ 2. LambdaεΫϦϓτ࡞੒ ඞཁͳࣙॻΛϦετʹ

Slide 33

Slide 33 text

• lambda_function.py • 4. MeCabϞδϡʔϧϩʔυ import MeCab # init MeCab MECABRC = os.path.join(os.getcwd(), 'local', 'etc', 'mecabrc') tagger = MeCab.Tagger("-r %s" % MECABRC) import normalize # αχλΠζϞδϡʔϧ import termextract # ߹໊ࢺੜ੒ɼϑϨʔζநग़Ϟδϡʔϧ 2. LambdaεΫϦϓτ࡞੒ mecabrcύεͷࢦఆ αχλΠζɼ߹໊ࢺੜ੒Ϟδϡʔϧ

Slide 34

Slide 34 text

• lambda_function.py • 5. τʔΫφΠζ(ίΞ)෦෼ def lambda_handler(event, context): text = event['queryStringParameters'].get('text', '') is_termext = event['queryStringParameters'].get('termextract', '') is_phrase = event['queryStringParameters'].get('phrase', '') if text: text = unicodedata.normalize('NFKC', text.strip()).lower() text = normalize.cleansingText(text) node = tagger.parse('') node = tagger.parseToNode(text) (ଓ͖͸࣍ͷεϥΠυʹ....) 2. LambdaεΫϦϓτ࡞੒ httpΫΤϦετϦϯάऔಘ ϊʔϚϥΠζ/αχλΠζ/τʔΫφΠζ

Slide 35

Slide 35 text

• lambda_function.py • 5. τʔΫφΠζ(ίΞ)෦෼ node = tagger.parse('') node = tagger.parseToNode(text) ### ଓ͖ tokens = [] if is_termext: # ߹໊ࢺੜ੒ tokens = termextract.tokenize(node) elif is_phrase: # ϑϨʔζநग़ tokens = termextract.phrases(node) else: while node: token = { 'surface': node.surface, 'posid': node.posid, 'feature': node.feature, } tokens.append(token) node = node.next 2. LambdaεΫϦϓτ࡞੒ ߹໊ࢺੜ੒ ϑϨʔζநग़ mecabͦͷ··

Slide 36

Slide 36 text

2. LambdaεΫϦϓτ࡞੒ • normalize.py αχλΠζॲཧ

Slide 37

Slide 37 text

؀ڥߏங • normalize.py ςΩετͷαχλΠζॲཧ import re CLEANSING_PATTERNS = [ [r'\r\n|\r|\n|\\n', 'ɻ'], # վߦίʔυΛআڈ [r'\t+|\s+', ' '], # ࿈ଓ͢Δλϒ·ͨ͸࿈ଓ͢Δεϖʔε [u'\u30FC+', u'\u30FC'], # ࿈ଓ͢ΔϋΠϑϯΛ1ͭͷϋΠϑϯʹ͢Δ [r'([^\w])\s([^\w])', r'\1ɻ\2'], # 2όΠτจࣈؒͷεϖʔε͸۟఺Ͱஔ׵ [u'ɾɾ+', 'ɾ'], # ࿈ଓ͢Δத఺Λ1ͭʹ [u'ɻɻ+', 'ɻ'], # ࿈ଓ͢Δ۟఺Λ1ͭʹ ] def cleansingText(text): for src, dst in CLEANSING_PATTERNS: res = re.sub(src, dst, text) return res [ม׵લ, ม׵ޙ] ΛϦετʹ re.subͰஔ׵

Slide 38

Slide 38 text

2. LambdaεΫϦϓτ࡞੒ • termextract.py 1. ࢖༻͢Δ඼ࢺIDͷఆٛ 2. ߹໊ࢺੜ੒ϧʔϧΛ࡞੒ 2. ϑϨʔζੜ੒ϧʔϧΛ࡞੒

Slide 39

Slide 39 text

2. LambdaεΫϦϓτ࡞੒ • termextract.py • 1. ࢖༻͢Δ඼ࢺIDͷఆٛ import re APPLY_IDS = [30, 36, 38, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 52, 53, 54, 56, 57] APPLY_JOIN_IDS = [10, 13, 18, 20, 24, 25, 31, 32, 33, 51, 58] NOT_ENDOFWORDS = [ [13, "͔Β"], [18, "͕"], [18, "΋ͷͷ"], [18, "ͱ"], [24, "ͷ"], ] EX_APPLY_IDS = [ [18, "ͯ"], [18, "Ͱ"], [25, "ͨ"], [25, "ͣ"], [25, "ͳ͍"], [25, "·͢"], [25, "Μ"], [25, "·͠"], [25, "·ͤ"], [31, "͍"], [54, "ͦ͏"] ] IGNORE_WORDS =[u"ͦ͏", u"ͨ͠", u"͋ͱ", u"Կ౓΋", u"Կ͔"] ࿈݁ର৅ͷ඼ࢺID ඼ࢺIDͱ෇ଐޠͷηοτΛ૊Έ߹Θͤ

Slide 40

Slide 40 text

• termextract.py • ߹໊ࢺͷੜ੒ϧʔϧ࡞੒ʢྫʣ def wordFilter(token, buf): if len(buf) != 0: if token.posid == 4 \ and re.search(r'[ʔɾ]', token.surface) \ and re.search(r'^[͊-ΜΝ-λμ-ϲʔɾ]+$', buf[-1].surface) \ and buf[-1].posid != 30: return True for pid, w in EX_APPLY_IDS: if pid == token.posid and w == token.surface: return True if token.posid in APPLY_IDS: # ΞϧϑΝϕοτͷཏྻ໊͕ࢺͱ൑அ͞ΕΔͷΛ๷͙ # (ID: 45Ͱݻ༗໊ࢺ,૊৫ͱID:38ͰҰൠ໊ࢺͷ2ύλʔϯ͕͋Δ) if len(token.surface) < 3 and re.search(r'[a-zA-Z:]+', token.surface): return False if len(buf) == 0 and token.surface in IGNORE_WORDS: return False return False (த఺ or ௕Իූ߸) and (࠷ޙͷจࣈྻ͕ͻΒ͕ͳ/ΧλΧφ) and ࠷ޙͷ୯ޠͷ඼ࢺID͕30(໊ࢺ઀ଓ)Ҏ֎Λ࿈݁͢Δ 2. LambdaεΫϦϓτ࡞੒ ৚݅ʹΑͬͯ͸࿈݁͠ͳ͍΋ͷ΋...

Slide 41

Slide 41 text

2. LambdaεΫϦϓτ࡞੒ • ۩ମతʹ... • ʮετʔϦʔɾϝΠΩϯά͕ૉ੖Β͍͠ʯ ετʔϦʔ APPLY_IDS = [30, 36, 38, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 52, 53, 54, 56, 57] ࿈݁ର৅඼ࢺID ɾ ϝΠΩϯά ඼ࢺID: 38 Ұൠ໊ࢺ ඼ࢺID: 4 ه߸ ඼ࢺID: 41 ݻ༗໊ࢺ ࿈݁ର৅IDʢͻΒ͕ͳɼ͔͔ͨͳɼத఺Λڐ༰ʣ

Slide 42

Slide 42 text

• termextract.py ϑϨʔζநग़ • ߹໊ࢺੜ੒ͱجຊతʹ͸ಉ͡ • ࿈݁ͤ͞Δ඼ࢺIDʹॿࢺͳͲΛ௥Ճ͢Δ 2. LambdaεΫϦϓτ࡞੒

Slide 43

Slide 43 text

3. AWS΁σϓϩΠ

Slide 44

Slide 44 text

3. AWS΁ͷσϓϩΠ • σϓϩΠͷϑϩʔ 1. ࡞੒ͨ͠ϓϩάϥϜΛzipʹ·ͱΊΔ 2. AWSϢʔβʔϩʔϧͷઃఆ 3. Lambda → S3 ϧʔτઃఆ 4. ηΩϡϦςΟάϧʔϓ࡞੒ 5. Lambdaͷ࡞੒ 6. API Gatewayͷઃఆ

Slide 45

Slide 45 text

3. AWS΁ͷσϓϩΠ • 1. lambda_neologdΛzipͰ·ͱΊΔ • ෆཁͳϑΝΠϧΛexcludeʹ (zip: 1.5MBఔ౓ʹ) $ zip lambda_mecabneologd.zip -r * -x@exclude $ vim lambda_neologd/exclude *.dist-info/* *.egg-info *.pyc env/* exclude lambda_mecabneolod.zip local/bin/* local/include/* local/libexec/* local/share/* zipϑΝΠϧͷ࡞੒ exclude(zipʹؚΊͳ͍) ϑΝΠϧͷத਎

Slide 46

Slide 46 text

3. AWS΁ͷσϓϩΠ • 2. AWS Ϣʔβʔϩʔϧͷઃఆ • IAMઃఆ • S3ಡΈࠐΈݖݶΛઃఆ

Slide 47

Slide 47 text

• 2. AWS Ϣʔβʔϩʔϧͷઃఆ • Lambdaͷ࣮ߦϩʔϧઃఆ lambda࣮ߦʹඞཁͳϩʔϧ 3. AWS΁ͷσϓϩΠ

Slide 48

Slide 48 text

• 3. Lambda → S3΁ͷϧʔτΛ࡞੒ • VPC, SubnetΛ࡞੒ • VPCΤϯυϙΠϯτΛ࡞੒ • ඞਢͰ͸ͳ͍:ݖݶ·ΘΓͰϋϚΒͳ͍ͨΊʹ ↓ → 3. AWS΁ͷσϓϩΠ

Slide 49

Slide 49 text

• 3. Lambda → S3΁ͷϧʔτΛ࡞੒ • ϧʔςΟϯάΛઃఆ 3. AWS΁ͷσϓϩΠ

Slide 50

Slide 50 text

• 4. ηΩϡϦςΟάϧʔϓͷ࡞੒ (IP੍ݶ) 3. AWS΁ͷσϓϩΠ

Slide 51

Slide 51 text

• 5. Lambdaͷ࡞੒ 3. AWS΁ͷσϓϩΠ zipΛΞοϓϩʔυ Python3.6 σϑΥϧτ ϩʔϧࢦఆ

Slide 52

Slide 52 text

• 5. Lambdaͷ࡞੒ • ϝϞϦ͸512MBʙ 3. AWS΁ͷσϓϩΠ λΠϜΞ΢τઃఆ VPCઃఆ ϝϞϦαΠζ

Slide 53

Slide 53 text

• 5. Lambdaͷ࡞੒ 3. AWS΁ͷσϓϩΠ αϒωοτઃఆ(2ͭ) ηΩϡϦςΟάϧʔϓࢦఆ

Slide 54

Slide 54 text

• 6. API Gateway ઃఆ 3. AWS΁ͷσϓϩΠ Lambdaͷࢦఆ

Slide 55

Slide 55 text

• 6. API Gateway ઃఆ • GETϝιουΛ࡞੒ 3. AWS΁ͷσϓϩΠ

Slide 56

Slide 56 text

• 6. API Gateway ઃఆ • API GatewayͷσϓϩΠΛ࣮ߦ 3. AWS΁ͷσϓϩΠ σϓϩΠΛ࣮ߦ ΤϯυϙΠϯτ͕ੜ੒

Slide 57

Slide 57 text

• https://qxhfsso....execute-api.ap-northeast- 1.amazonaws.com/stg/mecab/ tokenize.json?text=NHK૯߹ςϨϏͰ͓΋͠ Ζʔʔʔ͍൪૊΍ͬͯΔ σϞ(API)

Slide 58

Slide 58 text

• text=੺Ӌ୆τϯωϧ͸౦๺৽װઢ্໺Ӻ - େ ٶӺ͓ؒΑͼ࡛ژઢ੺ӋӺ - ๺੺ӋӺؒʹ͋ Δ૯Ԇ௕585ϝʔτϧͷమಓτϯωϧͰ͋Δɻ σϞ(API)

Slide 59

Slide 59 text

• ߹໊ࢺੜ੒Λ༗ޮʹ࣮ͯ͠ߦ • /mecab/tokenize.json?termextract=1&text=੺Ӌ୆... σϞ(API)

Slide 60

Slide 60 text

• ϑϨʔζநग़Λ༗ޮʹ࣮ͯ͠ߦ • /mecab/tokenize.json?phrase=1&text=੺Ӌ୆... σϞ(API)

Slide 61

Slide 61 text

σϞ Ԡ༻ฤͱͯ͜͠Μͳ΋ͷΛ࡞Γ·ͨ͠

Slide 62

Slide 62 text

σϞ • ಺ֳ෎ ݄ྫܦࡁใࠂ • ܠؾʹؔ͢Δ੓෎ͷެࣜݟղΛࣔ͢ใࠂॻ

Slide 63

Slide 63 text

σϞ • ಺ֳ෎ ݄ྫܦࡁใࠂ

Slide 64

Slide 64 text

σϞ • ಺ֳ෎ ݄ྫܦࡁใࠂ

Slide 65

Slide 65 text

σϞ • จষൺֱWEB

Slide 66

Slide 66 text

·ͱΊ

Slide 67

Slide 67 text

• MeCab + ৽ޠΛ࣋ͭࣙॻ؀ڥͱͯ͠ • AWS Lambda + API GatewayΛ࢖༻ • NEologd͸lambda؀ڥʹ৐ͤΔ͜ͱ͕Ͱ͖Δ • ࣙॻΛ࠷খݶʹ͠ͳ͍ͱ͍͚ͳ͍ • Lambda͢Δ͜ͱͰؾܰʹ࢖͑Δͱ͍͏ϝϦοτ͸͋Δ • Lambdaʹ৐ͤͨͱ͖ͷҰ൪ͷ໰୊͸ॳճىಈ͕஗͍ ·ͱΊ

Slide 68

Slide 68 text

• ࣙॻ͸Ͳ͏͢΂͖͔໰୊ʹ͍ͭͯ • NEologdͷࣙॻ͸େมΑ͘Ͱ͖͍ͯΔ • ͔͠͠σΧ͍ʂʂ • ࣙॻΛ࠷খݶʹ͢ΔσϝϦοτΑΓ΋೔ຊޠͷΏΒ͗ʹ Ԡͯ͡ʮϊʔϚϥΠζʯɼʮαχλΠζʯ͔ͯ͠Βτʔ ΫφΠζ͢Δ͜ͱ͸ࣙॻ͕Ͳ͏Ͱ͋Ε༗ҙٛͰ͋Δ • ໊ࢺ࿈݁Λ࣮૷͢Δ͜ͱͰ߹໊ࢺ͕࡞੒Ͱ͖Δ • Ωʔϫʔυநग़ͳͲͰ͸ͱͯ΋࢖͑Δ • ϑϨʔζʹؔͯ͠͸ඞཁʹԠͯ͡඼ࢺ࿈݁Λ࣮૷ ·ͱΊ ͱ͜ΖͰɼσϞͰݟͤͨจॻൺֱͰ͕͢....

Slide 69

Slide 69 text

·ͱΊ σϞͷจষൺֱWEBͰ࢖༻͍ͯ͠Δࣙॻ͸ Wikipedia͔Β࡞੒ͨ͠΋ͷʂʂʂʂʂ

Slide 70

Slide 70 text

·ͱΊ σϞͷจষൺֱWEBͰ࢖༻͍ͯ͠Δࣙॻ͸ Wikipedia͔Β࡞੒ͨ͠΋ͷʂ ͭ·Γ... ϊʔϚϥΠζ΍αχλΠζɼ඼ࢺ࿈͕݁ॏཁ
 ͦͷ࣮૷͕Ͱ͖Ε͹NEologd͡Όͳͯ͘΋ྑ͍ʂʁ Wikipediaࣙॻʢor͸ͯͳࣙॻʣͰे෼Ͱ͸ʂʁ

Slide 71

Slide 71 text

͝੩ௌ͋Γ͕ͱ͏͍͟͝·ͨ͠