Slide 1

Slide 1 text

5BLVZB"TBOP !UBLVZB@C *OUSPEVDUJPOUP 
 "QBDIF-VDFOF 
 -VDFOF3FBEJOH

Slide 2

Slide 2 text

w +BWBͰॻ͔ΕͨશจݕࡧΤϯδϯϥΠϒϥϦ w ೥ݱࡏɺશจݕࡧʹ͓͚ΔσϑΝΫτελϯμʔυ w શจݕࡧʹඞཁͳػೳ͕΄΅࣮૷͞Ε͍ͯΔ w &MBTUJDTFBSDI΍4PMSͰݕࡧͷίΞ෦෼ͱͯ͠࢖ΘΕ͍ͯΔ w &MBTUJDTFBSDI4PMS͸-VDFOFͷ3&45"1*ΛϢʔβʹఏڙ w ଞʹ΋-VDFOFʹ͸ͳ͍ػೳʢ෼ࢄݕࡧͳͲʣΛ࣮૷͍ͯ͠Δ ͕͜͜Ͱ͸ׂѪ 8IBUJT"QBDIF-VDFOF IUUQTHJUIVCDPNBQBDIFMVDFOF

Slide 3

Slide 3 text

-VDFOF$PSF$PNQPOFOUT $PNQPOFOU 3FMBUFE$MBTTFT 2VFSZQBSTJOH QueryParser "OBMZTJT Analyzer 4FBSDI IndexSearcher 2VFSJFT Query *OEFYJOH IndexWriter *OEFYBDDFTT IndexReader 4UPSBHFBDDFTT Directory, DirectoryReader %PDVNFOUSFQSFTFOUBUJPO Document, Field $PEFDT JOEFY fi MFGPSNBUT Codec, PostingsFormat, DocValuesFormat, StoredFieldsFormat, FieldInfosFormat, SegmentInfoFormat, LiveDocsFormat, PointsFormat, … "MHPSJUINT%BUBTUSVDUVSFT LZ4, LevenshteinAutomata, FST, BKDReader, BKDWriter, PackedInts, FixedBitSet, PriorityQueue, …

Slide 4

Slide 4 text

'VMMUFYU4FBSDI#BTJDTPO-VDFOF *OWFSUFEJOEFY5IFDPSFEBUBTUSVDUVSFPGTFBSDIFOHJOFT w શจݕࡧͰ͸ɺݕࡧର৅ͷίϯςϯπ͸จॻ EPDVNFOU ͱͯ͠ ϞσϧԽ͞ΕΔ w &$ݕࡧͰ͋Ε͹঎඼ w 8FCݕࡧͰ͋Ε͹8FCϖʔδ w -VDFOFͰ͸DocumentΫϥεʹରԠ w -VDFOF͸సஔΠϯσοΫε JOWFSUFEJOEFY ʹΑΓ 
 จॻΛΠϯσοΫεԽ͢Δ w సஔΠϯσοΫεํࣜͷશจݕࡧ͸ɺେن໛ͳจॻू߹͔Βͷ ݕࡧʹ޲͍͍ͯΔ w -VDFOFͰ͸ɺจॻ͸EPDJEͰࣝผ͞ΕΔ 5FSN 1PTUJOHT-JTU BDUJPO DPPLCPPL JO MVDFOF "OFYBNQMFPGJOWFSUFEJOEFYTUSVDUVSFGPSEPDVNFOUT l-VDFOFJO"DUJPOzBOEl-VDFOF$PPLCPPLz" TFUPGBMMUFSNTJTPGUFOSFGFSSFEUPBTBlUFSN EJDUJPOBSZzPSTJNQMZlEJDUJPOBSZz SFG*OGPSNBUJPO3FUSJFWBMBOE8FC4FBSDI·ͱΊ సஔΠϯσοΫεTUPQUIFXPSME 
 IUUQTTUPQUIFXPSMEIBUFOBCMPHDPNFOUSZDTJOGPSNBUJPOSFUSJFWBM

Slide 5

Slide 5 text

-VDFOFT*OEFY $SFBUJOH4FBSDIJOHBOJOEFY w ΠϯσοΫεͷ࡞੒ w ΠϯσοΫε͸ϑΝΠϧγεςϜʹอଘ͞ΕΔ w ΠϯσοΫε͸ෳ਺ͷϑΝΠϧ͔Βߏ੒͞ΕΔ w λʔϜࣙॻɺϙεςΟϯάϦετͳͲʢޙड़ʣ w ΠϯσοΫεΛݕࡧ w ϑΝΠϧͱͯ͠อଘ͞ΕͨΠϯσοΫε͔Βݕࡧ w ΫΤϦʹϚον͢Δจॻू߹Λฦ͢ w จॻ*%ɺจॻͷ಺༰ͳͲؚ͕·ΕΔ -VDFOF*OEFY 4&31 %PDVNFOUT

Slide 6

Slide 6 text

*OEFYJOH1SPDFTT0WFSWJFX $PSFDPNQPOFOUTGPSJOEFYJOH $POTUSVDUBEPDVNFOUPCKFDU 
 จॻΦϒδΣΫτΛߏங "OBMZ[FUFYUDPOUFOUT QSFQSPDFTTJOH 
 ςΩετղੳʢલॲཧʣ #VJMEB-VDFOFJOEFY 
 ΠϯσοΫεߏங 8SJUFBOJOEFYUPBTUPSBHF 
 ΠϯσοΫεॻ͖ࠐΈ Analyzer Directory IndexWriter Document

Slide 7

Slide 7 text

%PDVNFOU3FQSFTFOUBUJPO w จॻ͸DocumentΫϥεͷΦϒδΣΫτͱͯ͠දݱ͞ΕΔ Document doc1 = new Document(); doc1.add(new Field("title", "Lucene in Action", TextField.TYPE_STORED)); Document doc2 = new Document(); doc2.add(new Field("title", "Lucene Cookbook", TextField.TYPE_STORED)); .PEFMPCKFDUUPCFJOEFYFE

Slide 8

Slide 8 text

%PDVNFOU3FQSFTFOUBUJPO w จॻ͸DocumentΫϥεͷΦϒδΣΫτͱͯ͠දݱ͞ΕΔ w Document͸ෳ਺ͷField͔Βߏ੒͞ΕΔ w ,FZWBMVFNBQͷΑ͏ͳσʔλߏ଄ Document doc1 = new Document(); doc1.add(new Field("title", "Lucene in Action", TextField.TYPE_STORED)); Document doc2 = new Document(); doc2.add(new Field("title", "Lucene Cookbook", TextField.TYPE_STORED)); .PEFMPCKFDUUPCFJOEFYFE

Slide 9

Slide 9 text

%PDVNFOU3FQSFTFOUBUJPO w จॻ͸DocumentΫϥεͷΦϒδΣΫτͱͯ͠දݱ͞ΕΔ w Document͸ෳ਺ͷField͔Βߏ੒͞ΕΔ w ,FZWBMVFNBQͷΑ͏ͳσʔλߏ଄ w Field͸ϑΟʔϧυ໊ͱ಺༰ɺϑΟʔϧυͷλΠϓΛ΋ͭ Document doc1 = new Document(); doc1.add(new Field("title", "Lucene in Action", TextField.TYPE_STORED)); Document doc2 = new Document(); doc2.add(new Field("title", "Lucene Cookbook", TextField.TYPE_STORED)); 'JFMEOBNF 'JFMEDPOUFOU 'JFMEUZQFzTUPSFEPSOPUTUPSFE 4UPSFE fi FMETBSFTUPSFEJOBOJOEFY5IJTBMMPXTZPVUP 
 SFUSJFWFUIF fi FMEDPOUFOUTBUTFBSDIUJNF .PEFMPCKFDUUPCFJOEFYFE

Slide 10

Slide 10 text

"OBMZ[FS 5FYUQSFQSPDFTTPST w AnalyzerTokenizerFilters w Tokenizer w ςΩετจࣈྻΛ5PLFOͷྻʹ෼ׂ͢Δ w Filter w 5PLFOΛҰఆͷϧʔϧͰআڈ͢ΔʢFHStopFilterʣ w 5PLFOͷจࣈྻΛҰఆͷϧʔϧͰஔ׵͢ΔʢFHLowerCaseFilterʣ w AnalyzerͷྫStandardAnalyzer w StandardAnalyzerStandardTokenizer + StopFilter LowerCaseFilter w 6OJDPEF5FYU4FHNFOUBUJPO ϕʔεͷ Tokenizer "Lucene in Action" "Lucene", "in", "Action" "Lucene", "Action" "lucene", "action" StandardTokenizer StopFilter LowerCaseFilter *GUIFStopFilterIBTlJOzBTBTUPQXPSE

Slide 11

Slide 11 text

#BTJD*OEFYJOH"1* "DPEFFYBNQMFUPCVJMEB-VDFOFJOEFY // Create a directory for storing Lucene index Path indexDirPath = Files.createDirectory(Path.of("index")); Directory directory = FSDirectory.open(indexDirPath); // Set up IndexWriter Analyzer analyzer = new StandardAnalyzer(); IndexWriterConfig config = new IndexWriterConfig(analyzer); IndexWriter indexWriter = new IndexWriter(directory, config); // Index a document: "Lucene in Action" Document doc1 = new Document(); doc1.add(new Field("title", "Lucene in Action", TextField.TYPE_STORED)); indexWriter.addDocument(doc1); // Index a document: "Lucene Cookbook" Document doc2 = new Document(); doc2.add(new Field("title", "Lucene Cookbook", TextField.TYPE_STORED)); indexWriter.addDocument(doc2); // Write index to the directory indexWriter.close();

Slide 12

Slide 12 text

#BTJD*OEFYJOH"1* "DPEFFYBNQMFUPCVJMEB-VDFOFJOEFY w FSDirectory w DirectoryʢετϨʔδΞΫηε"1*ʣͷ࣮૷ͷҰͭ w ϑΝΠϧγεςϜ΁ͷΞΫηεΛఏڙ͢Δ w Directoryͷ࣮૷ʹ͸ଞʹ΋RAMDirectoryͳͲ͕͋Δ // Create a directory for storing Lucene index Path indexDirPath = Files.createDirectory(Path.of("index")); Directory directory = FSDirectory.open(indexDirPath); // Set up IndexWriter Analyzer analyzer = new StandardAnalyzer(); IndexWriterConfig config = new IndexWriterConfig(analyzer); IndexWriter indexWriter = new IndexWriter(directory, config); // Index a document: "Lucene in Action" Document doc1 = new Document(); doc1.add(new Field("title", "Lucene in Action", TextField.TYPE_STORED)); indexWriter.addDocument(doc1); // Index a document: "Lucene Cookbook" Document doc2 = new Document(); doc2.add(new Field("title", "Lucene Cookbook", TextField.TYPE_STORED)); indexWriter.addDocument(doc2); // Write index to the directory indexWriter.close();

Slide 13

Slide 13 text

#BTJD*OEFYJOH"1* "DPEFFYBNQMFUPCVJMEB-VDFOFJOEFY w FSDirectory w DirectoryʢετϨʔδΞΫηε"1*ʣͷ࣮૷ͷҰͭ w ϑΝΠϧγεςϜ΁ͷΞΫηεΛఏڙ͢Δ w Directoryͷ࣮૷ʹ͸ଞʹ΋RAMDirectoryͳͲ͕͋Δ w IndexWriter w ΠϯσοΫεͷॻ͖ࠐΈΛΦʔέετϨʔγϣϯ͢ΔΫϥε // Create a directory for storing Lucene index Path indexDirPath = Files.createDirectory(Path.of("index")); Directory directory = FSDirectory.open(indexDirPath); // Set up IndexWriter Analyzer analyzer = new StandardAnalyzer(); IndexWriterConfig config = new IndexWriterConfig(analyzer); IndexWriter indexWriter = new IndexWriter(directory, config); // Index a document: "Lucene in Action" Document doc1 = new Document(); doc1.add(new Field("title", "Lucene in Action", TextField.TYPE_STORED)); indexWriter.addDocument(doc1); // Index a document: "Lucene Cookbook" Document doc2 = new Document(); doc2.add(new Field("title", "Lucene Cookbook", TextField.TYPE_STORED)); indexWriter.addDocument(doc2); // Write index to the directory indexWriter.close();

Slide 14

Slide 14 text

#BTJD*OEFYJOH"1* "DPEFFYBNQMFUPCVJMEB-VDFOFJOEFY w FSDirectory w DirectoryʢετϨʔδΞΫηε"1*ʣͷ࣮૷ͷҰͭ w ϑΝΠϧγεςϜ΁ͷΞΫηεΛఏڙ͢Δ w Directoryͷ࣮૷ʹ͸ଞʹ΋RAMDirectoryͳͲ͕͋Δ w IndexWriter w ΠϯσοΫεͷॻ͖ࠐΈΛΦʔέετϨʔγϣϯ͢ΔΫϥε w addDocument()ϝιουͰDocumentΛ௥Ճ // Create a directory for storing Lucene index Path indexDirPath = Files.createDirectory(Path.of("index")); Directory directory = FSDirectory.open(indexDirPath); // Set up IndexWriter Analyzer analyzer = new StandardAnalyzer(); IndexWriterConfig config = new IndexWriterConfig(analyzer); IndexWriter indexWriter = new IndexWriter(directory, config); // Index a document: "Lucene in Action" Document doc1 = new Document(); doc1.add(new Field("title", "Lucene in Action", TextField.TYPE_STORED)); indexWriter.addDocument(doc1); // Index a document: "Lucene Cookbook" Document doc2 = new Document(); doc2.add(new Field("title", "Lucene Cookbook", TextField.TYPE_STORED)); indexWriter.addDocument(doc2); // Write index to the directory indexWriter.close();

Slide 15

Slide 15 text

#BTJD*OEFYJOH"1* "DPEFFYBNQMFUPCVJMEB-VDFOFJOEFY w FSDirectory w DirectoryʢετϨʔδΞΫηε"1*ʣͷ࣮૷ͷҰͭ w ϑΝΠϧγεςϜ΁ͷΞΫηεΛఏڙ͢Δ w Directoryͷ࣮૷ʹ͸ଞʹ΋RAMDirectoryͳͲ͕͋Δ w IndexWriter w ΠϯσοΫεͷॻ͖ࠐΈΛΦʔέετϨʔγϣϯ͢ΔΫϥε w addDocument()ϝιουͰDocumentΛ௥Ճ w close()͢ΔͱʢσϑΥϧτͷઃఆͰ͸ʣDirectoryʹ ॻ͖ࠐΉ // Create a directory for storing Lucene index Path indexDirPath = Files.createDirectory(Path.of("index")); Directory directory = FSDirectory.open(indexDirPath); // Set up IndexWriter Analyzer analyzer = new StandardAnalyzer(); IndexWriterConfig config = new IndexWriterConfig(analyzer); IndexWriter indexWriter = new IndexWriter(directory, config); // Index a document: "Lucene in Action" Document doc1 = new Document(); doc1.add(new Field("title", "Lucene in Action", TextField.TYPE_STORED)); indexWriter.addDocument(doc1); // Index a document: "Lucene Cookbook" Document doc2 = new Document(); doc2.add(new Field("title", "Lucene Cookbook", TextField.TYPE_STORED)); indexWriter.addDocument(doc2); // Write index to the directory indexWriter.close();

Slide 16

Slide 16 text

#VGGFSJOHBOE'MVTIJOHBU*OEFYJOH 8IFOIndexWriterXSJUFTEPDVNFOUT w IndexWriterʹaddDocument()ͯ͠΋ɺ͙͢ʹετϨʔδʹ 
 ॻ͖ࠐ·ΕΔΘ͚Ͱ͸ͳ͍ w 3".ʢ+7.ώʔϓʣ্ͷόοϑΝྖҬʹॻ͖ࠐ·ΕΔ w IndexWriter͸ద౰ͳλΠϛϯάͰDirectoryʹϑϥογϡ͢Δ w όοϑΝʹอଘ͞Εͨ3".αΠζ΍จॻ਺ͳͲ͕τϦΨʔ w ໌ࣔతʹflush()ΛݺͿ͜ͱͰϑϥογϡͰ͖Δ 
 ʢͨͩ͠DPNNJUॲཧ͸͠ͳ͍ʣ w commit()ΛݺͿͱϑϥογϡ͔ͯ͠ΒDPNNJUॲཧΛߦ͏ addDocument() IndexWriter Directory #V ff FSJO3". 'MVTI

Slide 17

Slide 17 text

_0.fdm _0.fdt _0.fdx _0.fnm _0.nvd _0.nvm _0.si _0_Lucene84_0.doc _0_Lucene84_0.pos _0_Lucene84_0.tim _0_Lucene84_0.tip _0_Lucene84_0.tmd segments_1 write.lock *OEFYBGUFSTUDPNNJU 4FHNFOU 4FHNFOUT'JMF -PDL'JMF TFHNFOUT@ 4FHNFOU w ΠϯσοΫε͸ෳ਺ͷηάϝϯτ TFHNFOU ͔ΒͳΔ w ͢΂ͯಉ͡σΟϨΫτϦʹอଘ͞ΕΔ w ηάϝϯτ͸αϒΠϯσοΫε w ୯ମͰ΄΅-VDFOFΠϯσοΫεͱͯ͠ػೳ͢Δ w ηάϝϯτ͸ෳ਺ͷϑΝΠϧ͔ΒͳΔ w ϑΝΠϧ໊͸_gen.extPS_gen_Lucene84_0.extͷܗࣜ w &H_0.fnm _0_Lucene84_0.pos ʜ w genηάϝϯτͷੈ୅ FH w extϑΥʔϚοτ͝ͱͷ֦ுࢠ FHGON QPT w IndexWriter͕ fl VTIͨ͠ͱ͖ʹηάϝϯτ͕ͭ࡞ΒΕΔ w DPNNJU͞Εͨͱ͖ʹॳΊͯsegments_N͔Βࢀর͞ΕΔ w N ʜ *OEFY4FHNFOUT -VDFOFJOEFY fi MFT

Slide 18

Slide 18 text

*OEFY4FHNFOUT -VDFOFJOEFY fi MFT w ΠϯσοΫε͸ෳ਺ͷηάϝϯτ TFHNFOU ͔ΒͳΔ w ͢΂ͯಉ͡σΟϨΫτϦʹอଘ͞ΕΔ w ηάϝϯτ͸αϒΠϯσοΫε w ୯ମͰ΄΅-VDFOFΠϯσοΫεͱͯ͠ػೳ͢Δ w ηάϝϯτ͸ෳ਺ͷϑΝΠϧ͔ΒͳΔ w ϑΝΠϧ໊͸_gen.extPS_gen_Lucene84_0.extͷܗࣜ w &H_0.fnm _0_Lucene84_0.pos ʜ w genηάϝϯτͷੈ୅ FH w extϑΥʔϚοτ͝ͱͷ֦ுࢠ FHGON QPT w IndexWriter͕ fl VTIͨ͠ͱ͖ʹηάϝϯτ͕ͭ࡞ΒΕΔ w DPNNJU͞Εͨͱ͖ʹॳΊͯsegments_N͔Βࢀর͞ΕΔ w N ʜ _0.fdm _0.fdt _0.fdx _0.fnm _0.nvd _0.nvm _0.si _0_Lucene84_0.doc _0_Lucene84_0.pos _0_Lucene84_0.tim _0_Lucene84_0.tip _0_Lucene84_0.tmd segments_1 write.lock _0.fdm _0.fdt _0.fdx _0.fnm _0.nvd _0.nvm _0.si _0_Lucene84_0.doc _0_Lucene84_0.pos _0_Lucene84_0.tim _0_Lucene84_0.tip _0_Lucene84_0.tmd _1.fdm _1.fdt _1.fdx _1.fnm _1.nvd _1.nvm _1.si _1_Lucene84_0.doc _1_Lucene84_0.pos _1_Lucene84_0.tim _1_Lucene84_0.tip _1_Lucene84_0.tmd segments_2 write.lock *OEFYBGUFSTUDPNNJU *OEFYBGUFSOEDPNNJU 4FHNFOU 4FHNFOU 4FHNFOU 4FHNFOUT'JMF -PDL'JMF 4FHNFOUT'JMF -PDL'JMF TFHNFOUT@ 4FHNFOU 4FHNFOU TFHNFOUT@ 4FHNFOU

Slide 19

Slide 19 text

*OEFY$PNNJUT$POTJTUFODZ )PXIndexWriter#commit()XPSLT όοϑΝʹ͋ΔจॻΛ͢΂ͯ fl VTI͠ɺͦͷ͋ͱDirectoryΛTZOD͢Δ w FSDirectoryͷͱ͖ɺϑΝΠϧγεςϜʹॻ͖ࠐ·Εͨ͜ͱΛอূ TFHNFOUT@ 4FHNFOU TFHNFOUT@ 4FHNFOU 4FHNFOU 'MVTI4FHNFOU 
 4ZOD fi MFTZTUFN *OEFY3FBEFS *OEFY3FBEFS

Slide 20

Slide 20 text

*OEFY$PNNJUT$POTJTUFODZ )PXIndexWriter#commit()XPSLT όοϑΝʹ͋ΔจॻΛ͢΂ͯ fl VTI͠ɺͦͷ͋ͱDirectoryΛTZOD͢Δ w FSDirectoryͷͱ͖ɺϑΝΠϧγεςϜʹॻ͖ࠐ·Εͨ͜ͱΛอূ segments_NϑΝΠϧΛॻ͖ࠐΜͰTZOD͢Δ w IndexReader͔Βݟ͑ΔΑ͏ʹͳΔ 
 ʢIndexReader͸DPNNJU͞ΕͨηάϝϯτͷΈΛಡΈࠐΉʣ TFHNFOUT@ 4FHNFOU TFHNFOUT@ 4FHNFOU 4FHNFOU TFHNFOUT@ 4FHNFOU 4FHNFOU 'MVTI4FHNFOU 
 4ZOD fi MFTZTUFN 8SJUFTFHNFOUT@ 
 4ZOD fi MFTZTUFN TFHNFOUT@ *OEFY3FBEFS *OEFY3FBEFS *OEFY3FBEFS

Slide 21

Slide 21 text

*OEFY$PNNJUT$POTJTUFODZ )PXIndexWriter#commit()XPSLT όοϑΝʹ͋ΔจॻΛ͢΂ͯ fl VTI͠ɺͦͷ͋ͱDirectoryΛTZOD͢Δ w FSDirectoryͷͱ͖ɺϑΝΠϧγεςϜʹॻ͖ࠐ·Εͨ͜ͱΛอূ segments_NϑΝΠϧΛॻ͖ࠐΜͰTZOD͢Δ w IndexReader͔Βݟ͑ΔΑ͏ʹͳΔ 
 ʢIndexReader͸DPNNJU͞ΕͨηάϝϯτͷΈΛಡΈࠐΉʣ ݹ͍DPNNJUʢݹ͍segments_NʣΛ࡟আ TFHNFOUT@ 4FHNFOU TFHNFOUT@ 4FHNFOU 4FHNFOU TFHNFOUT@ 4FHNFOU 4FHNFOU 'MVTI4FHNFOU 
 4ZOD fi MFTZTUFN 8SJUFTFHNFOUT@ 
 4ZOD fi MFTZTUFN %FMFUFTFHNFOUT@ TFHNFOUT@ 4FHNFOU 4FHNFOU TFHNFOUT@ *OEFY3FBEFS *OEFY3FBEFS *OEFY3FBEFS *OEFY3FBEFS

Slide 22

Slide 22 text

*OEFY4FHNFOU.FSHJOH IndexWriterMergePolicy w IndexWriter͸ηάϝϯτ܈ΛMergePolicyʹैͬͯɺΑΓେ͖͍ηάϝϯτ ΁ͱϚʔδ͢Δ w TieredMergePolicy EFGBVMU LogMergePolicy FUD w MergePolicy͸ɺͲͷηάϝϯτΛϚʔδ͢Δ͔Λܾఆ͢Δ w খ͍͞ηάϝϯτ͕େྔʹ͋Δͱݕࡧ͕஗͘ͳΔ w Ϛʔδͯ͠େ͖͍ηάϝϯτʹ·ͱΊΔ͜ͱͰύϑΥʔϚϯε͕޲্͢Δ w MergePolicy ʹΑͬͯΠϯσΩγϯάͷεϧʔϓοτ΍શମͷෛՙͳͲΛνϡʔ χϯάͰ͖Δ 4FHNFOU 4FHNFOU 4FHNFOU 4FHNFOU 4FHNFOU 4FHNFOU SFG$IBOHJOH#JUT7JTVBMJ[JOH-VDFOFTTFHNFOUNFSHFT 
 IUUQCMPHNJLFNDDBOEMFTTDPNWJTVBMJ[JOHMVDFOFTTFHNFOUNFSHFTIUNM merge()

Slide 23

Slide 23 text

*OEFY'JMF'PSNBUT 'PSNBU/BNF &YUFOTJPO 3FMBUFE$MBTT %FTDSJQUJPO 4FHNFOU'JMF segments_N SegmentInfos ίϛοτϙΠϯτΛอ࣋ -PDL'JMF write.lock N/A εϨουηʔϑͷͨΊͷϩοΫϑΝΠϧ 4FHNFOU*OGP .si SegmentInfoFormat ηάϝϯτͷϝλσʔλ 'JFMET .fnm FieldInfosFormat ϑΟʔϧυ৘ใʢ໊લͳͲʣΛอ࣋ 'JFME*OEFY .fdx StoredFieldsFormat ϑΟʔϧυσʔλ΁ͷϙΠϯλ 'JFME%BUB .fdt StoredFieldsFormat จॻͷϑΟʔϧυσʔλ 5FSN%JDUJPOBSZ .tim PostingsFormat λʔϜࣙॻʢλʔϜ৘ใΛอ࣋ʣ 5FSN%JDUJPOBSZ.FUBEBUB .tmd PostingsFormat λʔϜࣙॻͷϝλσʔλ 5FSN*OEFY .tip PostingsFormat λʔϜࣙॻ΁ͷϙΠϯλ 'SFRVFODJFT .doc PostingsFormat సஔΠϯσοΫεͱεΩοϓϦετ 1PTJUJPOT .pos PostingsFormat จॻ಺ͷλʔϜͷग़ݱҐஔΛอ࣋ 1BZMPBET .pay PostingsFormat ग़ݱҐஔ͝ͱͷϝλσʔλʢจࣈΦϑηοτͳͲʣ -JWF%PDVNFOUT .liv Lucene50LiveDocsFormat ࡟আ͞Ε͍ͯͳ͍ MJWF จॻͷ৘ใ 1PJOU7BMVFT .dii, .dim PointsFormat ਺஋σʔλΛอ࣋

Slide 24

Slide 24 text

#SPXTF-VDFOF*OEFYVTJOH-VLF -VLF5IF(6*UPPMCPYGPSJOUSPTQFDUJOH-VDFOFJOEFY %PXOMPBE"QBDIF-VDFOFGSPN 
 IUUQTMVDFOFBQBDIFPSHDPSFEPXOMPBETIUNM &YUSBDU.tgz fi MF &YFDVUF w -JOVYNBD04lucene-8.8.1/luke/luke.s h w 8JOEPXTlucene-8.8.1/luke/luke.bat SFG-VDFOF௒ೖ໳XJUI-VLFCZNPDPCFUB 
 IUUQTNPDPCFUBNFEJVNDPNMVDFOF&#&"&XJUIMVLFBDDCDB

Slide 25

Slide 25 text

#SPXTF-VDFOF*OEFYVTJOH-VLF -VLF5IF(6*UPPMCPYGPSJOUSPTQFDUJOH-VDFOFJOEFY *OQVUJOEFYEJSFDUPSZQBUI

Slide 26

Slide 26 text

#SPXTF-VDFOF*OEFYVTJOH-VLF -VLF5IF(6*UPPMCPYGPSJOUSPTQFDUJOH-VDFOFJOEFY *OEFYTUBUJTUJDTBOENFUBEBUB 4FMFDU fi FMEOBNF 5FSNTUBUJTUJDT

Slide 27

Slide 27 text

2VFSZ1SPDFTTJOH0WFSWJFX 6TJOHIndexSearcherXJUIQuery IndexSearcher Directory IndexReader Query TopDocs 1BTTB2VFSZPCKFDU 
 ΫΤϦΦϒδΣΫτΛ౉͢ 3FUVSOUPQLSFTVMUT 
 5PQLͷ݁Ռ͕ฦΔ TermQuery PhraseQuery FUD 3FBE fi MFT "DUVBMTUPSBHFBDDFTT 3FBE-VDFOFJOEFY

Slide 28

Slide 28 text

#BTJD4FBSDI"1* "DPEFFYBNQMFUPTFBSDIB-VDFOFJOEFY // Open a directory which stores index Directory directory = FSDirectory.open(Path.of("index")); // Create an IndexSearcher IndexReader indexReader = DirectoryReader.open(directory); IndexSearcher indexSearcher = new IndexSearcher(indexReader); // Create Query object that searches for "lucene" on "title" field Query query = new TermQuery(new Term("title", "lucene")); TopDocs results = indexSearcher.search(query, 10); ScoreDoc[] hits = results.scoreDocs; // Iterate through the results for (ScoreDoc hit : hits) { Document hitDoc = indexSearcher.doc(hit.doc); System.out.println("Hit: " + hitDoc.get("title")); } // Post-processing indexReader.close(); directory.close(); w IndexSearcher w IndexReaderΛ௨ͯ͠ΠϯσοΫεʹΞΫηε

Slide 29

Slide 29 text

#BTJD4FBSDI"1* "DPEFFYBNQMFUPTFBSDIB-VDFOFJOEFY // Open a directory which stores index Directory directory = FSDirectory.open(Path.of("index")); // Create an IndexSearcher IndexReader indexReader = DirectoryReader.open(directory); IndexSearcher indexSearcher = new IndexSearcher(indexReader); // Create Query object that searches for "lucene" on "title" field Query query = new TermQuery(new Term("title", "lucene")); TopDocs results = indexSearcher.search(query, 10); ScoreDoc[] hits = results.scoreDocs; // Iterate through the results for (ScoreDoc hit : hits) { Document hitDoc = indexSearcher.doc(hit.doc); System.out.println("Hit: " + hitDoc.get("title")); } // Post-processing indexReader.close(); directory.close(); w IndexSearcher w IndexReaderΛ௨ͯ͠ΠϯσοΫεʹΞΫηε w TermQuery w TermʢసஔΠϯσοΫεʹ͓͚ΔΩʔʣͰݕࡧ w BOBMZ[FޙͷςΩετΛ౉͢ w -VDFOFͰݕࡧͯ͠΋ώοτͤͣɺMVDFOFͳΒώοτ

Slide 30

Slide 30 text

#BTJD4FBSDI"1* "DPEFFYBNQMFUPTFBSDIB-VDFOFJOEFY w IndexSearcher w IndexReaderΛ௨ͯ͠ΠϯσοΫεʹΞΫηε w TermQuery w TermʢసஔΠϯσοΫεʹ͓͚ΔΩʔʣͰݕࡧ w BOBMZ[FޙͷςΩετΛ౉͢ w -VDFOFͰݕࡧͯ͠΋ώοτͤͣɺMVDFOFͳΒώοτ w IndexSearcher#search()ʹ౉ͯ͠ݕࡧ w TopDocsʢ5PQLͷ݁Ռʣ͕ಘΒΕΔ // Open a directory which stores index Directory directory = FSDirectory.open(Path.of("index")); // Create an IndexSearcher IndexReader indexReader = DirectoryReader.open(directory); IndexSearcher indexSearcher = new IndexSearcher(indexReader); // Create Query object that searches for "lucene" on "title" field Query query = new TermQuery(new Term("title", "lucene")); TopDocs results = indexSearcher.search(query, 10); ScoreDoc[] hits = results.scoreDocs; // Iterate through the results for (ScoreDoc hit : hits) { Document hitDoc = indexSearcher.doc(hit.doc); System.out.println("Hit: " + hitDoc.get("title")); } // Post-processing indexReader.close(); directory.close();

Slide 31

Slide 31 text

-VDFOF2VFSJFT $MBTT %FTDSJQUJPO Query ͢΂ͯͷ2VFSZͷBCTUSBDUͳجఈΫϥε TermQuery 5FSNʹϚον͢Δݕࡧ PrefixQuery ઀಄ࣙ QSF fi Y ʹΑΔݕࡧ PhraseQuery ϑϨʔζݕࡧ PhrasePrefixQuery ϑϨʔζͰ઀಄ࣙݕࡧ SpanQuery, SpanTermQuery ग़ݱҐஔͷൣғ TQBO Λߟྀͨ͠ݕࡧ TermRangeQuery λʔϜͷόΠτॱ #ZUFT3FG Ͱͷൣғݕࡧ NumericRangeQuery ਺஋Ͱͷൣғݕࡧ FuzzyQuery ͍͋·͍ݕࡧ BooleanQuery ϒʔϦΞϯݕࡧʢଞͷΫΤϦͱ૊Έ߹ΘͤΔෳ߹ΫΤϦʣ FilteredQuery ߜΓࠐΈݕࡧ

Slide 32

Slide 32 text

4DPSJOHCZ4JNJMBSJUJFT $VTUPNJ[FSBOLJOHNPEFMT w IndexSearcherIBTBSimilarity w *UIBTBBM25Similarity CBTFEPO0LBQJ#.NPEFM BTBEFGBVMUTJNJMBSJUZ w "XIJMFBHP JUXBTBTFIDFSimilarity CBTFEPO7FDUPS4QBDF.PEFM w :PVDBOTFUBOPUIFSTJNJMBSJUZCZIndexSearcher#setSimilarity() w DFRSimilarity DFISimilarity IBSimilarity LMDirichletSimilarity LMJelinekMercerSimilarity BOETPPO SFGPSHBQBDIFMVDFOFTFBSDITJNJMBSJUJFT -VDFOF"1* 
 IUUQTMVDFOFBQBDIFPSHDPSF@@DPSFPSHBQBDIFMVDFOFTFBSDITJNJMBSJUJFTQBDLBHFTVNNBSZIUNM

Slide 33

Slide 33 text

0UIFS5PQJDT "OENPSF w *OEFY'JMF#JOBSZ'PSNBU w %FMFUJOH%PDVNFOUT -JWF%PD w 0QUJNJ[JOH*OEFY w 8PSLJOHXJUI'JMUFST'JMUFS$BDIF w 'JMF4ZTUFN$BDIF ..BQ%JSFDUPSZ w /FBS3FBMUJNF4FBSDI /35

Slide 34

Slide 34 text

3FGFSFODF w 0 ffi DJBMEPDVNFOU 
 IUUQTMVDFOFBQBDIFPSHDPSF@@JOEFYIUNM w "1*+BWBEPD 
 IUUQTMVDFOFBQBDIFPSHDPSF@@DPSFJOEFYIUNM w $IBOHFT 
 IUUQTMVDFOFBQBDIFPSHDPSF@@DIBOHFT$IBOHFTIUNM w *TTVFT 
 IUUQTJTTVFTBQBDIFPSHKJSBQSPKFDUT-6$&/&JTTVFT w #PPLT w -VDFOFJO"DUJPO 4FDPOE&EJUJPO 
 IUUQTXXXNBOOJOHDPNCPPLTMVDFOFJOBDUJPOTFDPOEFEJUJPO w 7JEFPT w -VDJEXPSLT 
 IUUQTXXXZPVUVCFDPNDIBOOFM6$1*U0EG6L@UKMWRHHL:+T"TFBSDI RVFSZ-VDFOF w -VDFOF4PMS3FWPMVUJPO 
 IUUQTXXXZPVUVCFDPNDIBOOFM6$,V3S[&2:1QG$H$/JMH2WJEFPT