ABYSS engineer. What do I like? I like fast things, especially cars. One of my favorite hobby is Mini 4WD. For that, my internal slack icon is Mini 4WD made by myself. 3
bottom." You may know the related adjective abysmal, which means "appallingly bad" — or "way down in the depths," as it were. from vocabulary.com image: aflo 18
packaged and published. • Service personnels install and maintain the application on real servers. • Knowledges are collected among each services • However, there is no guarantee that these knowledges work well in other service because application execution environments are different each other. Service 24
are collected in one place. • These knowledges work well in other service because application execution environments are the same Service Platform • Platform personnels focus on reducing operating costs and incorporating modern technologies. • Service personnels focus on improving their service. ABYSS makes a foundation for developing successful cases from one service to another. Roles are clarified. 25
layer Container layer Data • Save fed data on the VM layer outside the container layer. • ABYSS operator can recreate containers without worrying about data loss. Reduce operating costs by virtualization technology. 35
lose its data due to physical layer failure. • In ABYSS, shards have at least 3 replicas for redundancy. • By Solr’s replication, shards will not lose data even if one VM loses its data. Reduce operating costs by redundancy. 36
not safe if some VMs in one shard run on the same physical layer. • ABYSS has checker component which checks whether shards have a single point of failure (SPOF) on the physical layer. • If SPOF is found, ABYSS operator remove it by swapping VMs. Reduce operating costs by removing single points of failure (SPOF). Data Data Data 37
requests. Platform team incorporates these extensions to ABYSS So that, service team can use requested functions earlier than Solr natively supports it. Development relationship 47
"guess" the document vectors nearest to a query vector. ANN drastically reduces search latency by losing only a bit of accuracy. In case of Approximately Nearest Neighbor search (ANN) Plugin 48
2022 2020 2021 2022 Background In the fields of image search and natural language processing (NLP), vector search attracts attention. In case of Approximately Nearest Neighbor search (ANN) Plugin 49
first service to use ANN plugin consults ABYSS in Jun 2021. In case of Approximately Nearest Neighbor search (ANN) Plugin 2020 2021 2022 Prototype of ANN plugin is delivered to ABYSS in November 2020 The service switch to using ANN plugin in October 2021. 50
vector Search articles with the user vector Feed articles with their article vectors A simple vector search runs in ABYSS. More details: https://www.slideshare.net/techblogyahoo/yjtc-yjtc21-a1-241223218 Run machine learnings outside of ABYSS. image: aflo 65
many requests to be received by one cluster. To distribute them, we use many clusters with the same data. We build the clusters with middle spec VMs because a simple vector search is running on them. High-end spec VM High spec VM Middle spec VM ← We provide! Low spec VM 67
result in 2 phases. • 1st Phase • Score questions using simple morphological analysis. • Pass the top N (determined by the tuning parameter) in each shard to the 2nd Phase • 2nd Phase • Score N * shards questions using machine learning. • Rerank questions by this scores and return Top 10. Faster than scoring all questions that hit the search query using machine learning. 78
use high-end spec VMs that have large size memory and high-performance vCPU. Because Solr is a full-text search application, having all texts on memory greatly contributes to its performance. Using machine learning, we need high- performance vCPU. High-end spec VM ← We provide! High spec VM Middle spec VM Low spec VM 80