Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Making Sense of HS Codes: HSense AI System for ...

Avatar for Lablup Inc. Lablup Inc. PRO
November 03, 2025
0

Making Sense of HS Codes: HSense AI System for Automated Tariff Classfication

Track 3_1615_1_Lablup Conf 2025_Sergey

Avatar for Lablup Inc.

Lablup Inc. PRO

November 03, 2025
Tweet

Transcript

  1. Making Sense of HS Codes: HSense AI System for Automated

    Tariff Classfication Leksikov Sergey
  2. Outline • Korea’s Tariff Classification Landscape • Solution: Hsense •

    HSense Team - workflow • Tariff Officer • GRI rule expert • Legal Framework Expert • HS Code Navigator • Technical details • Tech Stack • Core ideas • Accuracy performance • Challenges • Code Snippet Time
  3. The HS Code Tariff System • HS (Harmonized System) Code

    is a 10-digit numbers that classify every traded product locally and globally • Determine tariffs: 0% vs 8% can mean millions in costs for imported goods • Required by law: Every import/export needs correct classification • 11,000+ possible codes in Korea's system
  4. Korea's Classification Landscape • 97 product chapters (01-97, excluding 77)

    • Each chapter branches into hundreds of subcategories • From mountain animals “01” to artwork “97” • Complex hierarchy requires systematic navigation
  5. Solution: HSense • HSense - AI System for Automated Tariff

    Classification • Multi-Agent collaboration • Coordinated specialists: • Tariff Officer + GRI Rule Expert + Codes Definition Expert + HS Code Navigator • Making sense of 11,000+ codes through systematic evidence synthesis
  6. Benefits of Agentic AI Team System • Rule-based approach based

    on legal law framework • Robustness for new products and inventions: • Product cases databases cannot keep with new products released on the market • UNIPASS product case dataset covers only about 30% of all possible hs codes • No need to fine-tune LLM on domain or product cases -> cost and time saved • Can explain it’s decision making vs black-box Deep Learning approach
  7. Tariff Officer - The Coordinator • Coordinates work between team

    member agents • Guides the search • Makes final decision
  8. GRI Rule Expert - The Interpreter • Explaings the rules

    for classification • Provides interpretation and guidance • Rule-based reasoning
  9. Legal Framework Expert - The Schola r • Expert in

    Legal Law document: • 2000+ pages law document • ~ 3 million tokens of legal text • Uses Retrieval Tools • Retrieval Augmentation Generation (RAG) tool over vectorstore for semantic meaning • SQLite database for keyword search over legal framework
  10. HS Code Navigator - The Pathfinder • Navigates the code

    hierarchies from 2 digit to 10 digits • Tool: • SQLite database of codes hierarchies with keyword description
  11. Core ideas • LLM Context Management: • keep context size

    small • specialization • Ensemble and MoE methods were effective in Machine Learning field • Collaborative Decision Making: • multiple perspectives • cross-validation • consensus building • Task Decomposition: • Divide and Conquer • systematic evidence gathering
  12. Tech stack • agentic framework - Agno • semantic database,

    vectorstore - FAISS (Facebook AI Similarity Search) • Korean PDF with text to structured Markdown - https://github.com/datalab-to/marker • tried about 5 options • another good markdown structure was from https://github.com/microsoft/markitdown • LangChain for chunking based on MarkDown Header content • SQLite with FTS5 extension - full-text search support
  13. Accuracy performance • Work in progress and accuracy range within

    other commercial models • “Benchmarking Harmonized Tariff Schedule Classification Models” Bryce Judy, 11 November 2024 • Accuracy: 44-89% on 10 digits, top-1 • Optimization is needed on multiple levels: • retrieval • prompt • team composition • Inference scalability: • Add RAG pipeline with product cases
  14. Challenges • OpenAI API rate limits - prevents optimization and

    multiple evaluation • Chunking and vectorstore are not right tools for hierarchical document structure • Limited number of open source fast, smart and good at Korean LLM models • Transparency and monitoring • Hard to debug • HSense Manager makes wrong decision -> all members work hard in wrong direction until the end • Need a system to backtrack decision and see the alternatives
  15. Scalability of Team of Agents • Multi-agent systems are naturally

    slower due to LLM processing time • Practical Solutions: • Overnight Batch Processing: Run multiple agent teams on product of 100 catalo gs • Quality-First Approach: Accuracy matters more than real-time speed • Market Context: • Existing slow systems prove value and are in demand: • AI research and report writers (Perplexity, Gemini Deep Research): 20+ minutes per report • OpenAI reasoning pro models: PhD-level problem solving takes time • Companies in specific industries have limited set of product categories • the domain and search space can be narrow downed
  16. Basic Agent with Reasoning Tool - 1 • Pydantic schema

    for structured output • Reasoning Tool - Agent decided when to enter reasoning mode. Example: a complex product with different components or materials
  17. Basic Agent with Reasoning Tool - 2 • Continue Agent()

    object initialization • Add memory to keep session and conversations • Can be used with Sqlite or Postgresql for persistance • Setting the respons_model with pydantic schema for structured output • logging enablers with detailed tools and intermediate steps • multi-modal support for images out of box if model is multi-modal
  18. Team definition snippet with modes • Team modes: • collaborate

    • coordinate • route • Team also itself acts as an Agent with own prompt • Team coordination, decision steps, iterations are all hidden behind Agno framework
  19. 템플릿 안내사항 • 권장이지만 꼭 이렇게 맞춰야 하는 것은 아닙니다.

    • 글꼴 : Pretendard (한글/영문) 및 Inter (영문 이탤릭용) Prentendard는 이 파일에 포함되어 있지만, 아래 주소에서 직접 다운로드하여 활용하셔도 무방합 니다. • https://cactus.tistory.com/306 • https://fonts.google.com/specimen/Inter • 코딩 글꼴 : JetBrains Mono • https://www.jetbrains.com/lp/mono/