Upgrade to Pro — share decks privately, control downloads, hide ads and more …

1001 號 到 333 號

1001 號 到 333 號

LINE 台灣資料工程團隊的資料工程師,也是前一屆轉正成功的前實習生 Shandy Yu 帶來 「從 1001 號到 333 號」的分享

LINE Developers Taiwan

March 13, 2024
Tweet

More Decks by LINE Developers Taiwan

Other Decks in Technology

Transcript

  1. Shandy •陽明交通大學 百川學士學位學程 111級 •Data Scientist @LINE Taiwan •TECH FRESH

    @LINE Taiwan •Consulting Service Intern @Microsoft Taiwan •Research Assistant @NYCU CIRDA •Software Engineer Intern @ITRI Yu
  2. LINE Family Services LINE TODAY LINE SHOPPING LINE SPOT LINE

    INVOICE LINE STICKER LINE VOOM LINE TRAVEL Data Dev NLP Generative AI Data Analytics NER Classifier Duplication Detector Auto completion Keyword Extraction Related Search Text Generation User Tagging Uplift Modeling Recom- mendation CLV Source from: Penny LINE OA Data Dev 在做什麼 ?
  3. NLPaaS —— SmartText SmartText 1.0 Automatic NLP Classifier Multi-label Classifier

    Topic Detection SmartText 2.0 Generative NLP Summarization Paraphrasing Question-Answering Beyond NLP Image Search Image Generation Audio Interaction
  4. LINE Family Services LINE TODAY LINE SHOPPING LINE SPOT LINE

    INVOICE LINE STICKER LINE VOOM LINE TRAVEL Data Dev NLP Generative AI Data Analytics NER Classifier Duplication Detector Auto completion Keyword Extraction Related Search Text Generation User Tagging Uplift Modeling Recom- mendation CLV Source from: Penny LINE OA Data Dev 在做什麼 ?
  5. 7 • Build and optimize data pipeline architecture • Assemble

    large, complex data sets that meet requirements Data Engineer Data Analyst Big data infra, SQL, ETL, message queuing • Interpret data, analyze results using statistical techniques • Identify, analyze, and interpret trends or patterns in complex data sets Statistics, Data Visualization, Business Knowledge SKILL RESPONSIBILITY Pipeline Biz • Select appropriate datasets and data representation methods • Research and implement appropriate ML algorithms Data Scientist Machine learning, deep learning, CV, NLP, Speech Model ML Engineer • Build and scale machine learning infrastructure • Monitor model performance System infrastructure design, DevOps Service 常見的組織分佈
  6. As a Data Scientist @Data Dev Data Scientist • Build

    prediction model • Study cutting-edged technologies 30% Source from: Tomaz Bratanic.(2021 Jul 20) Turn a Harry Potter Book into a Knowledge Graph https://neo4j.com/developer-blog/turn-a-harry-potter-book-into-a-knowledge-graph/
  7. As a Data Scientist @Data Dev Data Analyst • Discover

    the data • Design BI platform 30% 30%
  8. As a Data Scientist @Data Dev Data Engineer • Clean

    the data • Design the ETL • Design the automatic pipeline 30% 30% 20% Extract Load Transform Source DB Data Warehouse
  9. As a Data Scientist @Data Dev Others • Requirements Discussion

    • Internal/External Speech • Training 30% 30% 20% 10%