Upgrade to Pro — share decks privately, control downloads, hide ads and more …

機器學習在智慧運營上的應用

 機器學習在智慧運營上的應用

July 25, 2022 @百 業分享堂

Janpu Hou

August 17, 2022
Tweet

More Decks by Janpu Hou

Other Decks in Technology

Transcript

  1. SMART OPERATIONS TECHNOLOGY FORUM Artificial Intelligence for IT & IoT

    Operations 機器學習在智慧運營上的應用 Artificial Intelligence for IT & IoT Operations 機器學習在智慧運營上的應用 Janpu Hou (侯展璞) July 25,2022 @百業分享堂
  2. 公司治理應具備之能力 • 在不確定環境下看見機會的能力 ▪ 及時辨識營運異常 • 管理營運方向並正確執行的能力 ▪ 追根究底找到問題 •

    確保營運敏捷性及可控性的能力 ▪ 及時糾錯維持正常營運 輔助執行者的 工具:AIOps 領導者 策略 執行者 能力 能量 資源分配 的 決策
  3. 物聯網 人聯網 關係網 Product & Services Business Processes Company Resources

    Manage IT Operational data 公司運營 IT Ops Challenges: Too Complicate to identify potential risk Too Long to find the root cause Too Late to take corrective action 公司數位優化後的挑戰
  4. Big Operational Data Smart IT Infrastructure AIOps provide operational data

    analytics from smart IT infrastructure AI Solution: Build smart IT infrastructure Provide operational data analytics Automate operational actions 公司數位優化後的治理工具:AIOPs
  5. AI-led IT operations, or ‘AIOps’, affords smart cities to get

    a handle on their increasingly complex technology systems. AIOps, including AI-Bots can leverage AI/ML for operational automation and to analyze cross-domain context and alerts in seconds. 智慧城市運營 Las Vegas (2018), San Diego (2020) 智慧金融運營 KeyBank(2017), Wells Fargo(2021) Digital Transformation at KeyBank (HQ in Cleveland) with AIOps The data volumes we process are truly astounding. KeyBank’s online banking system alone clocks 35 customer logins per second AIOps使用案例
  6. Select an approach To resolve the problem Root cause analysis

    Identify the problem 討論運營瓶頸 找出問題源頭 提出解決方案 標準運營管理決策方式
  7. Knowledge Graph Root Cause Analysis Take actions based on Recommendation

    Anomaly Detection AIOps 產品設計:產品架構
  8. Wells Fargo Open AIOps Platform MySQL, Prometheus, Grafana Apache Flink

    PyFlink, Keras Tensorflow AIOps 產品設計:產品架構
  9. • Main Framework: • Apache Flink Distributed Stream Processing Framework

    • Python Keras Deep Learning Framework • Anomaly detection • Root cause analysis • Automated action recommendation • Alerting & Visualization: • MySQL Database • Grafana Observability Platform • Prometheus Monitoring https://databricks.com/session_na21/effective-aiops-with-open-source-software-in-a-week Wells Fargo Open AIOps Platform
  10. 數據傳播時是否正常? M2M時是否正常? 人聯網是否正常? 物聯網是否正常? P. H. Le-Khac, et al. "Contrastive

    Representation Learning: A Framework and Review," in IEEE Access, vol. 8, pp. 193907-193934, 2020 Anomaly Detection Engine: 找出運營異常 解決方案:Contrastive Representation Learning
  11. C. -C. Yen et al., "Graph Neural Network based Root

    Cause Analysis Using Multivariate Time-series KPIs for Wireless Networks," NOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium, 2022, pp. 1-7, TT Time-series KPI data Graph Structure Construction Data Classification Root Cause Evaluation Álvaro Brandón et al., ”Graph-based root cause analysis for service-oriented and microservice architectures” Journal of Systems and Software, Volume 159, 2020 Operational Data Smart IT Infrastructure Business Process Root Cause Analysis Engine: 找出問題根源 解決方案: Graph Neural Network based Root Cause Analysis
  12. Knowledge Graph Q. Guo et al., "A Survey on Knowledge

    Graph-Based Recommender Systems," in IEEE Transactions on Knowledge and Data Engineering, vol. 34, no. 8, pp. 3549-3568, 1 Aug. 2022 Action Recommendation Engine: 找出解決方案 解決方案: Knowledge Graph Embedding
  13. • 數位轉型後,運營數據龐大,IT基建複雜,需有 人工智慧的AIOps. • AIOps • 發現運營異常 • Contrastive Representation

    Learning • 找出問題源頭 • Graph Neural Network based Root Cause Analysis • 及時糾錯維持正常營運 • Knowledge Graph Embedding 總結: