Slide 1

Slide 1 text

Copyright © 2025 TOYOTA MOTOR CORPORATION All rights reserved. 分散トレーシングによるコネクティッドカーの データ処理見える化の試み July 22, 2025 トヨタ自動車株式会社 InfoTech 伊藤 雅典 OpenTelemetry Meetup 2025-07

Slide 2

Slide 2 text

Copyright © 2025 TOYOTA MOTOR CORPORATION All rights reserved. About Me Masanori Itoh  Affiliation TOYOTA MOTOR CORPORATION  Digital Information and Communication Dept. InfoTech Div.  Open Source Program Group (Toyota OSPO)  Works  R&D for Connected Vehicle Systems • E2E Observability, Standardization, Diagnostics, …  OSPO Operations  Keywords  Operating System, Cloud Infrastructure, etc.  https://github.com/thatsdone  https://www.linkedin.com/in/masanori-itoh-6401603/

Slide 3

Slide 3 text

Copyright © 2025 TOYOTA MOTOR CORPORATION All rights reserved. Contents  なぜクルマ屋がこんなところでしゃべっているのか?  OSSJ2023/KubeCon NA 2023での講演  クルマのデータ処理の概観  OTELでみえる化やってみた  クルマのデータ処理とモバイルネットワーク  まとめ(1)  ’23当時はできなかった課題感と今のOpenTelemetryならどうするか?  まとめ(2) 3

Slide 4

Slide 4 text

Copyright © 2025 TOYOTA MOTOR CORPORATION All rights reserved. なぜクルマ屋がこんなところにでしゃべっているのか? 現在の車両は多量のデータを発生 (位置情報、エンジンやドア等各 種センサ情報)し、携帯回線でセンタに接続されている。 車両の内部も、大小あわせて多数(100以上)の車載器(ECU)による 複雑なネットワークで構成されている。 これらのデータを処理するシステムは、車両も含めて極めて複雑な系を構成 4 イケてる見える化の 仕組みを導入しない と正常運用だけでも 困難

Slide 5

Slide 5 text

Copyright © 2025 TOYOTA MOTOR CORPORATION All rights reserved. OSSJ2023/KubeCon NA 2023での講演ほか  背景  2021よりKDDI様と協業を実施中 (https://newsroom.kddi.com/news/detail/kddi_pr-1127.html)  講演1(KDDI遠藤さんの講演)  “E2E Observability for Connected Vehicle Service via Distributed Tracing” (KubeCon NA 2023@Chicago)  https://kccncna2023.sched.com/event/1R2oh  ポイント  5GネットワークのC-Plane側でのトラシューにOTELを適用する試み – 車載通信機と車両のIDをC-Plane側で関連付ける拡張の紹介  講演2(私の講演)  “E2E Observability for Connected Vehicle Services Including 5G Cellular Network U-Plane Troubles” (OSSJ 2023@Tokyo)  https://ossjapan2023.sched.com/event/1Tyrm  ポイント  ネットワークのU-Plane側でトラブルが起きた場合に、 影響範囲を車両粒度で知るためのOTEL適用の試みの紹介  さらに前段  ポイント  車両からセンタにデータをアップロードするところにOTELを仕込み、 車両~センタのトランザクションの可視化 5

Slide 6

Slide 6 text

Copyright © 2025 TOYOTA MOTOR CORPORATION All rights reserved. クルマのデータ処理の概観  トレンド  CASE – Connected, Autonomous, Shared/Service, Electric  SDV – Software Defined Vehicle  データ処理形態  リアルタイム/バッチ、データ蓄積  データ種別  CAN(センサ系)/カメラほか  センタ側アーキ  パブリッククラウド(古のシステムはオンプレ)、今後はMEC/NWエッジもあわせてハイブリッドに  コネクティビティ  セルラー、WiFi、V2Xほか 6 Internal Developers Customers Wi-Fi Cellular Mobile NW Backbone NW DynamicMap Data Processing(CAN/Camera) Data Accumulation Edge Offload Hybrid Cloud OTA, Center Driven Control, .. Edge Offload

Slide 7

Slide 7 text

Copyright © 2025 TOYOTA MOTOR CORPORATION All rights reserved. クルマのデータ処理の概観(広域システムイメージ) 実際に組んでみたPOCシステムの例 gNB#1 W A N Location #1 (on-premise) Location #2 (on-premise) Public Cloud #1 CAN Camera DynamicMap Others… LB Ops Subsystem Pseudo Vehicle Pseudo Vehicle (generator) Edge#1(Region #2) Auth GW Dispatcher Offload Process Slice#1 (Lat.) Auth GW For Fallback W A N Data Accumulation Edge#2(Region #2) Auth GW Dispatcher Offload Process UE#1 Slice#2 (B/W) UPF#2 UPF#1 Dedicated Line Pseudo Vehicle (generator) UE#1 Slice#1 (Lat.) Slice#2 (B/W) OCI /etc. Public Cloud#2 UPF#2 UPF#1 Fail Over gNB#2 gNB#2 gNB#1 Orchestr ation gNB#1 Dedicated Line/VPN Dedicated Line/VPN Dedicated Line 7

Slide 8

Slide 8 text

Copyright © 2025 TOYOTA MOTOR CORPORATION All rights reserved. クルマのデータ処理の概観(センタ側の処理のイメージ)  例:車載カメラの画像に対する物体検知処理のPOCアプリ  アプリ構成: 車両とセンタ側の通信はmTLS (NGINX) w/ OCSP で終端。 アップロード等はセンタから指示。データはいったんAWS S3にアップロード し、非同期に読みだした上でKafka 経由でSparkに投げ込み、静止画切り出し 処理をした上で推論サーバに投げ込み  Observabilityスタック: Prometheus + ElasticSearch + Jaeger (モノは古 い) object- detection-api movie-kafka divide-video image-kafka detect-image detect-service generator VMs pygen mTLS Front (ingress) Camera Data Processing Center Accumlation Center (*) Each box consists of multiple pods/containers 8 Object Detection Subsystem (k8s) Ops Subsystem Metrics Log Trace Front Subsystem (k8s) Multiple Components running distributed manner  Not so easy to monitor

Slide 9

Slide 9 text

Copyright © 2025 TOYOTA MOTOR CORPORATION All rights reserved. 車両データ処理の例(OTELを適用した例) カメラ画像データに対する物体検知処理のPOCアプリにOTELを適用した例 9 Object Detection Subsystem Camera Data Upload Event TRIP : from Engine Start to Stop Camera Data Upload Operation Camera Data Upload Completion handler Invokes Object Detection Subsystem Asynchronously センタ側の制御に従っ てカメラ画像をアップ ロードし、非同期に物 体検出処理を実行

Slide 10

Slide 10 text

Copyright © 2025 TOYOTA MOTOR CORPORATION All rights reserved. 課題とモチベーション 課題  車両またはセンタ側それぞれでは、何があったかわかるが… ネットワークでトラブルが発生すると、ユーザ側できることは少ない  問い合わせを多数いただくが… どこでトラブルが起こったのかすら特定するのは簡単ではない モチベーション  モバイル通信事業者(MNO)さんと協業できれば: コネクティッドサービスをより堅牢にできる E2Eサービスのリカバリタイムを短縮できる 10

Slide 11

Slide 11 text

Copyright © 2025 TOYOTA MOTOR CORPORATION All rights reserved. 車両データ処理側からの要件 トラブル発生時に特定したいこと:  Areas where vehicle communication is affected (どこで)  Vehicles that get affected from the trouble (どれ(=どの車両))  What kind of communication trouble? (何が/どう)  Intermittent or Persistent trouble, for example  When the trouble occurred and is expected to be resolved (いつ) 11 Excerpt from KubeConNA 2023 Presentation, “E2E Observability for Connected Vehicle Service via Distributed Tracing”  Requirements from Mobile Network Operator side Systems with large number of users constantly connected Provided for IoT devices such as connected car Accountability for the failure impact and the extent to which customers were affected when the failure occurred

Slide 12

Slide 12 text

Copyright © 2025 TOYOTA MOTOR CORPORATION All rights reserved. 課題設定 – 詳細  Problems to be Addressed  Application Layer Troubles are OK, But Need to address Network Troubles (including U-Plane)  Assumptions  MNO does health monitoring of C/U-Plane NFs  User (Connected Vehicle Servicer) Side knows:  Identifiers(IMSI(SUPI)…), Current Location, Route Plan (e.g., Destination, Schedule), …  Motivations/Reasons  Detailed Requirements  If we can get U-Plane trouble information with UE granularity earlier, we can take Proactive Actions, sending/retrieving necessary information/command in prior, for example.  Even if it’s difficult to forecast failures/troubles, failure location and vehicle location is normally apart. Thus, we can take Actions Proactively. 12

Slide 13

Slide 13 text

Copyright © 2025 TOYOTA MOTOR CORPORATION All rights reserved. 5GNWに端末がつながる時の処理の概観  C-Plane側の要素はたくさんあります  NRF, AMF, SMF, …  C-Plane  UPF, gNodeB, UE  U-Plane  端末が通信できるようになるまで  (1) 端末(UE)登録処理  契約者認証とか。基本的に C-Plane側の処理  (2) PDUセッション確立  U-Plane の要素が中継装置(UPF) /基地局(gNodeB)群に設定される (by AMF/SMF) 13 3GPP TS 23.502 version 15.3.0 Release 15

Slide 14

Slide 14 text

Copyright © 2025 TOYOTA MOTOR CORPORATION All rights reserved. アイデア – 理想  実現したいこと  U-Plane機器(中継装置/基地局)単位で、PDUセッションの情報と、端末 の識別子を取り出せればよい  初期アイデア  (1) Get PDU Session Information (with SUPI(IMSI)) from NFs via REST API  Could be called this function as a kind of NEF?  (2) Use OpenTelemetry Automatic Instrumentation  Two choices – OpenTelemetry otel-go-instrumentation (https://github.com/open- telemetry/opentelemetry-go-instrumentation) – Grafana Labs Beyla Automatic Instrumentation (https://github.com/grafana/beyla)  KDDI Team took Manual Instrumentation approach to address non- HTTP 5GS communication(NGAP/PFCP) 14

Slide 15

Slide 15 text

Copyright © 2025 TOYOTA MOTOR CORPORATION All rights reserved. アイデア – 現実  現実  (1) Free5GC NFs implementations do not return sufficient information   (2) Golang (in which Free5GC is written) Automatic Instrumentation has limitations   Even HTTP (not NGAP/PFCP) could not be instrumented enough  実際やってみたこと(Work Around)  NF(特に AMF と SMF)が出力するログを監視し、リアルタイムに必要 な情報を取り出し、OTLPで送り出すツールを作成  A custom Log Parser of Free5GC NFs which can emit trace data with SUPI(IMSI) and to be correlated later 15

Slide 16

Slide 16 text

Copyright © 2025 TOYOTA MOTOR CORPORATION All rights reserved.  5GS Stack  Free5GC v3.3.0 w/go 1.21.1 + gtp5g  UERANSIM git+3a96298 (’23/5/9)  Ubuntu 22.04.1(amd64)  Deployment  2 Slices per Tracking Area  2 Tracking Areas  Simple VM deployment  No containerization (for now) POCシステムの構成 UPF#1-1 VM (free5gc-poc-up-11) 192.168.1.11) VM(free5gc-poc-ue-1) 192.168.63.1 uesimtun0 VM(free5gc-poc-cp-1) 192.168.0.11 NRF SMF#1 uesimtun1 SNAT boundary (iptables MASQUERADE) SMF#2 VM (free5gc- poc-cp-2) AMF UE#1 NRF Static NAT boundary (Floating IP) UPF#1-2 gNB#1-2 gNB#1-1 .1.211 upfgtp 16 upfgtp VM#3 (free5gc-poc-up-12) 192.168.1.12) .1.111 N11 : HTTP N3 : GTP-U (udp 2152) N4 : PFCP (udp 8805) N2 : NGAP (sctp 38412) Radio Link Simulation (udp 4997) Tracking Area #1 Tracking Area #2 Tracking Area #3 : : .1.212 .1.112 VM (free5gc- poc-cp-3)

Slide 17

Slide 17 text

Copyright © 2025 TOYOTA MOTOR CORPORATION All rights reserved.  環境構築手順(1)  Follow UERANSIM/Free5GC documentation  環境構築手順(2) (TIP 0)  (1) Create UEs without PDU Sessions  Use a configuration with NULL ‘sessions:’ and ‘default-nssai:’ section  (2) Initiate PDU Session Establishment Request  Using nr-cli, invoke PDU Session Establishment request  TIP: Use double quotation(“) to group sub-command – https://github.com/aligungr/UERANSIM/wiki/Usage POC – 手順 17 $ nr-cli imsi-208930000012345 --exec “ps-establish IPv4 0x01 0x010203 internet11” # 0x01 : sst (Slice and Service Type) # 0x010203 : sd (Slice Differenciator) # Internet11 : dnn UE configuration example --- supi: 'imsi- 208930000000003' mcc: '208' mnc: '93' (snip) gnbSearchList: - 192.168.1.211 (snip) sessions: configured-nssai: - sst: 0x01 sd: 0x010203 - sst: 0x01 sd: 0x112233 default-nssai: integrity: (snip) Make these sections empty Make these sections empty

Slide 18

Slide 18 text

Copyright © 2025 TOYOTA MOTOR CORPORATION All rights reserved. POC – 分析  Analysis – ログベースの分析  中継装置(UPF)と端末(UE)の対応はSMFが管理する 18 $ grep -E ¥(PduSess¥|GIN¥) 20231127-1-free5gc-pdu-session-success-smf.log time="2023-11-27T16:04:22.923623330+09:00" level="info" msg="Receive Create SM Context Request" CAT="PduSess" NF="SMF" time="2023-11-27T16:04:22.927806025+09:00" level="info" msg="In HandlePDUSessionSMContextCreate" CAT="PduSess" NF="SMF" time="2023-11-27T16:04:22.928808304+09:00" level="trace" msg="State[InActive] -> State[InActive]" CAT="PduSess" NF="SMF" pdu_session_id="1" supi="imsi-208930000000003" time="2023-11-27T16:04:22.930953619+09:00" level="trace" msg="State[InActive] -> State[ActivePending]" CAT="PduSess" NF="SMF" pdu_session_id="1" supi="imsi-208930000000003" time="2023-11-27T16:04:22.931518587+09:00" level="debug" msg="S-NSSAI[sst: 1, sd: 010203] DNN[internet11]" CAT="PduSess" NF="SMF" pdu_session_id="1" supi="imsi-208930000000003" time="2023-11-27T16:04:22.938366848+09:00" level="info" msg="Send NF Discovery Serving UDM Successfully" CAT="PduSess" NF="SMF" pdu_session_id="1" supi="imsi-208930000000003" time="2023-11-27T16:04:22.970142403+09:00" level="trace" msg="Send NF Discovery Serving AMF successfully" CAT="PduSess" NF="SMF" pdu_session_id="1" supi="imsi-208930000000003" time="2023-11-27T16:04:22.971051987+09:00" level="trace" msg="findPSAandAllocUeIP" CAT="PduSess" NF="SMF" pdu_session_id="1" supi="imsi-208930000000003" time="2023-11-27T16:04:22.975311711+09:00" level="info" msg="Allocated PDUAdress[10.241.0.1]" CAT="PduSess" NF="SMF" pdu_session_id="1" supi="imsi-208930000000003" time="2023-11-27T16:04:23.028520457+09:00" level="debug" msg="Install SessionRule[SessRuleId-1]: &{AuthSessAmbr:0xc00005ac80 AuthDefQos:0xc00005aca0 SessRuleId:SessRuleId-1 RefUmData: RefCondData:}" CAT="PduSess" NF="SMF" pdu_session_id="1" supi="imsi-208930000000003" time="2023-11-27T16:04:23.029753661+09:00" level="info" msg="Has no pre-config route" CAT="PduSess" NF="SMF" pdu_session_id="1" supi="imsi-208930000000003" time="2023-11-27T16:04:23.030725335+09:00" level="trace" msg="In AllocateLocalSEIDForDataPath" CAT="PduSess" NF="SMF" time="2023-11-27T16:04:23.031665952+09:00" level="trace" msg="NodeIDtoIP: 192.168.1.111" CAT="PduSess" NF="SMF" time="2023-11-27T16:04:23.032432749+09:00" level="trace" msg="In ActivateTunnelAndPDR" CAT="PduSess" NF="SMF" time="2023-11-27T16:04:23.033283688+09:00" level="trace" msg="DataPath Meta Information¥nActivated: false¥nIsDefault Path: true¥nHas Braching Point: false¥nDestination IP: ¥nDestination Port: ¥nDataPath Routing Information¥n1th Node in the Path¥nCurrent UPF IP: 192.168.1.111¥nPrevious UPF IP: None¥nNext UPF IP: None¥n" CAT="PduSess" NF="SMF" time="2023-11-27T16:04:23.035178231+09:00" level="trace" msg="Current DP Node IP: 192.168.1.111" CAT="PduSess" NF="SMF" time="2023-11-27T16:04:23.037023448+09:00" level="warning" msg="No Create URR" CAT="PduSess" NF="SMF" time="2023-11-27T16:04:23.038764297+09:00" level="trace" msg="Current DP Node IP: 192.168.1.111" CAT="PduSess" NF="SMF" time="2023-11-27T16:04:23.039538734+09:00" level="trace" msg="Before DLPDR OuterHeaderCreation" CAT="PduSess" NF="SMF" time="2023-11-27T16:04:23.041489662+09:00" level="info" msg="| 201 | 192.168.0.11 | POST | /nsmf-pdusession/v1/sm-contexts | " CAT="GIN" NF="SMF" time="2023-11-27T16:04:23.041964494+09:00" level="info" msg="Sending PFCP Session Establishment Request" CAT="PduSess" NF="SMF" time="2023-11-27T16:04:23.043417128+09:00" level="trace" msg="[SMF] Send SendPfcpSessionEstablishmentRequest" CAT="PduSess" NF="SMF" time="2023-11-27T16:04:23.043899656+09:00" level="trace" msg="Send to addr 192.168.1.111:8805" CAT="PduSess" NF="SMF" time="2023-11-27T16:04:23.066085696+09:00" level="info" msg="Received PFCP Session Establishment Accepted Response" CAT="PduSess" NF="SMF" time="2023-11-27T16:04:23.099668947+09:00" level="trace" msg="State[ActivePending] -> State[Active]" CAT="PduSess" NF="SMF" pdu_session_id="1" supi="imsi-208930000000003" time="2023-11-27T16:04:23.112868164+09:00" level="info" msg="Receive Update SM Context Request" CAT="PduSess" NF="SMF" time="2023-11-27T16:04:23.114583438+09:00" level="trace" msg="State[Active] -> State[ModificationPending]" CAT="PduSess" NF="SMF" pdu_session_id="1" supi="imsi-208930000000003" time="2023-11-27T16:04:23.116560933+09:00" level="trace" msg="State[ModificationPending] -> State[PFCPModification]" CAT="PduSess" NF="SMF" pdu_session_id="1" supi="imsi-208930000000003" time="2023-11-27T16:04:23.117479738+09:00" level="trace" msg="In case PFCPModification" CAT="PduSess" NF="SMF" pdu_session_id="1" supi="imsi-208930000000003" time="2023-11-27T16:04:23.129293784+09:00" level="info" msg="Received PFCP Session Modification Accepted Response from AN UPF" CAT="PduSess" NF="SMF" time="2023-11-27T16:04:23.130259009+09:00" level="trace" msg="In case SessionUpdateSuccess" CAT="PduSess" NF="SMF" pdu_session_id="1" supi="imsi-208930000000003" time="2023-11-27T16:04:23.131363584+09:00" level="trace" msg="State[PFCPModification] -> State[Active]" CAT="PduSess" NF="SMF" pdu_session_id="1" supi="imsi-208930000000003" time="2023-11-27T16:04:23.132523645+09:00" level="info" msg="| 200 | 192.168.0.11 | POST | /nsmf-pdusession/v1/sm-contexts/urn:uuid:474a7f9e-f223-4981-a748-cbe82a1b5a06/modify | " CAT="GIN" NF="SMF"

Slide 19

Slide 19 text

Copyright © 2025 TOYOTA MOTOR CORPORATION All rights reserved. POC – 分析  Analysis – ログ分析のTIPS  TIP 1  Common Tags: time, level, msg, CAT (CATegory), NF (Network Function)  TIP 2  Additional Tags: pdu_session, supi (=IMSI) for CATegory=“PDUSess” log lines  TIP 3  Category “GIN”? Is HTTP Access Log (GIN is a Golang REST API Framework) 19 time="2023-11-27T16:04:22.923623330+09:00" level="info" msg="Receive Create SM Context Request" CAT="PduSess" NF="SMF" time="2023-11-27T16:04:22.928808304+09:00" level="trace" msg="State[InActive] -> State[InActive]" CAT="PduSess" NF="SMF" pdu_session_id="1" supi="imsi- 208930000000003" time="2023-11-27T16:04:23.041489662+09:00" level="info" msg="| 201 | 192.168.0.11 | POST | /nsmf-pdusession/v1/sm-contexts | " CAT="GIN" NF="SMF"

Slide 20

Slide 20 text

Copyright © 2025 TOYOTA MOTOR CORPORATION All rights reserved. POC – 分析  Analysis – 注目するSMFのログ  UPF Facility Identifier (Name/IP) and UE Identifier (SUPI/IMSI) correspondence can be found in SMF logs 20 level="trace" msg="findPSAandAllocUeIP" CAT="PduSess" NF="SMF" pdu_session_id="1" supi="imsi-208930000000003" level="debug" msg="check start UPF: UPF1-1" CAT="CTX" NF="SMF" level="debug" msg="check start UEIPPool(10.241.0.0/17)" CAT="CTX" NF="SMF" level="info" msg="Allocated UE IP address: 10.241.0.1" CAT="CTX" NF="SMF" level="info" msg="Selected UPF: UPF1-1" CAT="CTX" NF="SMF" level="info" msg="Allocated PDUAdress[10.241.0.1]" CAT="PduSess" NF="SMF" pdu_session_id="1" supi="imsi-208930000000003“ level="trace" msg="NodeIDtoIP: 192.168.1.111" CAT="PduSess" NF="SMF" 端末(UE)に対応して選択さ れた中継装置(UPF) An excerpt from SMF log with ‘log level = trace’. timestamp (time=“YYMM…”) column is omitted for simplicity. 端末に割り当てられたIP 端末(UE)の識別子 (オプション)中継装置(UPF)のアンダ レイ側の IP address

Slide 21

Slide 21 text

Copyright © 2025 TOYOTA MOTOR CORPORATION All rights reserved.  プロトタイプ – ログモニタ&トレース送信エージェント  Watch SMF log files, extract PDU Session - UPF mapping and send it out as a Trace span. The mapping is queried to identify blast-radius (affected UEs) on U-Plane failures. POC – 監視システムプロトタイプ 21 Trace Aggregator / Analyzer Establish PDU Session Log Watch Tool (Trace Shipper) Vehicle Mobile Network SMF Send Trace (w/SUPI) Log Time Send Trace (w/UPF+SUPI) Watch Write AMF … PDU Session Setup (not online) Init Registration Correlate Traces using SUPI, and Identify Vehicle/Location etc. U-Plane Failure On U-Plane failures, Lookup Trace Aggregator and extract suffered UEs (=Vehicles) list DB: VIN IMEI/IMSI DB: Vehicle Loc. / etc. Keep Alive

Slide 22

Slide 22 text

Copyright © 2025 TOYOTA MOTOR CORPORATION All rights reserved. POC – 監視システムプロトタイプ  実験結果(自作した監視ツールの動作ログ) 22 {"time": "2023-12-05T07:48:25.411976661+09:00", "level": "info", "msg": "Allocated UE IP address: 10.241.0.6", "NF": "SMF", "CAT": "CTX"} {"time": "2023-12-05T07:48:25.412720710+09:00", "level": "info", "msg": "Selected UPF: UPF1-1", "NF": "SMF", "CAT": "CTX"} {"time": "2023-12-05T07:48:25.413436458+09:00", "level": "info", "msg": "Allocated PDUAdress[10.241.0.6]", "NF": "SMF", "CAT": "PduSess", "pdu_session_id": "2", "supi": "imsi-208930000000002"} # Found U-Plane/PDU Session mapping: UPF: UPF1-1 SUPI: imsi-208930000000002 pdu_session_id: 2 IP(UE): 10.241.0.6 { "name": null, "context": { "trace_id": "0x4419c8cca82561d8b52116276b202ee8", "span_id": "0x396829beeaa73eae", "trace_state": "[]" }, "kind": "SpanKind.INTERNAL", "parent_id": null, "start_time": "2023-12-05T02:35:34.698170Z", "end_time": "2023-12-05T02:35:34.698250Z", "status": { "status_code": "OK" }, "attributes": { "supi": "imsi-208930000000002", "pdu_session_id": "2", "ue_ip": "10.241.0.6", "selected_upf": "UPF1-1" }, "events": [], "links": [], "resource": { "attributes": { "service.name": "free5gc-parse.py" }, "schema_url": "" } } OpenTelemetry Trace/Span Data (By ConsoleExporter, normally exported to remote server) Mapping information of UE and UPF is exported as trace attributes. Processed (parsed) Free5GC SMF Log Lines

Slide 23

Slide 23 text

Copyright © 2025 TOYOTA MOTOR CORPORATION All rights reserved. POC – 評価  結果  中継装置(UPF)で疑似故障発生後、確かに影響を受ける中継装置 (UPF)と端末(UE)の対応情報を受信できた  どの程度リアルタイムに検知できるかは、NW基盤の監視の仕組みの設定 (≒通常は生存検知間隔)に依存  Current Limitations  Cannot handle gNodeB information (at the moment)  In case of Free5GC + UERANSIM: – gNodeB failure can be recovered via gNodeB Hand-Over – UPF failures require InitRegistration again  Cannot handle tiered UPF (w/I-UPF) topology (at the moment) 23

Slide 24

Slide 24 text

Copyright © 2025 TOYOTA MOTOR CORPORATION All rights reserved. Works In-Progress & Next Step 1. (Work In-Progress) Blush-up Design and Implementation of the Prototype  Resolve restrictions, dashboard, Integration with other Observability Stack  Or try other 5GS stack implementations (OAI etc.) 2. (Middle Term) Explore more sophisticated Information Retrieval (& Workflow)  Best way is to rely on Standardized API of 3GPP/ETSI (if possible) 3. (Middle Term) Improve Performance & Scalability  Up to millions of vehicles out of tens of millions of MNO subscribers  Failure Notification Latency (less than a minute) 4. (Long Term) Propose to Telco Equipment Vendors/Integrators/MNOs  Explore 3GPP spec. and Contribute to Free5GC or try other 5GS Open Source Stacks 24

Slide 25

Slide 25 text

Copyright © 2025 TOYOTA MOTOR CORPORATION All rights reserved. まとめ(1)  Proved it’s possible to Notify Users UE granularity U-Plane Failure Information  With the notification, Connected Vehicle Service Providers can take actions Proactively if the failure is ahead of the of Vehicles  Without any modifications to 3GPP protocol  This time, watched log lines sequence and extracted UE/U-Plane Function mapping (but to be improved more)  In General  Observability via Distributed Tracing is quite Useful and Important  Contributed 5GS Open Source Stack (and others) via reporting & posting fixes 25

Slide 26

Slide 26 text

Copyright © 2025 TOYOTA MOTOR CORPORATION All rights reserved. ’23当時の課題感と今ならどうするか? 自動計装がツラい(1/2)  課題感 コードへのインパクトを避けるため、商用部隊には自動計装をおす すめしたい…が、よく使われるものが必ずしも期待通り動作しない – 例:Spark Structured StreamingのKafkaConnector処理 » https://github.com/open-telemetry/opentelemetry-java- instrumentation/issues/9638  今後(?) 品質問題は、最低限報告さえしておけば、時間が解決することも多 いが、このケースはOTEL側では対処が難しく、相手(Kafka Connector)に直してもらう必要がある(が、説明がむずかしい) 26

Slide 27

Slide 27 text

Copyright © 2025 TOYOTA MOTOR CORPORATION All rights reserved. ’23当時の課題感と今ならどうするか? 自動計装がツラい(2/2)  課題感 golang や rust等、構造が独特なものは、比較的よくつかわれる Web Frameworkでもcontext propagationできない  今現在 eBPF系のcollector/profilerの進歩が著しいのでこれで解決? – …が、必要な権限が高い等、悩みも残っている  今後 golangについては、言語フレームワーク側での対応が進行中らし い(KubeCon Japan 2025でdash0の人から聞いた情報) 27

Slide 28

Slide 28 text

Copyright © 2025 TOYOTA MOTOR CORPORATION All rights reserved. ’23当時の課題感と今ならどうするか? トラフィックを減らしたい  課題感 車両から外へ出るトラフィック(無線部)は1byteでも削りたい  対策案1:Tail based sampling? 現実 – 車両から出ていくところを削りたいので解決にならないorz  対策案2:最低限のコンテキスト情報+αだけペイロードに埋 め込む?(一種のインライン処理)  現実 – OTEL標準から乖離することになるので避けたいところ 28

Slide 29

Slide 29 text

Copyright © 2025 TOYOTA MOTOR CORPORATION All rights reserved. ’23当時の課題感と今ならどうするか? 認証が悩ましい  対センタ通信は(標準ベースではあるが)車両認証が必要。 必要な認証処理をどう入れるか?  対策案 TLS系の認証であれば、上流collectorのフロントで終端/認 証すればよい? – Observabilityで通信性能の足をひっぱりたくない – 一番外側のフロントなのでここでトラブると困る 29

Slide 30

Slide 30 text

Copyright © 2025 TOYOTA MOTOR CORPORATION All rights reserved. 参考:目の前の課題感 車載OS(Linux)にotelcolを入れたい  組み込み系では、一般的にIntel/AMD CPU上でターゲット (ARM等)ごとのバイナリを作る(a.k.a. cross build)  Yocto Linuxでビルドしたところ(hostはx86_64) 、ビルドの過 程で生成されるツール(aarch64)を使ってデータを処理する部 分があり、x86_64上ではコケる。  対策案 バイナリ配付版をDLしてきてイメージに追加する arm64なビルド環境を作る – どちらもちょっと気持ちわるい… 30

Slide 31

Slide 31 text

Copyright © 2025 TOYOTA MOTOR CORPORATION All rights reserved. まとめ(2)  その後  担当業務が変わったため、若手に引継ぎ  今は車両側に軸足を移しています  ’25時点での所感  Upstream Communityは日々進歩している  KubeCon Japan 2025で実感  …とはいえ、’23時点での課題感がすべて解決したわけでもなさそう  今後に期待(可能なら自分でもコントリしたい…) 31

Slide 32

Slide 32 text

Copyright © 2025 TOYOTA MOTOR CORPORATION All rights reserved. Thank you very much! 32

Slide 33

Slide 33 text

Copyright © 2025 TOYOTA MOTOR CORPORATION All rights reserved. 付録  Span Links – Record Correlation across OpenTelemetry Spans  In case not a direct child, Links can be used to record indirect relationship (Such as Causal relationship)  https://opentelemetry.io/docs/concepts/signals/traces/#span-links 33 Trace (1) Trace (2) Point Other Trace(s) as Link(s) References (List of Links) Useful for aggregating traces across system domains (e.g. Network and Application)