Building an ML powered Android Livestreaming App by Etienne Caron

Slide 1

Slide 1 text

Building an ML-powered Android Livestreaming App

Slide 2

Slide 2 text

Building an ML-powered (Android) Livestreaming App

Slide 3

Slide 3 text

No content

Slide 4

Slide 4 text

Etienne Caron Technical Founder [email protected] kanastruk.com Kanastruk Consulting Services

Slide 5

Slide 5 text

kanastruk.com

Slide 6

Slide 6 text

Hardware prototyping Yolo v8 Digital Twin Model Training Realtime Leaderboard / Out of Bounds Detection

Slide 7

Slide 7 text

kanastruk.com

Slide 8

Slide 8 text

PRODUCE DETECTION EXPLAINED A high level explanation of past Kanastruk customer work. AUTHOR: @kanawish DATE: 24/04/20 Image Classi fi er BackgroundSubstractor 15fps (...) (f3) (f2) (f1)

Slide 9

Slide 9 text

Kanastruk Consulting Services

Slide 10

Slide 10 text

Kanastruk Consulting Services Kanastruk Innovation Lab

Slide 11

Slide 11 text

Design Thinking

Slide 12

Slide 12 text

Empathize De fi ne Ideate Prototype Test 🧪 📝 💡 ❤ 🛠

Slide 13

Slide 13 text

💡 ❤ 🛠 Put yourself in the user's shoes Imagine the future Make it real Design Thinking Workshops

Slide 14

Slide 14 text

Prototype 🛠

Slide 15

Slide 15 text

No content

Slide 16

Slide 16 text

https://docs.livekit.io/realtime/

Slide 17

Slide 17 text

https://github.com/livekit/livekit

Slide 18

Slide 18 text

No content

Slide 19

Slide 19 text

https://livekit.io/kitt 🛠

Slide 20

Slide 20 text

🛠 https://wattenberger.com/thoughts/boo-chatbots

Slide 21

Slide 21 text

🛠

Slide 22

Slide 22 text

Participant Track Room Room illustration sourced from: https://tinyurl.com/yj2vkyrf

Slide 23

Slide 23 text

➜ ~ ➜ ~ ➜ ~ brew install livekit OK livekit-server one of key-file or keys must be provided livekit-server --dev INFO livekit starting in development mode INFO livekit no keys provided, using placeholder keys {"API Key": "devkey", "API Secret": "secret"} INFO livekit starting LiveKit server {"portHttp": 7880, "nodeID": "ND_SUoZgzemKouv", "nodeIP": "192.168.50.240", "version": "1.7.2", "bindAddresses": ["127.0.0.1", "::1"], "rtc.portTCP": 7881, "rtc.portUDP": {"Start":7882,"End":0}} Installing and spinning up a local LiveKit Server LiveKit Server

Slide 24

Slide 24 text

LiveKit Server Client App Client App Client App Auth Services Ofﬁcial Blob ID Name: Face Listening to You Status: Not an Emoji yet Cuteness: Yes Auth Tokens

Slide 25

Slide 25 text

LiveKit Server Client App Client App Client App Auth Services Ofﬁcial Blob ID Name: Face Listening to You Status: Not an Emoji yet Cuteness: Yes Room, track and participant services Auth Tokens

Slide 26

Slide 26 text

➜ livekit git clone [email protected]:livekit-examples/livestream.git Cloning into 'livestream'... remote: Enumerating objects: 379, done. remote: Counting objects: 100% (157/157), done. remote: Compressing objects: 100% (97/97), done. remote: Total 379 (delta 76), reused 93 (delta 51), pack-reused 222 (from 1) Receiving objects: 100% (379/379), 537.75 KiB | 3.11 MiB/s, done. Resolving deltas: 100% (154/154), done. ➜ livekit cd livestream ➜ livestream git:(main) npm install added 511 packages, and audited 512 packages in 20s ➜ livestream git:(main) ✗ cp .env.example .env.development ➜ livestream git:(main) ✗ vi .env.development Installing the Livestream REST / Web App Sample.

Slide 27

Slide 27 text

➜ livestream git:(main) npm run dev > [email protected] dev > next dev ▲ Next.js 14.0.1 - Local: http://localhost:3000 - Environments: .env.development /bin/sh: pnpm: command not found ✓ Ready in 2.7s ○ Compiling /page ... ✓ Compiled /page in 2.8s (1025 modules) ✓ Compiled in 481ms (446 modules) ○ Compiling /favicon.ico/route ... ✓ Compiled /favicon.ico/route in 535ms (1030 modules) Running the Livestream REST / Web App Sample.

Slide 28

Slide 28 text

➜ livestream git:(main) ✗ open http://localhost:3000/ Jumping to Chrome to Show Off the Web Client.

Slide 29

Slide 29 text

References Source Code and Documentation •https://docs.livekit.io/home/ •https://github.com/livekit/livekit •https://github.com/orgs/livekit-examples •https://github.com/kanawish/control-room Of fi cial LiveKit Content Upgraded Android Client

Slide 30

Slide 30 text

Start Screen Join Screen Stream Options Screen Participant Info Screen Participant List Screen Room Screen Invited Screen Start Screen Room Container Screen

Slide 31

Slide 31 text

var userName by rememberSaveable(stateSaver = TextFieldValue.Saver) { mutableStateOf(TextFieldValue(preferencesManager.getUsername())) } var chatEnabled by rememberSaveable { mutableStateOf(true) } var viewerJoinRequestEnabled by rememberSaveable { mutableStateOf(true) } var cameraPosition by remember { mutableStateOf(CameraPosition.FRONT) }

Slide 32

Slide 32 text

var response: CreateStreamResponse? = null try { response = livestreamApi.createStream( CreateStreamRequest( metadata = RoomMetadata( creatorIdentity = Participant.Identity(userName.text), enableChat = chatEnabled, allowParticipation = viewerJoinRequestEnabled, ) ) ).body() } catch (e: Exception) { Timber.e(e) { "error" } }

Slide 33

Slide 33 text

if (response != null) { Timber.e { "response received: $response" } appModel.run { connected( authToken = response.authToken, connectionDetails = response.connectionDetails, isHost = true, initialCamPos = cameraPosition ) mainNav.mainNavigate(RoomContainerRoute) } } else { Timber.e { "response failed!" } }

Slide 34

Slide 34 text

/** * Apis used for the Livestream example. */ interface LivestreamApi { @POST("/api/create_stream") suspend fun createStream( @Body body: CreateStreamRequest ): Response @POST("/api/join_stream") suspend fun joinStream( @Body body: JoinStreamRequest ): Response } Auth + Livestream Back-end & Front-end Ofﬁcial Blob ID Name: Face Listening to You Status: Not an Emoji yet Cuteness: Yes

Slide 35

Slide 35 text

/** * Apis that require an Authentication: Token header */ interface AuthenticatedLivestreamApi { @POST("/api/invite_to_stage") suspend fun inviteToStage( @Body body: IdentityRequest ): Response @POST("/api/remove_from_stage") suspend fun removeFromStage( @Body body: IdentityRequest ): Response @POST("/api/raise_hand") suspend fun requestToJoin(): Response @POST("/api/stop_stream") suspend fun stopStream(): Response } Auth + Livestream Back-end & Front-end Ofﬁcial Blob ID Name: Face Listening to You Status: Not an Emoji yet Cuteness: Yes

Slide 36

Slide 36 text

Auth + Livestream Back-end & Front-end Ofﬁcial Blob ID Name: Face Listening to You Status: Not an Emoji yet Cuteness: Yes Client App Client App Web App http://localhost:3000 Auth and Livestream APIs LiveKit Server LiveKit APIs (Room, participants, tracks, etc)

Slide 37

Slide 37 text

// NOTE: After we successfully authenticated... mainNav.mainNavigate(RoomContainerRoute)

Slide 38

Slide 38 text

RoomScope( url = connectionDetails.wsUrl, token = connectionDetails.token, audio = rememberEnableMic(enableAudio), video = rememberEnableCamera(enableVideo), roomOptions = DefaultRoomOptions { options -> options.copy( videoTrackCaptureDefaults = LocalVideoTrackOptions( position = initialCamPos ) ) }, liveKitOverrides = DefaultLKOverrides(context), onConnected = { Timber.d("RoomScreenContainer -> onConnected") }, onDisconnected = { Toast.makeText(context, "Disconnected from livestream.", Toast.LENGTH_LONG).show() mainNav.mainPopBackstack(if (isHost) StartRoute else JoinRoute,false) }, onError = { _, error -> if (error is RoomException.ConnectException) { Toast.makeText( context, "Error while joining. Check the code and try again.", Toast.LENGTH_LONG ).show() mainNav.mainPopBackstack(if (isHost) StartRoute else JoinRoute,false) } } ) { room ->

Slide 39

Slide 39 text

RoomScope( // ... ) { room -> RoomNavHost(cameraPosition, showOptionsDialogOnce) }

Slide 40

Slide 40 text

fun RoomNavHost( cameraPosition: MutableState, showOptionsDialogOnce: MutableState, roomNav: RoomNav = koinInject() ) { // ... NavHost( navController = roomNavHostController, startDestination = RoomRoute ) { composable { // Pass in 'view state' that belongs to container. RoomScreen( cameraPosition = cameraPosition, showOptionsDialogOnce = showOptionsDialogOnce ) } bottomSheet(StreamOptionsRoute.name) { StreamOptionsScreen() } bottomSheet(ParticipantListRoute.name) { ParticipantListScreen() } // FIXME - bottomSheet(ParticipantInfoRoute.name + "/{sid}") { val sid = it.arguments?.getString("sid") ParticipantInfoScreen(participantSid = sid) } bottomSheet(InvitedToStageRoute.name) { InvitedToStageScreen() } } }

Slide 41

Slide 41 text

val tracks = rememberTracks(usePlaceholders = setOf(Track.Source.CAMERA)) val hostParticipant = rememberHostParticipant(roomMetadata.creatorIdentity) val hostTrack = tracks.firstOrNull { track -> track.participant == hostParticipant } // Get all the tracks for all the other participants. val stageParticipants = rememberOnStageParticipants(roomMetadata.creatorIdentity) val stageTracks = stageParticipants.map { p -> tracks.firstOrNull { track -> track.participant == p } } // Prioritize the host to the top. val videoTracks = listOf(hostTrack).plus(stageTracks) val metadatas = rememberParticipantMetadatas()

Slide 42

Slide 42 text

ParticipantGrid( videoTracks = videoTracks, isHost = isHost, modifier = Modifier .constrainAs(hostScreen) { width = Dimension.matchParent height = Dimension.fillToConstraints top.linkTo(parent.top) bottom.linkTo(chatBar.top) } )

Slide 43

Slide 43 text

VideoTrackView( room = RoomLocal.current, trackReference = trackReference, mirror = isHost, scaleType = ScaleType.Fill, modifier = Modifier .clip(RoundedCornerShape(8.dp)) .then(modifier) )

Slide 44

Slide 44 text

Slide 45

Slide 45 text

Wut? In order of preference •ICE over UDP: ideal connection type, used in majority of conditions •TURN with UDP (3478): used when ICE/UDP is unreachable •ICE over TCP: used when network disallows UDP (i.e. over VPN or corporate fi rewalls) •TURN with TLS: used when fi rewall only allows outbound TLS connections Connectivity

Slide 46

Slide 46 text

ICE Interactive Connectivity Establishment

Slide 47

Slide 47 text

STUN Session Traversal Utilities for NAT

Slide 48

Slide 48 text

TURN Traversal Using Relays around NAT

Slide 49

Slide 49 text

https://developer.mozilla.org/en-US/docs/Web/API/WebRTC_API/Protocols Peer B Peer A You are: 208.141.55.130:3255 Who am I? STUN server

Slide 50

Slide 50 text

https://developer.mozilla.org/en-US/docs/Web/API/WebRTC_API/Protocols Peer B Peer A STUN server Who am I? You are: 209.141.55.130:3255 Behind Symetric NAT TURN server

Slide 51

Slide 51 text

https://developer.mozilla.org/en-US/docs/Web/API/WebRTC_API/ Connectivity#the_entire_exchange_in_a_complicated_diagram SDP

Slide 52

Slide 52 text

veloper.mozilla.org/en-US/docs/Web/API/WebRTC_API/ ctivity#the_entire_exchange_in_a_complicated_diagram

Slide 53

Slide 53 text

In order of preference •ICE over UDP: ideal connection type, used in majority of conditions •TURN with UDP (3478): used when ICE/UDP is unreachable •ICE over TCP: used when network disallows UDP (i.e. over VPN or corporate fi rewalls) •TURN with TLS: used when fi rewall only allows outbound TLS connections Connectivity

Slide 54

Slide 54 text

Auth Tokens

Slide 55

Slide 55 text

No content

Slide 56

Slide 56 text

async def draw_color_cycle(output_source: rtc.VideoSource, width, height): argb_frame = bytearray(width * height * 4) arr = np.frombuffer(argb_frame, dtype=np.uint8) framerate = 1 / 30 hue = 0.0 while True: start_time = asyncio.get_event_loop().time() rgb = colorsys.hsv_to_rgb(hue, 1.0, 1.0) rgb = [(x * 255) for x in rgb] # type: ignore argb_color = np.array(rgb + [255], dtype=np.uint8) arr.flat[::4] = argb_color[0] arr.flat[1::4] = argb_color[1] arr.flat[2::4] = argb_color[2] arr.flat[3::4] = argb_color[3] frame = rtc.VideoFrame(width, height, rtc.VideoBufferType.RGBA, argb_frame) output_source.capture_frame(frame) hue = (hue + framerate / 3) % 1.0 code_duration = asyncio.get_event_loop().time() - start_time await asyncio.sleep(1 / 30 - code_duration)

Slide 57

Slide 57 text

No content

Slide 58

Slide 58 text

async def draw_face_mask_to_video_loop( input_stream: rtc.VideoStream, output_source: rtc.VideoSource, show_window=True ): landmarker = FaceLandmarker.create_from_options(options) # cv2 commands are only for _local_ window/preview if show_window: cv2.namedWindow("livekit_video", cv2.WINDOW_NORMAL) cv2.startWindowThread() async for frame_event in input_stream: buffer: VideoFrame = frame_event.frame arr = np.frombuffer(buffer.data, dtype=np.uint8) arr = arr.reshape((buffer.height, buffer.width, 3)) mp_image = mp.Image(image_format=mp.ImageFormat.SRGB, data=arr) detection_result = landmarker.detect_for_video( mp_image, frame_event.timestamp_us ) draw_landmarks_on_image(arr, detection_result) frame = rtc.VideoFrame(buffer.width, buffer.height, rtc.VideoBufferType.RGB24, buffer.data) output_source.capture_frame(frame) if show_window: arr = cv2.cvtColor(arr, cv2.COLOR_RGB2BGR) cv2.imshow("livekit_video", arr) if cv2.waitKey(1) & 0xFF == ord("q"): break landmarker.close() if show_window: cv2.destroyAllWindows()

Slide 59

Slide 59 text

🛠

Slide 60

Slide 60 text

No content

Slide 61

Slide 61 text

No content

Slide 62

Slide 62 text

No content

Slide 63

Slide 63 text

No content

Slide 64

Slide 64 text

No content

Slide 65

Slide 65 text

No content

Slide 66

Slide 66 text

No content

Slide 67

Slide 67 text

No content

Slide 68

Slide 68 text

No content

Slide 69

Slide 69 text

No content

Slide 70

Slide 70 text

No content

Slide 71

Slide 71 text

No content

Slide 72

Slide 72 text

async def handle_frame_event(frame_event: VideoFrameEvent, output_source: rtc.VideoSource): buffer: VideoFrame = frame_event.frame arr = np.frombuffer(buffer.data, dtype=np.uint8) arr = arr.reshape((buffer.height, buffer.width, 3)) src_image = cv2.cvtColor(arr, cv2.COLOR_RGB2BGR) gray = cv2.cvtColor(src_image, cv2.COLOR_BGR2GRAY) cv2.imshow(windows[0], gray) blurred = cv2.GaussianBlur(gray, (7, 7), 0) cv2.imshow(windows[1], blurred) _, thresh = cv2.threshold(blurred, 120, 255, cv2.THRESH_BINARY_INV) cv2.imshow(windows[2], thresh) contours, _ = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE) dest_image = src_image for contour in contours: cv2.drawContours(dest_image, [contour], -1, (0, 255, 0), 2) cv2.imshow(windows[4], dest_image) frame = rtc.VideoFrame( buffer.width, buffer.height, rtc.VideoBufferType.RGB24, cv2.cvtColor(dest_image, cv2.COLOR_BGR2RGB).data ) output_source.capture_frame(frame) cv2.waitKey(1)

Slide 73

Slide 73 text

https://kanastruk.com https://github.com/ kanawish/control-room Th ank you! 💕