Building an ML powered Android Livestreaming App by Etienne Caron

Building an ML-powered Android Livestreaming App

Building an ML-powered (Android) Livestreaming App

Etienne Caron Technical Founder [email protected] kanastruk.com Kanastruk Consulting Services

kanastruk.com

Hardware prototyping Yolo v8 Digital Twin Model Training Realtime Leaderboard
/ Out of Bounds Detection

kanastruk.com

PRODUCE DETECTION EXPLAINED A high level explanation of past Kanastruk
customer work. AUTHOR: @kanawish DATE: 24/04/20 Image Classi fi er BackgroundSubstractor 15fps (...) (f3) (f2) (f1)

Kanastruk Consulting Services

Kanastruk Consulting Services Kanastruk Innovation Lab

Design Thinking

Empathize De fi ne Ideate Prototype Test 🧪 📝 💡
❤ 🛠

💡 ❤ 🛠 Put yourself in the user's shoes Imagine
the future Make it real Design Thinking Workshops

Prototype 🛠

https://docs.livekit.io/realtime/

https://github.com/livekit/livekit

https://livekit.io/kitt 🛠

🛠 https://wattenberger.com/thoughts/boo-chatbots

Participant Track Room Room illustration sourced from: https://tinyurl.com/yj2vkyrf

➜ ~ ➜ ~ ➜ ~ brew install livekit OK
livekit-server one of key-file or keys must be provided livekit-server --dev INFO livekit starting in development mode INFO livekit no keys provided, using placeholder keys {"API Key": "devkey", "API Secret": "secret"} INFO livekit starting LiveKit server {"portHttp": 7880, "nodeID": "ND_SUoZgzemKouv", "nodeIP": "192.168.50.240", "version": "1.7.2", "bindAddresses": ["127.0.0.1", "::1"], "rtc.portTCP": 7881, "rtc.portUDP": {"Start":7882,"End":0}} Installing and spinning up a local LiveKit Server LiveKit Server

LiveKit Server Client App Client App Client App Auth Services
Ofﬁcial Blob ID Name: Face Listening to You Status: Not an Emoji yet Cuteness: Yes Auth Tokens

LiveKit Server Client App Client App Client App Auth Services
Ofﬁcial Blob ID Name: Face Listening to You Status: Not an Emoji yet Cuteness: Yes Room, track and participant services Auth Tokens

➜ livekit git clone [email protected]:livekit-examples/livestream.git Cloning into 'livestream'... remote: Enumerating
objects: 379, done. remote: Counting objects: 100% (157/157), done. remote: Compressing objects: 100% (97/97), done. remote: Total 379 (delta 76), reused 93 (delta 51), pack-reused 222 (from 1) Receiving objects: 100% (379/379), 537.75 KiB | 3.11 MiB/s, done. Resolving deltas: 100% (154/154), done. ➜ livekit cd livestream ➜ livestream git:(main) npm install added 511 packages, and audited 512 packages in 20s ➜ livestream git:(main) ✗ cp .env.example .env.development ➜ livestream git:(main) ✗ vi .env.development Installing the Livestream REST / Web App Sample.

➜ livestream git:(main) npm run dev > [email protected] dev >
next dev ▲ Next.js 14.0.1 - Local: http://localhost:3000 - Environments: .env.development /bin/sh: pnpm: command not found ✓ Ready in 2.7s ◦ Compiling /page ... ✓ Compiled /page in 2.8s (1025 modules) ✓ Compiled in 481ms (446 modules) ◦ Compiling /favicon.ico/route ... ✓ Compiled /favicon.ico/route in 535ms (1030 modules) Running the Livestream REST / Web App Sample.

➜ livestream git:(main) ✗ open http://localhost:3000/ Jumping to Chrome to
Show Off the Web Client.

References Source Code and Documentation •https://docs.livekit.io/home/ •https://github.com/livekit/livekit •https://github.com/orgs/livekit-examples •https://github.com/kanawish/control-room Of
fi cial LiveKit Content Upgraded Android Client

Start Screen Join Screen Stream Options Screen Participant Info Screen
Participant List Screen Room Screen Invited Screen Start Screen Room Container Screen

var userName by rememberSaveable(stateSaver = TextFieldValue.Saver) { mutableStateOf(TextFieldValue(preferencesManager.getUsername())) } var
chatEnabled by rememberSaveable { mutableStateOf(true) } var viewerJoinRequestEnabled by rememberSaveable { mutableStateOf(true) } var cameraPosition by remember { mutableStateOf(CameraPosition.FRONT) }

var response: CreateStreamResponse? = null try { response = livestreamApi.createStream(
CreateStreamRequest( metadata = RoomMetadata( creatorIdentity = Participant.Identity(userName.text), enableChat = chatEnabled, allowParticipation = viewerJoinRequestEnabled, ) ) ).body() } catch (e: Exception) { Timber.e(e) { "error" } }

if (response != null) { Timber.e { "response received: $response"
} appModel.run { connected( authToken = response.authToken, connectionDetails = response.connectionDetails, isHost = true, initialCamPos = cameraPosition ) mainNav.mainNavigate(RoomContainerRoute) } } else { Timber.e { "response failed!" } }

/** * Apis used for the Livestream example. */ interface
LivestreamApi { @POST("/api/create_stream") suspend fun createStream( @Body body: CreateStreamRequest ): Response<CreateStreamResponse> @POST("/api/join_stream") suspend fun joinStream( @Body body: JoinStreamRequest ): Response<JoinStreamResponse> } Auth + Livestream Back-end & Front-end Ofﬁcial Blob ID Name: Face Listening to You Status: Not an Emoji yet Cuteness: Yes

/** * Apis that require an Authentication: Token <token> header
*/ interface AuthenticatedLivestreamApi { @POST("/api/invite_to_stage") suspend fun inviteToStage( @Body body: IdentityRequest ): Response<Unit> @POST("/api/remove_from_stage") suspend fun removeFromStage( @Body body: IdentityRequest ): Response<Unit> @POST("/api/raise_hand") suspend fun requestToJoin(): Response<Unit> @POST("/api/stop_stream") suspend fun stopStream(): Response<Unit> } Auth + Livestream Back-end & Front-end Ofﬁcial Blob ID Name: Face Listening to You Status: Not an Emoji yet Cuteness: Yes

Auth + Livestream Back-end & Front-end Ofﬁcial Blob ID Name:
Face Listening to You Status: Not an Emoji yet Cuteness: Yes Client App Client App Web App http://localhost:3000 Auth and Livestream APIs LiveKit Server LiveKit APIs (Room, participants, tracks, etc)

// NOTE: After we successfully authenticated... mainNav.mainNavigate(RoomContainerRoute)

RoomScope( url = connectionDetails.wsUrl, token = connectionDetails.token, audio = rememberEnableMic(enableAudio),
video = rememberEnableCamera(enableVideo), roomOptions = DefaultRoomOptions { options -> options.copy( videoTrackCaptureDefaults = LocalVideoTrackOptions( position = initialCamPos ) ) }, liveKitOverrides = DefaultLKOverrides(context), onConnected = { Timber.d("RoomScreenContainer -> onConnected") }, onDisconnected = { Toast.makeText(context, "Disconnected from livestream.", Toast.LENGTH_LONG).show() mainNav.mainPopBackstack(if (isHost) StartRoute else JoinRoute,false) }, onError = { _, error -> if (error is RoomException.ConnectException) { Toast.makeText( context, "Error while joining. Check the code and try again.", Toast.LENGTH_LONG ).show() mainNav.mainPopBackstack(if (isHost) StartRoute else JoinRoute,false) } } ) { room ->

RoomScope( // ... ) { room -> RoomNavHost(cameraPosition, showOptionsDialogOnce) }

fun RoomNavHost( cameraPosition: MutableState<CameraPosition>, showOptionsDialogOnce: MutableState<Boolean>, roomNav: RoomNav = koinInject()
) { // ... NavHost( navController = roomNavHostController, startDestination = RoomRoute ) { composable<RoomRoute> { // Pass in 'view state' that belongs to container. RoomScreen( cameraPosition = cameraPosition, showOptionsDialogOnce = showOptionsDialogOnce ) } bottomSheet(StreamOptionsRoute.name) { StreamOptionsScreen() } bottomSheet(ParticipantListRoute.name) { ParticipantListScreen() } // FIXME - bottomSheet(ParticipantInfoRoute.name + "/{sid}") { val sid = it.arguments?.getString("sid") ParticipantInfoScreen(participantSid = sid) } bottomSheet(InvitedToStageRoute.name) { InvitedToStageScreen() } } }

val tracks = rememberTracks(usePlaceholders = setOf(Track.Source.CAMERA)) val hostParticipant = rememberHostParticipant(roomMetadata.creatorIdentity)
val hostTrack = tracks.firstOrNull { track -> track.participant == hostParticipant } // Get all the tracks for all the other participants. val stageParticipants = rememberOnStageParticipants(roomMetadata.creatorIdentity) val stageTracks = stageParticipants.map { p -> tracks.firstOrNull { track -> track.participant == p } } // Prioritize the host to the top. val videoTracks = listOf(hostTrack).plus(stageTracks) val metadatas = rememberParticipantMetadatas()

ParticipantGrid( videoTracks = videoTracks, isHost = isHost, modifier = Modifier
.constrainAs(hostScreen) { width = Dimension.matchParent height = Dimension.fillToConstraints top.linkTo(parent.top) bottom.linkTo(chatBar.top) } )

VideoTrackView( room = RoomLocal.current, trackReference = trackReference, mirror = isHost,
scaleType = ScaleType.Fill, modifier = Modifier .clip(RoundedCornerShape(8.dp)) .then(modifier) )

References Source Code and Documentation •https://docs.livekit.io/home/ •https://github.com/livekit/livekit •https://github.com/orgs/livekit-examples •https://github.com/kanawish/control-room Of
fi cial LiveKit Content Upgraded Android Client

Wut? In order of preference •ICE over UDP: ideal connection
type, used in majority of conditions •TURN with UDP (3478): used when ICE/UDP is unreachable •ICE over TCP: used when network disallows UDP (i.e. over VPN or corporate fi rewalls) •TURN with TLS: used when fi rewall only allows outbound TLS connections Connectivity

ICE Interactive Connectivity Establishment

STUN Session Traversal Utilities for NAT

TURN Traversal Using Relays around NAT

https://developer.mozilla.org/en-US/docs/Web/API/WebRTC_API/Protocols Peer B Peer A You are: 208.141.55.130:3255 Who am
I? STUN server

https://developer.mozilla.org/en-US/docs/Web/API/WebRTC_API/Protocols Peer B Peer A STUN server Who am I?
You are: 209.141.55.130:3255 Behind Symetric NAT TURN server

https://developer.mozilla.org/en-US/docs/Web/API/WebRTC_API/ Connectivity#the_entire_exchange_in_a_complicated_diagram SDP

veloper.mozilla.org/en-US/docs/Web/API/WebRTC_API/ ctivity#the_entire_exchange_in_a_complicated_diagram

In order of preference •ICE over UDP: ideal connection type,
used in majority of conditions •TURN with UDP (3478): used when ICE/UDP is unreachable •ICE over TCP: used when network disallows UDP (i.e. over VPN or corporate fi rewalls) •TURN with TLS: used when fi rewall only allows outbound TLS connections Connectivity

Auth Tokens

async def draw_color_cycle(output_source: rtc.VideoSource, width, height): argb_frame = bytearray(width *
height * 4) arr = np.frombuffer(argb_frame, dtype=np.uint8) framerate = 1 / 30 hue = 0.0 while True: start_time = asyncio.get_event_loop().time() rgb = colorsys.hsv_to_rgb(hue, 1.0, 1.0) rgb = [(x * 255) for x in rgb] # type: ignore argb_color = np.array(rgb + [255], dtype=np.uint8) arr.flat[::4] = argb_color[0] arr.flat[1::4] = argb_color[1] arr.flat[2::4] = argb_color[2] arr.flat[3::4] = argb_color[3] frame = rtc.VideoFrame(width, height, rtc.VideoBufferType.RGBA, argb_frame) output_source.capture_frame(frame) hue = (hue + framerate / 3) % 1.0 code_duration = asyncio.get_event_loop().time() - start_time await asyncio.sleep(1 / 30 - code_duration)

async def draw_face_mask_to_video_loop( input_stream: rtc.VideoStream, output_source: rtc.VideoSource, show_window=True ): landmarker
= FaceLandmarker.create_from_options(options) # cv2 commands are only for _local_ window/preview if show_window: cv2.namedWindow("livekit_video", cv2.WINDOW_NORMAL) cv2.startWindowThread() async for frame_event in input_stream: buffer: VideoFrame = frame_event.frame arr = np.frombuffer(buffer.data, dtype=np.uint8) arr = arr.reshape((buffer.height, buffer.width, 3)) mp_image = mp.Image(image_format=mp.ImageFormat.SRGB, data=arr) detection_result = landmarker.detect_for_video( mp_image, frame_event.timestamp_us ) draw_landmarks_on_image(arr, detection_result) frame = rtc.VideoFrame(buffer.width, buffer.height, rtc.VideoBufferType.RGB24, buffer.data) output_source.capture_frame(frame) if show_window: arr = cv2.cvtColor(arr, cv2.COLOR_RGB2BGR) cv2.imshow("livekit_video", arr) if cv2.waitKey(1) & 0xFF == ord("q"): break landmarker.close() if show_window: cv2.destroyAllWindows()

async def handle_frame_event(frame_event: VideoFrameEvent, output_source: rtc.VideoSource): buffer: VideoFrame = frame_event.frame
arr = np.frombuffer(buffer.data, dtype=np.uint8) arr = arr.reshape((buffer.height, buffer.width, 3)) src_image = cv2.cvtColor(arr, cv2.COLOR_RGB2BGR) gray = cv2.cvtColor(src_image, cv2.COLOR_BGR2GRAY) cv2.imshow(windows[0], gray) blurred = cv2.GaussianBlur(gray, (7, 7), 0) cv2.imshow(windows[1], blurred) _, thresh = cv2.threshold(blurred, 120, 255, cv2.THRESH_BINARY_INV) cv2.imshow(windows[2], thresh) contours, _ = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE) dest_image = src_image for contour in contours: cv2.drawContours(dest_image, [contour], -1, (0, 255, 0), 2) cv2.imshow(windows[4], dest_image) frame = rtc.VideoFrame( buffer.width, buffer.height, rtc.VideoBufferType.RGB24, cv2.cvtColor(dest_image, cv2.COLOR_BGR2RGB).data ) output_source.capture_frame(frame) cv2.waitKey(1)

https://kanastruk.com https://github.com/ kanawish/control-room Th ank you! 💕

Building an ML powered Android Livestreaming Ap...

Building an ML powered Android Livestreaming App by Etienne Caron

More Decks by GDG Montreal

Other Decks in Programming

Featured

Transcript