Slide 1

Slide 1 text

How we built an AI Code Reviewer with Serverless and Bedrock

Slide 2

Slide 2 text

Yan Cui http://theburningmonk.com @theburningmonk AWS user since 2010

Slide 3

Slide 3 text

Yan Cui http://theburningmonk.com @theburningmonk running serverless in production since 2016

Slide 4

Slide 4 text

Developer Advocate @ Yan Cui http://theburningmonk.com @theburningmonk

Slide 5

Slide 5 text

Yan Cui http://theburningmonk.com @theburningmonk independent consultant

Slide 6

Slide 6 text

No content

Slide 7

Slide 7 text

evolua.io Demo

Slide 8

Slide 8 text

Architecture

Slide 9

Slide 9 text

API Gateway EventBridge Webhook

Slide 10

Slide 10 text

API Gateway DynamoDB Bedrock EventBridge Webhook

Slide 11

Slide 11 text

API Gateway DynamoDB Bedrock EventBridge Webhook

Slide 12

Slide 12 text

API Gateway DynamoDB Bedrock EventBridge Webhook evolua.io

Slide 13

Slide 13 text

No content

Slide 14

Slide 14 text

API Gateway DynamoDB Bedrock EventBridge Webhook AppSync evolua.io

Slide 15

Slide 15 text

API Gateway DynamoDB Bedrock EventBridge Webhook AppSync evolua.io

Slide 16

Slide 16 text

No content

Slide 17

Slide 17 text

API Gateway DynamoDB Bedrock EventBridge Webhook AppSync evolua.io Authoriser

Slide 18

Slide 18 text

API Gateway DynamoDB Bedrock EventBridge Webhook AppSync evolua.io Authoriser

Slide 19

Slide 19 text

API Gateway DynamoDB Bedrock EventBridge Webhook AppSync evolua.io Authoriser

Slide 20

Slide 20 text

API Gateway DynamoDB Bedrock EventBridge Webhook AppSync evolua.io Authoriser

Slide 21

Slide 21 text

API Gateway DynamoDB Bedrock EventBridge Webhook AppSync evolua.io Authoriser

Slide 22

Slide 22 text

Challenges (for an AI code reviewer) Handling sensitive data for customers

Slide 23

Slide 23 text

Challenges (for an AI code reviewer) Large fi les. Large PRs with many fi les. Handling sensitive data for customers

Slide 24

Slide 24 text

Why Bedrock?

Slide 25

Slide 25 text

Security

Slide 26

Slide 26 text

Security Data is encrypted at rest.

Slide 27

Slide 27 text

www.wiz.io/blog/wiz-research-uncovers-exposed-deepseek-database-leak

Slide 28

Slide 28 text

aws.amazon.com/bedrock/faqs

Slide 29

Slide 29 text

Security Data is encrypted at rest. Inputs & Outputs are not shared with model providers. Inputs & Outputs are not used to train other models.

Slide 30

Slide 30 text

API Gateway DynamoDB Bedrock EventBridge Webhook AppSync evolua.io Authoriser Fallback Primary

Slide 31

Slide 31 text

privacy.anthropic.com/en/articles/7996885-how-do-you-use-personal-data-in-model-training

Slide 32

Slide 32 text

Serverless

Slide 33

Slide 33 text

Serverless Usage-based AND provisioned throughput pricing

Slide 34

Slide 34 text

No content

Slide 35

Slide 35 text

No content

Slide 36

Slide 36 text

1M Input Tokens 1M Output Tokens $0.14 v3 r1 $0.28 $0.55 $2.19 Sonnet $3.75 $15.0 Haiku $0.80 $4.00

Slide 37

Slide 37 text

Very cost ef fi cient!

Slide 38

Slide 38 text

Very cost ef fi cient! Data is stored in China.

Slide 39

Slide 39 text

Very cost ef fi cient! Data is stored in China. Data might be used to train other models.

Slide 40

Slide 40 text

www.wiz.io/blog/wiz-research-uncovers-exposed-deepseek-database-leak

Slide 41

Slide 41 text

Very cost ef fi cient! Data is stored in China. Data might be used to train other models. Operationally immature.

Slide 42

Slide 42 text

No content

Slide 43

Slide 43 text

No token-based pricing yet

Slide 44

Slide 44 text

No token-based pricing yet “GPU-based instance type like ml.p5e.48xlarge is recommended”

Slide 45

Slide 45 text

ml.p5e.48xlarge 💰💰💰💰💰💰💰💰💰💰 💰💰💰💰💰💰💰💰💰💰 💰💰💰💰💰💰💰💰💰💰 💰💰💰💰💰💰💰💰💰💰 💰💰💰💰💰💰💰💰

Slide 46

Slide 46 text

Other capabilities Guardrails Knowledge base (managed RAG) Agents Cross-region inference Model evaluations

Slide 47

Slide 47 text

No content

Slide 48

Slide 48 text

No content

Slide 49

Slide 49 text

No content

Slide 50

Slide 50 text

API Gateway DynamoDB Bedrock EventBridge Webhook AppSync evolua.io Authoriser Fallback Primary

Slide 51

Slide 51 text

Lessons

Slide 52

Slide 52 text

Webhook

Slide 53

Slide 53 text

Webhook Analyse changes

Slide 54

Slide 54 text

Webhook Analyse changes Feedback

Slide 55

Slide 55 text

Condensed view…

Slide 56

Slide 56 text

No content

Slide 57

Slide 57 text

Lambda timed out after 15 mins

Slide 58

Slide 58 text

Succeeded on automatic retry

Slide 59

Slide 59 text

Webhook Analyse changes Feedback LLM limits GitHub limits AWS limits

Slide 60

Slide 60 text

Lesson: AI is 10% of the problem

Slide 61

Slide 61 text

No content

Slide 62

Slide 62 text

Reasoning ability

Slide 63

Slide 63 text

Context window Max response tokens API rate limit Reasoning ability

Slide 64

Slide 64 text

Context window Max response tokens API rate limit Reasoning ability Cost Performance

Slide 65

Slide 65 text

Context window Max response tokens API rate limit Reasoning ability Cost Performance Important selection criteria for LLMs

Slide 66

Slide 66 text

Doing cool AI stuff! Working around AI limits

Slide 67

Slide 67 text

Doing cool AI stuff! Working around AI limits Stop playing with my bowl…

Slide 68

Slide 68 text

Context window Max response tokens API rate limit Reasoning ability Cost Performance

Slide 69

Slide 69 text

Claude 3.5 Sonnet’s default throughput is 50 per minute

Slide 70

Slide 70 text

Claude 3.5 Sonnet’s default throughput is 50 per minute Can be raised to 1,000 per minute

Slide 71

Slide 71 text

Claude 3.5 Sonnet’s default throughput is 50 per minute Can be raised to 1,000 per minute Bedrock has cross- region inference

Slide 72

Slide 72 text

Mitigate API rate limit Raise account limits. Use Bedrock cross-region inference.

Slide 73

Slide 73 text

Mitigate API rate limit Raise account limits. Use Bedrock cross-region inference. Limit no. of parallel requests per PR.

Slide 74

Slide 74 text

Mitigate API rate limit Raise account limits. Use Bedrock cross-region inference. Limit no. of parallel requests per PR. Fallback to Anthropic & less powerful models (Claude 3 Sonnet, Claude 3.5 Haiku)

Slide 75

Slide 75 text

Future work: incorporate other models (Nova, DeepSeek, etc.)

Slide 76

Slide 76 text

Future work: incorporate other models (Nova, DeepSeek, etc.) Also good for cost control!

Slide 77

Slide 77 text

Lesson: LLMs are still quite expensive

Slide 78

Slide 78 text

No content

Slide 79

Slide 79 text

Dif fi cult to build a sustainable and competitive business

Slide 80

Slide 80 text

Cost control Only analyse changed lines.

Slide 81

Slide 81 text

Cost control Only analyse changed lines. Good for cost control Good for UX

Slide 82

Slide 82 text

Cost control Only analyse changed lines. Limit free users to few PRs per month.

Slide 83

Slide 83 text

API Gateway DynamoDB Bedrock EventBridge Webhook

Slide 84

Slide 84 text

API Gateway DynamoDB Bedrock EventBridge Webhook Built-in retries & DLQ

Slide 85

Slide 85 text

Lambda timed out after 15 mins

Slide 86

Slide 86 text

Lambda timed out after 15 mins Reprocess fi les on retry…

Slide 87

Slide 87 text

Lambda timed out after 15 mins Reprocess fi les on retry… Duplicated side- effects (e.g. Github comments)

Slide 88

Slide 88 text

Cost control Only analyse changed lines. Limit free users to few PRs per month. Use checkpoints to avoid re-processing fi les on retries

Slide 89

Slide 89 text

const issues = await executeIdempotently( `${event-id}-${filename}-analyze`, () => analyzeFile(file) ); ... await executeIdempotently( `${event-id}-${filename}-add-gh-comment`, () => addReviewComment(filename, comment) );

Slide 90

Slide 90 text

Webhook Analyse changes Feedback Why not Step Functions?

Slide 91

Slide 91 text

Webhook Analyse changes Feedback Why not Step Functions? Checkpoints is just easier 🤷

Slide 92

Slide 92 text

Lesson: Latency is a challenge

Slide 93

Slide 93 text

Models take 10s of seconds to analyse each fi le

Slide 94

Slide 94 text

Wasted CPU cycles in Lambda

Slide 95

Slide 95 text

Future work: try other models

Slide 96

Slide 96 text

Future work: make use of these CPU cycles

Slide 97

Slide 97 text

Lesson: Be ware of hallucinations

Slide 98

Slide 98 text

“Give me JSON in this format”

Slide 99

Slide 99 text

No content

Slide 100

Slide 100 text

“Give me JSON in this format” “Nope!”

Slide 101

Slide 101 text

No content

Slide 102

Slide 102 text

Non-existent codes, invalid URLs

Slide 103

Slide 103 text

Non-existent line numbers

Slide 104

Slide 104 text

Future works

Slide 105

Slide 105 text

Go to evolua.io to try it out. We’d love your feedback!

Slide 106

Slide 106 text

Questions?