Context window
Max response tokens
API rate limit
Reasoning ability
Slide 64
Slide 64 text
Context window
Max response tokens
API rate limit
Reasoning ability Cost
Performance
Slide 65
Slide 65 text
Context window
Max response tokens
API rate limit
Reasoning ability Cost
Performance
Important selection
criteria for LLMs
Slide 66
Slide 66 text
Doing cool AI stuff!
Working around AI limits
Slide 67
Slide 67 text
Doing cool AI stuff!
Working around AI limits
Stop playing with my bowl…
Slide 68
Slide 68 text
Context window
Max response tokens
API rate limit
Reasoning ability Cost
Performance
Slide 69
Slide 69 text
Claude 3.5 Sonnet’s default throughput is 50 per minute
Slide 70
Slide 70 text
Claude 3.5 Sonnet’s default throughput is 50 per minute
Can be raised to
1,000 per minute
Slide 71
Slide 71 text
Claude 3.5 Sonnet’s default throughput is 50 per minute
Can be raised to
1,000 per minute
Bedrock has cross-
region inference
Slide 72
Slide 72 text
Mitigate API rate limit
Raise account limits.
Use Bedrock cross-region inference.
Slide 73
Slide 73 text
Mitigate API rate limit
Raise account limits.
Use Bedrock cross-region inference.
Limit no. of parallel requests per PR.
Slide 74
Slide 74 text
Mitigate API rate limit
Raise account limits.
Use Bedrock cross-region inference.
Limit no. of parallel requests per PR.
Fallback to Anthropic & less powerful models (Claude 3 Sonnet,
Claude 3.5 Haiku)
Slide 75
Slide 75 text
Future work: incorporate other models (Nova, DeepSeek, etc.)
Slide 76
Slide 76 text
Future work: incorporate other models (Nova, DeepSeek, etc.)
Also good for cost control!
Slide 77
Slide 77 text
Lesson: LLMs are still quite expensive
Slide 78
Slide 78 text
No content
Slide 79
Slide 79 text
Dif
fi
cult to build a sustainable and competitive business
Slide 80
Slide 80 text
Cost control
Only analyse changed lines.
Slide 81
Slide 81 text
Cost control
Only analyse changed lines.
Good for cost control
Good for UX
Slide 82
Slide 82 text
Cost control
Only analyse changed lines.
Limit free users to few PRs per month.
Slide 83
Slide 83 text
API Gateway
DynamoDB
Bedrock
EventBridge
Webhook
Slide 84
Slide 84 text
API Gateway
DynamoDB
Bedrock
EventBridge
Webhook
Built-in retries & DLQ
Slide 85
Slide 85 text
Lambda timed out
after 15 mins
Slide 86
Slide 86 text
Lambda timed out
after 15 mins
Reprocess
fi
les
on retry…
Slide 87
Slide 87 text
Lambda timed out
after 15 mins
Reprocess
fi
les
on retry…
Duplicated side-
effects (e.g. Github
comments)
Slide 88
Slide 88 text
Cost control
Only analyse changed lines.
Limit free users to few PRs per month.
Use checkpoints to avoid re-processing
fi
les on retries