AWS Bites podcast

153. LLM Inference with Bedrock

6.3.2026

AWS Bites

0:00

43:25

If you’re curious about building with LLMs, but you want to skip the hype and learn what it takes to ship something reliable in production, this episode is for you.We share our real-world experience building AI-powered apps and the gotchas you hit after the demo: tokens and cost, quotas and throttling, IAM and access friction, marketplace subscriptions, and structured outputs that do not break your JSON parser.We focus on Amazon Bedrock as AWS’s managed inference layer: how to get started with the current access model, how to choose models, how pricing works, and what to watch for in production.We also go deep on structured outputs: constrained decoding, schema design that improves output quality, and how to avoid “grammar compilation timed out”.

In this episode, we mentioned the following resources:

fourTheorem: Bedrock structured outputs guide https://fourtheorem.com/amazon-bedrock-structured-outputs/
Amazon Bedrock https://aws.amazon.com/bedrock/
Bedrock docs https://docs.aws.amazon.com/bedrock/latest/userguide/
Bedrock pricing https://aws.amazon.com/bedrock/pricing/
Structured outputs https://docs.aws.amazon.com/bedrock/latest/userguide/structured-outputs.html
Cross-region inference https://docs.aws.amazon.com/bedrock/latest/userguide/cross-region-inference.html
Quotas https://docs.aws.amazon.com/bedrock/latest/userguide/quotas.html
Throttling help https://repost.aws/knowledge-center/bedrock-throttling-error
Prompt caching https://docs.aws.amazon.com/bedrock/latest/userguide/prompt-caching.html
Troubleshooting error codes https://docs.aws.amazon.com/bedrock/latest/userguide/troubleshooting-api-error-codes.html

Do you have any AWS questions you would like us to address?

Leave a comment here or connect with us on X/Twitter, BlueSky or LinkedIn:

- ⁠https://twitter.com/eoins⁠ | ⁠https://bsky.app/profile/eoin.sh⁠ | ⁠https://www.linkedin.com/in/eoins/⁠

- ⁠https://twitter.com/loige⁠ | ⁠https://bsky.app/profile/loige.co⁠ | ⁠https://www.linkedin.com/in/lucianomammino/

Flere episoder fra "AWS Bites"

Få adgang til hele det store podcastunivers med gratisappen GetPodcast.

Abonnér på dine favoritpodcasts, lyt til episoder offline, og få spændende anbefalinger.

© radio.de GmbH 2026

En virksomhed fra

MADSACK