I've spent a lot of time with #AWS app inference profiles this week and I hope my discoveries are helpful and useful (and correct anything that's egregiously wrong)
Happy Friday!
Application Inference Profiles for AWS Bedrock foundation model cost and access management
One of the gaps identified early in using Bedrock was the lack of cost control and access management for on-demand inference requests. While CloudWatch records basic token utilization and latency metrics, there was little to no visibility into ad hoc requests made from the Bedrock Playground, or when using Bedrock
