Standard AWS security reviews are good at what they’re designed for. They find misconfigured S3 buckets, overpermissioned IAM policies, disabled GuardDuty, missing CloudTrail regions, unencrypted resources. Those findings matter and they’re worth fixing.
They’re not built for what happens when you add Bedrock.
What a standard AWS security review actually covers
A typical review runs automated tooling against Security Hub, Config, IAM Access Analyzer, and CloudTrail. It produces a list of findings mapped to a framework — Essential Eight, CIS Benchmarks, or AWS Foundational Security Best Practices. The findings are real. The coverage, relative to an AI workload’s actual risk surface, is incomplete.
None of those tools were designed to ask: what can the model do? What inputs does it accept? Can those inputs be used against it? What data does the service role have access to beyond what the inference job requires?
That’s not a criticism of the tools — they were built before AI services were production workloads. It’s a gap that needs to be addressed separately.
The AI risk surface that infrastructure reviews miss
When you add Bedrock to an AWS environment, you introduce a new category of risk that doesn’t map to traditional infrastructure controls:
- The model accepts inputs. A user — or an attacker — can send requests that manipulate the model’s behavior in ways that aren’t governed by IAM or network controls.
- The model role can access things. The IAM role Bedrock uses to invoke inference often has broader permissions than the specific job requires.
- The model processes data. Prompts, system instructions, knowledge base documents, and conversation context are all data that can potentially be extracted through adversarial input.
- Third-party models add integration surface. If you’re using marketplace models (Claude, Llama, Mistral), each integration point is an additional attack surface with its own characteristics.
Prompt injection in a Bedrock context
Prompt injection is the AI equivalent of SQL injection — an attacker supplies input that the model processes as instruction rather than data. In a Bedrock context, a common pattern:
Example scenario
A customer-facing chatbot has a system prompt that defines its role and restricts what it discusses. A user sends a message structured to override those instructions — claiming to be a system administrator, or embedding instruction-like text that the model interprets as a new directive. The model follows the injected instruction instead of the system prompt.
The consequence depends on what the model can do. If it only generates text, the damage is limited to what it says. If it can call tools, write to a database, or retrieve documents from a knowledge base, the consequence scales with those capabilities.
Many Bedrock implementations don’t test for this at all. The application is tested for functionality — does it answer questions correctly? — but not for adversarial behavior under crafted inputs.
IAM for Bedrock: the overpermissioned role problem
Most Bedrock implementations use a service role that’s broader than required. The model role might have s3:GetObject on a wide prefix rather than the specific bucket and key path the inference job needs. Or it can invoke any model in the account rather than the specific model ID in use.
This happens partly because AWS documentation frequently shows broad permission examples, and partly because scoping permissions correctly for a new service requires understanding its specific operation model — which takes time to get right. The result is a role that has more access than required and a security assumption that’s been left unvalidated.
An overpermissioned model role doesn’t necessarily create an exploitable vulnerability on its own. But in combination with a prompt injection vulnerability, it determines the blast radius. A model that can be prompted to retrieve and return data from S3, with a role that can read from sensitive prefixes, is a data exposure risk.
Data exposure through RAG pipelines
Retrieval-Augmented Generation (RAG) pipelines retrieve relevant documents from a knowledge base before generating a response. The assumption is that retrieval returns relevant documents for the user’s query. The risk is that adversarial inputs can retrieve documents the user shouldn’t see.
This is particularly common in implementations where:
- The knowledge base contains documents at different classification or access levels
- The retrieval logic doesn’t filter by the requesting user’s access entitlements
- The system prompt doesn’t explicitly instruct the model not to repeat retrieved content verbatim
A crafted query can retrieve and surface documents that were never intended to be user-accessible. In financial services or healthcare environments, where knowledge bases often contain sensitive operational content, that’s a material data exposure.
Third-party model integration risk
Bedrock’s model marketplace lets you use third-party models — Llama, Mistral, Cohere, and others — through the same API. The integration is convenient. The security consideration is that data sent to third-party models passes outside AWS’s native services, even if it stays within the Bedrock API surface.
The specific risk depends on what data you’re sending. Prompt content that includes conversation history, retrieved documents, or user-provided context may contain sensitive data that you haven’t assessed for transmission outside your control boundary. Most implementations don’t have a data classification layer between the application and the model invocation.
What an AI security review covers that a standard review doesn’t
An AI security review is additive — it doesn’t replace a standard cloud security review, it runs in addition to it. What it covers specifically:
- Attack surface mapping: what models are deployed, what inputs they accept, what tools or data they can access
- Prompt injection surface testing: can system prompt instructions be overridden? Can the model be induced to return restricted data or take unintended actions?
- IAM audit: are model roles scoped to the specific resources and operations each inference job requires?
- Knowledge base access review: is retrieval filtering in place? Can adversarial queries retrieve documents outside the user’s intended access scope?
- Third-party model data flow review: what data is transmitted to external models, under what conditions, and does that match your data classification policy?
- Inference endpoint exposure: are endpoints authenticated and properly scoped, or accessible beyond the intended application layer?
Running Bedrock in production?
We run AI security reviews specifically for AWS AI workloads — separate from, or combined with, a standard cloud governance assessment. If you have Bedrock or SageMaker in staging or production, the review is worth doing before the first incident surfaces the gaps.
Talk to us about a security review