Scoping security when deploying generative AI on AWS

Scoping security when deploying generative AI on AWS

When we deploy generative AI workloads on AWS for customers, the first security question is rarely about controls or tooling.

It is almost always this:

What do we actually own in this system, and what do we not?

That distinction matters more in generative AI than in most other architectures. A public chatbot, a SaaS product with embedded AI features, a retrieval-augmented internal assistant, and a fine-tuned domain model can look similar on the surface. From a security and risk perspective, they are very different systems.

AWS’s Generative AI Security Scoping Matrix is useful because it forces this conversation early. We use it as a practical way to align engineering, security, and stakeholders before diving into threat modelling or implementation details.

This post explains how we apply that thinking when building and deploying generative AI systems on AWS.

Why scoping comes first

Generative AI security often goes wrong when teams skip straight to controls.

They debate prompt injection, data leakage, or IAM policies without agreeing on where the system boundary actually sits. The result is usually confusion, duplicated effort, or controls applied in the wrong place.

The scoping matrix helps by framing the problem around ownership. Once you are clear about how much of the stack you control, security responsibilities become easier to reason about.

The five deployment scopes we work with

Consumer applications

In this scope, you are using a public third-party generative AI application. You do not control the model, the training data, or how outputs are generated.

Security here is mostly about governance. Policies define what data can be used, and enforcement focuses on user behaviour rather than system design. If sensitive or regulated data is involved, this scope is usually inappropriate.

Enterprise applications

Here, generative AI is embedded inside a third-party enterprise product that your organisation has a formal relationship with.

Security effort shifts slightly compared to consumer tools, but it is still vendor-led. The focus is on contracts, data handling terms, auditability, and understanding how the vendor uses or stores your data. You still do not control the model itself.

Pre-trained foundation models

This is where most AWS-based generative AI systems begin.

You build your own application, but invoke an existing foundation model through an API. On AWS, this is often done using Amazon Bedrock.

You control the application logic, prompts, and surrounding services, but not the model weights. This is the first scope where traditional cloud security architecture becomes central.

Fine-tuned models

In this scope, an existing foundation model is fine-tuned using your own data to produce a more specialised version.

Security complexity increases here because your data now influences the model itself, not just the prompts. Decisions around data classification, retention, and deletion become much harder to reverse.

Self-trained models

This is full ownership. You train a model from scratch using data you own or acquire.

This scope brings the greatest flexibility and the greatest responsibility. Everything from training data governance to model lifecycle management sits with you. For most organisations, this is only justified for very specific use cases.

How data usage changes the security picture

One of the most important scoping decisions is how your data is used.

If data is embedded into training or fine-tuning, removing it later can be extremely difficult. In practice, fully removing data from a model usually means retraining without it, which is costly and disruptive.

For this reason, we often recommend retrieval-augmented generation where appropriate. Keeping data external to the model allows access control, auditing, and deletion to remain explicit and manageable.

Risk management depends on scope

Risk assessment looks very different depending on where you sit in the matrix.

For consumer and enterprise applications, risk management focuses on third-party providers. You assess their controls, incident response posture, and contractual obligations.

For systems built on foundation models, risk management becomes your responsibility. Threat modelling needs to account for issues like prompt injection, misuse, and unexpected behaviour. These are not entirely new problems, but they show up in new places.

Why access control cannot live inside the model

Foundation models do not understand users, roles, or permissions.

Once data is included in a prompt, the model has access to all of it. There is no concept of row-level or document-level access control inside the model itself.

That is why access control must be enforced before data reaches the model. In practice, this means applying identity and authorisation at the application and retrieval layers, not relying on the model to behave securely.

Using scoping to guide design decisions

We use the scoping matrix early in every generative AI engagement.

It helps teams agree on ownership, clarify which risks matter, and avoid designing controls for problems that do not exist in their chosen scope. It also prevents unrealistic expectations about what the model can secure on its own.

Once scope is clear, decisions about resilience, controls, and operations become much easier to make.

Final thoughts

Most security issues in generative AI systems are not caused by exotic attacks.

They come from unclear ownership, misplaced assumptions, and controls applied in the wrong layer. Scoping forces teams to confront those issues early.

When we deploy generative AI systems on AWS, getting the scope right is what sets everything else up to succeed.