Most engineering teams get their workloads running on AWS. Very few are running a platform. The distinction is real, and it matters more than most teams realise until a compliance audit, a production incident, or a new hire forces it into view.
What “on AWS” usually looks like
The common pattern: EC2 or ECS for compute, RDS for the database, S3 for storage, maybe a load balancer and a CloudFront distribution. Deployed by someone who knew what they were doing at the time, using the console or a collection of scripts that grew organically. There may be some CloudFormation that nobody fully understands. Nobody’s entirely sure which resources are in which account, or why some of them exist.
It works. Until it doesn’t. And the moment it stops working — an incident, an audit, a new engineer who can’t get anything done — the absence of a platform becomes apparent quickly.
What a platform actually is
A platform is the layer that makes deploying, operating, and securing workloads consistent and repeatable. It’s not a product. It’s a set of decisions and tooling that means your team doesn’t have to individually solve the same infrastructure problems with every new service they ship.
The components that constitute a production platform:
Everything that runs is defined in Terraform or CDK. No resources exist outside the IaC definition. Drift is detectable and correctable.
Code changes go through a defined pipeline — test, build, deploy to staging, promote to production. No manual deployments to production.
SCPs at the Organizations level prevent certain actions regardless of who attempts them. Config Rules detect and alert on drift. GuardDuty and Macie are enabled and monitored.
No credentials in code, no credentials in environment variables. Secrets Manager or Parameter Store, with rotation policies configured.
Structured logging, metrics, and distributed tracing. Not just CloudWatch Logs — a searchable, alertable baseline that gives you signal when something breaks.
Workloads separated by environment and, where appropriate, by sensitivity. Not everything in a single account because that was the path of least resistance at launch.
Where the absence of a platform becomes expensive
The gap between “on AWS” and “running a platform” doesn’t usually hurt immediately. It accumulates — and then surfaces at the worst possible time.
The compliance audit. Someone needs to evidence your security posture. The first discovery is that CloudTrail wasn’t enabled in all regions. Then that the Security Hub findings haven’t been reviewed in months. Then that there’s no audit trail for a configuration change that happened six weeks ago. The evidence package that would take two days to assemble in a well-instrumented environment takes six weeks of retrospective work to reconstruct.
The production incident. Something breaks at 2am. The on-call engineer can’t tell what version is deployed, what changed in the last 24 hours, or what to revert to — because there’s no IaC baseline, no deployment history, and no structured logs that make the failure obvious. The MTTR is a function of the environment’s opacity.
The new hire. A senior engineer joins and it takes three weeks to understand what’s actually running, because it’s undocumented and inconsistent. The documentation that exists describes what the environment looked like eighteen months ago, not what it looks like now. Every question requires interrupting someone who was there at the start.
The pattern is consistent: the absence of a platform isn’t invisible, it’s deferred. The cost accrues as engineer hours, incident impact, and remediation work. By the time it’s visible, it’s usually more expensive to fix than it would have been to build correctly.
The DevSecOps piece
Security doesn’t improve by adding a security review at the end of a project. It improves by making secure configuration the default — built into the platform rather than checked against it after the fact.
In practice, that means:
- IaC modules that enforce security group constraints by default, rather than relying on the developer to remember not to open port 22 to the internet
- Pipeline stages that run security scanning (SAST, dependency scanning, IaC linting) before code reaches staging
- SCPs that prevent public S3 bucket creation at the organisation level, so the control doesn’t depend on individual developers’ attention
- Macie configured to detect sensitive data exposure before it becomes a breach notification
DevSecOps isn’t a team or a process. It’s the result of building security into the platform so that the secure path is also the easy path.
When to invest in a platform
The right time to build a platform is before you need it, but after you know roughly what you’re building. Too early, and the platform is speculative — you’re solving problems you don’t have yet. Too late, and you’re retrofitting — refactoring a production system that’s already carrying load.
The inflection points we see most often:
- First production outage where the root cause takes more than an hour to identify
- First compliance requirement — an Essential Eight assessment, a customer security questionnaire, a APRA CPS 234 audit
- First time bringing on a second engineering team
- First time trying to move fast and discovering the environment won’t let you
If you’re past two of those, the platform build is already overdue. Not catastrophically — it’s recoverable — but the longer it runs, the more existing workloads need to be brought into the new structure rather than starting clean.
Running AWS workloads without a platform foundation?
We scope platform builds as defined deliverables — IaC baseline, CI/CD pipeline, security guardrails, observability — with fixed pricing at each milestone. You can stop after any milestone with something complete in hand.
Talk to us about a platform build