DevOps & Platform Eng

Terraform S3 Locking: Ditching DynamoDB for Simplicity?

Terraform's default state management is changing. The long-standing DynamoDB requirement for state locking is now optional, with S3 handling the job directly. This move promises significant cost and complexity reductions, but the question remains: can it hold up under real-world, high-demand scenarios?

Diagram showing Terraform S3 backend locking flow without DynamoDB

Key Takeaways

  • Terraform's S3 backend now supports state locking via `.tflock` files, eliminating the need for DynamoDB in many scenarios.
  • This simplification can significantly reduce AWS costs and operational complexity, particularly for small teams and low-concurrency environments.
  • For large teams, high CI/CD activity, or critical infrastructure, DynamoDB's superior consistency and concurrency management still make it the safer choice.
  • A key risk with S3-only locking is stuck `.tflock` files after Terraform crashes, requiring manual cleanup.

Consider this: AWS charges $0.28 per million DynamoDB requests. For a busy DevOps team with hundreds of Terraform runs daily, that cost can inflate faster than you might think. Now, picture that line item disappearing entirely.

That’s the core proposition behind Terraform’s recent S3 backend evolution: ditching DynamoDB for state locking, opting instead for a .tflock file within the S3 bucket itself. It’s a move that trims AWS dependencies, slashes operational overhead, and ostensibly simplifies the entire state management puzzle. But is this a genuine cost-saving upgrade, or a recipe for disaster in the wild, unforgiving landscape of production infrastructure management?

Terraform’s state file is, in essence, your infrastructure’s DNA. Every terraform apply meticulously reads and rewrites this record. Without strong locking, the potential for data corruption and operational chaos skyrockets. Imagine two engineers — or worse, two CI/CD pipelines — attempting to spin up or modify the same resources concurrently. The state file becomes a battleground, and your infrastructure, a casualty.

Terraform state locking is designed to prevent exactly this. The mechanics are deceptively simple: a process attempts to lock the state; if a lock is active, the operation halts; if not, it proceeds, acquires the lock, and releases it upon completion. Traditionally, this lock was a database record in DynamoDB. Now, it’s a .tflock file.

The Allure of Simplicity: S3-Only Locking

Dropping DynamoDB from the Terraform S3 backend setup is a compelling proposition for many teams. The immediate benefits are clear: fewer AWS services to manage, reduced monthly bills, and a significantly less daunting initial configuration. For startups or teams with modest infrastructure footprints and a controlled deployment cadence, this is undeniably attractive. Onboarding new engineers becomes smoother, too — one less service to explain and configure.

But let’s be blunt. For small teams where infrastructure changes are infrequent and meticulously planned, or where CI/CD pipelines don’t race each other in parallel, S3-only locking is probably sufficient. It’s the lean, mean approach.

When Simplicity Becomes a Security Risk

The traditional DynamoDB approach isn’t just a default; it’s a hardened choice for a reason. DynamoDB offers strong consistency guarantees and superior handling of high concurrency. It’s engineered for scenarios where multiple operations are genuinely vying for the state file simultaneously. When you’re dealing with critical production systems, large engineering teams, or heavily automated CI/CD workflows that churn out deployments constantly, that extra layer of resilience matters.

My take? The original wisdom of using DynamoDB wasn’t about blind adherence. It was a pragmatic recognition that infrastructure as code, at scale, demands strong concurrency controls. If your team is large, if your pipelines are aggressive, or if the infrastructure itself is mission-critical, clinging to S3-only locking is akin to using a garden hose to fight a refinery fire.

“Multiple pipelines running simultaneously can create conflicts. More engineers increase the risk of concurrent operations. Critical infrastructure requires stronger consistency guarantees.”

One of the thorniest issues with .tflock files on S3 is what happens when Terraform crashes mid-operation. The .tflock file might linger, an invisible barricade preventing any further Terraform runs. This isn’t a hypothetical; it’s a well-documented pain point. The fix? A manual <a href="/tag/aws-s3/">aws s3</a> rm s3://your-bucket/path/.tflock command. This isn’t exactly an automated resilience feature.

Best Practices for the New World (and the Old)

Whether you’re embracing S3-only locking or sticking with DynamoDB, good practices are non-negotiable. S3 versioning is your safety net for state files, KMS encryption is a must for data at rest, and separating state by environment (dev, staging, prod) is foundational. IAM permissions should be ruthlessly pruned to the principle of least privilege. And, critically, monitoring access and changes to your state is essential.

Migrating from DynamoDB to S3-only locking is, configuration-wise, a cinch. It boils down to setting use_lockfile = true in your backend configuration and then running terraform init -migrate-state. The real test, however, isn’t the init; it’s rigorous testing of concurrent operations before you confidently deploy this to your production environment.

The Decision Matrix: A Pragmatic Cheat Sheet

Here’s the simplified, data-driven logic:

  • Small Team? Low Concurrency? S3 locking likely suffices. Think startups, small dev shops.
  • Large Team? High CI/CD Activity? Critical Infrastructure? DynamoDB remains the safer, more strong choice.

🧬 Related Insights

Frequently Asked Questions

Will S3-only Terraform locking replace DynamoDB for everyone?

No. While it offers a simpler, cheaper alternative for smaller teams and less demanding workloads, DynamoDB’s advanced consistency and concurrency handling remain vital for large-scale, high-activity environments.

What are the biggest risks with S3-only Terraform state locking?

The primary risks involve concurrency conflicts leading to state corruption and the potential for .tflock files to get stuck if Terraform crashes, requiring manual intervention.

How do I migrate my Terraform state from DynamoDB to S3 locking?

Update your Terraform backend configuration to set use_lockfile = true. Then, run terraform init -migrate-state to migrate your existing state.

Jordan Kim
Written by

Cloud and infrastructure correspondent. Covers Kubernetes, DevOps tooling, and platform engineering.

Frequently asked questions

Will S3-only Terraform locking replace DynamoDB for everyone?
No. While it offers a simpler, cheaper alternative for smaller teams and less demanding workloads, DynamoDB’s advanced consistency and concurrency handling remain vital for large-scale, high-activity environments.
What are the biggest risks with S3-only Terraform state locking?
The primary risks involve concurrency conflicts leading to state corruption and the potential for `.tflock` files to get stuck if Terraform crashes, requiring manual intervention.
How do I migrate my Terraform state from DynamoDB to S3 locking?
Update your Terraform backend configuration to set `use_lockfile = true`. Then, run `terraform init -migrate-state` to migrate your existing state.

Worth sharing?

Get the best Developer Tools stories of the week in your inbox — no noise, no spam.

Originally reported by dev.to

Stay in the loop

The week's most important stories from DevTools Feed, delivered once a week.