The evolution of Ethereum continues to unfold with a clear vision: long-term sustainability, scalability, and simplicity. In the fifth installment of his series on the future of the Ethereum protocol, Vitalik Buterin introduces The Purge—a critical phase in Ethereum’s roadmap aimed at reducing bloat, simplifying the protocol, and ensuring long-term viability without sacrificing decentralization or persistence.
This article explores the core challenges The Purge seeks to address: historical data accumulation, state growth, and protocol complexity. We’ll dive into proposed solutions, ongoing research, trade-offs, and how these changes integrate with Ethereum’s broader upgrade path.
Understanding The Purge: Core Objectives
“The Purge is not about destruction—it's about refinement.”
As blockchain networks mature, they naturally accumulate technical debt. Ethereum is no exception. Two primary sources of this bloat are:
- Historical Data: Every transaction and block ever created must be stored by full nodes.
- Protocol Complexity: Features are easy to add but hard to remove.
The Purge aims to reverse these trends by introducing mechanisms that reduce storage demands and eliminate obsolete or redundant code—while preserving Ethereum’s core promise: permanence.
You should be able to store an NFT, send a message in a transaction, or lock funds in a smart contract, disappear for a decade, and return to find everything intact. For dApps to confidently decentralize fully—removing upgrade keys—they need assurance that underlying protocols won’t break their dependencies.
👉 Discover how next-gen blockchain platforms support seamless protocol evolution.
Reducing Historical Data Storage
The Problem
As of 2025, a fully synced Ethereum node requires approximately 1.1 TB of disk space for the execution client, plus hundreds of gigabytes more for consensus data. Most of this is historical—blocks, transactions, receipts—much of which dates back years. Even with stable gas limits, node size grows by hundreds of GB annually.
This creates barriers to entry for new validators and threatens decentralization.
The Solution: Distributed Historical Storage
A key insight is that consensus does not require every node to store all history. Thanks to cryptographic proofs (like Merkle proofs), as long as the network agrees on the latest block, any historical data can be verified by a single participant providing a proof.
This enables a 1-of-N trust model for historical data—unlike the N/2-of-N model required for consensus.
Practical Approaches
- EIP-4444: Proposes limiting historical data retention (e.g., one year) on full nodes.
- Blob Expiry: Already implemented—blobs expire after ~18 days.
- Consensus History: Beacon chain nodes now store only ~6 months of data.
Long-term, the goal is a coordinated period where all nodes store everything, followed by decentralized archival via peer-to-peer networks.
Distribution Models
Two main approaches are being explored:
- Torrent-like Networks: Nodes store random subsets of history. With 100,000 nodes each storing 10%, every piece of data is replicated 10,000 times—matching today’s redundancy.
- Portal Network: An Ethereum-native solution for distributed retrieval of SSZ-encoded objects.
Using erasure coding (already used for blob data) can further enhance robustness while minimizing bandwidth usage.
Remaining Challenges & Trade-offs
- Integration: Need to finalize and deploy either torrent integration or Portal Network for execution history.
- Backward Compatibility: EIP-4444 requires a new network protocol version; all clients must coordinate upgrades.
- Archival Integrity: How aggressively do we ensure old data remains accessible?
Two dimensions define our approach:
- Node Participation: Should we mandate historical storage (e.g., via proof-of-custody), or rely on voluntary standards?
- Protocol Depth: Should Portal Network be integrated into sync processes so archive nodes can bootstrap directly from it?
👉 Explore how modern blockchain infrastructures handle large-scale data efficiently.
Managing State Data Growth
The Challenge
Even if historical data is pruned, state data—account balances, contract storage, code—grows by ~50 GB per year. Users pay once but impose perpetual costs on all nodes.
Unlike history, state cannot be easily "expired" because EVM assumes permanent accessibility.
Potential Solutions
While statelessness could shift storage burden to specialized block builders, relying solely on it risks centralization. Hence, state expiry remains essential.
Two leading models are under consideration:
1. Partial State Expiry (e.g., EIP-7736)
- State is divided into chunks ("stems") containing headers, code, and storage slots.
- Data unused for 6 months becomes a 32-byte stub.
- Accessing expired data requires “reviving” it with a cryptographic proof.
- Compatible with Verkle trees and future stateless designs.
This balances efficiency and usability but still allows slow growth due to stubs.
2. Address-Cycle-Based State Expiry
- Multiple state trees exist; new ones added per cycle (~1 year).
- Only the latest two trees are stored by full nodes.
- Old state can be read/written with Merkle proofs and gets copied back into active state.
To make this user-friendly: address cycles are introduced.
An address with cycle N can only interact with state during or after cycle N.
This prevents conflicts when old state reappears—but requires expanding address size beyond 20 bytes.
Address Space Expansion vs. Contraction
| Approach | Pros | Cons |
|---|---|---|
| Expansion (32-byte addresses) | Preserves collision resistance | Breaks backward compatibility with 20-byte assumptions |
| Contraction (e.g., reserve 0xffffffff prefix) | Simpler transition | Increases risk of hash collisions (~2⁵⁶ vs current ~2⁸⁰) |
Ironically, even without state expiry, Ethereum may eventually need to address collision risks due to advancing hardware (GPUs, ASICs).
Final Decision Paths
- Go Stateless Only: Accept slow state growth; only specialized roles store full state.
- Partial Expiry: Accept low but non-zero permanent growth.
- Full Expiry + Address Expansion: Long-term secure but complex migration.
- Full Expiry + Address Contraction: Faster rollout but higher security trade-off.
State expiry also simplifies future upgrades—new state tree formats can be adopted incrementally without mass conversions.
Simplifying the Protocol: Functionality Cleanup
Why Simplicity Matters
Complexity undermines:
- Security (more bugs)
- Accessibility (harder for devs)
- Neutrality (favoring certain interests)
Default trajectory? More features → more complexity. To avoid this spiral, Ethereum must either:
- Freeze development (not ideal), or
- Actively remove outdated features.
We’re choosing the latter.
Key Cleanup Opportunities
Outside EVM
- RLP → SSZ Migration: Replace legacy RLP encoding with SSZ across all data types for consistency and better hashing.
- Remove Legacy Tx Types: Reduce transaction type sprawl; use account abstraction as fallback.
- Log/Bloom Filter Removal: Clients don’t use bloom filters; replace with off-chain tools using SNARKs.
- Eliminate Sync Committees: Future SNARK-based verification will make them obsolete.
- Unify Data Formats: Align execution, consensus, and blob formats using KZG + SSZ principles.
- Delete Beacon Chain Committees: Obsolete since sharding shifted to L2/blobs.
- End Mixed Endianness: Standardize on big-endian (EVM-native) across layers.
Inside EVM
- Simplify Gas Model: Revisit arbitrary storage/memory pricing; adopt unified formulas (see EIP-4762).
- Remove Unused Precompiles: Identity, RIPEMD160, MODEXP, BLAKE—replace or delete.
- Make Gas Unobservable: EOF already does this; make it mandatory to enable future gas upgrades.
- Eliminate Dynamic Jumps: Improve static analysis and compiler optimization; EOF enforces static jumps.
The SELFDESTRUCT Blueprint
The deprecation of SELFDESTRUCT serves as a model:
- Caused DoS risks due to unbounded state changes.
- Rarely used in practice.
- Now restricted to same-tx creations; full removal possible later.
This reduced client complexity significantly.
A Standardized Deprecation Process
A four-step framework ensures safe removal:
- Discussion: Propose removing feature X.
- Impact Analysis: Assess breakage; choose path (abort, proceed, minimize harm).
- Deprecation EIP: Formal proposal; ecosystem tools stop supporting it.
- Removal: Final deletion after multi-year transition.
Time between steps: years, not months—balancing innovation with stability.
EOF and Its Role in Protocol Simplicity
EVM Object Format (EOF) introduces strict rules:
- No gas observation
- Static jumps only
- Code immutability (no
CODECOPY)
Benefits:
- Enables safer upgrades
- Encourages migration to more secure execution environments
But unless old EVM versions are eventually deprecated, EOF adds complexity rather than reducing it.
Thus, EOF must be part of a broader simplification strategy—not an end in itself.
Integration with Ethereum’s Broader Roadmap
The Purge synergizes with other upgrades:
- The Surge (scaling via rollups) → Less L1 congestion → Easier state management
- The Verge (Verkle trees) → Enables statelessness → Complements partial expiry
- The Splurge & Scourge → Refine economics and MEV handling → Support cleaner protocol logic
Moreover:
- Switching to single-slot finality allows PoS simplifications.
- Full account abstraction lets us replace legacy transaction logic with default smart contract code.
- Unified hashing (SSZ everywhere) streamlines verification.
Frequently Asked Questions (FAQ)
Q: Does The Purge compromise Ethereum’s immutability?
A: No. Data remains cryptographically verifiable via proofs. The change is where it's stored—not whether it exists.
Q: Will I lose access to my old funds or NFTs?
A: Not if you return within the retention window. For older assets, revival via proof ensures access—even from dormant accounts.
Q: How will developers adapt to address cycle changes?
A: Tools will auto-generate correct addresses. Existing contracts may require wrappers—but core logic stays unchanged.
Q: Can’t we just scale storage instead of pruning?
A: Hardware improves slowly; blockchain growth outpaces it. Decentralization requires lightweight nodes.
Q: Is full state expiry inevitable?
A: Likely—but gradual. The community will choose based on security, usability, and migration feasibility.
Q: What happens if we don’t act?
A: Node requirements become prohibitive, leading to centralization—and increased vulnerability to attacks.
Final Thoughts: Toward a Sustainable Ethereum
The Purge represents Ethereum’s commitment to long-term health—not just growth, but refinement. By tackling historical bloat, curbing state expansion, and pruning obsolete code, Ethereum moves closer to a future where anyone can run a node on consumer hardware—and trust that the system endures.
This isn’t just engineering—it’s stewardship.
👉 Stay ahead of blockchain innovation—see how platforms evolve sustainably.