Vitalik Buterin has been frequently discussing a concept called “statelessness” in recent presentations at events like the Korea Blockchain Week, in Singapore, and even during the Ethereum All Core Developers Consensus Call (ACDE). This concept is closely tied to the idea of “state” and is accompanied by various related concepts, such as state expiry, history expiry (EIP-4444), Verkle trees, and even address space expansion/compression. This isn’t entirely new and aligns with Ethereum’s latest roadmap outlined by Vitalik in November of the previous year, which encompasses two critical pathways known as The Verge and The Purge.
This article aims to provide insights into these two major pathways and explore some new challenges, helping us understand Vitalik’s vision for solving the state-related issues.
State in Ethereum
In Ethereum, “state” refers to a comprehensive ledger that includes all externally owned accounts (EOAs), their balances, deployed smart contracts, and related storage. This state is not static; it expands continuously with the addition of new users and the deployment of new smart contracts. Currently, full nodes must store this ever-growing dataset to correctly verify blocks and ensure accurate state transitions, making the verification process inherently stateful. This continuous growth in storage requirements raises hardware demands for running full nodes, contributing to the centralization of validators.
According to data from etherscan.io, running a fast-sync full node currently requires at least 1200 GB (using the Geth client as an example). This is after some state pruning, where older state data is deleted, leaving only the most recent state. For archival nodes that retain all historical states, including the state of each block, the required capacity is around 15,400 GB, and it continues to grow, a phenomenon often referred to as “state bloat.”
This is the central issue Vitalik emphasized during the Korea Blockchain Week: centralization of nodes is one of the biggest problems facing the Ethereum network and should be addressed by making node operation cheaper and easier.
In response to these challenges, the Ethereum community has been actively exploring improvement and optimization methods, including the solutions mentioned at the beginning.
Statelessness: The core idea of statelessness is to externalize state data, eliminating the need for each node to store the complete state. Nodes in this mode only maintain block headers and related transaction information. They verify and reconstruct the state using state proofs. Stateless mode significantly reduces the storage burden on nodes, improves network scalability, and allows more nodes to participate in validation while maintaining Ethereum’s decentralization.
Verkle Trees: Ethereum currently relies on Merkle-Patricia trees to hash and compress state data. However, the size of Merkle proofs in this tree structure may become too large for the witness needed in a stateless model. To address this, Ethereum plans to transition to Verkle trees, a more efficient data structure. Both Merkle-Patricia and Verkle trees share an important capability: generating witnesses, cryptographic proofs allowing anyone to easily confirm the existence and availability of specific information in the state root. Verkle trees excel in generating smaller proofs.
History Expiry (EIP-4444): EIP-4444 aims to implement history data expiry, requiring nodes to stop hosting historical blocks on the peer-to-peer network after a year. Removing historical data significantly reduces disk space requirements for node operators and simplifies client software by eliminating the need to accommodate different versions of historical blocks. However, concerns arise regarding preserving and recovering historical data.
State Expiry: Stateless mode eliminates the necessity for validators to maintain the complete state when verifying blocks. However, the state doesn’t disappear; its continuous growth remains a long-term challenge for the network. To address this fundamental issue, the community has proposed state expiry. State expiry automatically prunes static state parts, such as those unchanged for a year, moving them to a separate tree structure and removing them from the primary Ethereum protocol. It’s worth noting that state expiry becomes feasible after transitioning to Verkle trees. Also, if statelessness and PBS (Proposer-Builder Separation) are implemented, state expiry might become a lower-priority task，because if the Proposer-Builder Separation (PBS) is implemented by then, in the stateless situation, the block builder still needs to access the state to create the block.But by then, block builders are already expected to be able to effectively handle the growth of state, and because the field allows a degree of centralization, builders’ node performance will naturally be able to meet the demand.
Although protocol level PBS has not yet been included in the Ethereum mainnet, we can roughly understand a trend of the future mainnet by understanding the current market distribution of Mev-Boost PBS. The data statistics of mevboost.pics are as follows:
In addition, the implementation of state expiry involves changes in Ethereum’s address format. Currently, there are two proposals: address space extension and address space compression. The former extends the address length to 32 bytes (currently 20 bytes) but requires complex logic for backward compatibility and necessitates updates to existing contracts. The latter retains the 20-byte format but uses the first 6 bytes for prefixes and address cycle identification. However, this reduced address length lacks collision resistance, introducing potential security concerns in address creation, a significant challenge facing the community.
Currently, we can roughly prioritize these solutions based on their implementation challenges and urgency (2\3\4 May be equal):
1. Verkle Trees
4. History Expiry (EIP-4444)
5. Changes in Ethereum address format (compression/expansion)
6. State Expiry
In conclusion, these measures aim to lower the barrier to entry for running nodes, maintain decentralization, and mitigate potential state-related issues while optimizing network traffic load. However, there is still much work ahead to address these challenges fully.