How Load Network uses GCP BigQuery & Reth ExExes to power permanent onchain data

March 28, 2025

Load Network is a high-performance blockchain built towards the goal of solving the EVM storage dilemma with Arweave and EigenLayer. It gives the coming generation of high-performance chains a place to settle and store onchain data, without worrying about cost, availability, or permanence.

Load Network offers scalable and cost-effective permanent storage by using Arweave as a decentralized hard drive, both at the node and smart contract layer, and EigenLayer as stack decentralization offering temporary storage and decentralizing the way data flows in and out of Load. This makes it possible to store large data sets and run web2-like applications without incurring EVM storage fees.

Load Network is built on top of Reth, the Rust-powered Ethereum client developed by Paradigm. It uses Arweave as a permanent hard drive, and Google Big Query as a cloud indexer, all of which are wrapped inside the Load Network full node binary using a feature known as Execution Extensions (ExExes).

In this article, we’ll look at the architecture of Load Network, how it works with Execution Extensions, and how Google Cloud Platform powers the pipeline.

Inside Load Network’s architecture

(Source)
Load Network is built as a high-performance layer-1 blockchain (L1) that combines EVM compute with the permanent storage guarantees of Arweave and high throughput DA.

Data coming into Load Network – whether directly as smart contract interactions, or from an L2 as calldata – is first serialized with Borsh and compressed with Brotli to reduce the size of the data. Then, the block data is posted to Arweave via the ArDrive Turbo Bundler and the resultant Arweave transaction ID is saved as a storage proof inside the JSON-serialized block object. The JSON is committed to Google Big Query to make it easy for the network or any user building on top of Load Network to run queries on the indexed data.

The result of this pipeline is a way to frictionlessly push data from the L1 or any connected L2 to Arweave’s permanent harddrive and build an index in GBQ by default, all managed inside the node’s binary via Reth ExExes.

How Load Network uses Google Big Query

When it comes to pulling data from the blockchain for indexing purposes, most people turn to providers like Infura or Alchemy. However, these services can get pricey with demand.

Or if you use public RPCs, pulling large amounts of data quickly is often not possible for free. The overhead from using JSON-RPC calls over the wire adds to the issue: they’re slow, due to things like the TLS handshake, and you might be calling an API located on the other side of the world. This introduces extra latency with TCP connections, making it difficult to scale efficiently.

Besides relying on centralized paid services or attempting to scale with public RPCs—which are often unsuitable for high-velocity indexing due to rate limits, slow response times, and infrastructure not designed for sustained demand—emerging solutions like an ExEx-powered indexer with GBQcan be a more compelling choice from a cost and scalability perspective. The key advantage of ExExes is direct, native access to a Reth node, removing the latency overhead inherent in remote RPC connections. This translates into faster data retrieval and lower latency, which is critical for high-performance use cases like indexing or data pipelines.

From a cost perspective, running your own indexer stack at the ExEx layer becomes a better deal when you start hitting higher requests-per-second (RPS). Public RPCs often struggle under high load or impose rate limits that throttle performance, pushing users towards paid RPC services like Infura or Alchemy. With your own infrastructure, artificial rate limits disappear, and you eliminate recurring API costs entirely. While there is initial setup complexity, the long-term savings on API fees can far outweigh the operational costs, especially as your RPS increases. Additionally, an indexer ExEx removes the failure point of third-party RPCs, which means more stability and control over your node’s performance.

The Reth GBQ indexer operates alongside the Reth node by sharing memory for communication, triggered by the execution of new blocks or the reorg of old blocks. This means it runs in real-time, with no need for external scheduling, making it a more efficient, more cost effective, and a scalable solution for indexing data.

For generic use in any Reth node outside of the Load Network stack, Load also maintains an ExEx template that can be hooked up to any EVM compatible network and pipes data to GBQ. Grab the ExEx here and plug it into a network of your choice.

GBQ & ExEx, the Load Network Approach

How do we integrate the Load ExEx with GBQ?

The indexing challenges for growing EVM networks and decentralized storage is often found in the bottlenecks of metadata. In any data system, it’s important to have identifiers that point to a specific chunk of representational data at a place. Load Network 1-second block time introduces a few considerations:

How can we index 1 block in 1 second, and all its transactions?
Can we make transactions indexable with human-readable properties, like tags?

The answer to these foundational issues are found in high-performance, distributed engines such as Google BigQuery. Optimized for data analytics, we are able to harness GBQ to provide users with robust APIs that offer fast, reliable access to Load Network data. We also don’t lose the trustless properties of the network, as any data can be cross-checked from multiple sources.

Architecture and quirks of the indexer

It’s important to mention that our indexing needs to work in a way that makes sense for the user: we try to not index data that we know won’t be queried as much, or that could be obtained by another ordinary method without us risking higher egress, or higher processing on our servers. Keeping relations simple is key.

More on ExExes

A feature unique to Reth is Execution Extensions (ExExes). ExExes make it possible to extend the base Ethereum client by plugging in modules that run for each new block, without touching the core functionality.

Execution Extensions (or ExExes, for short) allow developers to build their own infrastructure that relies on Reth as a base for driving the chain (be it Ethereum or OP Stack) forward. An Execution Extension is a task that derives its state from changes in Reth’s state. Some examples of such state derivations are rollups, bridges, and indexers. They are called Execution Extensions because the main trigger for them is the execution of new blocks (or reorgs of old blocks) initiated by Reth. – reth Book

One of the key extensions Load Network uses in its Reth stack is an integration with Google Big Query. This provides an out-of-the-box indexer for onchain data alongside the right tools to construct deep, useful queries.

The Load team has also built a directory of ExExes at exex.rs to make it easy to discover new useful tooling to plug into Reth nodes. Contribute your ExEx to the directory here.

Why Arweave?

Arweave is a dedicated permanent storage chain. Across all nodes, it stores over 300 petabytes of data. It’s been live since 2018, and has protocol level guarantees for permanent storage of data for at least 200 years.

It ensures this with a mechanism known as the endowment. Any time someone wants to store data on the network, an extra fee gets locked into a reserve reward pool for miners. That reward pool is designed to never dip below the amount of tokens needed to operate the network for 200 years in the unlikely event that no other incentives from new data uploads are dispensed.

This is in contrast to other storage-focused chains like Filecoin, where users must maintain deals with storage providers and pay regularly by actively renewing these deals. Arweave, on the other hand, uses a ‘pay once, store forever’ model, ensuring permanence.

Arweave also has the ability to handle data at any scale. It routinely processes blocks multiple gigabytes in size, so can stretch to meet the demands of every chain simultaneously without hitting capacity.

What Load Network unlocks

Many chains treat storage as an afterthought. Optimizing for consensus and execution in the ‘here and now’. This introduces a longevity problem. The ethos of Ethereum is geared towards decentralization via low validator hardware requirements, which means it is unacceptable to force network participants to download and maintain hundreds of gigabytes of old state in order to validate the tip of the chain.

On the flip side, L2s like Arbitrum see much higher transaction throughput and are accruing historical data at a rapid pace, making state growth an even more pressing problem.

The question Load Network answers is: in a world where hardware requirements must be kept low and the rate of historical data being accrued onchain is increasing exponentially, where is all that data going to live?

Transparent permanent storage at any scale

A lot of L2 infrastructure is centralized by design, as a means to scale. But a side effect of that is key components are a black box. According to L2Beat risk analyses, many critical vulnerabilities relate to data storage, transparency, and third-party verifiability. Failure in these systems can bring the chain down and wipe user funds.

We’ve already asserted that most chains are not optimized for storage, but one was purpose built for it: Arweave. By using Arweave storage at the root of the stack, Load Network exposes a new storage primitive to EVM chains and allows them to offload storage to a dedicated chain.

This need for reliable storage is particularly relevant for data availability (DA) layers that use Ethereum blobs. While solutions like EigenDA work well for many applications, they present the same challenges to those requiring long-term access to historical data and transparency of L2 data – blob storage is purged after short windows of time. Load’s integration with EigenDA provides a way to extend the lifespan of critical on-chain data by allowing developers to archive EigenDA blobs using the same Sidecar Server Proxy APi interface. This ensures that essential data remains accessible and immutable beyond the short availability window. In this way, Load Network serves as an important layer for projects seeking to balance performance with reliable long-term data storage.

High-throughput, cost-effective, longterm DA

At current rates, rollups and EVM networks on rollup.wtf have a total data throughput of 200 KBps. This is 0.315% of Load’s data throughput in Alphanet V4, so Load Network could handle DA for every network and rollup listed and still have 99.6% remaining capacity.

Compared to Ethereum calldata storage prices, which cost in the range of hundreds of dollars per megabyte, Load Network is at least 200 times cheaper. And over 10 times cheaper than temporary storage via blobs, which don’t actually address the long-term storage guarantees we’re so concerned about.

Without having to touch the core consensus or execution code of Reth, Load Network can ensure that any L1 or L2 transaction that passes through it is enshrined permanently on Arweave for a fraction of the price it would cost on another L2 like Base or Arbitrum, or even a dedicated temporary DA layer.

On the DA side, Load has been deployed by EigenDA for blob permanence and Dymension as a DA layer. Load is 200 times cheaper to store data than Base, has 700 times more throughput than Celestia, and can handle 800 times the DA demand of all today’s rollups combined – with all data permanent on Arweave, sidestepping reliance on centralized archive nodes.

Access to Arweave from Solidity smart contracts

Storage on EVM chains has typically been so expensive that there’s basically no hope of building a profitable and usable data-heavy consumer app onchain. But Load doesn’t just push all of its ledger history and data from L2s to Arweave, it actually exposes Arweave as a storage primitive inside Solidity contracts.

With Arweave as a native data store inside Solidity contracts, developers can now build dApps fit for data-rich, real world purposes. This integration effectively bridges the gap between Ethereum’s smart contract capabilities and Arweave’s decentralized, permanent storage, allowing developers to store large amounts of data on a dedicated storage chain and interface with that data inside of the EVM environment.

Think fully onchain social media, governance discussion, storage access for autonomous AI, content monetization, document management, and anything that needs to compute with data bigger than the few bytes available today.

Building on Load Network

If you’re building an EVM-compatible L1 or L2 and want to add highly scalable permanent storage and DA, reach out to us on X, Telegram, or Discord – we’re in the stage where we’re running pilots with partners including Dymension, EigenDA, Metis, RSS3, RISE, Phala and GOAT Network.

Check out the Load Network ecosystem to see who’s already using Load and get involved yourself. Reach out on if you want to build the future of onchain data.