ProDerivatives

View Original

Minimizing Data Storage Cost on the Ethereum Network

Update...

See this content in the original post

Scroll all the way to the bottom of this post to see how the storage cost has evolved over the past 12 months.

Introduction

One of the key benefits of the Ethereum network is decentralized, tamper-proof data storage. Attempts have been made to calculate the cost of storage, for example Hess, 2016 and Omaar, 2017 estimate the cost of storing 1 GB worth of data at $76,000 and $4,672,500, respectively. The amounts are different because the gas price and especially the Ether price changed considerably between February 2016 and July 2017. 1 The difference would have been even larger if the calculations had been based on the same amount of gas. Omaar estimated that the operation would require 625 billion gas, while Hess calculated an amount of 700 billion. 2 Had Omaar used the higher amount then July 2017 storage cost would increase to $5,236,645 per GB.

Gas usage depends on the set of instructions executed by a transaction. The amount of gas consumed by each instruction is defined in the Ethereum protocol and does not change unless the protocol changes (see Wood). Therefore, gas usage does not depend on network or market conditions and identical transactions will use the same amount of gas regardless of when they are executed. A difference of 75 billion gas for effectively the same transaction is therefore difficult to explain.

In order to obtain the correct amount of gas it is necessary to map gas usage as a function of the number of bytes sent to and persisted on the Ethereum network.

Gas usage as a function of bytes sent for storage

In order to calculate the exact amount of gas required for sending and storing a given number of non-zero bytes, first obtain the amount of gas required to write the initial 32 bytes. This amount can vary depending on the specific contract implementation and data structure used. 3

Between 32 and the required number of bytes:

  • Add 64 gas for every byte.

  • Add 134 gas after each block of 32 bytes.

  • Add 20,070 gas (see below) after each block of 32 bytes.

  • Add 64 gas from byte 256, skip every integer multiple of 256.

  • Add 64 gas from byte 65,792, skip for 256 bytes every integer multiple of 65,536.

  • After byte 544 add 1 gas according to sequence shown in the appendix. One additional sequence is applied every 32 * 256 bytes.

A sample implementation of the gas usage function can be found here.

Only the 20,070 gas after each block of 32 bytes is directly attributable to storage. All other items apply even if no data is persisted on the blockchain.

The compounding sequences (last item) cause the gas usage function to go superlinear - usage of gas increases at a growing rate. In turn this means that average gas usage (total gas used / number of bytes) first falls as initial transaction gas usage is distributed over a larger number of bytes and then rises again because marginal gas usage (gas used by last byte added) increases. The number of bytes at which average gas usage is minimized is the cutoff point at which large transactions should be split up, i.e. once the number of bytes to be written is large enough, it is cheaper to send multiple smaller transactions instead of one big one.

The theoretical optimal transaction size is approximately 150 KB4 and requires a little over 105 million gas. A 1 GB transaction should be split into 6,587 transactions with a total usage of 695 billion gas. This amount is reasonably close to the 700 billion gas underlying the estimate provided by Hess. Since 695 billion gas is the minimal gas amount, 625 billion gas is too low.

However, 105 million gas far exceeds the block gas limit at the time of writing meaning that 695 billion gas is currently not achievable. With a block gas limit of 8 million, the optimal transaction size decreases to 11,424 bytes5 and total gas usage for storing 1 GB increases to 700 billion gas. This amount is the currently achievable minimum.

Multiplying 700 billion gas with the current gas price and Ether price yields the dollar cost of sending and storing 1 GB.

With a gas price of 12 Gwei and 200 $/ETH at the time of writing in early November 2018, it costs $1.68 million to store 1 GB on the Ethereum blockchain.

Footnotes

  1. 50 Gwei per unit of gas and $2.17 per ETH in February 2016 vs 28 Gwei per unit of gas and $267 per ETH in July 2017

  2. 700 billion gas vs 625 billion. Gas Usage [units of gas] = Cost [$] / ( Gas Price [Ether / gas] * Network Token Price [$ / Ether]).

  3. 65,000 gas is a representative amount.

  4. Specifically, gas usage is minimized at 151,808 bytes and requires at least 105,585,348 gas. The contract implementation or data structure used should not make much of a difference at this size.

  5. Since storage is allocated in blocks of 32 bytes, it is efficient to size transactions in integer multiples of 32, e.g. 1024 bytes instead of 1000 bytes.

References

Omaar, Jamila. 2017. Forever isn’t free: The cost of storage of blockchain on a blockchain database.  https://medium.com/ipdb-blog/forever-isnt-free-the-cost-of-storage-on-a-blockchain-database-59003f63e01

Hess, Tjaden. 2016. Answer to: What is the cost to store 1KB, 10KB, 100KB worth of data into the ethereum blockchain?

https://ethereum.stackexchange.com/questions/872/what-is-the-cost-to-store-1kb-10kb-100kb-worth-of-data-into-the-ethereum-block

Wood, Gavin. Ethereum: A secure decentralised generalised transaction ledger. http://gavwood.com/paper.pdf


Appendix

Below sequence shows blocks of bytes which trigger one additional unit of gas to be added to the total cost of gas. The sequence starts at byte 544 and continues forever and a new sequence is added every 8192 bytes. This causes average gas cost per byte to rise with increasing transaction storage requirement.

Add 1 gas after bytes Iterations
288 1
256 1
192 1
160 2
128 5
96 6
64 1
96 2
64 1
96 1
64 2
96 1
64 15
32 1
64 3
32 1
64 2
32 1
64 1
32 1
64 2
32 1
64 1
32 1
64 1
32 1
64 1
32 1
64 1
32 1
64 1
32 1
64 1
32 2
64 1
32 2
64 1
32 2
64 1
32 2
64 1
32 2
64 1
32 3
64 1
32 3
64 1
32 4
64 1
32 6
64 1
32 7
64 1
32 Forever
See this content in the original post