proof-of-work

a slightly critical summary of Bitcoin's proof-of-work algorithm written by a student who took one (1) blockchain class

  ·   10 min read

Some background and term definitions #

Bitcoin is a peer-to-peer e-cash system developed by Satoshi Nakamoto and is outlined in this paper. In order to understand what proof-of-work is, we need to define a few terms and ideas from the paper (and terms used in blockchains in general). Let's start from the smallest data structure to the largest data structure when we define things.

A miner is a person who uses their own hardware and software to complete mathematical equations to earn coin rewards. We can redefine this vague definition later once we establish some key concepts.

An unspent transaction output (UTXO) is some coin that can be spent, but has not been spent yet. A set of these represents the coins in a cryptosystem at some given time. Bitcoin's use of UTXOs helps avoid a coin being maliciously spent twice (see: double-spend attack).

A transaction dictates a transfer of some value of coin between two different Bitcoin wallets, like a direct mapping of some inputs to some outputs. A transaction's input is a reference to some output from a previous transaction. Let's say that we have two transactions, A and B. Each of these transactions has some inputs and some outputs. A's output is a UTXO that is spendable by B, meaning that A's output is B's input.

A block is a group of verified transactions. In order to create these blocks, all users (miners) put their proposed (before validation) transactions into a large memory pool/mempool, a storage area for proposed transactions made by some authority node. These proposed transactions sit in the mempool waiting to be validated. These nodes verify these transactions and broadcast them to miners, who take the transactions and try to include them in blocks. This is where proof-of-work becomes important.

A blockchain is a distributed and immutable ledger made of blocks. Ledgers are meant to be a canonical reference of all financial transactions, so distributed ledgers pose this question: how do we achieve consensus across ALL members of a crypto network?

Time to talk about Proof-of-Work (PoW) #

Distributed networks like the Bitcoin network cannot necessarily rely on the fact that every single miner is honest. Therefore, it is evident that there are a lot of intricacies in how difficult consensus algorithms are to create in these kinds of systems. There must be some burden on miners in the network to help ensure that people are incapable of cheating money out of others.

Proof-of-work (PoW) is a consensus algorithm that determines when and where new blocks are added to a blockchain. When you hear people talk about Bitcoin mining being "solving complex mathematical equations to get Bitcoin," this is what that refers to. In this system, the "burden" on miners is the expenditure of significant computational energy.

Miners group validated transactions into blocks. These miners expend this intense computational effort to solve PoW puzzles and produce "proof" for a block. This "proof" is used to help validate blocks and put them onto the blockchain. If these blocks are validated and added to the blockchain, miners are granted a coinbase reward for their efforts. At the time of writing this sentence, the reward is 3.1 BTC per block mined which is $89,407.73.

The actual calculations of PoW involve the header of the block being "mined."

Just kidding, time to talk about block headers #

The header of a block has a lot of different fields: version number, previous block hash, Merkle root, timestamp, difficulty target, and nonce.

The version number dictates the current version of Bitcoin we're working with. This is important because Bitcoin has frequent protocol changes. If one opposes the changes that a protocol change suggests, there may be a fork in the chain (just like the case of Bitcoin Cash).

The previous block hash is important to maintain the blockchain's tree structure.

A Merkle root is something I didn't know before taking my class on blockchain and secure distributed systems. The Merkle root is the root of a Merkle tree, which is a data structure to help prove that some item is in a set of data without giving away the whole dataset. Let's say we have a block that has 10 transactions, numbered 0 through 7 (8 elements total) for simplicity:

flowchart TD
    A[A] -->  B(B)
    A --> C(C)
    B --> D(D)
    B --> E(E)
    C --> F(F)
    C --> G(G)
    D --> H(H)
    D --> I(I)
    E --> J(J)
    E --> K(K)
    H --> 0(0)
    I --> 1(1)
    J --> 2(2)
    K --> 3(3)
    F --> L(L)
    F --> M(M)
    L --> 4(4)
    M --> 5(5)
    G --> N(N)
    G --> P(P)
    N --> 6(6)
    P --> 7(7)
linkStyle default stroke:white

Here's the Merkle tree we'd use. Our Merkle root is node A at the very top of the tree. For each of the elements 0-7, we take the hash of it to become the resulting nodes H-P, i.e. if we had the hash function sha256(), then node J = sha256(2). If we wanted to find an equation to find node D, then we'd have to concatenate the values of nodes H and I. Thus,

$$ \text{node D} = \text{sha256}(H + I) = \text{sha256} (\text{sha256(0)} + \text{sha256(1)})$$

We can continue this series of hashing and concatenating to eventually reach the Merkle root. The Merkle root is stored in the block header.

The timestamp is the approximate creation time of the block.

The difficulty target is a value that we use to measure if we have successfully completed the mining "puzzle."

The nonce is a garbage value used to help the proof-of-work algorithm.

The last two header fields become very important in proof-of-work.

Let's actually talk about PoW now #

Now that we have the background of what is in a block's header, we can talk more about the proof-of-work algorithm. The proof-of-work algorithm is a brute-force guess-and-check algorithm. The work that one must do is ensure that the hash of a block's header falls under a specific threshold -- the difficulty target.

To get the hash of a block's header, we must concatenate the six fields (IN THAT EXACT ORDER) from above. These six field values need to be in byte representation. If we have these fields in byte format, we can get the block hash by running the following:

def getBlockHash(version_num, prev_block_hash, merkle_root, timestamp, target, nonce):
    '''
    function to get the block hash (in bytes) from values (in bytes) in block header fields
    note: order matters, since swapping them completely changes hashed output
    '''
    concat_b = b"".join([version_num, prev_block_hash, merkle_root, timestamp, target, nonce])
    hash1 = hashlib.sha256(concat_b).digest()
    hash2 = hashlib.sha256(hash1).digest()
    return hash2

Note that we hash the concatenated bytes of the header fields twice. This is because of worries that SHA256 is vulnerable to a length extension attack.

Miners must prove that blockHash < target. If blockHash does NOT fall under this target, we must increment the block's nonce value and recalculate blockHash. This is the brute force algorithm. If we get to the point where we run out of possible values for the nonce (which is a 32-bit value, meaning 4,294,967,295 values), we can adjust the time or merkle root (by switching around the transactions that are in the block) instead and recalculate the hash.

Feels very inefficient, right?

...PoW is just guess-and-check? #

Blockchain mining technology was described as "complex mathematical problem solving" all the time when I was younger. I was floored by it just boiling down to a guess-and-check algorithm. How is this even allowed?

This inefficiency and giant energy consumption burden that PoW relies on is reflected in its total electricity usage. Bitcoin (that uses PoW) has an estimated annualised consumption of 213.33 TWh, which is about the same amount as Thailand (203 TWh) in 2022.

Obviously, PoW is meant to be very difficult and strenuous. It's meant to "delay" miners so that no one in the distributed network can act maliciously. Someone having criticism of PoW might be construed as them not understanding the purpose of the algorithm. As such, potential alternatives to PoW would need to be able to solve the same problem that PoW tries to solve.

One notable alternative to PoW is proof-of-stake (PoS). Instead of miners like PoW, validators are the entities who work to add blocks to a blockchain and are the ones chosen to validate the next block. In order have a higher chance of being chosen, validators stake their own cryptocurrency as a form of "collateral", since one's chance of being a validator scales with how much stake one puts in. This collateral is the burden to PoS validators as mining energy is the burden to PoW miners. Similarly, in PoW, where miners get an incentive in the form of a coinbase reward, the incentive for validators is collecting a network fee after validating blocks.

Compared to a PoW system, PoS seems to get rid of a lot of the electricity usage concerns since the burden is based on collateral instead of computational power:

"...Bitcoin's energy consumption exceeds the energy consumption of all PoS-based systems analyzed by at least three orders of magnitude" (Source).

PoS seems to have mostly solved the environment battle between these two different schemes, but is it as effective at preventing malicious actors as PoW? The answer is...not really, depending on who you ask it seems. The aforementioned validators in a PoS network are the ones solely responsible for the security of the distributed network, which could make it more vulnerable to attacks.

Additionally, one could argue that PoS devalues the whole point of a decentralized network since there is centralization regarding validators being the sole entities to validate blocks. Now, we're back to the problem of keeping the network truly decentralized.

Bitcoin is also argued to not even really have environmental friendliness in mind and instead focuses on decentralization and security. Those who emphasize the environmental aspects of Bitcoin are sometimes seen to have "missed the point."

It's a very difficult process to develop these distributed consensus algorithms, and there will always be people who try to emphasize the importance of some ABC things while others emphasize the importance of XYZ things. Many topics of debate involve the security, complexity, scalability, money distribution, and degree of decentralization, to name a few. As such, there will always be disagreement.

In a perfect world, we have purely distributed systems that utilize consensus techniques with securities on par with PoW while maintaining a low electric cost. We talked at length about how there are too many disagreements with the latter, so let's talk about the former and whether it can be actualized or not.

I personally think nothing in the realm of cryptocurrency or blockchain can truly be "decentralized" right now. The technology is too new and there is not enough time or resources for it to take off. Take, for example, the AWS incident on October 20th of this year. Due to a DNS resolution issue of the AWS DynamoDB API endpoint in us-east-1, there were many disruptions to AWS for almost 15 hours, taking down a wide range of different websites. These websites included a lot of blockchain trading and infrastructure providers, like Binance, Kucoin, and Coinbase. There was even a Cloudflare outage last Tuesday that took other crypto-related platforms out as well.

It is obviously not ideal for a decentralized currency and system to be reliant on a centralized one.