Hop On Board the Merkle Patricia Tr(ie)ain

One of my favourite passages from Neal Stephenson's historical fiction classic Cryptonomicon describes Alan Turing going on a bike ride near Princeton in the late 1930s with friends Lawrence Waterhouse and Rudy von Hacklheber.

On this bike ride Alan is trying to bring Lawrence up to speed on the ideas in Bertrand Russell's Principia Mathematica, hoping that Lawrence would be able to join Alan and Rudy in a discussion about the possibility of building a universal machine.

At one point in the conversation Alan admonishes Rudy who was growing impatient during Alan's explanation of concepts which Rudy was already very familiar with:

"Shut up about Leibniz for a moment, Rudy, because look here: You--Rudy--and I are on a train, as it were, sitting in the dining car, having a nice conversation, and that train is being pulled along at a terrific clip by certain locomotives named The Bertrand Russell and Riemann and Euler and others. And our friend Lawrence is running alongside the train, trying to keep up with us--it's not that we're smarter than he is, necessarily, but that he's a farmer who didn't get a ticket. And I, Rudy, am simply reaching out through the open window here, trying to pull him onto the fucking train with us so that the three of us can have a nice little chat about mathematics without having to listen to him panting and gasping for breath the whole way."

A few weeks ago I was evaluating a new network that is taking one of the freshest and most elegant approaches to solving scalability that I have ever seen.

In discussing the scalability properties of this new network with the founders I found myself, like Lawrence, huffing and puffing, unable to keep up with a discussion that I soon realised was being pulled along by a particular locomotive, to borrow Stephenson's analogy, called the Merkle Patricia Trie.

I needed to catch up so I jumped on the internet hoping to find a quick way to get up to speed on Merkle Patricia Tries.

The materials I found on the internet however were either too narrow or too high level for what I needed.

After finally getting a good grasp on Merkle Patricia Tries, I decided to write this post in the hopes of helping anyone else who might need a fast and thorough way of getting up to speed on this concept.

Table of Contents

In order to make this post detailed yet comprehensive, I started from an explanation of basic concepts like hashing and worked my way up to Merkle Patricia Tries and their use.

If you are already familiar with any of the below topics, feel free to skip to the sections that might be most relevant for you.

  1. Hashing
  2. Locking down Bitcoin with Hashes
  3. Hashing at Scale and in the Wild
  4. Merkle Tries
  5. Understanding State in Bitcoin vs. Ethereum
  6. Patricia Tries
  7. Merkle Patricia Tries
  8. Securing the Ethereum "Hard Drive" With Merkle Patricia Tries

1. Hashing

Hashing functions are mathematical formulae that convert any string of text into another, unique string of text.

For example, let's say that we want to hash this quote from Neal Stephenson's 1992 science fiction classic Snow Crash:

No matter how smart we get, there is always this deep irrational part that makes us potential hosts for self-replicating information.

Apparently Amazon are working on a TV adaptation of Snow Crash. No release date announced.

If we run this quote through a common hash function called sha256, the output is the following string of characters, which we call a "hash":

2abc2e3c2cf1d0a3588584dd9326da23fb1a8fde5246225b956bcdd21677cc37

One of the interesting properties of hash functions like sha256 is that every time we run that exact same quote through the hash function, we will get the exact same hash.

Why is this useful?

Well, say we now make a tiny change to the above quote, and include an extra space:

No matter how smart we get, there is always this deep irrational part that makes us potential hosts for self-replicating information.

When we run this modified quote through the same sha256 hash function, we arrive at this hash:

19e0bd53384a095cf540f97312148af395d1b0863a93dc236a2719bfd5004543

Notice that even though the change to the quote is tiny and barely noticeable, the resulting hash is completely different.

Not only is the hash of the quote completely different, it is for all practical purposes impossible to predict how the hash will change without running the new text through the hash function.

This makes hash functions particularly useful for comparing two sets of data and checking if they are identical.

Let's imagine that we wanted to check whether two entire copies of the Snow Crash novel were exactly identical. The paperback version of the novel is 448 pages long.

We could I suppose hire 100 people to comb through 4 – 5 pages of the novel each to try to detect any differences.

Thankfully, this is 2019 and the average laptop could complete this task with no errors in just a few seconds.

All we would need to do is hash both copies of the novel.

Once the hashing is completed, we would compare the two resulting hashes and if the hashes matched, we could then say with near certainty that the two copies of the novel are identical!

Random fact: Fictional interfaces featured in Snow Crash like the networked 3D "Metaverse" as well as the virtual "Earth" software inspired a slew of real life product successes, including the classic role playing game Quake as well as Google Earth. Snow Crash is also required reading for people joining the Xbox development team.

2. Locking Bitcoin Down with Hashes

Using hash functions to check for differences between two copies of a science fiction novel might not seem that useful, especially because a hash function can't point you to where or what those differences are.

In software however, hashing is a critical technique for verifying whether data or code that you have received is true and correct and has not been corrupted or surreptitiously replaced by a bad actor.

For example, when I download a copy of the Bitcoin ledger, I want to know that it is the same Bitcoin ledger that everyone else agrees is the "source of truth" or "real" Bitcoin ledger.

In other words, if my copy of the Bitcoin ledger shows that I own 1 BTC, I want to know that everyone else's copy of the Bitcoin ledger also shows that I own 1 BTC. Otherwise my 1 BTC isn't real.

One of the great innovations of Bitcoin is that it can synchronise thousands of copies of the Bitcoin ledger automatically without any one person or authority (like a bank or a government) being in charge of ensuring that the ledger remains synchronised, secure and correct.

Let's take a look at how hashing helps achieve this incredible feat.

As you might be aware if you're reading this post, to include a new block in the Bitcoin ledger every winning miner needs to include a hash of the header of the previous block in the header of the current block.

We won't be focussing on the mining process itself in this post but if you're interested in understanding how Bitcoin mining works, this video has one of the most concise explanations of the process that I have ever seen.

So in the diagram above the miner that created Block 5 needs to include a hash of the header of Block 4 in the header of Block 5.

Note that Block 4 itself contains a hash of the header of Block 3, which itself contains a hash of the header of Block 2 and so on, all the way back to the very first Bitcoin block.

This daisy chaining of blocks using hashes assures Bitcoin users that if anyone tampers with even a single character in any transaction in the entire history of Bitcoin, it will quickly become apparent.

It might be obvious to you why it would quickly become apparent but just to be sure let's step through it.

Let's say that we're new to the Bitcoin network and wanted to download a copy of the ledger.

To do this, we would reach out to a peer to download a copy of the ledger.

If we were just dealing with our bank we would probably just trust the data to be true and correct.

In Bitcoin though we have no idea who that peer is so our base assumption must be that the data we just downloaded is fake unless we prove it to be otherwise.

To verify whether the ledger is "real", we start by computing the hash of the header of every block for ourselves, starting from Block 1 all the way through to the current block.

Let's say that the latest block in Bitcoin is Block 5. If we calculated the block header hashes from Block 1 through to Block 5, we would arrive at the below hash:

000000009b7262315dbf071787ad3656097b892abffd1f95a1a022f896f533fc

After arriving at this hash, we would reach out to a few peers, and ask for the hash that they calculated for the header of Block 5.

PatriciaTries-compareToPeersDesktop

If the hashes match, then we know with a very high degree of certainty that we are all looking at the same ledger that everyone agrees is the source of truth.

PatriciaTries-corruptedBlocksDesktop

If anyone were to tamper with even a single transaction, the block headers we calculate would not match those reported by our peers and hence we would know that we cannot trust the data we received to be correct.

3. Hashing at Scale and in the Wild

Now that we have a handle on what hashing is and how it helps maintain a single source of truth in Bitcoin, let's look at some of the problems that arise when we try to use hashing to verify the Bitcoin ledger at scale and in the wild.


Let's start by breaking out the six fields included in every Bitcoin block:

Field Comment
1. Version Bitcoin software version number
2. Pervious Block Hash We covered this earlier in this post
3. Merkle Trie Root Secures the transactions in the current block
4. Timestamp Self explanatory
5. Difficulty Target Mining related, see this video if interested
6. Nonce Mining related, see this video if interested

You'll note that rather than securing each block by including a direct hash of transactions in the block header, Bitcoin secures the transaction with a Merkle Trie Root (Field 3 in the table above).

We will cover Merkle Tries in detail in the next section but first let's explore why Satoshi didn't just take a direct hash of each block's transactions and stuck them in the header.

The Glory IV is a fictional boat featured in Neal Stephenson's 1990s classic historical fiction / techno-thriller Cryptonomicon

Let's imagine a transaction between the nerdy but affable software engineer Randy Waterhouse and the no-nonsense marine surveyor Amy Shaftoe, two characters from the historical fiction classic Cryptonomicon (also by Neal Stephenson).

Let's say Randy asks Amy to dispatch the boat Glory IV to discreetly survey the bottom of the Subic Bay, off the west coast of Luzon in the Philippines.

Randy sends Amy 10 BTC as payment for this service.

He also sends Amy an encrypted email saying that he has sent the 10 BTC payment, that the hash of the transaction is 4C463, and that the transaction has been included in Block 5 of the Bitcoin ledger.

As much as Amy likes Randy, she wants to verify this payment before dispatching the Glory IV.

In this scenario, let's assume that to secure Bitcoin's transaction data Satoshi had chosen to just do a direct hash of a simple concatenation of all the transactions in a given block.

Let's also assume for this example that Block 5 contains a total of 8 transactions.

In this scenario, how might Amy verify the existence of Randy's transaction in Block 5?

First, she (or her Bitcoin software to be more precise) would download a copy of all the transactions in Block 5.

She would add all the transactions together and she whould then hash the result, as shown in the diagram below.

alt

Once she had calculated the direct hash of the all the transactions in Block 5 (906f2), she would then download the other five fields of the Block 5 header (i.e. the version number, timestamp, the hash of the Block 4 header, etc.), she would combine these header fields with the hash of the transactions (906f2), and then she would calculate the hash of all these header fields to arrive at the Block 5 header hash.

alt

Finally she would check that this hash of the Block 5 header that she had just calculated matched the hash of the Block 5 header that other peers were reporting.

If the hash of the header that Amy calculated for Block 5 matched what other peers were reporting, then she would have strong proof that Randy's transaction was indeed in Block 5.

Ok cool, that seems to work . . . so what's the problem?

Unfortunately in real life, Bitcoin blocks contain way more than just 8 transactions.

In fact, the typical Bitcoin block today includes more than 2,000 transactions!

Pepe frog worrying about having to hash all these tx's every time he wants to validate a transaction

If every time Amy's wallet tried to verify a transaction, it had to download thousands of transactions and then hash them all, her computer would likely have a stroke.

Fortunately, Satoshi foresaw this problem and came up with a clever solution: including a Merkle Trie Root in the Bitcoin block header.

4. Merkle Tries

Let's have another crack at verifying Randy's transaction for Amy, this time using Merkle Tries.

Recall that there were a total of 8 transactions in Block 5, including Randy's.

This time, instead of calculating a direct hash of the the list of transactions, let's use a Merkle Trie.

We'll walk through setting up the Merkle Trie step by step so we can see what a simple, useful and totally human brain-friendly concept it is.

First, let's go over some terminology that will help us follow along.

Nerd speak Semi-Accurate Plain English Definition
Trie Like a tree but upside down, with the root at the top and the leaves at the bottom
Node A point on the trie
Parent A node that connects to nodes that are further down the trie
Child A node that connects to nodes that are further up the trie
Root Top node with no parents and yet is generally the centre of attention
Leaf Nodes with parents but no children (and yet surprisingly humble and useful)
Binary Trie A trie where each parent has exactly 2 children

Now, we line up our transactions along the bottom of a binary tree.

PatriciaTries-binaryTree

Now we load our transactions into the tree. We do this by hashing each transaction and saving the result into each transaction's corresponding leaf node.

PatriciaTries-binaryStep1

Now we pair each leaf node up with its neighbour, hash each leaf node pair, and load the result into the parent node of each of the leaves.

PatriciaTries-binaryStep2

We now repeat this process all the way up to the root node.

PatriciaTries-merkleTreeComplete

And now we have a Merkle Trie, which is, as I've hopefully demonstrated, simply a binary tree where each node is the hash of its children.

The top node is called the Merkle Root, and this is what miners include in the header of every block they mine in order to secure the transactions in that block.

That wasn't so bad was it?

Umm . . . that's cool . . . but how does this help us verify Randy's transaction?

Recall that the last time we tried to verify Randy's transaction, we had to download 8 transactions including Randy's transaction.

With Merkle Tries, we don't need to download 8 transactions.

Instead, as shown in the diagram below, all we need to do is download 3 Merkle Trie nodes plus Randy's transaction and we can still verify that Randy's transaction was included in Block 5.

PatriciaTries-download8nodes

Let's walk through how we can verify Randy's transactionwith a Merkle Trie in case it's not clear from the chart.

First, we reach out to a full node peer and ask for:

  1. Randy's transaction (4c463)
  2. Leaf Node F (bdc40)
  3. Node GH (3038f)
  4. Node ABCD (4e443)

Then we perform the following hashes:

  1. We hash Randy's transaction into Node E
  2. Then we combine Node E with Node F and hash this into Node EF
  3. Then we combine Node EF and Node GH and hash this into Node EFGH
  4. And finally we combine Node EFGH with ABCD and hash this into the Merkle Root Node.

The resulting Merkle Root from this hashing exercise is 90da4.

After calculating this root, we compare it with the Merkle Roots reported by a few peers, and if they match, we've proven that Randy's transaction is indeed in Block 5.

Crucially, the Merkle Trie allowed us to avoid having to download nodes that do not touch the direct path from Randy's transaction to the root node.

Ok, so rather than downloading 8 transactions to verify Randy's transaction, we only need to download 3 Merkle Trie nodes. I guess that's kind of cool

In case you're still not that impressed by Merkle Tries, let's see what happens if we increase the number of transactions in the block from 8 to 16.

binaryTreeComplete-16nodes

As you can see, we increased the number of transactions by 8, for a total of 16 transactions.

And yet we only need to download 1 additional Merkle Trie node for a total of 4 Merkle Trie nodes.

Still not impressed?

Ok how about this. Let's increase the number of transactions in the block to 2,048.

That's an increase of 2,040 transactions and yet to verify one transaction in this sea of transactions we would only need to download an additional 8 Merkle Trie Nodes for a total of 11 Merkle Trie Nodes.

If, unlike me, you're still all over your high school maths, you might recognise the pattern here.

Number of Transactions Required Merkle Trie Nodes
8 log2(8) = 3
16 log2(16) = 4
32 log2(32) = 5
. . . . . .
2,048 log2(2,048) = 11

The table above goes to the heart of why Merkle Tries are so useful.

As the number of transactions in a block increases, the number of Merkle Trie Nodes required to verify a transaction also increases.

However, relative to the number of transactions in a block, the number of Merkle Trie Nodes required to verify a transaction grows much more slowly.

Specifically it increases logarithmically (e.g. 3, 4, 5 . . . 11) or at the same rate as the exponent over 2 that would equal a given number of transactions (e.g. 2 to the power of 11 equals 2,048).

And if that still doesn't impress you, then just stop reading now, go open an account at Coinbase and buy Ripple :D

5. Understanding State in Bitcoin vs Ethereum

So far we've covered what Merkle Tries are and how they are useful for verifying Bitcoin transactions even as the number of transactions in a block increases.

We're just getting started though and things are about to get a lot more interesting as we turn to Ethereum.

Before we jump into how Merkle Tries are used in Ethereum, let's lay some more groundwork, namely understanding the differences between how Bitcoin and Ethereum maintain state.

If you're not familiar with the term "state" think of it as kind of like a snapshot at a specific point in time of information that a computer is keeping track of.

In finance, the "balance sheet" is analogous to state in a computer, in that for a company the balance sheet represents the state of assets and liablities at a specific point in time, whereas the cash flows represent the "transactions" that modify that state between two periods.

One thing I didn't fully appreciate about Bitcoin until writing this post was that Bitcoin does not store state explicitly.

In other words, there is nowhere on a full node that you can point to and say "that's where the current state of Bitcoin is stored".

This is a direct result of the data structure that Satoshi chose for Bitcoin.

In Bitcoin, unlike say, your bank, there are no accounts, and there are no account balances.

But how does my wallet know how much BTC I have?

Instead of accounts Bitcoin uses UTXO, which stands for Unspent Transaction Outputs.

When I first heard the term UTXO it sounded very intimidating to me but the concept itself is very simple.

The way I think of UTXOs is simply like paper bills.

Using this analogy, your Bitcoin wallet is very much like a real wallet, stuffed with many of these UTXO "bills" which often come in a mess of different denominations, especially after having made several transactions that required change (e.g. like having change in your wallet of two quarters, two $1 dollar bills, one $5 bill and one $10 bill after paying for a $2.50 coffee with a $20 bill).

To spend UTXOs you need to create a transaction that takes one or more of your UTXOs as "inputs" (similar to what you would do in the real world if you pulled out one $5 bill, two $1 bills and one quarter to pay for a $7.25 sandwich).

You then define as "outputs" one or more recipients of the funds plus how much each will receive.

alt

Importantly, the value of the UTXOs used as inputs needs to exactly match the value of the new UTXOs created in the output of the transaction.

You then broadcast the transaction at which point all the UTXOs you included as inputs get marked as spent and can no longer be spent (they become TXOs).

The outputs of the transaction become new UTXOs which can be spent in the future.

alt

Circling back to how state is stored in Bitcoin, when your wallet shows you a "balance", what it is really doing is searching all past blocks for unspent bills for which you own the private keys and adding them up to arrive at your wallet balance.

Because everyone agrees on the information in every bitcoin block, every wallet arrives at the same answer for any particular set of public keys.

As a state model UTXO is very simple and is perfect for cash-like payment systems like Bitcoin.

In cash-like payment systems you don't really need to know anything about other participants or their wallet balances.

When you receive a payment all you care about is that the amount of cash that you receive, i.e. the number and denomination of bills that you receive, matches what you are expecting.

In this respect, UTXO is quite self-centered as a data model. All you really care about is your own state.

In Ethereum, things get a little more complex because you are performing much more complex interactions that require knowledge of other people's states.

For example, lets say you want to play a round of Kitty Race, a game where you pay a fee for entering your Crypto Kitty into a race against other kitties in a winner takes all race.

The winner is determined largely by chance, but having a Kitty with the right genetic makeup can give you an edge.

For this game to work, the Kitty Race contract needs to perform the following "lookups":

  • Looking up whether you own the Crypto Kitty that you want to enter
  • Querying the Crypto Kitty contract to determine the genetic make up of your kitty
  • Performing the same operations for up to 9 other racers

In other words, the Kitty Race contract, which is relatively simple, needs to query your state, the state of the Crypto Kitties contract, as well as the state of up to 9 other racers.

More complex applications, like the Maker Dao set of contracts that underpin the DAI stablecoin, require keeping frequent tabs on the state of many more contracts and accounts.

In theory this is possible to do with UTXO, e.g. you could in theory continually reconstruct the state of the Kitty Race contract, the Crypto Kitties contract, the 10 racers and anyone else you might need to interact with.

I struggle to picture how this would work, but I am aware of a few projects that apparently have managed to implement smart contracts on top of UTXO networks (I am intrigued by this idea so perhaps this is an area I should dig into more at some point).

For the purposes of this post however, suffice to say that the more obvious way of enabling people to query each others' states is for every participant to maintain a "shared state".

You can picture this shared state as very much like a shared hard drive, except that the hard drive does not reside in one place but rather in every machine that is running an Ethereum full node.

PatriciaTries-Page-12

In this shared hard drive, everyone has an identical copy of the drive.

Everyone has an account (or several accounts), which you can picture as a folder inside the drive that only the owner controls.

In that folder, you can store digitial money (ETH) as well as any arbitrary data, like text, images, media files, etc.(albeit in practice space is very limited so generally people only store strings of text).

This "shared drive" approach to structuring state is what people often refer to as Ethereum's "account model".

6. Patricia Tries

The "shared drive" model of storing data is all well and good in theory but for it to work in practice this shared drive needs to:

  • Be easy to query
  • Be easy to secure (i.e. no one can make unauthorised changes to the shared drive)
  • Be easy to synchronise amongst all participants (i.e. everyone is looking at the same state at all times)

This is no small feat and to achieve it Vitalik and his fellow early Ethereum developers turned to Patricia Tries.

A Patricia Trie is a data structure that organises data along prefix paths determined by the keys of a set of key value pairs.

Let's break this definition down so we can see what it actually means.

First, a key value pair is simply a piece of data (a value) that is assigned to a key so that it's easy to find later when needed.

A simple example of a set of key value pairs would be a list of unique names (keys) with a balance against each name (values). For instance:

{
    "Abel": 89,
    "Abigail": 3,
    "Abner": 55,
    "Abraham": 5,
    "Acacia": 34,
    "Ada": 21
}

To organise this data into a Patricia Trie data structure, we would group keys by their prefixes, i.e. we would group the keys based on characters that they share in common.

In this key value set, all keys start with A, so our first node in the Patricia Trie would be A.

PatriciaTries-Step1

Then we make a node for the prefix b, which 4 out of the 6 keys share as a second character.

PatriciaTries-step2

We repeat this process for all keys in the A-b path until the sub-path for each key shares no characters with other keys.

PatriciaTries-step3

At this point we stop and declare the node to be a leaf node. We then save the data in this leaf node.

Now we follow the same process for Acacia and Ada to complete the Patricia Trie.

PatriciaTries-step4

Just to make sure I've conveyed how this works, let's assume that we want to add the key value pair {"Adela" : 2}.

To do this, we first delete the node at the A-da path and replace it with a new node with the path A-d.

PatriciaTries-step5

Then we create two new leaf nodes, a, and ela, and store the respective values in those leaves.

PatriciaTries-step6

So, why store the key value pairs this way instead of just having a table with rows of key value pairs?

Well, a simple table might work for a couple of hundred rows, but as your data grows to thousands or millions of rows, the time it takes to find a speficic key tends to grow linearly with the size of the table.

With a Patricia Trie structure, a computer can look at a key, for example "Ada", and just follow the prefix path on the trie for that key, without wasting time searching down paths that don't lead to "Ada".

PatriciaTries-step7

As the database grows, the number of nodes required to store the data, and consequently the time required to find the data associated with any specific address, grows as well.

However, as we saw with Merkle Tries, the number of Patricia Trie nodes tends to grow much more slowly (specifically, logarithmically) relative to the growth in the number of key value pairs stored in this structure.

This property makes Patricia Tries a great choice for organising data whilst maintaining scalability of operations like searching for data, updating data and securing data, as we'll see in the last sections of this post.

7. Merkle Patricia Tries

Now that we've covered Hashing, Merkle Tries, UTXO vs Account-based state and Patricia Tries, understanding Merkle Patricia Tries is super simple.

Let's imagine that the key value set we were looking at above was the entire state of Ethereum right now.

In Ethereum, rather than using names as keys, we use hashes, so let's replace the keys in our previous data set with hashes.

{
    "0xa711355" : 89, //Abel
    "0xa77d337" : 3, //Abigail
    "0xa7f9365" : 55, //Abner
    "0xa77d397" : 5, //Abraham
    "0xa7733k" : 34, //Acacia
    "0xa7fe7az" : 21 //Ada
}

And now let's organise this state into a new Patricia Trie.

PatriciaTries-EthPatriciaTrie

Now let's hash each child into its parent node, same as we did with Merkle Tries before.

PatriciaTries-MPT

And we now have a Merkle Patricia Trie, with a root of 56515 at the top which we can use to secure the data in the entire trie.

8. Securing the Ethereum "hard drive" with Merkle Patricia Tries

Let's take a more detailed look at how Merkle Patricia Tries secure the Ethereum "hard drive".

Remember that what we mean by "securing state" is ensuring that every person running an Ethereum node agrees what the current state is, while making it as difficult as possible for any person or group of people to make "illegal" changes to the state (e.g. by sending themselves money from another account, creating new money out of thin air, breeding new kitties without paying siring fees etc.).

To keep everyone's copy of the Ethereum "hard drive" in sync, the Ethereum state changes one block at a time, with each block containing a list of transactions or "append" instructions that modify the state of the Ethereum hard drive.

PatriciaTries-stateTransition

At a high level, Ethereum blocks function in a similar way to Bitcoin.

Unlike Bitcoin however, the header in Ethereum blocks include not one but three Merkle Patrica Trie roots:

  1. A transactions root, which secures transactions within each block, very much like Merkle Trie Roots secure transactions in Bitcoin
  2. A receipts root, which we won't focus on in this post but you can find out more about here
  3. A state root, which is used to secure the global state stored by every Ethereum full node

Let's focus on the state root and walk through an example transaction to see how Ethereum uses it to secure data stored on the Ethereum hard drive.

We'll start by imagining that the current Ethereum block is Block 5 and that Block 5's state root is 56515, the root that we calculated in the previous section.

Let's say Ada (whose account key is 0xa7fe7az) wants to send 1 ETH To Acacia (whose account key is 0xa7733k).

First Ada sends her transaction to the Ethereum network.

PatriciaTries-broadcastTx

At some point, a miner picks up Ada's transaction, along with many others.

To keep the example simple, let's assume that Ada's transaction is the only one in the block.

The state of both Ada's and Acacia's accounts will change because of Ada's transaction, so the miner updates the state of these accounts locally on his machine.

Once the miner updates the state of his machine, he knows the state root will completely change, so he recalculates it.

Note however that not all the nodes in the Merkle Patricia Trie will change as a result of the transaction.

PatriciaTries-mptUpdate

So to recalculate the Merkle root, rather than recomputing the entire Merkle Patricia Trie, the miner re-uses old nodes on the trie that are not affected by the change, and recomputes only nodes affected by the change.

This ability to reuse nodes that were not affected by state changes is one of the key scaling benefits of using Merkle Patricia Tries for securing state

Once the miner calculates the new Merkle Patrice Trie state root, b45fd, he includes it in the header of his proposed Block 6.

If he happens to win the right to determine Block 6, he then sends the new block to the network.

Other full nodes will receive the new Block 6 but before updating their state and broadcasting the new Block 6 to other peers, they validate the block by, among other things, checking that they also arrive at b45fd for the state root of Block 6.

PatriciaTries-mptBlockUpdate

Here again the Merkle Patricia Trie structure saves time by allowing full nodes to reuse unaffected trie nodes when validating a new block

As more and more nodes validate the new Block 6, they update their states, and broadcast the block to other peers.

PatriciaTries-updatedPeerRoots

As they do this, the overall state of the Ethereum "hard drive" updates by one block while retaining a high degree of security and resistance to tampering.

Make no mistake, this process is very slow, certainly relative to modern computing standards.

However when Ethereum went live in July 2015, it was the first ever successful attempt at creating a "world computer" that maintained a secure state across an arbitrary number of peers without relying on one entity to coordinate consistency and security of state.

The Merkle Patricia Trie was among the critical innovations that helped realise this incredible feat.

Wrapping up

I certainly have been guilty in the past of getting caught up with the limitations of Bitcoin and Ethereum, particularly with respect to their speed compared to centralised services like AWS or Azure.

However after diving into Merkle Patricia Tries I gained a new appreciation for some of the innovations that allowed both Bitcoin and Ethereum to not only work but also to scale to a point where despite their limitations they are usable and used by tens of thousands of people, if not hundreds of thousands of people, while still maintaining very low reliance on any central coordinating party.

Whether you were seeking to learn about Merkle Patricia Tries for fun, curiosity or for work, I hope this post has helped you grasp the core of the concept.

If you feel others might benefit from reading this post, please feel free to link to it and share it (e.g. on Twitter) to make it easier for others to find.

Also if you have any questions or would like to leave feedback, please email me at alexm@skunk.capital. I would love to hear from you.

Show Comments

Get the latest posts delivered right to your inbox.