This piece is the first entry to a multi-part series explaining Plasma, by Daniel Goldman.
In case you haven't heard, scaling cryptocurrency is hard.
In August of 2017, Vitalik Buterin and Joseph Poon released the Plasma whitepaper, unleashing into the world a new, promising approach to increase crypto transaction throughput and deliver us from blockchain-congestion evil. Seemingly overnight, Plasma became the most hyped-up Layer 2 scaling framework in the Ethereum ecosystem, the hype bringing with it the dizzying chaos we've now come to expect from the crypto space—bold promises, ambitious research, and a plethora of variants/proposals/counter-proposals/optimizations so multitudinous it's basically become a running joke.
Which is well exciting and good, but unfortunately, Plasma's fast-paced evolution, along with its technical complexity, has made it all but impossible for those not directly involved with the research and development to get a handle on things. Peer in from the outside, and you'll likely be left with more questions than concrete answers: what capabilities can we realistically expect Plasma chains to have? What obstacles do we still face before seeing Plasma effectively operating in the wild? What are its trade-offs? How does it work?
And first and foremost: just what the hell is it?
If those questions pique your interest, you've come to the right place! This series aims to provide an overview of Plasma technology — what it is, how it works, and what the current state of the technical research can tell us. Part one will cover:
- the theoretical underpinnings of Plasma as a Layer 2 technology
- explain the inner workings of the first concrete Plasma specification 'Minimum Viable Plasma'
- Plasma Cash, the variant specification that’s attracted much of the Plasma R&D since 'MVP'
Obligatory disclaimer: while no serious technical background should be necessary to follow along, some basic understanding of Ethereum and smart contracts will be assumed. Plan accordingly.
Background: Layer 2
The family of protocols we call “Plasma” represents a subset of Layer 2 solutions to the blockchain scalability problem, the “problem” here essentially being the limited transaction capacity that an open, unpermissioned blockchain can handle. Layer 2 seeks to circumvent this bottleneck by allowing for (some) transactions to be considered finalized without them ever having to touch the blockchain itself.
If you’d like, you can think of a Layer 2 transaction as a check whose account’s funds you can verify directly, without necessarily having to actually deposit it into your bank account. This check could then effectively be treated as paper currency and directly handed off to another party as payment, provided that the next party also gains ability to verify the account’s funds for themselves (rough analogy, please no nitpicking yet.)
The general pattern of Layer 2 systems is: initially, some capital is locked up on the blockchain’s base layer (we’ll assume Ethereum from here on). Next, some parties (and not necessarily the same parties who made the deposit) can then transact off-chain with this capital via an overlay system, while only interacting with the mainchain occasionally (if ever). At any given point, the proper owner of any capital has assurance in their ability to withdraw all funds they own back onto Layer 1.
The defining property that distinguishes Layer 2 (as we’re using the term) from other off-chain payment systems is that despite avoiding constant base-layer interaction, Layer 2 transactions still preserve all of the decentralized, trustless security guarantees that we expect from Layer 1. By securing your private keys and running the requisite software, you can guarantee custody of your own funds, regardless of the actions or inactions of any counterparties — “counterparties” here being other individuals, institutions, consensus mechanisms, or really anything else that's outside of your control, save for the mainchain itself. Even in a nightmarish, conspiratorial, Truman Show-esque scenario where all other users of the system are secretly colluding to try and steal your money, they'll fail.
What Makes Plasma Plasma
In the past few years of Layer 2 R&D, a taxonomy has slowly but surely emerged that lets us neatly partition Layer 2 mechanisms into one of two categories: “Plasma” or “channels” (as in “state channels” or “payment channels”). While not everyone uses these terms precisely this way — and ultimately, neither crypto nor language itself has a high council to officially settle these definitions for us — we’ll hereby assume broad enough definitions of “Plasma" and "channels" such that the two encompass the totality of all possible Layer 2 systems.
One way to delineate these two categories is by the minimum on-chain transactions they require: for a channel transaction to be considered finalized, no interaction with the mainchain is strictly necessary; for a Plasma transaction, one interaction with the mainchain is strictly necessary (broadcast by the Plasma operator, not the users, as we’ll see). The reason Plasma still qualifies as a Layer 2 scaling approach (despite requiring regular on-chain transactions) is that each Layer 1 transaction can effectively finalize many transactions in one fell swoop; you can imagine that a bundle of Layer 2 transactions are compressed down into one. Still, this itself seems to be an ipso facto plus-side to channels; no on-chain block confirmations required means (virtually) instant finality, and less on-chain interaction is generally a good thing.
On the flip side, a channel requires full consent of all of its participants for any channel-wide state update, which means that having a single channel with many parties gets highly impractical. Transacting with parties with whom you don't share a channel requires "relaying" transactions through your channel-partners, limiting your financial activity to those with whom you can find these liquidity paths through the channel network's graph. In Plasma, however, only a transaction’s sender needs to give consent, and no liquidity lock-ups / restrictions are required for all parties involved to enter, exit, and freely transact with each other (the question of “who needs to give consent for state updates,” is, it turns out, an equivalent way to delineate the “channels” vs. “Plasma” dichotomy).
So at face value, one could say that channels are the appropriate mechanism for applications that benefit from instant finality and where a small, relatively fixed set of participants can be expected to interact, whereas Plasma is most useful of cases where many parties are involved and high transaction throughput is paramount, with immediate finality being less important.
As we'll see in later installments, there are constructions which utilize both channel and Plasma mechanisms and try to capture as much of the best of both worlds as possible, so we may ultimately not need to compromise as much as it may seem. But let's not get ahead of ourselves — before we start getting too fancy, we first need to grasp how Plasma actually works.
Minimum Viable Plasma
While the original whitepaper introduced the general notions of Plasma, it was also broad and wildly ambitious (and long, frankly); some of the ideas it floats — a tree of nested Plasma chains, for example — are currently still out of the scope of any current Plasma research and may ultimately not even be possible.
Thus, the first big step towards getting actual working code — and, arguably, the starting point for the way we currently think about Plasma — was a spec known as Minimum Viable Plasma (MVP). As the name suggests, the goal here is to filter out all fancy features and distill things down to the simplest possible working implementation. "Working" here means it must simply satisfy the fundamental requirements of a Layer 2 Plasma system as defined above, and with only minimal functionality — namely, A to B payments of some fungible asset (we'll assume Ether from here on, but it works just the same with any ERC20-compliant token).
For the time being (and only for the time being!), we’ll ignore any other downsides that emerge, even if said downsides deeply suck. And indeed, MVP does qualify as a functional Plasma solution! Although the downsides... well, you decide.
As we saw earlier, the key property of Plasma is that many transactions are compressed down and finalized with only one transaction landing on the mainchain. In MVP, the "compression" is done via a Merkle tree; transactions are grouped together and Merklized down into a root, which is all that needs to be put on Layer 1. The transactions themselves follow a Bitcoin-esque UTXO model; i.e., they spend from inputs of which the sender proves ownership, and create new outputs encumbered with the public address of their new owners. Which is to say: a Plasma chain is itself a blockchain! Using the requisite off-chain Plasma block data and the on-chain Merkle root data, users can verify ownership of what’s rightfully their own, relying on the smart contract on the Ethereum chain to enforce the rules and settle all disputes.
We'll refer to the entity that is responsible for the procedure described above — i.e., Merklizing the transactions, broadcasting the root, and sharing the data with users — the Plasma "block producer," essentially — as the Plasma Operator. It's worth noting that the Plasma mechanism itself is completely agnostic as to what form this Operator takes; it could be a single, "centralized" entity, a federated sidechain, a proof of stake based block attestation system, etc. The fundamental goal of the Plasma construction is for all fund management to be non-custodial, and if we can get Plasma to work, this factor should remain the same regardless of who’s producing blocks for us. The more “decentralized” mechanisms might well offer other benefits of the sort we associate with distributed, peer to peer systems, i.e., censorship resistance, fault tolerance, etc. — but the non-custodial-ness remains the same. Thus, for simplicity’s sake, we'll assume the Operator is simply a single entity, letting us reason more explicitly about the Plasma mechanism itself.
With that, we can start to go through the life-cycle of a typical Plasma transaction, and examine how things are handled in different possible scenarios.
First, Alice deposits Ether into our Plasma chain by sending an on-chain Ether transaction to the contract, which the Operator includes in a Plasma block; this Ether initially belongs to Alice (obviously) in the form of a UTXO. Alice, as usual, wants to pay Bob, (note that Bob himself does not necessarily need to have made any on-chain deposits himself yet, or ever.) To do this, she creates a transaction that spends her UTXO and creates a new one for Bob, and sends this transaction to the Plasma operator. The operator takes this transaction along with a bundle of other (feasibly unrelated) transactions, groups them together into one Plasma block, "Merklizes" them down to their Merkle root, and sends this root — and only this root — onto the main chain.
The operator then sends this Plasma block to all users (including Alice and Bob). Upon receiving the latest block, Alice and Bob validate it on their end; this validation entails ensuring that the transactions themselves are valid and that the block corresponds to the on-chain Merkle root. If all of this checks out, Alice, Bob, and all of the other users can then go happily on with their lives.
Later on, Alice — feeling she's had herself enough of this crazy Plasma business, say — decides she wants to withdraw her funds back onto the Ethereum chain. She initiates this "withdrawal request" via an on-chain transaction (n.b: this withdrawal request does not require the Operator’s permission.) In her transaction, she includes the Plasma chain’s UTXO she would like to withdraw, along with the Plasma block number it belongs to and the Merkle path proving inclusion. Now, before she has access to her funds, she must wait for the "dispute period" (one week, let's just say) to pass. During this period, other users can challenge her exit if they detect foul play.
Which brings us to:
The Happy Case: Everyone Behaves
Upon initiating her exit, other users skim through their copy of the Plasma chain to check and confirm that yes, indeed, the UTXO Alice is trying to exit with does in fact still belong to her. They also verify that all of the blocks are valid in every other way (though presumably they’ve already done this). Users can now rest assured that Alice is only departing with money that's rightfully hers, and other users' funds are safe. Life can go on.
The Unhappy case: Evil Alice
Now let's create an alternate ending to the previous scenario: Alice's exit attempt is "proper enough" to be initially accepted by the smart contract — which is to say, it's a valid transaction, with a Merkle proof that does indeed correspond with an old Merkle root — but it’s actually a double spend; i.e., she tries to exit with the same UTXO she sent Bob earlier. Alas, Alice.
But no matter! Bob (or any really any other user, but let's assume Bob) has one week to take action; he’ll check Alice’s UTXO against his copy of the Plasma chain and notice that it’s a double spend. To prove maleficence, he submits a "fraud proof" in the form of the old transaction in which Alice previously spent the UTXO in question, along with a Merkle proof of its inclusion in a Plasma block. Bob has given cryptographic proof that Alice already spent this money; she's been caught in the act, and her attempt to withdraw this money is cancelled.
A Note on “Punishments”
At this point, we may want some way to further punish Alice for her attempted crime, creating a credible threat of greater loss for her and ideally disincentivizing this sort of behavior from taking place to begin with. These punishment mechanisms are typical in channel constructions; in Poon-Dryja payment channels, for example (the payment channel construction currently used in Bitcoin’s Lightning Network), being caught trying to cash out outdated transactions results in your counterparty collecting all of your channel’s funds. While this sort of punishment isn't strictly necessary in either Plasma or channels, it’s arguably a stronger requirement in Plasma; without some notion of punishments, Alice could simply repeatedly attempt improper withdrawals, forcing Bob (or someone else) to spend gas fees with every response. Ironically, however, it's also less self-evident how such punishments could be administered in Plasma; to slash some of Alice's funds in the Plasma chain, we’d need to establish which funds are hers, which itself would require its own claim/dispute window mechanism, sending us down a recursive, challenges-all-the-way-down rabbit hole.
Thus, Plasma constructions typically require Alice to post an "exit bond" as she attempts her withdrawal. In essence, she says, "I would like to take out 5 Ether, and here's 1 Ether which you can take from me if my exit proves to be fraudulent." We’re free to set up the contractual terms — i.e., the required size of the bond, as well as the response case of a violation (give the bond to the successful challenger as bounty, slash it into oblivion, cover only the challenger’s gas costs etc.) — and make them as lenient/Draconian as we please.
The Miserable Case: Evil Operator
So far things have been gone relatively smoothly, largely because we've made the hugely simplifying assumption that the Operator has all of our best interests at heart. Now it's time to think the unthinkable: what if the Operator is an out-and-out liar and a thief?
Say, for example, Alice and Bob are going about their business, when they one day find that the Operator has sent them a block — with its Merkle root notarized on-chain — that includes a blatantly invalid transaction, one that spends, say 90% of all of the Ether available in the Plasma chain; recall