Thread by @nutzipper, Post shall be published soon, but for now please be prepared for [...]

Post shall be published soon, but for now please be prepared for a tweet storm.
Thread: why block merge is important milestone and how it brings horizontal scalability.
Spoiler: Some preliminary results of implementation in the end. https://twitter.com/nutzipper/status/1359052450657935361

https://twitter.com/nutzipper/status/1359052450657935361

1\\ Blockchain is effectively a computer, replicated across a number of nodes.
Two basic offerings are cryptographic proof of state transition + l incentive for making sure one cheating is going to be punished.

Now let us talk about state transition.

2\\ Early blockchain application gave birth to a public ledger. This public ledger contains, in the simplest case, a list of balances. But just a list is not a good model for such a ledger, so usually more sophisticated models are used, like UTXO for Bitcoin and some others.

3\\ Ethereum state model (EVM) allows storing Turing complete state machine, enabling executing of arbitrary smart contracts, not just changing balances in a ledger.

4\\ Entities that are allowed to introduce changes to the state, changing balances, making transactions, invoking smart contracts - are miners, or validators of the network.

5\\ In case of PoS networks, once some state change is introduced, it should be verified by other validators, and the Sword of Damocles should punish thee daring cheating (or, in simple words, the cheater will lose its stake).

This is the basic principle of most chains.

6\\ But what if some miners of the network make two state transitions at the same time?

7\\ PoW systems solve this by the longest chain rule - state transitions are stacked upon each other in order at which miners produce proof of work, and in the end the longest chain wins. There can be, and will be orphans, so miners doing useless work is a part of the protocol.

8\\ PoS chains mostly suggest slots for blocks, and the more stake miner has the more blocks it is allowed to create, thus collecting more rewards.

9\\ So, this basic principle of a blockchain assumes that all state transitions happen one by one, and verified one by one. This leads to seeking scaling opportunities in running parallel chains or making some computations off chain and committing the result of a batch at once.

10\\ Why is this, one might ask? Why not have a nice graph, or hypergraph, or hypergraph of hypergraphs? And include all state transitions that have been done by all validators, eliminating orphans at all?

11\\ Here's the problem - when miner creates a new state transition, it has to use some state as a base to work with, one that has been already proven by the network to be valid. And if you have not a chain, but DAG, you have several possible base states to work with.

12\\ Let us have 4 validators proposed 4 blocks with the following transactions:
tx1: Alice bought a toy train,
tx2: Bob bought a ticket to the cinema,
tx3: Carol bought the same toy train that Alice did
tx4: Dave moved all his funds to DeFi contract with open exploit.

13\\ So miner A sees all these and tries to create a new state transition, willing to have all these 4 changes included into the final state which it will offer to peers for verification.

14\\ Here the question arises - how miner A can be sure that tx1 tx2 tx3 and tx4 do not call a double spend (and tx1 and tx3 actually call a double spend)?

15\\ RChain stands on a point that if you don't have a computational model that has an intrinsic concurrency model - you cannot detect that double spend.

16\\ Of course, one is able to create some extensions over the smart contracting language to track some special cases, e.g. have some very strict model for token transfers and check if token transfers do not interfere.

17\\ But - this is a not scalable solution and quite fast stack up limitations. Also, it's an error prone because no one guarantees that this will be a secure solution.

18\\ RChain solves this problem. Thanks to rho-calculus as the foundation for computing language - conflict detection is provided straight out of the box. So miner A easily can detect which state transitions cannot be combined and disqualify them (and only them)

19\\ To understand this more deeply, let's take a look how RChain represents the state of a virtual machine.

20\\ All computations in Rholang are expressed as sends and receives on some channel, as it is a process calculi language. So, roughly speaking, all that is stored - is data pieces that are available on a channel, and continuations that are available on a channel.

21\\ If at some point data and continuation match - comm event happens, which is a basic compute event. All this is stored in a RSpace, which is a storage unit. As a side note, the outcome of this storage model is that RChain also offers transactional data storage.

22\\ There is one basic constraint on RSpace - if one looks at it, there should not be any possible comm events at any time. When some state transition happens, if any comm occurs - all continuations are invoked till the point when there are no more potential comm's left.

23\\ That's why the logic of conflict detection simply boils down to checking if changes produced by transactions, added to RSpace, lead to potential comm events. If no - changes made by these tx’s can be put into RSpace without any code execution.

24\\ And, which is important - this conflict resolution is built-in on a computational model level. So it is available for all user code ever going to be written, and - with mathematical proof of correctness.

25\\ Now, having this knowledge, let's talk about how this brings scalability.
General understanding in blockchain space is that if you add more nodes to the network - you inevitably lower down the throughput, decentralization is not free.

26\\ This sounds reasonable, overhead on routing and data replication increases. But, if miners are allowed to change state in parallel, and there is a way to validate and compose these changes in an efficient way - we might reconsider this statement.

27\\ So, we have two pieces that have to be enabled to make this happen: efficient validation, and efficient merging of state changes. The latter includes conflict detection and actual technical merging of changes.

28\\ The great thing about RChain - validation for all state transitions can happen concurrently. RSpace is an immutable structure. So, when miner A receives several blocks from miners B, C, D - all these state transitions can be validated in parallel, even if they are conflicting

29\\ In extreme cases - time spent on validation of 100 blocks should be the same as for a single block. But this is a matter of optimization, proper engineering and resources available. At the moment of writing the article this is still work in progress.

30\\ Conflict detection can be done with several degrees of efficiency, trading number of conflicts for performance.

31\\ The most performant way is to not analyze the content of the channel but just detect if the same channel used with different polarities in two state transitions - on one data is being put into channel, in another continuation is being put into the same channel.

32\\ In general case, this should not be a common thing, because of how channels are constructed using unforgeable names (which btw enables other awesome things like OCAPs security discipline).

33\\ So it is sufficient for many (if not the most) use cases. In this case conflict detection boils down to the operation of intersection of two sets. This is quite a simplistic explanation, but for one's wanting to get a more in-depth picture, feel free to join Discord and ask.

34\\ And, the last part is the merging of state changes. This operation requires data that is inside quite a narrow scope till the last finalized block, which allows aggressive caching. But even without it, it's just reading from and writing to the database.

35\\ To sum up everything being said, this tweet storm briefly explained what is `block merge` and how it is related to scalability.
As said in the beginning, some preliminary results can be found in this gist https://gist.github.com/nzpr/2465db5210c817324ab7ab4f01e23f38.

gist:2465db5210c817324ab7ab4f01e23f38

GitHub Gist: instantly share code, notes, and snippets.

https://gist.github.com/nzpr/2465db5210c817324ab7ab4f01e23f38

36\\ The expected result which should demonstrate linear scalability is that merging time is a linear function of number of nodes and number of blocks created by the network grows linearly with increasing number of nodes.

37\\ As it can be seen, merging of state changes for 20 blocks takes about twice as for 10 blocks. Which gives some high hopes for achieving almost linear scalability for state merging.
Please stay tuned for next articles with more data and tests. Thank you for reading this!

Latest Threads Unrolled: