Blockchain - Beyond the Hype of Bitcoin and Ethereum

"You can't stop things like Bitcoin. It will be everywhere and the world will have to readjust. World governments will have to readjust" - John McAfee, Founder of McAfee

I am sure by now everyone at least heard in some context of Bitcoin, maybe in the news or maybe even in your favorite TV series. For most people this is just an obscure technology (the name or its classification as a cryptocurrency doesn't help too much in understanding it either) used by suspicious individuals who want to purchase illegal merchandise or by hackers to fund their...well...their hacking or whatever suspicious individuals do.

Pretty much sums it up

While that may be one of its uses, the truth is that there are a lot of legitimate uses that bitcoin enables with more and more places accepting Bitcoin as a means of payment these days. But what's most interesting is the underlying mechanism that makes Bitcoin work, the blockchain. And, if you have been paying attention, then you know that blockchain was one of the most buzzed and hyped technologies in 2016, as it can be seen in Gartner's 2016 Hype Cycle for Emerging Technologies picture illustrated bellow. If you decide to read on you will not only understand how bitcoin works, but will get to see how deep the rabbit's hole really goes.

Some words about cryptocurrencies

Each blockchain has an associated digital currency (aka cryptocurrency) that is independent and exists only within the world of that blockchain. The most known example is of course Bitcoin, but there are a lot more than you would expect (more than 700 at the time of this writing). You can get a list of them and their value compared to the US dollar here. Maybe you will say "Wait, that money doesn't really's just some numbers that exist on a computer". In our day and age this is also the case for most conventional currencies, they are just numbers in the bank's datacenters, with cash becoming more and more rare. In fact Bitcoin has been compared with currencies which are based on the gold standard. Without going into much details, think about the fact that the value of something usually comes from its scarcity and by how big the demand for it is...the same applies for cryptocurrencies.

Blockchain - The data structure

So what is blockchain? The name is pretty revealing, it is a chain of blocks. Each block contains a number of transactions that have taken place. These transactions deal with the associated cryptocurrency of that blockchain (as discussed earlier). Together, all blocks form the entire history of transactions that took place since the inception of the blockchain, you can think of it as a ledger. As you might expect, there is a block 0 in this chain and it is a very special block. It is here where the initial amount and distribution of the new cryptocurrency takes place. So if you decide to launch a new digital coin make sure to allocate some for you in the beginning...who knows maybe you stumble upon the next Bitcoin.

Inside the block there is other information, which varies according to blockchain implementations (e.g. timestamp of the block and metadata about the transactions). What is always there is a pointer (a hash pointer to be more exact) which indicates what was the previous block. That is how the chain is constructed and the blocks tied to each other. To get a little more technical, in Bitcoin, the way this works is that each block has a hash of the previous block's header. A hash is a one way mathematical function that applied to any input, it will generate a fixed size output. Since we want our transactions to be secure, what we need in the context of blockchain is a cryptographic secure hash function (in bitcoin that being SHA256), which has the following 3 extra properties:

  • you can only find the input value corresponding to an output value by random trials
  • a small change to a message should change the hash value so extensively that the new hash value appears uncorrelated with the old hash value
  • you cannot find two inputs with the same output value easily

Congratulations, you know the basics of the blockchain data let's go distributed!

Blockchain - The network

The whole point of cryptocurrencies is to function on a decentralized infrastructure. That's why we have the blockchain network, a peer to peer network where the peers are the so called miners.

If you were ever intrigued by the term miner here is where the mystery will be revealed. The term miner refers to the nodes that are part of the network. For public blockchains (like Bitcoin), everyone can join and become a miner (yes, even you), while for permissioned blockchains there is in place some sort of access control (public vs permissioned vs private blockchains is an another entire article by itself).

Miners have three big roles:

  • They all keep a full copy of the current blockchain that they have already reached a consensus on (under the form of the data structure described earlier). In other words, they all have access to the entire history of valid transactions that took place on the blockchain. This can become quite a burden on the miners, with the current Bitcoin chain having a size of over 114 GB (so if you want to become a miner, better free up some space). This makes the entire process quite inefficient, so a lot of research has been made on new blockchains to eliminate this requirement.
  • Each of them constructs independently a new block containing new transactions coming from the users of the blockchain. It is also their duty to validate and forward the transactions to the other miners. Since it can happen that they receive more transactions than available space in a new block, mechanisms exist in order to prioritize.
  • They must reach a consensus on what block will be added to the canonical chain.

How is that consensus reached and how can we trust this network since there is no central authority that regulates this network and everyone is out there for himself ?

To answer this we have to turn away from a pure scientific answer, because this is one of the rare cases where things work better in practice than in theory. But theory is catching up fast, especially with all the hype that is surrounding this field lately.

Consensus - Beating the odds

This is one of the holy grails of Computer Science, Distributed Consensus aka Byzantine Fault Tolerance. It is named so after the Byzantine Generals Problem which tells the story of multiple Byzantine generals, found in different camps, having to coordinate for a successful attack or retreat under unreliable circumstances (really exciting stuff there, I encourage you to read about it in the link I provided). In general, computer scientists achieved pessimistic results about this topic, with a number of impossibilities proven.

Why blockchain works is because it departs from classical distributed systems (e.g. distributed databases) in two essential ways:

  • It introduces the idea of offering incentives to miners to play nice and follow the rules of the game. Being associated with a currency, a natural mechanism is provided for doing so
  • It embraces randomness. It acts on probabilities more than on certainties, and with the passing of time these probabilities grow towards 100%

Let's enumerate the steps in which Bitcoin tries to reach consensus:

  1. A "random" miner is selected by the network
  2. The selected miner sends a local new block to the other miners
  3. The other miners validate the received new block. A basic check is that the hash pointer points to the latest block that is already part of the canonical blockchain. Another important check is that the users have enough balance in their accounts to complete the transactions (information that can be gained from the blockchain that everybody has available)
  4. If it is valid, they discard their local new block and add to the blockchain the received block

If you think about it carefully, this mechanism prevents a lot of common attacks, as long as there are more than 50% honest nodes:

  • The Double Spend Attack - An attacker might try to spend his available money two times to different parties. This requires a little more explanation that you can find here. The basic idea is that if you receive a payment from someone via blockchain from a newly added block and immediately accept it, another block can come in with another transaction that is invalidating your received payment. Within a range of probabilities, the network may choose that second block as the right one, even though morally the block containing your payment is the right one. The conclusion is that if you are receiving a payment, then it is wise for you to wait for a number of more blocks to be added, because each new added block increases the probability of your payment becoming accepted by the network. For bitcoin it is recommended that you should wait 6 blocks before confirming the payment
  • Stealing money - if one of the users would try to include in the transactions money they don't have, even if they control some of the miners, the majority of the miners will reject the transactions as invalid
  • Denial of service - if some miners want to block someones transactions, they can refuse to add his transactions in their new blocks, but since a "random" miner is selected each time a new block is added, this can only result in a minor delay or inconvenience for the user.

This looks fine technically, but it still doesn't answer how the "random" selection works and why we assume 51% of the network will do the right thing...

Embracing the randomness

What is the problem? We are trying to make unrelated entities (which in the end are human, because the mining nodes are controlled by humans) to cooperate.
Sure enough, blockchains come up with incentives to promote good behavior. Usually this incentives come under two forms:

  • A block reward - every miner who is "randomly" selected to add a new block will receive the so-called block reward. This is a predetermined amount of the currency associated with that blockchain. This is usually pretty important at the start of a new blockchain, helping its bootstrap process. The catch is that your reward is valid only if your block is valid. So you should try your best to make sure that your block has all the correct information and all the transactions contained within are valid as well. As an example for bitcoin the block reward is 12 bitcoins and is halving every four years. At the time of this writing a bitcoin is valued at $767, so you would get $9204 for one mined block...not bad.
  • The transaction fee - Again this works in different ways for different blockchains, but the main idea is that the user of the blockchain network can choose to reserve a small portion of the transaction amount. That money will be earned, as with the block reward, by the node which has the luck of adding the block containing the transaction. This also works as a transaction prioritization system: miners will most probably add first the transactions that offer a high transaction fee.

This looks pretty neat, but as long as there is something to gain, humans will act in their own interest, even contrary to the common good. If you don't believe me, look it up. There's even a cool named economic theory about it, the Tragedy of the commons.
Image source xkcd
This incentives system creates the "free for all" problem. Everyone would join in the hope of hitting the jackpot. This leads to another problem: attackers could create unlimited nodes to join the network until they have more than 50%, which is pretty bad as discussed earlier. Also the problem of selecting "random" nodes is still left open.

There are multiple ways of solving these issues, which in the end turn out to be interrelated. For simplification they could be expressed under the (famous?) Prisoners Dilemma: A story about two prisoners and each has the option of either keeping silent or ratting out the other. The essential ideas are that each prisoner is better off choosing to betray regardless of the other player's choice, and that the greatest benefit goes to who betrays when the other cooperates. The prisoners might both be better off if they both cooperate rather than both betray, but since they have no way to ensure cooperation, they will both choose to betray.

Image source thedeclination

To understand how to solve the dilemma we must turn to biology. The idea is that when two animals want and must cooperate, they must communicate their honest intentions to one another in a believable way. In order to make lying implausible, the signal must impose a cost on the signaler that would make it very costly to cheat. In other words, the signal itself must be a handicap. This is the so called [Handicap Principle]. ( And this can be applied to blochains under different forms, the most well-known being Proof of Work and Proof of Stake. Bitcoin is using Proof of Work so we will focus on that.

Here's the scoop:​ The key idea behind proof‐of‐work is that we approximate the selection of a random node by instead selecting nodes in proportion to a relatively scarce resource that we hope nobody can monopolize (e.g. Bitcoin uses processor power, while others use computer memory). How, you ask? Well lets see.

In Bitcoin all miners are required to solve a puzzle (i.e. the Hashcash puzzle, basically they have to find the input to a certain hash function for a given output range). This puzzle can only be solved by brute force, so basically it is only by chance that you can find the right answer, one could say even almost random. That's why "random" was always between quotes, because we are only doing an approximation (maybe surprisingly, true randomness is hard to get by). Even if a miner has a lot more computing power than others, it will not always be the one to solve the puzzle, it will only increase its chances of doing so proportional to the total computing power in the network. And it is pretty important to be the first to solve the puzzle because that miner will be the one who will be adding the new block to the blockchain. So as long as no one has a higher percentage than 50% of the total computing power in the network, the network is secure. This in fact represents the mining process and that's why the nodes are called miners.

This continuous puzzle solving has a cost for the miners: first in terms of the hardware required to mine and secondly in terms of consumed electricity. This costs provides an entry barrier to attackers who cannot just create for free new nodes to join the network and increase their chance of winning the "block adding lottery".

The Proof of Stake system is an alternative with the advantage, among other, that it doesn't require the useless energy consumption that Proof of Work entitles. In this system, the chance of getting selected to add a new block depends on the amount of stake you have in the currency associated with the blockchain. This system is not trivial because most of the times it comes with the "nothing at stake" and "monopoly" problems. You can read more about it here.

That's it folks. If you reached this far, most of the high level mysteries of blockchain have been revealed and probably you also have a lot of questions (you can ask in the comments and I will try my best to answer).

There is one more thing that was promised since the beginning of this article...

Ethereum - The world computer

As you may have noticed, our click bait title contains also a reference to Ethereum. Initially I wanted to talk more in-depth about Ethereum, but that would turn this article into such a mammoth of an article that even I would become afraid of start reading (reminds me of the feeling I had when I wanted to start reading Dostoyevsky's The Brothers Karamazov). So just a few words for now and I will maybe describe in a future post more details about it.

Image source coindesk
Ethereum takes blockchain and goes with it to the next level. You don't use the network and its currency just to transfer money, you use it to transfer, store and execute code. It acts like a trusted and secure distributed computing platform of which anyone can become part and anyone can use. From here the possibilities are endless. Some applications include:

  • Decentralized investment funding (although the most famous one, The DAO, ended up pretty bad)
  • Decentralized options exchange
  • Releasing music as a digital contract
  • Decentralized voting system

This all works by introducing the notion of smart contracts. It is supposed to be the equivalent of a paper contract with features like being fully self-executing, self-enforcing and easy to manage. It's mainly because of this that the whole world has gone crazy over blockchain in the last two years. Banks, startups, fintechs are all getting in on the hype train. After the dust settles we may really have something that will change how to world works.

Before the closing words, if you really want to know more of the technicalities of Bitcoin and blockchains, I strongly recommend you the excellent Coursera course Bitcoin and Cryptocurrency Technologies provided by Princeton University and the accompanying textbook (The draft version is available for free here)

Tune in for the next episode

Next time we will get our hands dirty and try to make some Bitcoin transactions, we will try to install our own Ethereum blockchain and build some cool simple stuff on it. Or maybe it will be completely something else...who knows?

If you want more do not forget to subscribe.