BitCoin BlockChain Problem
I read recently an article about a bitcoin problem that I was trying to understand since the first time that I read the article on bitcoin the electronic cash system. The problem is pretty straightforward. The transactions in Bitcoin system are stored in a structure called blockchain and this is done forever. So a basic question pops up. How do the system manages the growth of information, which by the way is dependent on the number of transactions. The blockchain (you can call it all the transactions data) is divided in blocks (no shit) which are files with transactions. These files are of small size to ease the storing of information in the peers of the bitcoin network. However the drawback is that the small size also implies a small number of transactions per second that the network is able to process. So you got here a trade off. On the other side the fundamental problem of data storage scalability is not solved at all. If you check the all time blockchain size you notice the shape of the curve, and that's not pretty. You can see a very sharp increase in data volume needed to store all the data and at the time of this writing the data is on the astonishing 40Gb of information. What? Lets pause a bit here. Remember what was the purpose of bitcoin? A electronic cash system which should be descentralized by design, right? Well how the hell can a descentralized system work with a exponential growth in space of the blockchain which, by the way, is stored locally in every peer of the network? Well that's the question I was facing and that I didn't understand how they did solve it. Apparently it was not solved at all. From the article I mentioned you can see two possible approaches, the first is by assuming the equivalent evolution in hardware. I think this first approach is a laughable one because, as we all know, hardware with respect to data storage is not growing in the same rate as the blockchain. So assuming that hardware will rise in data read/write speed as well storage space is simply ludicrous. The second approach discussed in the article was a reimplementation of the blockchain in a distributed way, by using a mechanism based in Merkle Trees. At first it seems a preety clever idea. And theoretically it may work. If you notice the growth in the number of transactions is related to the number of users of bitcoin cash system. It is plausible to assume these new users as peers in the network so it is also plausible to assume we can exploit the sum of storage and the sum of throughput of the network. However I still have a problem with this (much better I must say) approach. How do the network ensures that all the data remains there when one peer quits the network, in this new distributed blockchain? Well the solution is to have some degree of redundancy between peers and a mechanism that ensures that for some arbitrary data there will be allways some minimal number of peers that will hold it. Or assume that the peers never quit, which in my opinion is not a very clever approach. This mechanism is not perfect and can be problematic. If the network of peers is unstable and there is a very in and out of peers in the network the amount of data that need to be distributed can be huge and consume a great amount of the network capacity. On the other hand if, for some, reason some of the peers that sudently decide to quit are related to the same data the network can lose data. And this is a very concerning problem. So how to solve this problem? Other approach that was discussed is called prunning. From all the alternatives seen so far I think prunning is the best approach but it comes with a high price to pay. Data will not be stored forever. With pruning you define a maximum deep number of transactions to store and discard the old ones. In this way you would have a more smooth and manageable growth in the blockchain. But as I said it would remove the very nice feature of storing the transaction data forever. For now I don't see a definitive approach capable of solving the problem. I believe that the Merkle Tree is a very clever and can, indeed, work. Nevertheless we should be aware of the pitfalls we can encounter and recognise the problems that can arise from here. My hope is that this fundamental problem is quickly tackled because it is the future of a very promising monetary system that is on the brink of failure.