       Dec 14 , 2017

## How Bluzelle manages nodes to provide optimal database storage

Bluzelle replicates data within swarms to the minimum extent necessary to achieve 100% reliability, and thereby optimizes node farming resources to achieve maximum scalability.
By Neeraj Murarka

Bluzelle has three metrics that enable it to mitigate key database pain points:

- Performance.

- Reliability.

- Scalability.

Performance is always a critical metric that is maximized, but we will discuss here how Bluzelle elegantly uses farming resources (ie: nodes) to maximize the latter two metrics specifically.

- We have a total of n nodes across all swarms.

- Each node has at least d amount of space to store data. This is the minimum required to become a farmer, per node, and is enforced via a proof of resources. d is therefore also the exact amount of space each swarm can store.

- We need at most x nodes in a swarm to reach a level of redundancy that effectively guarantees that the swarm is indestructible. “Effectively” is used here for mathematical correctness, since the probability has a mathematical limit of zero as x tends to infinity. The probability p (as an example) of ALL x of the nodes in a swarm going down at the same time (and bringing the swarm down too) is astronomically low, which can be considered a probability of 0, in all practicality, given a high-enough value of x.

- The number s of swarms is calculated with the formula n/x.

- The amount of space t that Bluzelle as a whole can store is therefore calculated with the formula sd.

With these points in mind, we want to minimize x while still having an effective probability p of 0 of any swarm coming down. Minimizing xmaximizes t, increasing the supply of inventory (database space) available, and reducing the cost to the consumer.

We would on average have approximately x nodes in a swarm. Adding more nodes than x to any swarm is sub-optimal, as it does not improve redundancy, given that the probability p is already 0 that the swarm will come down given the existing x nodes.

Since the number of swarms s is expressed as n/x, by minimizing x for a fixed value of n, we maximize the number s of swarms in Bluzelle.

By maximizing the number s of swarms, we also maximize the amount of space t in Bluzelle as a whole, for a fixed value d representing the minimum required storage space in a node.

Compared to a distributed cluster of redundant nodes on a cloud platform, Bluzelle nodes in a given swarm are geographically widespread and therefore far less vulnerable than cloud nodes are. This is for reasons of geography, where geographically-isolated outages don’t affect more than a small part of the swarm (ie: nodes that happen to be in that geography), as well as the fact swarm nodes are spread across many different data centers and infrastructures. This is why the value x in the context of Bluzelle is significantly lower than it is in a cloud environment.

By minimizing the number of nodes in a swarm to the minimum value of xnecessary to achieve a probability p of 0 of a swarm going down, we maximize the overall storage capacity t of Bluzelle. This achieves our goal of maximized reliability and scalability. As new nodes join the network, they will ultimately spill over into newly-born swarms that dilute out the existing data and increase the overall storage capacity of Bluzelle.

In short, Bluzelle replicates data within swarms to the minimum extent necessary to achieve 100% reliability, and thereby optimizes node farming resources to achieve maximum scalability. The reliability and scalability goals are achieved.

## Related Posts

SEE ALL POSTS

Mar 15 , 2019

### Things You Should Know About Database Caching

A database is one of the most common uses of data store technology like Bluzelle. This is what most people commonly think of. Additionally this same technology can be used as a cache. This blog will explain in details what is a cache and why it is important.