Storage

In Züs network, storage is provided by specialized entities called blobbers.

Blobber: A blobber is responsible for storing data in exchange for rewards.

Our design relies on the use of signed markers, as described in Züs Token Pools and Markers . Critically for storage, when the blobber redeems these markers on the blockchain, they also serve as a public commitment to store the data provided. Our protocol was first outlined in a DAPPCON paper [8], as well as a related technical report [3].

For storage, there are different amounts of storage that must be understood for a blobber:

  • capacity: The total storage that a blobber physically offers.

  • staked capacity: The amount of capacity that is also backed by staked tokens, either from the blobber or delegates.

  • free capacity: The amount of staked capacity that has not already been purchased by a client.

  • purchased capacity: Of the staked capacity, the amount that clients have currently purchased, whether they are storing anything or not.

  • used storage: Of the purchased capacity, the amount that the client is currently using.

A stake pool stores tokens backing a specific blobber’s offer of storage. After the offer of storage expires, the stake pool tokens return to the delegates. Note that a stake pool is actually a collection of delegate pools, where each pool represents the tokens belonging to a specific delegate.

A blobber may offer additional capacity at any time. However, capacity can only be lowered if it has not already been staked. Similarly, while a delegate can request to unstake tokens at any time, the request can only be granted when it would not drop the staked capacity below the purchased capacity. To provide resiliency, Züs uses erasure coding. When confidentiality is also needed, the data can be encrypted as well; in this case, we use proxy re-encryption (Section 4.12) allowing the data to be re-encrypted by the blobber without revealing the contents of the data. Therefore, we use an encode-then encrypt setup.

Storage Structure

Before we describe the different actions that can be taken with storage, we first review the organization for the files that a blobber stores. The files system is organized in a Merkle-tree like structure called a Git tree, patterned after Git’s organization of files. The root of the Git tree is stored in write markers in an AllocationRoot field. Since the write marker is signed by the client, the write marker can be used to verify the state of the blobber’s system. The leaves of the Git tree are metadata files relating to the files being stored, allowing us to validate a variety of properties of a file against the write marker. These different aspects will be detailed below in this section.

Git Trees

A Git tree is a Merkle tree-like structures similar to those used by Git. The leaves of the Git tree are the hashes of the metadata files for the different files stored by the blobber. Any information stored in these files can thus be verified against the write marker. Directories are mappings of file names to the hash values; they are likewise named by their hash values.

Verifying that a metadata file is contained in a Git tree is straightforward. With the hash of the metadata file is included, we only need to send the path to the root of the tree. This is essentially a Merkle path. To avoid confusion with the Merkle path of a file’s contents, we refer to this path as a Git path.

Write Marker Format

A write marker is signed by the client paying for storage, allowing us to verify that the blobber is storing the files that the client wanted stored. Write markers contain the following fields:

• AllocationRoot • AllocationID • BlobberID • ClientID • Size • Timestamp • ViewNumber – Indicates the “version” of the data. • Signature – The signature of the client that owns the allocation. The AllocationRoot serves to verify the agreement between the client and the blobber on the file system contents. Since the client signed the write marker, we can be sure that it agrees. Redeeming the write marker on the blockchain serves as the blobber’s handshake agreeing to the data stored.

Metadata Files

The metadata files are the leaves of the Git tree, and may thus be tied to the write marker. The metadata files currently contain: • AllocationID • Path • Size • ActualFileHash • ValidationRoot – a standard Merkle tree hash of the file contents, used for a client reading data to verify that the data is correct. • MerkleRoot – a modified Merkle tree hash, used specifically for challenges. Challenge hash, also referred to as the FixedMerkleTree hash. • ViewNumber

Initializing Blobber

When a blobber registers to provide storage, it specifies its total storage capacity, its pricing for both reads and writes, and the duration (max offer duration)for how long its pricing is valid. The duration starts from the timestamp of the transaction where the offer of storage was first made.

Note that the blobber cannot offer storage immediately. First, there must be enough tokens staked to guarantee service, as discussed in Service Providers, Staking and Delegates . These tokens do not have to be staked by the blobber itself, although we expect that the blobber will provide at least some of the stake. Other clients may serve as delegates, staking tokens on behalf of the blobber and sharing in the rewards offered.

The blobber goes through the same process when it wishes to expand or decrease the storage that it offers, increasing or decreasing the stake amount of tokens needed. A blobber can specify a capacity of 0 if they wish to stop providing storage altogether.

It should be noted that a blobber cannot abandon its existing storage agreements. The blobber must maintain those allocations until the user releases them, or until the duration of the storage offer elapses. Currently, all allocation periods are fixed to 1 year.

New Allocation

An allocation is a volume of data associated with a client, and may potentially be stored with many blobbers. To set up a new allocation, a client specifies the price range that they are willing to pay for reads and writes, the size of storage they require, and the duration that they need for the storage (specified as an expire parameter).

For each allocation and geolocation, a client must have two token pools of funds that blobbers can draw on to be rewarded for the work they have done. The pools are:

  • A write pool, used to pay blobbers for any data uploaded and stored.

  • A read pool, used to pay blobbers for any reads of the data stored.

The read pool is associated with client’s wallet so that they can read from any blobbers. The write pool is tied to allocation and its set of specific blobbers. When requesting a new allocation, the client must specify:

  • The price ranges that it is willing to pay for both reads and writes.

  • The size of data that it needs to store.

  • The expiration time for when that storage will no longer be required.

  • (Optionally) A list of preferred blobbers.

Uploading Data

To ensure that data stays in sync, a ViewNumber field is included in the write marker. This field is set to 0 for the initialization of the allocation and incremented for each write marker accepted by a blobber.

We also include save points in the ViewNumber field after a ‘.’. For example “42.3” would be for version 42, save point 3.

We use a two marker system, meaning that a blobber precommits data when it receives the corresponding write marker. When a subsequent write marker is received, it commits the precommitted data and precommits the next batch of data. As a result, a blobber never needs to store more than 2 versions of state for a file system. (Note that when a session concludes, the client should commit the last back of precommitted data.)

Figure 2 shows the process of a client uploading data to its blobbers. Though our diagram only shows two blobbers, the steps are repeated for all blobbers storing the allocation’s data.

Here are the steps of the process, following the sequence diagram: In steps 1 and 3, the client requests locks from the blobbers, who respond with their version information in steps 2 and 4.

If the blobbers are ahead, the client needs to catch up first. If the blobbers are not in sync, we might need to trigger a rollback, discussed in Section 4.11.

Once the client acquires a lock with a blobber, this begins a new session.

In steps 5-8, the client sends some operations to the blobbers. Once completed, the client (steps 9 and 12) sends write markers to the blobbers. The blobbers then (steps 10 and 13) commit the previous version (if needed), and precommit the new version. The blobbers then (steps 11 and 14) reply with an acknowledgement.

At this point, the client still retains the lock and may send more operations. In steps 15-16, the client sends an additional operation to the blobbers. The client and blobber repeat the process of sending and committing write markers (steps 17-22).

Note that write marker write_marker_b1_sp1 replaces write_marker_b1 sp0, so that blobber 1 only needs to commit the second write marker to the blockchain to receive its rewards.

Also, note that the client will not send write_marker_b1_sp1 if it has not received the acknowledgement for write_marker_b1 _sp0. As a result, blobber 1 and blobber 2 should never be more than one version/save point apart for their write markers.

Once the client has ensured that all blobbers have moved to the latest version, it (steps 23-34) notifies the blobbers to release the lock and to commit data from the last write marker.

On commit (steps 10, 13, 18, 21), the blobbers can delete 2 revisions back (if available), trusting that the client would not have sent a write marker if the blobbers were not in sync for steps 2 and 4. The previous state of the system should not be deleted yet, since we might need to rollback, but we should be able to assume that we won’t need to roll back 2 versions.

After step 22, the client has verified that its blobbers are in sync. In steps 23-24, the client notifies the blobbers to release the locks. The blobbers may now delete the old data safely.

Challenges

Blobbers and their delegates receive rewards for reads immediately. However, writes are paid through challenges where the blobber must prove that they are storing the data that they are paid to store. However, token rewards for writes are instead transferred to a challenge pool. Tokens in this pool are not made immediately available to the blobber or its delegates, but they may receive those rewards after passing a challenge proving that they are storing the data that they claim.

Outsourcing attacks, where the blobber stores their data with another storage provider, are of particular concern. Our protocol ensures that the content provided for verification is 64 kB and the content required to create this verified content is the full file fragment. Our process is illustrated in Figure 3. The file is divided into n 64 kB fragments based on n storage servers. Each of these 64 kB fragments is further divided into 64-byte chunks, so that there are 1024 such chunks in each 64 kB block that can be addressed using an index of 1 to 1024. The data at each of these indexes across the blocks is treated as a continuous message and hashed. Then the 1024 hashes serve as the leaf hashes of the Merkle Tree. The root of this Merkle tree is used to roll up the file hashes further to directory/allocation level. The Merkle proof provides the path from the leaf to the file root and from the file root to the allocation level. In this model, in order to pass a challenge for a file for a given index (between 1 and 1024), a dishonest blobber first needs to download all the content and do the chaining to construct the leaf hash. This approach discourages blobbers from outsourcing the content and faking a challenge response. There are 3 hashes for a file stored with a blobber. The actual file hash is used by a client to verify the checksum of a downloaded file.This hash is the hash of the original file. It is stored in the ActualFileHash field of the file metadata.

The validation hash is used by a client who downloads data to verify that the data is correct. It is a Merkle hash, thus allowing segments of data in a file to be validated without needing to download the entire file. This hash is stored in the ValidationRoot field of the file metadata.

Finally, the challenge hash is used to verify challenges, and is stored in the MerkleRoot field of the file metadata. Unlike the ValidationRoot, the files are hashed with a modified version of the Merkle tree called a fixed Merkle tree. This structure is designed to make challenges difficult to pass if a blobber is not storing the file locally Figure 4 below shows a high level view of the payment process. A client must have funds committed to a write pool before uploading data. Then, when uploading files to the blobber, the client must include write markers along with the files. Critically, the challenge hash is used to build the AllocationRoot field of the write marker; thus, when the blobber commits those markers to the blockchain, it serves as the blobbers commitment to the stored data. Redeeming the markers transfers them to the challenge pool. When the blobber is challenged to prove that the data is stored correctly and successfully passes the challenge, the tokens are transferred from the challenge pool to the blobber. When a new block is produced, miners will slash the stake of blobbers who either failed a challenge or who have not responded within the allowed time. Every block also provides a new challenge based off of the VRF (discussed in Mining on the Züs Blockchain). There are 10 validators selected from other blobbers that may verify the challenge (though the blockchain may be configured to require more or less validators). Critically, the validators do not need to have any pre-existing knowledge of the data stored, since it can be verified against the write marker stored by the challenged blobber. At a high level, the challenge protocol involves three phases:

  • Using the VRF result, a single block of a file stored by one specific blobber is selected. We refer to this stage as the challenge issuance. How blobber is selected ? Specifically, we use the VRF to randomly select a partition of the blobbers, then randomly select a blobber from that partition, then a random non-empty allocation stored by the blobber, then a random file in that allocation, and finally a random block within that file. At all steps, the VRF provides the random seed.) .

  • In the justification phase, the blobber broadcasts the data to the validators along with the metadata needed to verify the challenge.

  • Finally, in the judgment phase, the validators share their results.

We now detail the justification phase. When a file for a given index (between 1 and 1024) is challenged:

  1. The blobber generates the input to the hash function (designated as hash blocks in Figure 3)

  2. The blobber broadcasts to all validators:

• the hash block • the Merkle path to the MerkleRoot of the file • the corresponding file metadata • the Git path to the file metadata • the latest signed write marker.

  1. The validators verify that the write marker is the latest one committed to the blockchain and that the signature on the write marker is valid.

  2. If the write marker is valid, the validators hash the hash block, and verify that the resulting hash and the Merkle path match the MerkleRoot field of the file metadata. They then verify that the hash of the file metadata and its Git path match the AllocationRoot field in the write marker.

For a more detailed discussion on the challenge protocol, see [3].

Time Allowed for Challenges

The challenge time (CT) is the amount of time allowed for a blobber to respond to a challenge. Since the file size may significantly affect the blobber’s time to respond to the challenge, we factor that into our formula, shown below:

CT = M ∗ F S + K

Since it will take longer to provide the data for larger files,

the time equals the size of the file (FS) times a fixed multiplier (M), with an additional time allotment constant (K) that represents the outer bound of time it is expected for the first block to be transmitted.

Updating Allocation

A client can change the size or expiration of an allocation. If extending the allocation (by increasing either of these values), the client must negotiate new terms. . If a client reduces the size of the allocation, they may continue to use the existing terms of the allocation

Extending Allocation

For a client to extend their allocation, they must have sufficient tokens in their write pool and the blobbers must have sufficient storage capacity. Otherwise the operation will fail.

The client will continue to pay the original rate for their first allocation, but will pay the new rate for the extended period.

Reducing/Closing Allocation

When the client reduces its allocation, it may reclaim some of its tokens. However, there is a delay allowing blobbers to still claim the tokens for the services that they have already provided. However, note that any tokens in the challenge pool are not returned to the client; once they leave the write pool, they are considered to have been paid to the blobber and its delegates.

The client may cancel an allocation at any time, though they pay a penalty for doing so. If the client cancels the allocation, then the allocation is finished and the blobbers may stop storing the client’s data.

Adding/Removing Blobbers

Occasionally, a blobber may need to be replaced. This replacement might be triggered by the client who owns the data or it could be the result of repeated failed challenges (which the client could observe). In either case, this process is initiated by the client. First, the client writes a transaction to update their allocation to add a new blobber. At this point, the new blobber will accept writes (though it might need to queue them up). However, the new blobber won’t respond to reads until it has been able to sync up the data. The client must acquire the data to give to the blobber. The client might already have the data cached locally. If not, they must acquire it, either by reading from the old blobber if it is still available, or by reconstructing the data from the other blobbers. The client then uploads the data to the new blobber. Note that while the client must pay for these writes, they may have previously recovered token from failed challenges if the old blobber was not performing adequately.

After the new blobber has been able to sync up, it writes a transaction to cash in their write markers, effectively declaring itself online. At this point, the new blobber is not available for reads and challenges. Finally, the client writes a transaction to the blockchain to drop the old blobber. The old blobber will no longer be selected for reads or writes, and may safely discard the data. However, it still may redeem outstanding markers.

Reading from Allocation

Similar to how writes are handled, clients write special read markers to pay blobbers for providing data. Token Pools and Markers details the philosophy behind markers in more depth.

Read markers contain the following fields:

• ClientID – the reader of the file.

• ClientPublicKey

• BlobberID

• AllocationID

• OwnerID – the owner of the allocation.

• Timestamp

• ReadCounter – used to prevent the read marker being redeemed multiple times.

• Signature

When the ReadCounter is incremented, the price is determined by multi plying the increase in ReadCounter by the size of the block and the read price. The blobber is paid immediately when the read marker is redeemed.

The client reading the data may elect to validate the data that they receive from the blobber against the ValidationRoot field in the file metadata files. The metadata files themselves can be validated against the latest write marker. Note that the metadata files are signed, and provide some degree of validation; however, a malicious blobber could provide a stale metadata file.

The sequence diagram in Figure 5 outlines how a file could be verified while it is being downloaded. In steps 1 and 2, the client requests the file from the blobbers. In steps 3 and 4, the blobbers respond with the write marker, the Git path to the metadata file, and the metadata file. The client then (step 5) verifies the metadata files against the write markers and git paths.

If everything is valid, the client is ready to receive data. They may now rely on the ValidationRoot values stored in the metadata files.

In steps 6 and 7, the blobbers send the first block of data from the file along with the corresponding Merkle paths to the root of the file. In step 8, the client verifies the downloaded data, checking that the blocks match the Merkle paths, and that the Merkle paths match the ValidationRoots. If anything is amiss, the misbehaving blobber can be identified.

Steps 9-11 repeat the same process, with one notable difference. Since many of the nodes in the Merkle paths may match the previously sent Merkle tree, the blobbers will only need to send the additional nodes needed for the Merkle path.

Livestreaming and Videos

Züs provides support for livestreaming, allowing a client to upload audio/video data to Züs’s network on a continuous basis so that other clients can watch it continuously. We use M3U8 format for our livestreaming.

The client providing the data divides the livestream into chunks of a specified duration (configured to one second at the time of this writing) and uploads them to the blobbers. The client viewing the livestream downloads the chunks locally and allows the viewer to watch the livestream.

For other videos, our files may be much larger than the files provided for livestreaming. In order to allow the viewer to jump around in the video file, the client viewing the data can download 64 KB data blocks from within the file without needing to download the entire file. Once downloaded, these are converted into a byte stream.

Blobber Support for General Data Protection Regulation

In an effort to give people greater control over their personal data, the European Union introduced the General Data Protection Regulation (GDPR). The Züs network includes functionality to introduce privacy reports about the usage of a customer’s data on their request. With Züs, each blobber stores usage statistics in a local database. Therefore, the Züs network promises a best effort, relying on the blobbers to report accurate results. This feature is optional, and a gdpr boolean flag allows smart contracts to find blobbers that support it. For blobbers that do support it, the feature is enabled for all users by default. Of course, blobbers might charge a slightly higher price for this service.

Repair Protocol

With Züs , if a client device fails in mid operation, only a minority of blobbers might have received a write marker for any given update. In the worst case, the client device might have lost the data and be unable to complete the change.

For illustration, we refer back to Figure 2. Following the sequence diagram, if blobber 1 commits (step 10), but blobber 2 does not (step 13), it is possible that blobber 1 would be a commit ahead of blobber 2. If there are not enough blobbers to reconstruct the latest state, blobber 1 would need to roll back. The process would be:

  1. The client realizes that a rollback is needed.

  2. The client sends a rollback message to the affected blobbers indicating what version we are reverting to.

  3. The blobber reverts to the previous state, following Git’s approach. The newer version of files must still be stored temporarily. Note that the blobber cannot roll back two versions, but as long as the client is following the protocol, that issue should not arise. If the client does not follow the protocol, it is possible that it could corrupt its data.

Challenges and Rollbacks

If a blobber has previously committed a write marker to the blockchain and it is rolled back, it is possible that the blobber could be challenged for a file that is no longer relevant.

The blobber must store these files until a new write marker is committed to the blockchain with a version number that is the same or higher.

Proxy Re-Encryption

Proxy re-encryption (PRE) allows a user to store confidential data in the cloud without having to trust the storage provider. The data is encrypted with the data owner’s private key; when they wish to share their data, they derive a re-encryption key from their own key pair and the receiver’s public key. This re-encryption key allows the data to be re-encrypted for the receiver’s public key without ever decrypting the data. As a result, the cloud provider can convert the data without being given an opportunity to read the confidential data. We use the approach outlined in Selvi et al. [11]. Figure 6 shows how data is uploaded when using proxy re-encryption. The client first (1) erasure codes the data into fragments, with one fragment per blobber. For each blobber, the client then (2,5) generates a public/private key pair (if it does not already have a keypair associated with that storage provider). The client then (3,6) encrypts the corresponding fragment with the public key, and (4,7) sends the encrypted data for the storage provider. For data transfer, the client requesting the data must first request the data from the client that owns the data. (For convenience, we will refer to the client owning the data as the seller and the client requesting the data as the buyer, even if the seller does not actually request any compensation for allowing access to their data.) Figure 7 shows an overview of this process. The buyer (1) requests data from the seller, specifying its public key and the details of the data desired.

For each storage provider, the seller then (2,4) calculates the re-encryption key from the buyer’s public key and the keypair associated with the blobber. The seller then (3,5) sends the re-encryption key and the ID of the buyer to the corresponding storage provider. The storage provider retains this information.

Once the initial phase is complete, the seller (6) sends a confirmation to the buyer including the list of blobbers. The buyer then (7,10) requests the data from each storage provider, specifying its ID and the data requested. Each storage provider (8,11) re-encrypts the data with the re-encryption key, sending the results to the buyer. The buyer (9,12) decrypts the fragments and (13) reconstructs the original data.

Last updated