Skip to Content
Technical Articles

Implement : Data Privacy in Hyperledger Fabric through private collections

Hyperledger Fabric  is a permissioned Blockchain infrastructure that was introduced to solve the problem of establishing accountability in a multiple party business scenario .

Though Hyperledger’s membership model ensures the identity of all participants of the network, in a real world multiparty business scenario, Data Protection and Privacy laws brings legal regulation requirements that might prevent all data to be shared among all parties in the network.

Prior to 1.2 Privacy in HL was addressed through the concept of channels.  Companies (peer nodes) that wanted to share private data among selective other organizations were joined in a separate channel. The problem with this approach is duplication of data and creation of a huge number of network of channels among group of organizations.

1.2 version was a huge step forward in addressing this issue via the concept of  private data collections. Collection can be considered as an overlay on a channel consisting of many information sharing groups , yet the hash of the private data thats shared in the private collections are maintained in the ledger state to ensure validity and integrity . Transaction data in this case are decimated peer to peer to mask it being exposed to even orderer nodes.

Assuming the reader already has a fair understanding of Hyperledger Fabric basic architecture ,  I will cut to the chase here and share our viewpoint on implementing “Privacy through SideDBs” .

A single chaincode can define many “Collection”s .  Data stored in a specific collection is viewable only to the orgs belonging to the collection.

Below are a simple steps needed to see private data collections in action :

Step 1. Define collections in collection-config.json

Step 2. Separate the data into Public and Private structs [So we can have all orgs in the channel have visibility into some parts of data publicly, yet some part are hidden and shared only within certain parties]

Step 3: Add handlers in code to save the public part of data to world state db and the private data to Collections .


Step 1 :  Define Collection-config.json   :

In the below example we have defined two private collections . “collections1” is used to persist data that are shared ONLY between CompanyA and CompanyB (companyC doesn’t have visibility to this data) . While “collections2” is used for data that can be seen Only by CompanyA and CompanyC (companyB doesn’t have visibility into this).

(Note that the peer node CompanyA with its own MSP is named as ‘CompanyAMSP’ in our SAP testnet network, hence the name CompanyAMSP.member in the collections config)

[{
"name": "collections1",
"policy": "OR('CompanyAMSP.member','CompanyBMSP.member')",
"requiredPeerCount": 0,
"maxPeerCount": 3,
"blockToLive": 0
},
{
"name": "collections2",
"policy": "OR('CompanyAMSP.member','CompanyCMSP.member')",
"requiredPeerCount": 0,
"maxPeerCount": 2,
"blockToLive": 0
}]

blockToLive parameter defines the max time (in measure of blocks) after which data will be automatically purged (provision for GDPR compliant requirements). A value of 0 denotes never purge.

NOTE: collection-config.json need not be zipped along with the other chaincode src files during deployment. After the chaincide is deployed the channel, during the “instantiate” stage, the collection-config.json need to be uploaded.


Step 2 :    Divide the data into Public and Private structs: 

Let’s say every peer node belonging to the channel should be given the ability to  check the validity and authenticity of an item in the blockchain and read certain basic attributes about the item, but if a company wants share some info about the item like say,   manufacturer_country and price with only certain companies based on some contractual agreement, and keep it hidden from others, then we need to divide and conquer the data struct so :

type Item_Public struct {

ItemID      string `json:"item_id"`
Description string `json:"description"`
Owner       string `json:"owner"`
Status      int    `json:"status"`
Components  string `json:"components"`
}

type Item_Private struct {
ItemID               string `json:"item_id"`
Manufacturer_Country string `json:"manufacturer_country"`
Price                string `json:item_price"`
}

Please note since the item_id/key  represents the unique item, it’s a good practice that it is matched in normal channel data and PrivateDB so we have data integrity.


Step 3:  Implement code to handle private collections:

For eg: Below is the simple addItem code to store the public part in world state DB and the private in the Collections/sideDB.

[Please note, I am just storing the private data in collections1 in this code, we can make our code dynamic to determine which “collection” to store based on any specific business logic such as based on item-id, owner etc ]

func (t *GenericChaincode) addItem(stub shim.ChaincodeStubInterface, args []string) peer.Response {

id := strings.ToLower(args[0])

//Get this peer node’s Id (for eg: ‘CompanyA’)

creator, _ := stub.GetCreator()
//Create a SerializedIdentity to hold Unmarshal GetCreator() result
sId := &msp.SerializedIdentity{}
//Unmarshal the creator from []byte to structure
err1 := proto.Unmarshal(creator, sId)

nodeId := sId.Mspid[:len(sId.Mspid)-3]

// catch Unmarshal error
if err1 != nil {
//return Unmarshal error via HTTP
return Error(http.StatusInternalServerError, "Could not deserialize a SerializedIdentity, Error "+err.Error())
}

//Form the Public Asset struct
item := &Item_Public{ItemID: args[0], Description: args[1],
Owner:      nodeId,
Status:     args[2],
Components: args[3]}

value, err := stub.GetPrivateData("collections1", id)
if err == nil && value != nil {
return shim.Error("This item already exists in collection: " + id)
}

// First save public struct in World State DB
if err := stub.PutState(id, item.ToJSON()); err != nil {
return Error(http.StatusInternalServerError, err.Error())
}

// Now put private struct in SideDB / private collection
item_private := &Item_Private{ItemID: args[0], Manufacturer_Country: args[4], Price: args[5]}
err3 := stub.PutPrivateData("collections1", id, item_private.ToJSON_Private())

if err3 != nil {
return shim.Error(err.Error())
}

return Success(http.StatusCreated, "Private Item Added", nil)
}

Most shim API methods that are available for accessing normal channel data are available for private collections too.

GetState(id) GetPrivateData(COLLECTION_NAME, id)
PutState(id) PutPrivateData(COLLECTION_NAME, id)
GetQueryResult(queryStr) GetQueryResult(COLLECTION_NAME, queryStr)
DelState(id) DelPrivateData(COLLECTION_NAME, id)
GetHistoryForKey(id) <Not yet available >

Now putting together above bits of info, we have a simple chaincode implementation demonstrating Private Data usage:

A simple implementation : https://github.wdf.sap.corp/i863312/PrivateCollections 


Challenges in 1.3 and enhancement in 1.4 w.r.t Private Data :

Up until Version 1.3, if a more granular access control (for eg: at the transaction level) was needed , we had to implement our own Access Control logic as part of the chaincode to handle it. This was needed because of the non-deterministic nature of peer selection based on the channel wide endorsement policy.

Fabric Version 1.4 has support for both chaincode based endorsement and state based endorsement policies. This adds provision for supporting a tighter Access control based on the transaction.

 

2 Comments
You must be Logged on to comment or reply to a post.