Trust Data Sharing and Utilization Infrastructure for Sensitive Data Using Hyperledger Projects
Koshi Ikegawa.
Trust Data Sharing and Utilization Infrastructure for Sensitive Data Using Hyperledger Projects.
Open Source Summit Japan 2021 (OSSJ 2021). December 14-15 2021. Online.
Hitachi, Ltd. / Research and Development Group Koshi Ikegawa Trust Data Sharing and Utilization Infrastructure for Sensitive Data using Hyperledger Projects Wildcard Theater / •Wildcard / Blockchain #ossummit @koshi_ikegawa
1 ❚Name: Koshi Ikegawa ❚Company: uHitachi, Ltd. / Research and Development Group uAssociate Researcher ❚Activity: uHe joined Hitachi, Ltd. as a researcher in April 2019 and is researching and developing blockchain platforms and applications. uHe received a master's degree in engineering at the University of Tsukuba, Japan in 2019. uHe began contributing to the Hyperledger Projects in 2020.
with Trust (DFFT) is a top priority for the new economy, according to the World Economic Forum 2019. Background | Data Free Flow with Trust (DFFT) 4 ❚DFFT is the free and trustworthy flow of data between nations and/or orgs uBlockchain is expected to be used as a means to realize DFFT • A feature of blockchains is that transactions and data content are shared and written to a ledger by all participating organizations. ❚There is open data that can be shared with all organizations, and there is sensitive data that needs to be restricted in who it can be shared with. uSuch as medical data, personal information, etc... ❚When handling sensitive data, privacy protection is required by laws uSuch as the Act on the Protection of Personal Information (Japan) and GDPR (EU)
has a different network type and consensus algorithm Background | Blockchain (BC) 5 Platform Bitcoin Quorum Network Type Public (Permission-less) Private (Consortium / Permissioned) Consensus Algorithm PoW PoW / PoS Clique PoA / IBFT Endorse & Ordering Service ❚Network type uThere are two types of blockchains: public, in which anyone can have a node, and private, in which participation requires permission from a consortium. ❚Consensus Algorithm u In a blockchain without a central administrator, a consensus must be reached across participating organizations in order to share the ledger with everyone on the network. We are contributing to Hyperledger and realizing a variety of use cases.
Overview 6 ❚Hyperledger Fabric is one of the open source private-type blockchain platform managed by the Hyperledger Foundation uPeer is a blockchain node owned by each organization uLedger is a collection of blocks that pack transactions uState DB is the result of executing a transaction managed by the ledger uOrderer decides the order of the transactions and distributes the blocks Fabric Blockchain Network Org 2 Org 1 Orderer Org 3 Peer Ledger State DB Chaincode Peer Peer User Client program
Chaincode (Smart contract) 7 ❚Hyperledger Fabric uses a smart contract called chaincode to perform processing on the blockchain. uChaincode runs transactions that may modify the data on State DB. uTransactions are written to the ledger after getting consensus from other organizations. Fabric Blockchain Network Org 2 Org 1 Org 3 Peers Ledger State DB Chaincode Peer Peer User Client program Invoke / Query Operate Chaincode Chaincode Orderer Request consensus
of Fabric is the Endorse & Ordering Service u① Endorsement: • A peer owned by another organization tentatively executes a transaction on the chaincode, verifies that it is correct, and endorse it. u② Ordering: • The orderer collects the transactions that have received endorsement, resolves the ordering relationships, packages them into blocks, and distributes the blocks to peers. Hyperledger Fabric | Consensus Algorithm 8 Fabric Blockchain Network Org 2 Org 1 Org 3 Peers Ledger State DB Chaincode Peer Peer User Client program Invoke / Query Operate Chaincode Chaincode ①Endorsement Orderer Block ②Ordering ①Endorsement Request consensus
Channel 9 Fabric Blockchain Network channel A channel B Org 2 Org 1 Org 4 Org 3 Ledger A Ledger A Ledger A Ledger B Ledger B Data A Data A Data A Data B Data B Orderer ❚A channel in the Hyperledger Fabric blockchain network is a private layer of communication between specific organizations, invisible to other organizations that do not belong to that channel ❚Each channel consists of a separate ledger that can only be read and written by the organizations participating in that channel
Private Data Collection 10 ❚In Private Data, data to be kept secret is stored in a database outside the ledger, and data is exchanged through direct communication between Peers. ❚The hash values of the data stored as Private Data are managed in the ledger. Fabric Blockchain Network channel Org 2 Org 1 Org 4 Org 3 Ledger Ledger Ledger Ledger Private Data # Data Data # # # Peer-to-peer connection Orderer
Use Cases 11 ❚There are many logistics trace use cases using Hyperledger Fabric uDLT Labs & Walmart: Food Traceability System [1] uHoneywell Aerospace: online parts marketplace [2] uIBM: TradeLens container logistics solution [3] ❚Hyperledger Fabric is also expected to be used in areas such as healthcare, where sensitive data is handled uThe functionality of Hyperledger Fabric itself may not be sufficient. ❚Hyperledger has projects to enhance the privacy protection of Fabric. ue.g., Hyperledger Avalon and Hyperledger Fabric Private Chaincode. 1. DLT Labs & Walmart Canada Transform Freight Invoice Management with Hyperledger Fabric (URL: https://www.hyperledger.org/learn/publications/dltlabs-case-study) 2. Honeywell Aerospace creates online parts marketplace with Hyperledger Fabric (URL: https://www.hyperledger.org/learn/publications/honeywell-case-study) 3. IBM TradeLens container logistics solution (URL: https://www.ibm.com/blockchain/container-logistics) *These images are taken from the respective references
❚Ledger independent implementation of the Trusted Compute Specifications published by the Enterprise Ethereum Alliance. uAims to enable the secure movement of blockchain processing off the main chain to dedicated computing resources. ❚Guarantees a trust execution of a program in the protected area by CPU native secure function called Trusted Execution Environment (TEE) Fabric Blockchain Network channel Org 1 Orderer Peers Ledger State DB Chaincode User Avalon Client Org 2 Peer (On-chain) Ledger State DB Chaincode TEE (Off-chain) Avalon Blockchain Connector Guarantee trust of process Data Processer
Chaincode (FPC) 13 ❚Hyperledger Fabric Private Chaincode (FPC) enables the execution of chaincode using Trusted Execution Environment (TEE) BC Network channel Org 1 Orderer Peers User FPC Client TEE Encrypted Ledger Encrypted State DB Private Chaincode Org 2 (Permitted) Peer FPC Client TEE Encrypted Ledger Encrypted State DB Private Chaincode Org 3 (Denied) Peer FPC Client TEE Encrypted Ledger Encrypted State DB Private Chaincode Logic: If denied org user requests it, return error. The ledger and DB are encrypted, so users cannot view the data even if they access it directly. Encrypt transaction and data Logic: If permitted org user requests it, decrypt the data and return it.
Execution Environment (TEE) 14 Trusted Execution Environment is CPU Security Function ❚This function provided by CPU vendors usuch as Intel Software Guard Extensions (SGX), ARM TrustZone, AMD Secure Encrypted Virtualization (SEV), etc. ❚Hyperledger Avalon and Fabric Private Chaincode use Intel SGX for their implementations
SGX 15 4. Intel® Software Guard Extensions (Intel® SGX). URL: https://www.intel.co.jp/content/www/jp/ja/architecture-and-technology/software-guard-extensions.html 5. Intel SGX⼊⾨ - SGX基礎知識編 (in Japanese). URL: https://qiita.com/Cliffford/items/2f155f40a1c3eec288cf RAM Intel CPU Enclave OS Create Enclave SGX Calculation Sensitive Data Intel SGX [4] is Hardware security function enhancements in Intel CPUs ❚Intel SGX uis a CPU feature that creates an encrypted area called an Enclave in RAM uloads programs and data into Enclave, allowing programs to be executed while protecting sensitive data This figure is based on [5].
of original chaincode and private chaincode 16 Org 1 Peer FPC Client TEE Encrypted Ledger Encrypted State DB Private Chaincode Org 2 Peer FPC Client TEE Encrypted Ledger Encrypted State DB Private Chaincode BC Network Org 1 Peer Client Ledger State DB Chaincode Org 2 Peer Client Ledger State DB Chaincode All transactions and data are stored in the DB without encryption and can be viewed. Private Chaincode allows control of transactions and data that can be viewed by other organizations Original Chaincode Fabric Private Chaincode
17 ❚Chaincode and Fabric Private Chaincode can coexist on the same channel of the same blockchain network ❚A single ledger can be used to manage information that is shared by all organizations participating in a channel of the blockchain network, and information that is kept secret and disclosed only to a few organizations. BC Network Org 1 Peer FPC Client TEE Private Chaincode Chaincode Ledger Encrypted Ledger State DB Encrypted State DB channel
Fabric Private Chaincode 18 Use cases introduced by the Fabric Private Chaincode community ❚ Auctions [6] u Applications that realize auctions need to be designed in a way that prevents fraudulent activities such as collusion. u FPC can be used to execute Chaincode on TEE while keeping transactions secret. ❚ FPC for Health [7] u When training models such as convolutional neural networks to detect precancerous lesions, aneurysms, and other brain abnormalities, a much larger amount of data is needed to achieve high accuracy than a single organization such as a hospital has. u Due to GDPR, HIPAA, and other regulations, it is impossible to share CT scans and MRI images of the brain taken by radiologists. u FPC can be used as a Blockchain infrastructure that can be shared freely while protecting privacy. 6. New IBM and Intel Blockchain Security Feature Targets 5G Auctions, https://www.ibm.com/blogs/research/2020/03/new-ibm-and-intel-blockchain-security-feature-targets-5g-auctions/ 7. FPC for Health use case, https://docs.google.com/document/d/1jbiOY6Eq7OLpM_s3nb-4X4AJXROgfRHOrNLQDLxVnsc/
Demonstration Overview 20 ①Blockchain Network Org1 (Hospital 1 : Permitted) Peer FPC Client Encrypted Ledger Encrypted State DB ②Private Chaincode Org2 (Hospital 2: Data Owner) Peer FPC Client Encrypted Ledger Encrypted State DB ②Private Chaincode User (Doctor) User (Doctor) Data Processer Genome Data Storage Org3 (Hospital 3: Denied) Peer Encrypted Ledger Encrypted State DB ②Private Chaincode ① Start Blockchain Network for FPC ② Install Simple Private Chaincode ③ Check Ledger Data using GUI Data Viewer
21 Prepare a machine with a built-in CPU with Intel SGX. ❚Microsoft Azure can create virtual machines (DCsv2 series) with Intel SGX uEven if you cannot use Intel SGX, you can try it in simulation mode. ❚Verification environment prepared by the presenter uMachine: Microsoft Azure DCsv2 Series Instance (Standard_DC2s_v2) uOS: Ubuntu Server 20.04 LTS - Gen2 uStorage: 256GB
22 Install the necessary packages to make Intel SGX available ❚Required software packages udocker udocker-compose ugolang umake ❚Build and install the following uintel/linux-sgx • https://github.com/intel/linux-sgx uIt is also possible to install a pre-built package. • https://download.01.org/intel-sgx/sgx_repo/ubuntu
23 Register for Intel® SGX Attestation Service Utilizing Enhanced Privacy ID ❚Create an Intel developer account and issue an Enhanced Privacy ID (EPID) uhttps://api.portal.trustedservices.intel.com/EPID-attestation
to Start Blockchain Network for FPC 25 cd $FPC_PATH/samples/deployment/test-network git clone https://github.com/hyperledger/fabric-samples cd $FPC_PATH/samples/deployment/test-network/fabric-samples curl -sSL https://bit.ly/2ysbOFE | bash -s -- 2.3.3 1.4.9 -s 6. Clone fabric-samples cd $FPC_PATH/samples/deployment/test-network ./setup.sh 7. Rewrite the network configuration 8. Start fabric blockchain network cd $FPC_PATH/samples/deployment/test-network/fabric-samples/test-network ./network.sh down ./network.sh up ./network.sh createChannel -c mychannel 5. To build all required FPC components and run the integration tests # => bash on fpc-development-main container cd $FPC_PATH make
to Install Simple Private Chaincode 26 cd $FPC_PATH/samples/deployment/test-network export CC_PATH=${FPC_PATH}/samples/chaincode/$CC_ID make build 1. Build private chaincode 2. Install the chaincode cd $FPC_PATH/samples/deployment/test-network ./installFPC.sh 3. Run the chaincode make ercc-ecc-start
to Install Simple Private Chaincode 27 Simple chaincode installed in this demonstration // storeAsset: Specify key and value and store them in the State DB std::string storeAsset(std::string asset_name, int value, shim_ctx_ptr_t ctx) { LOG_DEBUG("HelloworldCC: +++ storeAsset +++"); put_state(asset_name.c_str(), (uint8_t*)&value, sizeof(int), ctx); return OK; } // retrieveAsset: Specify a key, return the value paired with the key stored in the State DB std::string retrieveAsset(std::string asset_name, shim_ctx_ptr_t ctx) { std::string result; LOG_DEBUG("HelloworldCC: +++ retrieveAsset +++"); uint32_t asset_bytes_len = 0; uint8_t asset_bytes[MAX_VALUE_SIZE]; get_state(asset_name.c_str(), asset_bytes, sizeof(asset_bytes), &asset_bytes_len, ctx); // check if asset_name exists if (asset_bytes_len > 0) { result = asset_name + ":" + std::to_string((int)(*asset_bytes)); } else { // asset does not exist result = NOT_FOUND; } return result; }
Ledger Data using GUI Data Viewer 28 Original Chaincode Fabric Private Chaincode Invoked value is an encrypted state. 1. Launch Hyperledger Explorer (GUI Data Viewer) cd $FPC_PATH/samples/deployment/test-network/blockchain-explorer docker-compose up -d 2. invoke the following data using the “storeAsset” function of simple private chaincode. > {“key”: “a”, “value”: “10”}. cd $FPC_PATH/samples/application/simple-cli-go ./fpcclient init $CORE_PEER_ID ./fpcclient invoke storeAsset a 10 3. Access Hyperledger Explorer using web browser Invoked value is not encrypted if the same logic is implemented in the Original Chaincode.
case, we created an infrastructure to manage and utilize genome data in multiple organizations and has confirmed PoC [8] Related work 30 8. Koshi Ikegawa, Nao Nishijima, Yoji Ozawa, Katsuhiro Fukunaka, Hironori Emaru, Masaru Hisada, Akihito Kaneko, Eiichi Araki, Ai Okada and Yuichi Shiraishi. Secure and Traceable System for Genomic Data Sharing Using Hyperledger Fabric Blockchain (in Japanese). IIBMP2020, September 2020. ❚ Multiple organizations are participating in a blockchain network for genome data sharing ❚ Raw genome data must not be passed on to other organizations because the data is sensitive data ❚ Analyze the data on the processor of the data owner org and pass only the results to other orgs BC Network Org1 (Hospital 1) Peer Client Ledger State DB Chaincode Org2 (Hospital 2) Peer Client Ledger State DB Chaincode User (Doctor) User (Doctor) Patients Data Processer Genome Data Storage store load request store request result result
problem is that the non-related organization can view the details of the transactions BC Network Org1 (Hospital 1) Peer Client Ledger State DB Chaincode Org2 (Hospital 2) Peer Client Ledger State DB Chaincode User (Doctor) User (Doctor) Patients Data Processer Genome Data Storage store load request store request result result OrgN
transactions and data exchange secret from non-related organizations BC Network Org1 (Hospital 1) Peer Client Ledger State DB Chaincode Org2 (Hospital 2) Peer Client Ledger State DB Chaincode User (Doctor) User (Doctor) Patients Data Processer Genome Data Storage store load request store request result result OrgN
Network Org 1 (Hospital 1) FPC Client Org 2 (Hospital 2) FPC Client User (Doctor) User (Doctor) Data Processer Genome Data Storage Org N (Hospital N) Peer Private Chaincode Install Private Chaincode: ②data access permission management chaincode ③task management chaincode Ledger Encrypted Ledger State DB Encrypted State DB Original Chaincode Install Original Chaincode: ①genome data catalog chaincode Peer Private Chaincode Ledger Encrypted Ledger State DB Encrypted State DB Original Chaincode Peer Private Chaincode Ledger Encrypted Ledger State DB Encrypted State DB Original Chaincode * Some components such as users and clients are omitted. Fabric blockchain network with multiple organizations participating
Network Org 1 (Hospital 1) FPC Client Org 2 (Hospital 2) FPC Client User (Doctor) User (Doctor) Data Processer Genome Data Storage Org N (Hospital N) Peer Private Chaincode Ledger Encrypted Ledger State DB Encrypted State DB Original Chaincode Peer Private Chaincode Ledger Encrypted Ledger State DB Encrypted State DB Original Chaincode Peer Private Chaincode Ledger Encrypted Ledger State DB Encrypted State DB Original Chaincode * Some components such as users and clients are omitted. Visualize the contents of the State-DB in tables State DB: for genome data catalog chaincode Data name Owner Hash Value State DB: for data access permission management chaincode Data name Access Request Access Approval State DB: for task management by data processer Data name Requester Task Result
1: store genome data & metadata 36 BC Network Org 1 (Hospital 1) FPC Client Org 2 (Hospital 2) FPC Client User (Doctor) User (Doctor) Data Processer Genome Data Storage Org N (Hospital N) Peer Private Chaincode Ledger Encrypted Ledger State DB Encrypted State DB Original Chaincode Peer Private Chaincode Ledger Encrypted Ledger State DB Encrypted State DB Original Chaincode Peer Private Chaincode Ledger Encrypted Ledger State DB Encrypted State DB Original Chaincode * Some components such as users and clients are omitted. Store raw genome data Invoke: genome metadata State DB: for genome data catalog chaincode Data name Owner Hash Value Genome Data 001 Org 2 Doctor 00aa11bb22cc... State DB: for data access permission management chaincode Data name Access Request Access Approval State DB: for task management by data processer Data name Requester Task Result
2: Access control (1/2) 37 BC Network Org 1 (Hospital 1) FPC Client Org 2 (Hospital 2) FPC Client User (Doctor) User (Doctor) Data Processer Genome Data Storage Org N (Hospital N) Peer Private Chaincode Ledger Encrypted Ledger State DB Encrypted State DB Original Chaincode Peer Private Chaincode Ledger Encrypted Ledger State DB Encrypted State DB Original Chaincode Peer Private Chaincode Ledger Encrypted Ledger State DB Encrypted State DB Original Chaincode * Some components such as users and clients are omitted. State DB: for genome data catalog chaincode Data name Owner Hash Value Genome Data 001 Org 2 Doctor 00aa11bb22cc... State DB: for data access permission management chaincode Data name Access Request Access Approval Genome Data 001 Org 1 Doctor invoke: request access right State DB: for task management by data processer Data name Requester Task Result
2: Access control (2/2) 38 BC Network Org 1 (Hospital 1) FPC Client Org 2 (Hospital 2) FPC Client User (Doctor) User (Doctor) Data Processer Genome Data Storage Org N (Hospital N) Peer Private Chaincode Ledger Encrypted Ledger State DB Encrypted State DB Original Chaincode Peer Ledger Encrypted Ledger State DB Encrypted State DB Original Chaincode Peer Private Chaincode Ledger Encrypted Ledger State DB Encrypted State DB Original Chaincode * Some components such as users and clients are omitted. State DB: for genome data catalog chaincode Data name Owner Hash Value Genome Data 001 Org 2 Doctor 00aa11bb22cc... State DB: for data access permission management chaincode Data name Access Request Access Approval Genome Data 001 Org 1 Doctor Org 1 Doctor Private Chaincode Invoke: genome metadata State DB: for task management by data processer Data name Requester Task Result
3: Request Task (1/2) 39 BC Network Org 1 (Hospital 1) Org 2 (Hospital 2) FPC Client User (Doctor) User (Doctor) Data Processer Genome Data Storage Org N (Hospital N) Peer Private Chaincode Ledger Encrypted Ledger State DB Encrypted State DB Original Chaincode Peer Ledger Encrypted Ledger State DB Encrypted State DB Original Chaincode Peer Private Chaincode Ledger Encrypted Ledger State DB Encrypted State DB Original Chaincode * Some components such as users and clients are omitted. State DB: for genome data catalog chaincode Data name Owner Hash Value Genome Data 001 Org 2 Doctor 00aa11bb22cc... State DB: for data access permission management chaincode Data name Access Request Access Approval Genome Data 001 Org 1 Doctor Org 1 Doctor State DB: for task management by data processer Data name Requester Task Result Genome Data 001 Org 1 Doctor Analyze xxxxx FPC Client Private Chaincode invoke: request task
3: Request Task (2/2) 40 BC Network Org 1 (Hospital 1) Org 2 (Hospital 2) FPC Client User (Doctor) User (Doctor) Data Processer Genome Data Storage Org N (Hospital N) Peer Private Chaincode Ledger Encrypted Ledger State DB Encrypted State DB Original Chaincode Peer Ledger Encrypted Ledger State DB Encrypted State DB Original Chaincode Peer Private Chaincode Ledger Encrypted Ledger State DB Encrypted State DB Original Chaincode * Some components such as users and clients are omitted. State DB: for genome data catalog chaincode Data name Owner Hash Value Genome Data 001 Org 2 Doctor 00aa11bb22cc... State DB: for data access permission management chaincode Data name Access Request Access Approval Genome Data 001 Org 1 Doctor Org 1 Doctor State DB: for task management by data processer Data name Requester Task Result Genome Data 001 Org 1 Doctor Analyze xxxxx Result yyyyy FPC Client Private Chaincode Load genome data Analyze task Invoke: result
uexplained some background on the market needs uexplained introduction that includes how blockchain works and what we are doing with Hyperledger. ❚Demonstration ushowed a demonstration of starting a Fabric Network, installing a simple Private Chaincode, and checking its operation using a GUI viewer. ❚Motivation uexplained the motivation for this presentation by comparing it to previous work. ❚Design and Approach uexplained the design and approach of the system to be implemented in this presentation.
Hyperledger, Hyperledger Fabric, Hyperledger Avalon, and Hyperledger logs and any related marks are trademarks of the Linux Foundation or Hyperledger Foundation ❚ Intel, Intel SGX and Intel logs and any related marks are trademarks of the Intel Corp. ❚ Other brand names and product names used in this material are trademarks, registered trademarks, or trade names of their respective holders.