PyTorch PPML Framework Tutorial¶
This tutorial presents a framework for developing PPML (Privacy-Preserving Machine Learning) applications with Intel SGX and Graphene. We use PyTorch as an example ML framework. However, this tutorial can be applied to other ML frameworks like OpenVINO, TensorFlow, etc.
Machine Learning (ML) is increasingly utilized in many real-world applications. ML algorithms are first trained on massive amounts of known past data and then deployed to interpret unknown future data, which allows us to forecast weather, classify images, recommend content, and so on.
As machine learning pervades our daily lives, privacy concerns emerge as one of the key issues about this technology. In this tutorial, we focus on protecting the confidentiality and integrity of the input data when the computation takes place on an untrusted platform such as a public cloud virtual machine. We also protect the model for cases where the model owner is concerned about protecting their IP. In particular, we highlight how to build the PPML framework based on PyTorch in an untrusted cloud using Intel SGX and Graphene.
In general, ML workloads have two phases: training and inference. Both can be viewed as an application that takes inputs and produces an output. Training applications take a training dataset as input and produce a trained model. Inference applications take new data and the trained model as inputs and produce the result (the prediction).
The goal of this tutorial is to show how these applications – PyTorch workloads in particular – can run in an untrusted environment (like a public cloud), while still ensuring the confidentiality and integrity of sensitive input data and the model. To this end, we use Intel SGX enclaves to isolate PyTorch’s execution to protect data confidentiality and integrity, and to provide a cryptographic proof that the program is correctly initialized and running on legitimate hardware with the latest patches. We also use Graphene to simplify the task of porting PyTorch to SGX, without any changes to the ML application and scripts.
In this tutorial, we will show the complete workflow for PyTorch running inside an SGX enclave using Graphene and its features of Secret Provisioning and Protected Files. We rely on the new ECDSA/DCAP remote attestation scheme developed by Intel for untrusted cloud environments.
To run the PyTorch application on a particular SGX platform, the owner of the SGX platform must retrieve the corresponding SGX certificate from the Intel Provisioning Certification Service, along with Certificate Revocation Lists (CRLs) and other SGX-identifying information (1). Typically, this is a part of provisioning the SGX platform in a cloud or a data center environment, and the end user can access it as a service (in other words, the end user doesn’t need to deal with the details of this SGX platform provisioning but instead uses a simpler interface provided by the cloud/data center vendor).
As a second preliminary step, the user must encrypt the input and model files with her cryptographic (wrap) key and send these protected files to the remote storage accessible from the SGX platform (2).
Next, the remote platform starts PyTorch inside of the SGX enclave. Meanwhile, the user starts the secret provisioning application on her own machine. The two machines establish a TLS connection using RA-TLS (3), the user verifies that the remote platform has a genuine up-to-date SGX processor and that the application runs in a genuine SGX enclave (4), and finally provisions the cryptographic wrap key to this remote platform (5). Note that during build time, Graphene informs the user of the expected measurements of the SGX application.
After the cryptographic wrap key is provisioned, the remote platform may start executing the application. Graphene uses Protected FS to transparently decrypt the input and the model files using the provisioned key when the PyTorch application starts (6). The application then proceeds with execution on plaintext files (7). When the PyTorch script is finished, the output file is encrypted with the same cryptographic key and saved to the cloud provider’s file storage (8). At this point, the protected output may be forwarded to the remote user who will decrypt it and analyze its contents.
Ubuntu 18.04. This tutorial should work on other Linux distributions as well, but for simplicity we provide the steps for Ubuntu 18.04 only.
Please install the following dependencies:
sudo apt install libnss-mdns libnss-myhostname
PyTorch (Python3). PyTorch is a framework for machine learning based on Python. Please install PyTorch before you proceed (don’t forget to choose Linux as the target OS). We will use Python3 in this tutorial.
Intel SGX Driver and SDK/PSW. You need a machine that supports Intel SGX and FLC/DCAP. Please follow this guide to install the Intel SGX driver and SDK/PSW. Make sure to install the driver with ECDSA/DCAP attestation.
Graphene. Follow Quick Start to build Graphene. In this tutorial, we will use both non-SGX and SGX-backed versions of Graphene. Make sure you build both Graphene loaders (
Runtime/pal-Linuxfor non-SGX version and
Runtime/pal-Linux-SGXfor SGX version).
Executing Native PyTorch¶
We start with a very simple example script written in Python3 for PyTorch-based ML inferencing. Graphene already provides a minimalistic and insecure PyTorch example which does not have confidentiality guarantees for input/output files and does not use remote attestation. In this tutorial, we will use this existing PyTorch example as a basis and will improve it to protect all user files.
Go to the directory with Graphene’s PyTorch example:
cd <graphene repository>/Examples/pytorch
The directory contains a Python script
pytorchexample.py and other relevant
files. The script reads a pretrained AlexNet model and an image
input.jpg, and infers the class of an object in the image. Then, the script
writes the top-5 classification results to a file
We first download and save the pre-trained AlexNet model:
This command uses the
download-pretrained-model.py script to download a
pretrained model and save it as a serialized file
See Saving and Loading Models in PyTorch for more
Now simply run the following command to run PyTorch inferencing:
This will execute native PyTorch which will write the classification results to
result.txt. The provided example image is a photo of a dog, therefore the
output file contains “Labrador retriever” as a first result.
In later sections, we will run exactly the same Python script but with Graphene and inside SGX enclaves.
Executing PyTorch with Graphene¶
In the next two sections, we will run the exact same PyTorch example with Graphene. We will first run PyTorch with non-SGX Graphene (for illustrative purposes) and then with SGX-backed Graphene. Note that this part of the tutorial still only shows the non-PPML workflow where Graphene doesn’t protect input/output user files; the end-to-end PPML workflow will be described below.
The porting effort to run PyTorch in Graphene is minimal and boils down to creation of the Graphene PyTorch-specific manifest file. When Graphene runs an executable, it reads a manifest file that describes the execution environment including the security posture, environment variables, dynamic libraries, arguments, and so on. In the rest of this tutorial, we will create this manifest file and explain its options and rationale behind them. Note that the manifest file contains both general non-SGX options for Graphene and SGX-specific ones. Please refer to this for further details about the syntax of Graphene manifests.
Executing PyTorch with non-SGX Graphene¶
Let’s run the PyTorch example using Graphene, but without an SGX enclave.
Navigate to the PyTorch example directory we examined in the previous section:
cd <graphene repository>/Examples/pytorch
Let’s take a look at the template manifest file
(recall that PyTorch is a collection of libraries and utilities but it uses
Python as the actual executable). For illustrative purposes, we will look at
only a few entries of the file. Note that we can simply ignore SGX-specific keys
(starting with the
sgx. prefix) for our non-SGX run.
Notice that the manifest file is not secure because it propagates untrusted command-line arguments and environment variables into the enclave. We keep these work-arounds in this tutorial for simplicity, but this configuration must not be used in production:
loader.insecure__use_cmdline_argv = 1 loader.insecure__use_host_env = 1
We mount the entire
<graphene repository>/Runtime/ host-level directory to
/lib directory seen inside Graphene. This trick allows to transparently
replace standard C libraries with Graphene-patched libraries:
fs.mount.lib.type = "chroot" fs.mount.lib.path = "/lib" fs.mount.lib.uri = "file:$(GRAPHENEDIR)/Runtime/"
We also mount other directories such as
required by Python and PyTorch (they search for libraries and utility files in
these system directories).
Finally, we mount the path containing the Python packages installed via pip:
fs.mount.pip.type = "chroot" fs.mount.pip.path = "$(HOME)/.local/lib" fs.mount.pip.uri = "file:$(HOME)/.local/lib"
Now we can run
make to build/copy all required Graphene files:
This command will autogenerate a couple new files:
- Generate the actual non-SGX Graphene manifest (
pytorch.manifest) from the template manifest file. This file will be used by Graphene to decide on different manifest options how to execute PyTorch inside Graphene.
- Create a symbolic link to the generic Graphene loader (
pal_loader). This is just for convenience.
Now, launch Graphene via
pal_loader. You can simply append the arguments
after the application path. Our example takes
pytorchexample.py as an argument:
./pal_loader ./pytorch pytorchexample.py
That’s it. You have run the PyTorch example with Graphene. You can check
result.txt to make sure it ran correctly.
Executing PyTorch with Graphene in SGX Enclave¶
In this section, we will learn how to use Graphene to run the same PyTorch
example inside an Intel SGX enclave. Let’s go back to the manifest template
(recall that the manifest keys starting with
sgx. are SGX-specific syntax;
these entries are ignored if Graphene runs in non-SGX mode).
Below, we will highlight some of the SGX-specific manifest options in
pytorch.manifest.template. SGX syntax is fully described here.
First, here are the following SGX-specific lines in the manifest template:
sgx.trusted_files.ld = "file:$(GRAPHENEDIR)/Runtime/ld-linux-x86-64.so.2" sgx.trusted_files.libc = "file:$(GRAPHENEDIR)/Runtime/libc.so.6" ...
sgx.trusted_files.<name> specifies a file that will be verified and trusted
by the SGX enclave. Note that the key string
<name> may be an arbitrary
legal string (but without
- and other special symbols) and does not have to
be the same as the actual file name.
The way these Trusted Files work is before Graphene runs PyTorch inside the SGX
enclave, Graphene generates the final SGX manifest file using
Graphene utility. This utility calculates hashes of each trusted file and
appends them as
sgx.trusted_checksum.<name> to the final SGX manifest. When
running PyTorch with SGX, Graphene reads trusted files, finds their
corresponding trusted checksums, and compares the calculated-at-runtime checksum
against the expected value in the manifest.
The PyTorch manifest template also contains
entries. They specify files unconditionally allowed by the enclave:
sgx.allowed_files.pythonhome = "file:$(HOME)/.local/lib"
This line unconditionally allows all Python libraries in the path to be loaded
into the enclave. Ideally, the developer needs to replace it with
sgx.trusted_files for each of the dependent Python libraries.
Allowed files are not cryptographically hashed and verified. Thus, this is insecure and discouraged for production use (unless you are sure that the contents of the files are irrelevant to security of your workload). Here, we use these allowed files only for simplicity. A next tutorial on PyTorch (with Docker integration) replaces all allowed files with trusted/protected files (that tutorial is work in progress).
Now we desribed how the manifest template looks like and what the SGX-specific manifest entries represent. Let’s prepare all the files needed to run PyTorch in an SGX enclave:
The above command performs the following tasks:
- Generates the final SGX manifest file
- Signs the manifest and generates the SGX signature file containing SIGSTRUCT
- Creates a dummy EINITTOKEN token file
pytorch.token(this file is used for backwards compatibility with SGX platforms with EPID and without Flexible Launch Control).
After running this command and building all the required files, we can simply
SGX=1 environment variable and use
pal_loader to launch the PyTorch
workload inside an SGX enclave:
SGX=1 ./pal_loader ./pytorch pytorchexample.py
It will run exactly the same Python script but inside the SGX enclave. Again,
you can verify that PyTorch ran correctly by examining
End-To-End Confidential PyTorch Workflow¶
Background on Remote Attestation, RA-TLS and Secret Provisioning¶
Intel SGX provides a way for the SGX enclave to attest itself to the remote user. This way the user gains trust in the SGX enclave running in an untrusted environment, ships the application code and data, and is sure that the correct application was executed inside a genuine SGX enclave. This process of gaining trust in a remote SGX machine is called Remote Attestation (RA).
Graphene has two features that transparently add SGX RA to the application: (1) RA-TLS augments normal SSL/TLS sessions with an SGX-specific handshake callback, and (2) Secret Provisioning establishes a secure SSL/TLS session between the SGX enclave and the remote user so that the user may gain trust in the remote enclave and provision secrets to it. Secret Provisioning builds on top of RA-TLS and typically runs before the application. Both features are provided as opt-in libraries.
The Secret Provisioning library provides a simple non-programmatic API to
applications: it transparently initializes the environment variable
SECRET_PROVISION_SECRET_STRING with a secret obtained from the remote user
during remote attestation. In our PyTorch example, the provisioned secret is the
confidential (master, or wrap) key to encrypt/decrypt user files. To inform
Graphene that the obtained secret is indeed the key for file encryption, it is
enough to set the environment variable
Note that RA-TLS and Secret Provisioning work both with the EPID-based and the ECDSA/DCAP schemes of SGX remote attestation. Since this tutorial concentrates on an untrusted-cloud scenario, we use the ECDSA/DCAP attestation framework.
Background on Protected Files¶
Graphene provides a feature of Protected Files, which encrypts files and transparently decrypts them when the application reads or writes them. Integrity- or confidentiality-sensitive files (or whole directories) accessed by the application must be marked as protected files in the Graphene manifest. New files created in a protected directory are automatically treated as protected. The encryption format used for protected files is borrowed from the similar feature of Intel SGX SDK.
This feature can be combined with Secret Provisioning such that the files are encrypted/decrypted using the provisioned wrap key, as explained in the previous section.
Preparing Confidential PyTorch Example¶
In this section, we will transform our native PyTorch application into an end-to-end confidential application. We will encrypt all user files before starting the enclave, mark them as protected, let the enclave communicate with the secret provisioning server to get attested and receive the master wrap key for encryption and decryption of protected files, and finally run the actual PyTorch inference.
We will use the previous non-confidential PyTorch example as a starting point, so copy the entire PyTorch directory:
cd <graphene repository>/Examples cp -R pytorch pytorch-confidential
We will also use the reference implementation of Secret Provisioning found under
Examples/ra-tls-secret-prov directory, so build and copy all the relevant
files from there:
cd <graphene repository>/Examples/ra-tls-secret-prov make -C ../../Pal/src/host/Linux-SGX/tools/ra-tls dcap make dcap pf_crypt
The second line in the above snippet creates Graphene-specific DCAP libraries
for preparation and verification of SGX quotes (needed for SGX remote
attestation). The last line builds the required DCAP binaries and copies
relevant Graphene utilities such as
pf_crypt to encrypt input files.
The last line also builds the secret provisioning server
secret_prov_server_dcap. We will use this server to provision the master
wrap key (used to encrypt/decrypt protected input and output files) to the
PyTorch enclave. See Secret Provisioning Minimal Examples
for more information.
Preparing Input Files¶
The user must encrypt all input files:
alexnet-pretrained.pt. For simplicity, we re-use the already-existing stuff
Examples/ra-tls-secret-prov directory. In particular, we re-use
the confidential wrap key:
cd <graphene repository>/Examples/pytorch-confidential mkdir files cp ../ra-tls-secret-prov/files/wrap-key files/
In real deployments, the user must replace this
wrap-key with her own
128-bit encryption key.
We also re-use the
pf_crypt utility (with its
library and required mbedTLS libraries) that encrypts/decrypts the files:
cp ../ra-tls-secret-prov/libsgx_util.so . cp ../ra-tls-secret-prov/libmbed*.so* . cp ../ra-tls-secret-prov/pf_crypt .
Let’s also make sure that
alexnet-pretrained.pt network-model file exists
under our new directory:
Now let’s encrypt the original plaintext files. We first move these files under
plaintext/ directory and then encrypt them using the wrap key:
mkdir plaintext/ mv input.jpg classes.txt alexnet-pretrained.pt plaintext/ LD_LIBRARY_PATH=. ./pf_crypt encrypt -w files/wrap-key -i plaintext/input.jpg -o input.jpg LD_LIBRARY_PATH=. ./pf_crypt encrypt -w files/wrap-key -i plaintext/classes.txt -o classes.txt LD_LIBRARY_PATH=. ./pf_crypt encrypt -w files/wrap-key -i plaintext/alexnet-pretrained.pt -o alexnet-pretrained.pt
You can verify now that the input files are encrypted. In real deployments, these files must be shipped to the remote untrusted cloud.
Preparing Secret Provisioning¶
The user must prepare the secret provisioning server and start it. For this,
copy the secret provisioning executable and its helper library from
Examples/ra-tls-secret-prov to the current directory:
cp ../ra-tls-secret-prov/libsecret_prov_verify_dcap.so . cp ../ra-tls-secret-prov/secret_prov_server_dcap .
Also, copy the server-identifying certificates so that in-Graphene secret provisioning library can verify the provisioning server (via classical X.509 PKI):
cp -R ../ra-tls-secret-prov/certs ./
These certificates are dummy mbedTLS-provided certificates; in production, you would want to generate real certificates for your secret-provisioning server and use them.
Now we can launch the secret provisioning server:
In this tutorial, we simply run it locally (
localhost:4433 as configured in
the manifest) for simplicity. In reality, the user must run it on a trusted
remote machine. In that case,
loader.env.SECRET_PROVISION_SERVERS in the
manifest (see below) must point to the address of the remote-user machine. We
launch the server in the background.
Preparing Manifest File¶
Finally, let’s modify the manifest file. Open
with your favorite text editor.
protected_files for the input files:
# sgx.trusted_files.classes = "file:classes.txt" sgx.protected_files.classes = "file:classes.txt" # sgx.trusted_files.image = "file:input.jpg" sgx.protected_files.image = "file:input.jpg" # sgx.trusted_files.model = "file:alexnet-pretrained.pt" sgx.protected_files.model = "file:alexnet-pretrained.pt"
result.txt as a protected file so that PyTorch writes the
encrypted result into it:
sgx.protected_files.result = "file:result.txt"
Now, let’s add the secret provisioning library to the manifest. Append the
LD_LIBRARY_PATH so that PyTorch and Graphene
add-ons search for libraries in the current directory:
# this instructs in-Graphene dynamic loader to search for dependencies in the current directory loader.env.LD_LIBRARY_PATH = "/lib:/usr/lib:$(ARCH_LIBDIR):/usr/$(ARCH_LIBDIR):./"
Add the following lines to enable remote secret provisioning and allow protected
files to be transparently decrypted by the provisioned key. Recall that we
launched the secret provisioning server locally on the same machine, so we
re-use the same
certs/ directory and specify
localhost. For more info on
the used environment variables and other manifest options, see here:
sgx.remote_attestation = 1 loader.env.LD_PRELOAD = "libsecret_prov_attest.so" loader.env.SECRET_PROVISION_CONSTRUCTOR = "1" loader.env.SECRET_PROVISION_SET_PF_KEY = "1" loader.env.SECRET_PROVISION_CA_CHAIN_PATH = "certs/test-ca-sha256.crt" loader.env.SECRET_PROVISION_SERVERS = "localhost:4433" sgx.trusted_files.libsecretprovattest = "file:libsecret_prov_attest.so" sgx.trusted_files.cachain = "file:certs/test-ca-sha256.crt"
libsecret_prov_attest.so library provides the in-enclave logic to attest
the SGX enclave, Graphene instance, and the application running in it to the
remote secret-provisioning server. Graphene needs to locate this library, so
let’s copy it to our working directory:
cp ../ra-tls-secret-prov/libsecret_prov_attest.so ./
Building and Executing End-To-End PyTorch Example¶
Now that we prepared the files and the manifest, let’s re-generate the manifest files, tokens, and signatures:
make clean make SGX=1
It is also important to remove the file
result.txt if it exists. Otherwise
the Protected FS will detect the already-existing file and fail. So let’s remove
rm -f result.txt
We are ready to run the end-to-end PyTorch example. Notice that we didn’t change a line of code in the Python script. Moreover, we can run it with exactly the same command used in the previous section:
SGX=1 ./pal_loader ./pytorch pytorchexample.py
This should run PyTorch with encrypted input files and generate the encrypted
result.txt output file. Note that we already launched the secret
provisioning server on the same machine, so secret provisioning will run
Decrypting Output File¶
After our protected PyTorch inference is finished, you’ll see
the directory. This file is encrypted with the same key as was used for
encryption of input files. In order to decrypt it, use the following command:
LD_LIBRARY_PATH=. ./pf_crypt decrypt -w files/wrap-key -i result.txt -o plaintext/result.txt
You can check the result written in
plaintext/result.txt. It must be the
same as in our previous runs.
When done, don’t forget to terminate the secret provisioning server: