Paper - Occlumency: Privacy-preserving Remote Deep-learning Inference Using SGX

  • Metadata:
    • author: Taegyeong Lee, Junehwa Song, Saumay Pushp, Caihua Li, Yunxin Liu, Youngki Lee, Fengyuan Xu, Chenren Xu, Zhang Lintao
    • title: Occlumency: Privacy-preserving Remote Deep-learning Inference Using SGX
    • year: 20192019

  • Essay:
    • As we discussed last week, mobile devices can massively benefit from offloading computationally heavy tasks to infrastructure. A typical use-case for offloading is resource-intensive inference for deep learning models, but the need to transmit personal data like voice snippets to remote service providers raises privacy concerns.
    • One could solve this problem by using Trusted Execution Environments, protecting the data from being read by third parties serverside; however, their limits in regards to memory bring significant performance decreases to deep learning inference. The authors of this paper aim to exploit the fact that models themselves often do not need to be protected (many applications leverage pre-trained models) and utilize memory-efficient convolutional operations on top of that to severely improve the performance of inference in these environments. With their implementation of "Occlumency", they heavily reduce the amount of overhead that secure inference produces for various deep learning models. Doing so, they ultimately show that the usage of TEEs can be viable for deep learning applications.
    • I think the paper takes a look at a critical problem in the realm of machine learning that will only get more relevant as consumer-grade Al becomes increasingly prevalent. I missed the discussion of privacy issues in MAUI, so having en entire paper dedicated to the methodologies behind secure remote execution was interesting. In my opinion, the presentation of their core idea was done very well overall. They specifically focused on the central problem TEEs had with deep learning (memory) and what circumstances they would use to fix that (models needing no protection), developing their ideas around that. This fact made following the thought process throughout the paper easy.
    • Lastly, their detailed look at how they improve the memory usage for convolution also helped my understanding of the way deep learning frameworks commonly handle memory. As my deep learning lecture is nearly entirely theoretical, this paper served as a personal introduction to that topic, which I enjoyed.
    • For this paper, I found it hard to find many things to criticize, but two of the points they raised struck me as odd. They limited their paper to CNNs and claimed that their approach is readily applicable to other models. However, seeing how they had to develop special methods for convolution in order to reach the needed memory efficiency to work in SGX, I would like to see further evidence for that claim. Also, I found the performance comparison with conventional GPUs concerning. Seeing that these were 125x faster on some models and assuming that the push towards big data will call for more power and increased usage of GPUs for inference, scaling Occlumency to these needs could present a considerable challenge in the future.