Program Analysis for System Security and Reliability

VVZ:Open in Course Catalogue

Semester:Spring 2021

Number:263-2925-00L

Lecturer: Prof. Dr. Martin Vechev

Head TA:

Maximilian Baader

TA:

Mislav Balunović, Jingxuan He, Matthew Mirman, Luca Beurer-Kellner

Lecture: Tue 16-18, Zoom (Meeting ID: 975-0590-7796)

Exercise: Thu 13-14, Zoom (Meeting ID: 993-3446-7816)

Credits:7

Overview

Security issues in modern systems (blockchains, datacenters, deep learning, etc.) result in billions of losses due to hacks and system downtime. This course introduces fundamental techniques (ranging from automated analysis, machine learning, synthesis, zero-knowledge and their combinations) that can be applied in practice so to build more secure and reliable modern systems. All of these techniques are heavily used in practice and form the basis of some of the most the advanced analysis engines built by successful ETH spin-offs (ChainSecurity (acquired by PwC) and DeepCode (acquired by Snyk, a billion dollar security company), as well as other world-class systems.

Objectives

Understand the fundamental techniques used to create modern security and reliability analysis engines that are used worldwide.
Understand how symbolic techniques are combined with machine learning (e.g., deep learning, reinforcement learning) so to create new kinds of learning-based analyzers
Understand how to quantify and fix security and reliability issues in modern deep learning models.
Understand open research questions from both theoretical and practical perspectives.

Part I: Fundamentals of Automated Security Analysis with Applications to Smart Contracts

We will introduce fundamental analysis methods: fuzzing (including combinations with reinforcement learning), symbolic execution, predication abstraction, and Datalog.
We will show how these methods can be used to build some of the most popular, state-of-the-art automated security analysis and verification systems for blockchain smart contracts (e.g., Securify, VerX).

Part II: Security and Reliability of Datacenter and Network Programs

We will show how to ensure that datacenters and ISPs are secured using declarative reasoning methods (e.g., Datalog) as introduced in Part I.
We will also show how to automatically synthesize secure configurations (e.g. using SyNET and NetComplete) which lead to desirable behaviors, thus automating the job of the network operator and avoiding critical errors.

Part III: Machine Learning for Automated Security Analysis and Repair

We will illustrate how to automatically learn interpretable models expressed in Datalog from billions of lines of code and fixes to this code, which form the basis of new kinds of security analyzers.
We will study how to automatically learn to identify security vulnerabilities related to the handling of untrusted inputs (cross-Site scripting, SQL injection, path traversal, remote code execution) from large codebases.
We will also cover how to use machine learning models in order to automatically repair software errors (essentially a step towards the machine writing code).

Part IV: Security and Reliability of Deep Learning Models

We will study (black box) methods to quantify the robustness of large scale deep learning models.
We will study methods to patch and fix deep learning models.
We will study methods to monitor the online behavior of deep learning models.

Course Project

The course involve a hands-on programming project where the methods studied in the class will be applied. You can work on the project in a group consisting of at most 2 students. Registration is closed.

The description of the course project can be found here.

Deadlines:

Group registration	6PM CEST, March 28, 2021
Project announcement	6PM CEST, March 30, 2021
Preliminary deadline (optional)	6PM CEST, May 16, 2021
Preliminary feedback	6PM CEST, May 23, 2021
Final deadline	6PM CEST, June 6, 2021

Lectures

Use your NETHZ account to access the slides.

No.	Date	Content	Exercises
1	Feb 23	Introduction	No Exercise
2	Mar 2	Datalog and static analysis
3	Mar 9	Fuzzing
4	Mar 16	Linear Temporal Logic
5	Mar 23	Safety verification	No Exercise
6	Mar 30	Zkay
7	Apr 13	Network analysis
8	Apr 20	Network synthesis
9	Apr 27	Datalog at DeepCode	No Exercise
10	May 4	Machine Learning for Program Analysis	No Exercise
11	May 11	Machine Learning for Bug Detection & Fix	No Exercise
12	May 18	Black-Box Model Robustness
13	May 25	Blind Spots and Model Patching

Past exams

The exam from last year is available here.