Program Analysis for System Security and Reliability

VVZ:Open in Course Catalogue

Semester:Spring 2022

Number:263-2925-00L

Lecturer: Prof. Dr. Martin Vechev

Head TA:

Maximilian Baader

TA:

Mislav Balunović, Jingxuan He, Nikola Jovanović, Luca Beurer-Kellner

Lecture: Thu 16-18, Zoom (Meeting ID: 621 2169 7688, use your NETHZ Zoom account)

Exercise: Thu 13-14, Zoom (Meeting ID: 650 0671 5439, use your NETHZ Zoom account)

Credits:7

Overview

Security issues in modern systems (blockchains, datacenters, deep learning, etc.) result in billions of losses due to hacks and system downtime. This course introduces fundamental techniques (ranging over automated analysis, machine learning, synthesis, zero-knowledge, differential privacy, and their combinations) that can be applied in practice so to build more secure and reliable modern systems. All of these techniques are heavily used in practice and form the basis of some of the most the advanced analysis engines built by successful ETH spin-offs (ChainSecurity (acquired by PwC) and DeepCode (acquired by Snyk, a billion dollar security company), as well as other world-class systems.

Objectives

Understand the fundamental techniques used to create modern security and reliability analysis engines that are used worldwide.
Understand how symbolic techniques are combined with machine learning (e.g., deep learning, reinforcement learning) so to create new kinds of learning-based analyzers
Understand how to quantify and fix security and reliability issues in modern deep learning models.
Understand open research questions from both theoretical and practical perspectives.

Part I: Fundamentals of Automated Security Analysis with Applications to Smart Contracts

We will introduce fundamental analysis methods: fuzzing (including combinations with reinforcement learning), symbolic execution, predication abstraction, and Datalog.
We will show how these methods can be used to build some of the most popular, state-of-the-art automated security analysis and verification systems for blockchain smart contracts (e.g., Securify, VerX).

Part II: Security and Reliability of Datacenter and Network Programs

We will show how to ensure that datacenters and ISPs are secured using declarative reasoning methods (e.g., Datalog) as introduced in Part I.
We will also show how to automatically synthesize secure configurations (e.g. using SyNET and NetComplete) which lead to desirable behaviors, thus automating the job of the network operator and avoiding critical errors.

Part III: Machine Learning for Automated Security Analysis and Repair

We will illustrate how to automatically learn interpretable models expressed in Datalog from billions of lines of code and fixes to this code, which form the basis of new kinds of security analyzers.
We will study how to automatically learn to identify security vulnerabilities related to the handling of untrusted inputs (cross-Site scripting, SQL injection, path traversal, remote code execution) from large codebases.
We will also cover how to use machine learning models in order to automatically repair software errors (essentially a step towards the machine writing code).

Part IV: Security and Reliability of Machine Learning Models

We will introduce differential privacy, and systematic ways to find violations of differential privacy.
We will study (black box) methods to quantify the robustness of large scale deep learning models.

Course Project

The course involve a hands-on programming project where the methods studied in the class will be applied. You can work on the project in a group consisting of at most 2 students. If you do not have a teammate, you can choose to work alone or be matched with another student. The registration was closed.

Project description:

. Recording of the project announcement session:

.

Deadlines:

Group registration	6PM CEST, March 29, 2022
Project announcement	6PM CEST, March 31, 2022
Preliminary deadline (optional)	6PM CEST, May 9, 2022
Preliminary feedback	6PM CEST, May 13, 2022
Final deadline	6PM CEST, June 10, 2022

Lectures

Use your NETHZ account to access the slides. The password to access the recordings is sent in a separate email.

No.	Date	Content	Exercises
1	Feb 24	Course Introduction: Topics and Organization	No Exercise
2	Mar 3	Datalog and Static Analysis
3	Mar 10	Fuzzing
4	Mar 17	Linear Temporal Logic
5	Mar 24	Safety Verification	Course project
6	Mar 31	Zero-knowledge Proofs
7	Apr 7	Network Analysis
8	Apr 14	Network Synthesis
9	Apr 28	Datalog at DeepCode	No Exercise
10	May 5	Differential Privacy
11	May 12	Testing for Differential Privacy
12	May 19	Black Box Attacks

Past exams

A previously held exam is available here.