The objective of the seminar is to:

  • Introduce students to the field of Deep Learning for Big Code.
  • Learn how machine learning models can be used to solve practical challenges in software engineering and programming beyond traditional methods.
  • Highlight the latest research and work opportunities in industry and academia available on this topic.

The seminar is carried out as a set of presentations (2 each lecture) chosen from a set of available papers (available below). The grade is determined as a function of the presentation, handling questions and answers, and participation:


20.02 Introduction to the seminar (topics, objectives, structure): Veselin Raychev PDF
06.03. Counterfactual Explanations for Models of Code Marco Max
SpreadsheetCoder: Formula Prediction from Semi-structured Context Laine Mislav
13.03. Adversarial Robustness for Code Yuhao Mark
Jigsaw: Large language models meet program synthesis Paula Nikola
20.03. Break-It-Fix-It: Unsupervised Learning for Program Repair Nic Max
Getafix: Learning to fix bugs automatically Ivan Mislav
27.03. Typewriter: Neural type prediction with search-based validation Ahmet Mislav
Unsupervised translation of programming languages Ionut Veselin
03.04. Competition-level code generation with alphacode George Marc
Repair is nearly generation: Multilingual program repair with llms Timothe Nikola
24.04. Code Prediction by Feeding Trees to Transformers Alexander Marc
Typilus: Neural Type Hints Cedric Max
08.05. NL2Type: inferring JavaScript function types from natural language information Roxana Stiuca Timon
Synchromesh: Reliable code generation from pre-trained language models Manuel Nikola
15.05. Big code!= big vocabulary: Open-vocabulary models for source code Xiaoyuan Timon