Course syllabus

DAT675 Artificial Intelligence for Molecules
LP3 VT26 (7.5 hp)

Course offered by the Department of Computer Science and Engineering


Contact Details

Role Name Email
Lecturer/Examiner Assistant Professor Rocío Mercado Oropeza rocio.mercado@chalmers.se
Teaching Assistant Stefano Ribes ribes@chalmers.se

Course Purpose

This course provides a comprehensive introduction to the transformative role of AI in molecular sciences. Structured into four modules, it covers molecular representations, property prediction, and generative AI techniques, culminating in a hands-on project tackling real-world challenges.

Emphasizing practical skills, the course integrates programming, machine learning frameworks, and data handling best practices. Designed for students with a solid programming foundation, it bridges diverse scientific backgrounds to explore the intersection of AI and molecular science.


Schedule Overview

The course runs for 8 weeks:

  • Lectures: Mondays 08:00–09:45 and Wednesdays 10:00–11:45
  • Recitation sessions: Wednesdays 13:15–16:00

For the complete schedule with rooms and topics, see TimeEdit.

Key Dates

Date Event
Jan 19 Course begins
Jan 26 Assignment 1 released
Feb 13 Assignment 1 due
Feb 16 Assignment 2 released
Feb 18 No class (CHARM) — Team registration deadline
Feb 20 Project proposal due
Mar 6 Assignment 2 due
Mar 9–11 Project presentations (attendance required)
Mar 20 Final report due

Course Modules

Module 1: Machine Representations of Molecules

Introduction to the field of AI for molecules. Overview of molecular data structures, string representations, molecular graphs, fingerprints, descriptors, and data preprocessing best practices.

Module 2: Molecular Property Prediction

Classical and deep learning approaches to predicting molecular properties, including classical machine learning methods, deep neural networks, model evaluation, and advanced training paradigms.

Module 3: Molecular Design

Generative models for molecular design, from VAEs, RNNs, and GANs, all the way to diffusion models and LLM-based approaches. Evaluation metrics and molecular optimization.

Module 4: Real-World Challenges

Industry applications, ethics and sustainability, reproducibility, and final project presentations.


Assessment Components

The course comprises five graded components, each worth 5 points (25 points total):

Component Format Due Date
Assignment 1 Individual Jupyter notebook Fri, Feb 13, 23:59
Assignment 2 Individual Jupyter notebook Fri, Mar 6, 23:59
Project Proposal Team PDF (1–2 pages) Fri, Feb 20, 23:59
Final Presentation Team presentation (12 + 3 min Q&A) Mar 9–11 (in class)
Final Report Team PDF (max 4 pages) Fri, Mar 20, 23:59

All times are in Gothenburg, Sweden (CET/CEST).


Grading Scale

Grade Points Required
5 24–25 points
4 21–23 points
3 17–20 points
U (Fail) < 17 points

All five components must be submitted to pass the course.

There will be a few opportunities for receiving bonus points throughout the course, which will count towards the final grade iff all 5 assessment components are successfully completed.


Learning Objectives

Knowledge and Understanding

On successful completion of the course, students will be able to:

  • Articulate the role of molecular data in addressing global challenges such as healthcare innovation, sustainability, and biotechnology, and explain its distinct importance compared to other areas of data science.
  • Describe the foundations of cheminformatics and the evolution of AI methods for molecular applications, highlighting historical milestones and technological advancements.
  • Explain the principles of molecular data representation and analysis, including key challenges, pitfalls, and best practices when working with this specialized data.
  • Demonstrate knowledge of state-of-the-art AI methods for molecular property prediction and molecular generation, including their strengths, limitations, and potential for industry applications.
  • Critically evaluate AI models in the context of molecular science, considering data quality, biases, and the broader implications of model design and performance.

Skills and Abilities

On successful completion of the course, students will be able to:

  • Preprocess, analyze, and manage molecular datasets, ensuring data quality and addressing common challenges such as data imbalance and representation biases.
  • Design and implement machine learning pipelines tailored to molecular property prediction and generative tasks, considering real-world constraints such as missing data, data sparsity, computational resources, and model complexity.
  • Apply advanced techniques in optimization, statistics, and algorithm development to molecular AI tasks and analyze the results in a meaningful and reproducible manner.
  • Use modern programming tools and libraries (e.g., PyTorch, RDKit, scikit-learn, HuggingFace) to develop scalable and efficient molecular AI workflows, while adapting to emerging technologies in the field.
  • Communicate findings, insights, and implications of molecular AI research effectively to interdisciplinary audiences, fostering collaboration across fields.

Judgement and Approach

On successful completion of the course, students will be able to:

  • Critically assess molecular AI models for reliability, reproducibility, scalability, and applicability to real-world problems across industries.
  • Reflect on the ethical, societal, and environmental implications of AI-driven solutions in molecular sciences, contributing to the responsible development and application of these technologies.
  • Evaluate the strategic potential of integrating molecular AI in solving future global challenges, identifying opportunities for innovation and cross-disciplinary collaboration.
  • Foster a mindset of continuous learning and critical inquiry, recognizing the evolving nature of AI technologies and their applications in molecular sciences.

Link to the official syllabus on Studieportalen


Course Policies

Attendance

There is no mandatory attendance for lectures or recitation sessions, with one exception:

Project presentations (Mar 9–11): Attendance is required. You are expected to be present for all presentations, not just your own.

Late Submissions

  • Late submissions receive a 1-point deduction per 24 hours
  • Maximum lateness: 5 days (after which the submission receives 0 points)
  • Extensions are granted only for valid reasons (illness, serious personal circumstances)
  • To request an extension, email the Examiner before the deadline
  • Do not assume extension requests will be granted
  • There will be no resubmissions

Team Formation

  • Projects require teams of 2–3 students
  • Register your team on Canvas by Wednesday, February 18, 2026
  • Individual projects require prior approval from the Examiner (email with justification)

Re-examination

If you do not pass a component, you may resubmit or complete it in a later instance of the course. Contact the Examiner to discuss options.

Students with approved accommodations should contact the Examiner to arrange suitable arrangements.


Academic Integrity

Academic integrity is fundamental to your education and this course. All submitted work must be your own (individual assignments) or your team's own (team assignments).

Generative AI Policy

The use of generative AI tools (e.g., ChatGPT, Claude, GitHub Copilot, DeepSeek) is permitted in this course, subject to the following conditions:

  1. You must fully understand all AI-generated content that you include in your submissions.
  2. You are responsible for correctness. AI tools make mistakes. If AI-generated content contains errors, you are responsible for those errors.
  3. Random verification: For each assignment, 1–2 students will be randomly selected to meet with the Examiner and answer questions about their submission. These will be straightforward questions about your code or analysis. If you cannot adequately explain your own submission, you will receive 0 points for that assignment.

Proper (and encouraged) use of AI tools:

  • Using AI to help explain concepts you're learning
    • Be careful to validate concepts with verified sources, especially if the concept is new to you
  • Using AI to help debug code you wrote
  • Using AI to improve the clarity of text you drafted
  • Using AI as a learning aid alongside your own studies

Improper use of AI tools:

  • Submitting AI-generated work that you do not understand
  • Using AI to complete assignments without engaging with the material
  • Relying on AI instead of developing your own skills

General Academic Integrity Rules

  • Do not copy code or text from other students
  • Do not share your solutions with other students
  • Do not publish your solutions (e.g., public GitHub repositories)
  • Do not publish course material or distribute it outside this course; the materials included in this Canvas page took a long time to develop and are for your personal use throughout this course
  • Cite external sources: If you use code or ideas from papers, tutorials, Stack Overflow, or other sources, cite them clearly
  • Collaboration vs. copying: You may discuss problems and concepts with classmates, but you may not share code or written answers

Consequences

Suspected violations of academic integrity will be reported to the university disciplinary committee and may result in suspension.

When in doubt, ask. It is always better to check with the examiner than to guess.


Course Literature

There is no required textbook. Course materials (lecture slides, notebooks) and recommended reading will be linked within each module on Canvas.


Changes Since Last Occasion

This is the first instance of DAT675 AI for Molecules.

Course summary:

Course Summary
Date Details Due