Highly Adaptive Lasso and Targeted Learning 📿🎯

Short course

Author

Mark van der Laan (UC Berkeley) and Sky Qiu (UC Berkeley)

Published

October 17, 2025

Welcome to the short course!

This page contains slides and code examples for the short course on Highly Adaptive Lasso and Targeted Learning at the DahShu Data Science Symposium 2025.

Schedule

Here is the schedule for the day (9am-5pm, October 18th, 2025):

Time Topic
9am-10.20am Introduction to Targeted Learning; The Roadmap; Super Learning; TMLE; L-TMLE.
10.20am-10.30am Coffee/Tea break, Q&A.
10.30am-11.50am

Introduction to Highly Adaptive Lasso; Cadlag functions; Zero-order spline HAL, rate of convergence; Higher order splines, examples of 1st and 2nd order HAL, rate of convergence.

11.50am-12.00pm Coffee/Tea break, Q&A.
12.00pm-12.30pm Efficient HAL-TMLE; Undersmoothed HAL-MLE; Nonparametric bootstrap; Profile-TMLE; Adaptive-TMLE.
12.30pm-1.30pm Lunch break.
1.30pm-2.20pm
2.20pm-2.30pm Coffee/Tea break, Q&A.
2.30pm-3.20pm
  • Specifying a diverse super learner library including different specifications (smoothness orders, subspaces) of candidate HAL learners [code example]
  • Estimating effect of counterfactual treatment regimes in longitudinal data with ltmle and dltmle [code example]
3.20pm-3.30pm Coffee/Tea break, Q&A.
3.30pm-4.30pm

HAL-based adaptive-TMLE, applications to RCT-RWD hybrid designs:

  • Adaptive-TMLE for augmenting an RCT with external RWD [code example]
4.30pm-5pm Closing remarks, Q&A

Environment setup

  • R (≥ 4.3) installed.
  • Recommended IDE: RStudio.
  • R packages required:
# install CRAN packages
cran_pkgs <- c("hal9001", "glmnet", "plotly", "haldensify", "origami", "tmle",
               "MASS", "ggplot2", "dbarts", "xgboost", "ltmle", "data.table",
               "DT", "plotly", "purrr")
missing_cran <- cran_pkgs[!(cran_pkgs %in% installed.packages()[, "Package"])]
if (length(missing_cran) > 0) {
  install.packages(missing_cran)
}

# install GitHub packages
if (!requireNamespace("remotes", quietly = TRUE)) {
  install.packages("remotes")
}
github_pkgs <- c("tlverse/sl3", "tq21/atmle")
for (pkg in github_pkgs) {
  pkg_name <- basename(pkg)
  if (!requireNamespace(pkg_name, quietly = TRUE)) {
    remotes::install_github(pkg)
  }
}

Relevant books/papers for this short course

  • van der Laan, Mark and Sherri Rose (2011) Targeted Learning: Causal Inference for Observational and Experimental Data. Springer Science & Business Media. [Link]

  • van der Laan, Mark and Sherri Rose (2018) Targeted Learning in Data Science: Causal Inference for Complex Longitudinal Studies. Springer Science & Business Media. [Link]

  • Petersen, Maya and Mark van der Laan (2014) “Causal models and learning from data: integrating causal modeling and statistical estimation.” Epidemiology 25, no. 3: 418-426. [Link]

  • Dang, Lauren et al (2023). “A causal roadmap for generating high-quality real-world evidence.” Journal of Clinical and Translational Science, 7(1), p. e212. doi:10.1017/cts.2023.635. [Link]

  • van der Laan, Mark, Eric Polley, and Alan Hubbard (2007). “Super learner.” Statistical applications in genetics and molecular biology, 6, Article25. https://doi.org/10.2202/1544-6115.1309 [Link]

  • van der Laan, Mark and Daniel Rubin (2006). “Targeted Maximum Likelihood Learning.” The International Journal of Biostatistics, vol. 2, no. 1. [Link]

  • van der Laan, Mark (2017). “A generally efficient targeted minimum loss based estimator based on the highly adaptive lasso.” The international journal of biostatistics. [Link]

  • Benkeser, David, and Mark van der Laan. “The highly adaptive lasso estimator.” 2016 IEEE international conference on data science and advanced analytics (DSAA). IEEE, 2016. [Link]

  • van der Laan, Mark. “Higher order spline highly adaptive lasso estimators of functional parameters: Pointwise asymptotic normality and uniform convergence rates.” arXiv preprint arXiv:2301.13354 (2023). [Link]

  • van der Laan, Lars, Marco Carone, Alex Luedtke, and Mark van der Laan. “Adaptive debiased machine learning using data-driven model selection techniques.” arXiv preprint arXiv:2307.12544 (2023). [Link]

  • van der Laan, Mark, Sky Qiu, Jens Tarp, and Lars van der Laan. “Adaptive-TMLE for the average treatment effect based on randomized controlled trial augmented with real-world data.” arXiv preprint arXiv:2405.07186 (2024). [Link]

  • Qiu, Sky, Jens Tarp, Andrew Mertens, and Mark van der Laan. “An estimator-robust design for augmenting randomized controlled trial with external real-world data.” arXiv preprint arXiv:2501.17835 (2025). [Link]

  • Shirakawa, Toru, Yi Li, Yulun Wu, Sky Qiu, Yuxuan Li, Mingduo Zhao, Hiroyasu Iso, and Mark van der Laan. “Longitudinal targeted minimum loss-based estimation with temporal-difference heterogeneous transformer.” Proceedings of the 41st International Conference on Machine Learning, PMLR 235:45097-45113 (2024). [Link]

Additional resources

License

  • Slides and code examples © 2025 by the authors.
  • Code is MIT-licensed unless noted.

Maintainers & contact

For questions or issues, open a GitHub issue or email sky.qiu@berkeley.edu.