Highly Adaptive Lasso and Targeted Learning 📿🎯

Short course

Author

Mark van der Laan (UC Berkeley) and Sky Qiu (UC Berkeley)

Published

October 17, 2025

Welcome to the short course!

This page contains slides and code examples for the short course on Highly Adaptive Lasso and Targeted Learning at the DahShu Data Science Symposium 2025.

Schedule

Here is the schedule for the day (9am-5pm, October 18th, 2025):

Time	Topic
9am-10.20am	Introduction to Targeted Learning; The Roadmap; Super Learning; TMLE; L-TMLE.
10.20am-10.30am	Coffee/Tea break, Q&A.
10.30am-11.50am	Introduction to Highly Adaptive Lasso; Cadlag functions; Zero-order spline HAL, rate of convergence; Higher order splines, examples of 1st and 2nd order HAL, rate of convergence. Examples of the k-th order spline representation: [k=1, d=3], [k=2, d=3]
11.50am-12.00pm	Coffee/Tea break, Q&A.
12.00pm-12.30pm	Efficient HAL-TMLE; Undersmoothed HAL-MLE; Nonparametric bootstrap; Profile-TMLE; Adaptive-TMLE.
12.30pm-1.30pm	Lunch break.
1.30pm-2.20pm	`hal9001` [vignette] Introduction to `hal9001` [code example] Estimating conditional average treatment effects [code example] Estimating conditional densities [code example]
2.20pm-2.30pm	Coffee/Tea break, Q&A.
2.30pm-3.20pm	Specifying a diverse super learner library including different specifications (smoothness orders, subspaces) of candidate HAL learners [code example] Estimating effect of counterfactual treatment regimes in longitudinal data with `ltmle` and `dltmle` [code example]
3.20pm-3.30pm	Coffee/Tea break, Q&A.
3.30pm-4.30pm	HAL-based adaptive-TMLE, applications to RCT-RWD hybrid designs: Adaptive-TMLE for augmenting an RCT with external RWD [code example]
4.30pm-5pm	Closing remarks, Q&A

Environment setup

R (≥ 4.3) installed.
Recommended IDE: RStudio.
R packages required:

# install CRAN packages
cran_pkgs <- c("hal9001", "glmnet", "plotly", "haldensify", "origami", "tmle",
               "MASS", "ggplot2", "dbarts", "xgboost", "ltmle", "data.table",
               "DT", "plotly", "purrr")
missing_cran <- cran_pkgs[!(cran_pkgs %in% installed.packages()[, "Package"])]
if (length(missing_cran) > 0) {
  install.packages(missing_cran)
}

# install GitHub packages
if (!requireNamespace("remotes", quietly = TRUE)) {
  install.packages("remotes")
}
github_pkgs <- c("tlverse/sl3", "tq21/atmle")
for (pkg in github_pkgs) {
  pkg_name <- basename(pkg)
  if (!requireNamespace(pkg_name, quietly = TRUE)) {
    remotes::install_github(pkg)
  }
}

Relevant books/papers for this short course

van der Laan, Mark and Sherri Rose (2011) Targeted Learning: Causal Inference for Observational and Experimental Data. Springer Science & Business Media. [Link]
van der Laan, Mark and Sherri Rose (2018) Targeted Learning in Data Science: Causal Inference for Complex Longitudinal Studies. Springer Science & Business Media. [Link]
Petersen, Maya and Mark van der Laan (2014) “Causal models and learning from data: integrating causal modeling and statistical estimation.” Epidemiology 25, no. 3: 418-426. [Link]
Dang, Lauren et al (2023). “A causal roadmap for generating high-quality real-world evidence.” Journal of Clinical and Translational Science, 7(1), p. e212. doi:10.1017/cts.2023.635. [Link]
van der Laan, Mark, Eric Polley, and Alan Hubbard (2007). “Super learner.” Statistical applications in genetics and molecular biology, 6, Article25. https://doi.org/10.2202/1544-6115.1309 [Link]
van der Laan, Mark and Daniel Rubin (2006). “Targeted Maximum Likelihood Learning.” The International Journal of Biostatistics, vol. 2, no. 1. [Link]
van der Laan, Mark (2017). “A generally efficient targeted minimum loss based estimator based on the highly adaptive lasso.” The international journal of biostatistics. [Link]
Benkeser, David, and Mark van der Laan. “The highly adaptive lasso estimator.” 2016 IEEE international conference on data science and advanced analytics (DSAA). IEEE, 2016. [Link]
van der Laan, Mark. “Higher order spline highly adaptive lasso estimators of functional parameters: Pointwise asymptotic normality and uniform convergence rates.” arXiv preprint arXiv:2301.13354 (2023). [Link]
van der Laan, Lars, Marco Carone, Alex Luedtke, and Mark van der Laan. “Adaptive debiased machine learning using data-driven model selection techniques.” arXiv preprint arXiv:2307.12544 (2023). [Link]
van der Laan, Mark, Sky Qiu, Jens Tarp, and Lars van der Laan. “Adaptive-TMLE for the average treatment effect based on randomized controlled trial augmented with real-world data.” arXiv preprint arXiv:2405.07186 (2024). [Link]
Qiu, Sky, Jens Tarp, Andrew Mertens, and Mark van der Laan. “An estimator-robust design for augmenting randomized controlled trial with external real-world data.” arXiv preprint arXiv:2501.17835 (2025). [Link]
Shirakawa, Toru, Yi Li, Yulun Wu, Sky Qiu, Yuxuan Li, Mingduo Zhao, Hiroyasu Iso, and Mark van der Laan. “Longitudinal targeted minimum loss-based estimation with temporal-difference heterogeneous transformer.” Proceedings of the 41st International Conference on Machine Learning, PMLR 235:45097-45113 (2024). [Link]

Additional resources

Center for Targeted Machine Learning and Causal Inference at Berkeley: https://ctml.berkeley.edu/
Introduction to causal inference by Maya Petersen and Laura Balzer: https://ctml.berkeley.edu/introduction-causal-inference
Targeted Learning in R: Causal Data Science with the tlverse Software Ecosystem (tlverse handbook): https://tlverse.org/tlverse-handbook/
Kat Hoffman’s blog and tutorial on Targeted Learning: https://www.khstats.com/blog/tmle/tutorial
Check out TL Revolution’s YouTube channel for FDA funded webinar series on Targeted Learning:

License

Code is MIT-licensed unless noted.

Maintainers & contact

For questions or issues, open a GitHub issue or email sky.qiu@berkeley.edu.