+ - 0:00:00
Notes for current slide
Notes for next slide

\usepackageamsmath,amssymb,bm

Piece-wise exponential (Additive Mixed) Modeling Tools

ISCB41, 2020




Andreas Bender (@adibender),
Fabian Scheipl, David Rügamer, Philipp Kopper, Bernd Bischl, Helmut Küchenhoff



Department of Statistics, LMU Munich

1

The framework is general in the sense that



  1. it supports different Survival Tasks

    • right-censoring, left-truncation
    • time-varying effects, time-varying features
    • cumulative effects (weighted cumulative exposure, distributed lag models)
    • competing risks, multi-state models
  2. does not require specialized Software, can be applied

    • across programming languages and
    • using any algorithm that supports optimization of the Poisson Likelihood
2

\usepackageamsmath,amssymb,bm

(source: Bender, et al. (2020))

3

\usepackageamsmath,amssymb,bm

Survival Analysis as Poisson Regression


4

Consider setting with right-censored data:

  • we observe (ti,δi),i=1,,n, where
    • ti=min(Ti,Ci); TiFCiG;Ti,Ci>0
    • δi=I(TiCi){0,1}

To approximate λ(t;xi)=exp(g(xi(t),t))=PHλ0(t)exp(xiβ)

5

Consider setting with right-censored data:

  • we observe (ti,δi),i=1,,n, where
    • ti=min(Ti,Ci); TiFCiG;Ti,Ci>0
    • δi=I(TiCi){0,1}

To approximate λ(t;xi)=exp(g(xi(t),t))=PHλ0(t)exp(xiβ)

  • split the follow-up in J intervals (κj1,κj],j=1,,J
5

Consider setting with right-censored data:

  • we observe (ti,δi),i=1,,n, where
    • ti=min(Ti,Ci); TiFCiG;Ti,Ci>0
    • δi=I(TiCi){0,1}

To approximate λ(t;xi)=exp(g(xi(t),t))=PHλ0(t)exp(xiβ)

  • split the follow-up in J intervals (κj1,κj],j=1,,J

  • assume piece-wise constant hazards: λ(t|xi(t))exp(g(xij,tj)):=λij,  t(κj1,κj],

5

Consider setting with right-censored data:

  • we observe (ti,δi),i=1,,n, where
    • ti=min(Ti,Ci); TiFCiG;Ti,Ci>0
    • δi=I(TiCi){0,1}

To approximate λ(t;xi)=exp(g(xi(t),t))=PHλ0(t)exp(xiβ)

  • split the follow-up in J intervals (κj1,κj],j=1,,J

  • assume piece-wise constant hazards: λ(t|xi(t))exp(g(xij,tj)):=λij,  t(κj1,κj],

5

Data in "standard" time-to-event format
Data in PED format


transform to PED using κ0=0,κ1=1,κ2=1.5,κ3=3

6

Data in "standard" time-to-event format
Data in PED format


transform to PED using κ0=0,κ1=1,κ2=1.5,κ3=3

6

Data in "standard" time-to-event format
Data in PED format


transform to PED using κ0=0,κ1=1,κ2=1.5,κ3=3

  • define: δij={1ti(κj1,κj]δi=10else
6

Data in "standard" time-to-event format
Data in PED format


transform to PED using κ0=0,κ1=1,κ2=1.5,κ3=3

  • define: δij={1ti(κj1,κj]δi=10else, tij={tiκj1δij=1κjκj1else
6

Data in "standard" time-to-event format
Data in PED format


transform to PED using κ0=0,κ1=1,κ2=1.5,κ3=3

  • define: δij={1ti(κj1,κj]δi=10else, tij={tiκj1δij=1κjκj1else, tj:=κj
6

Data in "standard" time-to-event format
Data in PED format


transform to PED using κ0=0,κ1=1,κ2=1.5,κ3=3

  • define: δij={1ti(κj1,κj]δi=10else, tij={tiκj1δij=1κjκj1else, tj:=κj

General log-likelihood contribution:

i=log(λ(ti;xi)δiS(ti;xi))=j=1Ji(δijlogλijλijtij)

Working Assumption δijiidPo(μij=λijtij):

i=log(j=1Jif(δij))=j=1Jiδijlog(λij)+δijlog(tij)λijtij

6

Competing risks setting with event types k{1,2}

Data in "standard" format
Data in PED format


transform to PED using κ0=0,κ1=1,κ2=1.5,κ3=3

estimate λ(t|x,k)=exp(f(x(t),t,k)), k{1,2}

7

Competing risks setting with event types k{1,2}

Data in "standard" format
Data in PED format


transform to PED using κ0=0,κ1=1,κ2=1.5,κ3=3

estimate λ(t|x,k)=exp(f(x(tj),tj,k)), t(κj1,κj],  k{1,2}

7

PEM/GLM: λ(t)=λ0j=exp(β0j),t(κj1,κj],j=1,,J

  • trade of w.r.t. to number of split points (less flexible/more robust vs. more flexible/less robust)

  • computationally inefficient (one parameter for each interval), especially when considering time-varying effects

  • results sensitive to number and placement of interval cut points

8

PAMM/GAMM: λ(t)=λ0j=exp(f0(tj)),t(κj1,κj],j=1,,J; f0(tj)=q=1Qβ0qB0q(tj)

  • large differences between neighboring coefficients/baseline hazards of neighboring intervals are penalized

  • insensitive to number and placement of split points

  • number of parameters to estimate determined by basis dimension Q, not number of intervals J

9

Time-varying effects

In the PEM/PAMM framework, time-varying effects are simply interactions of time tj and other covariates.
log(λ(t|x))=f01(tj)I(complications=yes)+f02(tj)I(complications=no)

pam_tumor <- mgcv::gam(formula=ped_status~s(tend, by=complications), data=ped_tumor, family=poisson(), offset=offset)

10

# "Regular" GAM
mgcv::gam(formula=ped_status~s(tend, by=complications), data=ped_tumor, family=poisson(), offset=offset)
# GAM with monotinicity constraints
scam::scam(formula=ped_status~s(tend, by=complications, bs = "mpd"), data=ped_tumor, family=poisson(), offset=offset)
# Bayesian GAM
brms::brm(formula=ped_status~s(tend, by=complications) + offset(offset), data=ped_tumor, family=poisson())
11

Competing Risks

log(λ(t|x))=f01(tj)I(k=1)+f01(tj)I(k=2) Cause specific hazards are time-varying effects of time tj and covariate "event type" k

pam_cr <- mgcv::gam(formula = ped_status ~ s(tend, by = cause), data = ped_stacked, family = poisson(), offset = offset)

12

Tree based methods

Time-varying effects Shared vs. cause-specific effects (in CR)

(source: Bender, et al. (2020))

13

The pammtools package


14

PEMs/PAMMs powerfull framework for survival analysis, but cumbersome to work with

pammtools facilitates

  • data transformation (as_ped):

    • right-censoring
    • cumulative effects
    • competing risks
  • post-processing:

    • prediction (add_hazard, add_surv_prob, add_cif),
    • model evaluation (integrated brier score via pec)
  • convenience functions for visualisation, ...

15

16

Outlook




  • support for multi-state models

  • facilitate extensions: S3 functions for calculation of hazard for other packages (e.g. mbooost, brms)

  • Prototype for PEMs using xgboost available: https://github.com/adibender/pem.xgb

  • However, ML algorithms need a different infrastructure (resampling, tuning, benchmarking)
    Development will probably continue in mlr3 and mlr3proba (Lang, et al. (2019); Sonabend, et al. (2020))

17

References

Argyropoulos, C. et al. (2015). "Analysis of Time to Event Outcomes in Randomized Controlled Trials by Generalized Additive Models". In: PLoS ONE 10.4, p. e0123784. DOI: 10.1371/journal.pone.0123784. URL: http://dx.doi.org/10.1371/journal.pone.0123784.

Bender, A. et al. (2018). "A generalized additive model approach to time-to-event analysis". En. In: Statistical Modelling 18.3-4, pp. 299-321. ISSN: 1471-082X. DOI: 10.1177/1471082X17748083.

Bender, A. et al. (2020). "A General Machine Learning Framework for Survival Analysis". In: arXiv:2006.15442 [cs, stat]. arXiv: 2006.15442.

Cai, T. et al. (2002). "Mixed Model-Based Hazard Estimation". In: Journal of Computational and Graphical Statistics 11.4, pp. 784-798. ISSN: 1061-8600. DOI: 10.1198/106186002862. URL: http://dx.doi.org/10.1198/106186002862.

Carstensen, B. et al. (2011). "Using Lexis Objects for Multi-State Models in R". En. In: Journal of Statistical Software 38.1. Number: 1, pp. 1-18. ISSN: 1548-7660. DOI: 10.18637/jss.v038.i06. URL: https://www.jstatsoft.org/index.php/jss/article/view/v038i06.

Friedman, M. (1982). "Piecewise Exponential Models for Survival Data with Covariates". In: The Annals of Statistics 10.1, pp. 101-113. ISSN: 00905364. URL: http://www.jstor.org/stable/2240502.

18

References

Kauermann, G. (2005). "Penalized spline smoothing in multivariable survival models with varying coefficients". In: Computational Statistics & Data Analysis 49.1, pp. 169-186. ISSN: 0167-9473. DOI: 10.1016/j.csda.2004.05.006. URL: http://www.sciencedirect.com/science/article/pii/S0167947304001240.

Laird, N. et al. (1981). "Covariance Analysis of Censored Survival Data Using Log-Linear Analysis Techniques". In: Journal of the American Statistical Association 76.374, pp. 231-240. DOI: 10.2307/2287816. URL: http://www.jstor.org/stable/2287816.

Lang, M. et al. (2019). "mlr3: A modern object-oriented machine learning framework in R". In: Journal of Open Source Software. DOI: 10.21105/joss.01903. URL: https://joss.theoj.org/papers/10.21105/joss.01903.

Sonabend, R. et al. (2020). "mlr3proba: Machine Learning Survival Analysis in R". In: arXiv:2008.08080 [cs, stat]. arXiv: 2008.08080. URL: http://arxiv.org/abs/2008.08080 (visited on Aug. 20, 2020).

19

The framework is general in the sense that



  1. it supports different Survival Tasks

    • right-censoring, left-truncation
    • time-varying effects, time-varying features
    • cumulative effects (weighted cumulative exposure, distributed lag models)
    • competing risks, multi-state models
  2. does not require specialized Software, can be applied

    • across programming languages and
    • using any algorithm that supports optimization of the Poisson Likelihood
2

\usepackageamsmath,amssymb,bm

Paused

Help

Keyboard shortcuts

, , Pg Up, k Go to previous slide
, , Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
Number + Return Go to specific slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
Esc Back to slideshow