+ - 0:00:00
Notes for current slide
Notes for next slide

A General Machine Learning
Framework for Survival Analysis

ECML PKDD 2020


Andreas Bender (@adibender),
David Rügamer, Fabian Scheipl, Bernd Bischl


Department of Statistics, LMU Munich

1

The framework is general in the sense that

  1. It supports different Survival Tasks

    • right-censoring, left-truncation
    • time-varying effects, time-varying features
    • competing risks, multi-state models
  2. Does not require specialized Software, can be applied across programming languages and using any algorithm that supports optimization of the Poisson Likelihood

2


3

\usepackageamsmath,amssymb,bm

Survival Task as Poisson Task


4

Consider setting with right-censored data:

  • we observe (ti,δi),i=1,,n, where
    • ti=min(Ti,Ci); TiFCiG;Ti,Ci>0
    • δi=I(TiCi){0,1}

To approximate λ(t;xi)=exp(g(xi(t),t))=PHλ0(t)exp(xiβ)

5

Consider setting with right-censored data:

  • we observe (ti,δi),i=1,,n, where
    • ti=min(Ti,Ci); TiFCiG;Ti,Ci>0
    • δi=I(TiCi){0,1}

To approximate λ(t;xi)=exp(g(xi(t),t))=PHλ0(t)exp(xiβ)

  • split the follow-up in J intervals (κj1,κj],j=1,,J
5

Consider setting with right-censored data:

  • we observe (ti,δi),i=1,,n, where
    • ti=min(Ti,Ci); TiFCiG;Ti,Ci>0
    • δi=I(TiCi){0,1}

To approximate λ(t;xi)=exp(g(xi(t),t))=PHλ0(t)exp(xiβ)

  • split the follow-up in J intervals (κj1,κj],j=1,,J

  • assume piece-wise constant hazards: λ(t|xi(t))exp(g(xij,tj)):=λij,  t(κj1,κj],

5

Consider setting with right-censored data:

  • we observe (ti,δi),i=1,,n, where
    • ti=min(Ti,Ci); TiFCiG;Ti,Ci>0
    • δi=I(TiCi){0,1}

To approximate λ(t;xi)=exp(g(xi(t),t))=PHλ0(t)exp(xiβ)

  • split the follow-up in J intervals (κj1,κj],j=1,,J

  • assume piece-wise constant hazards: λ(t|xi(t))exp(g(xij,tj)):=λij,  t(κj1,κj],

  • Estimation using Piece-wise Exponential Model (e.g. Friedman (1982))
    ( Poisson regression with transformed data)
5

Data in "standard" time-to-event format
Data in PED format


transform to PED using κ0=0,κ1=1,κ2=1.5,κ3=3

6

Data in "standard" time-to-event format
Data in PED format


transform to PED using κ0=0,κ1=1,κ2=1.5,κ3=3

6

Data in "standard" time-to-event format
Data in PED format


transform to PED using κ0=0,κ1=1,κ2=1.5,κ3=3

  • define: δij={1ti(κj1,κj]δi=10else
6

Data in "standard" time-to-event format
Data in PED format


transform to PED using κ0=0,κ1=1,κ2=1.5,κ3=3

  • define: δij={1ti(κj1,κj]δi=10else, tij={tiκj1δij=1κjκj1else
6

Data in "standard" time-to-event format
Data in PED format


transform to PED using κ0=0,κ1=1,κ2=1.5,κ3=3

  • define: δij={1ti(κj1,κj]δi=10else, tij={tiκj1δij=1κjκj1else, tj:=κj
6

Data in "standard" time-to-event format
Data in PED format


transform to PED using κ0=0,κ1=1,κ2=1.5,κ3=3

  • define: δij={1ti(κj1,κj]δi=10else, tij={tiκj1δij=1κjκj1else, tj:=κj

General log-likelihood contribution:

i=log(λ(ti;xi)δiS(ti;xi))=j=1Ji(δijlogλijλijtij)

Working Assumption δijiidPo(μij=λijtij):

i=log(j=1Jif(δij))=j=1Jiδijlog(λij)+δijlog(tij)λijtij

6

Consider 3 subjects in competing risks setting with event types k{1,2}

  • i=1: (t1=1.3,δ1=2)
  • i=2: (t2=0.5,δ2=0)
  • i=3: (t3=2.7,δ3=1)

Data in PED format

estimate λ(t|x,k)=exp(f(x(t),t,k)), k{1,2}

7

Consider 3 subjects in competing risks setting with event types k{1,2}

  • i=1: (t1=1.3,δ1=2)
  • i=2: (t2=0.5,δ2=0)
  • i=3: (t3=2.7,δ3=1)

Data in PED format

estimate λ(t|x,k)=exp(f(x(t),t,k)), k{1,2}

7

Consider 3 subjects in competing risks setting with event types k{1,2}

  • i=1: (t1=1.3,δ1=2)
  • i=2: (t2=0.5,δ2=0)
  • i=3: (t3=2.7,δ3=1)

Data in PED format

estimate λ(t|x,k)=exp(f(x(t),t,k)), k{1,2}

7

Time-varying effects Shared vs. cause-specific effects (in CR)

8

Experimental Results


9

We use gradient boosted trees (GBT) as computing engine for PEMs (more specifically XGBoost (Chen and Guestrin, 2016)) and compare them to

Single Event and competing risks data sets

  • Standard data sets (directly available)
  • Synthetic data with time-varying effects (TVE)

For each data set

  • 20 subsamples, each split into train (70%) and test (30%) data
  • tuning on the training data (random search with fixed budget)
  • evaluation on test, performance measured by Brier Score at different time-points (25%, 50% and 75% quantiles of event times in the test data)
10

Comparison with ORSF (single-event, right-censoring)
Evaluation w.r.t. Integrated Brier Score

11

Comparison with DeepHit (single-event and competing risks, right-censoring)
Evaluation w.r.t. weighted Brier Score

12

Choice of interval split points

  • Number and placement of interval split points could potentially be a tuning parameter

  • In our experience setting split points at observed event times results in good performance many split points where many events observed

  • For large data sets select subset of unique event times for split points

13

Concluding Remarks


14

  • General ML Framework for Survival Analysis (Bender, Rügamer, Scheipl, et al., 2020)

    • supports many survival task (TVE, TVF, CR, MSM)
    • does not require specialized software/algorithms
  • No assumptions w.r.t. distribution of event times (Poisson assumption just a computational vehicle)

  • Framework for continuous time survival analysis (exact time enters via offset, prediction of hazards and survival probabilities possible for any time t)

15

References

Bender, A, D. Rügamer, F. Scheipl, et al. (2020). "A General Machine Learning Framework for Survival Analysis". In: arXiv:2006.15442 [cs, stat]. arXiv: 2006.15442.

Chen, T. and C. Guestrin (2016). "XGBoost: A Scalable Tree Boosting System". In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - KDD '16, pp. 785-794. DOI: 10.1145/2939672.2939785. arXiv: 1603.02754.

Friedman, M. (1982). "Piecewise Exponential Models for Survival Data with Covariates". In: The Annals of Statistics 10.1, pp. 101-113. ISSN: 00905364. URL: http://www.jstor.org/stable/2240502.

Jaeger, B. C, D. L. Long, D. M. Long, et al. (2019). "Oblique random survival forests". In: The Annals of Applied Statistics 13.3, pp. 1847-1883. ISSN: 1932-6157, 1941-7330. DOI: 10.1214/19-AOAS1261.

Lee, C., W. R. Zame, J. Yoon, et al. (2018). "DeepHit: A Deep Learning Approach to Survival Analysis With Competing Risks". In: Thirty-Second AAAI Conference on Artificial Intelligence. Thirty-Second AAAI Conference on Artificial Intelligence. URL: https://www.aaai.org/ocs/index.php/AAAI/AAAI18/paper/view/16160.

16

The framework is general in the sense that

  1. It supports different Survival Tasks

    • right-censoring, left-truncation
    • time-varying effects, time-varying features
    • competing risks, multi-state models
  2. Does not require specialized Software, can be applied across programming languages and using any algorithm that supports optimization of the Poisson Likelihood

2
Paused

Help

Keyboard shortcuts

, , Pg Up, k Go to previous slide
, , Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
Number + Return Go to specific slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
Esc Back to slideshow