Estimate a survival function under current status sampling
Arguments
- time
n x 1
numeric vector of observed monitoring times. For individuals that were never monitored, this can be set to any arbitrary value, includingNA
, as long as the correspondingevent
variable isNA
.- event
n x 1
numeric vector of status indicators of whether an event was observed prior to the monitoring time. This value must beNA
for individuals that were never monitored.- X
n x p
dataframe of observed covariate values.- SL_control
List of
SuperLearner
control parameters. This should be a named list; seeSuperLearner
documentation for further information.- HAL_control
List of
haldensify
control parameters. This should be a named list; seehaldensify
documentation for further information.- deriv_method
Method for computing derivative. Options are
"m-spline"
(the default, fit a smoothing spline to the estimated function and differentiate the smooth approximation),"linear"
(linearly interpolate the estimated function and use the slope of that line), and"line"
(use the slope of the line connecting the endpoints of the estimated function).- eval_region
Region over which to estimate the survival function.
- n_eval_pts
Number of points in grid on which to evaluate survival function. The points will be evenly spaced, on the quantile scale, between the endpoints of
eval_region
.- alpha
The level at which to compute confidence intervals and hypothesis tests. Defaults to 0.05
Value
Data frame giving results, with columns:
- t
Time at which survival function is estimated
- S_hat_est
Survival function estimate
- S_hat_cil
Lower bound of confidence interval
- S_hat_ciu
Upper bound of confidence interval
Examples
if (FALSE) # This is a small simulation example
set.seed(123)
n <- 300
x <- cbind(2*rbinom(n, size = 1, prob = 0.5)-1,
2*rbinom(n, size = 1, prob = 0.5)-1)
t <- rweibull(n,
shape = 0.75,
scale = exp(0.4*x[,1] - 0.2*x[,2]))
y <- rweibull(n,
shape = 0.75,
scale = exp(0.4*x[,1] - 0.2*x[,2]))
# round y to nearest quantile of y, just so there aren't so many unique values
quants <- quantile(y, probs = seq(0, 1, by = 0.05), type = 1)
for (i in 1:length(y)){
y[i] <- quants[which.min(abs(y[i] - quants))]
}
delta <- as.numeric(t <= y)
dat <- data.frame(y = y, delta = delta, x1 = x[,1], x2 = x[,2])
dat$delta[dat$y > 1.8] <- NA
dat$y[dat$y > 1.8] <- NA
eval_region <- c(0.05, 1.5)
res <- survML::currstatCIR(time = dat$y,
event = dat$delta,
X = dat[,3:4],
SL_control = list(SL.library = c("SL.mean", "SL.glm"),
V = 3),
HAL_control = list(n_bins = c(5),
grid_type = c("equal_mass"),
V = 3),
eval_region = eval_region)
#> Warning: Some fit_control arguments are neither default nor glmnet/cv.glmnet arguments: n_folds;
#> They will be removed from fit_control
xvals = res$t
yvals = res$S_hat_est
fn=stepfun(xvals, c(yvals[1], yvals))
plot.function(fn, from=min(xvals), to=max(xvals)) # \dontrun{}