Title: | Variable Selection in High-Dimensional Logistic Regression Models using a Whitening Approach |
---|---|
Description: | It proposes a novel variable selection approach in classification problem that takes into account the correlations that may exist between the predictors of the design matrix in a high-dimensional logistic model. Our approach consists in rewriting the initial high-dimensional logistic model to remove the correlation between the predictors and in applying the generalized Lasso criterion. |
Authors: | Wencan Zhu |
Maintainer: | Wencan Zhu <[email protected]> |
License: | GPL-2 |
Version: | 2.1 |
Built: | 2025-03-08 03:31:45 UTC |
Source: | https://github.com/cran/WLogit |
It proposes a novel variable selection approach in classification problem that takes into account the correlations that may exist between the predictors of the design matrix in a high-dimensional logistic model. Our approach consists in rewriting the initial high-dimensional logistic model to remove the correlation between the predictors and in applying the generalized Lasso criterion.
The DESCRIPTION file:
Package: | WLogit |
Type: | Package |
Title: | Variable Selection in High-Dimensional Logistic Regression Models using a Whitening Approach |
Version: | 2.1 |
Date: | 2023-07-17 |
Author: | Wencan Zhu |
Maintainer: | Wencan Zhu <[email protected]> |
Description: | It proposes a novel variable selection approach in classification problem that takes into account the correlations that may exist between the predictors of the design matrix in a high-dimensional logistic model. Our approach consists in rewriting the initial high-dimensional logistic model to remove the correlation between the predictors and in applying the generalized Lasso criterion. |
License: | GPL-2 |
Imports: | cvCovEst, genlasso, tibble, MASS, ggplot2, Matrix, glmnet, corpcor |
VignetteBuilder: | knitr |
Suggests: | knitr |
Depends: | R (>= 3.5.0) |
NeedsCompilation: | no |
Packaged: | 2023-07-17 06:44:04 UTC; chenxi |
Date/Publication: | 2023-07-17 07:10:06 UTC |
Config/pak/sysreqs: | cmake libglpk-dev make libicu-dev libxml2-dev |
Repository: | https://clavie3009.r-universe.dev |
RemoteUrl: | https://github.com/cran/WLogit |
RemoteRef: | HEAD |
RemoteSha: | 921ec1a7a4b440b3a8258257f2638f9f34b6f947 |
Index of help topics:
CalculPx Calculate the class-conditional probabilities. CalculWeight Calculate the weight Refit_glm Refit the logistic regression with chosen variables Thresholding Thresholding on a vector WLogit-package Variable Selection in High-Dimensional Logistic Regression Models using a Whitening Approach WhiteningLogit Variable selection in high-dimensional logistic regression models using a whitening approach WorkingResp Calculate the working response X Example of a design matrix of a logistic model beta True coefficients in the esample. test WLogit output top Thresholding to zero of the smallest values top_thresh Thresholding to a given threshold of the smallest values y Example of a binary response variable of a logistic model.
Further information is available in the following vignettes:
Vignettes |
WLogit package (source, pdf) |
This package consists of functions: "WhiteningLogit", "CalculPx", "CalculWeight", "Refit_glm", "top", "top_thresh", "WorkingResp", and "Thresholding". For further information on how to use these functions, we refer the reader to the vignette of the package.
Wencan Zhu
Maintainer: Wencan Zhu <[email protected]>
W. Zhu, C. Levy-Leduc, N. Ternes. "Variable selection in high-dimensional logistic regression models using a whitening approach". (2022)
True coefficients in the esample given in the vignette.
data("beta")
data("beta")
The format is: num [1:500] 1 1 1 1 1 1 1 1 1 1 ...
data(beta) plot(beta)
data(beta) plot(beta)
Calculate the probability for a repsonse to be 1 in the logistic regression model.
CalculPx(X, beta, intercept = 0)
CalculPx(X, beta, intercept = 0)
X |
Design matrix of the logistic model considered. |
beta |
Vector of coefficients of the logistic model considered. |
intercept |
Whether there is the intercept |
prob |
the probability for a repsonse to be 1 |
Wencan Zhu, Celine Levy-Leduc, Nils Ternes
Please read https://hastie.su.domains/Papers/glmnet.pdf for more details
data(X) data(beta) CalculPx(X=X, beta=beta) ##---- Should be DIRECTLY executable !! ---- ##-- ==> Define data, use random, ##-- or do help(data=index) for the standard data sets. ## The function is currently defined as function (X, beta, intercept = 0) { prob <- 1/(1 + exp(-(X %*% beta + intercept))) return(prob) }
data(X) data(beta) CalculPx(X=X, beta=beta) ##---- Should be DIRECTLY executable !! ---- ##-- ==> Define data, use random, ##-- or do help(data=index) for the standard data sets. ## The function is currently defined as function (X, beta, intercept = 0) { prob <- 1/(1 + exp(-(X %*% beta + intercept))) return(prob) }
Calculate the weight in the penalized weighted- least-squares problem
CalculWeight(Px)
CalculWeight(Px)
Px |
The vector of estimated probability for each response to be 1. |
Wencan Zhu, Celine Levy-Leduc, Nils Ternes
Please read https://hastie.su.domains/Papers/glmnet.pdf for more details
data(X) data(beta) px <- CalculPx(X=X, beta=beta) CalculWeight(px) ##---- Should be DIRECTLY executable !! ---- ##-- ==> Define data, use random, ##-- or do help(data=index) for the standard data sets. ## The function is currently defined as function (Px) { return(Px * (1 - Px)) }
data(X) data(beta) px <- CalculPx(X=X, beta=beta) CalculWeight(px) ##---- Should be DIRECTLY executable !! ---- ##-- ==> Define data, use random, ##-- or do help(data=index) for the standard data sets. ## The function is currently defined as function (Px) { return(Px * (1 - Px)) }
Refit the logistic regression with chosen variables.
Refit_glm(X, beta_pred, y)
Refit_glm(X, beta_pred, y)
X |
Design matrix of the logistic model considered. |
beta_pred |
Predicted coefficients to be refited. |
y |
Binary response |
beta_refit |
The new estimated coefficients |
Wencan Zhu, Celine Levy-Leduc, Nils Ternes
data(X) data(y) data(beta) Refit_glm(X=X, beta_pred=beta, y=y) ##---- Should be DIRECTLY executable !! ---- ##-- ==> Define data, use random, ##-- or do help(data=index) for the standard data sets. ## The function is currently defined as function (X, beta_pred, y) { X_temp <- X[, which(beta_pred != 0)] if (length(which(beta_pred != 0)) == 0) { coef_est <- beta_pred } else if (is.null(ncol(X_temp))) { mydata <- data.frame(Y = y, X_temp) colnames(mydata) <- c("Y", "X") formula <- paste0("Y~-1 +", paste0(colnames(mydata)[-which(colnames(mydata) == "Y")], collapse = " + ")) myform <- as.formula(formula) mod_lm <- glm(myform, data = mydata, family = "binomial") coef_est <- mod_lm$coefficients } else { mydata <- data.frame(Y = y, as.matrix(X_temp)) formula <- paste0("Y~-1 +", paste0(colnames(mydata)[-which(colnames(mydata) == "Y")], collapse = " + ")) myform <- as.formula(formula) if (length(which(beta_pred != 0)) >= length(y)) { mod_ridge <- cv.glmnet(x = as.matrix(X_temp), y = y, alpha = 0, intercept = FALSE, family = "binomial") opt_lambda <- mod_ridge$lambda[which.min(mod_ridge$cvm)] coef_est <- as.vector(glmnet(x = as.matrix(X), y = y, alpha = 0, intercept = FALSE, family = "binomial", lambda = opt_lambda)$beta) } else { mod_lm <- glm(myform, data = mydata, family = "binomial") coef_est <- mod_lm$coefficients } } beta_refit <- rep(0, length(beta_pred)) beta_refit[which(beta_pred != 0)] <- coef_est return(beta_refit) }
data(X) data(y) data(beta) Refit_glm(X=X, beta_pred=beta, y=y) ##---- Should be DIRECTLY executable !! ---- ##-- ==> Define data, use random, ##-- or do help(data=index) for the standard data sets. ## The function is currently defined as function (X, beta_pred, y) { X_temp <- X[, which(beta_pred != 0)] if (length(which(beta_pred != 0)) == 0) { coef_est <- beta_pred } else if (is.null(ncol(X_temp))) { mydata <- data.frame(Y = y, X_temp) colnames(mydata) <- c("Y", "X") formula <- paste0("Y~-1 +", paste0(colnames(mydata)[-which(colnames(mydata) == "Y")], collapse = " + ")) myform <- as.formula(formula) mod_lm <- glm(myform, data = mydata, family = "binomial") coef_est <- mod_lm$coefficients } else { mydata <- data.frame(Y = y, as.matrix(X_temp)) formula <- paste0("Y~-1 +", paste0(colnames(mydata)[-which(colnames(mydata) == "Y")], collapse = " + ")) myform <- as.formula(formula) if (length(which(beta_pred != 0)) >= length(y)) { mod_ridge <- cv.glmnet(x = as.matrix(X_temp), y = y, alpha = 0, intercept = FALSE, family = "binomial") opt_lambda <- mod_ridge$lambda[which.min(mod_ridge$cvm)] coef_est <- as.vector(glmnet(x = as.matrix(X), y = y, alpha = 0, intercept = FALSE, family = "binomial", lambda = opt_lambda)$beta) } else { mod_lm <- glm(myform, data = mydata, family = "binomial") coef_est <- mod_lm$coefficients } } beta_refit <- rep(0, length(beta_pred)) beta_refit[which(beta_pred != 0)] <- coef_est return(beta_refit) }
The output of WLogit in the example given in the vignette.
data("test")
data("test")
The format is: List of 4 $ beta : num [1:50, 1:500] 0 0 0 0 0 ... $ lambda : num [1:50] 100.8 80 73 58.9 56.7 ... $ beta.min : num [1:500] 0.0194 0.0348 0.0259 0.0287 0.0385 ... $ log.likelihood: num [1:50] 57.7 57.7 57.7 57.7 57.7 ...
data(test) str(test)
data(test) str(test)
This function provides the thresholding (correction) given a vector. It calls the function top
or top_thresh
in the same package, and the output is the vector after correction with the optimal threshold parameter.
Thresholding(X, y, coef, TOP)
Thresholding(X, y, coef, TOP)
X |
Design matrix of the logistic model considered. |
y |
Binary response |
coef |
Candidate vector to be corrected |
TOP |
The grill of thresholding |
opt_top |
The optimal threshold |
auc |
the log-likelihood for each grill of thresholding |
Wencan Zhu, Celine Levy-Leduc, Nils Ternes
This function keeps only the K largest values of the vector sorted_vect
and sets the others to zero.
top(vect, thresh)
top(vect, thresh)
vect |
vector to threshold |
thresh |
threshold |
This function returns the thresholded vector.
Wencan Zhu, Celine Levy-Leduc, Nils Ternes
x=sample(1:10,10) thresh=3 top(x,thresh) ##---- Should be DIRECTLY executable !! ---- ##-- ==> Define data, use random, ##-- or do help(data=index) for the standard data sets. ## The function is currently defined as function (vect, thresh) { sorted_vect <- sort(abs(vect), decreasing = TRUE) v = sorted_vect[thresh] ifelse(abs(vect) >= v, vect, 0) }
x=sample(1:10,10) thresh=3 top(x,thresh) ##---- Should be DIRECTLY executable !! ---- ##-- ==> Define data, use random, ##-- or do help(data=index) for the standard data sets. ## The function is currently defined as function (vect, thresh) { sorted_vect <- sort(abs(vect), decreasing = TRUE) v = sorted_vect[thresh] ifelse(abs(vect) >= v, vect, 0) }
This function keeps only the K largest values of the vector vect
and sets the others to the smallest value among the K largest.
top_thresh(vect, thresh)
top_thresh(vect, thresh)
vect |
vector to threshold |
thresh |
threshold |
This function returns the thresholded vector.
Wencan Zhu, Celine Levy-Leduc, Nils Ternes
x=sample(1:10,10) sorted_vect=sort(x,decreasing=TRUE) thresh=3 top_thresh(x,thresh) ##---- Should be DIRECTLY executable !! ---- ##-- ==> Define data, use random, ##-- or do help(data=index) for the standard data sets. ## The function is currently defined as function (vect, thresh) { sorted_vect <- sort(vect, decreasing = TRUE) v = sorted_vect[thresh] ifelse(vect >= v, vect, v) }
x=sample(1:10,10) sorted_vect=sort(x,decreasing=TRUE) thresh=3 top_thresh(x,thresh) ##---- Should be DIRECTLY executable !! ---- ##-- ==> Define data, use random, ##-- or do help(data=index) for the standard data sets. ## The function is currently defined as function (vect, thresh) { sorted_vect <- sort(vect, decreasing = TRUE) v = sorted_vect[thresh] ifelse(vect >= v, vect, v) }
Variable selection in high-dimensional logistic regression models using a whitening approach
WhiteningLogit(X = X, y = y, nlambda = 50, maxit = 100, gamma = 0.9999, top_grill=c(1:100))
WhiteningLogit(X = X, y = y, nlambda = 50, maxit = 100, gamma = 0.9999, top_grill=c(1:100))
X |
Design matrix of the logistic model considered. |
y |
Binary response of the logistic model considered. |
nlambda |
Number of lambda |
maxit |
Integer specifying the maximum number of steps for the generalized Lasso algorithm. It should not be smaller than nlambda. |
gamma |
Parameter |
top_grill |
A grill of provided for the thresholding |
Returns a list with the following components
lambda |
different values of the parameter |
beta |
matrix of the estimations of |
beta.min |
estimation of |
log.likelihood |
Log-likelihood for all the |
Wencan Zhu, Celine Levy-Leduc, Nils Ternes
W. Zhu, C. Levy-Leduc, N. Ternes. "Variable selection in high-dimensional logistic regression models using a whitening approach". (2022)
X0 <- matrix( rnorm(50*10,mean=0,sd=1), 50, 10) y0 <- c(rep(1,25), rep(0,25)) mod <- WhiteningLogit(X=X0, y=y0) plot(mod$beta.min) ##---- Should be DIRECTLY executable !! ---- ##-- ==> Define data, use random, ##-- or do help(data=index) for the standard data sets. ## The function is currently defined as function(X=X, y=y, nlambda=50, maxit=100, gamma=0.9999, top_grill=c(1:100)){ p=ncol(X) n=nrow(X) mod_ridge <- cv.glmnet(x=as.matrix(X), y=y, alpha=0.5, intercept=FALSE, family="binomial") pr_est <- predict(mod_ridge, as.matrix(X), s = "lambda.min", type="response") beta_ini <- predict(mod_ridge, as.matrix(X), s = "lambda.min", type="coefficients")[-1] diag_w <- pr_est*(1-pr_est) square_root_w <- diag(sqrt(as.vector(diag_w)), nrow=n) X_new <- square_root_w Cov_est <- cvCovEst( dat = X_new, estimators = c( linearShrinkLWEst, thresholdingEst, sampleCovEst ), estimator_params = list( thresholdingEst = list(gamma = seq(0.1, 0.3, 0.1)) ), center = TRUE, scale = TRUE ) Sigma_est <- Cov_est$estimate SVD_new <- fast.svd(Sigma_est) U_sigma_new <- SVD_new$u D_sigma_new <- SVD_new$d inv_transmat <- U_sigma_new inv_diag_new <- ifelse(D_sigma_new<0.000001, 0, 1/sqrt(D_sigma_new)) trans_mat <- U_sigma_new if (p <= 50) { top_grill <- seq(1, p, 2) }else if (p <= 200) { top_grill <- c(1:50, seq(52, p, 2)) }else if (p <= 300) { top_grill <- c(1:50, seq(52, 100, 2), seq(105, 200, 5), seq(210, p, 10)) }else { top_grill <- c(1:50, seq(52, 100, 2), seq(105, 200, 5), seq(210, 300, 10)) } X_tilde <- X beta_tilde_ini <- inv_transmat Px <- CalculPx(X_tilde, beta=beta_tilde_ini) wt <- CalculWeight(Px) # wt <- ifelse(wt0==0, 0.0001, wt0) ystar <- WorkingResp(y=y, Px=Px, X=X_tilde, beta=beta_tilde_ini) X_tilde_weighted <- sweep(X, MARGIN=1, sqrt(wt), `*`) ystar_weighted <- sqrt(wt)*ystar gen.model0 <- genlasso(y=ystar_weighted, X=X_tilde_weighted, D=trans_mat, maxsteps = 50) parameter_tmp <- beta_tilde_ini beta_final <- matrix(NA, length(gen.model0$lambda), p) skip_i <- TRUE eval_final <- c() defaultW <- getOption("warn") options(warn = -1) for(i in 1:length(gen.model0$lambda)){ #inner loop epsilon=10 j=0 if(skip_i){parameter_tmp <- beta_tilde_ini } else {parameter_tmp <- parameter_current} skip_i <-FALSE while(epsilon > 0.001){ j=j+1 parameter_current <- parameter_tmp Px <- CalculPx(X_tilde, beta=parameter_current) wt0 <- CalculWeight(Px) wt <- ifelse(round(wt0,4)==0, 0.0001, wt) ystar <- WorkingResp(y=y, Px=Px, X=X_tilde, beta=parameter_current) X_tilde_weighted <- sweep(X, MARGIN=1, sqrt(wt), `*`) ystar_weighted <- sqrt(wt)*ystar gen.model <- genlasso(y=ystar_weighted, X=X_tilde_weighted, D=trans_mat, maxsteps = maxit) if(gen.model0$lambda[i] < min(gen.model$lambda)){ parameter_tmp <- parameter_current break } else { parameter_tmp <- coef(gen.model, lambda=gen.model0$lambda[i], type = "primal")$beta beta_current <- parameter_tmp if(sum(is.na(parameter_tmp))>0){ skip_i <-TRUE parameter_tmp <- rep(0,p) break} epsilon <- max(abs(parameter_current-parameter_tmp)) if(epsilon >=100){ skip_i <-TRUE break} if (j==maxit){ skip_i <-TRUE break} } } if(skip_i){ beta_final[i, ] <- rep(NA, p) eval_final[i] <- NA } else{ correction <- Thresholding(X_tilde, y, coef=parameter_tmp, TOP=top_grill) opt_top_tilde <- correction$opt_top beta_tilde_opt <- top_thresh(vect=parameter_tmp, thresh = opt_top_tilde) beta_final0 <- trans_mat correction <- Thresholding(X, y, coef=beta_final0, TOP=top_grill) opt_top_final <- correction$opt_top beta_final[i, ] <- beta_opt_final <- top(vect=beta_final0, thresh = opt_top_final) beta_refit <- Refit_glm(X=X, beta_pred = beta_opt_final, y=y) pr_est <- CalculPx(X, beta_refit) ll <- pr_est^y*(1-pr_est)^(1-y) #ll <- ifelse(ll<0.000001, 1, ll) eval_final[i] <- -log(prod(ll)) } beta.min <- beta_final[which.min(eval_final), ] } options(warn = defaultW) return(list(beta=beta_final, lambda=gen.model0$lambda, beta.min=beta.min, log.likelihood=eval_final)) }
X0 <- matrix( rnorm(50*10,mean=0,sd=1), 50, 10) y0 <- c(rep(1,25), rep(0,25)) mod <- WhiteningLogit(X=X0, y=y0) plot(mod$beta.min) ##---- Should be DIRECTLY executable !! ---- ##-- ==> Define data, use random, ##-- or do help(data=index) for the standard data sets. ## The function is currently defined as function(X=X, y=y, nlambda=50, maxit=100, gamma=0.9999, top_grill=c(1:100)){ p=ncol(X) n=nrow(X) mod_ridge <- cv.glmnet(x=as.matrix(X), y=y, alpha=0.5, intercept=FALSE, family="binomial") pr_est <- predict(mod_ridge, as.matrix(X), s = "lambda.min", type="response") beta_ini <- predict(mod_ridge, as.matrix(X), s = "lambda.min", type="coefficients")[-1] diag_w <- pr_est*(1-pr_est) square_root_w <- diag(sqrt(as.vector(diag_w)), nrow=n) X_new <- square_root_w Cov_est <- cvCovEst( dat = X_new, estimators = c( linearShrinkLWEst, thresholdingEst, sampleCovEst ), estimator_params = list( thresholdingEst = list(gamma = seq(0.1, 0.3, 0.1)) ), center = TRUE, scale = TRUE ) Sigma_est <- Cov_est$estimate SVD_new <- fast.svd(Sigma_est) U_sigma_new <- SVD_new$u D_sigma_new <- SVD_new$d inv_transmat <- U_sigma_new inv_diag_new <- ifelse(D_sigma_new<0.000001, 0, 1/sqrt(D_sigma_new)) trans_mat <- U_sigma_new if (p <= 50) { top_grill <- seq(1, p, 2) }else if (p <= 200) { top_grill <- c(1:50, seq(52, p, 2)) }else if (p <= 300) { top_grill <- c(1:50, seq(52, 100, 2), seq(105, 200, 5), seq(210, p, 10)) }else { top_grill <- c(1:50, seq(52, 100, 2), seq(105, 200, 5), seq(210, 300, 10)) } X_tilde <- X beta_tilde_ini <- inv_transmat Px <- CalculPx(X_tilde, beta=beta_tilde_ini) wt <- CalculWeight(Px) # wt <- ifelse(wt0==0, 0.0001, wt0) ystar <- WorkingResp(y=y, Px=Px, X=X_tilde, beta=beta_tilde_ini) X_tilde_weighted <- sweep(X, MARGIN=1, sqrt(wt), `*`) ystar_weighted <- sqrt(wt)*ystar gen.model0 <- genlasso(y=ystar_weighted, X=X_tilde_weighted, D=trans_mat, maxsteps = 50) parameter_tmp <- beta_tilde_ini beta_final <- matrix(NA, length(gen.model0$lambda), p) skip_i <- TRUE eval_final <- c() defaultW <- getOption("warn") options(warn = -1) for(i in 1:length(gen.model0$lambda)){ #inner loop epsilon=10 j=0 if(skip_i){parameter_tmp <- beta_tilde_ini } else {parameter_tmp <- parameter_current} skip_i <-FALSE while(epsilon > 0.001){ j=j+1 parameter_current <- parameter_tmp Px <- CalculPx(X_tilde, beta=parameter_current) wt0 <- CalculWeight(Px) wt <- ifelse(round(wt0,4)==0, 0.0001, wt) ystar <- WorkingResp(y=y, Px=Px, X=X_tilde, beta=parameter_current) X_tilde_weighted <- sweep(X, MARGIN=1, sqrt(wt), `*`) ystar_weighted <- sqrt(wt)*ystar gen.model <- genlasso(y=ystar_weighted, X=X_tilde_weighted, D=trans_mat, maxsteps = maxit) if(gen.model0$lambda[i] < min(gen.model$lambda)){ parameter_tmp <- parameter_current break } else { parameter_tmp <- coef(gen.model, lambda=gen.model0$lambda[i], type = "primal")$beta beta_current <- parameter_tmp if(sum(is.na(parameter_tmp))>0){ skip_i <-TRUE parameter_tmp <- rep(0,p) break} epsilon <- max(abs(parameter_current-parameter_tmp)) if(epsilon >=100){ skip_i <-TRUE break} if (j==maxit){ skip_i <-TRUE break} } } if(skip_i){ beta_final[i, ] <- rep(NA, p) eval_final[i] <- NA } else{ correction <- Thresholding(X_tilde, y, coef=parameter_tmp, TOP=top_grill) opt_top_tilde <- correction$opt_top beta_tilde_opt <- top_thresh(vect=parameter_tmp, thresh = opt_top_tilde) beta_final0 <- trans_mat correction <- Thresholding(X, y, coef=beta_final0, TOP=top_grill) opt_top_final <- correction$opt_top beta_final[i, ] <- beta_opt_final <- top(vect=beta_final0, thresh = opt_top_final) beta_refit <- Refit_glm(X=X, beta_pred = beta_opt_final, y=y) pr_est <- CalculPx(X, beta_refit) ll <- pr_est^y*(1-pr_est)^(1-y) #ll <- ifelse(ll<0.000001, 1, ll) eval_final[i] <- -log(prod(ll)) } beta.min <- beta_final[which.min(eval_final), ] } options(warn = defaultW) return(list(beta=beta_final, lambda=gen.model0$lambda, beta.min=beta.min, log.likelihood=eval_final)) }
Calculate the working response in the iterative least square regression
WorkingResp(y, Px, X, beta, intercept = 0)
WorkingResp(y, Px, X, beta, intercept = 0)
X |
Design matrix of the logistic model considered. |
y |
Binary response of the logistic model considered. |
Px |
The probability of the reponse to be 1 |
beta |
Vector of coefficients |
intercept |
If there is an intercept |
This function returns the vector of working response.
Wencan Zhu, Celine Levy-Leduc, Nils Ternes
Please read https://hastie.su.domains/Papers/glmnet.pdf for more details
It contains an example of a design matrix of a logistic model.
data("X")
data("X")
The format is: num [1:100, 1:500] -1.576 -0.476 -0.237 -0.398 0.284 ...
data(X)
data(X)
It contains an example of a binary response variable of a logistic model.
data("y")
data("y")
The format is: int [1:100] 0 1 0 1 1 0 0 0 1 1 ...
data(y)
data(y)