Package 'WLogit' reference manual

Title:	Variable Selection in High-Dimensional Logistic Regression Models using a Whitening Approach
Description:	It proposes a novel variable selection approach in classification problem that takes into account the correlations that may exist between the predictors of the design matrix in a high-dimensional logistic model. Our approach consists in rewriting the initial high-dimensional logistic model to remove the correlation between the predictors and in applying the generalized Lasso criterion.
Authors:	Wencan Zhu
Maintainer:	Wencan Zhu <[email protected]>
License:	GPL-2
Version:	2.1
Built:	2025-03-08 03:31:45 UTC
Source:	https://github.com/cran/WLogit

Variable Selection in High-Dimensional Logistic Regression Models using a Whitening Approach

Description

It proposes a novel variable selection approach in classification problem that takes into account the correlations that may exist between the predictors of the design matrix in a high-dimensional logistic model. Our approach consists in rewriting the initial high-dimensional logistic model to remove the correlation between the predictors and in applying the generalized Lasso criterion.

Details

The DESCRIPTION file:

Package:	WLogit
Type:	Package
Title:	Variable Selection in High-Dimensional Logistic Regression Models using a Whitening Approach
Version:	2.1
Date:	2023-07-17
Author:	Wencan Zhu
Maintainer:	Wencan Zhu <[email protected]>
Description:	It proposes a novel variable selection approach in classification problem that takes into account the correlations that may exist between the predictors of the design matrix in a high-dimensional logistic model. Our approach consists in rewriting the initial high-dimensional logistic model to remove the correlation between the predictors and in applying the generalized Lasso criterion.
License:	GPL-2
Imports:	cvCovEst, genlasso, tibble, MASS, ggplot2, Matrix, glmnet, corpcor
VignetteBuilder:	knitr
Suggests:	knitr
Depends:	R (>= 3.5.0)
NeedsCompilation:	no
Packaged:	2023-07-17 06:44:04 UTC; chenxi
Date/Publication:	2023-07-17 07:10:06 UTC
Config/pak/sysreqs:	cmake libglpk-dev make libicu-dev libxml2-dev
Repository:	https://clavie3009.r-universe.dev
RemoteUrl:	https://github.com/cran/WLogit
RemoteRef:	HEAD
RemoteSha:	921ec1a7a4b440b3a8258257f2638f9f34b6f947

Index of help topics:

CalculPx                Calculate the class-conditional probabilities.
CalculWeight            Calculate the weight
Refit_glm               Refit the logistic regression with chosen
                        variables
Thresholding            Thresholding on a vector
WLogit-package          Variable Selection in High-Dimensional Logistic
                        Regression Models using a Whitening Approach
WhiteningLogit          Variable selection in high-dimensional logistic
                        regression models using a whitening approach
WorkingResp             Calculate the working response
X                       Example of a design matrix of a logistic model
beta                    True coefficients in the esample.
test                    WLogit output
top                     Thresholding to zero of the smallest values
top_thresh              Thresholding to a given threshold of the
                        smallest values
y                       Example of a binary response variable of a
                        logistic model.

Further information is available in the following vignettes:

`Vignettes`	WLogit package (source, pdf)

This package consists of functions: "WhiteningLogit", "CalculPx", "CalculWeight", "Refit_glm", "top", "top_thresh", "WorkingResp", and "Thresholding". For further information on how to use these functions, we refer the reader to the vignette of the package.

Author(s)

Wencan Zhu

Maintainer: Wencan Zhu <[email protected]>

References

W. Zhu, C. Levy-Leduc, N. Ternes. "Variable selection in high-dimensional logistic regression models using a whitening approach". (2022)

True coefficients in the esample.

Description

True coefficients in the esample given in the vignette.

Usage

data("beta")data("beta")

Format

The format is: num [1:500] 1 1 1 1 1 1 1 1 1 1 ...

Examples

data(beta)
plot(beta)
data(beta)
plot(beta)

Calculate the class-conditional probabilities.

Description

Calculate the probability for a repsonse to be 1 in the logistic regression model.

Usage

CalculPx(X, beta, intercept = 0)
CalculPx(X, beta, intercept = 0)

Arguments

`X`	Design matrix of the logistic model considered.
`beta`	Vector of coefficients of the logistic model considered.
`intercept`	Whether there is the intercept

Value

prob

the probability for a repsonse to be 1

Author(s)

Wencan Zhu, Celine Levy-Leduc, Nils Ternes

Examples

data(X)
data(beta)
CalculPx(X=X, beta=beta)


##---- Should be DIRECTLY executable !! ----
##-- ==>  Define data, use random,
##--	or do  help(data=index)  for the standard data sets.

## The function is currently defined as
function (X, beta, intercept = 0) 
{
    prob <- 1/(1 + exp(-(X %*% beta + intercept)))
    return(prob)
  }
data(X)
data(beta)
CalculPx(X=X, beta=beta)


##---- Should be DIRECTLY executable !! ----
##-- ==>  Define data, use random,
##--	or do  help(data=index)  for the standard data sets.

## The function is currently defined as
function (X, beta, intercept = 0) 
{
    prob <- 1/(1 + exp(-(X %*% beta + intercept)))
    return(prob)
  }

Calculate the weight

Description

Calculate the weight in the penalized weighted- least-squares problem

Usage

CalculWeight(Px)
CalculWeight(Px)

Arguments

`Px`	The vector of estimated probability for each response to be 1.

Author(s)

Wencan Zhu, Celine Levy-Leduc, Nils Ternes

Examples

data(X)
data(beta)
px <- CalculPx(X=X, beta=beta)
CalculWeight(px)
##---- Should be DIRECTLY executable !! ----
##-- ==>  Define data, use random,
##--	or do  help(data=index)  for the standard data sets.

## The function is currently defined as
function (Px) 
{
    return(Px * (1 - Px))
  }
data(X)
data(beta)
px <- CalculPx(X=X, beta=beta)
CalculWeight(px)
##---- Should be DIRECTLY executable !! ----
##-- ==>  Define data, use random,
##--	or do  help(data=index)  for the standard data sets.

## The function is currently defined as
function (Px) 
{
    return(Px * (1 - Px))
  }

Refit the logistic regression with chosen variables

Description

Refit the logistic regression with chosen variables.

Usage

Refit_glm(X, beta_pred, y)
Refit_glm(X, beta_pred, y)

Arguments

`X`	Design matrix of the logistic model considered.
`beta_pred`	Predicted coefficients to be refited.
`y`	Binary response

Value

beta_refit

The new estimated coefficients

Author(s)

Wencan Zhu, Celine Levy-Leduc, Nils Ternes

Examples

data(X)
data(y)
data(beta)
Refit_glm(X=X, beta_pred=beta, y=y)
##---- Should be DIRECTLY executable !! ----
##-- ==>  Define data, use random,
##--	or do  help(data=index)  for the standard data sets.

## The function is currently defined as
function (X, beta_pred, y) 
{
    X_temp <- X[, which(beta_pred != 0)]
    if (length(which(beta_pred != 0)) == 0) {
        coef_est <- beta_pred
    }
    else if (is.null(ncol(X_temp))) {
        mydata <- data.frame(Y = y, X_temp)
        colnames(mydata) <- c("Y", "X")
        formula <- paste0("Y~-1 +", paste0(colnames(mydata)[-which(colnames(mydata) == 
            "Y")], collapse = " + "))
        myform <- as.formula(formula)
        mod_lm <- glm(myform, data = mydata, family = "binomial")
        coef_est <- mod_lm$coefficients
    }
    else {
        mydata <- data.frame(Y = y, as.matrix(X_temp))
        formula <- paste0("Y~-1 +", paste0(colnames(mydata)[-which(colnames(mydata) == 
            "Y")], collapse = " + "))
        myform <- as.formula(formula)
        if (length(which(beta_pred != 0)) >= length(y)) {
            mod_ridge <- cv.glmnet(x = as.matrix(X_temp), y = y, 
                alpha = 0, intercept = FALSE, family = "binomial")
            opt_lambda <- mod_ridge$lambda[which.min(mod_ridge$cvm)]
            coef_est <- as.vector(glmnet(x = as.matrix(X), y = y, 
                alpha = 0, intercept = FALSE, family = "binomial", 
                lambda = opt_lambda)$beta)
        }
        else {
            mod_lm <- glm(myform, data = mydata, family = "binomial")
            coef_est <- mod_lm$coefficients
        }
    }
    beta_refit <- rep(0, length(beta_pred))
    beta_refit[which(beta_pred != 0)] <- coef_est
    return(beta_refit)
  }
data(X)
data(y)
data(beta)
Refit_glm(X=X, beta_pred=beta, y=y)
##---- Should be DIRECTLY executable !! ----
##-- ==>  Define data, use random,
##--	or do  help(data=index)  for the standard data sets.

## The function is currently defined as
function (X, beta_pred, y) 
{
    X_temp <- X[, which(beta_pred != 0)]
    if (length(which(beta_pred != 0)) == 0) {
        coef_est <- beta_pred
    }
    else if (is.null(ncol(X_temp))) {
        mydata <- data.frame(Y = y, X_temp)
        colnames(mydata) <- c("Y", "X")
        formula <- paste0("Y~-1 +", paste0(colnames(mydata)[-which(colnames(mydata) == 
            "Y")], collapse = " + "))
        myform <- as.formula(formula)
        mod_lm <- glm(myform, data = mydata, family = "binomial")
        coef_est <- mod_lm$coefficients
    }
    else {
        mydata <- data.frame(Y = y, as.matrix(X_temp))
        formula <- paste0("Y~-1 +", paste0(colnames(mydata)[-which(colnames(mydata) == 
            "Y")], collapse = " + "))
        myform <- as.formula(formula)
        if (length(which(beta_pred != 0)) >= length(y)) {
            mod_ridge <- cv.glmnet(x = as.matrix(X_temp), y = y, 
                alpha = 0, intercept = FALSE, family = "binomial")
            opt_lambda <- mod_ridge$lambda[which.min(mod_ridge$cvm)]
            coef_est <- as.vector(glmnet(x = as.matrix(X), y = y, 
                alpha = 0, intercept = FALSE, family = "binomial", 
                lambda = opt_lambda)$beta)
        }
        else {
            mod_lm <- glm(myform, data = mydata, family = "binomial")
            coef_est <- mod_lm$coefficients
        }
    }
    beta_refit <- rep(0, length(beta_pred))
    beta_refit[which(beta_pred != 0)] <- coef_est
    return(beta_refit)
  }

WLogit output

Description

The output of WLogit in the example given in the vignette.

Usage

data("test")data("test")

Format

The format is: List of 4 $ beta : num [1:50, 1:500] 0 0 0 0 0 ... $ lambda : num [1:50] 100.8 80 73 58.9 56.7 ... $ beta.min : num [1:500] 0.0194 0.0348 0.0259 0.0287 0.0385 ... $ log.likelihood: num [1:50] 57.7 57.7 57.7 57.7 57.7 ...

Examples

data(test)
str(test)
data(test)
str(test)

Thresholding on a vector

Description

This function provides the thresholding (correction) given a vector. It calls the function top or top_thresh in the same package, and the output is the vector after correction with the optimal threshold parameter.

Usage

Thresholding(X, y, coef, TOP)
Thresholding(X, y, coef, TOP)

Arguments

`X`	Design matrix of the logistic model considered.
`y`	Binary response
`coef`	Candidate vector to be corrected
`TOP`	The grill of thresholding

Value

`opt_top`	The optimal threshold
`auc`	the log-likelihood for each grill of thresholding

Author(s)

Wencan Zhu, Celine Levy-Leduc, Nils Ternes

Thresholding to zero of the smallest values

Description

This function keeps only the K largest values of the vector sorted_vect and sets the others to zero.

Usage

top(vect, thresh)
top(vect, thresh)

Arguments

`vect`	vector to threshold
`thresh`	threshold

Value

This function returns the thresholded vector.

Author(s)

Wencan Zhu, Celine Levy-Leduc, Nils Ternes

Examples

x=sample(1:10,10)
thresh=3
top(x,thresh)
##---- Should be DIRECTLY executable !! ----
##-- ==>  Define data, use random,
##--	or do  help(data=index)  for the standard data sets.

## The function is currently defined as
function (vect, thresh) 
{
    sorted_vect <- sort(abs(vect), decreasing = TRUE)
    v = sorted_vect[thresh]
    ifelse(abs(vect) >= v, vect, 0)
  }
x=sample(1:10,10)
thresh=3
top(x,thresh)
##---- Should be DIRECTLY executable !! ----
##-- ==>  Define data, use random,
##--	or do  help(data=index)  for the standard data sets.

## The function is currently defined as
function (vect, thresh) 
{
    sorted_vect <- sort(abs(vect), decreasing = TRUE)
    v = sorted_vect[thresh]
    ifelse(abs(vect) >= v, vect, 0)
  }

Thresholding to a given threshold of the smallest values

Description

This function keeps only the K largest values of the vector vect and sets the others to the smallest value among the K largest.

Usage

top_thresh(vect, thresh)
top_thresh(vect, thresh)

Arguments

`vect`	vector to threshold
`thresh`	threshold

Value

This function returns the thresholded vector.

Author(s)

Wencan Zhu, Celine Levy-Leduc, Nils Ternes

Examples

x=sample(1:10,10)
sorted_vect=sort(x,decreasing=TRUE)
thresh=3
top_thresh(x,thresh)

##---- Should be DIRECTLY executable !! ----
##-- ==>  Define data, use random,
##--	or do  help(data=index)  for the standard data sets.

## The function is currently defined as
function (vect, thresh) 
{
    sorted_vect <- sort(vect, decreasing = TRUE)
    v = sorted_vect[thresh]
    ifelse(vect >= v, vect, v)
  }
x=sample(1:10,10)
sorted_vect=sort(x,decreasing=TRUE)
thresh=3
top_thresh(x,thresh)

##---- Should be DIRECTLY executable !! ----
##-- ==>  Define data, use random,
##--	or do  help(data=index)  for the standard data sets.

## The function is currently defined as
function (vect, thresh) 
{
    sorted_vect <- sort(vect, decreasing = TRUE)
    v = sorted_vect[thresh]
    ifelse(vect >= v, vect, v)
  }

Variable selection in high-dimensional logistic regression models using a whitening approach

Description

Variable selection in high-dimensional logistic regression models using a whitening approach

Usage

WhiteningLogit(X = X, y = y, nlambda = 50, maxit = 100, gamma = 0.9999, 
top_grill=c(1:100))
WhiteningLogit(X = X, y = y, nlambda = 50, maxit = 100, gamma = 0.9999, 
top_grill=c(1:100))

Arguments

`X`	Design matrix of the logistic model considered.
`y`	Binary response of the logistic model considered.
`nlambda`	Number of lambda
`maxit`	Integer specifying the maximum number of steps for the generalized Lasso algorithm. It should not be smaller than nlambda.
`gamma`	Parameter $\gamma$ defined in the paper Zhu et al. (2022) given in the references. Its default value is 0.95.
`top_grill`	A grill of provided for the thresholding

Value

Returns a list with the following components

`lambda`	different values of the parameter $\lambda$ considered.
`beta`	matrix of the estimations of $\beta$ for all the $\lambda$ considered.
`beta.min`	estimation of $\beta$ which minimize the MSE.
`log.likelihood`	Log-likelihood for all the $\lambda$ considered.

Author(s)

Wencan Zhu, Celine Levy-Leduc, Nils Ternes

References

W. Zhu, C. Levy-Leduc, N. Ternes. "Variable selection in high-dimensional logistic regression models using a whitening approach". (2022)

Examples

X0 <- matrix( rnorm(50*10,mean=0,sd=1), 50, 10)  
y0 <- c(rep(1,25), rep(0,25))
mod <- WhiteningLogit(X=X0, y=y0)
plot(mod$beta.min)

##---- Should be DIRECTLY executable !! ----
##-- ==>  Define data, use random,
##--	or do  help(data=index)  for the standard data sets.

## The function is currently defined as
function(X=X, y=y,
                   nlambda=50,
                   maxit=100,
                   gamma=0.9999,
                   top_grill=c(1:100)){
  
  p=ncol(X)
  n=nrow(X)
  
  mod_ridge <- cv.glmnet(x=as.matrix(X), y=y, alpha=0.5, intercept=FALSE, family="binomial")
  pr_est <- predict(mod_ridge, as.matrix(X), s = "lambda.min", type="response")
  beta_ini <- predict(mod_ridge, as.matrix(X), s = "lambda.min", type="coefficients")[-1]
  diag_w <- pr_est*(1-pr_est)
  square_root_w <- diag(sqrt(as.vector(diag_w)), nrow=n)
  X_new <- square_root_w
  
  Cov_est <- cvCovEst(
    dat = X_new,
    estimators = c(
      linearShrinkLWEst, thresholdingEst, sampleCovEst
    ),
    estimator_params = list(
      thresholdingEst = list(gamma = seq(0.1, 0.3, 0.1))
    ),
    center = TRUE,
    scale = TRUE
  )
  
  Sigma_est <- Cov_est$estimate
  
  SVD_new <- fast.svd(Sigma_est)
  U_sigma_new <- SVD_new$u
  D_sigma_new <- SVD_new$d
  inv_transmat <- U_sigma_new
  inv_diag_new <- ifelse(D_sigma_new<0.000001, 0, 1/sqrt(D_sigma_new))
  trans_mat <- U_sigma_new
  
  
  if (p <= 50) {
    top_grill <- seq(1, p, 2)
  }else if (p <= 200) {
    top_grill <- c(1:50, seq(52, p, 2))
  }else if (p <= 300) {
    top_grill <- c(1:50, seq(52, 100, 2), seq(105, 200, 5), 
                   seq(210, p, 10))
  }else {
    top_grill <- c(1:50, seq(52, 100, 2), seq(105, 200, 5), 
                   seq(210, 300, 10))
  }
  
  X_tilde <- X
  
  beta_tilde_ini <-  inv_transmat
  Px <- CalculPx(X_tilde, beta=beta_tilde_ini)
  wt <- CalculWeight(Px)
  # wt <- ifelse(wt0==0, 0.0001, wt0)
  ystar <- WorkingResp(y=y, Px=Px, X=X_tilde, beta=beta_tilde_ini)
  X_tilde_weighted <- sweep(X, MARGIN=1, sqrt(wt), `*`)
  ystar_weighted <- sqrt(wt)*ystar
  
  gen.model0 <- genlasso(y=ystar_weighted, X=X_tilde_weighted, 
                         D=trans_mat, maxsteps = 50)
  parameter_tmp <- beta_tilde_ini
  beta_final <- matrix(NA, length(gen.model0$lambda), p)
  skip_i <- TRUE
  eval_final <- c()
  defaultW <- getOption("warn") 
  
  options(warn = -1) 
  
  
  for(i in 1:length(gen.model0$lambda)){
    #inner loop
    epsilon=10
    j=0
    if(skip_i){parameter_tmp <- beta_tilde_ini
    } else {parameter_tmp <- parameter_current}
    skip_i <-FALSE
    
    while(epsilon > 0.001){
      j=j+1
      parameter_current <- parameter_tmp
      Px <- CalculPx(X_tilde, beta=parameter_current)
      wt0 <- CalculWeight(Px)
      wt <- ifelse(round(wt0,4)==0, 0.0001, wt)
      ystar <- WorkingResp(y=y, Px=Px, X=X_tilde, beta=parameter_current)
      X_tilde_weighted <- sweep(X, MARGIN=1, sqrt(wt), `*`)
      ystar_weighted <- sqrt(wt)*ystar
      
      gen.model <- genlasso(y=ystar_weighted, X=X_tilde_weighted, D=trans_mat, maxsteps =   maxit)
      
      if(gen.model0$lambda[i] < min(gen.model$lambda)){
        parameter_tmp <- parameter_current
        break
      } else {
        parameter_tmp <- coef(gen.model, lambda=gen.model0$lambda[i],
                              type = "primal")$beta
        beta_current <- parameter_tmp
        if(sum(is.na(parameter_tmp))>0){
          skip_i <-TRUE 
          parameter_tmp <- rep(0,p)
          break}
        epsilon <- max(abs(parameter_current-parameter_tmp))
        if(epsilon >=100){
          skip_i <-TRUE 
          break}
        if (j==maxit){
          skip_i <-TRUE 
          break}
      }
    }
    
    if(skip_i){
      beta_final[i, ] <- rep(NA, p)
      eval_final[i] <- NA
    } else{
      
      correction <- Thresholding(X_tilde, y, coef=parameter_tmp, TOP=top_grill)
      opt_top_tilde <- correction$opt_top
      beta_tilde_opt <- top_thresh(vect=parameter_tmp, thresh = opt_top_tilde)
      beta_final0 <- trans_mat
      
      correction <- Thresholding(X, y, coef=beta_final0, TOP=top_grill)
      opt_top_final <- correction$opt_top
      beta_final[i, ] <- beta_opt_final <- top(vect=beta_final0, thresh = opt_top_final)
      
      beta_refit <- Refit_glm(X=X, beta_pred = beta_opt_final, y=y)
      pr_est <- CalculPx(X, beta_refit)
      ll <- pr_est^y*(1-pr_est)^(1-y)
      #ll <- ifelse(ll<0.000001, 1, ll)
      eval_final[i] <- -log(prod(ll))
      
    }
    beta.min <- beta_final[which.min(eval_final), ]
  }
  options(warn = defaultW)
  return(list(beta=beta_final, lambda=gen.model0$lambda, beta.min=beta.min, 
              log.likelihood=eval_final))
}

X0 <- matrix( rnorm(50*10,mean=0,sd=1), 50, 10)  
y0 <- c(rep(1,25), rep(0,25))
mod <- WhiteningLogit(X=X0, y=y0)
plot(mod$beta.min)

##---- Should be DIRECTLY executable !! ----
##-- ==>  Define data, use random,
##--	or do  help(data=index)  for the standard data sets.

## The function is currently defined as
function(X=X, y=y,
                   nlambda=50,
                   maxit=100,
                   gamma=0.9999,
                   top_grill=c(1:100)){
  
  p=ncol(X)
  n=nrow(X)
  
  mod_ridge <- cv.glmnet(x=as.matrix(X), y=y, alpha=0.5, intercept=FALSE, family="binomial")
  pr_est <- predict(mod_ridge, as.matrix(X), s = "lambda.min", type="response")
  beta_ini <- predict(mod_ridge, as.matrix(X), s = "lambda.min", type="coefficients")[-1]
  diag_w <- pr_est*(1-pr_est)
  square_root_w <- diag(sqrt(as.vector(diag_w)), nrow=n)
  X_new <- square_root_w
  
  Cov_est <- cvCovEst(
    dat = X_new,
    estimators = c(
      linearShrinkLWEst, thresholdingEst, sampleCovEst
    ),
    estimator_params = list(
      thresholdingEst = list(gamma = seq(0.1, 0.3, 0.1))
    ),
    center = TRUE,
    scale = TRUE
  )
  
  Sigma_est <- Cov_est$estimate
  
  SVD_new <- fast.svd(Sigma_est)
  U_sigma_new <- SVD_new$u
  D_sigma_new <- SVD_new$d
  inv_transmat <- U_sigma_new
  inv_diag_new <- ifelse(D_sigma_new<0.000001, 0, 1/sqrt(D_sigma_new))
  trans_mat <- U_sigma_new
  
  
  if (p <= 50) {
    top_grill <- seq(1, p, 2)
  }else if (p <= 200) {
    top_grill <- c(1:50, seq(52, p, 2))
  }else if (p <= 300) {
    top_grill <- c(1:50, seq(52, 100, 2), seq(105, 200, 5), 
                   seq(210, p, 10))
  }else {
    top_grill <- c(1:50, seq(52, 100, 2), seq(105, 200, 5), 
                   seq(210, 300, 10))
  }
  
  X_tilde <- X
  
  beta_tilde_ini <-  inv_transmat
  Px <- CalculPx(X_tilde, beta=beta_tilde_ini)
  wt <- CalculWeight(Px)
  # wt <- ifelse(wt0==0, 0.0001, wt0)
  ystar <- WorkingResp(y=y, Px=Px, X=X_tilde, beta=beta_tilde_ini)
  X_tilde_weighted <- sweep(X, MARGIN=1, sqrt(wt), `*`)
  ystar_weighted <- sqrt(wt)*ystar
  
  gen.model0 <- genlasso(y=ystar_weighted, X=X_tilde_weighted, 
                         D=trans_mat, maxsteps = 50)
  parameter_tmp <- beta_tilde_ini
  beta_final <- matrix(NA, length(gen.model0$lambda), p)
  skip_i <- TRUE
  eval_final <- c()
  defaultW <- getOption("warn") 
  
  options(warn = -1) 
  
  
  for(i in 1:length(gen.model0$lambda)){
    #inner loop
    epsilon=10
    j=0
    if(skip_i){parameter_tmp <- beta_tilde_ini
    } else {parameter_tmp <- parameter_current}
    skip_i <-FALSE
    
    while(epsilon > 0.001){
      j=j+1
      parameter_current <- parameter_tmp
      Px <- CalculPx(X_tilde, beta=parameter_current)
      wt0 <- CalculWeight(Px)
      wt <- ifelse(round(wt0,4)==0, 0.0001, wt)
      ystar <- WorkingResp(y=y, Px=Px, X=X_tilde, beta=parameter_current)
      X_tilde_weighted <- sweep(X, MARGIN=1, sqrt(wt), `*`)
      ystar_weighted <- sqrt(wt)*ystar
      
      gen.model <- genlasso(y=ystar_weighted, X=X_tilde_weighted, D=trans_mat, maxsteps =   maxit)
      
      if(gen.model0$lambda[i] < min(gen.model$lambda)){
        parameter_tmp <- parameter_current
        break
      } else {
        parameter_tmp <- coef(gen.model, lambda=gen.model0$lambda[i],
                              type = "primal")$beta
        beta_current <- parameter_tmp
        if(sum(is.na(parameter_tmp))>0){
          skip_i <-TRUE 
          parameter_tmp <- rep(0,p)
          break}
        epsilon <- max(abs(parameter_current-parameter_tmp))
        if(epsilon >=100){
          skip_i <-TRUE 
          break}
        if (j==maxit){
          skip_i <-TRUE 
          break}
      }
    }
    
    if(skip_i){
      beta_final[i, ] <- rep(NA, p)
      eval_final[i] <- NA
    } else{
      
      correction <- Thresholding(X_tilde, y, coef=parameter_tmp, TOP=top_grill)
      opt_top_tilde <- correction$opt_top
      beta_tilde_opt <- top_thresh(vect=parameter_tmp, thresh = opt_top_tilde)
      beta_final0 <- trans_mat
      
      correction <- Thresholding(X, y, coef=beta_final0, TOP=top_grill)
      opt_top_final <- correction$opt_top
      beta_final[i, ] <- beta_opt_final <- top(vect=beta_final0, thresh = opt_top_final)
      
      beta_refit <- Refit_glm(X=X, beta_pred = beta_opt_final, y=y)
      pr_est <- CalculPx(X, beta_refit)
      ll <- pr_est^y*(1-pr_est)^(1-y)
      #ll <- ifelse(ll<0.000001, 1, ll)
      eval_final[i] <- -log(prod(ll))
      
    }
    beta.min <- beta_final[which.min(eval_final), ]
  }
  options(warn = defaultW)
  return(list(beta=beta_final, lambda=gen.model0$lambda, beta.min=beta.min, 
              log.likelihood=eval_final))
}

Calculate the working response

Description

Calculate the working response in the iterative least square regression

Usage

WorkingResp(y, Px, X, beta, intercept = 0)
WorkingResp(y, Px, X, beta, intercept = 0)

Arguments

`X`	Design matrix of the logistic model considered.
`y`	Binary response of the logistic model considered.
`Px`	The probability of the reponse to be 1
`beta`	Vector of coefficients
`intercept`	If there is an intercept

Value

This function returns the vector of working response.

Author(s)

Wencan Zhu, Celine Levy-Leduc, Nils Ternes

Example of a design matrix of a logistic model

Description

It contains an example of a design matrix of a logistic model.

Usage

data("X")data("X")

Format

The format is: num [1:100, 1:500] -1.576 -0.476 -0.237 -0.398 0.284 ...

Examples

data(X)
data(X)

Example of a binary response variable of a logistic model.

Description

It contains an example of a binary response variable of a logistic model.

Usage

data("y")data("y")

Format

The format is: int [1:100] 0 1 0 1 1 0 0 0 1 1 ...

Examples

data(y)
data(y)

Package 'WLogit'

Help Index

Variable Selection in High-Dimensional Logistic Regression Models using a Whitening Approach

Description

Details

Author(s)

References

True coefficients in the esample.

Description

Usage

Format

Examples

Calculate the class-conditional probabilities.

Description

Usage

Arguments

Value

Author(s)

See Also

Examples

Calculate the weight

Description

Usage

Arguments

Author(s)

See Also

Examples

Refit the logistic regression with chosen variables

Description

Usage

Arguments

Value

Author(s)

Examples

WLogit output

Description

Usage

Format

Examples

Thresholding on a vector

Description

Usage

Arguments

Value

Author(s)

Thresholding to zero of the smallest values

Description

Usage

Arguments

Value

Author(s)

Examples

Thresholding to a given threshold of the smallest values

Description

Usage

Arguments

Value

Author(s)

Examples

Variable selection in high-dimensional logistic regression models using a whitening approach

Description

Usage

Arguments

Value

Author(s)

References

Examples

Calculate the working response

Description

Usage

Arguments

Value

Author(s)

See Also

Example of a design matrix of a logistic model

Description

Usage

Format

Examples

Example of a binary response variable of a logistic model.