R/mcmcRocPrcGen.R
mcmcRocPrcGen.Rd
This function generates ROC and Precision-Recall curves
after fitting a Bayesian logit or probit regression. For fast calculation for
from an "rjags" object use mcmcRocPrc
mcmcRocPrcGen( modelmatrix, mcmcout, modelframe, curves = FALSE, link = "logit", fullsims = FALSE )
modelmatrix | model matrix, including intercept (if the intercept is among the
parameters estimated in the model). Create with model.matrix(formula, data).
Note: the order of columns in the model matrix must correspond to the order of columns
in the matrix of posterior draws in the |
---|---|
mcmcout | posterior distributions of all logit coefficients,
in matrix form. This can be created from rstan, MCMCpack, R2jags, etc. and transformed
into a matrix using the function as.mcmc() from the coda package for |
modelframe | model frame in matrix form. Can be created using as.matrix(model.frame(formula, data)) |
curves | logical indicator of whether or not to return values to plot the ROC or Precision-Recall
curves. If set to |
link | type of generalized linear model; a character vector set to |
fullsims | logical indicator of whether full object (based on all MCMC draws
rather than their average) will be returned. Default is |
This function returns a list with 4 elements:
area_under_roc: area under ROC curve (scalar)
area_under_prc: area under precision-recall curve (scalar)
prc_dat: data to plot precision-recall curve (data frame)
roc_dat: data to plot ROC curve (data frame)
This function generates ROC and precision-recall curves after fitting a Bayesian logit or probit model.
Beger, Andreas. 2016. “Precision-Recall Curves.” Available at SSRN: https://ssrn.com/Abstract=2765419. http://dx.doi.org/10.2139/ssrn.2765419.
.old_wd <- setwd(tempdir()) # \donttest{ if (interactive()) { # simulating data set.seed(123456) b0 <- 0.2 # true value for the intercept b1 <- 0.5 # true value for first beta b2 <- 0.7 # true value for second beta n <- 500 # sample size X1 <- runif(n, -1, 1) X2 <- runif(n, -1, 1) Z <- b0 + b1 * X1 + b2 * X2 pr <- 1 / (1 + exp(-Z)) # inv logit function Y <- rbinom(n, 1, pr) df <- data.frame(cbind(X1, X2, Y)) # formatting the data for jags datjags <- as.list(df) datjags$N <- length(datjags$Y) # creating jags model model <- function() { for(i in 1:N){ Y[i] ~ dbern(p[i]) ## Bernoulli distribution of y_i logit(p[i]) <- mu[i] ## Logit link function mu[i] <- b[1] + b[2] * X1[i] + b[3] * X2[i] } for(j in 1:3){ b[j] ~ dnorm(0, 0.001) ## Use a coefficient vector for simplicity } } params <- c("b") inits1 <- list("b" = rep(0, 3)) inits2 <- list("b" = rep(0, 3)) inits <- list(inits1, inits2) ## fitting the model with R2jags set.seed(123) fit <- R2jags::jags(data = datjags, inits = inits, parameters.to.save = params, n.chains = 2, n.iter = 2000, n.burnin = 1000, model.file = model) # processing the data mm <- model.matrix(Y ~ X1 + X2, data = df) xframe <- as.matrix(model.frame(Y ~ X1 + X2, data = df)) mcmc <- coda::as.mcmc(fit) mcmc_mat <- as.matrix(mcmc)[, 1:ncol(xframe)] # using mcmcRocPrcGen fit_sum <- mcmcRocPrcGen(modelmatrix = mm, modelframe = xframe, mcmcout = mcmc_mat, curves = TRUE, fullsims = FALSE) } # } setwd(.old_wd)