Package 'calibrationband'

Title: Calibration Bands
Description: Package to assess the calibration of probabilistic classifiers using confidence bands for monotonic functions. Besides testing the classical goodness-of-fit null hypothesis of perfect calibration, the confidence bands calculated within that package facilitate inverted goodness-of-fit tests whose rejection allows for a sought-after conclusion of a sufficiently well-calibrated model. The package creates flexible graphical tools to perform these tests. For construction details see also Dimitriadis, Dümbgen, Henzi, Puke, Ziegel (2022) <arXiv:2203.04065>.
Authors: Timo Dimitriadis [aut], Alexander Henzi [aut], Marius Puke [aut, cre]
Maintainer: Marius Puke <[email protected]>
License: GPL-3
Version: 0.1.1
Built: 2024-11-11 03:59:13 UTC
Source: https://github.com/marius-cp/calibrationband

Help Index


Confidence bands for monotone probabilities

Description

Confidence bands for monotone probabilities

Usage

calibration_bands(
  x,
  y,
  alpha = 0.05,
  method = "standard",
  digits = NULL,
  nc = FALSE
)

Arguments

x

covariate.

y

response variable (in 0,1).

alpha

type one error probability (1 minus the confidence level).

method

"standard" for the original method proposed in the article, "round" for rounding the covariate, or "YB" for the bounds by Yang & Barber (2019).

digits

number of digits for method "round". Default is 2. Has no effect for the other methods.

nc

use non-crossing bands for method "standard" or "round". Has no effect for method "YB". Default is FALSE. See also "summary(...,iso_test=TRUE)" in this context. Crossings allow to reject the null hypothesis of monotonicity in the calibration curve.

Value

An object of class calibrationband, which is a list containing the following entries:

bands a tibble holding x,lwr,upr the lower and upper bound, for each value of x. The upper bound extends to the left and the lower bound to the right, that is, the upper bound for x[i]<s<x[i+1] is upr[i+1], and the lower bound for x[i]<s<x[i+1] is lwr[i].
cal a tibble holding the areas/segments of calibration (out=0) and miscalibration (out=1).
bins a tibble of the characteristics of the isotonic bins.
cases tibble of all predictions and observations. In addition it holds the column isoy, which is the isotonic regression of y at points x.
alpha the given type one error probability (1 minus the nominal coverage of the band).
method the selected method for computing the band.
nc the selected method for non-crossing.
digits the given digits for method "round" (or NULL for method "standard").
time time to compute the upper and lower band.

Plotting monotone confidence bands

Description

Uses the ggplot2 package to illustrate monotone confidence bands to assess calibration of prediction methods that issue probability forecasts.

Usage

## S3 method for class 'calibrationband'
autoplot(
  object,
  ...,
  approx.equi = NULL,
  cut.bands = FALSE,
  p_ribbon = NULL,
  p_isoreg = NULL,
  p_diag = NULL
)

## S3 method for class 'calibrationband'
autolayer(
  object,
  ...,
  approx.equi = NULL,
  cut.bands = FALSE,
  p_diag = NA,
  p_isoreg = NA,
  p_ribbon = NA
)

## S3 method for class 'calibrationband'
plot(x, ...)

Arguments

object

object of class calibrationband

...

Further arguments to be passed to or from methods.

approx.equi

If NULL, the bands are drawn for each prediction-realization pair. If it is a scalar, say z, the bounds are approximated at z equidistant point on the x-axis. Also see the effect of cut.bands if a scalar is specified.

In large data sets, approx.equi = NULL might result in capacity-consuming plots. In these cases, we recommend to set approx.equi equal to a value that is at least 200.

Note, we add important additional points the initial scalar of approx.equi to assure accurate transition areas (changes between miscalibrated and calibrated areas).

cut.bands

Cut the bands at most extreme prediction values. Bands will not be extended to 0 and 1 respectively if option is set equal to true.

p_ribbon

If non NULL, a ribbon is drawn. Contains a list of arguments for ggplot2::geom_polygon. See details for default list settings.

p_isoreg

If non NULL the isotonic regression curve is drawn. Contains a list of arguments for ggplot2::geom_line. See details for default list settings.

p_diag

If non NULL, the diagonal line is drawn. Contains list of arguments for ggplot2::geom_segment.

x

object of class calibrationband

Details

When plotting the monotone confidence band, the upper bound should be extended to the left, that is, the bound at x[i] is valid on the interval (x[i-1],x[i]]. The lower bound should be extended to the right, i.e. the bound at x[i] is extended to the interval [x[i],x[i + 1]). This function creates x and y values for correct plotting of these bounds.

autoplot behaves like any ggplot() + layer() combination. That means, customized plots should be created using autoplot and autolayer.

Setting any of the p_* arguments to NA disables that layer.

Default parameter values for p_*

p_isoreg list(color = "darkgray")
p_diag list(color = "black", fill="blue", alpha = .1)
p_ribbon list(low = "gray", high = "red", guide = "none", limits=c(0,1))

Value

An object inheriting from class 'ggplot'.

Examples

s=.8
n=10000
x <- sort(runif(n))

p <- function(x,s){p = 1/(1+((1/x*(1-x))^(s+1)));return(p)}
dat <- data.frame(pr=x, y=rbinom(n,1,p(x,s)))

cb <- calibration_bands(x=dat$pr, y=dat$y,alpha=0.05, method="round", digits =3)

#simple plotting
plot(cb)
autoplot(cb)

#customize the plot using  ggplot2::autolayer
autoplot(
cb,
approx.equi=NULL,
p_ribbon = NA
)+
ggplot2::autolayer(
cb,
p_ribbon = list(alpha = .3, fill = "gray", colour = "blue"),
)

Print monotone confidence bands

Description

Printing methods for 'calibrationband' and 'summary.calibrationband' objects.

Usage

## S3 method for class 'calibrationband'
print(x, ...)

## S3 method for class 'summary.calibrationband'
print(x, ...)

Arguments

x

object of class calibrationband

...

Further arguments to be passed to or from methods; in particular these passed to autoplot.calibrationband

Details

print.calibrationband always sends an autoplot object to the current graphics device and prints a summary to the console.

Value

Invisibly returns x.

See Also

autoplot.calibrationband, summary.calibrationband


summarize calibration band object

Description

An object of class calibrationband contains the calibration band coordinates, the pairs of original observation and forecast values, and the recalibrated forecasts obtained by isotonic regression. The function summary.reliabilitydiag calculates the areas of miscalibration.

Usage

## S3 method for class 'calibrationband'
summary(object, ..., iso_test = FALSE, n = 3)

Arguments

object

object of class calibrationband

...

Further arguments to be passed to or from methods.

iso_test

with default = FALSE. If TRUE, the decision of the isotonicity test is reported along side the crossings of the band. If the calibrationband is calculated with nc=TRUE, the bands are re-estimated with nc=FALSE using digits=3. The alpha from the calibrationband is used.

n

number of rows in output table.

Value

A 'summary.reliability' object, which is also a tibble (see tibble::tibble()) with columns:

min_x minimal x-coordinate of misscalibration segment (ordered by length).
max_x maximal x-coordinate of misscalibration segment (ordered by length).

Examples

set.seed(123)
s=.8
n=10000
x <- sort(runif(n))

p <- function(x,s){p = 1/(1+((1/x*(1-x))^(s+1)));return(p)}
dat <- data.frame(pr=x, y=rbinom(n,1,p(x,s)))

cb <- calibration_bands(x=dat$pr, y=dat$y,alpha=0.05, method="round", digits =3)

summary(cb)
print(summary(cb), n=5)