Direct Interface to Information Gain.

.information_gain(
  x,
  y,
  type = c("infogain", "gainratio", "symuncert"),
  equal = FALSE,
  discIntegers = TRUE,
  nbins = 5,
  threads = 1
)

Arguments

x

A data.frame, sparse matrix or formula with attributes.

y

A vector with response variable or data.frame if formula is used.

type

Method name.

equal

A logical. Whether to discretize dependent variable with the equal frequency binning discretization or not.

discIntegers

logical value. If true (default), then integers are treated as numeric vectors and they are discretized. If false integers are treated as factors and they are left as is.

nbins

Number of bins used for discretization. Only used if `equal = TRUE` and the response is numeric.

threads

defunct. Number of threads for parallel backend - now turned off because of safety reasons.

Details

In principle using information_gain is safer.

data.frame with the following columns:

  • attributes - variables names.

  • importance - worth of the attributes.