Skip to contents

This function returns and plots frequency of missing values for each feature.

Usage

plot_missing(
  data,
  group = list(Good = 0.05, OK = 0.4, Bad = 0.8, Remove = 1),
  group_color = list(Good = "#1B9E77", OK = "#E6AB02", Bad = "#D95F02", Remove =
    "#E41A1C"),
  missing_only = FALSE,
  geom_label_args = list(),
  title = NULL,
  ggtheme = theme_gray(),
  theme_config = list(legend.position = c("bottom"))
)

Arguments

data

input data

group

missing profile band taking a list of group name and group upper bounds. Default is list("Good" = 0.05, "OK" = 0.4, "Bad" = 0.8, "Remove" = 1).

group_color

bar color list for each of the group. Default is list("Good" = "#1B9E77", "OK" = "#E6AB02", "Bad" = "#D95F02", "Remove" = "#E41A1C").

missing_only

plot features with missing values only? Default is FALSE.

geom_label_args

a list of other arguments to geom_label

title

plot title

ggtheme

complete ggplot2 themes. The default is theme_gray.

theme_config

a list of configurations to be passed to theme.

Value

invisibly return the ggplot object

See also

Examples

plot_missing(airquality)

plot_missing(airquality, missing_only = TRUE)


## Customize band
plot_missing(airquality, group = list("B1" = 0, "B2" = 0.06, "B3" = 1))

plot_missing(airquality, group = list("Awesome!" = 0.2, "Oh no!" = 1),
group_color = list("Awesome!" = "green", "Oh no!" = "red"))


## Shrink geom_label size
library(ggplot2)
plot_missing(airquality, geom_label_args = list("size" = 2,
"label.padding" = unit(0.1, "lines")))