Skip to contents

This function creates a data profiling report.

Usage

create_report(
  data,
  output_format = html_document(toc = TRUE, toc_depth = 6, theme = "yeti"),
  output_file = "report.html",
  output_dir = getwd(),
  y = NULL,
  plotly = FALSE,
  config = configure_report(),
  report_title = "Data Profiling Report",
  ...
)

Arguments

data

input data

output_format

output format in render. Default is html_document(toc = TRUE, toc_depth = 6, theme = "yeti").

output_file

output file name in render. Default is "report.html".

output_dir

output directory for report in render. Default is user's current directory.

y

name of response variable if any. Response variables will be passed to appropriate plotting functions automatically.

plotly

if TRUE, use interactive plotly charts in the report (requires the plotly package). Default is FALSE. Only applies to HTML output; PDF reports use static plots.

config

report configuration generated by configure_report.

report_title

report title. Default is "Data Profiling Report".

...

other arguments to be passed to render.

Details

config is a named list to be evaluated by create_report. Each name should exactly match a function name. By doing so, that function and corresponding content will be added to the report. If you do not want to include certain functions/content, do not add it to config.

configure_report generates the default template. You may customize the content using that function.

All function arguments will be passed to do.call as a list.

Note

If both y and plot_prcomp are present, y will be removed from plot_prcomp.

If there are multiple options for the same function, all of them will be plotted. For example, create_report(..., y = "a", config = list("plot_bar" = list("with" = "b"))) will create 3 bar charts:

  • regular frequency bar chart

  • bar chart aggregated by response variable "a"

  • bar chart aggregated by `with` variable "b"`

See also

Examples

if (FALSE) { # \dontrun{
# Create report
create_report(iris)
create_report(airquality, y = "Ozone")

# Create report with plotly
# Note: It is a known issue that some facet panels may not show up in plotly.
# More details in the following issues:
# * https://github.com/plotly/plotly.R/issues/1243
# * https://github.com/plotly/plotly.R/issues/1962
create_report(airquality, y = "Ozone", plotly = TRUE)

# Load library
library(ggplot2)
library(data.table)
library(rmarkdown)

# Set some missing values
diamonds2 <- data.table(diamonds)
for (j in 5:ncol(diamonds2)) {
  set(diamonds2,
      i = sample.int(nrow(diamonds2), sample.int(nrow(diamonds2), 1)),
      j,
      value = NA_integer_)
}

# Create customized report for diamonds2 dataset
create_report(
  data = diamonds2,
  output_format = html_document(toc = TRUE, toc_depth = 6, theme = "flatly"),
  output_file = "report.html",
  output_dir = getwd(),
  y = "price",
  config = configure_report(
    add_plot_prcomp = TRUE,
    plot_qq_args = list("by" = "cut", sampled_rows = 1000L),
    plot_bar_args = list("with" = "carat"),
    plot_correlation_args = list("cor_args" = list("use" = "pairwise.complete.obs")),
    plot_boxplot_args = list("by" = "cut"),
    global_ggtheme = quote(theme_light())
  )
)

## Configure report without `configure_report`
config <- list(
  "introduce" = list(),
  "plot_intro" = list(),
  "plot_str" = list(
    "type" = "diagonal",
    "fontSize" = 35,
    "width" = 1000,
    "margin" = list("left" = 350, "right" = 250)
  ),
  "plot_missing" = list(),
  "plot_histogram" = list(),
  "plot_density" = list(),
  "plot_qq" = list(sampled_rows = 1000L),
  "plot_bar" = list(),
  "plot_correlation" = list("cor_args" = list("use" = "pairwise.complete.obs")),
  "plot_prcomp" = list(),
  "plot_boxplot" = list(),
  "plot_scatterplot" = list(sampled_rows = 1000L)
)
} # }