This function creates a data profiling report.
create_report( data, output_format = html_document(toc = TRUE, toc_depth = 6, theme = "yeti"), output_file = "report.html", output_dir = getwd(), y = NULL, config = configure_report(), report_title = "Data Profiling Report", ... )
data | input data |
---|---|
output_format | output format in render. Default is |
output_file | output file name in render. Default is "report.html". |
output_dir | output directory for report in render. Default is user's current directory. |
y | name of response variable if any. Response variables will be passed to appropriate plotting functions automatically. |
config | report configuration generated by configure_report. |
report_title | report title. Default is "Data Profiling Report". |
... | other arguments to be passed to render. |
config
is a named list to be evaluated by create_report
.
Each name should exactly match a function name.
By doing so, that function and corresponding content will be added to the report.
If you do not want to include certain functions/content, do not add it to config
.
configure_report generates the default template. You may customize the content using that function.
All function arguments will be passed to do.call as a list.
If both y
and plot_prcomp
are present, y
will be removed from plot_prcomp
.
If there are multiple options for the same function, all of them will be plotted.
For example, create_report(..., y = "a", config = list("plot_bar" = list("with" = "b")))
will create 3 bar charts:
regular frequency bar chart
bar chart aggregated by response variable "a"
bar chart aggregated by `with` variable "b"`
if (FALSE) { # Create report create_report(iris) create_report(airquality, y = "Ozone") # Load library library(ggplot2) library(data.table) library(rmarkdown) # Set some missing values diamonds2 <- data.table(diamonds) for (j in 5:ncol(diamonds2)) { set(diamonds2, i = sample.int(nrow(diamonds2), sample.int(nrow(diamonds2), 1)), j, value = NA_integer_) } # Create customized report for diamonds2 dataset create_report( data = diamonds2, output_format = html_document(toc = TRUE, toc_depth = 6, theme = "flatly"), output_file = "report.html", output_dir = getwd(), y = "price", config = configure_report( add_plot_prcomp = TRUE, plot_qq_args = list("by" = "cut", sampled_rows = 1000L), plot_bar_args = list("with" = "carat"), plot_correlation_args = list("cor_args" = list("use" = "pairwise.complete.obs")), plot_boxplot_args = list("by" = "cut"), global_ggtheme = quote(theme_light()) ) ) ## Configure report without `configure_report` config <- list( "introduce" = list(), "plot_intro" = list(), "plot_str" = list( "type" = "diagonal", "fontSize" = 35, "width" = 1000, "margin" = list("left" = 350, "right" = 250) ), "plot_missing" = list(), "plot_histogram" = list(), "plot_density" = list(), "plot_qq" = list(sampled_rows = 1000L), "plot_bar" = list(), "plot_correlation" = list("cor_args" = list("use" = "pairwise.complete.obs")), "plot_prcomp" = list(), "plot_boxplot" = list(), "plot_scatterplot" = list(sampled_rows = 1000L) ) }