Skip to contents

DataExplorer 0.8.3

Enhancements

  • #154 PR: Added YAML option to allow HTML elements when choosing PDF report.
  • #165: Added geom_jitter option to plot_boxplot and plot_scatterplot.
  • #176 PR: Improved legend ordering in plot_missing.
  • #177 PR: Added group color customization in plot_missing.

DataExplorer 0.8.2

CRAN release: 2020-12-15

Enhancements

  • #139: Added by argument to plot_bar.

Bug Fixes

  • #148: Address CRAN removal due to vignette build failure.

DataExplorer 0.8.1

CRAN release: 2020-01-07

Enhancements

  • #111: Continuous distributions can now be plotted with different scales, i.e., histogram, density, boxplot, scatterplot.
  • #126: Cleaned up labels in legend guide.
  • #127 (PR): Added option to plot columns with missing values only in plot_missing.
  • Cleaned up code for create_report.

Bug Fixes

  • #109: Fixed a bug causing unordered bar charts.
  • #114: Removed redundant message in dummify.
  • #116: Fixed pandoc document conversion error 99.
  • #120: Fixed type logical being parsed as symbol in configure_report.
  • #121: Fixed missing value bug when split_columns(..., binary_as_factor = TRUE).
  • #130 (PR): plot_prcomp now drops columns with zero variance.

DataExplorer 0.8.0

CRAN release: 2019-03-17

New Features

  • #92: Added update_columns to transform any selected columns.

Enhancements

  • #87: Added configure_report function to customize report content.
  • #89: Added option to customize geom_text and geom_label arguments.
  • #91: create_report now displays full report directory after completion.
  • #95: Added better exception handling for plot_bar.
  • #98: Added band customization to plot_missing.
  • #100: Switched geom_text to geom_label.
  • #103: Report title can now be customized in create_report.
  • #108: Added option to treat binary features as discrete in plot_bar, plot_histogram, plot_density and plot_boxplot.
  • Updated d3.min.js to v5.9.2.

Bug Fixes

  • #88: Added plot_intro to report config.
  • #90: Added first plot in plot_prcomp to output and page_0.
  • #94: Fixed typo for PCA.

DataExplorer 0.7.1

Enhancements

  • #86: Replaced gridExtra::grid.arrange with facets.
  • Added seeds to vignette and README for re-producible examples.
  • Hid all internal functions.

DataExplorer 0.7.0

CRAN release: 2018-10-19

New Features

  • #72: Added plot_qq for QQ plot.
  • #76: Added plot_intro to visualize results of introduce.

Enhancements

  • #42: Applied S3 methods for plotting functions.
  • #77: dummify now works on selected columns.
  • #78: All ggplot objects from plot_* are now invisibly returned. As a result, extracted profile_missing from plot_missing for missing value profiles.
  • #83: Removed all deprecated functions.
  • #85: Users can now specify number of rows/columns for plot page layout.
  • plot_prcomp now passed scale. = TRUE to prcomp by default.
  • Added sampled_rows argument to plot_scatterplot.
  • Added option to parallelize plot object construction.
  • Updated default config for create_report.

Bug Fixes

  • #74: Fixed a bug causing create_report failure due to zero complete rows.
  • #75: Fixed a bug in plot_str when plotting data.frame with more than 100 columns.
  • #82: Removed hard-coded scales from all plot functions.
  • Fixed a bug causing wrong column indices in split_columns.
  • Fixed a bug using standard deviation instead of variance in plot_prcomp.

DataExplorer 0.6.1

Enhancements

  • Updated vignette for better clarity.
  • #71: Added better error handler for plot_prcomp.

Bug Fixes

  • #69: Fixed bug causing create_report failure (specifically from plot_prcomp) when y is specified.
  • Added more unit tests for create_report and plot_prcomp.

DataExplorer 0.6.0

CRAN release: 2018-05-30

New Features

  • #15: Added plot_prcomp to visualize principal component analysis.
  • #54: Extracted dummify from plot_correlation as a new function.
  • #59: Added introduce for basic metadata.

Enhancements

  • #41: create_report can now be customized.
  • #53: Added page number for plots that span multiple pages.
  • #56: Added support for theme and customization for individual components.
  • #62: plot_bar now supports optional measures (in addition to categorical frequency) using argument with.
  • #66: Feature engineering functions works on other classes in addition to just data.table.
  • plot_missing:
    • Percentage text labels from output plot now has 2 decimals to prevent small percentages from being truncated to 0%.
    • Added example to quickly drop columns with too many missing values.
  • Added .ignoreCat and .getAllMissing to helper.

Bug Fixes

  • #55: Fixed bugs and updated vignette with latest functions.
  • #57: Fixed plot_str bug for not supporting S4 objects.
  • #63: Fixed plot_histogram and plot_density not working with column names containing spaces.

DataExplorer 0.5.0

CRAN release: 2018-01-10

New Features

  • #48: Added plot_scatterplot to visualize relationship of one feature against all other.
  • #50: Added plot_boxplot to visualize continuous distributions broken down by another feature.

Enhancements

  • #44: Added option to exclude categories in group_category.
  • #45: Added title option for all plots.
  • #46: Added option to exclude columns in set_missing.
  • #49 [Breaking Change]: Switched package to tidyverse style. All old functions are in .Deprecated mode. List of name changes in alphabetical order:
    • BarDiscrete -> plot_bar
    • CollapseCategory -> group_category
    • CorrelationContinuous-> plot_correlation(..., type = "continuous")
    • CorrelationDiscrete-> plot_correlation(..., type = "discrete")
    • DensityContinuous -> plot_density
    • DropVar -> drop_columns
    • GenerateReport -> create_report
    • HistogramContinuous -> plot_histogram
    • PlotMissing -> plot_missing
    • PlotStr -> plot_str
    • SetNaTo -> set_missing
    • SplitColType -> split_columns
  • #52: Combined CorrelationContinuous and CorrelationDiscrete into one function, and added option to view correlation of all features at once.
  • Optimized layout for multiple plots.

Bug Fixes

  • #47: Fixed color scale for correlation heatmap.

DataExplorer 0.4.0

CRAN release: 2017-01-26

New Features

  • #33: Added PlotStr to visualize data structure.
  • #40: Added network graph to GenerateReport.

Bug Fixes

  • #32: Fixed pandoc requirement error in unit test on cran.
  • #34: Fixed error message when quiet is not supplied. In addition, report directory are printed through message() instead of cat().
  • #35: Fixed rprojroot not found error.

Enhancements

  • #12: Added vignette: dataexplorer-intro.
  • #36: Fixed warnings from data.table in DropVar.
  • #37: Changed all cat() to message().
  • #38: Added option to order bars in BarDiscrete.
  • #39: Extended SetNaTo to discrete features.
  • Added more examples to README.md.

DataExplorer 0.3.0

CRAN release: 2016-11-19

New Features

  • #25: Added SetNaTo to quickly reset missing numerical values.
  • #29: Added DropVar to quickly drop variables by either name or column position.

Bug Fixes

  • #24: CorrelationDiscrete now displays all factor levels instead of full rank matrix from model.matrix.

Enhancements

  • #11: Functions with return values will now match the input class and set it back.
  • #22: Added documentation for num_all_missing in SplitColType.
  • #23: Added additional measures (in addition to frequency) to CollapseCategory.
  • #26: Removed density estimation section from report template.
  • #31: Added flexibility to name the new category in CollapseCategory.

Other notes

  • #30: In CollapseCategory, update = TRUE will only work with input data as data.table. However, it is still possible to view the frequency distribution with any input data class, as long as update = FALSE.

DataExplorer 0.2.6

CRAN release: 2016-05-08

Bug Fixes

  • #20: Fixed permission denied bug due to intermediates_dir argument in knitr::render.

Enhancements

  • #16: Improved handling of missing values.

DataExplorer 0.2.5

Bug Fixes

  • #18: GenerateReport now handles data without discrete or continuous features.

Enhancements

  • #14: Updated rmarkdown template for GenerateReport.
  • #1: Features with all NA values will be ignored in BarDiscrete.

DataExplorer 0.2.4

CRAN release: 2016-03-02

Bug Fixes

  • Fixed a major bug in GenerateReport function due to package renaming.

Enhancements

  • GenerateReport will now print the directory of the report to console.

DataExplorer 0.2.3

CRAN release: 2016-03-01

New Features

  • Added function CollapseCategory to collapse sparse categories for discrete features.
  • Added correlation heatmap for both continuous and discrete features.
  • Added density plot for continuous features.

Bug Fixes

  • Fixed a bug in BarDiscrete and CorrelationDiscrete for not plotting non-factor class.
  • Minor changes for CRAN re-submission.

Enhancements

  • Changed grid layout for BarDiscrete and HistogramContinuous.
  • Features with all missing values will be ignored.
  • Switched position between continuous and discrete features in report template.
  • Renamed package name to DataExplorer.
  • Added NEWS.md.
  • Removed BoxplotContinuous.