plot_*
function.R/plot.R
plot_var_all.Rd
Easily generate ggplot2 graphs for all (or a named vector of)
variables in a data frame using a class-appropriate geometry via other
elucidate plot_*
functions with a restricted set of customization options
and some modified defaults. The var2
argument also allows you to plot all
variables against a specific named secondary variable. The collection of
generated graphs will be combined into a single lattice-style figure with
either the patchwork package or trelliscopejs package. See "Arguments"
section for details and
this
blog post for an introduction to ggplot2. To obtain a plot of a single
variable or vector, use plot_var
instead. To obtain pairwise plots of
all bivariate combinations of variables, use plot_var_pairs
instead.
plot_var_all(
data,
var2 = NULL,
group_var = NULL,
cols = NULL,
var2_lab = ggplot2::waiver(),
title = ggplot2::waiver(),
caption = ggplot2::waiver(),
fill = "blue2",
colour = "black",
palette = c("plasma", "C", "magma", "A", "inferno", "B", "viridis", "D", "cividis",
"E"),
palette_direction = c("d2l", "l2d"),
palette_begin = 0,
palette_end = 0.8,
alpha = 0.75,
greyscale = FALSE,
line_size = 1,
theme = c("bw", "classic", "grey", "light", "dark", "minimal"),
text_size = 14,
font = c("sans", "serif", "mono"),
legend_position = c("right", "left", "top", "bottom"),
omit_legend = FALSE,
dnorm = TRUE,
violin = TRUE,
var1_log10 = FALSE,
var2_log10 = FALSE,
point_size = 2,
point_shape = c("circle", "square", "diamond", "triangle up", "triangle down"),
regression_line = TRUE,
regression_method = c("gam", "loess", "lm"),
regression_se = TRUE,
bar_position = c("dodge", "fill", "stack"),
bar_width = 0.9,
basic = FALSE,
interactive = FALSE,
trelliscope = FALSE,
nrow = NULL,
ncol = NULL,
guides = "collect"
)
A data frame containing variables to be plotted.
The (quoted or unquoted) name of a secondary variable to plot
against all other variables in the input data (or a subset of them if the
cols
argument is used), where the latter set of "primary" variables will
be automatically assigned to the var1
argument of plot_var
.
var2
is usually assigned to the x-axis. However, if the primary variable
(i.e. var1
) is a categorical (factor, character, or logical) variable and
var2
is a numeric, integer, or date variable, var2
will be assigned to
the y-axis and var1
will be assigned to the x-axis. If var1
and var2
are both categorical variables, var1
will be assigned to the x-axis and
var2
will be assigned to facet_var
.
Use if you want to assign a grouping variable to fill
(colour) and/or (outline) colour e.g. group_var = "grouping_variable" or
group_var = grouping_variable. Whether the grouping variable is mapped to
fill, colour, or both will depend upon which plot_*
function is used (See
"Value" section). For density plots, both fill and colour are used for
consistency across the main density plots and added normal density curve
lines (if dnorm = TRUE). For bar graphs and box-and-whisker plots, the
variable will be assigned to fill. For scatter plots, the variable will be
assigned to colour. See aes
for details.
A character (or integer) vector of column names (or indices)
which allows you to plot only a subset of the columns in the input data
frame, where each of these primary variable columns will be automatically
assigned to the var1
argument of plot_var
. Note that a
variable which has been assigned to var2
or group_var
does not also
need to be listed here.
Accepts a character string to use to change the axis label
for the variable assigned to var2
. Ignored if var2
and the primary
variable are both categorical variables (since var2
will be used for
faceting in such cases).
A character string to add as a title at the top of the combined multiple-panel patchwork graph or trelliscopejs display.
Add a figure caption to the bottom of the plot using a character string.
Fill colour to use for density plots, bar graphs, and box plots.
Ignored if a variable that has been assigned to group_var
is mapped on to
fill_var
(see group_var
argument information above). Default is
"blue2". Use colour_options
to see colour option examples.
Outline colour to use for density plots, bar graphs, box plots,
and scatter plots. Ignored if a variable that has been assigned to
group_var
is mapped on to colour_var
(see group_var
argument
information above). Default is "black". Use colour_options
to
see colour option examples.
If a variable is assigned to group_var, this determines which viridis colour palette to use. Options include "plasma" or "C" (default), "magma" or "A", "inferno" or "B", "viridis" or "D", and "cividis" or "E". See this link for examples.
Choose "d2l" for dark to light (default) or "l2d" for light to dark.
Value between 0 and 1 that determines where along the
full range of the chosen colour palette's spectrum to begin sampling
colours. See scale_fill_viridis_d
for details.
Value between 0 and 1 that determines where along the full
range of the chosen colour palette's spectrum to end sampling colours. See
scale_fill_viridis_d
for details.
This adjusts the transparency/opacity of the main geometric objects in the generated plot, with acceptable values ranging from 0 = 100% transparent to 1 = 100% opaque.
Set to TRUE if you want the plot converted to grey scale.
Controls the thickness of plotted lines.
Adjusts the theme using 1 of 6 predefined "complete" theme
templates provided by ggplot2. Currently supported options are: "classic",
"bw" (the elucidate default), "grey" (the ggplot2 default), "light",
"dark", & "minimal". See theme_bw
for more
information.
This controls the size of all plot text. Default = 14.
This controls the font of all plot text. Default = "sans" (Arial). Other options include "serif" (Times New Roman) and "mono" (Courier New).
This allows you to modify the legend position if a
variable is assigned to group_var
. Options include "right" (the default),
"left", "top", & "bottom".
Set to TRUE if you want to remove/omit the legend(s).
Ignored if group_var
is unspecified.
When TRUE (default), this adds a dashed line representing a
normal/Gaussian density curve to density plots, which are rendered for
plots of single numeric variables. Disabled if var1
is a date vector,
var1_log10
= TRUE, or basic
= TRUE.
When TRUE (default), this adds violin plot outlines to box
plots, which are rendered in cases where a mixture of numeric and
categorical variables are assigned to var1
and var2
. Disabled if
basic
= TRUE.
If TRUE, applies a base-10 logarithmic transformation to a
numeric variable that has been assigned to var1
. Ignored if var1
is a
categorical variable.
If TRUE, applies a base-10 logarithmic transformation to a
numeric variable that has been assigned to var2
. Ignored if var2
is a
categorical variable.
Controls the size of points used in scatter plots, which
are rendered in cases where var1
and var2
are both numeric, integer, or
date variables.
Point shape to use in scatter plots, which
are rendered in cases where var1
and var2
are both numeric, integer, or
date variables.
If TRUE (the default), adds a regression line to scatter
plots, which are rendered in cases where var1
and var2
are both
numeric, integer, or date variables. Disabled if basic
= TRUE.
If regression_line
= TRUE, this determines the
type of regression line to use. Currently available options are "gam",
"loess", and "lm". "gam" is the default, which fits a generalized additive
model using a smoothing term for x. This method has a longer run time, but
typically provides a better fit to the data than other options and uses an
optimization algorithm to determine the optimal wiggliness of the line. If
the relationship between y and x is linear, the output will be equivalent
to fitting a linear model. "loess" may be preferable to "gam" for small
sample sizes. See stat_smooth
and
gam
for details.
If TRUE (the default), adds a 95% confidence envelope for the
regression line. Ignored if regression_line
= FALSE.
In bar plots, which are rendered for one or more
categorical variables, this determines how bars are arranged relative to
one another when a grouping variable is assigned to group_var
. The
default, "dodge", uses position_dodge
to arrange
bars side-by-side; "stack" places the bars on top of each other; "fill"
also stacks bars but additionally converts y-axis from counts to
proportions.
In bar plots, which are rendered for one or more categorical variables, this adjusts the width of the bars (default = 0.9).
This is a shortcut argument that allows you to simultaneously
disable the dnorm
, violin
, and regression_line
arguments to produce a
basic version of a density, box, or scatter plot (depending on
var1
/var2
variable class(es)) without any of those additional layers.
Dropping these extra layers may noticeably reduce rendering time and memory
utilization, especially for larger sample sizes and/or when interactive
=
TRUE.
Determines whether a static ggplot object or an
interactive html plotly object is returned. Interactive/plotly mode for
multiple plots should only be used in conjunction with trelliscope
=
TRUE. See ggplotly
for details. Note that in cases
where a box plot is generated (for a mix of numeric and categorical
variables) and a variable is also assigned to group_var
, activating
interactive/plotly mode will cause a spurious warning message about
'layout' objects not having a 'boxmode' attribute to be printed to the
console. This is a
documented bug with
plotly that can be safely ignored, although unfortunately the message
cannot currently be suppressed.
If changed to TRUE, plots will be combined into an
interactive trelliscope display rather than a static patchwork graph grid.
See trelliscope
for more information.
This controls the number of rows to use when arranging plots in the combined patchwork or trelliscopejs display.
This controls the number of columns to use when arranging plots in the combined patchwork or trelliscopejs display.
Controls the pooling of group_var
legends/guides across plot
panels if a categorical variable has been assigned to group_var
and
trelliscope
= FALSE. See wrap_plots
for details.
A static "patchwork" or dynamic "trelliscope" multi-panel graphical
display of ggplot2 or plotly graphs depending upon the values of the
trelliscope
and interactive
arguments. The type of graph (i.e.
ggplot2::geom*
layers) that is rendered in each panel will depend upon
the classes of the chosen variables, as follows:
One numeric (classes numeric/integer/date) variable will be graphed with
plot_density
.
One or two categorical (classes factor/character/logical) variable(s)
will be graphed with plot_bar
.
Two numeric variables will be graphed with plot_scatter
.
A mixture of numeric and categorical variables will be graphed with
plot_box
.
Wickham, H. (2016). ggplot2: elegant graphics for data analysis. New York, N.Y.: Springer-Verlag.
data(mtcars) #load the mtcars data
#convert variables "cyl" to a factors
mtcars$cyl <- as.factor(mtcars$cyl)
#plot variables "hp", "wt", and "cyl" from the mtcars data frame
plot_var_all(mtcars, cols = c("hp", "wt", "cyl"))
#plot each of the same variables against column "mpg"
plot_var_all(mtcars, var2 = mpg, cols = c("hp", "wt", "cyl"))
#plot "hp" and "wt" against mpg, group by "cyl"
plot_var_all(mtcars, var2 = mpg, group_var = cyl, cols = c("hp", "wt"),
basic = TRUE, #distable regression lines/CIs
ncol = 1, nrow = 2) #change the layout