R/compute_frequency_analysis.R
compute_frequency_analysis.Rd
Performs a volume frequency analysis on custom data. Defaults to ranking by minimums; use use_max
for to
rank by maximum flows. Calculates the statistics from events and flow values provided. Columns of events (e.g. years), their
values (minimums or maximums), and identifiers (low-flows, high-flows, etc.). Function will calculate using all values in the
provided data (no grouped analysis). Analysis methodology replicates that from
HEC-SSP. Returns a list of tibbles and plots.
compute_frequency_analysis(
data,
events = Year,
values = Value,
measures = Measure,
use_max = FALSE,
use_log = FALSE,
prob_plot_position = c("weibull", "median", "hazen"),
prob_scale_points = c(0.9999, 0.999, 0.99, 0.9, 0.5, 0.2, 0.1, 0.02, 0.01, 0.001,
1e-04),
compute_fitting = TRUE,
fit_distr = c("PIII", "weibull"),
fit_distr_method = ifelse(fit_distr == "PIII", "MOM", "MLE"),
fit_quantiles = c(0.975, 0.99, 0.98, 0.95, 0.9, 0.8, 0.5, 0.2, 0.1, 0.05, 0.01),
plot_curve = TRUE,
plot_axis_title = "Discharge (cms)"
)
A data frame of data that contains columns of events, flow values, and measures (data type).
Column in data
that contains event identifiers, typically year values. Default 'Year'
.
Column in data
that contains numeric flow values, in units of cubic metres per second. Default 'Value'
.
Column in data
that contains measure identifiers (example data: '7-day low' or 'Annual Max'). Can have
multiple measures (ex. '7-day low' and '30-day low') in column if multiple statistics are desired. Default 'Measure'
.
Logical value to indicate using maximums rather than the minimums for analysis. Default FALSE
.
Logical value to indicate log-scale transforming of flow data before analysis. Default FALSE
.
Character string indicating the plotting positions used in the frequency plots, one of 'weibull'
,
'median'
, or 'hazen'
. Points are plotted against (i-a)/(n+1-a-b) where i
is the rank of the value; n
is the
sample size and a
and b
are defined as: (a=0, b=0) for Weibull plotting positions; (a=.2; b=.3) for Median
plotting positions; and (a=.5; b=.5) for Hazen plotting positions. Default 'weibull'
.
Numeric vector of probabilities to be plotted along the X axis in the frequency plot. Inverse of
return period. Default c(.9999, .999, .99, .9, .5, .2, .1, .02, .01, .001, .0001)
.
Logical value to indicate whether to fit plotting positions to a distribution. If 'FALSE' the output will
return only the data, plotting positions, and plot. Default TRUE
.
Character string identifying the distribution to fit annual data, one of 'PIII'
(Log Pearson Type III)
or 'weibull'
(Weibull) distributions. Default 'PIII'
.
Character string identifying the method used to fit the distribution, one of 'MOM'
(method of
moments) or 'MLE'
(maximum likelihood estimation). Selected as 'MOM'
if fit_distr ='PIII'
(default) or
'MLE'
if fit_distr = 'weibull'
.
Numeric vector of quantiles to be estimated from the fitted distribution.
Default c(.975, .99, .98, .95, .90, .80, .50, .20, .10, .05, .01)
.
Logical value to indicate plotting the computed curve on the probability plot. Default TRUE
.
Character string of the plot y-axis title. Default 'Discharge (cms)'
.
A list with the following elements:
Data frame with provided data for analysis.
Data frame with plotting positions used in frequency plot.
ggplot2 object with plotting positions and (optional) fitted curve.
List of fitted objects from fitdistrplus.
Data frame with fitted quantiles.
if (FALSE) {
# Working example:
# Calculate some values to use for a frequency analysis
# (requires years, values for those years, and the name of the measure/metric)
low_flows <- calc_annual_lowflows(station_number = "08NM116",
start_year = 1980,
end_year = 2000,
roll_days = 7)
low_flows <- dplyr::select(low_flows, Year, Value = Min_7_Day)
low_flows <- dplyr::mutate(low_flows, Measure = "7-Day")
# Compute the frequency analysis using the default parameters
results <- compute_frequency_analysis(data = low_flows,
events = Year,
values = Value,
measure = Measure)
}