`R/compute_annual_frequencies.R`

`compute_annual_frequencies.Rd`

Performs a flow volume frequency analysis on annual statistics from a daily streamflow data set. Defaults to a low
flow frequency analysis using annual minimums. Set `use_max = TRUE`

for annual high flow frequency analyses. Calculates
statistics from all values, unless specified. Function will calculate using all values in 'Values' column (no grouped analysis).
Analysis methodology replicates that from HEC-SSP. Returns a list of
tibbles and plots.

```
compute_annual_frequencies(
data,
dates = Date,
values = Value,
station_number,
roll_days = c(1, 3, 7, 30),
roll_align = "right",
use_max = FALSE,
use_log = FALSE,
prob_plot_position = c("weibull", "median", "hazen"),
prob_scale_points = c(0.9999, 0.999, 0.99, 0.9, 0.5, 0.2, 0.1, 0.02, 0.01, 0.001,
1e-04),
fit_distr = c("PIII", "weibull"),
fit_distr_method = ifelse(fit_distr == "PIII", "MOM", "MLE"),
fit_quantiles = c(0.975, 0.99, 0.98, 0.95, 0.9, 0.8, 0.5, 0.2, 0.1, 0.05, 0.01),
plot_curve = TRUE,
water_year_start = 1,
start_year,
end_year,
exclude_years,
months = 1:12,
complete_years = FALSE,
ignore_missing = FALSE,
allowed_missing = ifelse(ignore_missing, 100, 0)
)
```

- data
A data frame of daily data that contains columns of dates and flow values. Groupings and the

`groups`

argument are not used for this function (i.e. station numbers). Leave blank or set to`NULL`

if using`station_number`

argument.- dates
Name of column in

`data`

that contains dates formatted YYYY-MM-DD. Only required if dates column name is not 'Date' (default). Leave blank or set to`NULL`

if using`station_number`

argument.- values
Name of column in

`data`

that contains numeric flow values, in units of cubic metres per second. Only required if values column name is not 'Value' (default). Leave blank if using`station_number`

argument.- station_number
Character string vector of seven digit Water Survey of Canada station numbers (e.g.

`"08NM116"`

) of which to extract daily streamflow data from a HYDAT database. Requires`tidyhydat`

package and a HYDAT database. Leave blank if using`data`

argument.- roll_days
Numeric value of the number of days to apply a rolling mean. Default

`1`

.- roll_align
Character string identifying the direction of the rolling mean from the specified date, either by the first (

`'left'`

), last (`'right'`

), or middle (`'center'`

) day of the rolling n-day group of observations. Default`'right'`

.- use_max
Logical value to indicate using maximums rather than the minimums for analysis. Default

`FALSE`

.- use_log
Logical value to indicate log-scale transforming of flow data before analysis. Default

`FALSE`

.- prob_plot_position
Character string indicating the plotting positions used in the frequency plots, one of

`'weibull'`

,`'median'`

, or`'hazen'`

. Points are plotted against (i-a)/(n+1-a-b) where`i`

is the rank of the value;`n`

is the sample size and`a`

and`b`

are defined as: (a=0, b=0) for Weibull plotting positions; (a=.2; b=.3) for Median plotting positions; and (a=.5; b=.5) for Hazen plotting positions. Default`'weibull'`

.- prob_scale_points
Numeric vector of probabilities to be plotted along the X axis in the frequency plot. Inverse of return period. Default

`c(.9999, .999, .99, .9, .5, .2, .1, .02, .01, .001, .0001)`

.- fit_distr
Character string identifying the distribution to fit annual data, one of

`'PIII'`

(Log Pearson Type III) or`'weibull'`

(Weibull) distributions. Default`'PIII'`

.- fit_distr_method
Character string identifying the method used to fit the distribution, one of

`'MOM'`

(method of moments) or`'MLE'`

(maximum likelihood estimation). Selected as`'MOM'`

if`fit_distr ='PIII'`

(default) or`'MLE'`

if`fit_distr = 'weibull'`

.- fit_quantiles
Numeric vector of quantiles to be estimated from the fitted distribution. Default

`c(.975, .99, .98, .95, .90, .80, .50, .20, .10, .05, .01)`

.- plot_curve
Logical value to indicate plotting the computed curve on the probability plot. Default

`TRUE`

.- water_year_start
Numeric value indicating the month (

`1`

through`12`

) of the start of water year for analysis. Default`1`

.- start_year
Numeric value of the first year to consider for analysis. Leave blank or set well before start date (i.e.

`1800`

) to use from the first year of the source data.- end_year
Numeric value of the last year to consider for analysis. Leave blank or set well after end date (i.e.

`2100`

) to use up to the last year of the source data.- exclude_years
Numeric vector of years to exclude from analysis. Leave blank or set to

`NULL`

to include all years.- months
Numeric vector of months to include in analysis. For example,

`3`

for March,`6:8`

for Jun-Aug or`c(10:12,1)`

for first four months (Oct-Jan) when`water_year_start = 10`

(Oct). Default summarizes all months (`1:12`

).- complete_years
Logical values indicating whether to include only years with complete data in analysis. Default

`FALSE`

.- ignore_missing
Logical value indicating whether dates with missing values should be included in the calculation. If

`TRUE`

then a statistic will be calculated regardless of missing dates. If`FALSE`

then only those statistics from time periods with no missing dates will be returned. Default`FALSE`

.- allowed_missing
Numeric value between 0 and 100 indicating the

**percentage**of missing dates allowed to be included to calculate a statistic (0 to 100 percent). If`'ignore_missing = FALSE'`

then it defaults to`0`

(zero missing dates allowed), if`'ignore_missing = TRUE'`

then it defaults to`100`

(any missing dates allowed); consistent with`ignore_missing`

usage. Supersedes`ignore_missing`

when used.

A list with the following elements:

- Freq_Analysis_Data
Data frame with computed annual summary statistics used in analysis.

- Freq_Plot_Data
Data frame with co-ordinates used in frequency plot.

- Freq_Plot
ggplot2 object with frequency plot.

- Freq_Fitting
List of fitted objects from fitdistrplus.

- Freq_Fitted_Quantiles
Data frame with fitted quantiles.

```
if (FALSE) {
# Working examples (see arguments for further analysis options):
# Compute an annual frequency analysis using default arguments
results <- compute_annual_frequencies(station_number = "08NM116",
start_year = 1980,
end_year = 2010)
# Compute an annual frequency analysis using default arguments (as listed)
results <- compute_annual_frequencies(station_number = "08NM116",
roll_days = c(1,3,7,30),
start_year = 1980,
end_year = 2010,
prob_plot_position = "weibull",
prob_scale_points = c(.9999, .999, .99, .9, .5,
.2, .1, .02, .01, .001, .0001),
fit_distr = "PIII",
fit_distr_method = "MOM")
# Compute a 7-day annual frequency analysis with "median" plotting positions
# and fitting the data to a weibull distribution (not default PIII)
results <- compute_annual_frequencies(station_number = "08NM116",
roll_days = 7,
start_year = 1980,
end_year = 2010,
prob_plot_position = "median",
fit_distr = "weibull")
}
```