`R/compute_annual_trends.R`

`compute_annual_trends.Rd`

Calculates prewhitened nonlinear trends on annual streamflow data. Uses the
`zyp`

package to calculate trends. Review `zyp`

for more information
Calculates statistics from all values, unless specified. Returns a list of tibbles and plots.
All annual statistics calculated using the `calc_all_annual_stats()`

function which uses the following
`fasstr`

functions:

```
compute_annual_trends(
data,
dates = Date,
values = Value,
groups = STATION_NUMBER,
station_number,
zyp_method,
basin_area,
water_year_start = 1,
start_year,
end_year,
exclude_years,
months = 1:12,
annual_percentiles = c(10, 90),
monthly_percentiles = c(10, 20),
stats_days = 1,
stats_align = "right",
lowflow_days = c(1, 3, 7, 30),
lowflow_align = "right",
timing_percent = c(25, 33, 50, 75),
normal_percentiles = c(25, 75),
complete_years = FALSE,
ignore_missing = FALSE,
allowed_missing_annual = ifelse(ignore_missing, 100, 0),
allowed_missing_monthly = ifelse(ignore_missing, 100, 0),
include_plots = TRUE,
zyp_alpha
)
```

- data
Data frame of daily data that contains columns of dates, flow values, and (optional) groups (e.g. station numbers). Leave blank or set to

`NULL`

if using`station_number`

argument.- dates
Name of column in

`data`

that contains dates formatted YYYY-MM-DD. Only required if dates column name is not 'Date' (default). Leave blank or set to`NULL`

if using`station_number`

argument.- values
Name of column in

`data`

that contains numeric flow values, in units of cubic metres per second. Only required if values column name is not 'Value' (default). Leave blank if using`station_number`

argument.- groups
Name of column in

`data`

that contains unique identifiers for different data sets, if applicable. Only required if groups column name is not 'STATION_NUMBER'. Function will automatically group by a column named 'STATION_NUMBER' if present. Remove the 'STATION_NUMBER' column beforehand to remove this grouping. Leave blank if using`station_number`

argument.- station_number
Character string vector of seven digit Water Survey of Canada station numbers (e.g.

`"08NM116"`

) of which to extract daily streamflow data from a HYDAT database. Requires`tidyhydat`

package and a HYDAT database. Leave blank if using`data`

argument.- zyp_method
Character string identifying the prewhitened trend method to use from

`zyp`

, either`'zhang'`

or`'yuepilon'`

.`'zhang'`

is recommended over`'yuepilon'`

for hydrologic applications (Bürger 2017; Zhang and Zwiers 2004). Required.- basin_area
Upstream drainage basin area, in square kilometres, to apply to observations. Three options:

(1) Leave blank if

`groups`

is STATION_NUMBER with HYDAT station numbers to extract basin areas from HYDAT.(2) A single numeric value to apply to all observations.

(3) List each basin area for each group/station in groups (can override HYDAT value if listed) as such

`c("08NM116" = 795, "08NM242" = 10)`

. If group is not listed the HYDAT area will be applied if it exists, otherwise it will be`NA`

.- water_year_start
Numeric value indicating the month (

`1`

through`12`

) of the start of water year for analysis. Default`1`

.- start_year
Numeric value of the first year to consider for analysis. Leave blank or set well before start date (i.e.

`1800`

) to use from the first year of the source data.- end_year
Numeric value of the last year to consider for analysis. Leave blank or set well after end date (i.e.

`2100`

) to use up to the last year of the source data.- exclude_years
Numeric vector of years to exclude from analysis. Leave blank or set to

`NULL`

to include all years.- months
Numeric vector of months to include in analysis. For example,

`3`

for March,`6:8`

for Jun-Aug or`c(10:12,1)`

for first four months (Oct-Jan) when`water_year_start = 10`

(Oct). Default summarizes all months (`1:12`

). If not all months, seasonal total yield and volumetric flows will not be included.- annual_percentiles
Numeric vector of percentiles to calculate annually. Set to

`NA`

if none required. Used for`calc_annual_stats()`

function. Default`c(10,90)`

.- monthly_percentiles
Numeric vector of percentiles to calculate monthly for each year. Set to

`NA`

if none required. Used for`calc_monthly_stats()`

function. Default`c(10,20)`

.- stats_days
Numeric vector of the number of days to apply a rolling mean on basic stats. Default

`c(1)`

. Used for`calc_annual_stats()`

and`calc_monthly_stats()`

functions.- stats_align
Character string identifying the direction of the rolling mean on basic stats from the specified date, either by the first (

`'left'`

), last (`'right'`

), or middle (`'center'`

) day of the rolling n-day group of observations. Default`'right'`

. Used for`calc_annual_stats()`

,`calc_monthly_stats()`

, and`calc_annual_normal_days()`

functions.- lowflow_days
Numeric vector of the number of days to apply a rolling mean on low flow stats. Default

`c(1,3,7,30)`

. Used for`calc_lowflow_stats()`

function.- lowflow_align
Character string identifying the direction of the rolling mean on low flow stats from the specified date, either by the first (

`'left'`

), last (`'right'`

), or middle (`'center'`

) day of the rolling n-day group of observations. Default`'right'`

. Used for`calc_lowflow_stats()`

function.- timing_percent
Numeric vector of percents of annual total flows to determine dates. Used for

`calc_annual_flow_timing()`

function. Default`c(25,33.3,50,75)`

.- normal_percentiles
Numeric vector of two values, lower and upper percentiles, respectively indicating the limits of the normal range. Default

`c(25,75)`

.- complete_years
Logical values indicating whether to include only years with complete data in analysis. Default

`FALSE`

.- ignore_missing
Logical value indicating whether dates with missing values should be included in the calculation. If

`TRUE`

then a statistic will be calculated regardless of missing dates. If`FALSE`

then only those statistics from time periods with no missing dates will be returned. Default`FALSE`

.- allowed_missing_annual
Numeric value between 0 and 100 indicating the

**percentage**of missing dates allowed to be included to calculate an annual statistic (0 to 100 percent). If`'ignore_missing = FALSE'`

then it defaults to`0`

(zero missing dates allowed), if`'ignore_missing = TRUE'`

then it defaults to`100`

(any missing dates allowed); consistent with`ignore_missing`

usage. Supersedes`ignore_missing`

when used. Only for annual means, percentiles, minimums, and maximums.- allowed_missing_monthly
Numeric value between 0 and 100 indicating the

**percentage**of missing dates allowed to be included to calculate a monthly statistic (0 to 100 percent). If`'ignore_missing = FALSE'`

then it defaults to`0`

(zero missing dates allowed), if`'ignore_missing = TRUE'`

then it defaults to`100`

(any missing dates allowed); consistent with`ignore_missing`

usage. Supersedes`ignore_missing`

when used.Only for monthly means, percentiles, minimums, and maximums.- include_plots
Logical value indicating if annual trending plots should be included. Default

`TRUE`

.- zyp_alpha
Numeric value of the significance level (ex.

`0.05`

) of when to plot a trend line. Leave blank for no line.

A list of tibbles and optional plots from the trending analysis including:

- Annual_Trends_Data
a tibble of the annual statistics used for trending

- Annual_Trends_Results
a tibble of the results of the zyp trending analysis

- Annual_*
each ggplot2 object for each annual trended statistic

References:

Büger, G. 2017. On trend detection. Hydrological Processes 31, 4039–4042. https://doi.org/10.1002/hyp.11280.

Sen, P.K., 1968. Estimates of the Regression Coefficient Based on Kendall's Tau. Journal of the American Statistical Association Vol. 63, No. 324: 1379-1389.

Wang, X.L. and Swail, V.R., 2001. Changes in extreme wave heights in northern hemisphere oceans and related atmospheric circulation regimes. Journal of Climate, 14: 2204-2221.

Yue, S., P. Pilon, B. Phinney and G. Cavadias, 2002. The influence of autocorrelation on the ability to detect trend in hydrological series. Hydrological Processes, 16: 1807-1829.

Zhang, X., Vincent, L.A., Hogg, W.D. and Niitsoo, A., 2000. Temperature and Precipitation Trends in Canada during the 20th Century. Atmosphere-Ocean 38(3): 395-429.

Zhang, X., Zwiers, F.W., 2004. Comment on “Applicability of prewhitening to eliminate the influence of serial correlation on the Mann-Kendall test” by Sheng Yue and Chun Yuan Wang. Water Resources Research 40. https://doi.org/10.1029/2003WR002073.

```
if (FALSE) {
# Working examples:
# Compute trends statistics using a data frame and data argument with defaults
flow_data <- tidyhydat::hy_daily_flows(station_number = "08NM116")
trends <- compute_annual_trends(data = flow_data,
zyp_method = "zhang")
# Compute trends statistics using station_number with defaults
trends <- compute_annual_trends(station_number = "08NM116",
zyp_method = "zhang")
# Compute trends statistics and plot a trend line if the significance is less than 0.05
trends <- compute_annual_trends(station_number = "08NM116",
zyp_method = "zhang",
zyp_alpha = 0.05)
# Compute trends statistics and do not plot the results
trends <- compute_annual_trends(station_number = "08NM116",
zyp_method = "zhang",
include_plots = FALSE)
}
```