Calculates means, medians, maximums, minimums, and percentiles for each day of the year of flow values
from a daily streamflow data set. Can determine statistics of rolling mean days (e.g. 7day flows) using the roll_days
argument. Note that statistics are based on the numeric days of year (1365) and not the date of year (Jan 1  Dec 31).
Calculates statistics from all values, unless specified. Returns a tibble with statistics.
calc_daily_stats( data, dates = Date, values = Value, groups = STATION_NUMBER, station_number, percentiles = c(5, 25, 75, 95), roll_days = 1, roll_align = "right", water_year_start = 1, start_year, end_year, exclude_years, complete_years = FALSE, months = 1:12, transpose = FALSE, ignore_missing = FALSE )
data  Data frame of daily data that contains columns of dates, flow values, and (optional) groups (e.g. station numbers).
Leave blank if using 

dates  Name of column in 
values  Name of column in 
groups  Name of column in 
station_number  Character string vector of seven digit Water Survey of Canada station numbers (e.g. 
percentiles  Numeric vector of percentiles to calculate. Set to 
roll_days  Numeric value of the number of days to apply a rolling mean. Default 
roll_align  Character string identifying the direction of the rolling mean from the specified date, either by the first
( 
water_year_start  Numeric value indicating the month ( 
start_year  Numeric value of the first year to consider for analysis. Leave blank to use the first year of the source data. 
end_year  Numeric value of the last year to consider for analysis. Leave blank to use the last year of the source data. 
exclude_years  Numeric vector of years to exclude from analysis. Leave blank to include all years. 
complete_years  Logical values indicating whether to include only years with complete data in analysis. Default 
months  Numeric vector of months to include in analysis (e.g. 
transpose  Logical value indicating whether to transpose rows and columns of results. Default 
ignore_missing  Logical value indicating whether dates with missing values should be included in the calculation. If

A tibble data frame with the following columns:
date (MMMDD) of daily statistics
day of year of daily statistics
daily mean of all flows for a given day of the year
daily mean of all flows for a given day of the year
daily mean of all flows for a given day of the year
daily mean of all flows for a given day of the year
each daily nth percentile selected of all flows for a given day of the year
daily 5th percentile of all flows for a given day of the year
daily 25th percentile of all flows for a given day of the year
daily 75th percentile of all flows for a given day of the year
daily 95th percentile of all flows for a given day of the year
# Run if HYDAT database has been downloaded (using tidyhydat::download_hydat()) if (file.exists(tidyhydat::hy_downloaded_db())) { # Calculate daily statistics using data argument with defaults flow_data < tidyhydat::hy_daily_flows(station_number = "08NM116") calc_daily_stats(data = flow_data, start_year = 1980) # Calculate daily statistics using station_number argument with defaults calc_daily_stats(station_number = "08NM116", start_year = 1980) # Calculate daily statistics regardless if there is missing data for a given day of year calc_daily_stats(station_number = "08NM116", ignore_missing = TRUE) # Calculate daily statistics using only years with no missing data calc_daily_stats(station_number = "08NM116", complete_years = TRUE) # Calculate daily statistics for water years starting in October between 1980 and 2010 calc_daily_stats(station_number = "08NM116", start_year = 1980, end_year = 2010, water_year_start = 10) # Calculate daily statistics with custom years and removing certain years calc_daily_stats(station_number = "08NM116", start_year = 1981, end_year = 2010, exclude_years = c(1991,1993:1995)) # Calculate daily statistics for 7day flows for JulySeptember months only, # with 25 and 75th percentiles starting in 1980 calc_daily_stats(station_number = "08NM116", start_year = 1980, roll_days = 7, months = 7:9, percentiles = c(25,75)) }#> # A tibble: 92 x 9 #> STATION_NUMBER Date DayofYear Mean Median Minimum Maximum P25 P75 #> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> #> 1 08NM116 Jul01 182 14.0 11.0 0.717 39.6 6.79 18.6 #> 2 08NM116 Jul02 183 13.3 10.3 0.730 40.1 6.29 17.2 #> 3 08NM116 Jul03 184 12.7 9.84 0.932 39.6 6.20 15.8 #> 4 08NM116 Jul04 185 12.2 9.15 1.35 40.4 6.34 14.7 #> 5 08NM116 Jul05 186 11.7 8.70 1.41 41.4 6.02 13.9 #> 6 08NM116 Jul06 187 11.2 8.13 1.31 44.5 5.73 13.2 #> 7 08NM116 Jul07 188 10.7 7.49 1.10 44.6 5.31 11.9 #> 8 08NM116 Jul08 189 10.2 6.62 0.914 44.8 5.26 11.1 #> 9 08NM116 Jul09 190 9.49 6.05 0.747 42.3 4.93 10.8 #> 10 08NM116 Jul10 191 8.96 5.83 0.628 40.1 4.53 9.96 #> # ... with 82 more rows