Calculates means, medians, maximums, minimums, and percentiles for each day of the year of flow values
from a daily streamflow data set. Can determine statistics of rolling mean days (e.g. 7-day flows) using the `roll_days`

argument. Note that statistics are based on the numeric days of year (1-365) and not the date of year (Jan 1 - Dec 31).
Calculates statistics from all values, unless specified. Returns a tibble with statistics.

```
calc_daily_stats(
data,
dates = Date,
values = Value,
groups = STATION_NUMBER,
station_number,
percentiles = c(5, 25, 75, 95),
roll_days = 1,
roll_align = "right",
water_year_start = 1,
start_year,
end_year,
exclude_years,
months = 1:12,
transpose = FALSE,
complete_years = FALSE,
ignore_missing = FALSE
)
```

- data
Data frame of daily data that contains columns of dates, flow values, and (optional) groups (e.g. station numbers). Leave blank or set to

`NULL`

if using`station_number`

argument.- dates
Name of column in

`data`

that contains dates formatted YYYY-MM-DD. Only required if dates column name is not 'Date' (default). Leave blank or set to`NULL`

if using`station_number`

argument.- values
Name of column in

`data`

that contains numeric flow values, in units of cubic metres per second. Only required if values column name is not 'Value' (default). Leave blank if using`station_number`

argument.- groups
Name of column in

`data`

that contains unique identifiers for different data sets, if applicable. Only required if groups column name is not 'STATION_NUMBER'. Function will automatically group by a column named 'STATION_NUMBER' if present. Remove the 'STATION_NUMBER' column beforehand to remove this grouping. Leave blank if using`station_number`

argument.- station_number
Character string vector of seven digit Water Survey of Canada station numbers (e.g.

`"08NM116"`

) of which to extract daily streamflow data from a HYDAT database. Requires`tidyhydat`

package and a HYDAT database. Leave blank if using`data`

argument.- percentiles
Numeric vector of percentiles to calculate. Set to

`NA`

if none required. Default`c(5,25,75,95)`

.- roll_days
Numeric value of the number of days to apply a rolling mean. Default

`1`

.- roll_align
Character string identifying the direction of the rolling mean from the specified date, either by the first (

`'left'`

), last (`'right'`

), or middle (`'center'`

) day of the rolling n-day group of observations. Default`'right'`

.- water_year_start
Numeric value indicating the month (

`1`

through`12`

) of the start of water year for analysis. Default`1`

.- start_year
Numeric value of the first year to consider for analysis. Leave blank or set well before start date (i.e.

`1800`

) to use from the first year of the source data.- end_year
Numeric value of the last year to consider for analysis. Leave blank or set well after end date (i.e.

`2100`

) to use up to the last year of the source data.- exclude_years
Numeric vector of years to exclude from analysis. Leave blank or set to

`NULL`

to include all years.- months
Numeric vector of months to include in analysis. For example,

`3`

for March,`6:8`

for Jun-Aug or`c(10:12,1)`

for first four months (Oct-Jan) when`water_year_start = 10`

(Oct). Default summarizes all months (`1:12`

).- transpose
Logical value indicating whether to transpose rows and columns of results. Default

`FALSE`

.- complete_years
Logical values indicating whether to include only years with complete data in analysis. Default

`FALSE`

.- ignore_missing
Logical value indicating whether dates with missing values should be included in the calculation. If

`TRUE`

then a statistic will be calculated regardless of missing dates. If`FALSE`

then only those statistics from time periods with no missing dates will be returned. Default`FALSE`

.

A tibble data frame with the following columns:

- Date
date (MMM-DD) of daily statistics

- DayofYear
day of year of daily statistics

- Mean
daily mean of all flows for a given day of the year

- Median
daily mean of all flows for a given day of the year

- Maximum
daily mean of all flows for a given day of the year

- Minimum
daily mean of all flows for a given day of the year

- P'n'
each daily n-th percentile selected of all flows for a given day of the year

Default percentile columns:

- P5
daily 5th percentile of all flows for a given day of the year

- P25
daily 25th percentile of all flows for a given day of the year

- P75
daily 75th percentile of all flows for a given day of the year

- P95
daily 95th percentile of all flows for a given day of the year

Transposing data creates a column of "Statistics" and subsequent columns for each year selected.

```
# Run if HYDAT database has been downloaded (using tidyhydat::download_hydat())
if (file.exists(tidyhydat::hy_downloaded_db())) {
# Calculate daily statistics using station_number argument with defaults
calc_daily_stats(station_number = "08NM116",
start_year = 1980)
# Calculate daily statistics regardless if there is missing data for a given day of year
calc_daily_stats(station_number = "08NM116",
ignore_missing = TRUE)
# Calculate daily statistics using only years with no missing data
calc_daily_stats(station_number = "08NM116",
complete_years = TRUE)
# Calculate daily statistics for water years starting in October between 1980 and 2010
calc_daily_stats(station_number = "08NM116",
start_year = 1980,
end_year = 2010,
water_year_start = 10)
}
#> # A tibble: 365 × 11
#> STATION_…¹ Date Dayof…² Mean Median Minimum Maximum P5 P25 P75 P95
#> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 08NM116 Oct-… 1 1.91 1.52 0.519 6.12 0.646 0.826 2.37 4.66
#> 2 08NM116 Oct-… 2 1.98 1.46 0.501 7.93 0.704 0.943 2.17 4.38
#> 3 08NM116 Oct-… 3 1.97 1.51 0.455 8.53 0.641 1.06 2.33 4.50
#> 4 08NM116 Oct-… 4 1.87 1.5 0.464 7.60 0.626 1.05 2.31 3.90
#> 5 08NM116 Oct-… 5 2.02 1.30 0.554 10.6 0.623 0.958 2.36 4.23
#> 6 08NM116 Oct-… 6 1.96 1.34 0.549 9.41 0.591 0.986 2.29 3.95
#> 7 08NM116 Oct-… 7 1.96 1.48 0.455 6.87 0.604 1.04 2.30 4.60
#> 8 08NM116 Oct-… 8 1.98 1.57 0.530 5.75 0.583 1.01 2.57 4.67
#> 9 08NM116 Oct-… 9 1.96 1.43 0.457 5.99 0.551 0.911 2.27 5.15
#> 10 08NM116 Oct-… 10 1.89 1.53 0.546 5.90 0.593 0.961 2.25 4.66
#> # … with 355 more rows, and abbreviated variable names ¹STATION_NUMBER,
#> # ²DayofYear
```