R/calc_longterm_daily_stats.R
calc_longterm_daily_stats.Rd
Calculates the long-term mean, median, maximum, minimum, and percentiles of daily flow values for over all months and all data (Long-term) from a daily streamflow data set. Calculates statistics from all values, unless specified. Returns a tibble with statistics.
calc_longterm_daily_stats( data, dates = Date, values = Value, groups = STATION_NUMBER, station_number, percentiles = c(10, 90), roll_days = 1, roll_align = "right", water_year_start = 1, start_year, end_year, exclude_years, months = 1:12, complete_years = FALSE, include_longterm = TRUE, custom_months, custom_months_label, transpose = FALSE, ignore_missing = FALSE )
data | Data frame of daily data that contains columns of dates, flow values, and (optional) groups (e.g. station numbers).
Leave blank if using |
---|---|
dates | Name of column in |
values | Name of column in |
groups | Name of column in |
station_number | Character string vector of seven digit Water Survey of Canada station numbers (e.g. |
percentiles | Numeric vector of percentiles to calculate. Set to |
roll_days | Numeric value of the number of days to apply a rolling mean. Default |
roll_align | Character string identifying the direction of the rolling mean from the specified date, either by the first
( |
water_year_start | Numeric value indicating the month ( |
start_year | Numeric value of the first year to consider for analysis. Leave blank to use the first year of the source data. |
end_year | Numeric value of the last year to consider for analysis. Leave blank to use the last year of the source data. |
exclude_years | Numeric vector of years to exclude from analysis. Leave blank to include all years. |
months | Numeric vector of months to include in analysis (e.g. |
complete_years | Logical values indicating whether to include only years with complete data in analysis. Default |
include_longterm | Logical value indicating whether to include long-term calculation of all data. Default |
custom_months | Numeric vector of months to combine to summarize (ex. |
custom_months_label | Character string to label custom months. For example, if |
transpose | Logical value indicating whether to transpose rows and columns of results. Default |
ignore_missing | Logical value indicating whether dates with missing values should be included in the calculation. If
|
A tibble data frame with the following columns:
month of the year, included 'Long-term' for all months, and 'Custom-Months' if selected
mean of all daily data for a given month and long-term over all years
median of all daily data for a given month and long-term over all years
maximum of all daily data for a given month and long-term over all years
minimum of all daily data for a given month and long-term over all years
each n-th percentile selected for a given month and long-term over all years
annual 10th percentile selected for a given month and long-term over all years
annual 90th percentile selected for a given month and long-term over all years
# Run if HYDAT database has been downloaded (using tidyhydat::download_hydat()) if (file.exists(tidyhydat::hy_downloaded_db())) { # Calculate long-term statistics using data argument with defaults flow_data <- tidyhydat::hy_daily_flows(station_number = "08NM116") calc_longterm_daily_stats(data = flow_data, start_year = 1980) # Calculate long-term statistics using station_number argument with defaults calc_longterm_daily_stats(station_number = "08NM116", start_year = 1980) # Calculate long-term statistics regardless if there is missing data for a given year calc_longterm_daily_stats(station_number = "08NM116", ignore_missing = TRUE) # Calculate long-term statistics for water years starting in October calc_longterm_daily_stats(station_number = "08NM116", start_year = 1980, water_year_start = 10) # Calculate long-term statistics with custom years calc_longterm_daily_stats(station_number = "08NM116", start_year = 1981, end_year = 2010, exclude_years = c(1991,1993:1995)) # Calculate long-term statistics for 7-day flows for July-September months only, # with 25 and 75th percentiles calc_longterm_daily_stats(station_number = "08NM116", roll_days = 7, months = 7:9, percentiles = c(25,75), ignore_missing = TRUE, include_longterm = FALSE) # removes the Long-term numbers # Calculate long-term statistics and add custom stats for July-September calc_longterm_daily_stats(station_number = "08NM116", start_year = 1980, custom_months = 7:9, custom_months_label = "Summer") }#> Warning: One or more calculations included missing values and NA's were produced. Filter data for complete years or months, or use to ignore_missing = TRUE to ignore missing values.#> # A tibble: 14 x 8 #> STATION_NUMBER Month Mean Median Maximum Minimum P10 P90 #> <chr> <fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> #> 1 08NM116 Jan 1.16 0.940 9.5 0.160 0.570 1.78 #> 2 08NM116 Feb 1.19 0.971 5.81 0.140 0.561 2.00 #> 3 08NM116 Mar 1.89 1.36 17.5 0.380 0.704 3.71 #> 4 08NM116 Apr 8.65 6.51 53.5 0.505 1.45 18.5 #> 5 08NM116 May 24.6 22.4 80.8 2.55 9.73 42.7 #> 6 08NM116 Jun 22.0 19.7 86.2 0.450 6.10 39.7 #> 7 08NM116 Jul 6.28 3.90 76.8 0.332 1.12 14.0 #> 8 08NM116 Aug 2.03 1.54 13.3 0.427 0.836 3.84 #> 9 08NM116 Sep 2.10 1.58 14.6 0.364 0.770 4.11 #> 10 08NM116 Oct 2.06 1.64 15.2 0.267 0.841 3.82 #> 11 08NM116 Nov 2.01 1.62 11.7 0.260 0.590 3.99 #> 12 08NM116 Dec 1.29 1.06 7.30 0.244 0.528 2.27 #> 13 08NM116 Long-term 6.28 1.83 86.2 0.140 0.705 20 #> 14 08NM116 Summer 3.48 1.90 76.8 0.332 0.863 7.20