Calculates means, medians, maximums, minimums, and percentiles for each month of all years of flow values from a daily streamflow data set. Calculates statistics from all values, unless specified. Returns a tibble with statistics.
calc_monthly_stats( data, dates = Date, values = Value, groups = STATION_NUMBER, station_number, percentiles = c(10, 90), roll_days = 1, roll_align = "right", water_year_start = 1, start_year, end_year, exclude_years, months = 1:12, transpose = FALSE, spread = FALSE, ignore_missing = FALSE )
data | Data frame of daily data that contains columns of dates, flow values, and (optional) groups (e.g. station numbers).
Leave blank if using |
---|---|
dates | Name of column in |
values | Name of column in |
groups | Name of column in |
station_number | Character string vector of seven digit Water Survey of Canada station numbers (e.g. |
percentiles | Numeric vector of percentiles to calculate. Set to |
roll_days | Numeric value of the number of days to apply a rolling mean. Default |
roll_align | Character string identifying the direction of the rolling mean from the specified date, either by the first
( |
water_year_start | Numeric value indicating the month ( |
start_year | Numeric value of the first year to consider for analysis. Leave blank to use the first year of the source data. |
end_year | Numeric value of the last year to consider for analysis. Leave blank to use the last year of the source data. |
exclude_years | Numeric vector of years to exclude from analysis. Leave blank to include all years. |
months | Numeric vector of months to include in analysis (e.g. |
transpose | Logical value indicating if each month statistic should be individual rows. Default |
spread | Logical value indicating if each month statistic should be the column name. Default |
ignore_missing | Logical value indicating whether dates with missing values should be included in the calculation. If
|
A tibble data frame with the following columns:
calendar or water year selected
month of the year
mean of all daily flows for a given month and year
median of all daily flows for a given month and year
maximum of all daily flows for a given month and year
minimum of all daily flows for a given month and year
each n-th percentile selected for a given month and year
10th percentile of all daily flows for a given month and year
90th percentile of all daily flows for a given month and year
# Run if HYDAT database has been downloaded (using tidyhydat::download_hydat()) if (file.exists(tidyhydat::hy_downloaded_db())) { # Calculate statistics using a data frame and data argument with defaults flow_data <- tidyhydat::hy_daily_flows(station_number = "08NM116") calc_monthly_stats(data = flow_data, start_year = 1980) # Calculate statistics using station_number argument with defaults calc_monthly_stats(station_number = "08NM116", start_year = 1980) # Calculate statistics regardless if there is missing data for a given year calc_monthly_stats(station_number = "08NM116", ignore_missing = TRUE) # Calculate statistics for water years starting in October calc_monthly_stats(station_number = "08NM116", start_year = 1980, water_year_start = 10) # Calculate statistics with custom years calc_monthly_stats(station_number = "08NM116", start_year = 1981, end_year = 2010, exclude_years = c(1991,1993:1995)) # Calculate statistics for 7-day flows, with 25 and 75th percentiles calc_monthly_stats(station_number = "08NM116", roll_days = 7, percentiles = c(25,75), ignore_missing = TRUE) }#> Warning: One or more calculations included missing values and NA's were produced. Some months in some years have no data to summarize.#> Warning: One or more calculations included missing values and NA's were produced. Filter data for complete years or months, or use to ignore_missing = TRUE to ignore missing values.#> Warning: One or more calculations included missing values and NA's were produced. Some months in some years have no data to summarize.#> # A tibble: 828 x 9 #> STATION_NUMBER Year Month Mean Median Maximum Minimum P25 P75 #> <chr> <dbl> <fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> #> 1 08NM116 1949 Jan NA NA NA NA NA NA #> 2 08NM116 1949 Feb NA NA NA NA NA NA #> 3 08NM116 1949 Mar NA NA NA NA NA NA #> 4 08NM116 1949 Apr 7.27 6.64 13.6 1.93 3.51 11.2 #> 5 08NM116 1949 May 26.2 27.8 41.1 13.1 15.1 34.2 #> 6 08NM116 1949 Jun 9.63 7.45 20.8 3.59 4.08 14.9 #> 7 08NM116 1949 Jul 1.93 1.56 4.29 0.950 1.24 2.00 #> 8 08NM116 1949 Aug 1.25 1.23 1.68 0.882 1.09 1.35 #> 9 08NM116 1949 Sep 1.37 1.39 1.97 0.838 1.01 1.64 #> 10 08NM116 1949 Oct NA NA NA NA NA NA #> # ... with 818 more rows