Tools for QA and validation of Camera Trap data • bccamtrap

Functions for QA and validation of Camera Trap data

For documentation, see the package documentation website, in particular the function reference section.

This document will walk through the main points of installing and using the core functionality of the package, including the bundled Shiny app.

Installation
Example Usage
Shiny

Installation

You can install the development version of bccamtrap from GitHub using the devtools package (you may need to install it first):

If don’t have devtools installed, install it:

install.packages("devtools")

Then you can install the bccamtrap package:

devtools::install_github("bcgov/bccamtrap")

Example Usage

This package is being developed for camera trap studies in the West Coast Region, BC Ministry of Water, Land, and Resource Stewardship (WLRS).

The functions in this package currently assume your project and session-level data are stored in a BC Government Wildlife Data Submission Template for Camera Trap Data, augmented with additional fields.

The image data is expected to be in multiple csv files, in one folder per project. The csv files have been generated by reviewing the images in TimeLapse software, using the template v20230518.

Note that example data has been obfuscated to protect the location of the projects.

To begin, set the paths to the project metadata file, and the folder containing the TimeLapse image files:

library(bccamtrap)

metadata_path <- "~/data/project-files/project_1_RISC_WCR_Database_Template_v20230518.xlsm"
data_path <- "~/data/wc-wlrs-cam-data/camera-data/project_1/"

Project and station metadata

Read in project metadata from the SPI worksheet. There are functions to read the relevant tabs:

Project Information

proj <- read_project_info(metadata_path)
proj
#> # A tibble: 1 × 4
#>   spi_project_id project_name                  survey_name survey_intensity_code
#> * <chr>          <chr>                         <chr>       <chr>                
#> 1 <NA>           2022 - ongoing - Roosevelt E… 2022-2023 … <NA>

Sample station information

Read the sample station information. This creates a spatial data frame of class "sf", from the sf package. This format allows us to work with it as a regular data frame, but also do spatial things with it.

sample_stations <- read_sample_station_info(metadata_path)
sample_stations

#> Simple feature collection with 21 features and 29 fields
#> Geometry type: POINT
#> Dimension:     XY
#> Bounding box:  xmin: -125.8534 ymin: 53.6787 xmax: -125.0136 ymax: 53.73782
#> Geodetic CRS:  WGS 84
#> # A tibble: 21 × 24
#>    wlrs_project_name      study_area_name study_area_photos sample_station_label
#>    <chr>                  <chr>           <chr>             <chr>               
#>  1 2022 - ongoing - Roos… Test Project    Y                 19_1                
#>  2 2022 - ongoing - Roos… Test Project    Y                 19_2                
#>  3 2022 - ongoing - Roos… Test Project    Y                 20                  
#>  4 2022 - ongoing - Roos… Test Project    Y                 21_1                
#>  5 2022 - ongoing - Roos… Test Project    Y                 21_2                
#>  6 2022 - ongoing - Roos… Test Project    Y                 24                  
#>  7 2022 - ongoing - Roos… Test Project    Y                 25                  
#>  8 2022 - ongoing - Roos… Test Project    Y                 26                  
#>  9 2022 - ongoing - Roos… Test Project    Y                 27                  
#> 10 2022 - ongoing - Roos… Test Project    Y                 28                  
#> # ℹ 11 more rows
#> # ℹ 20 more variables: station_status <chr>, number_of_cameras <dbl>,
#> #   set_date <dttm>, general_location <chr>, elevation_m <dbl>,
#> #   slope_percent <dbl>, aspect_degrees <dbl>, crown_closure_percent <dbl>,
#> #   camera_bearing_degrees <dbl>, camera_height_cm <dbl>,
#> #   distance_to_feature_m <dbl>, visible_range_m <dbl>, habitat_feature <chr>,
#> #   lock <chr>, code <chr>, sample_station_comments <chr>, …

Use the qa_stations_spatial() function to run some basic spatial validation on the data - namely checking for spatial outliers:

sample_stations <- qa_stations_spatial(sample_stations)
#> ! Station 19_1 appears to be very far away from other stations. Please check
#> its coordinates.

Use the summary() method for Sample Station Info for basic descriptive stats:

summary(sample_stations)
#> ┌ Test Project ───────┐
#> │                     │
#> │   Sample Stations   │
#> │                     │
#> └─────────────────────┘
#> ℹ 18 sample stations in 21 locations.
#> ℹ Summary of station distances (m):
#>     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
#>     5.08  4247.81  8712.00 11935.96 14562.39 55516.29
#> ✖ Detected 1 potential spatial outlier.
#> ℹ Station status summary:
#> Camera Active  Camera Moved 
#>            18             3
#> ℹ Set dates: Between 2022-11-07 and 2023-11-16
#> ! Run `map_stations(object)` to view stations on a map.

Use the map_stations() function to create an interactive map the of the stations. This will show any potential outlying stations, indicating possible data errors:

map_stations(sample_stations)

Camera Information:

Read camera information using read_camera_info():

camera_info <- read_camera_info(metadata_path)
camera_info

#> Simple feature collection with 21 features and 14 fields
#> Geometry type: POINT
#> Dimension:     XY
#> Bounding box:  xmin: -125.5219 ymin: 49.28703 xmax: -125.2213 ymax: 49.34728
#> Geodetic CRS:  WGS 84
#> # A tibble: 21 × 9
#>    wlrs_project_name         study_area_name parent_sample_statio…¹ camera_label
#>    <chr>                     <chr>           <chr>                  <chr>       
#>  1 2022 - ongoing - Rooseve… Test Project    19_1                   UBC 2       
#>  2 2022 - ongoing - Rooseve… Test Project    19_2                   UBC 1       
#>  3 2022 - ongoing - Rooseve… Test Project    20                     Eco1059     
#>  4 2022 - ongoing - Rooseve… Test Project    21_1                   Eco1057     
#>  5 2022 - ongoing - Rooseve… Test Project    21_2                   Eco7998     
#>  6 2022 - ongoing - Rooseve… Test Project    24                     Eco1050     
#>  7 2022 - ongoing - Rooseve… Test Project    25                     Eco1040     
#>  8 2022 - ongoing - Rooseve… Test Project    26                     Eco7939     
#>  9 2022 - ongoing - Rooseve… Test Project    27                     Eco6690     
#> 10 2022 - ongoing - Rooseve… Test Project    28                     UBC 4       
#> # ℹ 11 more rows
#> # ℹ abbreviated name: ¹parent_sample_station_label
#> # ℹ 5 more variables: make_of_camera_code <chr>, model_of_camera <chr>,
#> #   camera_comments <chr>, site_description_comments <chr>,
#> #   site_description_date <dttm>

Camera Setup and Checks:

camera_setup_checks <- read_cam_setup_checks(metadata_path)
camera_setup_checks

#> ! Data is missing expected column: "data_qc_complete"
#> # A tibble: 43 × 31
#>    wlrs_project_name       study_area_name sample_station_label deployment_label
#>    <chr>                   <chr>           <chr>                <chr>           
#>  1 2022 - ongoing - Roose… Test Project    19_1                 <NA>            
#>  2 2022 - ongoing - Roose… Test Project    19_2                 <NA>            
#>  3 2022 - ongoing - Roose… Test Project    20                   <NA>            
#>  4 2022 - ongoing - Roose… Test Project    21_1                 <NA>            
#>  5 2022 - ongoing - Roose… Test Project    21_2                 <NA>            
#>  6 2022 - ongoing - Roose… Test Project    24                   <NA>            
#>  7 2022 - ongoing - Roose… Test Project    25                   <NA>            
#>  8 2022 - ongoing - Roose… Test Project    26                   <NA>            
#>  9 2022 - ongoing - Roose… Test Project    27                   <NA>            
#> 10 2022 - ongoing - Roose… Test Project    28                   <NA>            
#> # ℹ 33 more rows
#> # ℹ 27 more variables: camera_label <chr>, surveyors <chr>,
#> #   date_time_checked <dttm>, sampling_start <dttm>, sampling_end <dttm>,
#> #   total_visit_or_deployment_time <dbl>, unit_of_total_time_code <chr>,
#> #   visit_type <chr>, camera_status_on_arrival <chr>, battery_level <chr>,
#> #   batteries_changed <chr>, number_of_photos <dbl>, quiet_period_s <chr>,
#> #   trigger_sensitivity <chr>, trigger_timing_s <dbl>, …

Deployments

Rather than just looking at the raw camera setup and checks or stations, there is more utility in assembling sampling deployments by combining the sample station information and the camera setup and checks. Do this with the make_deployments() function.

deployments <- make_deployments(metadata_path)
deployments

#> ! Data is missing expected column: "data_qc_complete"
#> Simple feature collection with 28 features and 62 fields
#> Geometry type: POINT
#> Dimension:     XY
#> Bounding box:  xmin: -130.3966 ymin: 52.92087 xmax: -129.569 ymax: 52.96732
#> Geodetic CRS:  WGS 84
#> # A tibble: 28 × 57
#>    wlrs_project_name       study_area_name sample_station_label deployment_label
#>    <chr>                   <chr>           <chr>                <chr>           
#>  1 2022 - ongoing - Roose… Test Project    19_1                 19_1_20230605   
#>  2 2022 - ongoing - Roose… Test Project    19_2                 19_2_20230605   
#>  3 2022 - ongoing - Roose… Test Project    20                   20_20230605     
#>  4 2022 - ongoing - Roose… Test Project    21_1                 21_20230605     
#>  5 2022 - ongoing - Roose… Test Project    21_2                 21_2_20230605   
#>  6 2022 - ongoing - Roose… Test Project    24                   24_20230708     
#>  7 2022 - ongoing - Roose… Test Project    25                   25_20230710     
#>  8 2022 - ongoing - Roose… Test Project    26                   26_20230710     
#>  9 2022 - ongoing - Roose… Test Project    27                   27_20230605     
#> 10 2022 - ongoing - Roose… Test Project    28                   28_20230605     
#> # ℹ 18 more rows
#> # ℹ 53 more variables: camera_label <chr>, surveyors <chr>,
#> #   date_time_checked <dttm>, sampling_end <dttm>,
#> #   total_visit_or_deployment_time <dbl>, unit_of_total_time_code <chr>,
#> #   visit_type <chr>, camera_status_on_arrival <chr>, battery_level <chr>,
#> #   batteries_changed <chr>, number_of_photos <dbl>, quiet_period_s <chr>,
#> #   trigger_sensitivity <chr>, trigger_timing_s <dbl>, …

There is a handy summary() method for this as well:

summary(deployments)
#> ┌ Test Project ───┐
#> │                 │
#> │   Deployments   │
#> │                 │
#> └─────────────────┘
#> ℹ 15 sample stations in 28 deploymentss.
#> ℹ Deployment lengths range between 113 and 245 days.
#> ℹ Camera status on arrival summary:
#> Active 
#>     28
#> ℹ There are 19201 images. Photos per deployment range betwen 189 and 3554.

We can use the mapview package to quickly visualize this, setting the zcol argument to the name of the column you’d like to colour the points by. Clicking on a point will give you the details of that deployment.

library(mapview)
mapview(deployments, zcol = "sample_station_label")

Project Metadata: Field Form CSV files

There are also two functions for reading in the different csv outputs from the field forms: Sample Stations, and Deployments:

sample_station_info <- read_sample_station_csv("path-to-sample-stations.csv")

deployments <- read_deployments_csv("path-to-deployments.csv")

Image data

We can read in an entire directory of image data from multiple csv files, as long as they all follow the same TimeLapse template. Currently it is expected that they follow the v20230518 template.

image_data <- read_image_data(data_path)
image_data

#> # A tibble: 11,833 × 43
#>    root_folder study_area_name sample_station_label deployment_label
#>    <chr>       <chr>           <chr>                <chr>           
#>  1 100RECNX    Test Project    19_1                 19_1_20230605   
#>  2 100RECNX    Test Project    19_1                 19_1_20230605   
#>  3 100RECNX    Test Project    19_1                 19_1_20230605   
#>  4 100RECNX    Test Project    19_1                 19_1_20230605   
#>  5 100RECNX    Test Project    19_1                 19_1_20230605   
#>  6 100RECNX    Test Project    19_1                 19_1_20230605   
#>  7 100RECNX    Test Project    19_1                 19_1_20230605   
#>  8 100RECNX    Test Project    19_1                 19_1_20230605   
#>  9 100RECNX    Test Project    19_1                 19_1_20230605   
#> 10 100RECNX    Test Project    19_1                 19_1_20230605   
#> # ℹ 11,823 more rows
#> # ℹ 39 more variables: date_time <dttm>, episode <chr>, species <chr>,
#> #   total_count_episode <dbl>, obj_count_image <int>, adult_male <int>,
#> #   adult_female <int>, adult_unclassified_sex <int>, yearling_male <int>,
#> #   yearling_female <int>, yearling_unclassified_sex <int>,
#> #   young_of_year_unclassified_sex <int>, juvenile_unclassified_sex <int>,
#> #   male_unclassified_age <int>, female_unclassified_age <int>, …

Again, we can use the summary() method to get an overview of the image data.

summary(image_data)
#> ┌ Test Project ─────┐
#> │                   │
#> │   Image summary   │
#> │                   │
#> └───────────────────┘
#> ℹ 11833 images in 17 deployments at 15 sample stations.
#> ℹ Image counts by trigger mode:
#> Motion Detection       Time Lapse 
#>             8657             3176
#> ℹ 117 images with lens obscured.
#> ℹ 4 images starred.
#> ! 4 images flagged for review.
#> ℹ Dates are between 2022-11-07 and 2023-07-10.
#> ℹ Temperatures are between -10 and 37 C.
#> ℹ Snow depths are between 0 and 135 cm.
#> ℹ Species counts:
#> Avian (comments)       Black Bear           Cougar        Mule Deer 
#>                6              130               12              155 
#> Other (comments)   Pacific Marten     Red Squirrel    Roosevelt Elk 
#>                1                2                1               88 
#>             <NA> 
#>            11438
#> ! Run `qa_deployment_images()` to crosscheck images with deployments.
#> ! Run `qa_image_data()` to run various QA checks.

Use the qa_deployment_images() function to find deployment labels that are in the deployment data but not in the image data, and vice-versa. It is usually likely that there will be deployment labels in the deployment data that are missing from the image data if not all of the images have been processed yet. Deployment labels that are present in the image data but not in the deployment data indicate a potential problem.

qa_deployment_images(deployments, image_data)
#> ! The following deployment labels are present in `image_data` but not `deployments`: "21_1_20230605", "2022-11-10", and "2023-01-10"
#> ! The following deployment labels are present in `deployments` but not `image_data`: "21_20230605", "19_1_20231107", "19_2_20231107", "20_20231107", "21_20231107", "21_2_20231108", "25_20231031", "26_20231031", "27_20231031", "28_20231031", "29_1_20231031", "29_2_20231031", "29_3_20231107", and "31_20231107"

Use merge_deployments_images() to join the deployment metadata to the images:

images_with_metadata <- merge_deployments_images(deployments, image_data)
#> ! The following deployment labels are present in `image_data` but not `deployments`: "21_1_20230605", "2022-11-10", and "2023-01-10"
#> ! The following deployment labels are present in `deployments` but not `image_data`: "21_20230605", "19_1_20231107", "19_2_20231107", "20_20231107", "21_20231107", "21_2_20231108", "25_20231031", "26_20231031", "27_20231031", "28_20231031", "29_1_20231031", "29_2_20231031", "29_3_20231107", and "31_20231107"
images_with_metadata
#> Simple feature collection with 11833 features and 102 fields (with 1208 geometries empty)
#> Geometry type: POINT
#> Dimension:     XY
#> Bounding box:  xmin: -130.3966 ymin: 52.92237 xmax: -129.569 ymax: 52.96732
#> Geodetic CRS:  WGS 84
#> # A tibble: 11,833 × 97
#>    root_folder deployment_label date_time           episode species
#>    <chr>       <chr>            <dttm>              <chr>   <chr>  
#>  1 100RECNX    19_1_20230605    2022-11-10 15:15:53 1:1|5   <NA>   
#>  2 100RECNX    19_1_20230605    2022-11-10 15:15:58 1:2|5   <NA>   
#>  3 100RECNX    19_1_20230605    2022-11-10 15:16:25 1:3|5   <NA>   
#>  4 100RECNX    19_1_20230605    2022-11-10 15:16:27 1:4|5   <NA>   
#>  5 100RECNX    19_1_20230605    2022-11-10 15:16:31 1:5|5   <NA>   
#>  6 100RECNX    19_1_20230605    2022-11-11 12:00:00 2:1|1   <NA>   
#>  7 100RECNX    19_1_20230605    2022-11-12 12:00:00 3:1|1   <NA>   
#>  8 100RECNX    19_1_20230605    2022-11-13 12:00:00 4:1|1   <NA>   
#>  9 100RECNX    19_1_20230605    2022-11-14 12:00:00 5:1|1   <NA>   
#> 10 100RECNX    19_1_20230605    2022-11-15 12:00:00 6:1|1   <NA>   
#> # ℹ 11,823 more rows
#> # ℹ 92 more variables: total_count_episode <dbl>, obj_count_image <int>,
#> #   adult_male <int>, adult_female <int>, adult_unclassified_sex <int>,
#> #   yearling_male <int>, yearling_female <int>,
#> #   yearling_unclassified_sex <int>, young_of_year_unclassified_sex <int>,
#> #   juvenile_unclassified_sex <int>, male_unclassified_age <int>,
#> #   female_unclassified_age <int>, unclassified_life_stage_and_sex <int>, …

Image Data QA

There are a number of common data quality issues that we can check for in the image data itself, aside from those addressed above when reconciling deployments and images.

We can use the qa_image_data() function to detect the following problems:

Check for blanks in key fields: study area, station label, deployment date, surveyor, trigger mode, temperature, episode
Species detected with no count data
Count data with no species
Sum of individual count fields equals Total Count
Multiple entries under same Episode number (indicating possible double entry)
Ensure dates for timelapse images are continuous and in order.
Snow data
- No blanks unless lens obscured is TRUE
- Look for snow depth outliers (e.g., 10, 10, 110, 10, 15, 20)

Run the qa_image_data() function:

image_data_qa <- qa_image_data(image_data)
dim(image_data_qa)
#> [1] 24 49

We can see that this has identified 24 records with potential problems. This dataset has a number of fields starting with QA_ which help us know which images we should have a closer look at. All of the original fields, plus any QA_ fields that have at least one TRUE value are returned:

# Print the names of the columns, just to see what we're working with
names(image_data_qa)
#>  [1] "root_folder"                             
#>  [2] "study_area_name"                         
#>  [3] "sample_station_label"                    
#>  [4] "deployment_label"                        
#>  [5] "date_time"                               
#>  [6] "episode_num"                             
#>  [7] "episode"                                 
#>  [8] "species"                                 
#>  [9] "total_count_episode"                     
#> [10] "obj_count_image"                         
#> [11] "adult_male"                              
#> [12] "adult_female"                            
#> [13] "adult_unclassified_sex"                  
#> [14] "yearling_male"                           
#> [15] "yearling_female"                         
#> [16] "yearling_unclassified_sex"               
#> [17] "young_of_year_unclassified_sex"          
#> [18] "juvenile_unclassified_sex"               
#> [19] "male_unclassified_age"                   
#> [20] "female_unclassified_age"                 
#> [21] "unclassified_life_stage_and_sex"         
#> [22] "antler_class"                            
#> [23] "animal_identifiable"                     
#> [24] "animal_tagged"                           
#> [25] "behaviour_1"                             
#> [26] "behaviour_2"                             
#> [27] "behaviour_3"                             
#> [28] "human_use_type"                          
#> [29] "human_transport_mode"                    
#> [30] "temperature"                             
#> [31] "snow_depth"                              
#> [32] "snow_index"                              
#> [33] "snow_is_est"                             
#> [34] "snow_depth_lower"                        
#> [35] "snow_depth_upper"                        
#> [36] "lens_obscured"                           
#> [37] "starred"                                 
#> [38] "needs_review"                            
#> [39] "comment"                                 
#> [40] "surveyor"                                
#> [41] "trigger_mode"                            
#> [42] "file"                                    
#> [43] "relative_path"                           
#> [44] "delete_flag"                             
#> [45] "QA_species_UNMATCHED_total_count_episode"
#> [46] "QA_total_count_episode_UNMATCHED_species"
#> [47] "QA_sum_counts"                           
#> [48] "QA_dup_episode"                          
#> [49] "QA_snow_blank"

We can use functions from the dplyr package to select and view just the QA columns. bccamtrap uses dplyr as a dependency, so it will already be installed on your system, though it does need to be loaded.

library(dplyr)

select(image_data_qa, root_folder, file, starts_with("QA_"))
#> # A tibble: 24 × 7
#>    root_folder file  QA_species_UNMATCHED…¹ QA_total_count_episo…² QA_sum_counts
#>    <chr>       <chr> <lgl>                  <lgl>                  <lgl>        
#>  1 27_20230605 RCNX… TRUE                   FALSE                  TRUE         
#>  2 100RECNX    RCNX… FALSE                  FALSE                  FALSE        
#>  3 100RECNX    RCNX… FALSE                  FALSE                  FALSE        
#>  4 21_2_20230… RCNX… FALSE                  FALSE                  TRUE         
#>  5 100RECNX    RCNX… FALSE                  FALSE                  FALSE        
#>  6 100RECNX    RCNX… FALSE                  FALSE                  FALSE        
#>  7 100RECNX    RCNX… FALSE                  FALSE                  FALSE        
#>  8 29_2_20230… IMG_… FALSE                  FALSE                  TRUE         
#>  9 100RECNX    IMG_… FALSE                  FALSE                  TRUE         
#> 10 102RECNX    RCNX… FALSE                  FALSE                  FALSE        
#> # ℹ 14 more rows
#> # ℹ abbreviated names: ¹QA_species_UNMATCHED_total_count_episode,
#> #   ²QA_total_count_episode_UNMATCHED_species
#> # ℹ 2 more variables: QA_dup_episode <lgl>, QA_snow_blank <lgl>

Built-in plotting functions

There are several plotting functions available to help you visualize your data and spot any potential problems. By default, all plots render as static images, but can be created as interactive plots by setting interactive = TRUE. Interactive plots are not shown here as they don’t render in the README.

Deployment plot

We can plot deployments to see that the start and ends of our deployments are as expected, and flag any “invalid” deployments (i.e., where we don’t know the end time because a camera was stolen, bumped, ran out of batteries etc.). You can make static or interactive plots:

plot_deployments(deployments, date_breaks = "2 months")

# plot_deployments(deployments, interactive = TRUE, date_breaks = "2 months")

Snow depth plot

We can plot the recorded snow depths across deployments using the plot_snow() function with our image data:

plot_snow(image_data, date_breaks = "2 months")
#> Warning: Removed 109 rows containing missing values or values outside the scale range
#> (`geom_interactive_point()`).

# plot_snow(image_data, date_breaks = "2 months", interactive = TRUE)

Detection plot

We can also plot image timestamps over the deployment durations to alert us to potential time mismatches between the session data and image time labels. Mismatches could indicate wrong time settings on cameras, errors in deployment labels (as the below indicates), or any number of data entry errors.

plot_deployment_detections(deployments, image_data, date_breaks = "2 months")

# plot_deployment_detections(deployments, image_data, interactive = TRUE, date_breaks = "2 months")

Daily detection patterns

We can plot the patterns of daily detections by species:

plot_diel_activity(image_data)

# plot_diel_activity(image_data, interactive = TRUE)

Sampling sessions

Define sampling sessions based on image data using the make_sample_sessions() function. This function will:

Set sampling_start as deployment_start
Notes dates of first and last photos of deployment
Counts photos (total, and motion-detection)
Determines if the sampling period is less than the deployment period
Determines gaps in sampling period due to obscured lens
Determines total length of sample period (last photo date - first photo date - number of days with lens obscured)

make_sample_sessions(image_data)

#> # A tibble: 17 × 13
#>    deployment_label sample_start_date sample_end_date n_photos n_photos_spp_id
#>    <chr>            <date>            <date>             <int>           <int>
#>  1 19_1_20230605    2022-11-10        2023-06-05           285              11
#>  2 19_2_20230605    2022-11-15        2023-06-05           411              26
#>  3 20_20230605      2022-11-10        2023-06-05          1857             101
#>  4 21_1_20230605    2022-11-10        2023-06-05           959              78
#>  5 21_2_20230605    2022-11-18        2023-06-05           569              23
#>  6 24_20230708      2022-11-18        2023-07-08           521              18
#>  7 25_20230710      2022-11-18        2023-07-10          3554              24
#>  8 26_20230710      2022-11-07        2023-07-10           621              50
#>  9 27_20230605      2022-11-07        2023-06-05           259               7
#> 10 28_20230605      2022-11-15        2023-06-05           566              14
#> 11 29_1_20230605    2022-11-07        2023-06-05           390              10
#> 12 2022-11-10       2022-11-07        2022-11-10            52               5
#> 13 29_2_20230605    2022-11-10        2023-06-05           530              34
#> 14 29_3_20230605    2022-11-18        2023-06-05           419              15
#> 15 2023-01-10       2022-11-15        2023-01-10           197               8
#> 16 31_20230605      2023-01-25        2023-06-05           382              15
#> 17 35_20230708      2022-11-18        2023-07-08           261               3
#> # ℹ 8 more variables: n_species <int>, n_individuals <dbl>,
#> #   n_motion_photos <int>, n_motion_photos_lens_obscured <int>,
#> #   n_tl_photos <int>, n_tl_photos_lens_obscured <int>, sample_gaps <lgl>,
#> #   trap_days <int>

You can set custom start and end dates for your sample session as well:

make_sample_sessions(
  image_data, 
  sample_start_date = "2022-12-01", 
  sample_end_date = "2023-04-30"
)

#> # A tibble: 16 × 13
#>    deployment_label sample_start_date sample_end_date n_photos n_photos_spp_id
#>    <chr>            <date>            <date>             <int>           <int>
#>  1 19_1_20230605    2022-12-01        2023-04-30           191               6
#>  2 19_2_20230605    2022-12-01        2023-04-30           290              16
#>  3 20_20230605      2022-12-01        2023-04-30           662              31
#>  4 21_1_20230605    2022-12-01        2023-04-30           684              62
#>  5 21_2_20230605    2022-12-01        2023-04-30           450              15
#>  6 24_20230708      2022-12-01        2023-04-30           166               3
#>  7 25_20230710      2022-12-01        2023-04-30           165               0
#>  8 26_20230710      2022-12-01        2023-04-30           163               1
#>  9 27_20230605      2022-12-01        2023-04-30           152               1
#> 10 28_20230605      2022-12-01        2023-04-30           162               3
#> 11 29_1_20230605    2022-12-01        2023-04-30           241               5
#> 12 29_2_20230605    2022-12-01        2023-04-30           415              18
#> 13 29_3_20230605    2022-12-01        2023-04-30           226               5
#> 14 2023-01-10       2022-12-01        2023-01-10            95               4
#> 15 31_20230605      2023-01-25        2023-04-30           311               8
#> 16 35_20230708      2022-12-01        2023-04-30           152               0
#> # ℹ 8 more variables: n_species <int>, n_individuals <dbl>,
#> #   n_motion_photos <int>, n_motion_photos_lens_obscured <int>,
#> #   n_tl_photos <int>, n_tl_photos_lens_obscured <int>, sample_gaps <lgl>,
#> #   trap_days <int>

Analysis data

Relative Activity Index (RAI)

Calculate Relative Activity Index for sample sessions using sample_rai(). By default, it calculates RAI per species using the sample start and end dates in the data for each deployment:

sample_rai(image_data)
#> # A tibble: 53 × 8
#>    deployment_label sample_start_date sample_end_date trap_days species      
#>    <chr>            <date>            <date>              <int> <chr>        
#>  1 19_1_20230605    2022-11-10        2023-06-05            207 Black Bear   
#>  2 19_1_20230605    2022-11-10        2023-06-05            207 Cougar       
#>  3 19_1_20230605    2022-11-10        2023-06-05            207 Roosevelt Elk
#>  4 19_2_20230605    2022-11-15        2023-06-05            201 Black Bear   
#>  5 19_2_20230605    2022-11-15        2023-06-05            201 Mule Deer    
#>  6 19_2_20230605    2022-11-15        2023-06-05            201 Roosevelt Elk
#>  7 2022-11-10       2022-11-07        2022-11-10              3 Mule Deer    
#>  8 2022-11-10       2022-11-07        2022-11-10              3 Roosevelt Elk
#>  9 2023-01-10       2022-11-15        2023-01-10             56 Cougar       
#> 10 2023-01-10       2022-11-15        2023-01-10             56 Roosevelt Elk
#> # ℹ 43 more rows
#> # ℹ 3 more variables: n_detections <int>, total_count <dbl>, rai <dbl>

You can set it to do a subset of species and/or deployment labels, and similar to make_sample_sessions(), set custom session start and end dates:

sample_rai(
  image_data, 
  species = "Roosevelt Elk", 
  deployment_label = c("19_2_20230605", "29_1_20230605"),
  sample_start_date = "2022-12-01", 
  sample_end_date = "2023-04-30"
)
#> # A tibble: 2 × 8
#>   deployment_label sample_start_date sample_end_date trap_days species      
#>   <chr>            <date>            <date>              <int> <chr>        
#> 1 19_2_20230605    2022-12-01        2023-04-30            151 Roosevelt Elk
#> 2 29_1_20230605    2022-12-01        2023-04-30            148 Roosevelt Elk
#> # ℹ 3 more variables: n_detections <int>, total_count <dbl>, rai <dbl>

You can also calculate RAI across all deployments by setting by_deployment = FALSE:

sample_rai(
  image_data, 
  species = "Roosevelt Elk", 
  by_deployment = FALSE,
  sample_start_date = "2022-12-01", 
  sample_end_date = "2023-04-30"
)
#> # A tibble: 1 × 7
#>   sample_start_date sample_end_date trap_days species   n_detections total_count
#>   <date>            <date>              <int> <chr>            <int>       <dbl>
#> 1 2022-12-01        2023-04-30            151 Roosevel…           51         146
#> # ℹ 1 more variable: rai <dbl>

We can compare total count and RAI across species:

spp_comp <- sample_rai(
  image_data, 
  by_deployment = FALSE,
  by_species = TRUE,
  sample_start_date = "2022-12-01", 
  sample_end_date = "2023-04-30"
)
spp_comp
#> # A tibble: 7 × 7
#>   sample_start_date sample_end_date trap_days species   n_detections total_count
#>   <date>            <date>              <int> <chr>            <int>       <dbl>
#> 1 2022-12-01        2023-04-30            131 Avian (c…            1           1
#> 2 2022-12-01        2023-04-30            151 Black Be…           28          28
#> 3 2022-12-01        2023-04-30            151 Cougar               8           8
#> 4 2022-12-01        2023-04-30            151 Mule Deer           77         105
#> 5 2022-12-01        2023-04-30            131 Other (c…            1           1
#> 6 2022-12-01        2023-04-30            151 Pacific …            2           2
#> 7 2022-12-01        2023-04-30            151 Roosevel…           51         146
#> # ℹ 1 more variable: rai <dbl>

Using the ggplot2 package, we can plot this:

library(ggplot2)

ggplot(spp_comp, aes(x = rai, y = species)) + 
  geom_point(colour = "darkgreen") + 
  geom_text(aes(label = total_count), nudge_x = 0.05, nudge_y = 0.1) +
  theme_classic() + 
  labs(title = "RAI of all species detected, across all deployments", 
       caption = "Numbers beside points represent total number of individuals detected",
       x = "Relative Activity Index", y = "Species")

We can group by deployment to compare across deployments:

spp_comp_by_dep <- sample_rai(
  image_data, 
  by_deployment = TRUE,
  by_species = TRUE,
  sample_start_date = "2022-12-01", 
  sample_end_date = "2023-04-30"
)

ggplot(spp_comp_by_dep, aes(x = rai, y = species, colour = deployment_label)) + 
  geom_point() + 
  geom_text(aes(label = total_count), nudge_x = 0.01, nudge_y = 0.1) +
  theme_classic() + 
  labs(title = "RAI of all species detected, across all deployments", 
       caption = "Numbers beside points represent total number of individuals detected",
       x = "Relative Activity Index", y = "Species")

Relative Activity Index (RAI) over time

Use rai_by_time() to calculate RAI over a time window, optionally calculating statistics using a moving window aggregation. You can calculate daily statistics, or aggregate by week, month, or year. By default, it calculates daily metrics, aggregating across deployments.

rai_by_time(image_data)
#> # A tibble: 2,205 × 9
#>    species          study_area_name date       max_snow_index mean_temperature
#>    <chr>            <chr>           <date>              <dbl>            <dbl>
#>  1 Avian (comments) Test Project    2022-11-07              2             6   
#>  2 Avian (comments) Test Project    2022-11-08              2            -1.75
#>  3 Avian (comments) Test Project    2022-11-09              2            -3.25
#>  4 Avian (comments) Test Project    2022-11-10              2             1.2 
#>  5 Avian (comments) Test Project    2022-11-11              2             1   
#>  6 Avian (comments) Test Project    2022-11-12              2             1.57
#>  7 Avian (comments) Test Project    2022-11-13              2             2.57
#>  8 Avian (comments) Test Project    2022-11-14              2             1.71
#>  9 Avian (comments) Test Project    2022-11-15              2             0.5 
#> 10 Avian (comments) Test Project    2022-11-16              2            -1   
#> # ℹ 2,195 more rows
#> # ℹ 4 more variables: n_detections <int>, total_count <dbl>, trap_days <int>,
#> #   rai <dbl>

We can select a single species, and calculate daily rolling values. The default window size is 7, but it can be changed with the k parameter.

elk_roll_avg <- rai_by_time(
  image_data, 
  by = "date",
  species = "Roosevelt Elk",
  roll = TRUE
)
elk_roll_avg
#> # A tibble: 245 × 15
#>    species       study_area_name date       max_snow_index mean_temperature
#>    <chr>         <chr>           <date>              <dbl>            <dbl>
#>  1 Roosevelt Elk Test Project    2022-11-07              2             6   
#>  2 Roosevelt Elk Test Project    2022-11-08              2            -1.75
#>  3 Roosevelt Elk Test Project    2022-11-09              2            -3.25
#>  4 Roosevelt Elk Test Project    2022-11-10              2             1.2 
#>  5 Roosevelt Elk Test Project    2022-11-11              2             1   
#>  6 Roosevelt Elk Test Project    2022-11-12              2             1.57
#>  7 Roosevelt Elk Test Project    2022-11-13              2             2.57
#>  8 Roosevelt Elk Test Project    2022-11-14              2             1.71
#>  9 Roosevelt Elk Test Project    2022-11-15              2             0.5 
#> 10 Roosevelt Elk Test Project    2022-11-16              2            -1   
#> # ℹ 235 more rows
#> # ℹ 10 more variables: n_detections <int>, total_count <dbl>, trap_days <int>,
#> #   rai <dbl>, roll_mean_max_snow <dbl>, roll_mean_temp <dbl>,
#> #   roll_trap_days <int>, roll_detections <int>, roll_count <dbl>,
#> #   roll_rai <dbl>

ggplot(elk_roll_avg, aes(x = date, y = roll_rai)) + 
  geom_line(colour = "darkgreen") + 
  theme_classic() + 
  labs(
    title = "Rolling seven day average of Elk RAI", 
    x = "Date", 
    y = "7 day rolling average RAI"
  )
#> Warning: Removed 6 rows containing missing values or values outside the scale range
#> (`geom_line()`).

Since the data returned by rai_by_time also includes snow and temperature data, we can plot these, and then compare RAI to these environment variables:

ggplot(elk_roll_avg, aes(x = date, y = roll_mean_max_snow)) + 
  geom_line(colour = "darkblue") + 
  theme_classic() + 
  labs(
    title = "Rolling seven day average of average maximum snow index across sites", 
    x = "Date", 
    y = "7 day rolling average of maximum snow index"
  )
#> Warning: Removed 6 rows containing missing values or values outside the scale range
#> (`geom_line()`).

We can change the way snow measurements are aggregated across sites when by_deployment = FALSE. By default it uses max, but we can set it to any aggregation function, like mean:

elk_roll_avg <- rai_by_time(
  image_data, 
  by = "date",
  species = "Roosevelt Elk",
  roll = TRUE,
  snow_agg = "mean"
)

ggplot(elk_roll_avg, aes(x = date, y = roll_mean_mean_snow)) + 
  geom_line(colour = "darkblue") + 
  theme_classic() + 
  labs(
    title = "Rolling seven day average of mean snow index across sites", 
    x = "Date", 
    y = "7 day rolling average of mean snow index"
  )
#> Warning: Removed 6 rows containing missing values or values outside the scale range
#> (`geom_line()`).

And we can compare Elk activity to snow levels:

ggplot(elk_roll_avg, aes(x = roll_mean_mean_snow, y = roll_rai, colour = mean_temperature)) + 
  geom_point() + 
  scale_colour_viridis_c(option = "inferno") + 
  theme_classic() + 
  labs(
    title = "Rolling seven day average of Elk RAI compared to Snow Index", 
    x = "7 day rolling average of mean Snow Index across sites", 
    y = "7 day rolling average RAI",
    colour = "Temperature"
  )
#> Warning: Removed 6 rows containing missing values or values outside the scale range
#> (`geom_point()`).

And temperature:

ggplot(elk_roll_avg, aes(x = roll_mean_temp, y = roll_rai)) + 
  geom_point() + 
  theme_classic() + 
  labs(
    title = "Rolling seven day average of Elk RAI compared to Temperature", 
    x = "7 day rolling average of mean temperature across sites", 
    y = "7 day rolling average RAI"
  )
#> Warning: Removed 6 rows containing missing values or values outside the scale range
#> (`geom_point()`).

We can compare raw counts vs snow depth across deployments. Note that for daily counts (by = "date") when by_deployment = TRUE, the “trap_days” in each row is equal to 1, so RAI is a bit meaningless and we can just compare raw counts:

elk_rai_by_dep <- rai_by_time(
  image_data, 
  by = "date",
  species = "Roosevelt Elk",
  by_deployment = TRUE
)

ggplot(elk_rai_by_dep, aes(x = snow_index, y = total_count, colour = deployment_label)) + 
  facet_wrap(vars(deployment_label)) + 
  geom_point()

If we want to compare the RAI of two species, we can specify them in the species argument, and colour our plot by species (if we left the species argument blank we would get a line per species, but that looks visually very busy).

all_spp_roll_avg <- rai_by_time(
  image_data, 
  by = "date",
  species = c("Roosevelt Elk", "Cougar"),
  by_species = TRUE,
  roll = TRUE
  )

ggplot(all_spp_roll_avg, aes(x = date, y = roll_rai, colour = species)) + 
  geom_line() + 
  theme_classic() + 
  labs(
    title = "Rolling seven day average of RAI for Cougar and Elk", 
    x = "Date", 
    y = "7 day rolling average RAI"
  )
#> Warning: Removed 12 rows containing missing values or values outside the scale range
#> (`geom_line()`).

Here we use it to compare the total monthly activity by all species among all deployments:

total_rai_by_month <- rai_by_time(
  image_data, 
  by = "month",
  by_species = FALSE,
  by_deployment = TRUE
  )

ggplot(total_rai_by_month, aes(x = month, y = rai, fill = deployment_label)) + 
  geom_col(position = "dodge") + 
  theme_classic() + 
  labs(
    title = "Monthly RAI of all species", 
    x = "Month", 
    y = "RAI"
  )

Write Data to SPI template

bccamtrap also has functionality to write out data to a SPI template for submission.

Use fill_spi_template() to write all of the data to a SPI template, filling in just the default required fields. This will fill in all of the tabs except for the Project Info sheet which you must fill in manually.

fill_spi_template(
  sample_stations,
  camera_info, 
  camera_setup_checks,
  image_data,
  file = "~/Desktop/SPI_output.xlsx"
)

If you want more control, such as adding data to other fields in the SPI template, use write_to_spi_sheet().

If you want to write to an existing file, specify the same file name in both the file and the template parameters. To write columns other than the default columns, specify paired column names in the form `Destination Column` = data_column. If the left-hand side is a syntactically valid name it can be provided as-is, but if it has spaces in it it must be wrapped in backticks or quotes.

write_to_spi_sheet(
  sample_stations,
  file = "~/Desktop/SPI_output.xlsx",
  `Number of Cameras` = number_of_cameras,
  template = "~/Desktop/SPI_output.xlsx"
)

Writing to SPI template using field form data

To write data imported from field form data, you must use the fill_spi_template_ff() function, passing in both the sample_station_info and deployments, as well as the image_data.

If you want to only write to the metadata tabs and not the Sequence Image Data, you can leave the image_data argument as NULL, and write to the file another time with write_to_spi_sheet().

sample_station_info <- read_sample_station_csv("path-to-sample-stations.csv")
deployments <- read_deployments_csv("path-to-deployments.csv")

fill_spi_template_ff(
  sample_stations,
  deployments,
  image_data,
  file = "~/Desktop/SPI_output_from_ff.xlsx"
)

bccamtrapp() Shiny App

The package contains a Shiny App for interactive use of most of the package’s functionality.

Run the app with:

library(bccamtrap)
bccamtrapp()

Data is loaded, and exported, via inputs on the left-hand side. You can use metadata from a SPI worksheet, or from a combination of csv-based field forms (sample stations and deployments).

Loading image data is done by selecting all image files in the dialogue or drag-and-drop.

The various tabs are useful for data summaries, QA, and generation of analysis data:

“Project Metadata” and “Deployments” rely only on having input the metadata files.
“QA Deployments vs Images” requires both metadata and image data files, and makes sure that they are compatible.
“Image Data QA”, “Sample Sessions”, and “Analysis Data” all require the image data to be loaded, but don’t require the metadata.

To export to a SPI template for submission, you need to have loaded metadata and image data. This will write only the required fields to the current SPI template included in the package.

If you experience the error “Maximum upload size exceeded”, you can change the max_upload_size_mb parameter of bccamtrapp() to a value greater than the default of 50:

bccamtrapp(max_upload_size_mb = 100)

The default of 50MB should be sufficient for most use-cases. If you find you need more than that, you may find degraded performance of the app and you should consider using bccamtrap functions directly in R.

Project Status

Getting Help or Reporting an Issue

To report bugs/issues/feature requests, please file an issue.

How to Contribute

If you would like to contribute, please see our CONTRIBUTING guidelines.

Please note that this project is released with a Contributor Code of Conduct. By participating in this project you agree to abide by its terms.

License

Copyright 2024 Province of British Columbia

Licensed under the Apache License, Version 2.0 (the &quot;License&quot;);
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an &quot;AS IS&quot; BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and limitations under the License.