Checks a tree measurements dataset for missing data and formatting that could lead to errors when applying other functions in this package. Outputs a table of the trees with data issues and a table summarizing the number of trees in the mapping dataset that have data issues.
tree_check(tree_data, map_data)
tree_data | Data frame containing tree measurement data where each row
represents a single measurement of a single tree. Should contain the columns
|
---|---|
map_data | Data frame containing tree mapping data. Should contain the
columns |
A list containing two elements:
problem_trees
is a data frame containing the tree ids found
to have data issues and a description of the issue
issue_summary
is a data frame that shows the number and
percentage of trees with at least one issue and with each of the specific
issues
The data issues checked for are: presence of required columns, single tree
ids referring to multiple trees, trees having no associated mapping data,
missing dbh, stand id, species, or measurement year. The provided
tree_data
is also checked for the presence of a mort
column
containing mortality data; if this column is found, a check for missing
mortality data is also performed. This function does not check for misspelled
stand ids or species, which should be checked independently. The common issue
of negative growth rates resulting from measurement error are not checked
here, but are checked by growth_summary
.
Tree ids indicated to have data issues according to this function are not necessarily unusable. For instance, missing year data could be inferred from knowledge on when certain stands were measured. The tree ids with data issues should therefore be investigated further rather than being excluded from further analyses right away.
tree_check_test <- tree_check(messy_tree, mapping)#> [1] "Potential formatting problems detected: please review output and correct errors or remove problem trees if necessary before continuing analysis"