Connexion utilisateur

Structural bias in aggregated species-level variables driven by repeated species co-occurrences: a pervasive problem in community and assemblage data

TitreStructural bias in aggregated species-level variables driven by repeated species co-occurrences: a pervasive problem in community and assemblage data
Type de publicationJournal Article
Year of Publication2017
AuteursHawkins, BA, Leroy, B, Rodríguez, MÁ, Singer, A, Vilela, B, Villalobos, F, Wang, X, Zelený, D
JournalJournal of Biogeography
Date Published02/2017
Mots-cléscommunity structure, community weighted means, geographical ecology, intrinsic variables, spatial analysis, species co-occurrence, species composition, species richness gradients, trait analysis


Species attributes are often used to explain diversity patterns across assemblages/communities. However, repeated species co-occurrences can generate spatial pattern and strong statistical relationships between aggregated attributes and richness in the absence of biological information. Our aim is to increase awareness of this problem.

North America.

We generated empirical species richness patterns using two data structures: (1) birds gridded from range maps and (2) tree communities from the US Forest Service's Forest Inventory and Analysis. We analysed richness using linear regression, regression trees, generalized additive models, geographically weighted regression and simultaneous autoregression, with ‘random intrinsic variables’ as predictors generated by assigning random numbers to species and calculating averages in assemblages. We then generated simulations in which species with cohesive or patchy distributions are placed with respect to the North American temperature gradient with or without a broad-scale richness gradient. Random intrinsic variables are again used as predictors of richness. Finally, we analysed one simulated scenario with random intrinsic variables as both response and predictor variables.

The models of bird and tree richness often explained moderate to large proportions of the variance. Regression trees, geographically weighted regression and simultaneous autoregression were very sensitive to the problem; generalized additive models were moderately affected, as was multiple regression to a lesser extent. In the virtual data, the variance explained increased with increasing species co-occurrences, but neither range cohesion, a richness gradient nor spatial autocorrelation in predictors had major impacts on the variance explained. The problem persisted when the response variable was also a random intrinsic variable.
Main conclusions

Repeated species co-occurrences can generate strong spurious relationships between richness and aggregated species attributes. It is important to realize that models utilizing assemblage variables aggregated from species-level values, as well as maps illustrating their spatial patterns, cannot be taken at face value.