Taxonomic sufficiency in freshwater ecosystems: effects of taxonomic resolution, functional traits, and data transformation
Taxonomic sufficiency (TS) has been proposed for assessing community composition and environmental impacts as a way to balance the need to indicate the biology of the organisms present with time and effort needed for species identification. TS has been applied most often to marine and freshwater macroinvertebrates, but tests of its usefulness are lacking for other freshwater groups. We analyzed the effects of taxonomic resolution, functional groupings, and data transformation on multivariate community patterns in periphyton, macrophytes, macroinvertebrates, and fishes, and on the quantification of biodiversity and environmental gradients. The applicability of TS differed strongly among taxonomic groups, depending on the average taxonomic breadth of the species sets. Numerical data resolution had more pronounced effects on community patterns than taxonomic resolution. Richness was strongly affected by data aggregation, but diversity indices were statistically reliable up to order level. Taxonomic aggregation had no significant influence on ability to detect environmental gradients. Functional surrogates based on biological traits, such as feeding type, reproductive strategy, and trophic state, were strongly correlated (ρ = 0.64–0.85) with taxonomic community composition. However, environmental correlations were generally lower with data aggregated to functional traits rather than to species. TS was universally applicable within taxonomic groups for different habitats in one biogeographic region. Aggregation to family or order was suitable for quantifying biodiversity and environmental gradients, but multivariate community analyses required finer resolution in fishes and macrophytes than in periphyton and macroinvertebrates. Sampling effort in environmental-impact studies and monitoring programs would be better invested in quantitative data and number of spatial and temporal replicates than in taxonomic detail.
Loss of biodiversity is proceeding faster in fresh water than in any other major biome (Dudgeon et al. 2006, Strayer and Dudgeon 2010, Geist 2011). Time- and cost-effective methods for the quantification of changes in ecosystem community structure are needed in the context of biodiversity conservation and for assessing and monitoring human impacts (Bevilacqua et al. 2012). On the other hand, a comprehensive scientific picture of their current status is essential to provide for the conservation of freshwater ecosystems (Millennium Ecosystem Assessment 2005, Bellier et al. 2012). Several authors and monitoring protocols regard species-level identification (Maurer 2000, Giagrande 2003, Drew 2011), identification of subspecies (Schaumburg et al. 2007), or DNA-based identification methods (e.g., Sweeney et al. 2011) as most appropriate in this context. However, identification of many freshwater species, e.g., algae or macroinvertebrates, and excessive quantitative sampling can be very difficult and time consuming (Johnson et al. 2006). Consequently, several attempts have been made to increase the time and cost efficiency of monitoring efforts by minimizing sampling or laboratory effort by reducing taxonomic or numerical resolution.
Ellis (1985) proposed taxonomic sufficiency (TS) as a concept for assessment of marine pollution that balances the need to interpret the biology of the organisms present against the time and effort needed for species identification. TS involves identifying taxa to the coarsest taxonomic level possible without losing significant ecological detail (e.g., differences in multivariate community patterns or diversity). In the last 2 decades, an increasing number of studies addressing the applicability of identification levels coarser than species has been published. These studies mainly covered marine ecosystems (42%) and especially marine macroinvertebrates (36% of all studies; reviewed by Bevilacqua et al. 2012). To our knowledge, in freshwater ecosystems, TS has been applied systematically only to macroinvertebrates (reviewed by Jones 2008), and its suitability for other groups is only known from single case studies (plankton: Hansson et al. 2004, diatoms: Heino and Soininen 2007). Especially for vertebrates, plants, and algae, few studies have been done to test the applicability of TS for all ecosystem types (Bevilacqua et al. 2012). Multivariate community patterns of marine and freshwater macroinvertebrates seem to be consistent from species at least up to family level (Bevilacqua et al. 2012), but biodiversity measures, such as richness, evenness, Shannon Index, or Simpson’s Index have the potential to be strongly underestimated by the use of coarser taxonomic levels (Maurer 2000). Multivariate community patterns and biodiversity measures are used often in freshwater science to assess effects of anthropogenic disturbance and to determine areas of conservation priority (e.g., Balmford et al. 1996), so such information would be of high value for freshwater ecologists.
The degree to which the results of ecological analyses change if a coarser level of taxonomic resolution is used is largely influenced by the number of species that are being condensed (Bevilacqua et al. 2012). Furthermore, the degree of taxonomic relatedness among the investigated species and the distribution of species to coarser taxonomic levels (even or uneven distribution) also may influence the applicability of TS. These characteristics can differ strongly among taxonomic groups and ecosystem types. Several indices are available for measuring taxonomic diversity (e.g., average taxonomic breadth; Clarke and Warwick 2001), but their usefulness for predicting the appropriate taxonomic resolution before undertaking time- and cost-intensive identification work has never been tested systematically in freshwater ecosystems. Moreover, ecological similarity of species is not necessarily correlated with taxonomic relatedness (Losos et al. 2003, Poff et al. 2006). Alternative groupings, such as feeding type, reproductive strategy, locomotion type, and habitat preference, can provide valuable information about ecosystem functions and processes (Usseglio-Polatera et al. 2000, Mouillot et al. 2005, Siefert 2012). However, the effects of aggregation to functional guilds on multivariate community patterns have not yet been investigated comprehensively for all relevant taxonomic groups in freshwater systems.
Quantification of the impact of environmental factors on biological communities plays an important role in assessments of human disturbance and natural variability that goes beyond simple description of changes in multivariate community patterns or diversity. To date, few studies of marine macroinvertebrates have considered the capability of detecting environmental gradients by using coarser taxonomic levels than species or other classification types like functional groupings (e.g., Olsgard et al. 1997, 1998).
The outcome of ecological analyses also can be influenced by the numerical resolution of the data (Clarke and Warwick 2001). Differences in numerical resolution can be founded on the degree of quantitative detail in the sampling strategy (quantitative, % abundance, or presence–absence data) or on post hoc data transformation. Especially in applied freshwater science, presence–absence data or % abundance data often are used subsequent to nonquantitative sampling methods (e.g., Barbour et al. 1999, Schaumburg et al. 2007). Various degrees of transformation are used commonly in multivariate analyses of ecological data. These techniques include √(x)-transformation to allow the intermediate abundant species to play a part, log(x)- or 4√(x)-transformation to increase consideration of rarer species, or use of presence–absence data to down-weight the effects of common and abundant species. However, choice of the numerical resolution of the data is more a biological than a statistical question. This choice can affect the conclusions of an analysis more than the choices of similarity measure or ordination method (Clarke and Warwick 2001) and may affect the applicability of TS.
Knowledge of the applicability of different taxonomic and numerical resolutions and functional surrogates currently is limited to single-case studies (e.g., algae: Hansson et al. 2004) or remains untested (e.g., macrophytes, fishes: none of 678 publications reviewed by Bevilacqua et al. 2012) for most freshwater groups. Furthermore, most studies are based on only one data set, do not consider taxonomic levels coarser than family level (see Bevilacqua et al. 2012), compare different taxonomic groups, include effects of numerical resolution and alternative groupings, or test the detected threshold levels statistically. These constraints limit direct comparisons between data sets and taxonomic groups within one ecosystem type. Application of TS and other taxonomic surrogates in freshwater ecosystems requires identification of uses for which TS or functional surrogates could be advantageous and situations in which the lack of taxonomic information might severely limit the quality of assessments for all major taxonomic groups. This knowledge will help find an optimal balance between level of detail of the results and effort. To our knowledge, our study is the first to analyze comprehensively the applicability of TS (up to phylum level) and functional surrogates for multivariate analyses of community patterns, univariate analyses of diversity indices, and detection of environmental gradients with 3 freshwater data sets, each including abundance data for several taxonomic groups (periphyton, macrophytes, macroinvertebrates, and fishes).
The following hypotheses were tested: 1) Taxonomic resolution of periphyton, macrophyte, macroinvertebrate, and fish community data affects the outcome of ecological analyses (multivariate community pattern, diversity measures, capability to detect environmental gradients). The extent of loss of information is expected to depend on the average taxonomic breadth of the investigated taxonomic group and set of species. 2) Classification into functional groups resolves a different outcome of ecological analyses (multivariate pattern, diversity) than grouping individuals according to Linnean taxonomy and can improve the capability to detect environmental gradients. 3) Numerical resolution of the data (e.g., relative abundance or presence–absence data instead of quantitative data) has a stronger influence on the results of community analyses than taxonomic resolution (e.g., genus or family instead of species level).
The hypotheses were tested using 3 large, full-resolution data sets (quantitative species abundance data) from lentic and lotic freshwater habitats. Data set 1 was focused on a pairwise comparison of sites upstream and downstream of weirs in 5 different rivers (10 sampling locations; Mueller et al. 2011). Data set 2 was aimed at a comparison of biodiversity and abiotic habitat variables in different floodplain habitats (river stretches, oxbow sections, small ponds). Data were collected in the River Danube floodplain in Bavaria, Germany (42 sampling locations; Stammel et al. 2012). Data set 3 was collected to compare abiotic habitat characteristics and community composition in 3 calciferous and 3 siliceous rivers distributed throughout Bavaria, Germany (30 sampling locations; JP, MM, and JG, unpublished data). Each data set included the taxonomic groups periphyton, macroinvertebrates, macrophytes, and fishes and a set of environmental variables (water temperature, dissolved O2, specific conductance, pH, water depth, current speed), but the data sets differed in data structure (sampling methods, number of sampled river stretches, and treatments; Table S1; available online from: http://dx.doi.org/10.1899/12-212.1.s1). In our study, periphyton refers to all groups of periphytic algae, including diatoms. The level of taxonomic identification was species for macrophytes and fishes. Periphyton and macroinvertebrates were identified to species level as far as possible. Taxonomically difficult groups (chironomids, Oligochaeta, mites, chlorophyceae <5 µm) and small juveniles were identified to genus or lowest possible level.
Fine-resolution data sets are needed to study the applicability of different taxonomic or numerical data resolutions and functional groupings for ecological analyses. The resolution of these data sets can then be modified by summarizing the data to coarser levels. This procedure is referred to as data aggregation (taxonomic resolution) in the following text. Understanding the effects of data aggregation is essential to ensure that the most suitable classification system and data resolution can be chosen before undertaking the effort involved in species identification in future studies.
Taxonomic resolution (hypothesis 1) was modified by aggregating the full-resolution species abundance data to coarser levels of taxonomic resolution. In our study, species level is the finest level of taxonomic resolution, whereas phylum level is the coarsest level of taxonomic resolution. Species abundance data for periphyton, macroinvertebrates, and macrophytes from each data set were aggregated to genus, family, order, class, and phylum level by calculating the sum of all individuals from the respective level per sample with the aggregation tool in PRIMER (version 6; Clarke and Gorley 2006). All freshwater fishes were from the class Osteichthyes and the phylum Chordata, so species abundance data were aggregated only to order for this group.
The effects of functional groupings on the multivariate community patterns of periphyton, macrophytes, macroinvertebrates, and fishes (hypothesis 2) were tested by aggregating data to commonly used functional traits (representing a mixture of biological traits and ecological requirements; Table S2; available online from: http://dx.doi.org/10.1899/12-212.1.s1). Fishes and macroinvertebrates were assigned to groups commonly used in assessments in the context of the European Water Framework Directive (see Table S2). The use of functional traits to assess macrophytes and periphyton is less established in standard evaluation. Therefore, commonly applied functional classifications were selected from different literature sources for these groups. For each taxonomic group, all traits were summarized in a matrix (All Traits) containing the number of specimens from each trait state per sample (e.g., 61 fishes with reproduction type rheophilic, 12 fishes with reproduction type indifferent, 10 with trophic status omnivore; cf. functional trait niche, Poff et al. 2006). Details about the selected functional traits and the respective literature sources are provided in Table S2.
The numerical resolution (hypothesis 3) of each data set, taxonomic group, and level of taxonomic resolution was modified by data transformation using the pre-treatment transformation (overall) procedure in PRIMER (Clarke and Gorley 2006). Numerical resolution in our study reached from untransformed quantitative data (finest level) over √(x)-transformed and % abundance data to presence–absence data (coarsest level).
Comparison of resemblance matrices
—After data aggregation, resemblance matrices (Bray–Curtis similarity) were calculated for all taxonomic groups, respective levels of taxonomic resolution, levels of numerical resolution (note that Bray–Curtis similarity is equal to Sørensen Index for presence–absence data), and functional traits (taxonomic and numerical resolution: 3 data sets × 4 taxonomic groups × 6 taxonomic levels × 4 numerical resolutions = 288 Bray–Curtis matrices; functional groupings: 3 data sets × 18 traits = 54 Bray–Curtis matrices). The 2nd-stage approach in the PRIMER package was used to analyze differences among multivariate community patterns derived from different data-aggregation modes (taxonomic resolution, numerical resolution, functional groupings). This procedure uses multivariate Spearman rank correlation (ρ) to compare resemblance matrices based on Bray–Curtis similarity and is commonly applied in assessments of TS (Somerfield and Clarke 1995). Nonmetric multidimensional scaling (NMDS) was run on the resulting 2nd-stage matrices (ρ as resemblance measure) to visualize similarities and dissimilarities among Bray–Curtis matrices. To test hypotheses 1 and 2, Bray–Curtis matrices derived from different taxonomic and numerical resolutions were compared in a 2nd-stage analysis, resulting in twelve 2nd-stage matrices and NMDS plots (3 data sets × 4 taxonomic groups). To test hypothesis 2, Bray–Curtis matrices derived from functional groupings were compared with those from the taxonomic groupings (species, genus, family, order, class, and phylum) in a separate 2nd-stage analysis, resulting in twelve 2nd-stage matrices and NMDS plots (3 data sets × 4 taxonomic groups).
Permutational multivariate analysis of variance (PERMANOVA; Anderson et al. 2008) was run in PRIMER to compare the effects of numerical resolution on multivariate community patterns with those of taxonomic resolution (hypothesis 3). PERMANOVA is a routine for testing the multivariate response to one or more factors on the basis of any resemblance measure. The values in the matrix are not treated as independent of one another (Anderson et al. 2008), which enables comparison of matrices derived from the same data (e.g., different taxonomic or numerical levels). For each taxonomic group and data set, 2 PERMANOVA analyses were run on the respective 2nd-stage matrices. Two separate 1-way PERMANOVA designs were applied. Taxonomic resolution (6 factor levels: species, genus, family, order, class, and phylum) was used as a fixed factor in the 1st design and numerical resolution (4 factor levels: untransformed quantitative, √(x)-transformed, % abundance, and presence–absence data) was used as the fixed factor in the 2nd design. Pseudo-F values and permutational p-values were used to compare the effect strength of taxonomic vs numerical resolution.
Taxonomic resolution and environmental gradients
—The ability to recover ecological patterns of periphyton, macrophytes, macroinvertebrates, and fishes at different taxonomic resolutions (hypothesis 1)/functional groupings (hypothesis 2) was tested using Biota-Environmental-Stepwise matching (BEST) analyses in PRIMER (Clarke and Warwick 2001). Taxa data sets were used as response variables (all 342 Bray–Curtis matrices generated in the previous analyses), and environmental variables (water temperature, dissolved O2, specific conductance, pH, water depth, current speed) were used as predictors. The BEST procedure uses a stepwise search and Spearman rank correlation to find a minimum combination of environmental variables that maximizes correlation with the biotic data. The taxonomic level and transformation type resulting in the maximum Spearman correlation coefficient between environmental variables and biotic data (r2) was used to identify the best description of community patterns (following Olsgard et al. 1997).
Univariate analyses of Spearman rank correlation coefficients and diversity measures
To identify the threshold of significant loss of information for each taxonomic group, we used nonparametric univariate statistical analysis with ρ-values, r2-values, and diversity indices as the response variables and taxonomic resolution as the factor (with factor levels: species, genus, family, order, class, and phylum). This approach is applied similarly in microarray analyses (Listgarten and Emili 2005) and machine learning (Demsar 2006) to test the validity of classification algorithms, and Demsar (2006) proposed the use of nonparametric tests for comparisons across data sets. A significant loss of information resulting from coarsening taxonomic resolution (hypothesis 1) in our study is defined as a statistically significant drop of ρ-values (indicating a change in multivariate community pattern; 2nd-stage analysis), r2-values (indicating a change in the capability to detect environmental gradients; BEST analysis), or diversity indices (indicating a change in richness, evenness, Shannon Index, or Simpson’s Index) from a finer level of taxonomic resolution to the next coarsest level (e.g., from species to genus). Richness, evenness (Pielou 1975), Shannon Index (Shannon and Weaver 1949), and Simpson’s Index (Simpson 1949) were calculated for each taxonomic level using the DIVERSE procedure in PRIMER. In addition, we calculated the functional diversity (measured as richness, evenness, Shannon Index, Simpson’s Index) for each functional trait (hypothesis 2). Diversity values were pooled over all data sets, whereas ρ-values and r2-values were pooled over all data sets and levels of numerical resolution. All data were tested for normality using the Shapiro–Wilk test and for homogeneity of variances using the Levene test. Because correlation coefficients and diversity indices were not normally distributed, the Kruskal–Wallis analysis of variance (ANOVA) and post hoc pairwise Mann–Whitney U-test were used to test for differences between aggregation levels. Bonferroni correction was applied to correct for multiple comparisons. All univariate statistics were carried out in the software program R (version 3.0.0; R Development Core Team, Vienna, Austria; www.r-project.org).
To test if the extent of information loss depends on taxonomic diversity (higher-taxa/species ratio [Φ] and the distribution of species to higher taxa) of the investigated set of species (hypothesis 1), the average taxonomic breadth (Δ+) for each aggregation level was correlated with the ρ-values between the respective resemblance matrices using linear regression and Spearman rank correlation. According to Clarke and Warwick (2001), the average taxonomic breadth is defined as , where S is the observed number of species in the sample, the double summation ranges over all pairs i and j of these species, and ω represents the taxonomic distances through the classification tree between every pair of individuals. It was calculated for each data set, taxonomic group, and aggregation level using the function DIVERSE in PRIMER. Linear regressions and correlation analyses were carried out in the software program R.
Effects of taxonomic resolution on multivariate community patterns
Effects of taxonomic resolution on multivariate community patterns strongly differed among the taxonomic groups periphyton, macrophytes, macroinvertebrates, and fishes (Figs 1, 2A–D). The lowest effects of coarsening taxonomic resolution were detected by 2nd-stage analysis in periphyton, followed by macroinvertebrates. A significant loss of information for macrophytes and fishes occurred at the genus (Mann–Whitney U-test, macrophytes, p < 0.001; Figs 1, 2B) and family levels (Mann–Whitney U-test, fishes, p < 0.01; Figs 1, 2D), respectively. In contrast, macroinvertebrate and periphyton community structure changed significantly from species to order (Mann–Whitney U-test, macroinvertebrates, p < 0.01; Figs 1, 2C) and from species to class level (Mann–Whitney U-test, periphyton, p < 0.01; Figs 1, 2A). A strong aggregation up to phylum or class level revealed a significantly different community composition for each taxonomic group (Figs 1, 2A–D). This structure was constant across data sets, but as a comparison of ρ-values from the 3 data sets indicates, the correlations among taxonomic levels were higher for anthropogenic disturbance (effects of weirs, data set 1, mean ρ = 0.79) than for natural variability both for large-scale (different rivers, data set 3, mean ρ = 0.67) and small-scale data (different habitats within one river system, data set 2, mean ρ = 0.66). Regression and correlation analysis of Δ+ and ρ-values between resemblance matrices of different taxonomic levels revealed a strong relationship between both parameters for all taxonomic groups with ρ ranging between 0.70 for fishes and 0.87 for periphyton (Fig. 3). In contrast, a comparison of Φ and the decline in ρ for periphyton and macrophytes indicates a less pronounced relationship in these groups. Φ between species and genus level was lower for periphyton (Φ = 0.56) with ρ staying constant, whereas for macrophytes, Φ was higher (Φ = 0.68), but ρ decreased significantly from species to genus level. At the same time, the Δ+ was higher for periphyton (Δ+ species–genus = 0.98) than for macrophytes (Δ+ species–genus = 0.89).
Effects of taxonomic resolution on the quantification of biodiversity
The univariate comparisons of richness, evenness, Shannon Index, and Simpson’s Index from different taxonomic levels suggest that diversity measures are generally affected by taxonomic resolution in a very similar way as the multivariate community patterns. As expected, increasing aggregation resulted in a decrease of richness and diversity indices, but with differences in the extent of decrease among different taxonomic groups. In the groups of macroinvertebrates and macrophytes, the quantification of biodiversity applying TS did not strongly differ from using species-level data as a baseline. The detected decrease with coarser taxonomic level was strongest for richness (Table 1), moderate and identical for Shannon Index and Simpson’s Index (Table 1), and less pronounced for evenness. Significant differences in evenness occurred only by shifting from species to class and phylum level for periphyton (Mann–Whitney U-test, p < 0.01), macrophytes (Mann–Whitney U-test, p < 0.001), and macroinvertebrates (Mann–Whitney U-test, p < 0.001), and to order level for fishes (Mann–Whitney U-test, p < 0.05). Shannon Index and Simpson’s Index did not change significantly up to the same or even at coarser taxonomic level as observed for the respective community pattern in each taxonomic group. Evenness was significantly lower only on class and phylum level for all taxonomic groups except for fishes, where it decreased significantly on order level.
Taxonomic resolution and environmental gradients
Because there were almost no differences in the results of BEST analyses between taxonomic levels (Table 2), the applied taxonomic resolution generally had low effects on the capability to detect environmental gradients. Significant differences between taxonomic levels occurred only for macroinvertebrates, with the correlation of environmental variables being significantly lower on family, order, class, and phylum level than on species level (Table 2). For macrophytes, the BEST analysis revealed slightly higher correlation on species level, but there were no significant differences and the standard deviation of BEST correlation coefficients was very high (Table 2). For periphyton, a slight decrease in values from family to phylum level was detected, whereas for fishes only the standard deviation increased from family level to higher taxonomic classifications (Table 2). Numerical resolution had no significant effects on BEST results for all investigated taxonomic groups.
Effects of numerical resolution
PERMANOVA pseudo-F values were constantly higher and p-values lower for the factor numerical resolution than for taxonomic resolution, so different types of data transformation (none, √(x), % abundance data, presence–absence data) obviously had stronger effects on community patterns than taxonomic levels (PERMANOVA taxonomic resolution: pseudo-F = 1.01–68.87, p = 0.001–0.5; numerical resolution: pseudo-F = 1.01–252.32, p = 0.001–0.49). Matrices from √(x)-transformed data and untransformed data strongly clustered in most cases (except for the periphyton data from data set 1, see Fig. 1). Percent abundance data and presence–absence data had more pronounced effects on the multivariate community pattern across data sets and taxonomic groups. The effects of coarsening taxonomic resolution on multivariate community patterns increased with coarsening numerical resolution, which also differed between taxonomic groups (Fig. 1). Evaluating the arrangement of matrices in the 2nd-stage NMDS, the strongest change of community patterns resulting from numerical data resolution was detected for periphyton (PERMANOVA data set 1: pseudo-F = 252.32, p < 0.001; data set 2: pseudo-F = 198.94, p < 0.001; data set 3: pseudo-F = 1.01, p = 0.35). A moderate impact was found in macrophytes (PERMANOVA data set 1: pseudo-F = 133.14, p < 0.001; data set 2: pseudo-F = 15.56, p < 0.01; data set 3: pseudo-F = 1.01, p = 0.49), and fishes (PERMANOVA data set 1: pseudo-F = 65.42, p < 0.01; data set 2: pseudo-F = 12.69, p < 0.001; data set 3: pseudo-F = 41.63, p < 0.001). The least pronounced effects were found for macroinvertebrates (PERMANOVA data set 1: pseudo-F = 12.73, p < 0.001; data set 2: pseudo-F = 17.76, p = 0.001; data set 3: pseudo-F = 17.35, p < 0.001; Fig. 1).
Functional traits as alternative grouping
A multivariate comparison of Bray–Curtis matrices from alternative groupings according to functional characteristics of species with those from taxonomic groupings indicates strong differences between taxonomic groups and the applied functional traits (Fig. 4). Some functional groupings, e.g., feeding types of fishes and macroinvertebrates and the trophic state of macrophytes, revealed community patterns that were very similar to those on species or other taxonomic levels in 2nd-stage analysis, but other traits, such as reproductive strategies and habitat preferences of fishes and the zonation of macroinvertebrates, resulted in a clustering that differed from taxonomic groupings (Fig. 4). Similarity between functional groupings and species level was highest for periphyton (ρ = 0.71–0.82) and lowest for fishes (ρ = 0.25–0.65) (Fig. 5). Correlations with environmental variables were lower or in the same range for functional groupings as for species-level data and declined with decreasing similarity to species level (Fig. 5).
Functional diversity, measured by Shannon Index and Simpson’s Index, summarized for all traits per taxonomic group was always higher than species diversity, no matter if richness was increased or decreased in comparison to species richness by summarizing all traits. Richness was generally strongly reduced by functional grouping (Table 3) because it was limited by the maximum number of trait states. Nevertheless, functional diversity (especially when measured by evenness) of single traits often reached similar values as species diversity (e.g., habitat preferences of periphyton, functional feeding groups of macroinvertebrates, substrate preferences, and trophic state of macrophytes and reproductive strategies of fishes; Table 3, Fig. 4).
Our study provides new baseline data about the effectiveness of taxonomic surrogates in freshwater ecosystems, including taxonomic resolution coarser than species, functional groups, diversity measures, and effects of numerical data resolution in periphyton, macrophytes, macroinvertebrates, and fishes. This information is crucial for assessing the applicability of the concept of TS in freshwater ecosystems, e.g., for understanding advantages and limits of using coarser taxonomic resolution than species (Bevilacqua et al. 2012) or functional surrogates instead of classical species data.
The applicability of taxonomic sufficiency for ecological analyses
The post hoc univariate comparison of 2nd-stage correlation coefficients clearly demonstrates that the threshold of losing statistically significant information when applying TS strongly differs between taxonomic groups. In addition, the applicability of TS is influenced by the scale of the investigated effects. The influence of effect scale is evident by the higher 2nd-stage correlation between finer and coarser taxonomic levels throughout all investigated taxonomic groups for data sets considering very pronounced differences between treatments (e.g., upstream and downstream sides of weirs in data set 1 or different rivers in data set 2) than for data sets considering small-scale natural variability (e.g., natural variation between habitat types within one river in data set 3). These findings indicate that the required taxonomic resolution rather depends on the investigated taxonomic group and the extent of the studied effects than on the ecosystem type. The differences in the applicability of TS between taxonomic groups are probably founded in the complexity of the systematic classification of the respective groups in a certain geographic region (Heino and Soininen 2007), which in turn results in differences in taxonomic diversity. In groups with a relatively low taxonomic diversity, the application of coarser taxonomic levels for community analyses can cause significant loss of information because of the aggregation of species with differing ecological requirements. For instance, in the group of freshwater fishes, species with contrasting specialization are aggregated on family level. In the data sets investigated in our study, species having very different ecological requirements but belonging to the same family (cyprinids) co-occurred (e.g., high current preference: Chondrostoma nasus L. or Barbus barbus L. and preference for lentic habitats: Scardinius erythrophthalmus L. or Rhodeus amarus A.). This grouping of ecologically different characteristics may have limited the habitat-type separation in the multivariate community pattern analysis, resulting in a low correlation between species and family level. The high discrepancy between genus and family level limits the practical use of TS for fishes, e.g., in European freshwater ecosystems because many genera are species-poor, and genus identification requires the same expert knowledge as species identification, so it is not more effective than species identification (Mandelik et al. 2007). However, the practical use of TS for fishes may be different in regions where fish species diversity is very high compared to our data sets (e.g., Amazon basin, Congo basin, Southern Asia; Rosenzweig and Sandlin 1997). In contrast to fishes, a comparison of the results for macroinvertebrates from our study with previous studies from other types of habitats (freshwater: Jones 2008, Buss and Vitorino 2010; marine: Olsgard et al. 1997, Chainho et al. 2007, Sajan et al. 2010; terrestrial: Blanche et al. 2001, Cagnolo et al. 2002, Landeiro et al. 2012) generally suggests a high robustness for multivariate community analyses up to family level. A similar result was obtained for periphyton. Because of the high numbers of families and orders in macroinvertebrates and periphyton, the probability that ecological differences are conserved on coarser taxonomic levels is increased for these groups compared to fishes. This explanation is supported by Bevilacqua et al. (2012), who found a significant relationship between Φ and correlations between species and coarser taxonomic level community patterns. Bevilacqua et al. (2012) also could detect this relationship for randomly aggregated data. For this reason, it is likely that the phenomenon of higher Φ increasing the probability of similarity between species and coarser taxonomic levels is a simply stochastic relationship. However, the use of Φ as a predictive measure for the effectiveness of TS in a certain set of species does not include the possible effects of an uneven distribution of species to higher taxa. This influence is evident from comparing Φ and Δ+ of periphyton and macrophytes in relation to the change in multivariate community patterns between genus and species level. The lower predictive power of Φ is probably caused by the more even hierarchical taxonomic distribution of periphyton species on genera than the distribution of macrophyte species on genera. Taxonomic diversity measured by Δ+ (Clarke and Warwick 2001) includes these effects and was suitable to predict the applicability of different taxonomic levels as evident from the high correlation coefficients. Thus, the average taxonomic breadth can be used as surrogate for a pre-estimation of the minimum necessary taxonomic level. This measure can easily be calculated from species lists obtained from pre-assessments in all geographic regions and ecosystem types. Second-stage NMDS is a universally applicable and powerful method for the selection of a combination of functional traits with minimum phylogenetic- and autocorrelation that maximize information content.
The capability to detect environmental gradients using coarser levels than species
Previous studies on macrobenthic communities in marine and freshwater environments suggested that coarser taxonomic levels may be more appropriate for the quantification of environmental changes than the species level (Ferraro and Cole 1990, Warwick 1993, Bailey et al. 2001). This finding was explained by the high noise that is caused by the individual reaction to natural environmental gradients of each species, which can disguise the effects of anthropogenic disturbance (Warwick 1993). Based on our results, this hypothesis has to be rejected for freshwater systems. Aggregation level had no significant influence on the capability to detect environmental gradients in our data sets, and correlation values decreased at levels coarser than species. This finding is also supported by results from marine meiofauna (Olsgard et al. 1997). Consequently, the application of moderate taxonomic levels in taxa identification may be justifiable if financial resources are limited. However, as indicated by the decreasing trend of correlation values and rising standard deviations, the reliability of the results suffers from aggregation, at least for class and phylum level. Subtle environmental changes, which may have an effect on rare or threatened species, can be overlooked if TS is applied.
The applicability of higher-taxon diversity as surrogate for species diversity
Besides the effects of TS on multivariate community patterns, there were also effects on the quantification of biodiversity, depending on the diversity measure applied. Although Heino and Soininen (2007) found a strong correlation between species richness and higher taxon richness for stream macroinvertebrates and diatoms, the strong decrease of richness already occurring by aggregation to genus or family level in our study indicates that biodiversity can be strongly underestimated if richness is used as the only diversity measure. In contrast, evenness, Shannon Index, and Simpson’s Index are even less affected by taxonomic data aggregation than multivariate community patterns. However, the most pronounced loss of information was detected from order to class level for all data sets, taxonomic groups, and diversity indices. Consequently, the order level seems to be a critical threshold of taxonomic resolution for aquatic ecology, below which the explanatory power of biodiversity measures strongly decreases. If the concept of TS is being applied to new systems or habitats, an initial combination of several measures of diversity (Heino et al. 2008) and multivariate community analyses, calculated from genus, family, or order level (adapted to the specific taxonomic group and the available resources) can serve as a reliable surrogate for species diversity for periphyton and macroinvertebrates. In contrast, the strong changes in the outcome of the analysis from species to family level observed for macrophytes and fishes suggest that genus- or species-level identification is necessary for these groups.
The applicability of functional surrogates for multivariate analyses
Alternative groupings according to functional traits can potentially reveal additional information concerning ecosystem properties beyond taxonomic composition (Usseglio-Polatera et al. 2000, Poff et al. 2006). Thereby, low statistical and phylogenetic correlations among traits, e.g., as detected in our study for migration types of fishes and life form of macrophytes (ρ species level ≤ 0.31), are desirable to ensure statistical independence and to maximize information content (Townsend and Hildrew 1994, Cadotte et al. 2011, Poff et al. 2006). For the faunal groups in our study, functional traits related to feeding types of fishes and macroinvertebrates were more strongly correlated with species-level data than traits referring to habitat use (e.g., migration type and habitat preferences of fishes, saprobic state, and zonation of macroinvertebrates). This finding is in line with the assumption that some functional traits are phylogenetically more conserved than others (Usseglio-Polatera et al. 2000). In turn, a similar clustering of species into groups derived from both taxonomy and functional traits can be explained by the phylogenetic conservation of traits, e.g., through similar morphologic and physiologic characteristics related to the feeding type within one genus or family.
The applicability of functional diversity as surrogate for species diversity
Functional diversity (richness and Shannon Index) that can be calculated from species-level data aggregated to functional groups strongly depends on the number of trait states. For instance, the functional diversity of migration types of fishes (only 3 categories) is lower than species diversity, whereas other traits including many categories (e.g., >10 feeding types of macroinvertebrates) typically reveal higher diversity values. This finding is supported by Bêche and Statzner (2009), who pointed out the weakness of trait richness as a measure of functional diversity in stream macroinvertebrates. In contrast to richness, evenness and Simpson’s Index for single traits and the combination of all investigated functional traits per taxonomic group resulted in high diversity values for all data sets investigated herein. The functional diversity of All Traits (representing the functional trait niches of the investigated taxa according to Poff et al. 2006) measured by Simpson’s Index sometimes even exceeded species diversity, regardless of whether the combination of all traits caused a decrease or an increase in richness. Consequently, this approach appears to be more comprehensive for analyzing functional diversity than the consideration of single traits.
The detection of environmental gradients applying functional surrogates
Because other authors hypothesized that functional traits represent evolutionary responses to environmental selective forces (e.g., Southwood 1977, Poff et al. 2006, Heino 2008b), alternative groupings according to functional characteristics of species were expected to be more suitable for detection of environmental gradients than species data. Moreover, testing the effects of the environment on the environmental relations of taxa (functional traits) can theoretically be prone to circular reasoning, resulting in high correlation values. Surprisingly, environmental correlations with data aggregated to functional groups, taxonomic groups, and functional traits were generally lower than for species-level data. This low correlation may be a result of the combination of traits and environmental variables investigated or the fact that functional traits are strongly affected by biotic interactions (Tonn 1990, Poff 1997, Carey and Wahl 2011) and habitat complexity (Heino 2008b). Consequently, it seems to be most reasonable to choose a combination of functional traits from different trophic levels that should preferably reveal low statistical and phylogenetic correlations (Mouillot et al. 2005). Following the results of our study, this set of functional traits could, for instance, be a combination of trophic state of periphyton and life form of macrophytes (primary producers), saprobic state of macroinvertebrates (primary and secondary consumers), and migration type or habitat preferences of fishes (secondary consumers). This approach assures the inclusion of all important foodweb components but avoids autocorrelation.
Effects of numerical resolution
Data transformation could have strong effects on the results of ecological analyses that even exceeded the effects of taxonomic resolution for almost all data sets and taxonomic groups herein. The same observation has been made in aquatic environments by other authors (Olsgard et al. 1997, Anderson et al. 2005, Heino 2008a), so it is likely that this is a general phenomenon, at least in aquatic ecosystems. Especially, reductions of quantitative information (relative abundances or presence–absence data) proved to strongly alter multivariate community patterns. This finding suggests that quantitative information can be more important than taxonomic detail if there are gradients in the productivity of the system under study, e.g., in weir-influenced river stretches (data set 1). In contrast, untransformed data or √(x)-transformed data contain the most information about productivity of habitats. However, classical monitoring protocols are often based on nonquantitative sampling techniques (e.g., periphyton sampling for the European Water Framework Directive; Schaumburg et al. 2007) or according to the Rapid Bioassessment Protocols of the US Environmental Protection Agency (Barbour et al. 1999). A lack of standardization in sample size makes the use of % abundance data or presence–absence the only choice for data analyses. Consequently, a rethinking of currently applied monitoring techniques may be required. Because financial resources for monitoring programs are typically limited, effort expended in taxonomic detail often could be better spent in quantitative sampling techniques and the consideration of multiple taxonomic groups, at least if the main objective is the monitoring of ecosystem changes rather than the conservation of specific rare species. Moreover, the effects of numerical resolution on the applicability of TS have to be considered. Because of the more pronounced effects of taxonomic aggregation on presence–absence data and % abundance data (Fig. 1), the application of TS may be less appropriate if quantitative data are not available.
Our study for the first time provides statistical threshold levels for the application of TS for ecological analyses in 4 different freshwater groups. The results of our study suggest that TS can be applied up to family or order level for macroinvertebrates and periphyton (Δ+ species–phylum > 0.77), whereas fishes and macrophytes (Δ+ species–phylum < 0.68) should be identified to genus and species level. However, for investigating the effects of environmental changes based on species-specific tolerances (e.g., water-quality determination; Lenat and Resh 2001), the use of species-level data appears generally advantageous. Because the applicability of TS was higher in data sets from impaired systems and with large spatial scale in our and in other studies (e.g., Bevilacqua et al. 2012), TS may be of great relevance for environmental-impact assessments, monitoring, and efficiency control of restoration measures. The strong impact of numerical data resolution on the outcome of ecological analysis suggests investing effort in quantitative data and number of spatial and temporal replicates rather than in taxonomic detail. The consideration of functional traits as additional descriptive variables, e.g., plotted on taxonomic data as vectors in multivariate statistics (Mueller et al. 2011), is a more integrative approach to analyze interactions between taxonomic composition, environmental conditions, and ecosystem functions than using functional traits as input variables for NMDS.
We acknowledge support to MM by the Technische Universität München Graduate School and a doctoral scholarship of UniBayern e.V.
- Anderson, M. J, S. D. Connell, B. M Gillanders, C. E Diebel, W. M Blom, J. E Saunders, and T. J Landers. 2005. Relationships between taxonomic resolution and spatial scales of multivariate variation. Journal of Animal Ecology 74:636–646.
- Anderson, M. J, R. N Gorley, and K. R Clarke. 2008. PERMANOVA+ for PRIMER: guide to software and statistical methods The University of Chicago Press Plymouth, UK
- Bailey, R. C, R. H Norris, and T. B Reynoldson. 2001. Taxonomic resolution of benthic macroinvertebrate communities in bioassessments. Journal of the North American Benthological Society 20:280–286.
- Balmford, A, A. H. M Jayasuriya, and M. J. B Green. 1996. Using higher taxon richness as a surrogate for species richness: II. Local applications. Proceedings of the Royal Society of London Series B: Biological Sciences 263:1571–1575.
- Barbour, M. T, J Gerritsen, B. D Snyder, and J. B Stribling. 1999. Rapid bioassessment protocols for use in streams and wadeable rivers: periphyton, benthic macroinvertebrates, and fish. 2nd edition. EPA 841-B-99-002 The University of Chicago Press Washington, DC
- Bêche, L. A, and B Statzner. 2009. Richness gradients of stream invertebrates across the USA: taxonomy- and trait-based approaches. Biodiversity Conservation 18:3909–3930.
- Bellier, E, V Grotan, S Engen, A. K Schartau, O. H Diserud, and A. G Finstad. 2012. Combining counts and incidence data: an efficient approach for estimating the log-normal species abundance distribution and diversity indices. Oecologia (Berlin) 170:477–488.
- Bevilacqua, S, A Terlizzi, J Claudet, S Fraschetti, and F Boero. 2012. Taxonomic relatedness does not matter for species surrogacy in the assessment of community responses to environmental drivers. Journal of Applied Ecology 49:357–366.
- Blanche, K. R, A. N Andersen, and J. A Ludwig. 2001. Rainfall-contingent detection of fire impacts: responses of beetles to experimental fire regimes. Ecological Applications 11:86–96.
- Buss, D. F, and A. S Vitorino. 2010. Rapid bioassessment protocols using benthic macroinvertebrates in Brazil: evaluation of taxonomic sufficiency. Journal of the North American Benthological Society 29:562–571.
- Cadotte, M. W, K Carscadden, and N Mirotchnick. 2011. Beyond species: functional diversity and the maintenance of ecological processes and services. Journal of Applied Ecology 48:1079–1087.
- Cagnolo, L, S. I Molina, and G. R Valladares. 2002. Diversity and guild structure of insect assemblages under grazing and exclusion regimes in a montane grassland from Central Argentina. Biodiversity Conservation 11:407–420.
- Carey, M. P, and D. H Wahl. 2011. Determining the mechanism by which fish diversity influences production. Oecologia (Berlin) 167:189–198.
- Chainho, P, M. F Lane, M. L Chaves, J. L Costa, M. J Costa, and D. M Dauer. 2007. Taxonomic sufficiency as a useful tool for typology in a poikilohaline estuary. Hydrobiologia 587:63–78.
- Clarke, R. K, and R. N Gorley. 2006. PRIMER v6: user manual/tutorial. 2nd edition The University of Chicago Press Plymouth, UK
- Clarke, R. K, and R. M Warwick. 2001. Change in marine communities: an approach to statistical analysis and interpretation The University of Chicago Press Plymouth, UK
- Demsar, J. 2006. Statistical comparison of classifiers over multiple data sets. Journal of Machine Learning Research 7:1–30.
- Drew, L. W. 2011. Are we losing the science of taxonomy . BioScience 61:942–946.
- Dudgeon, D, A. H Arthington, M. O Gessner, Z Kawabata, D. J Knowler, C Lévêque, R. J Naiman, A Prieur-Richard, D Soto, M. L. J Stiassny, and C. A Sullivan. 2006. Freshwater biodiversity: importance, threats, status and conservation challenges. Biological Review 81:163–182.
- Ellis, D. 1985. Taxonomic sufficiency on pollution assessment. Marine Pollution Bulletin 16:459.
- Ferraro, S. P, and F. A Cole. 1990. Taxonomic level and sample size sufficient for assessing pollution impacts on the Southern California Bight macrobenthos. Marine Ecology Progress Series 67:251–262.
- Geist, J. 2011. Integrative freshwater ecology and biodiversity conservation. Ecological Indicators 11:1507–1516.
- Giagrande, A. 2003. Biodiversity, conservation, and the “taxonomic impediment”. Aquatic Conservation: Marine and Freshwater Ecosystems 13:451–459.
- Hansson, L, M Gyllström, A Ståhl-Delbanco, and M Svensson. 2004. Responses to fish predation and nutrients by plankton at different levels of taxonomic resolution. Freshwater Biology 49:1538–1550.
- Heino, J. 2008a. Influence of taxonomic resolution and data transformation on biotic matrix concordance and assemblage-environment relationships in stream macroinvertebrates. Boreal Environmental Research 13:359–369.
- Heino, J. 2008b. Patterns of functional biodiversity and function–environment relationships in littoral macroinvertebrates. Limnology and Oceanography 53:1446–1455.
- Heino, J, H Mykrä, and J Kotanen. 2008. Weak relationships between landscape characteristics and multiple facets of stream macroinvertebrate biodiversity in a boreal drainage basin. Landscape Ecology 23:417–426.
- Heino, J, and J Soininen. 2007. Are higher taxa adequate surrogates for species-level assemblage patterns and species richness in stream organisms. Biological Conservation 137:78–89.
- Johnson, R. K, D Hering, M Furse, and R. T Clarke. 2006. Detection of ecological change using multiple organism groups: metrics and uncertainty. Hydrobiologia 566:115–137.
- Jones, F. C. 2008. Taxonomic sufficiency: the influence of taxonomic resolution on freshwater bioassessments using benthic macroinvertebrates. Environmental Review 16:45–69.
- Landeiro, V. L, L. M Bini, F. R. C Costa, E Franklin, A Nogueira, J. L. P Souza, J Moraes, and W. E Magnusson. 2012. How far can we go in simplifying biomonitoring assessments? An integrated analysis of taxonomic surrogacy, taxonomic sufficiency and numerical resolution in a megadiverse region. Ecological Indicators 23:366–373.
- Lenat, D. R, and V. H Resh. 2001. Taxonomy and stream ecology: the benefits of genus- and species-level identifications. Journal of the North American Benthological Society 20:287–298.
- Listgarten, J, and A Emili. 2005. Statistical and computational methods for comparative proteomic profiling using liquid chromatography-tandem mass spectrometry. Molecular and Cellular Proteomics 4.4:419434.
- Losos, J. B, M Leal, R. E Glor, K de Queiroz, P. E Hertz, S. L Rodriguez, L. A Chamizo, T. R Jackman, and A Larson. 2003. Niche lability in the evolution of a Caribbean lizard community. Nature 424:542–545.
- Mandelik, Y, T Dayan, V Chikantunov, and V Kravchenko. 2007. Reliability of a higher-taxon approach to richness, rarity, and composition assessments at the local scale. Conservation Biology 21:1506–1515.
- Maurer, D. 2000. The dark side of taxonomic sufficiency (TS). Marine Pollution Bulletin 40:98–101.
- Millennium Ecosystem Assessment. 2005. Ecosystems and human well-being: health synthesis The University of Chicago Press Washington, DC
- Mouillot, D, W. H. N Mason, O Dumay, and J. B Wilson. 2005. Functional regularity: a neglected aspect of functional diversity. Oecologia (Berlin) 142:353–359.
- Mueller, M, J Pander, and J Geist. 2011. The effects of weirs on structural stream habitat and biological communities. Journal of Applied Ecology 48:1450–1461.
- Olsgard, F, P. J Somerfield, and M. R Carr. 1997. Relationships between taxonomic resolution and data transformations in analyses of a macrobenthic community along an established pollution gradient. Marine Ecology Progress Series 149:173–181.
- Olsgard, F, P. J Somerfield, and M. R Carr. 1998. Relationships between taxonomic resolution, macrobenthic community patterns and disturbance. Marine Ecology Progress Series 172:25–36.
- Pielou, E. C. 1975. Ecological diversity The University of Chicago Press New York
- Poff, N. L. 1997. Landscape filters and species traits: toward mechanistic understanding and prediction in stream ecology. Journal of the North American Benthological Society 16:391–409.
- Poff, N. L, D. J Olden, N. K. M Vieira, D. S Finn, M. P Simmons, and B Kondratieff. 2006. Functional trait niches of North American lotic insects: trait-based ecological applications in light of phylogenetic relationships. Journal of the North American Benthological Society 25:730–755.
- Rosenzweig, M. L, and E. A Sandlin. 1997. Species diversity and latitudes: listening to area’s signal. Oikos 80:172–176.
- Sajan, S, T. V Joydas, and R Damodaran. 2010. Depth-related patterns of meiofauna on the Indian continental shelf are conserved at reduced taxonomic resolution. Hydrobiologia 652:39–47.
- Schaumburg, J, C Schranz, D Stelzer, and G Hofmann. 2007. Handlungsanweisung für die ökologische Bewertung von Seen zur Umsetzung der EU-Wasserrahmenrichtlinie: Makrophyten und Phytobenthos The University of Chicago Press Augsburg, Germany
- Shannon, C. E, and W Weaver. 1949. The mathematical theory of communication The University of Chicago Press Urbana, Illinois
- Siefert, A. 2012. Incorporating intraspecific variation in tests of trait-based community assembly. Oecologia (Berlin) 170:767–775.
- Simpson, E. H. 1949. Measurement of diversity. Nature 163:688.
- Somerfield, P. J, and K. R Clarke. 1995. Taxonomic levels, in marine community studies, revisited. Marine Ecology Progress Series 127:113–119.
- Southwood, T. R. E. 1977. Habitat, the templet for ecological strategies. Journal of Animal Ecology 46:337–365.
- Stammel, B, B Cyffka, J Geist, M Müller, J Pander, G Blasch, P Fischer, A Gruppe, F Haas, M Kilg, P Lang, R Schopf, A Schwab, H Utschick, and M Weißbrod. 2012. Floodplain restoration on the Upper Danube (Germany) by re-establishing water and sediment dynamics: a scientific monitoring as part of the implementation. River Systems 20/1–2:55–70.
- Strayer, D. L, and D Dudgeon. 2010. Freshwater biodiversity conservation: recent progress and future challenges. Journal of the North American Benthological Society 29:344–358.
- Sweeney, B. W, J. M Battle, J. K Jackson, and T Dapkey. 2011. Can DNA barcodes of stream macroinvertebrates improve descriptions of community structure and water quality. Journal of the North American Benthological Society 30:195–216.
- Tonn, W. M. 1990. Climate change and fish communities: a conceptual framework. Transactions of the American Fisheries Society 119:337–352.
- Townsend, C. R, and A. G Hildrew. 1994. Species traits in relation to a habitat templet for river systems. Freshwater Biology 31:265–275.
- Usseglio-Polatera, P, M Bournaud, P Richoux, and H Tachet. 2000. Biological and ecological traits of benthic freshwater macroinvertebrates: relationships and definition of groups with similar traits. Freshwater Biology 43:175–205.
- Warwick, R. M. 1993. Environmental impact studies on marine communities – pragmatic considerations. Australian Journal of Ecology 18:63–80.