In brief: Who is in to create an Open Trait Data repository?
In this same moment at least 10 researchers (but mostly undergrads) are compiling trait data for some exciting analysis. That includes myself. In fact, most trait analysis are hampered by the quality of the traits, which are often lumped to the species level, and hence do not capture the natural variation, or info for some species is based on just one population with the hope that it is representative. Paradoxically, I think this trait data is very abundant, but not available. Thousands of researchers have measures, for example, of body size for a bunch of specimens of his/her preferred taxa. This data is just not accessible or is scattered on the net.
There are some databases (some open, some not) with traits for some groups (plants, birds and mammals) but not a joint effort to capture all this knowledge like the GenBank initiative. So I propose to create a TraitBank. The technology is easy to implement (from a SQL server liked to a web, to a simple Google spreadsheet), but the key aspect would be to enroll the community to make trait data deposition encouraged upon manuscript acceptance. Do you think that the leading journals will ask authors to deposit any morphological or life history measurement reported in the paper? It will also be important that a well-known independent organisation host the data. Any idea on who to contact? would Figshare be an option?
The fields should be very delimited to allow an easy search and compilation of information; as a first pass I would propose:
– Publication associated with the data and/or author
– Species taxonomy (full taxonomy can be retrieved from ITIS)
– Measurement is in wild or captive populations
– Region and Lat/Long of the measurement
– Category (morphological;life history; or ecological trait)
– Subcategory (e.g. body mass; clutch size; survival; phenology…)
– Mean value, SE and n: Units should be fixed by the subcategory.
A form and an option to upload a large csv should be enough. An API that allow connecting to R would be a blast. So how can we move that idea forward?