In brief: Who is in to create an Open Trait Data repository?

In this same moment at least 10 researchers (but mostly undergrads) are compiling trait data for some exciting analysis. That includes myself. In fact, most trait analysis are hampered by the quality of the traits, which are often lumped to the species level, and hence do not capture the natural variation, or info for some species is based on just one population with the hope that it is representative. Paradoxically, I think this trait data is very abundant, but not available. Thousands of researchers have measures, for example, of body size for a bunch of specimens of his/her preferred taxa. This data is just not accessible or is scattered on the net.

There are some databases (some open, some not) with traits for some groups (plants, birds and mammals) but not a joint effort to capture all this knowledge like the GenBank initiative. So I propose to create a TraitBank. The technology is easy to implement (from a SQL server liked to a web, to a simple Google spreadsheet), but the key aspect would be to enroll the community to make trait data deposition encouraged upon manuscript acceptance. Do you think that the leading journals will ask authors to deposit any morphological or life history measurement reported in the paper? It will also be important that a well-known independent organisation host the data. Any idea on who to contact? would Figshare be an option?

The fields should be very delimited to allow an easy search and compilation of information; as a first pass I would propose:

– Publication associated with the data and/or author
– Species taxonomy (full taxonomy can be retrieved from ITIS)
Measurement is in wild or captive populations
– Region and Lat/Long of the measurement
– Category (morphological;life history; or ecological trait)
– Subcategory (e.g. body mass; clutch size; survival; phenology…)
– Mean value, SE and n: Units should be fixed by the subcategory.

A form and an option to upload a large csv should be enough. An API that allow connecting to R would be a blast. So how can we move that idea forward?


6 thoughts on “TraitBank

  1. Hey Ignasi, I like the idea! Just a few comments:

    -Database: MySQL or some other SQL variant is a good idea, perhaps even Postgresql would be better (totally open source).
    -Hosting: I’m not sure figshare is the right place. It would be great for folks just bulk downloading, but not for building in any type of input or output interface. Hosting the database either on university resources or something like Amazon’s EC2 or Heroku would be best.
    -Data input: this could take a significant amount of time since people tend to like to type data in very differently. I guess there could be a standard format you could require, then after upload, but before adding to the database, you could run some python/R scripts on the data to make sure it is cleaned up.
    -API: This wouldn’t be too hard. I am learning how to do this now, in Ruby on rails. However, maintaining the API, and making it easy to use with documentation would require more than just a few hours here and there.

    Anyway, great idea. I think if there was some money to support this all these things could be built.


  2. I think that doing something like this for traits is an awesome idea. As you say there are a number of the pieces out there already, but many of them are not meaningfully open and they are generally divided along taxonomic lines. Our group would definitely be interested in helping to contribute to and use a database like this one.

  3. Hey, we’re working on this already at the Encyclopedia of Life! We’re even calling it TraitBank ;-) We’re building on our existing infrastructure (Ruby on Rails, SQL) but adding a Virtuoso triple store. We’ll enhance our existing APIs (note there’s already an Reol, that should tap into this for R users), connect (I hope) with crowd-friendly platforms like WikiData & FreeBase but also with data repositories like Dryad. Let me know if you want to be invited to the private beta which should be released in the next couple of months.

  4. Yes Please!

    I had more thoughts on TraitBank lately, and in fact, i was thinking on putting together an application for the next EOL grant, so Great you are already on it. I would be pleased to contribute if possible.


Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s