About Data

I recently did a workshop about data with PhD students. That was great to order my thoughts and put together a lot of good resources. All material used (with lots of links) is available in GitHub: https://github.com/ibartomeus/Data. Topics range from hardcore R code to read, clean and transform data, to discussions about data sharing and reproducibility. Few PhD students thought about sharing their data before the workshop, and I hope I convinced some of them, not sure about that. However, I clearly convinced them to have reproducible workflows and ditch excel, which is a win.

In the last years I worked with lots of different data (mine and from others, big and small) and I have to say most datasets were poorly formated, maintained and documented. I think we do not give enough importance to data curating practices, but they are central to the scientific process. The feedback of the students is that the workshop was very enlightening for them and that they never received formal or informal advice on how to deal with data. If things are going to change, we need to do an active effort to form the new generation of scientists in a open data culture.

Advertisements

One thought on “About Data

  1. Pingback: About Data ← Patient 2 Earn

Discussion

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s