Preferring a preference index II: null models

This is a guest post by my PhD student Miguel Ángel Collado. My last post on preferring a preference indexes was not satisfactory to us, so we have better options now. Read Miguel Ángel solution below.


We are working on the ecological value of various habitats or sites. In addition to different classical biodiversity indexes, we want to know if we have some sites that are not specially diverse, but they have some ecologically important species attached to them, we could measure this through preference analyses, using null models to compare with our data.

We can define “preference” for an species if the presence of this species on a given site is bigger than expected by random. A way to know this is comparing to null models and establishing an upper threshold for preference, and a lower one for avoidance, this way we would know whether some species of interest have affinity for some sites or just use them as expected.

To see an example of this

Advertisements

Food consumption and global change

A conversation today at lunch time made me think about some notes I took on this topic, which I reproduce here:

Jonathan Foley gave a pretty convincing talk at ESA 2013 showing that meat consumption is unsustainable for the environment (i.e. land use + CO2 emissions). This was “the straw that broke the camel’s back”* for me and since then I reduced my meat consumption quite drastically.

However, I read a few days ago this paper showing that changing meat for vegetables and fruits can be even worse if you take also into account water footprint and energy use (e.g. transport and storage). I skip the details, but the bottom line is that the story is complicated and the best way to save the world is to reduce calorie intake and eat lots of grains. Here is Figure 2 from the paper (the paper style and figures are quite poor, by the way).

Tom_2015flexiterian_pdf__page_5_of_12_.png

It’s hard because even if you want to do the best is not easy. Is it better for the environment to use bacon or eggplant with my pasta? No idea!**. If I knew the Y axe of the following graph things would be easier.

Blank.png

*this is what google suggest for translating “la gota que colma el vaso”.

**Is the bacon from pigs next door? Is the eggplant from Nicaragua?

Climate change, phenology match and the big unknown

This year was crazy in Seville with plants flowering 2-3 months earlier than last year. So we went to sample, and guess what: bees were there too. Despite expectations about phenological “mis-match” are raised here and there, we don’t find a big phenological mismatch between plants and pollinators*. I am not talking here of specific species, but taking a community approach. However, this is not the end of the story. Is good that plants and pollinators are in sync, but this alone doesn’t warrants a healthy ecosystem functioning.

Why not? My main worry is that after a mild January and beginning of February, we have now “normal cold days” again. Consequently, we also find little bee activity (today we are sampling at 14ºC just to make sure this is true). Hence, both plants and bees are likely to suffer. The demographic implications of this are hard to predict, maybe is not a big deal if it happens only one year, but if it happens often, I presume can be quite bad. All in all its hard to quantify, but I suspect that we need to go back to population dynamics if we want to understand climate change impacts beyond phenological overlaps.

*Don’t take this blog as word, there are plenty of good papers showing it (here and here), including my own (here and here), and very little showing a clear mismatch, most of those on specialized systems.

Ecoflor 2016

Ecoflor is an annual Spanish meeting on everything related to flowers (from evolution to pollinators). The level is amazingly high for being a small “unorganized” local meeting and the most important part is that is a fun forum to discuss crazy ideas, and not just finished work. Here there are some of the things I learnt this year in no particular order:

  • You can do biogeography using Arabidobsis taliana. Moreover, flowering time can be regulated by photoperiod or vernalization and you can map responsible gens across regions (by X. Picò).
  • Plants can cooperate or be selfish depending on its genotype (by R. Torices).
  • The coolest talk was on epigenetics, which can redirect the course of evolution. With experimental data on radish exposed to herbivory. (by M. Sobral).
  • Invasive Oxalis pes-caprae was thought to have only one morph in its invasive rage and hance reproduce vegetatively only, but the second morph has arrived (and its here to stay) (by S. Castro)
  • Plant-pollinator networks can be better plotted than with bipartite (by J. Galeano)
  • And it was the first time one of my students talked in public. Definitively a great talk by Miguel Angel Collado on pollinator habitat preferences.

Next year will be in Seville, join us*!

*You need probably to know some spanish, but some talks are always in english an all slides are english.

Fun Data for teaching R

I’ll be running an R course soon and I am looking for fun (public) datasets to use in data manipulation and visualization. I would like to use a single dataset that has some easy variables for the first days, but also some more challenging ones for the final days. And I want that when I put exercises, the students* are curious about finding out the answer.

[*in this case students are not ecologists]

Ideas:

-Movies. How many movies has Woody Allen? Is the number of movies per year increasing linearly or exponentially? That is a good theme with lots of options. IMDB releases some data, AND processing their terribly formatted txt files and assembling them would be an excellent exercise for an advanced class, but not for beginners. OMDB has an API to make searches and if you donate you can get the full database. And of course, there is an R package to use the API. This is better option for beginners.

-Music. Everyone likes music and there are 300Gb of data here. You can get also just a chunk, though, but still 2 Gb of data is probably too much for beginers.

-Football: I discarded this one for me because I know nothing about it, but I am sure it will be highly popular in Spain. An open database here.

Kaggel datasets are also awesome. To download them you just have to register. I may use the baby names per year and US state. Everyone is curious about the most popular name the year of your birthday, for example.

Earthquakes: This one also needs some parsing of the txt files (easier than IMDB) and will do for pretty visualizations.

-Datasets already in R: Along with the classic datasets on Iris flowers (used by Fisher!) or the cars dataset there are cooler options. For example there are lots of datasets for econometrics (some are curious), and Rstudio also released some cool ones recently (e.g. flights).

-Other: Internet is full of data like real time series, lots of small data examples, M&M’s colors by bag, Jeopardy questions, Marvel social networks, Dolphins social networks, …

Please, add your ideas in the comments, especially if you have used them with success for teaching R. Thanks!

 

 

Where are the kids born in December?

This is the question Xavier Sala i Martín made in a catalan TV show about economic sciences (yes, pretty cool you can talk about that in the TV!). In a nutshell, he described the relative age effect. A pattern for which most elite football and hockey players are born in the first 6 months of the year because young kids from a given age are put to compete together and the older ones are bigger and stronger. Then coaches dedicate more time to them, and by the time the physical capabilities are even among all kids born the same year, kids born in January have trained more, get more positive reinforcement, etc…

But he did not answer where are the kids born in december. I speculated that those “bad” at sports would have more time to do arts, like play music. Lets test the hypothesis! I found a list of Musicians by birthday in wikipedia and @vgaltes scrap it for me*. Amazing! 57% of musicians in wikipedia are born in the latests 6 months of the year (yes, a chi square is highly significant with this sample size), and january is the only month that goes against our prediction.

Rplot03

Each bar is the number of musics per month starting in January. Black line is the expected number. Sorry for the terrible graph with no axes.

We should have stopped here. Publish it and be famous. Unfortunatelly we got excited. @vgaltes found this web page with lots of birthday summaries by profession and by eyeballing the numbers there is no clear pattern for musicians. Then @dukjb started pointing out that we should correct for number of days that each months has, and more importantly, for the natural birth rate per month, which is likely not uniform. Then we lost momentum, we got distracted by other things and the conversation fade out. But at least we had some fun, no excuse for being bad at sports** and this post!


*I am ashamed, but It would be too time consuming to do in R for me for a side, side, side, side project.

**I was born in early April.