is_invasive( ) got a new best friend: is_native( )

A while ago I did a function to check if a species was among the worst invaders. I am really happy this function was used and ropensci put it in a package. However the function does not tell you if a species is exotic in a particular place, and this is what most people want to know. Scott Chamberlain found a couple of other resources to get invasive species lists and we where discussing where to find those reliable, complete “lists of exotic species per region”, but we where thinking the problem from the wrong approach.

Exotic species lists will be always incomplete and hard to maintain. For most research questions the reverse question is also suited. Instead of “is exotic?”, you can ask “is not native?” Lists of natives species are easier to get and stable through time. Moreover, It conveys the message that any non native species is potentially harmful, rather than restricting to “worst invaders” or “known exotic species”.

So here it is. Its implemented for animals and plants in the US (using ITIS database) and for Plants in Europe (Using Flora Europaea)*. You can use it with the following R code:

install.packages("devtools")
library(devtools)
install_github("ibartomeus/traits")
library("traits")

#make a species list
sp <- c("Lavandula stoechas", "Carpobrotus edulis", "Rhododendron ponticum", "Alkanna lutea", "Anchusa arvensis")

#ask in which countries the first species is native by querying in Flora Europaea
fe_native(sp[1])
?fe_native #to see the help page.

#use sapply for querying all species at once
sapply(sp, fe_native, simplify = FALSE)

#ask if the first species is native in a particular region
is_native(sp[1], where = "Islas_Baleares", region = "europe")
?is_native #to see the help page and country names used

#or all species at once
sapply(sp, is_native, where = "Continental US", region = "america")
sapply(sp, is_native, where = "Islas_Baleares", region = "europe")

#for america I am calling itis_native function from taxize package.

The function will be available from ropensci/traits soon, and probably Scott will make it faster and more usable. Let me know if it breaks with any species or if the help pages needs clarifications and I can try to fix it.

*If you know/have a list of all native species for your region of interest, we can add it.

Getting the most of mix models with random slopes

I use mix models as a way to find general patterns integrating different levels of information (i.e. the random effects). Sometimes you only want to focus on the general effects, but others the variation among levels is also of interest. If this is the case, using a random slope model is pretty cool, but making sense of lmer output is not trivial. I provide here code to get the random slopes and CI’s of such models using the iris dataset in R (mainly, because I am sure I will need to check this entry in the future myself)

#This shows how to get the random slopes and CI's for each level in a hierarchical model

Is there a general relationship between petal and sepal width? and how it differs by species?

plot(iris$Sepal.Width ~ iris$Petal.Width, col = iris$Species, las =1
Uno
#Our model with random slope and intercept
library(lmer)
m2 <- lmer(data = iris, Sepal.Width ~ Petal.Width + (1 + Petal.Width|Species))
#extract fixed effects
a=fixef(m2)
#extract random effects
b=ranef(m2, condVar=TRUE)
# Extract the variances of the random effects
qq <- attr(b[[1]], "postVar")
e=(sqrt(qq)) 
e=e[2,2,] #here we want to access the Petal.Weigth, which is stored in column 2 in b[[1]], that's why I use the [,2,2]
#calculate CI's
liminf=(b[[1]][2]+a[2])-(e*2)
mean_=(b[[1]][2]+a[2])
limsup=(b[[1]][2]+a[2])+(e*2)
#Plot betas and its errors
dotchart(mean_$Petal.Width, labels = rownames(mean_), cex = 0.5,         xlim = c(0.4,1.4), xlab = "betas")
#add CI's...
for (i in 1:nrow(mean_)){
     lines(x = c(liminf[i,1], limsup[i,1]), y = c(i,i)) 
}
dos
#make final plot
plot(iris$Sepal.Width ~ iris$Petal.Width, col = iris$Species, las = 1)
#and plot each random slope
abline(a = b[[1]][1,1]+a[1], b= mean_$Petal.Width[1], col = "black")
abline(a = b[[1]][2,1]+a[1], b= mean_$Petal.Width[2], col = "red")
abline(a = b[[1]][3,1]+a[1], b= mean_$Petal.Width[3], col = "green")
#and general response
abline(a, lty = 2)
tres

The code is fast and dirty, I know I can be more consistent, and the plot can be nicer (clip the slopes, add SE’s), but will work to get the idea.

& thanks to Lucas Garibaldi for sharing much of this knowledge with me.

And the gist: https://gist.github.com/ibartomeus/f493bf142073c704391b

About Data

I recently did a workshop about data with PhD students. That was great to order my thoughts and put together a lot of good resources. All material used (with lots of links) is available in GitHub: https://github.com/ibartomeus/Data. Topics range from hardcore R code to read, clean and transform data, to discussions about data sharing and reproducibility. Few PhD students thought about sharing their data before the workshop, and I hope I convinced some of them, not sure about that. However, I clearly convinced them to have reproducible workflows and ditch excel, which is a win.

In the last years I worked with lots of different data (mine and from others, big and small) and I have to say most datasets were poorly formated, maintained and documented. I think we do not give enough importance to data curating practices, but they are central to the scientific process. The feedback of the students is that the workshop was very enlightening for them and that they never received formal or informal advice on how to deal with data. If things are going to change, we need to do an active effort to form the new generation of scientists in a open data culture.