Tagged: choropleth

Mining eBird Data – Comparing Specie Count Records

Back in January we looked at the specie counts by county from available eBird records.  How does that data compare to other available records?

I’m not sure there is an “official” data set for this information but the best i could come up with comes from the East Cascades Audubon Society (ECAS).  Drilling into their Birding Sites page one will find access to County Checklists.  It is from these checklists that a data set was created.  Thankfully there are only 36 counties in Oregon because stripping data from a .pdf is notoriously difficult. (note to editors: a .csv file would make this data more accessible to data junkies.  Heck, even a .xls(x) file would be preferable.)

The official record keeper for Oregon is the Oregon Field Ornithologists, now Oregon Birding Association.  They do not keep records at the County level that i am aware of, and none of their data is accessible in any meaningful way for our purposes, again, all .pdf files.

The comparison between ECAS records and eBird records (as of January 2013) is presented below.  There are three choropleths; one for ECAS records, one for eBird records and one for the difference between the two,  and one bar graph with all three metrics.  The choropleths use a purple – blue – green scale, low to high, from a scale of 180 (eBird min.) to 410 (ECAS high).  The choropleths were generated using ggplot2 in  ‘R’ and the color scheme is from Color Brewer.  I threw in a fourth map with the County names.

Summary Statistics:

ECAS Records:

  • Min = 249
  • Mean = 318
  • Max = 410

eBird Records

  • Min = 182
  • Mean = 254
  • Max = 349

ECAS – eBird (delta)

  • Min = 17
  • Mean = 64
  • Max = 105

Of course this leads one to speculate on the causes for the discrepancies.

The most obvious is the time frame embedded in the two data sets.  I assume the ECAS records are based on historical records.  Although i don’t see California Condor listed in any of the County records so i’m not sure how far back they go, and no attributing reference is made to the source of the data set that i could find.  eBird is a relatively new data set.  Some effort has been made to enter historic records into eBird but im not sure if Oregon has been part of that effort.  Some existing eBird users have entered their data from the past, but i’m pretty sure that is not universal among users with Oregon data, i know that i haven’t done it.  So there’s that.

The other thing to ponder is the variation in the delta statistic.  I suspect some of the same factors considered back in January are in play:

    1. Number of eBirders in the County
    2. eBirder interest in specific locations within a County
    3. Accessibility of the County
    4. Proximity of the County to large population centers
    5. Interest within a County by “hyper-active” eBirders
    6. Amount of eBird recorded observation time spent vs ECAS
    7. others?

So here’s the data representations: