Opendata.eol.org questions and requests

Any questions or requests pertaining to https://opendata.eol.org/ ? Can’t find what you’re looking for? Wondering what you just downloaded? We can help!

I’m interested in downloading the entire EOL dynamic hierarchy as its current form in beta.eol.org with all EOLid in place. I downloaded “EOL Dynamic Hierarchy Version 1 with EOL page” but there are lots of records that didn’t have EOLid.

On the related topic, where are the APIs to get hierarchy information from EOL dynamic hierarchy like the “hierarchy_entries” API in EOLv2? Is it coming soon?

1 Like

Hello! The most up to date file is
EOL Dynamic Hierarchy with landmark taxa. You will still find taxa without EOL IDs. These files are meant for preserving IDs that were already in EOL v2, mapping them to the same taxa in v3. Any taxa without IDs in these files weren’t represented in v2; new IDs have been minted for them in v3, but we haven’t got a download of that yet.

API migration is underway. Data and media content and search APIs are going first. Since hierarchies are handled so differently in v3, we will need to rebuild the more detailed hierarchy searches for v3 from scratch. However, there is a cypher service available in beta which contains the Dynamic Hierarchy as well as all attribute and interactions data. Testers are very welcome! The opendata downloads are recommended for large amounts of data; the Cypher service has not been stress tested yet.

Let me know if any of this is cryptic or anything is still unanswered.

Jen

1 Like

Hello! I am not sure if what I want already exists in the data portal, but I have not been able to find it. What I am looking for is a list of common names for each page. Something like a file were each line has an id for the page, language and the common name would be great. Since some pages might have multiple names, there would be more than one row for some page ids. The only way I have been able to get anything like this is by calling the classic api, but it would take too long to query all pages this way. Any help is appreciated.

Hello, @Dwarfley,

That’s a good idea. We usually make downloadable files of large exports like that which may be needed by multiple users. I’m surprised nobody has asked for this one before. I’ll put it on the to do list. You may need to wait awhile; I’ll post in here when it’s ready.

Jen

@Dwarfley this task made it up the to-do list at last. The file is in our open data portal. Thanks for requesting!

Thank you very much!

Hello! For a research project on the distribution of taxonomic effort, I’d like to correlate taxonomic effort (as measured by treatments/new names per order/family) with some of the traits in traitbank (e.g. carbon biomass, size of geographic range). Should I use the API for getting those trait data (i.e. taxon name and value for the trait for all taxa for which that trait is available), or do you recommend downloading the whole dataset (and is e.g. the range size per species for which it is available included in the ‘all trait data’ files?). Thanks!

Sounds like fun! Either should work, but the all traits export is probably easier. That file is updated monthly, so you won’t be missing anything. Please holler for help if anything is cryptic. And remember- there will be gaps, and sometimes multiple records per taxon.

Let us know how it goes!

Jen

Okay, thanks! I’ll try using the all traits export then, and will let you know how it goes.

Stijn

Hi Jen. I had one question after downloading the file: Can you give more information about the ‘inferred_traits’ file? That file only has the column ‘inferred_trait’ (I assume this is eol_pk?) and the page_id, and no predicates or values. I checked for two random inferred_traits whether the eol_pk was also in the ‘traits’ file, but couldn’t find them.

best,
Stijn

Thanks for pointing that out, @Stijn_Conix. We need to update the documentation on this download file.

Inferred traits are inherited by descendants of a larger clade. There are not separate trait records for all the bird species that have locomotory mode=flight. A single record appears on Aves, with metadata that alerts our system to associate that trait also with all descendants- or in this case, all descendants except for a number of flightless subclades that should be excluded: https://eol.org/data/R969-PK150614994

The “inferred trait” relationship is what connects that flight record to each of the applicable descendant taxa. It would greatly inflate the export file to list the full record repeatedly for each of these instances, so the full record is listed once, in the traits file, and any additional taxa to which it applies are listed in the inferred file, by trait record ID and taxon ID.

Let me know if that helps,

Jen

Thank you, that makes sense! I should have figured that out myself. One of the reasons I didn’t is that the inferred trait that I looked for in the traits file wasn’t there. I suspect this was because my ‘traits’ file was incomplete: when I unzip it (both with the windows tool and 7zip), I get a cvc error that the file is broken. I could still open it, but it had only about 3.5 million rows. Have you heard from others having the same problem, or is it something on my end? I tried recovering the full file, but there are still traits missing when I compare it with the numbers I find in the online trait search.

Oh, yes, that happened to last month’s file, but I thought we’d re-run it and gotten a complete copy. But the current one is only showing me ~3.5M records also. Stand by, we weren’t expecting this problem to repeat…

Okay, thanks, I’ll wait for another file then.

Hi Jen! Just so you know: I’ve downloaded a few of the earlier traits_all files (from 2019 and 2020), and seem to get an error for all of them. Best, Stijn

Sorry for the delay, @Stijn_Conix. Yes, apparently this is not a problem with file generation but possibly with corruption or something else. Apparently some of us can see the whole thing - eg: traits.csv has 15,806,673 rows- and some of us (like me!) can only see 3.4M rows. Hang in there; I can’t believe it’ll take much longer to track down…

Thanks, the help is much appreciated!

[quote=“Dwarfley, post:4, topic:170”]
Something like a file were each line has an id for the page, language and the common name would be great.
[/quote] What is the source for common names, should they be translated also?
I read somewhere that Wikidata also was a source for adding common names…

Soortenregister August 26, 2020

This is what we have currently- will it do?

Yes, wikidata is our largest source of common names.