Accessing Tumblr Data
Mostly as a reference for myself, I’ve summarised the types of data we extract from Tumblr, and highlighted the few areas where support is inconsistent across the methods provided.
I’ve noted previously, for example, that while we can get content by post type using the Tumblr API, the same can’t be achieved using a formatted URL. Conversely, there is now a URL format for displaying content by tag, while the API does not allow this directly.
The chart below depicts the five key ways in which posts are filtered (including the new native search), along with other commonly accessed data: tags, feeds and avatar images. Tumblr has announced changes to its theme engine which may in turn affect the API and URL schemes as well, so I’ll keep the chart updated where possible.
| API | Theme | URL | |
|---|---|---|---|
| Posts | Yes | Yes | Yes |
| — by Type | Yes | Soon | No |
| — by Tag | No ¹ | Soon | Yes |
| — by Keyword | No | Yes | Yes |
| — by Date | No | No | Yes |
| — by Quantity | Yes | Soon | No |
| Tags (per post) | Yes | Yes | Yes |
| Tags (all) | No ¹ | No | N/A |
| Reblog Source | No | Yes | N/A |
| Reblog Notes | No | No | No |
| Feed | Yes ² | Yes | Yes |
| Avatar | No | Yes | No |
Yes Supported No Not Supported Soon Announced N/A Not Applicable
¹ It’s technically possible to request every post in a Tumblr site using multiple API calls, and then filter or collate by tag, but this isn’t ideal.
² The feed URL is not specifically included in the API response, however it’s trivial to construct it given the username.
Update: At the suggestion of Scott Lamb, I’ve added reblog data to the chart. Reblog Source refers to the URL of the post from which yours is reblogged, while Reblog Notes is the list of users who’ve reblogged your post (and other notable appearances such as on the Radar). Tumblr’s David Karp displays notes for each post on his tumblelog, so I guess this is coming at some point.