justforeverus asked: HI, I'm a web developer and the co-founder of Essence Labs Creative Agency. I just wanted to give you a little idea that popped into my head when reading the post about the tumblr api. You should try making a static page that automatically fetches posts using ajax and the v1 tumblr api, and then have that be placed in a random order. That might work better than the cron job so this way each user gets a different page. Don't know if it will work too well, but its worth a shot! What do you think?
The issue with AJAX and the Tumblr API is the cross-site domain restrictions in browsers. The main page is at “foo.tumblr.com” and the information is at “api.tumblr.com”. Firefox and Chrome won’t allow cross-subdomain AJAX calls like this.
You *can* use JSONP as a workaround for the cross-site scripting issue, but I find it inelegant and ugly.
But more to the point, when you need to load possibly hundreds of posts, the Tumblr API server is just too slow to get the results on the page in a reasonable amount of time.
Experiments from my webserver (which has a super duper fast pipe) shows an average of over 2.0 seconds to load 250 posts from the Tumblr API. This slows to over 3.5 seconds on my personal computer and home internet line. This is just too slow.
On the other hand, I can pre-emotively download the information from the API, extract what I need, cache the result in a “static” JavaScript file on my webserver and deliver it to the end client in just 100’s of milliseconds. And so this is how I plan to solve this problem, using the Tumblr static “Pages” to load a specially generated JavaScript file (or JSON or whatever) with the required “shuffled post information” and display it with a tiny amount of JavaScript.
This offers a better end experience to users.
Thoughts?
So, in writing my theme, I wanted a way to put a “Reblog” and “Like” link on each post on the main page of the blog without users having to “click-through” to the “/post” page, and use Tumblr’s silly <iframe> in the upper right corner.
Interesting. I discovered a similar thing, but at the end, ended up including the iframe for each single post, setting opacity: 0, and aligning/scaling it so that it matches my button positions. Does the desired effect, only until I realized I cannot highlight my heart button when the post is liked, because I don’t and can’t know if the post is liked. After some research, I found out that there IS a way to do it, only I don’t know how.
Edit: no, there seems to be no way. At least nobody has done it.
Right. The “reblog” is easy. The “Like” button is tricky — not to “make it work” but to show whether or not the Post is already “Liked” or not.
Reading the contents of Tumblr’s own iframe to determine, say, which image they are using couldgive the result, but most browsers today prohibit inspecting the embedded iframe’s HTML with JavaScript for security reasons. Similarly, you cannot load the iframe’s content with AJAX because of security restrictions. Perhaps with JSONP? I don’t know, haven’t tried to hack it out further. And then yes, any “solution” would still be an ugly hack, likely to break whenever Tumblr’s whimsy programmers decided to mix things up a bit.
What really needs to happen is to just incorporate these functions into the Tumblr Theme Engine. It would probably be like, a few hours work for some Tumblr employee to implement this.
I also thought about scanning the notes of every post to check if you have liked it. Terribly inefficient, but it should happen fast enough for posts with 0-20 notes. It would have to be disabled for posts with more than 100 notes (or at least scan only the first 100 ones). Also, it would put a lot of queries on the Tumblr servers.
Yes, this is not a very robust technique. And Tumblr doesn’t look kindly on “page scraping” in general. Nor is there a way in the Tumblr API to fetch more than the most recent 50 notes for a post — at all!
But who cares, if they don’t want this they better improve their support - I have never heard a story of a theme designer who got in touch with the tumblr programmers.
Oh — well I have at least tried to get in touch with the Tumblr folks, on several occaisons.
And at least on the Tumblr API Google Group, there is one Tumblr Employee (John Bunting) who monitors the discussion in the group and is able to provide some feedback. However, he seems fairly powerless in the overall Tumblr structure to get any *real* changes made, although he has been helpful in fixing some bugs, etc. (The Tumblr Theme Google Group seems to be completely unmoderated by any Tumblr employees).
I’ve also sent emails to the Tumblr staff a several times about bugs in the API and Theme engine, and requests for features, etc. I usually get what looks to be a “real response from a real human” back — but nobody ever follows through to fix any of the problems I mention.
On the whole, Tumblr seriously needs to improve their support for developers. This is means listening to feedback, answering questions, fixing bugs and problems quickly and promptly.
For now - it’s really a giant headache.
On Tuesday Night, my house caught fire. Everyone in my family is OK but a lot of our stuff is totally ruined. As you can imagine, this is very stressful. We’re staying with my uncle & aunt for now, but I have very limited internet access here. I will update everyone as soon as I can.
I added a few new updates / features to my Basic Tumblr Tag Cloud:
-
Added the option “case=” to allow you to transform the “case” of your tags to Lower-case, Upper-case, and Title-case for display purposes.
-
Improved the default “Alphabetical” sorting code to better handle non-English Latin-script languages.
-
Added the option “lang=” to allow you to specify a particular “language” locale, which greatly improves Alphabetical sorting and Case transforming on Tags for many languages, including non-Latin-script languages.
-
About 70% of users were using the “promo=false” to get rid of the link to “get a tag cloud”. OK, I hear the message. I’ve disabled that link for everyone, from now on. C’est la vie.
These additions (and some other new neato new things) will also be added to the “Pretty” Tag Cloud in the next few days.
Also, I can always use more feedback from people. Please?
Fantastic to see this job up, I really wish I had the skills because I can’t imagine a more engaging job right now!
Yep! We’re looking for someone to help us hack on the API! Check it out! It’s gonna be fun ;)
I’d apply if I wasn’t 17 lol :P
Recently I’v been working on writing some “back end” code for a Tumblr blog that wants to do some non-standard things using Tags.
I’m all about trying to hack the Tumblr platform: hacking with the Tumblr API & doing unorthodox things in Tumblr themes.
The project is called Lensblr (http://lensblr.com) I’m not directly involved with creating the site. I’m just writing some devious back-end stuff to create added functionality.
The project caught my attention because navigation & view of the Tumblr is done exclusively through use of the /tagged/foobar pages. There is no view of the “main post feed”. It’s an interesting idea towards creating a custom Tumblr website.
Magical Post Tags
One goal of the designer was to automatically add/remove specific tags from posts, based on set criteria. In this case, the criteria is rather simple:
-
Based on the number of “notes” a post gets, add/remove tags that “move” the post to/from different pages on the Tumblr.
For example, after a post reaches 50 notes, we want to remove the tag the displays the post on one page, and move it to another page called “Featured Posts”
This is easily implemented using cron job that runs a script to analyze the posts on the blog, and use the Tumblr API to “retag” the posts as needed.
Of course, this kind of tag modification for changing how a site is presented could be extended and extrapolated to do lots of interesting things, based on any number of factors. It’s a promising idea.
Magical Post Shuffling and Randomization
A second goal of the designer behind Lensblr was to implement a way so that posts on a particular /tagged/ Page could be randomly shuffled, such that older posts would sometimes reach the first page.
- The principle is more equal exposure of content, regardless or whether or not that content was posted 2 months ago or an hour ago.
- Not all content is Timely; and most people never make it past the 1st or 2nd page of a blog.
Fortunately, you can modify the Published Date on posts using the Tumblr API. And when viewing posts through the standard “Posts” interface (mounted at / ), this works rather well. You are able to shuffle posts around so that older and newer posts get equal chance of exposure on the 1st or 2nd page of the blog.
Unfortunately, Tumblr seems to have a bug/limitation with sorting posts by “Time” on the /tagged/foobar pages:
- If a post is on page 3 of /tagged/foobar, even modifying the post’s Publish Date is not enough to ever bounce it up to page 1. The post is “stuck” on page 3 forever. (or page 4, 5… etc as more posts are added).
Interestingly, on the individual pages /tagged/foobar/page/X the posts *are* sorted chronologically — but only relative to the other posts on that page.
Based on this, I’ve concluded that while Tumblr “sorts by Time” it only does so per page; the actual posts that are put on a given page comes from “sorting by Post ID”. It’s a bit inconsistent.
So although we had a sound principle and strategy, implemented the code to “shuffle” the posts around — it just doesn’t work within Tumblr’s buggy / limited system.
Working within Tumblr’s Limitations
Tumblr does not provide Developers with many good means for representing blog Content with different “views” or in different ways. Using Tags seems to be the only practical way to achieve these ends, for now.
Things that are not particularly useful:
-
The “Static Pages”
While these can contain arbitrary HTML completely separate from the main “Theme” they are rather useless from the perspective of creating / loading dynamic content, or content that changes with high frequency (say once a day).
“Pages” are not accessible through the Tumblr API, so making automatic changes to these Pages with code is not possible.
-
The Tumblr API
The API is OK if you are developing an application for the desktop or a mobile device. It lets you do *most* things you might want to do to create a “Tumblr Experience” on a mobile device, or say a “Tumblr Posting / Editing” program on the Desktop.
Where the API is not particularly useful is in trying to create dynamically generated content on web pages.
-
Tumblr only supports OAuth1.0 (they should really upgrade to an OAuth2.0 interface)
-
Tumblr does not support CORS, a technology that allows controlled “cross-site” AJAX calls. Without CORS, as a developer you are limited to using rather ugly JSONP callbacks, and then only HTTP “GET” commands.
-
The available methods in the Tumblr API in many cases do not provide sufficient means to request the specific information you want: first, there are not enough “filters” to select the data you want, no literally no means to control the “output” received from the API. For example, if all I want is the list of “Tags” from posts on a blog - I still have to download *all that other data* which has no use for me. That adds up rather quickly to a *lot* of transferred data.
The API needs better methods of “selecting” the types of posts to return, and then needs to implement a way to filter down the data returned to only what you require in order to make the API useful in the context of creating dynamic web pages.
-
The API Server is just plain slow. Response times I observe are often in the range of 600-800 ms for simple requests (say “grab 50 posts”), and up to 2000-3000 ms for larger requests (say, “grab 200 posts”).
Part of the problem is the before-mentioned lack of ability to “filter down” to just the data you need.
Another part of the problem is that the API Servers refuse to use HTTP Keep-Alive if you will be requesting multiple pieces of data in quick succession. The API Server should enable Keep-Alive.
-
So Why Bother?
Because Tumblr is a great social platform. You bother trying to create custom/specialized Tumblr pages to create a unique “Web Experience” while at the same time being able to take advantage of the socialaspects that Tumblr can provide to your Page and your Content.
As a Developer however — this is all rather frustrating.
Anyway, Tumblr just put up a Job Listing for API Lead. Perhaps that is a sign that things will get better in the near future.
But for now, it’s all pretty foobar.
This issue came up on a recent project web project I’ve been working on.
In the project, I allow users to choose from a subset of fonts provided by the Google WebFonts API to style their Text in a pretty font.
The Problem - International Support
I get a message from a Czech user, requesting support for fonts that display the Czech language properly. (The Czech script uses various characters found in the “Extended Latin” Unicode blocks)
Luckily, I find on Google Webfonts that you can specifically request fonts with various script subsets.
For example, suppose you want the font “Poiret One” with the Latin-Extended character set, you can request:
fonts.googleapis.com/css?family=Poiret+One&subset=latin,latin-ext
Splendid - sorta.
A naive approach to offering better “international support”
In my naive approach:
- I examined the user’s text to determine if any characters where outside the “Standard Latin Block” (eg, if any character’s ordinal is > 127).
- If so, I disabled the fonts having only “Latin” support, and only enabled fonts supporting “Latin Extended”
- For supported fonts, I request the font from Google using the “subset=latin,latin-ext” call, as above.
But that didn’t quite work…
… Because it turns out (I suppose unsurprisingly) that many of the fonts Google has classified as supporting “Latin Extended” do not support many of the characters required to display Czech properly (and doubtless many other languages as well)
Now, I don’t know who at Google determines how / why a font gets classified as supporting the “Latin Extended” block, what characters must be supported, etc.
But — their classification system is quite simply broken. In more ways than this - as you will see.
Designing a “Smart” Approach
I decided that I could do better. I really didn’t want to offer fonts if they would be unable to display the user’s language properly. And since Google’s “classification” was not thorough enough, I decided to do it myself.
- I downloaded the relevant TrueType Font files from Google for the fonts supported in my app.
- I wrote a small Python program utilizing the TTX/FontTools package to examine the TrueType files.
It was relatively simple to search the ‘cmap’ table within the TTF File to see if the Font contained a Glyph for a given Unicode ordinal.
I dumped the list of the supported Glyph Ordinals of the font into a Python set object, let’s call it fontCharSet - I took the text the user was trying to render, and dumped the Unicode ordinals of the text into another Python set object, let’s call this userCharSet
- Now, determining if a font supported the user’s language was as simple as calling one line of Python code:
fontSupported = userCharSet.issubset(fontCharSet)
- Wow, that was easy. With just about 100 fonts to check, the process of validating the user’s text against the supported fonts for my application took less than a millisecond.
- (See appendix for code example)
Now I had a great way to only show fonts to users if those fonts could properly display their language. And then I could retrieve those fonts with the extra needed characters, by specifying the subset when calling the WebFonts API.
I easily extended this include support for Cyrillic and Greek text as well, by separating the user’s Text into the relevant Unicode blocks to determine which subsets to request from Google.
And now, when a user visits my page, only fonts I was certain could display their language were presented as options.
Awesome. Great. … Or so I thought.
It turns out that “subset classification” of fonts that Google offers is not just for your information — it is also limiting.
Take the WebFont “PT Sans Narrow” — a nice sans-serif font with narrow characters, and slightly tall Capital letters. It has good support for Latin, Latin-extended, and Cyrillic.
Running my “language checker” code revealed this font could correctly display all the characters in my Czech user’s Text, and so this font was duly added to his list of “Supported Fonts”
Except it isn’t supported - not really, not by Google.
If you examine the PT Sans Narrow Font Specimen on Google WebFonts you will see that “Latin Extended” is not listed as one of the supported character sets. Although, in fact, the font supports much of the Latin-Extended set — enough at least to render my Czech user’s text.
And requesting the font with
fonts.googleapis.com/css?family=PT+Sans+Narrow&subset=latin,latin-ext
Does not return any characters in the Latin-Extended block at all. At all. Not even the characters that do exist. You just get plain Latin. Sorry.
And it turns out that there are a bunch of fonts like this on Google WebFonts. Too many to list.
Fonts that, by examining the actual font file for Glyphs, you find are able to support a given user’s language needs, but which you cannot retreive from Google with the additional characters needed.
The bottom line is - if it doesn’t say “Yes this font supports Latin-Extended” then you can’t retrieve those characters from Google WebFonts, even if they do exist.
Ugh.
Once again - too smart for my own damn good.
I have filed a bug report with the Google WebFonts project on Google Code for some of the fonts I have noted behave in this way. Although I do not suspect it will do much good, looking at the “Issue tracking” page for the project. (It seems there is little activity, and virtually no response to any issues reported). And there seems to be no “Support” page or link anywhere on the WebFonts page or the WebFonts API page.
Sigh. Technology sucks.
EDIT - March 15, 12:00 AM GMT
To my amazement, by complaining on the Google Font Directory Discussion group I actually got some feedback from a guy at Google who seems to be in charge of issues like this.
Basically, this is what I learned from him:
- If a font is not listed as supporting a script subset on the WebFonts directory (e.g. “latin-ext”) then it is not possible to retrieve those extra characters by using “subset=latin-ext”, even if the font in question does indeed contain characters in the Latin-Extended blocks.
- He cited one of the primary reasons for this limitation is that some fonts support too many Glyphs in the Latin-Extended block, and thus the resulting file size for the font would be too big.
Gee Google, thanks for shafting people who don’t use an ASCII Alphabet. - He was willing to “fix” the Encoding information for a few fonts that I complained about in particular, but again not those whose resulting file size would be too big.
Appendix: Python Code Sample: (requires the TTX/fonttools library)
from fontTools import ttLib
fontfile = 'PT_Sans-Narrow-Web-Regular.ttf'
font = ttLib.TTFont(fontfile)
cmaptable = font['cmap'].getcmap(3, 1) # platform id (3,1) = "Windows"
fontCharSet = set(cmaptable.cmap) # set with Unicode ordinals for Glyphs in the font
usertextfile = 'text-sample.txt'
F = open(usertextfile, 'rb')
with F:
unicodeText = F.read().decode('utf-8')
# create set of Unicode ordinals in the text
# (use ord(c) > 32 to skip control characters)
userCharSet = set([ord(c) for c in unicodeText if ord(c) > 32])
fontSupported = userCharSet.issubset(fontCharSet)