Two-Line URLs Broken in Snow Leopard’s Preview
While going through final testing of the PDF of his “Take Control of Exploring & Customizing Snow Leopard,” Matt Neuburg ran across a bug related to clicking long Web URLs. The problem is not related to any particular PDF; it’s purely a bug in Preview, and one that could cause fairly significant confusion (and in our case, customer support headaches).
Here’s the problem. Let’s say you’re using Snow Leopard and thus Preview 5.0 (501), and you open one of our Take Control ebooks – the 2.1 version of “Take Control of Mac OS X Backups” is a good example. At some point while reading, you’ll run across a URL to a Web page that is too long to fit on a single line (search for “http” to find one quickly).
Now, click the bottom line of the URL, and your Web browser will load the proper destination page – whatever the full URL is. But click the top line of the URL, and Preview instead sends you to whatever visually appears on that top line, not the full URL.
We put some effort into making our split URLs look decent, so most of the time that top line will go somewhere (if an error page at the appropriate site), but if it were just “http://www.”, Preview would happily send that fragment to your Web browser.
Again, there is nothing wrong with the PDF. Both lines of the URL contain PDF link boxes with the full URL in them. Clicking either line in Adobe Reader works fine. And of course, the versions of Preview in Leopard and other versions of Mac OS X work as expected.
We assume that the problem is that Preview is attempting to treat text that looks like a URL as a link, but it is unfortunately doing so in such a way that it ignores the actual PDF link box that sits on top of the text.
This is a new feature in Preview – the Leopard version sees all URLs as just text – and is actually one that has existed for some time in Adobe Acrobat Pro and Adobe Reader. The difference is that Adobe’s programs honor the PDF link box in favor of the automatic URL recognition; they too can see only one line of a URL when faced with a split URL that lacks a true PDF link box.
The only workaround for users is to be sure to click the bottom line instead of the top one, or to use Adobe Reader (which has its own pluses and minuses). On our end, we could start using a URL shortening service, and we have in fact done that on a couple of particularly awkward URLs in recently released ebooks to avoid customer support questions, but we can’t change all the ebooks we’ve already published. In general, though, we prefer to use actual URLs since they often convey useful information.
Although I don’t have a large collection of PDFs from other publishers and other sources, it’s entirely likely that there are other sources of PDFs that will run afoul of this bug too, so consider yourself forewarned.
I’ve reported this bug to Apple, so we can hope it will be fixed in a new version of Preview in Mac OS X 10.6.1. Historically, it’s taken only about two weeks for Apple to release the first update to a major version of Mac OS X, so it’s likely that we’ll see 10.6.1 in the very near future.
Preview is even stranger when there is a three-line URL in a PDF, such as one in the fall Washington State Ferry schedule PDF.
Clicking the first line sends just that line to the browser. Clicking the second line sends the first and second lines (fortunately, a working URL in this instance, although one must then navigate to the desired page).
And the third line is not clickable.
It will be nice to get this fixed--I really don't want to suffer through Reader again.
This actually makes sense. In the example you provided in TIdBITS Talk (http://www.wsdot.wa.gov/ferries/pdf/2009Fall.pdf - Link is page 2 under Water Taxi) there's no PDF link box anywhere. So Preview is guessing at each line independently; the first one fails because it's just a fragment, the second one "works" but goes to the wrong place because it has a domain in it, and the third one simply doesn't look like a URL.
A bit on one side of the topic, but relevant to a comment made in the story:
I'm wondering if a full-on electronic publisher such as TidBITS should be relying on someone else's URL-shortening service. It strikes me that, unlike a public service, a private service (a subdomain hosted on your own server) used just for ventures such as "Take Control" would have a limited universe of URL listings, short equivalents, and redirection.
The bonus would be that you could assign alias URLs that convey the useful information you see in the original without the alphanumeric soup found in the originals.
The Preview bug has exposed once again the problems inherent in publishing URLs in other media, but TidBITS could easily take control of those URLs in its own publishing process.
That's an excellent point, Matt, and something we've certainly thought about over the years. Glenn has said it would take only minutes to set up, but I worry more about the ongoing maintenance and general addition to the complexity of our server infrastructure.
But, it would be nice to have short URLs in some situations, and particularly short URLs to which we could assign some useful alias information. Plus, when they broke (because they always do eventually), our server could provide some useful information about what was there at some previous point in time (a cached version of the page when the URL was created, for instance).
Of course, another downside is that it adds more work to making links.
I'll add this to our list of possible projects, but it's far below some other things we really want to get done in the near term.
Alas, this bug is not fixed in Mac OS X 10.6.1 - the version of Preview remains unchanged in the update.