I’ve just found a neat little program that Macintosh webmasters might appreciate immensely. Written by Stephen Turner and called Analog, it’s a free Web log analysis program. That’s not unusual, but Analog stands out from others I’ve tried (WebStat and its newer cousin ServerStat Lite) by being blindingly fast. On a Power Mac 6150 that was running all my standard Internet servers, Analog processed a 17 MB log file in 2 minutes, 24 seconds.
For those who don’t run a Web server or haven’t bothered to analyze their logs, a Web log analysis program scans through your Web log and provides basic reports, including (at least for Analog) the following:
- Number of pages requested per month
- Number of pages requested each week
- Number of pages requested each day of the week
- Number of pages requested each hour of the day
- Number of pages requested by each top-level domain
- Number of times each directory was hit
- Number of times each file type was hit
- Number of times each file was hit
- List of "referrers" – the URLs of pages that contain links to your pages
- A count of how many hits came from different Web browsers
You can see what Analog’s HTML output looks like (including some nice graphical bar graphs) by checking out the statistics for May through August on our Web server at the URL below. We used WebStat for the statistics for April and before, so check out those earlier months if you want to compare the output formats.
The utility of most of these reports should be evident, but I find some results especially interesting. For example, our Web site receives more hits on Tuesdays than any other day of the week. Since we publish on Monday night, that’s not surprising, but it’s good to know. Similarly, looking at the report of which hours of the day get the most hits, I can see that the low point is about 2:00 AM, so that’s when I have the server reboot itself each night. I like scanning through the list of files on the Web site along with the number of times they were hit, since it gives me an idea what people are doing when they come to the site. Since we’d been planning to rework our Web site (see "Rethinking a Web" below), seeing which pages are popular and which aren’t is educational.
August was the first month I turned on the REFERRER and AGENT options in WebSTAR’s log, so I was intrigued to see what Analog would tell me. The REFERRER option logs the URL of the site from which someone has come to your site, and the AGENT option records the specific Web browser they’re using. Unfortunately, the AGENT option isn’t particularly reliable, because a number of Web browsers (Microsoft Internet Explorer in particular) identify themselves as Netscape Navigator to ensure that they’re fed Netscape-compatible HTML from Web servers that can spit out different HTML to different browsers.
It struck me as interesting that only a few thousand of the 85,000 page requests on the TidBITS site in August came from external links. To an extent, that’s understandable, because the site takes a lot of hits from people who use the home pages from Internet Starter Kit for Macintosh, Second Edition, and Internet Starter Kit for Windows, Second Edition (those books also account for the disproportionate number of requests from people using MacWeb and NetManage Chameleon WebSurfer for Windows). Even still, I wonder what percentage of hits on other sites come from external referrers versus the number that are typed in directly.
Also interesting was the browser report, which told me that about 60 percent of the pages are requested via some version of Netscape Navigator. MacWeb came in at 24 percent, and "Netscape (compatible)," which is probably mostly Microsoft Internet Explorer, came in third at 5 percent. The vast majority of the MacWeb hits come from the home page for Internet Starter Kit for Macintosh, Second Edition, so it seems that many people who bought that book haven’t switched to new Web browsers, nor have they changed their home pages. Otherwise I wouldn’t expect nearly as many hits from the now-moribund MacWeb. Despite Netscape Navigator’s clear lead, I feel no inclination to use any Netscape extensions to HTML or other abominations like frames. It is nice to know that HTML 3.2 tags such as those for tables aren’t much of a problem, though.
When I ran Analog on my log file from August, I saw that all the references were internal to my site, so I set Analog to ignore all REFERRERs from my domain, thus restricting the list of REFERRERs to external sites. I tried, unsuccessfully, to better identify the different types of Web browsers as well, but I may have to spend some more time figuring out Analog’s configuration files.
Although Analog generally works fine with no additional configuration, you’ll probably want to tweak some parts of its report, and that’s the only place it falls down. Analog was ported to the Macintosh by Jason Linhart, and although it retains all the functionality and blazing speed of the original, it doesn’t yet have a Macintosh interface. Functionally, that’s not a problem, but it does make Analog more irritating to configure. I suspect you’ll fiddle with the two main configuration text files for a while the first time, and – once you’ve made them work – you won’t change them often. The only other problem I’ve had with Analog is that it likes a lot of memory if you feed it large log files. If it doesn’t have enough RAM, it quits with an error 25, which is an out of memory error. Each time I run into that, I give it more RAM and try again and it’s worked (the documentation notes this problem and suggests the solution).
The Macintosh version of Analog supports the MacHTTP and WebSTAR format logs, so if you use a different Web server, Analog may not work on your logs. Analog is worth a look, if only for its sheer speed in these days when so many programs trade interface for performance. If Analog doesn’t meet your needs, there are several other Macintosh utilities that analyze Web log files – check the page below for a list.