A brief review of blog traffic for the past year

I don’t pay as much attention to blog traffic for FML as I probably should. I know there are a lot of things I could improve if I paid more attention to the various details. Instead, I tend to look for trends and broad numbers and that’s about it.

This evening I checked summary statistics from Google Analytics for the past year. Here is what I found:

  • There were 6,713 unique visitors to the site, which averages out to about 18.4 visitors per day
  • Visitors tend to spend only about a minute on the site each visit
  • The browser used by visitors breaks down as follows:
    • Internet Explorer – 46.51%
    • Firefox – 41.53%
    • Safari – 9.65%
    • Mozilla – 1.02%
    • Netscape – .48%
  • Traffic sources include 38.36% of visitors who find FML via search engines; 31.68% who go directly to the site (in other words, the site is bookmarked or the URL is typed in directly); and 27.42% of traffic comes from referring sites. Of the 38.36% of visitors who find FML via a search engine, the vast majority of them uses Google (over 80%).
  • The vast majority of visitors uses Windows as their operating system (80.45%). 17.93% use Mac OS X. 1.38% use Linux.

I am especially pleased at the good showing for non-IE browsers. Something else that is of interest is what keywords people use in a search engine that leads them to FML. Here are some of the top keywords, aside from the obvious ones such as “family man librarian”: “portable browsers”, “everyone has a double”, “library related wordpress theme” and “praise you in the storm.”

[tags]blog traffic, google analytics[/tags]

A quick conference trip to Washington, D.C.

For the past few days I’ve been on a quick conference trip to a meeting in the Washington, D.C. area. The meeting was organized by NISO and was entitled “From Discovery to Delivery: Solutions to Put Your Content Where the Users Are.”

While there was nothing new or startlingly different about the content of the meeting, for me, at least, I think it was a worthwhile trip overall. The best part of the whole workshop was attending Dan Chudnov’s presentation on “COinS, unAPI, and a Plan for Zero Configuration Service Discovery.” Dan is a great speaker; humorous yet thorough, with an ability to easily explain some pretty technical stuff in a way that most people can understand. I was not surprised to see that he uses a Mac (way to go Mac lovers!) and I liked his use of Keynote for his presentation. The transition theme he used seemed to bother a few people and one person loudly remarked with a sneer, “Looks like a Mac application.” (Get a life, Windows lovers.) What I particularly liked about the approach Dan took with his talk was that he made it Lego-like, that is, piece built upon piece built upon piece, until he reached the (pardon the pun) piece-de-resistance, zero configuration service discovery. His vision for making things completely simple for users, with no configuration necessary for them and no need for them to know about the technical magic that lies behind the user experience, is truly invigorating. The basic focus he had was on using OpenURL and combining it with several other “off-the-shelf” standards to make it dead easy for users to navigate to resources they need. One of the technologies he highlighted was Apple’s excellent Bonjour application for auto-discovery of networked resources such as websites or printers. He also brought up the example of Apple’s iTunes and how it easily allows users on the same network to discover and then play shared music libraries. Overall, this was a great presentation and I am very thankful we have someone of Dan’s caliber to push the technological boundaries in our profession. I wanted to introduce myself to him but didn’t get to do that before the end of the meeting.

Andrew Pace of the Technically Speaking column in American Libraries and author of the Hectic Pace blog, was also in attendance and it was the first time I had seen him in person and heard his by now well-travelled talk about what NCSU has done with its Endeca-powered online catalog. Andrew also is an engaging speaker. I didn’t learn much that I didn’t already know about the work he and others have done but it was interesting to have it presented in person anyway. I wish that I could have spoken with him and others there about the work I am involved in regarding integration of my library’s online catalog with another commercial search engine, work that I think might be interesting to others because it makes new uses of library data that are different than what I have heard is being done anywhere else.

A third highlight of the event was a presentation from someone from the National Academies Press who talked about the challenges and changes they have implemented in providing improved resource discovery for materials they publish. Michael Jon Jensen gave the presentation and he is their Director of Web Communications for the National Academies and Director of Publishing Technologies for National Academies Press. Under his direction this entity has done some really interesting experimentation and development of ways to improve access to the 3,600 books they publish, including development of their own clustering results. One of the things he said that most stood out to me was that National Academies Press provides their books for free in HTML form but they charge for PDF versions. The reason for charging for PDF is that, as he put it, our society still values and treasures the framework and “ethos” of the printed book. Those aren’t his exact words but I think it captures the idea he put forward. He said that a printed book is worth more than the individual pieces, it is bigger and better as a whole collection contained in one package. I thought this to be a very interesting perspective that has important ramifications for how we present and deliver information in an increasingly e-only world.

Jane Burke, former CEO at Endeavor and someone with whom I have always gotten along, was also there as a presenter and it was nice to chat with her for a while and to hear how she is doing in her job leading Serials Solutions.

Finally what made the trip special was the chance to catch up with old friends, Janet Lee-Smeltzer and Tom Wilson. Janet works at UMBC and Tom worked until recently at University of Maryland, College Park. Each night they picked me up from my hotel and we had dinner together and talked far into the evening about librarianship, Web/Library 2.0, library politics, and many other topics.

Flock beta version released

Last week (or maybe the week before, I forget), the first public beta release of my favorite web browser, Flock, was released. Naturally I was eager to put it through its paces. I’m glad to say that this is an even better browser than before, with one or two exceptions. In my view Flock has made the social web experience even easier and better because of big improvements in photo website integration (Photobucket and Flickr), blogging capabilities, and RSS.

This isn’t going to be a full blown or scientific review but instead a list of observations, likes and dislikes, etc.:

  • The photo integration is really nice. Now I have the option in the topbar to browse my photos or anyone else’s on a particular topic (tag) if those photos are on Photobucket or Flickr. More than that, I now have the ability to browse these photos in small OR large sizes, and I have easy drag and drop capability to add photos into other applications or a blog entry. For example, just this morning I decided to see what photos folks have posted on Flickr from the American Library Assoication annual conference being held right now in New Orleans. I simply input the tag ‘ala2006′ and was able to quickly call up new and recent photos taken by librarian colleagues. Pretty nice!
  • The blog integration is handled in a better way. Before, I was able to post to my blog from a topbar element. Now, with a simple keystroke (Ctrl+B) I can call up a separate, smaller window and immediately begin blogging. After clicking on the Publish button I am then presented with further choices such as what categories I want to assign and what Technorati tags I want to use. While this whole process took a little getting used to at first (because in the previous iteration, choices for tags and categories were on the main blog posting window) I like this new way of doing things much better.
  • The RSS feed capabilities are nice but they are the weakest feature at this point. I keep getting script errors and/or funky results whenever I try to use the RSS aggregator sidebar. Hopefully this will work itself out soon. When it works, though, the sidebar arrangement and functionality are nice.
  • A big drawback for me for Flock was that there weren’t many native extensions available for it. (You couldn’t just use Firefox extensions, for example, of which there seem to be hundreds.) This is no longer a problem because with this beta release there is now a whole host of extensions available that can be readily used with Flock. I’ve had no problems with the ones I like to use except for FasterFox. It is great now to be able to use the ones I like the most in Firefox.
  • There is a new Conversations topbar plugin available that works much better than the previous Technorati topbar ever did. It’s basically the same as the old Technorati topbar but seemingly reengineered and renamed. I find this a very useful feature when I want to have some sense of what others might be saying about a particular website I’m interested in. When used in combination with the Google Web Comments plugin, I feel like I am able to get a pretty comprehensive sense of the “conversations” that are going on about that website.
  • The del.icio.us integration is also much smoother than before.
  • A really big, important new feature in this beta release is the Quick Search functionality, which integrates several areas into one truly quick search, such as your favorites, your web history, the top five hits from Yahoo!, and a quick way to pick other search engines to search in as well as whatever default search engine you’ve chosen. Again it takes a little getting used to but I am quite impressed with how it works thus far.

I am still surprised that there doesn’t seem to be that much use of or experimentation with this browser among librarian colleagues. Maybe there is stuff going on and I don’t realize it. I’ve used Flock (even the alpha releases) as my default browser for many months now and I have no problem recommending it to anyone. When the students in my course this summer saw me using it and talking about it, some of them decided to try it out, too. One of them found a thorough review on ExtremeTech and posted about it to the class blog.

I also should point out that I use different flavors of Flock. On my Windows laptop from work, I installed Flock on my portable USB drive and it works great. On my PowerBook at home, it also works great.

So…bottom line: If you blog, use photo sharing sites, or just appreciate a functional web browser, try Flock. I think you’ll like it.

Library online catalogs and relevancy ranking [Updated]

Karen Schneider’s post on the ALA Techsource blog, “How OPACs Suck, Part 1: Relevance Rank (Or the Lack of It),” is a rant by a librarian who either presents a foregone conclusion due to incomplete research, or one who reaches a conclusion out of misunderstanding. Unfortunately such rants are fairly common. Karen complains about the lack of relevancy ranking in most online catalogs, something that most search engines routinely employ. She sums up the result of her research with the following statement:

“Relevance ranking is just one of many basic search-engine functionalities missing from online catalogs.”

Be sure to read the post as well as all of the comments (28 so far).

So why do I find this post problematic? Well, first of all, Karen makes a blanket statement like the one quoted above, without qualification. The fact is that library online catalogs do include relevancy ranking, and they have for years. The online catalog for Endeavor, for example, called WebVoyage, has had relevancy ranking for just about all of its existence (about nine years). It has never been “perfect” but it has been there. No, it doesn’t work in the same manner as, say, Google’s Pagerank algorithm. (It predates that technology, anyway.) And I don’t think it should be expected to, either. I agree that the ease of use and the transparency of the results for library online catalogs should be close or very similar to Google’s but comparing library online catalogs to Google in this way is like comparing apples to oranges. For one thing, the underlying data and databases for library online catalogs is almost entirely different than the data and database(s) underlying a major search engine. See screen shots here that illustrate this capability in WebVoyage.

Another problem I have with this post is that it blames vendors of library online catalogs for the fact that relevancy ranking isn’t apparently present in many instances. There is no consideration given by Karen to the possibility that relevancy ranking may not appear to be available because libraries themselves have chosen not to implement it or make it readily available to their users. The perspective here is very one-sided. Let’s all blame the vendors for inhibiting us librarians from properly serving our users and meeting their expectations. Vendors are by no means blameless, but neither are librarians. Just once, I’d like to see Karen and others of her ilk acknowledge that situations like these are not as black and white as they may like to believe. Sometimes I think it’s a matter of convenience because many librarians have long since cast “the vendor” as the bogeyman (“how dare they actually care about making money?!”). I say, look at both sides of the issue and especially do not be so quick to lay blame without truly understanding the reality of what vendors provide and what vendors do. Here is another quote from Karen’s post:

“But the interesting questions are: Why don’t online catalog vendors offer true search in the first place? and Why we don’t demand it? Save the time of the reader!”

OK, so what is “true search,” Karen?! (I don’t believe that is defined anywhere in the post.) What you define as “true search” isn’t necessarily how another person might define it. This is just common sense. If “true search” is meant as relevancy ranking, as I’ve already pointed out, vendors HAVE offered and DO offer “true search.”

But I’m beginning to see that that kind of answer doesn’t fit the simplistic, librarians-as-hapless-victims paradigm Karen has preconstructed so it wouldn’t count. It wouldn’t be relevant.

P.S. In one of her comments responding to another person’s comment, Karen talks about how vendors don’t offer field-weighted searching in online catalogs, either. I can’t wait to read “the facts” she will present. [Updated 3/20/2006: Especially since Endeavor's WebVoyage does already provide field-weighted searching.]

Answers.com added to Firefox

Earlier today I upgraded to the latest beta release of Firefox (1.5 beta 2). One of the new features in the latest release is the addition of Answers.com as a search engine choice. Nice. This weekend, my son, Keegan, asked me what Columbus Day is for. I honestly couldn’t remember. (Me: dumb librarian.) Was it to commemorate Columbus’s birthday, or his discovery of America, or something else? A quick search on Answers.com gave me the answer.

How to redirect from my old blog?

I am being driven nuts by the fact that I haven’t figured out how to properly redirect traffic (and search engine bots) from my old blog site to the new one. The old blog is at http://homepage.mac.com/murphymoose/iblog/ and as may be obvious from the URL, was created using iBlog software. A long time ago I decided that a.) iBlog as a software platform was too limiting for my tastes and b.) I wanted my own domain name. That’s when I set up this new blog at http://www.familymanlibrarian.org/ using WordPress. It seems that the proper way to redirect everyone is to use a 301 HTTP Redirect (see, e.g. Google’s advice here. My understanding however is that this requires editing an existing (or creating a new) .htaccess file. OK, I got that. But wait a minute, how on earth does this work when the old blog was hosted using .Mac (a WebDAV service)? I am stumped. Any help or ideas would be very much appreciated. Just send me an email at s t e v e ( a t ) o b e r g s . n e t.

Radically restructured database architectures

ACM Queue – A Call to Arms – Long anticipated, the arrival of radically restructured database architectures is now finally at hand.

This article on the need for further research and development into new database architectures is pretty interesting. Although somewhat technical in parts, I think I got the gist of it. I found this point interesting:

One interesting development worth noting, however, has to do with the integration of database systems and file systems. Individuals who keep thousands of e-mail messages, documents, photos, and music files on their own personal systems are hard-pressed to find much of anything anymore. Scale up to the enterprise level, where the number of files is in the billions, and you’ve got the same problem on steroids. Traditional folder hierarchy schemes and filing practices are simply no match for the information tsunami we all face today. Thus, a fully indexed, semistructured object database is called for to enable search capabilities that offer us decent precision and recall. What does this all signify? Paradoxically enough, it seems that file systems are evolving into database systems…

I wonder if this is how Apple’s new search technology, Spotlight, works? I haven’t really read that much about the technical underpinnings of it, so this is just a dumb guess. I know that supposedly, Longhorn (the next major version of Windows) also has a revolutionary search engine built into it.

Regardless, any talk about new database architectures will surely have significant ramifications for libraries who are still heavily reliant upon integrated library systems. Also for libraries who increasingly rely upon web-based searching capabilities as well as web services.

Grokker, a new type of web search engine

Some time ago, I read about Grokker, a new type of web search engine that presents results visually in cluster maps (think something like Venn diagrams), rather than in a long series of search results to page through one by one. Put in simple terms, the idea behind Grokker is to enable the searcher to more readily find the desired information that might be buried in web pages on the umpteenth page of search results from a standard search engine. More recently, I downloaded a free 30-day trial version for Mac OS X (they also offer a Windows version) and used it to find relevant information on a particular topic in Google that I was struggling to find using the regular Google interface. I am quite impressed with it, although I am not sure yet whether or not I want to fork over the $49 they charge for a production version of the software. If you want to see the future of search engines, or at least one model for that future, I suggest you download a copy yourself and play around with it. It takes a bit of getting used to, but I think you’ll like it.

Apple just keeps getting better

I was reading about Apple’s newest OS X release, nicknamed Tiger, this a.m. There are so many goodies that it is hard to know where to start when describing the things I am excited about. How about a new version of Safari that incorporates an RSS reader? How about an upgraded iChat AV that allows for videoconferencing with up to three additional people? How about a nifty new, system-wide search engine, called Spotlight? How about a Konfabulator-like widget module, called Dashboard? All of these things are amazing. I can’t wait. (If that sounds like I’m a true Mac addict, well, so what.)