Turning the catalog inside out

I’ve mentioned before that I am annoyed when universities don’t link to their libraries on their home page, forcing me to hunt for the link through some other link (e.g. to “Academics”) or dropdown menu.

What if, on a library’s home page, there was no link to its online catalog? What then? It may sound odd but that’s exactly what has happened in the case of the library where I work. Perhaps even more bizarre is the fact that I don’t mind it at all :-)

The reason for my lack of concern is simple: Our library catalog data is deeply integrated into our website in various forms of search, browseable lists, subject lists, and so forth. Catalog data underpins just about everything. The online catalog still exists and is still maintained, but instead of forcing our users to use its interface to find stuff, we have turned the catalog inside out and integrated its data into our library portal in numerous ways.

I’ve worked on this for months. I knew that this new way of thinking was rather unusual or different but it didn’t really hit me how different it would be until our new portal went live a few weeks ago. I remember looking at the homepage and instinctively hunting for the link to the catalog before realizing, “Oh yeah, it’s not there. And oh yeah, that actually makes sense!” There are links to the catalog deeper into the portal, just not on the home page.

I’m still working through the implications of what, for me anyway, is a radical shift in thinking and approach.

A quick conference trip to Washington, D.C.

For the past few days I’ve been on a quick conference trip to a meeting in the Washington, D.C. area. The meeting was organized by NISO and was entitled “From Discovery to Delivery: Solutions to Put Your Content Where the Users Are.”

While there was nothing new or startlingly different about the content of the meeting, for me, at least, I think it was a worthwhile trip overall. The best part of the whole workshop was attending Dan Chudnov’s presentation on “COinS, unAPI, and a Plan for Zero Configuration Service Discovery.” Dan is a great speaker; humorous yet thorough, with an ability to easily explain some pretty technical stuff in a way that most people can understand. I was not surprised to see that he uses a Mac (way to go Mac lovers!) and I liked his use of Keynote for his presentation. The transition theme he used seemed to bother a few people and one person loudly remarked with a sneer, “Looks like a Mac application.” (Get a life, Windows lovers.) What I particularly liked about the approach Dan took with his talk was that he made it Lego-like, that is, piece built upon piece built upon piece, until he reached the (pardon the pun) piece-de-resistance, zero configuration service discovery. His vision for making things completely simple for users, with no configuration necessary for them and no need for them to know about the technical magic that lies behind the user experience, is truly invigorating. The basic focus he had was on using OpenURL and combining it with several other “off-the-shelf” standards to make it dead easy for users to navigate to resources they need. One of the technologies he highlighted was Apple’s excellent Bonjour application for auto-discovery of networked resources such as websites or printers. He also brought up the example of Apple’s iTunes and how it easily allows users on the same network to discover and then play shared music libraries. Overall, this was a great presentation and I am very thankful we have someone of Dan’s caliber to push the technological boundaries in our profession. I wanted to introduce myself to him but didn’t get to do that before the end of the meeting.

Andrew Pace of the Technically Speaking column in American Libraries and author of the Hectic Pace blog, was also in attendance and it was the first time I had seen him in person and heard his by now well-travelled talk about what NCSU has done with its Endeca-powered online catalog. Andrew also is an engaging speaker. I didn’t learn much that I didn’t already know about the work he and others have done but it was interesting to have it presented in person anyway. I wish that I could have spoken with him and others there about the work I am involved in regarding integration of my library’s online catalog with another commercial search engine, work that I think might be interesting to others because it makes new uses of library data that are different than what I have heard is being done anywhere else.

A third highlight of the event was a presentation from someone from the National Academies Press who talked about the challenges and changes they have implemented in providing improved resource discovery for materials they publish. Michael Jon Jensen gave the presentation and he is their Director of Web Communications for the National Academies and Director of Publishing Technologies for National Academies Press. Under his direction this entity has done some really interesting experimentation and development of ways to improve access to the 3,600 books they publish, including development of their own clustering results. One of the things he said that most stood out to me was that National Academies Press provides their books for free in HTML form but they charge for PDF versions. The reason for charging for PDF is that, as he put it, our society still values and treasures the framework and “ethos” of the printed book. Those aren’t his exact words but I think it captures the idea he put forward. He said that a printed book is worth more than the individual pieces, it is bigger and better as a whole collection contained in one package. I thought this to be a very interesting perspective that has important ramifications for how we present and deliver information in an increasingly e-only world.

Jane Burke, former CEO at Endeavor and someone with whom I have always gotten along, was also there as a presenter and it was nice to chat with her for a while and to hear how she is doing in her job leading Serials Solutions.

Finally what made the trip special was the chance to catch up with old friends, Janet Lee-Smeltzer and Tom Wilson. Janet works at UMBC and Tom worked until recently at University of Maryland, College Park. Each night they picked me up from my hotel and we had dinner together and talked far into the evening about librarianship, Web/Library 2.0, library politics, and many other topics.

Social web stuff at UIUC

I’ve mentioned many times that I have close ties to my alma mater, the University of Illinois at Urbana-Champaign, so forgive me for yet another mention of them. I have been meaning for a while to mention here that they are providing some nice social web functionality. Included in this is a library-specific toolbar for Firefox or IE (first heard about on ACRLog). My only complaint about this feature is that it joins a rather crowded group of customized toolbars such as those from OCLC and other kinds of toolbars available to everyone. What I mean is, I for one don’t like toolbars in the first place, and I particularly don’t like too much web browser real estate taken up by multiple toolbars.

Another nice application UIUC has had in place for a while is a webpage for easily creating RSS feeds from their online catalog, so that as new books or resources are added to the catalog in areas of interest to users, they are able to be automatically notified about them. I’ve put a library and information science-focused RSS link, created via this webpage, directly into my RefWorks account because this allows me to more easily import relevant citations.

Visibility of library on organizational websites

It has always bothered me when a link to the library of a particular organization is not prominently featured on the home page of its website. This is particularly bothersome for educational institutions given the de facto role of the library as a centerpiece of learning. In fact when I browse the web or go directly to a known institution and do not see a prominent link to the library, this gives me a bad impression of that institution. In a previous job when I was responsible for library websites, the issue of placement for the link to the library was a battle that I had to fight with non-library campus IT folks, and fight fairly aggressively. In the campus website that existed when I came into that position, the link to the library was buried somewhere in a category for Academics, if I recall. Noone could find it. This, in spite of the fact that the library site was one of the most heavily used in the entire campus web structure. Fortunately after a campus website redesign, the link to the library was placed prominently on the home page for the institution.

So it was with a lot of interest that I read Steven Bell’s summary, posted to ACRLog, of a discussion on the COLLIB-L discussion list regarding this issue. One portion of Bell’s post particularly caught my attention:

Tom Kirk, library director at Earlham College, also brought up the value of examining web site data, but made the observation that data alone would hardly yield the information we need about student behavior in using institutional and library web sites. Until we do know more about how students use our web sites, Tom said, we may be unjustified in arguing for what belongs on a home page. As for alternatives, Tom suggested that many of our institutions have specialized portals for communicating with current students and faculty, where a more prominent library link could be placed. He also suggested that having the library under “academics” has “become a de facto standard alternative to a link on the home page?” So if they do move your library link from the home page to academics, don’t take it too badly.

This statement from Tom Kirk frankly astounds me, especially the part about having the library under “academics” being the “de facto standard.” Not true! And even if it is fairly common, I vehemently disagree that we should be satisfied with that! Furthermore, we should and often do have the data to back up the assertion that the link to the library belongs on the institution’s home page. And we should and do have data on how our students are using our sites. I would ask the question, are other campus wide sites being asked to adhere to this same requirement? Maybe, but in many cases, I doubt it, based upon personal experience.

One more point I’d make is that the library is not just for students, it’s for the whole institution including faculty, staff, and alumni. Even more than that, it is for the broader worldwide academic community. In other words, library websites, especially for educational institutions, have a worldwide audience and this is often overlooked. I mention this because one of the arguments I faced when in charge of library websites was to keep the library websites publicly available versus putting them behind a firewall and accessible only via an intranet. The argument for this restriction (made by non-library IT people) was that library resources and information was only for existing students, faculty, and staff, so therefore it needn’t be available to anyone else. Of course this is true when we think of licensed e-resources but this approach would make the library’s online catalog and other freely available resources invisible to anyone else.

I am not arguing that the library website deserves high visibility “just because.” But I find it troubling that the library’s online presence needs to be defended so often, and that there is frequently an assumption that the link to the library should be buried somewhere within an institution’s site.

My big news

My big news is that I will shortly begin a new job. Yesterday I accepted a job offer from one of the world’s largest pharmaceutical companies, to manage their library’s systems as well as their technical services operation. I am really excited about this opportunity. I’ve worked for two large, academic research libraries, for a small college library, and for a library systems vendor. Now I will find out what it’s like to work in a corporate library environment. I submitted my resignation today at Endeavor Information Systems, Inc., and my last day there will be May 17. I start my new job on May 22. In the new job I will still be tied somewhat to Endeavor but in a new and different way. This corporate library uses most of Endeavor’s software products, including Voyager (a traditional integrated library system consisting of an online catalog and other stuff), Meridian (their electronic resources management system, or ERMS), Discovery: Finder (formerly, ENCompass for Resource Access, which is a federated search tool), and Discovery: Resolver (formerly, LinkFinderPlus, Endeavor’s OpenURL service).

This opportunity is a real answer to prayer. Now my family and I have a sense of direction, of where we’re going in the coming months. It’s going to be pretty stressful because we will be looking for a new home, a new community to live in, and moving again. At the same time I will be starting a new job, teaching a graduate course, and finishing up a book chapter.

The catalog is the library: a perspective

Today I found an interesting post via Pubsub, written by an ex-librarian, discussing the role of the online catalog for libraries in the (near) future. Basically he posits that the online catalog IS the library, and goes on to describe ways in which that promise or ideal can be reached. These include combining tags, facets, and flexible hierarchies, as well as combining book lending with book buying. It’s a somewhat provocative proposal but one of the things I find interesting about it is that the writer is obviously not at all in the mainstream of current library discussions; in fact, he seems to be in an entirely different arena. Yet his proposals are akin to what others have already been proposing in the library world. E.g. Paul Miller of Talis just wrote the other day about the idea of bookstores combining with libraries.

Library online catalogs and relevancy ranking [Updated]

Karen Schneider’s post on the ALA Techsource blog, “How OPACs Suck, Part 1: Relevance Rank (Or the Lack of It),” is a rant by a librarian who either presents a foregone conclusion due to incomplete research, or one who reaches a conclusion out of misunderstanding. Unfortunately such rants are fairly common. Karen complains about the lack of relevancy ranking in most online catalogs, something that most search engines routinely employ. She sums up the result of her research with the following statement:

“Relevance ranking is just one of many basic search-engine functionalities missing from online catalogs.”

Be sure to read the post as well as all of the comments (28 so far).

So why do I find this post problematic? Well, first of all, Karen makes a blanket statement like the one quoted above, without qualification. The fact is that library online catalogs do include relevancy ranking, and they have for years. The online catalog for Endeavor, for example, called WebVoyage, has had relevancy ranking for just about all of its existence (about nine years). It has never been “perfect” but it has been there. No, it doesn’t work in the same manner as, say, Google’s Pagerank algorithm. (It predates that technology, anyway.) And I don’t think it should be expected to, either. I agree that the ease of use and the transparency of the results for library online catalogs should be close or very similar to Google’s but comparing library online catalogs to Google in this way is like comparing apples to oranges. For one thing, the underlying data and databases for library online catalogs is almost entirely different than the data and database(s) underlying a major search engine. See screen shots here that illustrate this capability in WebVoyage.

Another problem I have with this post is that it blames vendors of library online catalogs for the fact that relevancy ranking isn’t apparently present in many instances. There is no consideration given by Karen to the possibility that relevancy ranking may not appear to be available because libraries themselves have chosen not to implement it or make it readily available to their users. The perspective here is very one-sided. Let’s all blame the vendors for inhibiting us librarians from properly serving our users and meeting their expectations. Vendors are by no means blameless, but neither are librarians. Just once, I’d like to see Karen and others of her ilk acknowledge that situations like these are not as black and white as they may like to believe. Sometimes I think it’s a matter of convenience because many librarians have long since cast “the vendor” as the bogeyman (“how dare they actually care about making money?!”). I say, look at both sides of the issue and especially do not be so quick to lay blame without truly understanding the reality of what vendors provide and what vendors do. Here is another quote from Karen’s post:

“But the interesting questions are: Why don’t online catalog vendors offer true search in the first place? and Why we don’t demand it? Save the time of the reader!”

OK, so what is “true search,” Karen?! (I don’t believe that is defined anywhere in the post.) What you define as “true search” isn’t necessarily how another person might define it. This is just common sense. If “true search” is meant as relevancy ranking, as I’ve already pointed out, vendors HAVE offered and DO offer “true search.”

But I’m beginning to see that that kind of answer doesn’t fit the simplistic, librarians-as-hapless-victims paradigm Karen has preconstructed so it wouldn’t count. It wouldn’t be relevant.

P.S. In one of her comments responding to another person’s comment, Karen talks about how vendors don’t offer field-weighted searching in online catalogs, either. I can’t wait to read “the facts” she will present. [Updated 3/20/2006: Especially since Endeavor's WebVoyage does already provide field-weighted searching.]

Some Thoughts on RDA and ILS vendors [Updated]

Some time ago I noted here that an acquaintence of mine had snagged an interesting job at ALA as RDA Project Manager. Yesterday I sat down and read more about RDA, which stands for Resource Description and Access. In particular I read through the RDA Prospectus, published by an international group called the Joint Steering Committee for Revision of AACR, or JSC for short. This group is responsible for implementing changes to the cataloging code of practice in use by the majority of libraries in North America, the U.K., and Canada. The current cataloging code is known as the Anglo-American Cataloging Rules (AACR) and this has been the standard code for cataloging since the 1960s when the first edition of AACR was published. Having taken all of the cataloging coursework in library school and then starting out in the profession as a serials cataloger at the University of Chicago Library and then managing a large cataloging unit there for quite a while, I have “grown up” on AACR and have been actively involved in the cataloging community, particularly the serials cataloging part, in the past. I’ve since moved away from that professional focus somewhat and am no longer as current in my knowledge as I used to be. I had heard about RDA but didn’t really pay much attention to it. So it was a big surprise to me to read yesterday that RDA will be replacing AACR (or rather, AACR2R, which is the 2nd, rev. ed. of AACR that is currently in use). I decided to delve into RDA in more detail.

What I learned from the prospectus and from some of the discussion surrounding RDA that I could find is very intriguing. This is a very big change, and, in my view, a positive one. It is a big change on many levels but since I work for a major ILS (integrated library systems) vendor, I focused on what this new standard might mean for them. Here are some thoughts or impressions that came to mind:

  • Acceleration of the end of MARC, or at least, the lessening of emphasis on MARC. MARC (which stands for MAchine Readable Cataloging) is not directly tied to AACR2R or RDA in theory but nevertheless the two are closely entwined in practice. While AACR2R (and soon, RDA) describes cataloging rules such as how to choose the title of a book, MARC is the standard for how to record and transmit cataloging information electronically. MARC also drives or controls much of what cataloging information gets displayed to users in online catalogs. My reading of the prospectus makes it seem very clear that RDA will not assume the use of MARC but instead will be designed to be of use in a variety of metadata formats, of which MARC will be one of many. Of course there are already many other metadata formats in use by libraries other than MARC (e.g. EAD, Dublic Core, etc.), but this kind of emphasis by RDA on multiplicity of formats has far-reaching implications and solidifies or adds weight to the trend toward multiplicity of formats that’s been underway for several years. Why does this matter to ILS vendors? It matters because the core record or basis for just about every major ILS system is the MARC record. Expansion of multiplicity of metadata formats supported by an ILS calls for radical system redesign — assuming, of course (which I personally do not), the need for an integrated (some say, monolithic) library system continues to exist.
  • The prospectus makes it clear that RDA will be predicated on FRBR (Functional Requirements for Bibliographic Records) and FRAR (Functional Requirements for Authority Records), conceptual models developed under the auspices of IFLA (the International Federation of Library Associations and Institutions). These models have been around for quite a while yet very few ILS vendors have made their systems compatible with them as of yet. Implementation of RDA, as it is currently proposed, anyway, will change that from “it would be nice, but…” to “must be capable of…” In other words, it will no longer be desirable, but required. That’s a big difference. Those ILS vendors who have maintained the status quo on this one won’t be able to do so for much longer.
  • According to the prospectus, “RDA is being developed to provide a better fit with emerging database technologies, and to take advantage of efficiencies and flexibility that such technologies offer with respect to data capture, storage, retrieval, and display.” This could mean all kinds of things for ILS vendors and I am not certain really of what JSC has in mind. However, database design and maintenance is perhaps the most integral, complicated, and proprietary aspect of modern library systems. Any changes in that aspect of ILS work will be of huge significance for vendors.
  • Perhaps if RDA is successfully implemented, the idea of an ILS will enjoy a renaissance if/when vendors and/or libraries develop a system that can readily ingest, output, and manipulate library data no matter how it is encoded. Rather than component-izing (a madeup word) the disparate pieces of traditional ILS functionality as seems to be the general trend nowadays, maybe RDA, with its inherent tolerance for a multiplicity of metadata formats, will result in one central system that can handle those formats in one place with the flexibility that libraries need. Who knows?
  • One major portion of RDA will be dedicated to relationships. I find this interesting and a good thing. One of the biggest failings of ILS systems is that they have largely failed to readily help librarians piece together disparate works so that the user of the online catalog can readily see relationships among them.
  • One thing not mentioned at all in the prospectus is the whole concept of user-supplied metadata, e.g. tagging, and how that will play a role in the future for online catalogs and bibliographic utilities. I believe that tagging as a phenomenon is here to stay, even if I have my doubts about its efficacy right now. How can or should ILS vendors enable user-supplied metadata in conjunction with library-supplied cataloging?

I admit that I don’t know as much as I should know about RDA and surrounding issues, and I may have misinterpreted some of what I’ve read. Or maybe there are even more radical implications for ILS vendors than what I can think of right now. Regardless, I am fairly confident that RDA’s progressive approach bodes for a lot of upheaval for a lot of stakeholders. I’m going to pay a lot more attention to it than I have heretofore!

Library of Congress goes Unicode

Within the last month or so, the Library of Congress‘s online catalog received an upgrade that allows users to view and search for records using non-Roman (Unicode) characters in Japanese, Arabic, Chinese, Korean, Persian, Hebrew, and Yiddish. See more information about it on their What’s New for the online catalog help pages. I think this is a big step forward for users and libraries who rely upon LC. For one thing, as far as I know, LC’s is the largest library catalog (for a single library) in the world; and it may also be correct to say that LC produces more cataloging records each year than just about any other library. People all over the world use this resource every day. (Full disclaimer: I happen to work for the vendor that provides LC’s online catalog software, Endeavor Information Systems, Inc.)

Use of the term ‘card catalog’ by Google

Am I the only one who finds it incredibly irritating to hear or read about Adam Smith of Google Print, constantly refer to their digitization work as building a ‘card catalog’ of books? HELLO! Card catalog, it isn’t. Please do not refer to it that way, folks. Modern online catalog systems are nothing like the card catalog and, in my opinion, should not be referred to in that way. Not to demean the card catalog; it actually is/was a very useful tool that could/can be better used to find things in certain ways than an online catalog can. (E.g. I think a card catalog is a much more user-friendly tool for browsing than an online catalog will ever be.) I just think that those who continue to use the term ‘card catalog’ to describe an entirely different technology are displaying an irritating ignorance of the state of library technology these days. Most people have no clue as to how technologically advanced most libraries are these days. The use of the term ‘card catalog’ by people outside of our profession tends, in my mind, to reinforce the incorrect stereotype of libraries as backward and only concerned with print materials. This is just as negative in its own way as the eternal stereotype of librarians as spinsters with their hair in a bun, shushing patrons.

Also let it be known that I am not at all anti-Google Print. I heartily welcome this development and do not share the suspicious and/or distrustful attitudes of some librarian colleagues toward Google in relation to their work in this area. It will be quite interesting to see how their interpretation of fair use in copyright will play out. I think their pursuit of this digitization effort, including copyrighted works, will play a central role in redefining copyright law.