Comparing the Library of Congress to Wal-Mart

Surely the news from last week about some Congressmen unfavorably comparing the Library of Congress to the likes of Wal-Mart and UPS was one of the stupidest things I have ever read. I was genuinely shocked by the level of ignorance and, well, stupidity…shown by Rep. Vernon Ehlers (R-Mich.) and Rep. Dan Lungren (R-Calif.), when they made statements like the following:

“You might be well advised to consult with Wal-Mart or Target who track inventory every day.”

and

“If UPS can track millions of items a day and not have a 10 percent loss, why can’t you?”

I mean, are these people for real???? Here is a link to an excellent post by Matt Raymond at the Library of Congress that thoroughly and completely exposes the whole tempest-in-a-teapot for the farce that it was: http://www.loc.gov/blog/?p=207.

In that same blog post there is discussion about the current ALA president, James Rettig, and his comments to Congress about what he sees as major deficiencies with recent changes in cataloging and so forth. Here is part of what he stated that the Library of Congress needed to do:

“…require the Library of Congress to consult broadly and meaningfully with the library community, including organizations central to bibliographic control, regarding all future decisions to substantively modify the character and quantity of bibliographic records”

Good grief. What on earth does he think LC has been doing? And has always done?! I can (barely) grasp that Congressmen might be ignorant but for someone at the highest levels of librarianship in this country to make such dumb statements is, in my opinion, inexcusable. Even worse to my mind was that many of my librarian colleagues cheered him on. It’s yet another reason I am so thankful that I no longer belong to the American Library Association (as if I needed any more reasons).

Some crystal ball reflections

Recently my mentee at UIUC GSLIS asked me to look into my crystal ball and articulate some thoughts about what lies in store for technical services librarianship. What follows is my response. I publish this here because although my points aren’t polished and well-defined, what I wrote to my mentee expresses some of what I personally think about library-related topics that are popular right now.

Where do I start?! Those who work in tech. svcs. are in need, more than ever, of a management mindset. Not necessarily management responsibilities, but a management mindset. By this I mean that we need to understand the broader pressures and trends that managers, especially upper-level managers, have to cope with and prepare for. We can no longer be (if we ever were) narrowly focused on, say, cataloging of print books and only print books. This luxury only exists in a handful of really large or special libraries. We need to be very aware of user-oriented trends such as the whole “social web” or Web 2.0 discussion, and how that might alter user expectations of what we provide to them in terms of access paths to information.

In terms of concerns and anxieties, well…I am reminded of a quote I always see in the signature of emails written by a friend of mine: “Delete: Bathwater. Undelete: Baby.” This causes a smile to come to me every time I see it. Put simply, I worry that in the rush toward new technologies, new ways of interacting with and meeting the needs of users, too many of my colleagues find it easy to forget or ignore what is in the past. In many ways I do believe the Bible verse that states something like this: “There is nothing new under the sun.” I believe this has application in libraries. We are not to be bound (pardon the small pun) by the past, necessarily, but we at least need to acknowledge a.) that there is a past and b.) understand at least some of that past to put the present and future into a right perspective. I’ve said this to people over and over again and I’ll repeat it here as an example of this point. About 10 years ago, when I was new to the profession, one of the really hot topics was outsourcing of technical services work. People were either up in arms against this trend or actively applauding it as revolutionary and innovative. Truth is, it was neither. Outsourcing has existed for a very long time in libraries and one big example of this is in the realm of shared cataloging. The Library of Congress distributed its cataloging records on 3×5 cards to other libraries throughout the U.S. and (maybe) the world, as long ago as the early 1900s. That is outsourcing!

Particularly in this era of the “social web” I am worried by so many librarians who are leading “the revolution” and proclaiming how wonderful and how great everything is that relates to blogs, wikis, instant messaging, etc. Those things ARE great but please, folks, get some perspective on them! Understand that libraries have ALWAYS striven to be social and interactive and patron-oriented. The way some of the library technorati talk these days, you’d think that libraries have been forbidding prisons until the social web came about. That’s ridiculous. Most of what is new is actually evolutionary, not revolutionary.

Don’t get me wrong: I am heartily in favor of trying new things, of experimenting, of innovating, etc. My wife calls me a technogeek and I guess that’s an accurate made-up word. My problem is just that new developments need to be understood and perceived through the lens of historical perspective.

A discussion with Karen Calhoun

Tomorrow during class, Karen Calhoun, Associate University Librarian for Technical Services at Cornell, will be a guest to discuss the report she authored for the Library of Congress, entitled “The Changing Nature of the Catalog and its Integration with Other Discovery Tools.” I am really thankful that Karen so graciously agreed to meet online with my students and others from the UIUC GSLIS community and this final class session is one I have been anticipating with excitement for a long time. Karen’s report was part of required reading for class and the themes and issues it contains have cropped up time and again throughout this entire semester. I remember, for instance, with what passion the UIUC technical services librarians who met with the class on the first day discussed the report and its implications. This made a big impression on the class.

I have no idea how many people will tune in to join the online discussion but I suspect it will be quite a few, perhaps as many as 40 or 50 people. The report and Karen’s visit to the class tomorrow have generated a lot of interest among other faculty in the school as well as from other parts of the extended GSLIS community. Mark Lindner will do his sterling job as usual in terms of broadcasting the session and I know that he, too, is really looking forward to it.

Start of class

Last week I met face-to-face with the students who signed up for LIS578LE: Technical Services Functions at UIUC GSLIS. Overall it was a great time, including an informative meeting with librarians and staff who work in technical services areas at the Main Library at UIUC, followed by a guided tour. This invariably serves to put some immediate context to the topics covered in the course and all of the students enjoyed it a lot. Many remarked on the energy and passion for their jobs that our hosts at UIUC demonstrated. One student described his impression of their work at UIUC as “drinking from the fire hose.” I thought that was a pretty apt description :-) A major focus for discussion during the visit and tour was Karen Calhoun’s recently released report on rethinking the role of the OPAC, commissioned by the Library of Congress. This is required reading for the course section on cataloging (a few weeks away yet) and we will discuss it more in depth at that time, but the basic themes contained in the report, and debated by the students and the UIUC librarians and staff, are ones about which it is hard to remain neutral.

There are 22 people in the class and as Mark Lindner remarked on his blog, this has the makings of an excellent group, with varied backgrounds and interests. (By the way, it was great to finally meet Mark in person! A great guy, and one whom I am pleased to work with.) Several students in the class work in public libraries; one works in a school library. As usual there are other students who have no library (let alone technical services) experience, and then there are those who have worked for several years in this area already. All of them bring valuable insights to class discussions.

This year I departed from the norm by having the class meet for part of a second day (usually one day is all we get; the rest of the semester is conducted entirely online). The main focus of this portion of the time together was on discussion about the tour and visit with UIUC technical services folks, followed by a crash course in setting up blogs and the class wiki. There may be some who found this new stuff a bit overwhelming, and that is to be expected. However I tried to point out the importance of getting involved, personally, in investigating these new forms of communication and collaboration. GSLIS has a technology platform for conducting online courses that has stood the test of time — 10 years, to be exact — very well, but there are many aspects that need to be updated. The tech support folks are wonderful, incredible people. They already have begun investigating and testing a new platform for conducting classes, called Moodle (it’s open source to boot). Two of the LEEP courses this summer are using this new platform, which contains built-in support for wikis, blogging capabilities, RSS, etc. For the technical services course I teach, I have had to go outside of the bounds of the LEEP technology to integrate blogs. I chose WordPress.com as the best overall platform for a balance of ease of setup and use as well as a rich set of features (and of course, it is free). A general class blog is now operational and most students have successfully set up individual blogs as well. One of the main assignments of this course is what I’ve termed a reflective journal. It struck me that this assignment would make a perfect match with blogging technology, and it would have the secondary benefit of helping to generate and sustain conversations about themes in the course in ways that a generic bulletin board setup could not. This is all somewhat experimental of course, and we’ll see how people take to the new stuff. So far, I am really pleased.

Oh, one other new tool that I am excited about in terms of teaching this course is the new ability I have to do application sharing via another open source software called Web Huddle. This will make introducing students to ERMS, for instance, much more fruitful than, say, a PowerPoint presentation.

Now I’ve got to prepare more for tomorrow’s first online “live” session, on the topic of acquisitions and collection development. Unfortunately I have had serious problems with connecting to the GSLIS server from my workplace, such that I am forced to conduct tomorrow’s session from home, where there isn’t aren’t such tight restrictions on network traffic!

EndUser 2006 notes on opening session [Updated]

[Through a series of missteps that I won't go into here, I discovered that I had accidentally deleted this post, first published a few weeks ago. I feel pretty dumb. When I figured out what happened, I sat here, stunned, wondering what to do. Then I remembered Google's good 'ol caching capability, did a quick search to call up the cached version of this post, did a quick copy and paste, and voila, problem solved. Well, almost. My error wiped out the original post entirely, meaning that it automatically broke the link to that post, as well. There's nothing I can do about that. In the process of reconstituting the content, I decided on some editorial tweaks throughout.]

(Warning, this is a pretty lengthy post.)

Yesterday was the start of EndUser 2006, Endeavor’s customer conference. Somewhere around 1,000 customers have shown up for this event, some coming from as far away as Australia, New Zealand, several European countries, as well as Canada, Latin America, and of course, the U.S. As I’ve noted before, there are several conference sessions dealing with topics of interest, but yesterday’s highlight was the opening general session featuring a representative from Google who spoke in depth about Google’s Book Search project. Tom Turvey, Head, Google Book Search Partnerships, gave a brief over of Google and how it makes money, defined the elements of Google Book Search, described the Google Book Search Partner Program (which he oversees), and finally discussed the Library Program portion of Google Book Search. Tom has a long history of working with online content, serving in numerous roles in the publishing industry relating to online delivery, including launching Barnes & Noble’s ebook offerings and most recently holding a senior post at HarperCollins.

Tom began by describing Google’s business. He mentioned that Google now provides 59% of all Internet search referrals. Google’s oft-repeated mission is “to organize the world’s information and make it universally accessible and useful.” Their Its core business, i.e. how they the company makes money, is from advertising revenue generated via paid search ads using Google AdSense. Tom also mentioned that Google is the leader, by far, in referrals to book sites (currently it processes about 60% of all such referrals). In describing Google’s business, Tom pointed out some interesting statistics about book purchasing. He provided statistics showing that 13% Thirteen percent of all book purchases are now done online; schools/libraries make up about 24% of the book buying market, direct to consumer purchasing (direct from publishers) is about 2%; and the biggest growth area recently has been in non bookstore retail (books being purchased in Costco, Sam’s Club, Wal-Mart, etc.).

The next portion of the presentation focused on an explanation of Google Book Search. Tom pointed out that in his experience, never has there been so much misinformation about a product as there has been with Google Book Search (GBS). He made some comment that 90% of what has been published in the news media is false, thus the importance of explaining exactly what it’s about. GBS, at its heart, is an attempt to associate book content with what searchers are looking for in search engines. There are two main parts to GBS: the Partner Program, and the Library Program. The Partner Program involves relationships and agreements between Google and publishers. GBS launched in October 2004 at the Frankfort Book Fair. As of now there are literally thousands of publisher partners spanning seven languages. One of the most frequent questions publishers ask Google is, what books are good choices for discovery via GBS? One of Tom’s funnier statements was “we don’t need to help Harry Potter find an audience.” What Google is mostly interested in is the arcane, the obscure, and bringing this material to light via searching GBS. Every page is searchable; users are searching books from cover to cover. There are two ways of providing search on book content: a dedicated search (books.google.com), and integrating book content within the general Google search. The main intent of working with publishers is to drive book sales. Content is protected in a variety of ways (Tom mentioned that as you can imagine, this element of agreements with publishers often gets “into the weeds”). Only 20% of a book is viewable by one user during the course of a month. Print, copy, and save are disabled. Scanned images are purposely low resolution. Publishers can add/remote remove their material at any time. There is page level security as well. A percentage of pages is never visible at one time. Google’s process for receiving publisher content is pretty straightforward: the publisher usually sends either a PDF or a print copy. If the latter, Google digitizes it. As an interesting aside to closing out this portion of the talk, Tom mentioned “Oh by the way, the five publishers who are suing Google over the Library Project are actually members of the Partner Program.”

In turning to the third and last portion of the presentation, Tom outlined the elements of the Library Project. Partner libraries, as most people are aware by now, include Stanford, NYPL, Oxford, Michigan, and Harvard. In researching and comparing collections from each partner library, Google discovered that 60% of books are held in only one of the partner libraries. For legal and other issues, Google began the project by focusing on public domain books. However, public domain books make up only about 20% of a typical library collection. Ten percent of a typical collection is made up of books that are still in print (i.e. the stuff that is handled via the Partner Program). Most books, 90%, are in print but in a fuzzy area in which they may be out of print but still in copyright, or perhaps out of copyright. Seventy percent of collections were published after 1923 and fall into three categories: in copyright, in public domain, or the rights may have reverted. Obviously Google needed to figure out how to solve or address these complexities. Their solution was to offer to scan everything but provide three views: sample pages (partner view), snippet view (book under copyright w/out agreement with a publisher partner), and full book view (book is in public domain). The snippet view means that the full text of each book is indexed; users can only view three snippets from the book; there are links to “buy this book” as well as “find in a library”; different categories of books are handled in different ways; and copyright holders may opt out of display and/or scanning.

Obviously a critical factor for Google is optimizing and streamlining the workflow. For example, a key consideration was figuring out how long it takes to scan a typical book. Tom mentioned that in the early days of the project, founder Larry Brin and another staff member would use a metronome to time each other over and over again as they tried to figure out how best to scan a book. (Why a metronome? I have no idea and neither did Tom.) Books are scanned as is, including scribbles, marginalia, notes, whatever. Google is aiming to build a comprehensive collection of indexed books but has a long way to go yet on achieving that goal. Some of the challenges they face on a daily basis are 100% OCR accuracy, 100% image quality, search and integration with web search, the accuracy of any affiliated metadata, the existence of lots of “edge cases” in terms of how to process and display the scanned results, how to address books that contain multiple languages and/or scripts; and how best to achieve a good level of speed/automation of the entire process. As with their much vaunted (and top secret) search algorithms, Google is constantly tweaking the process to try to improve the quality. How do they handle math formulas, spelling correction (Tom used the example of vernacular language that is meant to be spelled a certain way but which looks wrong to a typical spell checker), etc.? What is the best way to deal with automated metadata extraction? Can they figure out an automated way to detect (and appropriately handle) different languages and/or scripts?

Tom made a big point of the fact that Google is actively engaging the library community. Librarians tell Google the good and the bad about GBS (e.g. of bad: too overwhelming for users, hard to know which stuff is authoritative and what is junk, desire to know exactly how the process for scanning and indexing works). Google wants to ensure that GBS works for libraries by making information more discoverable, driving more library usage, and supporting a worldwide community, which is especially relevant for remote and distributed library users. Google has no desire whatsoever to put libraries out of business; in fact, Tom claims that the opposite is true.

[One of the things that I thought was particularly striking was that at one point during the session, Mr. Turvey asked for a show of hands from the audience of those people who were aware of the facts and details he had provided about Google Book Search. To my astonishment, I was one of the few people to raise their hands. Maybe this was just due to some people not fully understanding the question or to some people's innate shyness, who knows. But if it was an indicator of professional ignorance of these matters, then we're in big trouble.]

After concluding his prepared remarks, Tom invited the audience to pose questions. This was perhaps the most interesting portion of the session and Tom handled the questions with aplomb and a dose of wit. Below are my notes of the substance of some of the questions posed, followed by the substance of what I could jot down of Tom’s answers.

Question: When a user sees a link to “find in a library” which leads to Open WorldCat, what librarians want is to have that user come to us rather than use Google and/or buy the book from the publisher. What is your view on this?
Answer: It appears that this is in fact what is happening. Logs show that adding the “find in a library” link, directed to Open WorldCat, has driven a tremendous growth in traffic to WorldCat. Presumably this leads to higher library use.

Question: I’d like to see much more powerful search options, including things like truncation, proximity searching, and boolean capabilities. Is this something Google is considering?
Answer: That’s a very good question, what I’d expect from a librarian <laughter from the audience>. Some of these capabilities are things we are indeed working on, while some of them are already available via the Advanced Search option.

Question: I believe that in search results from publisher content, there is no link to “find in a library” when there is such a link provided in the library search. Why is that?
Answer: Good question. Remember that the goal of GBS is to have a relevant search. The vast majority of books available in GBS at this time are from publishers. Over the next few years, that proportion will flip to emphasize library-owned material. Honestly there is a constant tug and pull between publishers and Google over this issue of how to direct users. Publishers, obviously, participate in GBS to sell more books.

Question: Is there any plan to include Library of Congress Subject Headings (LCSH) as part of the GBS search?
Answer: LCSH and other taxonomies are already used to some extent behind the scenes to assist with determining relevance as well as identifying relationships between books (linking from one book to a related book).

Question: Can you speak about why you are being sued by some of your publisher partners?
Answer: Attorneys love it when you talk publicly about their litigation <much laughter from audience>. Seriously, though, no, I can’t answer that.

Question: Are you indexing each book cover to cover (i.e. full text)? How do you determine relevancy? [Editorial aside: Was this person paying attention? This question was clearly answered in the context of the presentation.]
Answer: Yes, we are doing full text. The ranking/relevancy algorithms used in GBS are pretty much the same as those used in the regular Google search. Some tweaking is of course necessary to make the algorithms relevant for book search. We do user interface testing every month and as a result, we constantly tweak/change the algorithms.

Question: Do you have a formal digital preservation strategy?
Answer: We have agreements with our library partners that cover preservation to whatever degree they have specified in their legal agreements. It really depends on what partner libraries want. Other than that, no, we do not have a formal preservation strategy and do not feel that that is a role we should assume.

Question: Elaborate on how relevant metadata is in GBS.
Answer: Well, first of all, metadata does play a role in GBS but our bias is always toward full text, with metadata/abstracts thought of as secondary. This is probably the opposite of how most libraries would prioritize things.

Question: I have a question on the issue of fair use. Are you working to expand the concept of fair use in terms of scholarly material in particular?
Answer: We feel that our stance on fair use and GBS is very, very significant. We do not have any formal focus on scholarly material in GBS, though.

Question: What is Google’s stance toward the Open Content Alliance? Does Google view them as partners, or competitors?
Answer: We have an open door, a desire to partner and share in digitizing material. We believe that initiatives such as the Open Content Alliance are worthy of our support. However, as you can imagine, there are certain complexities and a lot of politics involved in this kind of interaction. We want to participate in initiatives like this in as open a way as possible.

Question: “Find in a library” links only to WorldCat at present. Does Google have any plans for directing traffic to other bibliographic (i.e. library) databases (this is particularly important for those libraries who aren’t linked from WorldCat)?
Answer: We’d be interested in any other worthwhile bibliographic databases, but WorldCat is it for now.

Question: A single search box is very attractive, but when you expand your data sources (as Google is doing), the simplicity and relevance of this one search become more difficult to maintain. How do you handle this?
Answer: We constantly reevaluate the one box concept and it is an ongoing problem to solve. There is no ready answer.

Question: How do you handle materials from publishers once those materials have gone out of print?
Answer: Good question. Once a publisher’s book goes out of print, they request that it be removed from the index and then it no longer appears in the search. The exception to this would be if there happens to be a copy of that same book that has been scanned and indexed as part of the Library Project. In that case, the book would remain in the index.

Question: Do you have plans for providing regional Google book searches (e.g. one for New Zealand imprints)? This is important for those outside of the U.S. because currently there is such a predominance of U.S. imprints in GBS.
Answer: We already do this, e.g. currently we have 65 regional book searches.

Question: The exposure from GBS for libraries is great, but it needs to be more two way, e.g. to direct users looking for material in a local library catalog to GBS and/or elsewhere. Are there any plans to extend the Google API to be used by libraries for integration into their online catalogs?
Answer: Something like this functionality is present in Google Scholar. We are very happy with this integration with library services and we want to figure out ways to extend this further.

Question: What’s your view on library’s development of customized Greasemonkey scripts to integrate library results in with GBS?
Answer: Anything that doesn’t violate copyright, we’re all for.

Question: GBS is very exciting. What about developing Google Journals?
Answer: <tongue in cheek> …So we have this thing called Google Scholar…Actually we are working ways to better integrate or link between GBS and Google Scholar.

Question: There is clearly a balance of power issue relating to the premise that allowing Google to do all this scanning and digitizing of book content puts the burden of proof on the content creator rather than the user. What are your thoughts about this?
Answer: We believe that this is a very important issue and our stance on this hinges on the belief that we are simply being consistent between the indexing of website content and indexing the content of books.

Question: What about working to include government documents, because they do no present a copyright problem?
Answer: Yes, we have a team devoted to this very issue. It is a bigger challenge to do this than it may at first appear because in order to do it we need to work out who is responsible (i.e. the publisher) of the multitude of gov docs. Expect progress on this front.

ONIX for Serials and MARC21 for Holdings

2006-05: Changes to Holdings data fields to accommodate ONIX for Serials in the MARC 21 Holdings Format (Network Development and MARC Standards Office, Library of Congress)

Just came across an announcement of proposals for changes to MARC that will be discussed by MARBI (Machine-Readable Bibliographic Information Committee) at ALA Midwinter. One proposal, referenced above, is of particular interest to serialists because it suggests a means to add ONIX for Serials information, specifically for two message types, Serials Release Notice and Serials Online Holdings, into the MARC Holdings record. This should be interesting…

Library of Congress goes Unicode

Within the last month or so, the Library of Congress‘s online catalog received an upgrade that allows users to view and search for records using non-Roman (Unicode) characters in Japanese, Arabic, Chinese, Korean, Persian, Hebrew, and Yiddish. See more information about it on their What’s New for the online catalog help pages. I think this is a big step forward for users and libraries who rely upon LC. For one thing, as far as I know, LC’s is the largest library catalog (for a single library) in the world; and it may also be correct to say that LC produces more cataloging records each year than just about any other library. People all over the world use this resource every day. (Full disclaimer: I happen to work for the vendor that provides LC’s online catalog software, Endeavor Information Systems, Inc.)

E-Archiving tools the next big thing? [Updated]

Some recent developments and announcements make me think that e-archiving solutions may be the next big thing in the world of information technology and libraries. Certainly, things are heating up in this area. Several weeks ago the National Archives of the U.S. announced a contract with Lockheed Martin to develop a tool known as the Electronic Records Archive (ERA). More recently, the Library of Congress gave $3 million to support development of an e-archive solution named Portico, being developed by a non-profit organization called Ithaka Harbors, Inc., which appears to be a spinoff of JSTOR and the Mellon Foundation. Just today, Endeavor Information Systems, Inc. and Sun Microsystems announced a partnership to develop their own e-archiving solution(s). (Full disclosure: Endeavor Information Systems, Inc. is my employer.)

Some thoughts on tagging [Updated]

The other day I was listening to a podcast (NPR: Technology) on the way home. The topic was tagging, which is a hot topic right now in the blogosphere (see ‘Tagging’ Lets Ordinary Users Organize the Internet.’) Some time ago I wrote about tagging with the basic reaction of “Duh, this is the same as library cataloging, or a very lite version thereof.” Now I’m not so sure of my judgment. I am also not so sure about the “goodness” or “badness” of this phenomenon. But there are some things that bother me about it. Maybe it bothers me mostly because of my library cataloging background. I am not aware of other library cataloging “experts” who have already weighed in on this phenomenon. If others out there who are active in the cataloging community have written anything about this, I’d like to hear about it.

Here are some (rather incoherent) thoughts for now:

One of the things that was discussed in the podcast mentioned above was the fact that the concept of “aboutness” was no longer narrowly defined or assigned by someone else, e.g. a librarian. With tagging, you can label something what it means to you, and another person can label the same object something else that means something to them. It allows each person to identify what is important to them. This was viewed as a very good thing. I’m not so sure. I’m not sure exactly how to articulate what bothers me about this except that it seems to assume, incorrectly, that librarians or, more specifically, catalogers, do not already have a system that does much the same thing, but in a different way, with an authority reference structure. I’m not arguing that the Library of Congress Subject Headings, e.g., works well at all. Or that library online catalog systems do a good job of demonstrating a reference structure to their users. All I’m saying is that libraries have developed, and used for years, a system that attempts to identify or tag objects in multiple ways to suit the perspectives of different users. Those who are so big on tagging as something new and different, I suspect, are largely unaware (or dismissive) of this fact.

Another part of this point about the freedom of users to identify something the way they see it that bothers me is that it seems predicated on general moral philosophy so prevalent today. That is, “what is good for me may not be good for you” or “each to his own” and tolerate all views. Noone is wrong, there is no absolute truth. Put another way, tolerance is the value that is most admired in today’s society. I may be spinning this way out of bounds, who knows. But I think that there is something to be said for an authoritative judgment of “aboutness” as a way to bring together all related works. (And I happen to believe in absolute truth.)

I’m wondering whether, as the tagging phenomenon continues on for a while longer, anyone will begin to think differently about its value. Does tagging an object with whatever terms you want really make it easier to find? Does tagging really help “cut through the clutter” or does it instead perpetuate and feed into a scattershot approach to categorization? I’m not sure what the answer will be.

I do know that tagging does help me to discover new ways of looking for or identifying an object, and that is helpful. And from a user perspective I much prefer tagging things in ways that I find relevant rather than having to refer to a standard list of tags, simply because it is easier for me and more meaningful.

I don’t know if anything I’ve noted down here makes sense, because I am not sure yet what to make of tagging in my own mind. I have played around with tagging in places like Flickr and LibraryThing and will continue to follow the development of tagging with interest.

[UPDATE: Just did a little searching around and found something that I had read before but had forgotten about, written by Gary Price @ SearchEngineWatch.com that directly relates to this very garbled attempt at writing down some of my thoughts about tagging. Gary does a better job at describing some of the negatives. I also recommend reading this posting from Clay Shirky's Writings About the Internet.]

LibraryThing is cool

Just wanted to point out a new site that I learned about last week, called LibraryThing (www.librarything.com) that allows users to register to post their book collections online. That in and of itself doesn’t sound radical but what is interesting is that users can then readily see the contents of others’ libraries, how they have categorized (or tagged) their entries, what the most popular books or authors are, etc. It includes a simple to use interface to search the Library of Congress’s web-based catalog via Z39.50, as well as Amazon, so that the the user can then import the bibliographic description from these sources without rekeying the information.