Zotero on steroids

I’m a big Zotero fan.  Anyone who has access to the Firefox web browser and needs or wants to keep track of bibliographies, websites, and other material used for research should use it instead of the bloated, overly expensive, difficult to use commercial citation managers such as Reference Manager, RefWorks, ProCite, and the like.  Zotero does everything you need and more, and does it better than the competition.  Did I mention it’s free?!

Today in the Chronicle of Higher Education there is an article (it’s freely available, you don’t need a subscription to access it) about a proposal made by the creators of Zotero to put it on steroids.  Put simply, they intend to make Zotero the tool of choice for researchers, scientists, professors, and others to load their research works into a shared database hosted by the Internet Archive.  The article notes the general failure (oh wait, I mean, lack of success) that libraries have had with doing this on an institution-by-institution basis using tools such as DSpace.  There is a scathing review about so-called institutional repositories and their use by libraries by one of the people I’d deem to be a foremost expert on utilizing DSpace, Dorothea Salo. It is well worth reading.

Anyway, I think this whole idea — actually, it’s more than that because the Mellon Foundation just dished out hundreds of thousands of dollars to make it a reality — has many beneficial implications and I hope Dan Cohen and his team at the Center for History and New Media at George Mason University have all the success they hope for with it.  (Thanks to Wally Grotophorst for the mention.)

Roadblock to full OpenURLness [Updated]

This week I encountered a significant roadblock when trying to use OpenURL in a situation where it is a natural fit. Let me explain the scenario. A scientific researcher at the company where I work built an extensive bibliography of journal articles on a particular subject, and wants to publish that bibliography on the company intranet, complete with hyp[er]text links to the full text. This person initially thought it’d be ok to simply mount the full text articles that he had downloaded in the same webspace as the bibliography, and simply link to the files. Of course, that ideas was quickly shot down. Instead, we thought, why can’t we take this bibliography, check it against our SFX KnowledgeBase to see what articles we have available in full text, and then output the complete OpenURL for each of those articles for this researcher to use when marking up and publishing his bibliography?

The use case sounds straightforward, right? Turns out that it is anything but. I was provided with a text file of citations and was asked to come up with appropriate SFX links for each. Of course I could have manually rekeyed the citations one by one into a search form querying our SFX KB, but that would take quite a long time and quite a bit of effort. I tried to think of how this whole process could be automated.

On the advice of Dan Chudnov I downloaded an open source application written in Perl called Biblio-Citation-Parser, which on the face of it seemed to be exactly what I needed. I need a way to automatically parse the whole list of citations into the necessary chunks of metadata, and then automatically generate an OpenURL for each citation. After trying unsuccessfully to get Biblio-Citation-Parser to work (this isn’t a limitation of the software but of my Perl expertise), I sent queries out to other SFX users as well as to the Code4Lib discussion list. There were several responses from members of the Code4Lib discussion list, some of whom mentioned the application that I already knew about. But it turns out that pretty much nobody in that community [at least among those who responded] had ever used it, and also, that nobody in that community had come up with a good solution to this parsing problem themselves.

Since the original citations were stored in Reference Manager, one of the more common citation management software applications, I wrote back to the colleague who first asked me to help with this situation, asking him if he could provide me with the Reference Manager files. He did, and I downloaded a free trial version of the software, imported the references, then exported them in RIS format. Next, I imported the RIS output file into Zotero, and then exported the whole bibliography from Zotero into a readymade HTML bibliography. Because of Zotero’s built-in COinS functionality, the readymade HTML bibliography is automatically populated with OpenURLs. But I wasn’t done yet. I had to go through each citation by hand and test whether we did indeed have the article in full text, and also, to edit the HTML coding to substitute our company’s specific SFX base URL in each link.

In the end, I achieved what the user wanted — a list of bibliographic references with SFX links as the hypertext links. But it was a huge amount of work, and I kept asking myself, surely there is a better, easier way to do this?! Surely, someone, somewhere has already solved this problem of how to readily parse bibliographic citations in a text file and run them through a process to check for which articles are available in full text?

Maybe there is a much simpler solution and if you know of it, please comment on this post to let me know. I’m left thinking that this whole OpenURL stuff still has a ways to go in terms of ease of implementation for situations like I described.