Categories
API Collection databases Search

Museum collection meets library catalogue: Powerhouse collection now integrated into Trove

The National Library of Australia’s Trove is one of those projects that it is only after it is built and ‘live in the world’ that you come to understand just how important it is. At its most basic,Trove provides a meta-search of disparate library collections across Australia as well as the cultural collections of the National Library itself. Being an aggregator it brings together a number of different National Library products that used to exist independently under the one Trove banner such as the very popular Picture Australia.

Not only that, Trove,has a lovely (and sizeable) user community of historians, genealogists and enthusiasts that diligently goes about helping transcribe scanned newspapers, connect up catalogue records, and add descriptive tags to them along with extra research.

Last week Trove ingested the entirety of the Powerhouse’s digitised object collection. Trove had the collection of the Museum’s Research Library for a while but now they have the Museum’s objects too.

So this now means that if, in Trove, you are researching Annette Kellerman you also come across all the Powerhouse objects in your search results too – not just books about Kellerman but also her mermaid costume and other objects.

The Powerhouse is the first big museum object collection to have been ingested by Trove. This is important because over the past 12 months Trove has quickly become the first choice of the academic and research communities not to mention those family historians and genealogists. As one of the most popular Australian Government-run websites, Trove has become the default start point for these types of researchers it makes sense that museum collections need to be well represented in it.

The Powerhouse had been talking about integrating with Trove and its predecessor sub-projects for at least the last five years. Back in the early days the talk was mainly about exposing our objects records using OAI, but Trove has used the Powerhouse Collection API to ingest. The benefits of this have been significant – and surprising. Much richer records have been able to be ingested and Trove has been able to merge and adapt fields using the API as well as infer structure to extract additional metadata from the Powerhouse records. Whilst this approach doesn’t scale to other institutions (unless others model their API query structure on that of the Powerhouse), it does give end-users access to much richer records on Trove.

After Trove integration quietly went live last week there was a immediately noticeable flow of new visitors to collection records from Trove. And as Trove has used the API these visitors are able to be accurately attributed to Trove for their origin. The Powerhouse will be keeping an eye on how these numbers grow and what sorts of collection areas Trove is bringing new interest to – and if these interests differ to those arriving at collection records on the Powerhouse site through organic search, onsite search, or from other places that have integrated the Powerhouse collection as well such as Digital NZ.

Stage two of Trove integration – soon – is planned to allow the Powerhouse to ingest any user generated metadata back into the Powerhouse’s own site – much in the way it had ingested Flickr tags for photographs that are also in the Commons on Flickr.

This integration also signals the irreversible blending of museum and library practice in the digital space.

Only time will tell if this delivers more value to end users than expecting researchers to come to institutional websites. But I expect that this sort of merging – much like the expanding operations of Europeana – do suggest that in the near future museum collections will need to start offering far more than a ‘rich catalogue record’ online to pull visitors in from aggregator products (and, ‘communities of practice’) like Trove to individual institutional websites.

13 replies on “Museum collection meets library catalogue: Powerhouse collection now integrated into Trove”

Thnx Seb. Quick question from our informatics guys please. Is your collection API EMu based or did you develop it yourselves? Is it something that other museums can use?
Cheers,

The Powerhouse API was developed in-house and is not part of the Powerhouse Emu installation – and thus is not replicable to other institutions. 

However the query/response structure can easily be replicated – and in fact the Powerhouse modelled its query/response on that of the V&A, Brooklyn Museum and Digital NZ – which in turn seem to have been inspired by Flickr. It was designed like this so as to make it easier for developers who know how to write code for one API can adapt quickly to another one.See more info at http://www.freshandnew.org/2010/10/18/launch-of-the-powerhouse-museum-collection-api-v1-at-amped/

Following up on this: have you considers using SRW/SRU instead of the proprietary API? If so, what made you decide against using the standard? If not, are you able to list pros and cons now? 
I am asking because here in The Netherlands we propagate a minimum set of standards for digital heritage (we call it DE BASIS  http://www.den.nl/debasis (in Dutch)).
SRW/SRU is part of the set but we are currently reviewing. Although I am not aware of any other competing standard for federated search, SRW/SRU doesn’t seem to be used. However the Duch national aggragator for digital heritage (in development) is using it: http://data.digitalecollectie.nl
So what’s your opinion about SRU and its use at the PowerHouse and/or the Australian Heritage sector. 

Hi Marco.

There’s a timeline of implentations that might be of interest as background here. We’ve actually had Opensearch on our collection since 2006, then the CSV/ZIP download in 2009, API in 2010. And we’d looked at other options in between.

I guess the real benefit of the API is that Trove has been able to ingest collection (caching it using the API), merge and parse our fields into their fields (not needing an agreed upon schema), and then, most important of all, index our objects using their technologies.

What happened/happens with Opensearch, SRU etc is that the aggregator is entirely dependent upon the result ranking of the source – and can’t (easily) apply its own indexing (although we did think about caching results when we were building our own aggregators).

Obviously proprietary APIs don’t scale across institutions but as I said in the post itself – 

Much richer records have been able to be ingested and Trove has been able to merge and adapt fields using the API as well as infer structure to extract additional metadata from the Powerhouse records. Whilst this approach doesn’t scale to other institutions (unless others model their API query structure on that of the Powerhouse), it does give end-users access to much richer records on Trove.

Of course, we still offer Opensearch access as well but it is telling that the best implementations/ingestions of our collection have chosen to either use the full data dump OR the API.

I see. So the API is not (only) used to fetch and merge data of the PowerHouse on the fly by Trove but also to aggregate data (ingest as you call it). 
The it’s interesting to know how you keep the data in Trove up-to-date and if Trove considered using OAI-PMH to ingest your descriptions. 
The example I referred to uses OAI-MPH to harvest core data for indexing ans SRW/SRU to fetch, merge and present the rich descriptions on the fly. 

Of course one always has to weight richer records and more functionality against higher maintenance costs and failure risk, and it’s good to learn about the arguments and experiences behind having decided for a standards or proprietary solution. That’s my interest. 

Because museum collection data doesn’t change (very) rapidly, and what changes is only a small(ish) number of records at one time, Trove queries our API then caches the results every month. These results are then merged into the Trove index and then delivered to end users as they request them.

Our original plan many years back was to do OAI-PMH (which we’ve done for museumex.org) but it never got off the ground. For some reason the API has attracted more interest – whether that is warranted or not I’m not quite sure.

Given we use the API for internal product development we have to maintain the API in any case so it isn’t really incurring extra costs. In fact it saves us plenty in development for mobile and in-gallery things. OAI-PMH etc wouldn’t solve internal needs at all so in fact for us the equation is switched.

[…] The National Library of Australia’s Trove is one of those projects that it is only after it is built and ‘live in the world’ that you come to understand just how important it is. At its most basic,Trove provides a meta-search of disparate library collections across Australia as well as the cultural collections of the National Library itself. Being an aggregator it brings together a number of different National Library products that used to exist independently under the one Trove banner such as the very popular Picture Australia. Not only that, Trove,has a lovely (and sizeable) user community of historians, genealogists and enthusiasts that diligently goes about helping transcribe scanned newspapers, connect up catalogue records, and add descriptive tags to them along with extra research . Museum collection meets library catalogue: Powerhouse collection now integrated into Trove | Fresh &… […]

Writing from Queens, NYC. Do you know any other examples worldwide of a library integrating museum objects into it’s public catalog? Feel free to email me offline.

Leave a Reply

Your email address will not be published. Required fields are marked *