API Collection databases Search

Museum collection meets library catalogue: Powerhouse collection now integrated into Trove

The National Library of Australia’s Trove is one of those projects that it is only after it is built and ‘live in the world’ that you come to understand just how important it is. At its most basic,Trove provides a meta-search of disparate library collections across Australia as well as the cultural collections of the National Library itself. Being an aggregator it brings together a number of different National Library products that used to exist independently under the one Trove banner such as the very popular Picture Australia.

Not only that, Trove,has a lovely (and sizeable) user community of historians, genealogists and enthusiasts that diligently goes about helping transcribe scanned newspapers, connect up catalogue records, and add descriptive tags to them along with extra research.

Last week Trove ingested the entirety of the Powerhouse’s digitised object collection. Trove had the collection of the Museum’s Research Library for a while but now they have the Museum’s objects too.

So this now means that if, in Trove, you are researching Annette Kellerman you also come across all the Powerhouse objects in your search results too – not just books about Kellerman but also her mermaid costume and other objects.

The Powerhouse is the first big museum object collection to have been ingested by Trove. This is important because over the past 12 months Trove has quickly become the first choice of the academic and research communities not to mention those family historians and genealogists. As one of the most popular Australian Government-run websites, Trove has become the default start point for these types of researchers it makes sense that museum collections need to be well represented in it.

The Powerhouse had been talking about integrating with Trove and its predecessor sub-projects for at least the last five years. Back in the early days the talk was mainly about exposing our objects records using OAI, but Trove has used the Powerhouse Collection API to ingest. The benefits of this have been significant – and surprising. Much richer records have been able to be ingested and Trove has been able to merge and adapt fields using the API as well as infer structure to extract additional metadata from the Powerhouse records. Whilst this approach doesn’t scale to other institutions (unless others model their API query structure on that of the Powerhouse), it does give end-users access to much richer records on Trove.

After Trove integration quietly went live last week there was a immediately noticeable flow of new visitors to collection records from Trove. And as Trove has used the API these visitors are able to be accurately attributed to Trove for their origin. The Powerhouse will be keeping an eye on how these numbers grow and what sorts of collection areas Trove is bringing new interest to – and if these interests differ to those arriving at collection records on the Powerhouse site through organic search, onsite search, or from other places that have integrated the Powerhouse collection as well such as Digital NZ.

Stage two of Trove integration – soon – is planned to allow the Powerhouse to ingest any user generated metadata back into the Powerhouse’s own site – much in the way it had ingested Flickr tags for photographs that are also in the Commons on Flickr.

This integration also signals the irreversible blending of museum and library practice in the digital space.

Only time will tell if this delivers more value to end users than expecting researchers to come to institutional websites. But I expect that this sort of merging – much like the expanding operations of Europeana – do suggest that in the near future museum collections will need to start offering far more than a ‘rich catalogue record’ online to pull visitors in from aggregator products (and, ‘communities of practice’) like Trove to individual institutional websites.


Fiddling with Wolfram Alpha

Well, Wolfram Alpha is another nail in the coffin of the value of ‘raw data’ on the internet. And another reason why museums (and everyone else) need to emphasise interpretation, value add, and the ‘experience’ (Max Anderson’s ‘the visceral’). The raw materials will increasingly be free, easy to find, and ready for recombination and building upon. (Another reason why if you are not seriously cataloguing, documenting and digitising you are going to become invisible)

I’m impressed with my initial fiddling around.

Once upon a time you would have found it best to visit the Sydney Observatory to find out where Beta Centauri is in the sky. They would have given you a sky chart – which you can now download monthly from our site with accompanying podcast, or buy the annual Sky Guide book.

Of course, you’ll still find the Observatory a great place for a nerdy date or to get a go on the big telescope, and savour the experience of the historic building and unique location.

Now for the sky and factual data I can just go to Wolfram Alpha and do this search. Notice it has given the result relative to my geographical position and the time in my location. Equally impressive is the ability to see the sources used to generate the information (critical in establishing trust), and the ability to download the result as a PDF.

Now go and try it with people, places and things . . . .

You’ve probably noticed Google has also done some nifty new enhancements to their search.

Here’s the Wonder Wheel

And the Timeline

Collection databases Search Web metrics

OPAC2.0 – Examining Delta Goodrem’s dress again / more on search

The most popular object in our online collection database is still a dress worn by Delta Goodrem.

I’ve previously written about how the popularity of this dress was driven in part by coverage on a number of Delta Goodrem fan forums. But this neglects the criticality of search. Google has always driven traffic to this object and looking at last months analytics where Google search represented 86% of referrers to the object, the top 5 keywords used to discover this dress were these –

1. lisa ho – 11.24%
2. evening dresses – 4.55%
3. lisa ho dresses – 2.71%
4. formal dress – 2.13%
5. chiffon dress – 1.07%

Because of the frequency of the keywords ‘lisa ho’ in the title, description and body text of the object record, and the trusted PageRank of the Powerhouse Museum domain, we rank 11th in Google search results for ‘lisa ho’; 2nd for ‘lisa ho dress’; and 4th for ‘lisa ho dresses’.

Fortunately for us, this external traffic isn’t fleeting. Visitors to this object view almost double the average number of pages viewed by others on our site; and they spend more time on the site too.

Looking at the internal search terms for that same object the results are very different.

1. Australian fashion (also a subject classification)
2. tennis (user tag)
3. lisa ho
4. delta goodrem
5. elegant (user tag)

External search has effectively driven nearly 10 times the traffic of internal users to this object. It has also brought audiences to the object who have very little behavioural similarities to those who search within the context of our own site (internal search). This creates many new challenges in terms of usability and user experience.

Over the entire collection there are pockets of objects for which the difference between internal and external search is not as great however this needs much greater data analysis (and may be the subject of a future post or paper).

Search User experience

SEO (search engine optimisation) basics and museums

One of the most common questions asked over the past few years has been “how do I get the best out of SEO for my museum?”. This comes up in casual conversations and without fail at conferences. We are all becoming increasingly aware of the higher and higher proportion of our traffic coming via search, and that as content on the web grows exponentially the chance of our content lying buried deep in search engine results increases.

Often the problem for museums with search relates to the diversity of their web presence. Other than our brand name, our content, especially those held in collections, is often very diverse and our exhibitions equally so. I’ve previously written about the need to tackle exhibition naming so that at least on the web exhibition titles are more ‘search-friendly’, but this is very tricky to apply to collection and education content.

The news media have taken to rewriting headlines for search – knowing that timeliness and findability are crucial to their success of their content – Scott Gledhill’s fantastic SEO presentation from Web Directions South 2007 is an eye-opening look at how News Limited journalists in Australia are maximising the reach of their articles (link is to a full Slidecast).

Is this possible with museum content?

Should (and can) curators, education staff, marketing staff, get a quick dashboard that reports the web performance of the content they are creating? Should (and can) they iterate their content, improving it, guided by real world performance? If museums are ‘slow media’, then is performance-guided content creation even a desirable outcome? (Update: do we really want to get to a situation like this parodied in the Slate?)

Maybe you need to tackle the basics first – getting your key content more visible. So where do you start?

Fortunately there are plenty of great SEO resources on the web and plenty of ways of testing SEO performance for free or very low cost. Last month Web Designers Wall posted a simple introduction to SEO which is worthwhile reading for the very basics. This along with Scott’s presentation should provide a good start point.

Search User experience

User experience is all that matters – a reminder about content, search and users

Scott Karp over at Publishing 2.0 has been griping about his experience using his local newspaper website which just so happens to be the Washington Post. Driven by a desire to find out about power cuts as a result of storm, Karp was unable to quickly find what he wanted, and thus turned to other websites, finding them through Google.

Collection databases Geotagging & mapping MW2008 Search Semantic Web

MW2008 – Data shanty towns, cross-search and combinatory approaches

One of the popular sessions at MW2008 in Montreal was a double header featuring Frankie Roberto and myself talking about different approaches to data combining across multiple institutions.

Data combining was a bit of a theme this year with Mike Ellis, Brian Kelly and others talking mashups; Ross Parry, Eric Miller and Brian Sletten all talking ‘semantic web’; and Terry Makewell and Carolyn Royston demonstrating the early prototype of the NMOLP cross search.