API Collection databases Developer tools Museum blogging Tools

Powerhouse Museum collection WordPress plugin goes live!

Today the first public beta of our WordPress collection plugin was released into the wild.

With it and a free API key anyone can now embed customised collection objects in grids in their WordPress blog. Object grids can be placed in posts and pages, or even as a sidebar widget – and each grid can have different display parameters and contents. It even has a nice friendly backend for customising, and because we’re hosting it through WordPress, when new features are added it will be able to be auto-upgraded through your blog’s control panel!

Here it is in action.

So, if you have a WordPress blog and feel like embedding some objects, download it, read the online documentation, and go for it.

(Update 22/1/11: I’ve added a new post explaining the backstory and rationale for those who are interested)

Collection databases

Quick Wikipedia citation code added to collection

Another of the many incremental changes slowly being added to the Museum’s collection database went live today – Wikipedia citation code.

You can now find this at the bottom of each object record (for example this Lawrence Hargrave Photographic Print) and if you happen to be editing an article in Wikipedia and need to reference one of the Powerhouse objects you can now just grab the code and paste it directly into Wikipedia’s editing interface.

Nothing too exciting but having seen the National Library of Australia do this a few months ago in their Australian Newspapers project we felt it was worthwhile doing too.

API Collection databases Conceptual Interviews Metadata

Making use of the Powerhouse Museum API – interview with Jeremy Ottevanger

As part of a series of ‘things people do with APIs’ here is an interview I conducted with Jeremy Ottevanger from the Imperial War Museum in London. Jeremy was one of the first people to sign up for an API key for the Powerhouse Museum API – even though he was on the other side of the world.

He plugged the Powerhouse collection into a project he’s been doing in his spare time called Mashificator which combines several other cultural heritage APis.

Over to Jeremy.

Q – What is Mashificator?

It’s an experiment that got out of hand. More specifically, it’s a script that takes a bit of content and pulls back “cultural” goodies from museums and the like. It does this by using a content analysis service to categorise the original text or pull out some key words, and then using some of these as search terms to query one of a number of cultural heritage APIs. The idea is to offer something interesting and in some way contextually relevant – although whether it’s really relevant or very tangential varies a lot! I rather like the serendipitous nature of some of the stuff you get back but it depends very much on the content that’s analysed and the quirks of each cultural heritage API.

There are various outputs but my first ideas were around a bookmarklet, which I thought would be fun, and I still really like that way of using it. You could also embed it in a blog, where it will show you some content that is somehow related to the post. There’s a WordPress plugin from OpenCalais that seems to do something like this: it tags and categorises your post and pulls in images from Flickr, apparently. I should give it a go! Zemanta and Adaptive Blue also do widgets, browser extensions and so on that offer contextually relevant suggestions (which tend to be e-commerce related) but I’d never seen anything doing it with museum collections. It seemed an obvious mashup, and it evolved as I realised that it’s a good way to test-bed lots of different APIs.

What I like about the bookmarklet is that you can take it wherever you go, so whatever site you’re looking at that has content that intrigues you, you can select a bit of a page, click the bookmarklet and see what the Mashificator churns out.

Mashificator uses a couple of analysis/enrichment APIs at the moment (Zemanta and Yahoo! Terms Extractor) and several CH APIs (including the Powerhouse Museum of course!) One could go on and on but I’m not sure it’s worth while: at some point, if this is helpful to anyone, it will be done a whole lot better. It’s tempting to try to put a contextually relevant Wolfram Alpha into an overlay, but that’s not really my job, so although it would be quite trivial to do geographical entity extraction and show amap of the results, for example, it’s going too far beyond what I meant to do in the first place so I might draw the line there. On the other hand, if the telly sucks on Saturday night, as it usually does, I may just do it anyway.

Beside the bookmarklet, my favourite aspect is that I can rapidly see the characteristics of the enrichment and content web services.

Q – Why did you build it?

I built it because I’m involved with the Europeana project, and for the past few years I’ve been banging the drum for an API there. When they had an alpha API ready for testing this summer they asked people like me to come up with some pilots to show off at the Open Culture conference in October. I was a bit late with mine, but since I’d built up some momentum with it I thought I may as well see if people liked the idea. So here you go…

There’s another reason, actually, which is that since May (when I started at the Imperial War Museum) it’s been all planning and no programming so I was up for keeping my hand in a bit. Plus I’ve done very little PHP and jQuery in the past, so this project has given me a focussed intro to both. We’ll shortly be starting serious build work on our new Drupal-based websites so I need all the practice I can get! I still no PHP guru but at least I know how to make an array now…

Q – Most big institutions have had data feeds – OAI etc – for a long time now, so why do you think APIs are needed?

Aggregation (OAI-PMH‘s raison d’etre) is great, and in many ways I prefer to see things in one place – Europeana is an example. For me as a user it means one search rather than many, similarly for me as a developer. Individual institutions offering separate OPACs and APIs doesn’t solve that problem, it just makes life complicated for human or machine users (ungrateful, aren’t I?).

But aggregation has its disadvantages too: data is resolved to the lowest common denominator (though this is not inevitable in theory); there’s the political challenge of getting institutions to give up some control over “their” IP; the loss of context as links to other content and data assets are reduced. I guess OAI doesn’t just mean aggregation: it’s a way for developers to get hold of datasets directly too. But for hobbyists and for quick development, having the entirety of a dataset (or having to set up an OAI harvester) is not nearly as useful or viable as having a simple REST service to programme against, which handles all the logic and the heavy lifting. And conversely for those cases where the data is aggregated, that doesn’t necessarily mean there’ll be an API to the aggregation itself.

For institutions, having your own API enables you to offer more to the developer community than if you just hand over your collections data to an aggregator. You can include the sort of data an aggregator couldn’t handle. You can offer the methods that you want as well as the regular “search” and “record” interfaces, maybe “show related exhibitions” or “relate two items” (I really, really want to see someone do this!) You can enrich it with the context you see fit – take Dan Pett’s web service for the Portable Antiquities Scheme in the UK, where all the enrichment he’s done with various third party services feeds back into the API. Whether it’s worthwhile doing these things just for the sake of third party developers is an open question, but really an API is just good architecture anyway, and if you build what serve’s your needs it shouldn’t cost that much to offer it to other developers too – financially, at least. Politically, it may be a different story.

Q – You have spent the past while working in various museums. Seeing things from the inside, do you think we are nearing a tipping point for museum content sharing and syndication?

I am an inveterate optimist, for better or worse – that’s why I got involved with Europeana despite a degree of scepticism from more seasoned heads whose judgement I respect. As that optimist I would say yes, a tipping point is near, though I’m not yet clear whether it will be at the level of individual organisations or through massive aggregations. More and more stuff is ending up in the latter, and that includes content from small museums. For these guys, the technical barriers are sometimes high but even they are overshadowed by the “what’s the point?” barriers. And frankly, what is the point for a little museum? Even the national museum behemoths struggle to encourage many developers to build with their stuff, though there are honourable exceptions and it’s early days still – the point is that the difficulty a small museum might have in setting up an API is unlikely to be rewarded with lots of developers making them free iPhone apps. But through an aggregator they can get it in with the price.

One of my big hopes for Europeana was that it would give little organisations a path to get their collections online for the first time.
Unfortunately it’s not going to do that – they will still have to have their stuff online somewhere else first – but nevertheless it does give them easy access both to audiences and (through the API) to third party developers that otherwise would pay them no attention. The other thing that CHIN, Collections Australia, Digital NZ, Europeana and the like do, is offer someone big enough for Google and the link to talk to. Perhaps this in itself will end up with us settling on some de facto standards for machine-readable data so we can play in that pool and see our stuff more widely distributed.

As for individual museums, we are certainly seeing more and more APIs appearing, which is fantastic. Barriers are lowering, there’s arguably some convergence or some patterns emerging for how to “do” APIs, we’re seeing bold moves in licensing (the boldest of which will always be in advance of what aggregators can manage) and the more it happens the more it seems like normal behaviour that will hopefully give others the confidence to follow suit. I think as ever it’s a matter of doing things in a way that makes each little step have a payoff. There are gaps in the data and services out there that make it tricky to stitch together lots of the things people would like to do with CH content at the moment – for example, a paucity of easy and free to use web services for authority records, few CH thesuari, no historical gazetteers. As those gaps get filled in the use of museum APIs will gather pace.

Ever the optimist…

Q – What is needed to take ‘hobby prototypes’ like Mashificator to the next level? How can the cultural sector help this process?

Well in the case of the Mashificator, I don’t plan a next level. If anyone finds it useful I suggest they ask me for the code or do it themselves – in a couple of days most geeks would have something way better than this. It’s on my free hosting and API rate limits wouldn’t support it if it ever became popular, so it’s probably only ever going to live in my own browser toolbar and maybe my own super-low-traffic blog! But in that answer you have a couple things that we as a sector could do: firstly, make sure our rate limits are high enough to support popular applications, which may need to make several API calls per page request; secondly, it would be great to have a sandbox that a community of CH data devotees could gather around/play in. And thirdly, in our community we can spread the word and learn lessons from any mashups that are made. I think actually that we do a pretty good job of this with mailing lists, blogs, conferences and so on.

As I said before, one thing I really found interesting with this experiment was how it let me quickly compare the APIs I used. From the development point of view some were simpler than others, but some had lovely subtleties that weren’t really used by the Mashificator. At the content end, it’s plain that the V&A has lovely images and I think their crowd-sourcing has played its part there, but on the other hand if your search term is treated as a set of keywords rather than a phrase you may get unexpected results… YTE and Zemanta each have their own characters, too, which quickly become apparent through this. So that test-bed thing is really quite a nice side benefit.

Q – Are you tracking use of Mashificator? If so, how and why? Is this important?

Yes I am, with Google Analytics, just to see if anyone’s using it, and if when they come to the site they do more than just look at the pages of guff I wrote – do they actually use the bookmarklet? The answer is generally no, though there have been a few people giving it a bit of a work-out. Not much sign of people making custom bookmarklets though, so that perhaps wasn’t worthwhile! Hey, lessons learnt.

Q – I know you, like me, like interesting music. What is your favourite new music to code-by?

Damn right, nothing works without music! (at least, not me.) For working, I like to tune into WFMU, often catching up on archive shows by Irene Trudel, Brian Turner & various others. That gives me a steady stream of quality music familiar and new. As for recent discoveries I’ve been playing a lot (not necessarily new music, mind), Sharon van Etten (new), Blind Blake (very not new), Chris Connor (I was knocked out by her version of Ornette Coleman’s “Lonely Woman”, look out for her gig with Maynard Ferrguson too). I discovered Sabicas (flamenco legend) a while back, and that’s a pretty good soundtrack for coding, though it can be a bit of a rollercoaster. Too much to mention really but lots of the time I’m listening to things to learn on guitar. Lots of Nic Jones… it goes on.

Go give Mashificator a try!

Collection databases User behaviour Web metrics

Actual use data from integrating collection objects into Digital NZ

Two months ago the New Zealand cultural aggregator Digital NZ ingested metadata from roughly 250 NZ-related objects from the Powerhouse collection and started serving them through their network.

When our objects were ingested into Digital NZ they became accessible not just through the Digital NZ site but also through all manner of widgets, mashups and also institutional website that had integrated Digital NZ’s data feeds.

So, in order to strengthen the case for further content sharing in this way, we used Google Analytics’ campaign tracking functionality to quickly and easily see whether users of our content in Digital NZ actually came back to the Powerhouse Museum website for more information on the objects beyond their basic metadata.

Here’s the results for the last two months.

Total collection visits from Digital NZ – 98 (55 from New Zealand)
Total unique collection objects viewed – 66
Avg pages per visit – 2.87
True time on site per visit (excluding single page visits) – 11:57min
Repeat visits – 37%

From our perspective these 55 NZ visitors are entirely new visitors (well, except for the 8 visits we spotted from the National Library of NZ who run Digital NZ!) who probably would never have otherwise come across this content so that’s a good thing – and very much on keeping with our institutional goals of ‘findability’.

For the same period, here are the top 6 sources for NZ-only visitors to the museum’s collection (not the website as a whole) –

(click for larger)

Remember that the Digital NZ figure is for around only 250 discrete objects and so we are looking at just under 1 new NZ visitor a day to them via Digital NZ, whereas the other sources are for any of the ~80,000 collection objects.

However, I don’t have access to the overall usage data for Digital NZ so I can’t make a call on whether these figures are higher, lower, or average. But maybe one of the Digital NZ team can comment?

Collection databases Developer tools

Launch of the Powerhouse Museum Collection API v1 at Amped

Powerhouse API - Amped

This weekend just gone we launched the Powerhouse Collection API v1.

For the uninitiated the API provides programmatic access to the collection records for objects that are on the Powerhouse website.

For the technically minded, Version 1 returns JSON, JSONP, YAML and XML through a RESTful interface – chosen mainly so that interested people can “make something useful inside an hour”. Upcoming versions of the API are planned to return RDFa. (Already Allan Shone has independently added YQL!)

Now you may be asking why this matters, given we’ve been offering a static dataset for download for nearly a year already?

Well, the API gives access to roughly three times the volume of content for each object record – as well as structure and much more. Vitally, the API also makes internal Powerhouse web development much easier and opens up a plethora of new opportunities for our own internal products.

The main problem with APIs from the cultural sector thus far has been that they are under-promoted, and, like the cultural sector in general, rather invisible to those who are best placed to make good use of the API. Having had experience with our dataset being used for GovHack, Mashup Australia (one of the highly commended was a Powerhouse browser) and Apps4NSW last year, we rushed the launch to coincide with Amped – the Web Directions free ‘hack day’ that was being held at the Powerhouse.

And, despite the stress of a quick turnaround (hence the minimal documentation right now!), we could not have had better timing.

Amped Sydney

Amped provided the perfect road test of the API. Carlos and Luke were able to see people using the product of their work and talk to them about their problems and suggestions. Nothing like combining user testing and stress testing all in one go!

Amped Sydney

Out of 250 people that attended the Amped, 24 teams submitted prototype projects. 13 of these projects used the new Powerhouse API!

So, what did people do?

Amped Sydney

The winning project for the Powerhouse challenges was a collection interface which pivoted around an individual visitor’s interests and existing personal data – aimed at being deployed as an entry experience to the Museum – and developed by Cake & Jar (Andrea Lau & Jack Zhao).

Honourable mentions and runners up went to a Where In The World Is Carmen San Diego?-style game using the museum objects as the key elements in a detective story built with multiple APIs and entirely without a backend; a quite spectacular social browsing game/chat client built using the Go language; an accessibility-enhanced collection browser for the visually impaired; a collection navigator that emphasised provenance over time and space; and an 80s dungeon crawl-style graphical adventure collection organiser loosely inspired partially by (the magical) Minecraft.

Amongst the others were a very entertaining ‘story generator‘ that produced Brion Gysin-esque ‘automatic writing’ using the collection documentation written by curators; a lovely mobile collection suggester using ‘plain English’ sentences as an entry point; and several collection navigators optimised for iPads using different types of interface and interaction design models (including My Powerhouse).


Now over to you.

Register for a free account and then create your access keys.

Read the (ever-growing) documentation. Then make stuff!

We’ll be watching what you do with great interest. And if you have any suggestions then email api [at] phm [dot] gov [dot] au.

Thank you to the inspired and pioneering work especially by our friends at the Brooklyn Museum, Digital NZ, and Museum Victoria. Their work with has been instrumental in informing our decisions around the API.

(All photos by Jean-Jacques Halans, CC-BY-NC.)

Collection databases open content

Crossing the ditch – integrating our New Zealand objects with Digital NZ

If you use to regularly read this blog then it probably seems like it has been quiet here but in fact we’re still in one of the busiest periods ever. Today, though, some light through the clouds.

Our friends at Digital NZ (run by the National Library of New Zealand) switched on New Zealand-related Powerhouse objects in their federated meta-search. Now our wool samples and a stack of other objects can be found through any of the many institutions that have embedded the Digital NZ search in their own sites, as well as in mashups built on Digital NZ.

Here’s our wool samples appearing in the sidebar of Te Papa’s collection search, or in one of the nice mashups using the Digital NZ search called NZ Picture Show.

The integration with Digital NZ offers far greater (and more sensible) exposure to our New Zealand objects than expecting New Zealanders to find them initially through our own site. After all it is probably New Zealanders who will be best able to help us document them better. See Rule 1 – findable (was ‘discoverable’) content.

There’s a couple of things I’d like to point out about this.

Firstly, we (still) haven’t made a public API to feed our collection to Digital NZ. Instead they took our updating collection zips and parsed them and ingested the relevant records, pruning them as needed. Whilst it probably would have been nice if we had had an API for them I get the feeling that being able to suck the whole data file down and play with it first made the process for the ingestion easier – even if it comes at the expense of immediate update-ability. Of course this will be addressed once our API goes live.

Second, I love how feeding this data to Digital NZ has immediately had a public benefit in that it is available through all the existing Digital NZ partners and mashups. The work that Digital NZ has done since launch is really remarkable and everyone who now contributes content to them builds upon all their work to date. Contrast this with the innumerable projects with whom data is shared and then sits idle waiting for others to build things with it.

Third, there’s so much additional possibility now with our NZ-related data. Digital NZ users – you even – can go and suggest geo-locations for photos of our like this one with a nice UI. And then we can, in the future, harvest that data back “across the ditch“. Effectively this data hasn’t just gone to an aggregation and presentation service, it has gone to an ‘enhancement’ service.

Fourth, you’ll probably notice that we’re using Google Analytics’ campaign tracking capabilities to have some rudimentary URL-based tracking of federated usage. This gives us the ability to segment out traffic to our collection records that comes via those records that are now visible through Digital NZ. Such use data is critical to building the ongoing business case to federate and release our collection metadata.

Huge thanks to Fiona Rigby, Andy Neale, Elliott Young and the rest of the team at Digital NZ for making this happen, and to Virginia Gow (now at Auckland Museum) and Courtney Johnson (now gone commercial) who kicked this idea off with us way back in September 2009. They more than deserve their Chocolate Fish now.

(Declaration of interest – I and several others of the digital teams at the Powerhouse are Kiwis!)

Collection databases Imaging

Full screen zooms and image tweaks in our collection

If you are a regular user of our collection database you might have noticed some very minor tweaks recently. One of the most obvious is a change to how we show object images.

For objects with small and low-quality images we’ve turned off zooming (example). Instead these images now explain why they are not available at higher resolution (because they haven’t been moved and rephotographed in recent times).

For those that do zoom, we’ve popped them up in a larger overlay allowing for bigger views, partially in response to the ever increasing trend we are noticing in our analytics for bigger screen sizes.

We’ve also moved away from using Zoomify. As a result we now can support full screen zooms – just click the full screen icon once you’re in the zoomer. (Shortly we will have 3D objects views too!). The full screen is a lovely effect and is going to, eventually, force us to up the resolution of a lot of the images in the collection!

(full screen zoom of H4052, Ship model, HMS Sirius)

We’re working with some new options, too, for bigger images on the mobile web version of our collection too – which may even zoom on touch interface devices . . . stay tuned.

Collection databases open content

Malcolm Tredinnick on some problems with working with our collection dataset

Down at the recent Pycon we were excited to hear that Malcolm Tredinnick had taken the downloadable collection dataset from the Powerhouse and was using it to demonstrate some of the issues with working with (semi-)open datasets.

His presentation reveals what every museum knows – the datasets that exist in our collection databases are inherently messy. But we’re always working to improve the quality and structure of these datasets. Without them being publicly available to be worked on in new ways by non-museum people we’d never discover many of the flaws in them.

Here’s his presentation which is well worth watching if you are a developer or museum technologist and thinking of making your raw data available.

There’s some modifications and improvements coming to our downloadable data very soon – data release projects can’t just be a ‘set and forget’ arrangement.

Malcolm’s code for cleaning up our data is up on Github.

Collection databases User experience

Will schools use collection content? The Learning Federation Pilot Report

Over the last 12 months the Powerhouse, along with the National Museum of Australia and Museum Victoria, has been involved in supplying collection data to joint pilot project between the Le@rning Federation (TLF) and the Council of Australasian Museum Directors (CAMD) from March 2008 to May 2009

Museums have always had difficulty preparing material to service education audiences and there hasn’t been a great deal of specific work done looking at how schools actually end up using museum materials. Nor has there been an emphasis on developing ways of speeding up the process of delivering collection records to schools in usable formats and (re)written appropriate for classroom integration. Instead, museums have tended to focus on developing separate areas of their websites holding bespoke content made for schools and aligned with State and National curricula – in many ways mirroring the, often divisive, split in museums between curatorial and research areas and ‘education’ areas.

This pilot project looked at changing this. First it trialled programmatic ways of integrating existing collection content into the everyday teaching in school environments and then evaluated the relevance and use of museum collection records in these scenarios.

Each institution selected a bundle of collection records (643 in total – 2300 were initially envisioned) for the trial and then supplied them using the ANZ-LOM schema. These records were quality checked by Learning Federation specialists and then integrated into their Scootle platform where they could be mixed with other learning assets, tagged, shared, remixed and brought into lesson plans.

Schools, teachers and students discovered the objects with an ‘educational value statement’ through the Scootle portal and then could visit the museums’ own records directly (via persistent URLs) for further drilldown. This added a useful layer of contextualisation, discoverability, and syllabus mapping rarely found on the museums’ own websites (and never in collection databases).

Focus groups were then held with schools who were using the materials to look at exactly how museum objects were being used, and more importantly how teachers and students evaluated their usefulness.

The obvious hurdles of Copyright, content suitability, writing style at the museum end, and the teacher training at the schools end were far greater than any of the technical data supply issues.

Tellingly –

Of the 643 digital resources provided to schools as part of the new model of collaboration between TLF and the three museums, 55 digital resources were selected by schools to include in collaborative learning activities. Of this number, six resources were used more than once. (pg40)

. . .

Even though only a limited number of digital resources were available for the Trial, teachers were generally positive about the quality of these materials. While 73 per cent of teachers believed that the museum content was comparable in quality to other TLF resources, 100 per cent believed that it provided important background information and was well described for their purposes. (pg 41)

The report is available as a PDF from the Learning Federation directly (2mb).

Whilst the report is huge, it is important reading for everyone involved in trying to ensure museum content is written and delivered appropriately for the education sector.

Collection databases Web 2.0

Another OPAC discovery – the Gambey dip circle (or the value of minimal tombstone data)

New discoveries as a result of putting our incomplete collection database online are pretty common place – almost every week we are advised of corrections – but here’s another lovely story of an object whose provenance has been significantly enhanced by a member of the public – a story that made the local newspapers!

Here’s the original collection record as it was in our public database.

Now take a look at the same record a week later.

If your organisation is still having doubts about the value of making available un-edited, un-verified, ageing tombstone data then it is worth showing examples like these.