API Collection databases

More on museum datasets, un-comprehensive-ness, data mining

(Another short response post)

Thus far we’ve not had much luck with museum datasets.

Sure, some of us have made our own internal lives easier by developing APIs for our collection datasets, or generated some good PR by releasing them without restrictions. In a few cases enthusiasts have made mobile apps for us, or made some quirky web mashups. These are fine and good.

But the truth is that our data sucks. And by ‘our’ I mean the whole sector.

Earlier in the year when Cooper-Hewitt released their collection data on Github under a Creative Commons Zero license, we were the first in the Smithsonian family to do so. But as PhD researcher Mia Ridge found after spending a week in our offices trying to wrangle it, the data itself was not very good.

As I said at the time of release,

Philosophically, too, the public release of collection metadata asserts, clearly, that such metadata is the raw material on which interpretation through exhibitions, catalogues, public programmes, and experiences are built. On its own, unrefined, it is of minimal ‘value’ except as a tool for discovery. It also helps remind us that collection metadata is not the collection itself.

One of the reasons for releasing the metadata was simply to get past the idea that it was somehow magically ‘valuable’ in its own right. Curators and researchers know this already – they’d never ‘just rely on metadata’, they always insist on ‘seeing the real thing’.

Last week Jasper Visser pointed to one of the recent SIGGRAPH 2012 presentations which had developed an algorithm to look at similarities in millions of Google Street View images to determine ‘what architectural elements of a city made it unique’. I and many others (see Suse Cairns) loved the idea and immediately started to think about how this might work with museum collections – surely something must be hidden amongst those enormous collections that might be revealed with mass digitisation and documentation?

I was interested a little more than most because one of our curators at Cooper-Hewitt had just blogged about a piece of balcony grille in the collection from Paris. In the blogpost the curator wrote about the grille but, as one commenter quickly pointed out, didn’t provide a photo of the piece in its original location. Funnily enough, a quick Google search for the street address in Paris from which the grille had been obtained quickly revealed not only Google Street View of the building but also a number of photos on Flickr of the building specifically discussing the same architectural features that our curator had written about. Whilst Cooper-Hewitt had the ‘object’ and the ‘metadata’, the ‘amateur web’ held all the most interesting context (and discussion).

So then I began thinking about the possibilities for matching all the architectural features from our collections to those in the Google Street View corpus . . .

But the problem with museum collections is that they aren’t comprehensive – even if their data quality was better and everything was digitised.

As far as ‘memory institutions’ go, they are certainly no match for library holdings or archival collections. Museums don’t try to be comprehensive, and at least historically they haven’t been able to even consider being so. Or, as I’ve remarked before, it is telling that the memory institution that ‘acquired’ the Twitter archive was the Library of Congress and not a social history museum.

Collection databases Conceptual

Metadata as ‘cultural source code’

A quick thought.

Last week I wrote about collection data being ‘cultural source code’ in the context of the upload of the Cooper-Hewitt collection to GitHub.

As I wrote over there,

Philosophically, too, the public release of collection metadata asserts, clearly, that such metadata is the raw material on which interpretation through exhibitions, catalogues, public programmes, and experiences are built. On its own, unrefined, it is of minimal ‘value’ except as a tool for discovery. It also helps remind us that collection metadata is not the collection itself.

If you look at the software development world, you’ll see plenty of examples of tools for ‘collaborative coding’ and some very robust platforms for supporting communities of practice like Stack Overflow.

Yet where are their equivalents in collection management? Or in our exhibition and publishing management systems?

(I’ll be cross-posting a few ideas over the next little while as I try to figure out ‘what goes where’. But if you haven’t already signed up to the Cooper-Hewitt Labs blog, here’s another reminder to do so).

API Collection databases Search

Museum collection meets library catalogue: Powerhouse collection now integrated into Trove

The National Library of Australia’s Trove is one of those projects that it is only after it is built and ‘live in the world’ that you come to understand just how important it is. At its most basic,Trove provides a meta-search of disparate library collections across Australia as well as the cultural collections of the National Library itself. Being an aggregator it brings together a number of different National Library products that used to exist independently under the one Trove banner such as the very popular Picture Australia.

Not only that, Trove,has a lovely (and sizeable) user community of historians, genealogists and enthusiasts that diligently goes about helping transcribe scanned newspapers, connect up catalogue records, and add descriptive tags to them along with extra research.

Last week Trove ingested the entirety of the Powerhouse’s digitised object collection. Trove had the collection of the Museum’s Research Library for a while but now they have the Museum’s objects too.

So this now means that if, in Trove, you are researching Annette Kellerman you also come across all the Powerhouse objects in your search results too – not just books about Kellerman but also her mermaid costume and other objects.

The Powerhouse is the first big museum object collection to have been ingested by Trove. This is important because over the past 12 months Trove has quickly become the first choice of the academic and research communities not to mention those family historians and genealogists. As one of the most popular Australian Government-run websites, Trove has become the default start point for these types of researchers it makes sense that museum collections need to be well represented in it.

The Powerhouse had been talking about integrating with Trove and its predecessor sub-projects for at least the last five years. Back in the early days the talk was mainly about exposing our objects records using OAI, but Trove has used the Powerhouse Collection API to ingest. The benefits of this have been significant – and surprising. Much richer records have been able to be ingested and Trove has been able to merge and adapt fields using the API as well as infer structure to extract additional metadata from the Powerhouse records. Whilst this approach doesn’t scale to other institutions (unless others model their API query structure on that of the Powerhouse), it does give end-users access to much richer records on Trove.

After Trove integration quietly went live last week there was a immediately noticeable flow of new visitors to collection records from Trove. And as Trove has used the API these visitors are able to be accurately attributed to Trove for their origin. The Powerhouse will be keeping an eye on how these numbers grow and what sorts of collection areas Trove is bringing new interest to – and if these interests differ to those arriving at collection records on the Powerhouse site through organic search, onsite search, or from other places that have integrated the Powerhouse collection as well such as Digital NZ.

Stage two of Trove integration – soon – is planned to allow the Powerhouse to ingest any user generated metadata back into the Powerhouse’s own site – much in the way it had ingested Flickr tags for photographs that are also in the Commons on Flickr.

This integration also signals the irreversible blending of museum and library practice in the digital space.

Only time will tell if this delivers more value to end users than expecting researchers to come to institutional websites. But I expect that this sort of merging – much like the expanding operations of Europeana – do suggest that in the near future museum collections will need to start offering far more than a ‘rich catalogue record’ online to pull visitors in from aggregator products (and, ‘communities of practice’) like Trove to individual institutional websites.

Collection databases Interviews User behaviour

“Do curators dream of electric collection records?” Exploring how the Powerhouse online collection is used

As one of the first of a ‘new style’ of museum online collections, launching several internet generations ago in 2006, the Powerhouse Museum’s collection database has been undergoing a rethink in recent times. Five years is a very long time on the web and not only has the landscape of online museum collections radically changed, but so to has the way researchers, including curators, use these online collections as part of their own research practices.

Digging through five years of data has revealed a number of key patterns in usage, which when combined with user research paints a very different picture of the value and usefulness of online collections. Susan Cairns, a doctoral candidate at the University of Newcastle, has been working with us to trawl through oodles of data, and interviewing users to help us think about how the next iteration of an online museum collection might need to look like.

I asked Susan a number of questions about what she’s been discovering.

F&N – You’ve been looking over the last few years of data for the Powerhouse’s collection database. Can you tell me about the different types of users you’ve identified?

Based on the Google Analytics, there seem to be four main types of OPAC users. I’ve given each of them a nickname, in order to better identify them.

The first group is the FAMILIARS, composed of people who access the OPAC intentionally. FAMILIARS know of the collection through either experience (having used the online collection previously, or from visiting the museum), or via reputation (ie GLAM professionals, researchers or amateur collectors). FAMILIARS come to OPAC with the highest level of expectations and have the most invested in the experience. Trust and authority are hugely important for the people in this segment.

The second group, I’ve called the SEEKERS. Like FAMILIARS, SEEKERS are driven by a desire for information they can trust. However, unlike FAMILIARS, SEEKERS do not yet know about the museum and/or its collection. This group includes people who are new to collecting communities, or student researchers etc. If they find what they are looking for on the OPAC, SEEKERS have the potential to become FAMILIARS.

The final group for whom authority and trust in information is important are the UTILISERS. These visitors, primarily education users (like school students), have specific and particular research needs, which are externally defined (ie they might be looking for answers to set questions). This group is task-oriented.

The last group that comes to the OPAC is the WANDERERS. These are casual browsers who seek fast and convenient information, but don’t necessarily need depth in their answers. Seb once nicknamed them “pub trivia” users, and that seems pretty apt.

F&N – What sort of proportions do each of these make up?

By far the greatest number of OPAC visitors are WANDERERS. More than 80% of all OPAC users – whether in a two-year period, or a six-month timeframe – visited the collection online once. Obviously not all of these will be WANDERERS, but a significant proportion of OPAC users are clearly coming to meet short-term information needs.

At the opposite end of the scale, around 5% of OPAC users visited the collection five times or more during the last six months. These visitors have the most invested in the current OPAC, having spent time learning to negotiate it.

F&N – Have these users changed over time? (As other collections have come online etc)

The actual make up over time doesn’t seem to have changed that much, although the numbers of visitors dropped a little after a peak in early 2010.

Having said that, there are seasonal trends in the users. The search terms that UTILISERS often use to find the collections (such as “gold license”) are more popular during the school year than at other times. Similarly search terms go through peaks, depending on media interest, such as a high number of searchers who come to the OPAC looking for Australian media personality Claudia Chan Shaw, whose dress is in the collection.

Some search terms are just weird. One of the most popular search terms ever was “blue fur felt” which skyrocketed to popularity in January – July 2010, but has not been used to bring visitors to the OPAC since.

F&N – Are overseas users different from Australian ones?

During the last six months, the OPAC actually had more international users than domestic ones, with the top ten international countries visitors coming from the USA, UK, Canada, New Zealand, India, Germany, France, Netherlands and Philippines. The search terms that lead international users to the OPAC are very different from those within Australia. After all, many of the most searched for items are that link up with the school curriculum, and that is very Australia-specific. These items also make up a significant proportion of the most-looked-at references.

The search terms overseas users to access the collection are often far more specific – such as particular clock brands etc, which would indicate a higher proportion of amateur collectors (SEEKERS and FAMILIARS) than WANDERERS.

Australian users spend longer on the site, and have a far lower bounce rate, so once on site they engage more.

F&N- You’ve been speaking to our curators about how they use ours and others collection databases. What are some of the things you’ve learned from this?

Talking to the curators has been absolutely fascinating. Every single curator that I have spoken to has his or her own ways of researching and gathering collection information. Some curators rely heavily on books, while others spend a significant amount of time conducting face-to-face interviews. Others use websites like Trove, or conduct community consultation online, using wikis and blogs. However, every researcher utilises Google and the Web in some way in their search for information.

No matter how a curator conducts collection research however, all are looking for two main types of information. The first is the broad contextual information for an object that places it into an historical and social framing. This includes the broader history or biography of the creator or manufacturer, and information on the social period in which it is or was used.

The second type of information is specific to the object itself, and includes information about maker’s marks, the object’s history (including provenance, such as how, when and why it came into the collection, why it was owned and used), and any stories that relate specifically to the object.

In order to find this information however, very few of our curators use museum collection databases – even those curators who conduct a significant amount of their research online. The reasons for this varied, but emerging themes included a difficulty navigating online collections (once it could be located on the institution website in the first place), a sense of frustration at being unable to find relevant information/objects, and most important, a lack of trust in online collection databases.

Not one curator that I spoke to trusted either our own OPAC or other online collections as a resource that could provide complete and authoritative information. Where a number of curators did find online collections useful however, was in providing immediate access to images of objects and to get a sense of whether another institution held objects that might be important to their own search. Knowledge about what was in a collection was useful, but not necessarily the collection knowledge that was included in the online record.

A number of curators did use our own OPAC to see what information was being communicated to the public, and to answer public enquiries. However, it was very clear that there are ongoing issues with trust and authority.

Two things that did increase trust for curators however were good quality images (through which they could get a visual sense of the object), and PDFs of original documents. Curators trust that which they can see themselves. For most curators, their expertise is such that they will have an intuitive sense when information they come across is likely to be correct.

Following Susan’s initial work we started looking at the SEEKERS in more detail. Why were they coming to the site? And, more importantly, were they satisfied with what they found?

We’ve had a pop up survey running for the last two months – again using Kiss Insights – and the numbers have started coming in.

In order to survey only the SEEKERS we have set the survey to only show to visitors who’ve arrived via organic search, have visited at least three pages, and, obviously, are in the museum’s online collection. The survey, thus, has quite a limited reach and has been triggered by only 3900 visitors in the time – and has been completed by 229 respondents.

It is somewhat heartening to find that the largest subgroup of Seekers – those doing ‘amateur research, hobbyist and collectors’ – feel the content they find is ‘good’, and that the lowest positive ratings are for the ‘other’ group. This is especially interesting if we look by object and see which object records are being rated as ‘poor’. Here we find a mix of well documented (at least according to us) and very scantily documented (no image, metadata last copied from a paper stock book entry in the 1980s).

Once we get to a critical mass of respondents – 1000 or more – in this group we should have some more actionable findings. Then we move on to looking at the the other groupings.

API Collection databases Metadata open content Semantic Web

Things clever people do with your data #65535: Introducing ‘Free Your Metadata’

Last year Seth van Hooland at the Free University Brussels (ULB) approached us to look at how people used and navigated our online collection.

A few days ago Seth and his colleague Ruben Verborgh from the University Ghent launched Free Your Metadata – a demonstrator site for showing how even irregular metadata can have valued to others and how, if it is released rather than clutched tightly onto (until that mythical day when it is ‘perfect’), it can be cleaned up and improved using new software tools.

What’s awesome is that Seth & Ruben used the Powerhouse’s downloadable collection datafile as the test data for the project.

Here’s Seth and his team talking about the project.

F&N: What made the Powerhouse collection attractive for use as a data source?

Number one, it’s available for everyone and therefore our experiment can be repeated by others. Otherwise, the records are very representative for the sector.

F&N: Was the data dump more useful than the Collection API we have available?

This was purely due to the way Google Refine works: on large amounts of data at once. But also, it enables other views on the data, e.g., to work in a column-based way (to make clusters). We’re currently also working on a second paper which will explain the disadvantages of APIs.

F&N: What sort of problems did you find with our collection?

Sometimes really broad categories. Other inconveniences could be solved in the cleaning step (small textual variations, different units of measurement). All issues are explained in detail in the paper (which will be published shortly). But on the whole, the quality is really good.

F&N: Why do you think museums (and other organisations) have such difficulties doing simple things like making their metadata available? Is there a confusion between metadata and ‘images’ maybe?

There is a lot of confusion about what the best way is to make metadata available. One of the goals of the Free Your Metadata initiative, is to put forward best practices to do this. Institutions such as libraries and museums have a tradition to only publish information which is 100% complete and correct, which is more or less impossible in the case of metadata.

F&N: What sorts of things can now be done with this cleaned up metadata?

We plan to clean up, reconcile, and link several other collections to the Linked Data Cloud. That way, collections are no longer islands, but become part of the interlinked Web. This enables applications that cross the boundaries of a single collection. For example: browse the collection of one museum and find related objects in others.

F&N: How do we get the cleaned up metadata back into our collection management system?

We can export the result back as TSV (like the original result) and e-mail it. Then, you can match the records with your collection management system using records IDs.

Go and explore Free Your Metadata and play with Google Refine on your own ‘messy data’.

If you’re more nerdy you probably want to watch their ‘cleanup’ screencast where they process the Powerhouse dataset with Google Refine.

Collection databases Mobile QR codes User experience

Making Love Lace – a cross device exhibition catalogue & the return of the QR

Estee Wah has been busy bringing Love Lace, our upcoming contemporary art exhibition, online. She’s been wrangling content and ensuring that the website is able to act as a fully fledged (and expanding) catalogue for the show as well as revealing much of the individual artists’ processes in a behind the scenes section.

This exhibition sees the return of QR codes to the Museum (as well as, later, the trial of the tracking pilot).

To solve one of the big problems with QR codes – that people just can’t be bothered downloading a QR code reading application (or firing it up if they do have one), our internal developer Carlos Arroyo has built the exhibition iPhone and Android App with the QR code scanner built in! This means anyone who downloads the exhibition App – itself a full catalogue of the exhibition designed for in-gallery supplementary browsing now also has their QR scanner at their finger tips.

As the QRs are scanned – from within the App – the relevant exhibition object immediately launches in the App. Carlos has also managed to nail down error correction and the scanning is now really good even in low light and on low resolution cameras.

Like for Sydney Design and the Go Play Apps we’re using Flurry to track in-App actions so we can see which objects get scanned and viewed.

(There’s even a mobile website that mimics the App – without the scanner – if you don’t choose to install the App!).

We’ll keep you updated on how it goes and when the automated tracking goes live. Carlos is already working on a v1.1 version of the App to roll out shortly with some new interaction options.

Try out the iOS App in the AppStore. And the Android version of the App is on the Android market also.

API Collection databases

Powerhouse Object Name Thesaurus now available via our API!

Luke Dearnley is at LOD-LAM this week and he and Carlos Arroyo are pleased to publicly announce that the Powerhouse Object Name Thesaurus is now available through our API.

The Object Name Thesaurus was developed by the Powerhouse Museum to standardise the terms used to describe its own collection. It was first published in 1995 as the Powerhouse Museum Collection Thesaurus. Since then, many new terms have been added to the thesaurus within the Powerhouse’s collection information and management system. The print version has long been popular with collecting institutions to assist in the documentation of their own collections.

Whilst you have been able to download the thesaurus as a PDF for a fair while, the API now makes it possible to build applications on top of the thesaurus to do things like explain terms or even expand the search on your own website to show results from ‘related or child terms’. And of course, if you’ve built applications using the Powerhouse Collection you can now show related parent and child objects. The thesaurus, like the rest of the API defaults to a CC-BY-NC license although you can approach the Museum for a variation on request.

The hierarchical structure of the thesaurus assists in searching. By organising object names, the relationships between objects can be made explicit. Object names are organised according to their hierarchical, associative or equivalence relationships. The object name thesaurus allows for more than one broader term for each object name. Any term is permitted to have multiple broader terms, for example ‘Bubble pipes’ has the broader terms of ‘Pipes’ and ‘Toys’. There is no single hierarchy in which an object name is located, enabling it to by found by searchers approaching with different concepts in mind.

Here’s an example of the sort of return you can now get from the API.

    "status": 200, 
    "end": 50, 
    "start": 0, 
    "result": 50, 
    "terms": [
            "status": "APPROVED", 
            "scope_notes": "Any of a variety of brushes used to remove dirt and lint from clothing.", 
            "term": "Clothes brushes", 
            "num_items": 4, 
            "num_narrower_items": 0, 
            "relations": {
                "narrower": [
                        "status": "APPROVED", 
                        "scope_notes": null, 
                        "term": "Hat brushes", 
                        "num_items": 2, 
                        "num_narrower_items": 23, 
                        "id": 5104
                "broader": {
                    "status": "APPROVED", 
                    "scope_notes": null, 
                    "term": "Laundry equipment", 
                    "num_items": 11, 
                    "num_narrower_items": 0, 
                    "id": 1189
                "related": {
                    "status": "APPROVED", 
                    "scope_notes": "Used to remove dust and dirt from clothing by beating.", 
                    "term": "Clothes beaters", 
                    "num_items": 0, 
                    "num_narrower_items": 0, 
                    "id": 2802

The code snippet above shows the usage of terms (sometimes a bit like a definition) and the broader/narrower relationships between the terms themselves.

Laundry equipment is a broader term under which Clothes brushes sits. Clothes brushes are used as “Any of a variety of brushes used to remove dirt and lint from clothing.” and they have a single narrower term Hat brushes.

Not only that, but Clothes brushes are related to Clothes beaters which are “Used to remove dust and dirt from clothing by beating”.

If you were, say, running a collection search (or even an ecommerce system) for old washing machines and related equipment your application could use the Thesaurus in the API to make recommendations on your own site using the broader/narrower terms from our system. In that sense a user searching for “hat brushes” on your website could also be expanded to show them results for “clothes brushes” and “clothes beaters”.

And of course, you can also get the Powerhouse objects under each of these categories.

Rough documentation is available (with better documentation coming soon).

We’ll be adding to this over the coming months and we’d love your thoughts on how this might be useful to you in your own applications.

Collection databases open content Powerhouse Museum websites

Introducing the alpha of the Museum Metadata Exchange

The Museum Metadata Exchange (MME) is a project that started mid last year (2010) as a collaboration between the Council of Australasian Museum Directors (CAMD) and Museums Australia (MA). Funded by the Australian National Data Services (ANDS), the project is key infrastructure to deliver museum collection-level descriptive (CLD) metadata to the Australian Research Data Commons (ARDC).

That’s acronym city. So here’s the human-readable version.

The MME takes a different approach to collections. Instead of focussing at the object or item level, it moves up a notch to ‘collection level’. This has the benefit of providing an overview, a meaning and a scope that can be hard to ‘see’ at object level – especially if you were, say, looking for which museums had shoes made in the 1950s and worn in Australia. The other benefit of collection level descriptions is that the objects grouped in this way don’t necessarily need to be online or digitised (yet) in order to be discovered.

The project is funded by ANDS in order to ensure that these descriptors of museum collections are added to the Research Data Commons to be used and explored by academic researchers. In many ways this makes a lot of sense – academic researchers are far more likely than general web users to need to come and see the ‘real’ objects and make long term connections with staff at the host museum to conduct their research. And so, by exposing collection level descriptions especially for ‘yet to be digitised’ collections, the project is pulling back the curtain on those hidden gems held by museums across Australia. In fact, several of the staff working goon the project who deal with objects everyday were regularly surprised by what they were finding in other people’s collections – “oh I had no idea that they had some of those too!”.

Collection level descriptions have provenance and descriptive metadata along with semi-structured subject keywords, temporal, spatial and relational metadata. (Here’s a list of 66 Powerhouse collections and a single record on our rather excellent Electronic Music Collection.)

The first public iteration pulls together nearly 700 collections from 16 museums across Australia and future iterations will add more – primarily major regional collections, I would expect.

But . . .

The site itself is really a simple public front-end for a data transformation service. It isn’t supposed to be the primary place for anyone, not even researchers, to search or browse these collection level descriptions. It is a transformation and transport mechanism that acts as broker between the individual museums and the Research Data Commons. To this end anyone can download the XML feed of the collection level data from the site – this is the same data that gets passed on to the Commons.

Of course, we’ve tried to ‘pretty-up’ the rawness of the site a bit. The first iteration has lovely identity work done by emerging Newcastle-based designer Heath Killen. But the search is very rudimentary and there is currently no way to pivot by keywords or do the temporal or spatial searching – this sort of functionality is supposed to be handled by the various academic interfaces for the data once it reaches the Research Data Commons. We will add this to the MME site itself over time.

Go and have a bit of an explore – the best way of understanding the project is by taking a look at the sort of data that is already in it. If you’d like some more detailed background information the project also has a < a href="">microsite for contributing institutions.

Oh, and, we’re expecting to release the Powerhouse Object Name Thesaurus (already downloadable as a PDF) as a data service shortly as part of this project too. This thesaurus has been used by the project to start to normalise the data to a degree and it is expected that by making the thesaurus available as a data service, that there will be both read and write opportunities . . . .

Collection databases Digitisation Metadata Powerhouse Museum websites

Australia Dress Register – public site goes live

The first iteration of the public front end of the Australian Dress Register went live a few weeks back. This release makes visible much of the long data gathering process with regional communities that began in 2008 and continues as more garments are added to the Register over time.

The ADR is a good example of a distributed collection – brought together through regional partnerships. Many of the garments on the site are held by small regional museums or, in some cases, private collectors and families. It is only through their rigorous documentation and then aggregation that it becomes possible to tell the national stories that relate to changes in clothing over the last 200 years.

The ADR extends the standard collection metadata schema that we use for documentation at the Powerhouse with a large range of specific data fields for garment measurements and the quality of preservation. These have been added to allow costume and social history researchers to explore the data in greater detail and granularity. A good way to see the extra level of detail in the ADR is to compare a record on ADR with the same object record in the host institution’s own collection (where it is available online).

Here’s the child’s fancy dress costume from 1938 on the Powerhouse site, side by side with the same object on the ADR. (Click to view the full records)

The Resources section of the site provides volunteers and contributors without the capacity of the major capital city museums to better understand the best practice methods of preserving, documenting and digitising their garments along with a range of simple how-to videos.

The Browse and Search uses Solr on the backend and offers extensive faceting (Here’s just the discoloured garments with buttons). There are multiple views for search results with configurable list and grid views, and relevance, recency and alphabetical result ordering.

The Timeline is one of the visual highlights of the site, along with being rather cool from a technical perspective too. As the collection grows the Timeline and Browsing features will become more valuable to traverse the rich content.

There’s a lot more to go with this site and you’ll be seeing many more records contributed from around the country over the coming months.

Collection databases Developer tools

Behind the Powerhouse collection WordPress plugin

Yesterday we went live with the first version of the Powerhouse Museum collection WordPress plugin. Rather than clutter that launch blogpost up with the backstory and some its implications, here’s the why and how, and, what next.

The germination of the WordPress plugin was the aftermath of the Amped Hack Day run by Web Directions at the Powerhouse where we launched the Museum’s collection API.

Whilst the API launch had been a success, Luke (Web Manager/Developer) and Carlos (Developer) and I were a little disappointed that although we’d launched a REST API, we had actually made it more difficult for the ‘average interested person’ to do simple programmatic things with our collection data.

Of course, we’d built primarily the API to make our own lives easier in developing in-museum applications, and the next wave of online and mobile collection projects you will be hearing about over the coming 12 months. But we’d also aimed to have the API broaden the external use of our collection data and solve some of the ‘problems’ with our existing ‘download the database‘ approach.

In fact, ‘download the database’ had worked well for us. Apart from the data being used in several projects – notably Digital NZ and one of the highly commended entries in 2009’s Mashup Australia contest – we’d found that the database as a whole item was being used to teach data visualisation and computer science in various universities both in Australia and overseas. We’d also found that people in the digital humanities were interested in seeing the ‘whole view’ that that data dump provided.

None of these groups were well catered for by the API and one of our team, Ingrid Mason, ended up convincing us to retain the ‘download the database’ option alongside the API, rather than forcing everyone through the API. Her argument revolved around the greater, and hitherto underestimated value of being able to ‘see the whole thing’.

At the same time, WordPress had become a defacto quick and dirty CMS for most of the Museum’s web projects. We’ve run annual festival websites (Sydney Design), whole venue websites (Sydney Observatory), exhibition microsites (The 80s are back), and experimental pilots (Suburb Labs) on WordPress over the past few years building up both internal skills and also external relationships to the point where the graphic designers we work with supply designs conscious of the limitations of WordPress. In each of these sites we’ve had a need to integrate collection objects and this has usually meant ugly PHP code in text widgets.

(Don’t be concerned – for larger and complex projects we have been migrating to Django)

[phm-grid cols=4 rows=1 v_space=1 h_space=1 thumb_width=120 thumb_height=120 random=true parameters=”title:computer”]

So in the weeks after Amped, Carlos spent time developing up a WordPress plugin based entirely on the API. This, it was seen, would serve two purposes – firstly, allow us to embed the collection quickly into our own WordPress websites; and secondly, to give interested non-programmers a simple way to start using our API in their own sites.

Late last year we sent the alpha version out to some museum web people we knew around the world for feedback and the Carlos tweaked the plugin in between working on other projects, before its first public outing in the WordPress plugin repository.

So where now?

The WordPress plugin is definitely a work-in-progress.

We’re keeping a keen eye out for people implementing it on their blogs and WordPress sites. (If you’ve implemented it in something you’ve done then tell us!)

Carlos has several features and fixes already on his radar that have come out of our own uses of the plugin – some of these are tied to limitations in the data currently available through the API.

If you’ve got feature requests then we’d love to hear them – and we’re secretly hoping that those of you who are deeply into Drupal or Expression Engine might port the plugin to those platforms too.

Send your feedback to api [at]

(Luke is also presenting a paper on the API exprience at Museums and the Web in Philadelphia this year)