Collection databases Folksonomies Web 2.0

OPAC2.0 – Collection bulk tagging application launched

Today we finished our long awaited ‘bulk tagging’ application.

I’d encourage you to give it a go and send us some feedback.

We are particularly interested in museum professionals and amateur collecting organisations adding tags in volume to our collection. The application currently targets the user tagging of objects in our collection that have not been formally catalogued, or whose formal cataloguing data is not visible in the online database for various reasons.

Bulk Tagger is an experimental application to give quick access to tag multiple objects in our collection database from the one webpage. One of the key problems we have identified with social tagging of our collection is that there just isn’t enough tagging going on and although the tags that are added do have significant benefit in terms of making certain collection records more easily discoverable only about 3000 records have been tagged so far.

Bulk Tagger is currently being targetted at specialist user communities as a way of rapidly increasing our pool of user tags.

We are tracking tagging behaviour and tags added via Bulk Tagger are identified as such and can be quarantined from the mass public tagging if needed in future research.

Each screen shows five objects which have not yet been tagged. Users can add multiple comma separated tags to these objects and then submit them. Upon submission, another five objects will appear. Clicking on an object thumbnail will pull up more information about the object.

This is an early release experimental product only.

Concept and programming Luke Dearnley & Sebastian Chan, Powerhouse Museum.

Folksonomies UKMW07 Web 2.0 Young people & museums

A reminder about user incentives

Since Friday at UK Museums and the Web 2007 I keep being asked about my scepticism over explicit tagging in museums. “Why do I think that users don’t really have much natural incentive to tag our collections or content?”

Over at Bokardo there is a post dating back to 2006 which looks at why has been succesful titled the The Lesson.

The one major idea behind the Lesson is that personal value precedes network value. What this means is that if we are to build networks of value, then each person on the network needs to find value for themselves before they can contribute value to the network. In the case of, people find value saving their personal bookmarks first and foremost. All other usage is secondary.

As people use more, and in order to gain more personal value, they use tags to be able to find their bookmarks later. Tagging isn’t even the primary function of Most of the tagging done on is done secondarily, and for personal use.

The social value of tags on is only a happy side-effect. Even though most of the ink spilled about is about the social value, it’s really not the reason why people use it.

Now this is again a case of strategy first, technology second – those who attended my recent workshops will know clearly what I mean. If Forresters is correct and about 15% of US internet users have tagged something in the preceding month then we need to be careful to not make the leap to this being the same as 15% tag frequently let alone tag on all sites that offer tagging. Situational relevance and motivation also play a big part in the choice of which services people use.

If tagging is about engaging users and “bridging the semantic gap” then what other strategies might achieve the same end result?

We cannot give the same user incentives as the tagger who tags their images in Flickr nor the tagger who tags their bookmarks in Delicious. We can target our committed volunteers and amateur and affilated societies however but the user needs and UI design may be very different for those communities.

Collection databases Folksonomies Museum blogging Web 2.0 Wikis

A reminder about ‘participation inequality’

I’m busy preparing a couple of new and remixed presentations for delivery in the northern hemisphere in the next few weeks and Tony Walker over at the ABC reminded me about this excellent summary of Participation Inequality by usability evangelist Jakob Nielsen.

How to Overcome Participation Inequality

You can’t.
The first step to dealing with participation inequality is to recognize that it will always be with us. It’s existed in every online community and multi-user service that has ever been studied.

Your only real choice here is in how you shape the inequality curve’s angle. Are you going to have the “usual” 90-9-1 distribution, or the more radical 99-1-0.1 distribution common in some social websites? Can you achieve a more equitable distribution of, say, 80-16-4? (That is, only 80% lurkers, with 16% contributing some and 4% contributing the most.)

Although participation will always be somewhat unequal, there are ways to better equalize it.

In our collection database tagging represents less than 0.01% of activity on the site. But, because we also do some neat search tracking we can combine a very low level of tagging (folksonomy) with our existing rich taxonomies and the ‘read wear‘ trails left by users in browsing the site to enhance the user experience for everybody.

Others ask me – “I have a blog but no-one ever posts comments, why?”. The answer to which is usually, “are you writing your posts in a way that leaves space open for people to respond simply and quickly?”.

The danger in all this quick uptake of social media amongst the cultural sector is that we often over estimate how much our audiences want to particpate. Sure, in our physical spaces we see them interacting with our on-floor interactive experiences but we then make the mistake of thinking that this will transfer over to the online space. Participation is not the same as interaction – interaction is a much more transient activity whereas participation generally requires effort over time. My advice in the online space is to implement solutions that require, as Nielsen writes, “zero effort” to participate – this is why we do so much work around user tracking and making that tracking simultaneously transparent and, paradoxically, invisible.

Try it.

Here’s my well-trotted out example – search for ‘cricket’ in our collection database.

What does it recommend as ‘related searches’? Other sports and some other words as well usually – it changes dynamically over time which reflects the different patterns of usage and association over time.

Why? Because other users like yourself have told it that these words are related to ‘cricket’.

Have they done so explicitly? No. They just browse the site and their behaviour tells our system that certain terms are related. There is ‘zero effort’ on the part of the user.

How? Ahhh, that’d be telling . . . come to one of my future presentations and find out.

Collection databases Folksonomies Metadata Web 2.0

M&W07 – Day two: Rjiksmuseum & CHIP

The Rijksmuseum in Amsterdam and several Dutch universities have been working on an exciting collection project which uses ratings and user profiles to recommend art to users. Whilst I was a little sceptical of their ‘ratings’ (1 to 5 stars) as a means of describing art, the recommendation tools and prototype interface were fascinating. Also exciting was the means by which they exposed the ‘recommendations’ – ‘You are recommended these because . . . ” is very reminiscent of Amazon’s additions of the last few years.

Most of all, though, the most striking thing about the CHIP was the ability for the user to generate a printable/downloadable map customised to show them their favourite and recommended artworks. This high level of integration between the onsite recommendations and the gallery floor is something we are thinking a lot about at the Powerhouse Museum in our OPAC project – especially for use at our Castle Hill open storage facility.

Collection databases Folksonomies MW2007 Web 2.0

M&W07 – Day two: Tagging & Tracking / OPAC2.2

Thanks to all who came to my paper presentation.

The paper is online over at Archimuse or if you are attending it is also in the printed proceedings (which is a little easier to read on public transport). You can also download my slides but bear in mind they need to be viewed in conjunction with the paper itself.

Apologies to the questioner who asked why we don’t allow logins to let people keep track of the tags they have added. It was a good question which I rather abruptly passed over. The problem with logins is that they raise another barrier to participation – at least at this early stage. Whilst I understand that some power users would then get the ability to create a ‘MyTags personalisation’, the risk of deterring other users is high – I’d liken the power user to casual user ratio as probably being 1 in 100, if not more. At the moment I think we have the balance right with tagging and we are still analysing the usage – remembering that they are more for navigation and discovery than for descriptive purposes (unlike, say, an art museum). We might add that at a later stage however.

Thanks to Ian Johnson for the great suggestion about adding a ‘do you really want to delete that tag’ dialog to the tag deletion to prevent accidental deletion. We will implement that pretty much straight away I think.

Folksonomies Web 2.0 Web metrics

Pew Research on tagging

Pew Research Center: Tagging Play is a short report looking at who tags content on sites like Flickr, YouTube, Del.Icio.Us and the others.

I’d strongly recommend reading the brief report. There are some basic demographics in the report and short piece on what tagging means.

A December 2006 survey by the Pew Internet & American Life Project found that 28% of internet users — and 7% on any typical day — have tagged or categorized online content such as photos, news stories or blog posts.

(via Russ Weakley)

Collection databases Folksonomies Web 2.0

OPAC2.0 – Multiple images and new acquisitions added

A couple of minor new things to report on our collection database. A few minor additions to our collection database have been implemented today. These have been on the ‘to-do’ list for a long time!

Multiple images

Ever since OPAC2.0 launched we have been hiding multiple images of objects. Now they are all publicly accessible by clicking the numbers on the bottom right below the zoomable image. If no numbers appear then there is only the main image available.

Here are a few examples where you can now get different views of the same object record.

+ Hedda Morrison’s camera and accessories
+ Bleriot XI monoplane
+ 1969 Australian one cent coin

There are plenty more.

We have also implemented captions for these images where they exist.

The impetus, other than the availability of some spare time in which to do it, was a new internal kiosk for the Transport Gallery that uses the same backend database as the OPAC and required multiple images. The OPAC kiosk launched at the Museum on December 20 as part of a sound and light show called Further, Faster, Higher.

New acquisitions

Also as part of creating a simple image grid layout for the kiosk we were able to quickly implement a visual object browser based on date of acquisition.

Users can now view our latest acquisitions as they are catalogued by year. This gives a quick entry point into the collection.

Latest statistics

By the end of the month we will have served up 6 million object records since launch in June.

Of these –

~915,000 have been discovered via text searches (23,000 unique search terms),
~947,000 via tag cloud/user keywords (3,500 keywords added),
~330,000 via subject keywords,
~200 via OpenSearch.

This leaves 3.8 million records (63%) found by direct discovery – either via hyperlinking from other parts of our website (or other websites), or (probably primarily) via Google and other search engines.

Our specialist design portal, Design Hub which uses the same backend object database has also served up 185,000 design-related objects via searches on its site since its launch in August.

Folksonomies Web 2.0 Web metrics

OPAC2.0 – Search term frequency and the influence of interface

I’ve started preparing some work on search term frequency in our collection database.

The system is set up to track only successful searches – which we define as those that result in a user selecting an item from the search results. Taking figures generated last week, the database has served up over 1.87 million successful searches since launch (June 2006), whilst nearly 5 million objects have been viewed. Obviously users are getting to objects via direct links or using third party searches (Google etc, or our Opensearch feed) to get directly to records.

Of these 1.87 million searches there are only 19,352 unique terms.

Obviously there are few factors at play here. Firstly, there will always be clusters of popular terms – see Google Trends.

But what about the influence of interface?

Our current search and objects pages are set up with multiple (perhaps maximal) pivot points, or ways to get to other results and parts of the collection. The search/home page features a large randomised tag cloud which displays user-entered keywords. Clicking on one of these will result in a search result for that term.

The search result page now shows ‘related’ search terms as hyperlinks to searches for those terms.

The object record page shows (if they exist), user keywords with hyperlinks to a search for that word/phrase; the top three search terms related to that object (if the object has been viewed more than 30 times); as well as subject and object categories.

Each of these sets of hyperlinks are encouraging users to click them – probably before they manually type another search term in the large search box. Why type when the site you are using is making suggestions for you?

This requires further examination and a cross refencing of search terms against user keywords and also some heat tracking with a set of test users.

Here are the top 20 search terms as of November 2006 (excluding object numbers).



Rich serendipity and Vander Wal

Jonathan at the AGNSW pointed me towards this rather excellent piece on folksonomies which really resonates with our own experiences with our collection database.

What Vander Wal realized is that socially exposed tagging for personal use introduces another organizing agent that compensates for the ambiguity of its vocabulary with high-value serendipity: people. We are much better at picking up information and knowledge cues based on perceived similarities and differences compared to other people, than we are at picking up clues from a people-free environment. If people-free environments gives us weak serendipity, person-mediated serendipity is much richer.

People are a useful organizing agent because they are natural knowledge attractors and aggregators of meaning. People habitually collect and arrange for themselves what Vander Wal calls “personal infoclouds” and these arrangements reflect a meaningful perspective on knowledge (Vander Wal 2006).

Folksonomies Web 2.0

Synonymiser Beta – proof of concept

Synonymiser is an experimental micro-application that returns related words from search data relationships held in the Powerhouse Museum’s collection database. These ‘synonyms’ are dynamically generated from realtime user interaction with the collection database.

On the Synonymiser site you can enter any search word or phrase and it will return a list of ‘related’ words or phrases and a measure of relationship.

Of course, the results are not synonyms in the dictionary sense of the word, but instead show meaning relationships specific to the way in which users use our collection database.

The idea is that these word relationships can then be used to query other data sources. In this case we retreive images from Flickr™ to demonstrate the concept. It is possible to merge terms and/or offer alternative terms to improve results using this.

There is a proposal to make these synonym relationships available via an API to allow other museums to use and build upon our usage data to improve their own search tools.

Is this useful? Would you like to be involved or help with this?

What is ‘synonym promiscuity’?

‘Synonym promiscuity’ is our term for describing the uniqueness of a relationship of one word to another. If the value is low (less than 10) then the synonym has a very close relationship with the word entered. If the value is high then the synonym is related to many other words (high promiscuity). We are currently refining the mathematics behind the calculation of these values – but they current figures shoukld provide a means for comapring words.