Another plug for Opensearch

As I’ve been speaking to other institutions both here in Australia and overseas I’ve started to realise that more of us should be using Opensearch to allow others (or ourselves) to aggregate our deep content – whilst still retaining full control of said content.

I blogged about this ages ago but I think everyone was caught up in getting their collections online and searchable to begin with.

The library sector has been debating its implementation for a while and their arguments for and against Opensearch are covered here.

OpenSearch is . . . a discovery mechanism. It allows a site to quickly expose vast amounts of data to end users in a detailed enough format that it elicits click-throughs. It is a way for end users to search a variety of sources, and source types, and to quickly grab the useful bits from each source, and to dig deeper for more detail when they find something of interest.

More to the point, though, since everyone must implement their opensearch results in exactly the same way every OpenSearch source is guaranteed to work with every OpenSearch client. Instant interoperability.

Now with both Firefox 2.0 and IE7 supporting Opensearch there really is no reason not to.

Imagine if your collection or your deep/dark web databases that you have already connected up to your website could be easily searched by a centralised search portal? And any interested searchers who clicked on a result would be redirected immediately to your site? And you didn’t need to implement anything complicated to make this possible?

Here is a very simple tutorial for a standard website.

Here is the Powerhouse Museum’s collection search for ‘chair’ delivered via the A9 portal.

And here is the raw XML result which anyone can aggregate to their site (allowing others to deliver traffic back to us).

If you have multiple databases on your site that all have their own esoteric search engines, then you could create your own cross database search simply by creating a Opensearch feed for each and then a search page that aggregates each feed.

If you DO add Opensearch to your site then please tell us!

Young people & museums

Dragon & The Pearl – a blog for children

The Museum has launched another public facing blog called The Dragon & The Pearl.

This blog supports a public program that is running until March 2007.

A mysterious crate arrives at the museum containing what is thought to be a dragon’s egg! The blog allows children of all ages to keep pace with what the Museum and the two specialists it invites in – a cryptozoologist and a dragonologist – think it might be and what develops when and if the egg hatches. Already a faint pulse can be heard by visitors from inside the egg!


Rich serendipity and Vander Wal

Jonathan at the AGNSW pointed me towards this rather excellent piece on folksonomies which really resonates with our own experiences with our collection database.

What Vander Wal realized is that socially exposed tagging for personal use introduces another organizing agent that compensates for the ambiguity of its vocabulary with high-value serendipity: people. We are much better at picking up information and knowledge cues based on perceived similarities and differences compared to other people, than we are at picking up clues from a people-free environment. If people-free environments gives us weak serendipity, person-mediated serendipity is much richer.

People are a useful organizing agent because they are natural knowledge attractors and aggregators of meaning. People habitually collect and arrange for themselves what Vander Wal calls “personal infoclouds” and these arrangements reflect a meaningful perspective on knowledge (Vander Wal 2006).

Peer production and the ‘laws of quality’

Interesting reading from Paul Duguid in his paper on First Monday, Limits of self-organization: Peer production & the laws of quality. (via Nicholas Carr)

First, protagonists of the sorts of peer production projects discussed here should reflect on the extent to which, explicitly or implicitly, they rely on the laws of quality. If they don’t, they should ask themselves what they do rely on. Second, projects should be mature enough now for participants to admit their limitations. Project Gutenberg and Wikipedia are tremendous achievements. That does not entitle them to a free pass. Both, because free, tend to get some of the condescending praise given a bake sale, where it’s deemed inappropriate to criticize the cakes that didn’t rise. Third, they should draw closer to their roots in Open Source software. Software projects do not generally let anyone contribute code at random. Many have an open process for bug submission, but most are wisely more cautious about code. Making a distinction between the two (diagnosis and cure) is important because it would suggest that defensive energies might be misplaced/

Computer game history

Fascinating archive project from venerable US gaming magazine Computer Gaming World puts archives of issue 1 (1981) through to 1992 online as PDFs. It has obviously been an enormous scanning and digitisation project.

This is a great trip down memory lane and is an insight into not only how games have developed, but also how computer game audiences and advertising has changed, along with criticism and review.

Issue 1 has an amusing piece of the future of gaming – will 16K of memory be enough?

I hope they continue to release back issues for 1992-2006 at a later date.

Synonymiser Beta – proof of concept

Synonymiser is an experimental micro-application that returns related words from search data relationships held in the Powerhouse Museum’s collection database. These ‘synonyms’ are dynamically generated from realtime user interaction with the collection database.

On the Synonymiser site you can enter any search word or phrase and it will return a list of ‘related’ words or phrases and a measure of relationship.

Of course, the results are not synonyms in the dictionary sense of the word, but instead show meaning relationships specific to the way in which users use our collection database.

The idea is that these word relationships can then be used to query other data sources. In this case we retreive images from Flickr™ to demonstrate the concept. It is possible to merge terms and/or offer alternative terms to improve results using this.

There is a proposal to make these synonym relationships available via an API to allow other museums to use and build upon our usage data to improve their own search tools.

Is this useful? Would you like to be involved or help with this?

What is ‘synonym promiscuity’?

‘Synonym promiscuity’ is our term for describing the uniqueness of a relationship of one word to another. If the value is low (less than 10) then the synonym has a very close relationship with the word entered. If the value is high then the synonym is related to many other words (high promiscuity). We are currently refining the mathematics behind the calculation of these values – but they current figures shoukld provide a means for comapring words.

Stutzman on YouTube

Fred Stutzman’s blog is quickly becoming a must read.

Here he writes on YouTube from the perspective of YouTube as a social networking service rather than just a video hosting site. As he says,

The social architecture that enabled conversation in YouTube was built in, perhaps subconsciously, from the beginning. The founders built a site so they could share party videos with friends. The founders, while they probably have more friends now, likely had a relatively small social network. It was the millions of users like the founders, using the service in a similar fashion, that drove the value of YouTube. The fact the site also became the perfect home for viral videos and pirated video was completely secondary – they simply had the infrastructure to support the long-tail, hence the capacity to support non-long-tail uses. Other video sites that aren’t targeting the long tail are missing out on the social forces that drove YouTube – while people like viral videos, it is the long-tail of peer-produced content that keeps people coming back. It is the peer-production that enables conversation, and the iterative process that drives value back into the site. Without this value, a video sharing site is just expensive infrastructure built on a house of cards.

He also begins to hint at the other value in YouTube – that by visiting, watching, tagging, sharing and accumulating metadata around videos, users are effectively helping classify and categorise video which is notoriously difficult (like any time based media) to create descriptive metadata for (anyone use SMIL?).

OPAC2.0 – New feature – more similar searches

Continuing on the additions from yesterday.

Today we added the top three search terms for each object to the USER KEYWORD section of an object page. This section is where folksonomy tags can be added and deleted.

Here is a 1960s raincoat for example.

Why did we add this to the USER KEYWORDS section?

What we are finding through our ongoing analysis of folksonomy tagging behaviour is that users are generally adding synonyms. What folksonomies are doing in this instance is effectively ‘crowd-sourcing’ synonym generation. Now when a user searches for term and selects an object from a list of results they are making an association between the object and that term. In many instances these may be tentative associations and generally full of false drops, but when aggregated patterns begin to emerge (see Chan 2006 forthcoming). Effectively on our site we are noticing that the search terms are beginning to offer the same kind of synonym behaviour that tags do.

We are presenting them together as a way of making explicit the terms that are already associated with an object (automatically). This way we hope to improve the diversity of user tags (why tag with a word/term that already appears?). This is just a trial though and if we see little or no change then we may move the search terms off to a separate section.

I’d welcome any comments or thoughts on this as we’re experimenting here.

OPAC2.0 – New feature – similar searches

Take a look at the new ‘related searches’ feature on our collection search.

Now a search for ‘glass‘ will these ‘similar searches’ –

glassware vase bottles bowl bowls

This result will change over time. Hopefully we will implement a timescale simulator in our upcoming ‘experimental’ browsing section which will allow users to view the changing search language over time.

How does it work?

Because we keep a large store of relationships between search terms and clicked objects, we are able to reverse query terms such as ‘glass’ and see what terms have been used to find similar objects. We currently show only the top 5 terms – aggregated – which lessens the probability of ‘false drops’. False drops are most likely to occur for uncommon search terms although this will change over time, too.

Does this use folksonomy tags?

Yes. We allow user tags to be included in the search terms and from time to time certain objects will be most visited as a result of their user tag.

Outsourcing video hosting to YouTube may mean losing users

In a rather sensational piece on unsavoury content on YouTube in the Sydney Morning Herald today there is this little tidbit of note.

The site was impossible to access at public schools, an Education Department spokesman said.

“[The department] urges parents to monitor their children’s use of the internet at home as this is the most likely place from which students view and download material posted to these types of internet sites,” the spokesman said.

This has interesting implications for organisations considering hosting their streaming video on YouTube. Now you might consider using YouTube (or its competitors) to host video for you because –

a) your user base is already using and is familiar with YouTube
b) it solves (by outsourcing) some of the hosting issues around video in terms of bandwidth and delivery formats

But it is worth bearing in mind the consequences in terms of specific audience groups ability to access your content.