Folksonomies Web 2.0 Web metrics

OPAC2.0 – Search term frequency and the influence of interface

I’ve started preparing some work on search term frequency in our collection database.

The system is set up to track only successful searches – which we define as those that result in a user selecting an item from the search results. Taking figures generated last week, the database has served up over 1.87 million successful searches since launch (June 2006), whilst nearly 5 million objects have been viewed. Obviously users are getting to objects via direct links or using third party searches (Google etc, or our Opensearch feed) to get directly to records.

Of these 1.87 million searches there are only 19,352 unique terms.

Obviously there are few factors at play here. Firstly, there will always be clusters of popular terms – see Google Trends.

But what about the influence of interface?

Our current search and objects pages are set up with multiple (perhaps maximal) pivot points, or ways to get to other results and parts of the collection. The search/home page features a large randomised tag cloud which displays user-entered keywords. Clicking on one of these will result in a search result for that term.

The search result page now shows ‘related’ search terms as hyperlinks to searches for those terms.

The object record page shows (if they exist), user keywords with hyperlinks to a search for that word/phrase; the top three search terms related to that object (if the object has been viewed more than 30 times); as well as subject and object categories.

Each of these sets of hyperlinks are encouraging users to click them – probably before they manually type another search term in the large search box. Why type when the site you are using is making suggestions for you?

This requires further examination and a cross refencing of search terms against user keywords and also some heat tracking with a set of test users.

Here are the top 20 search terms as of November 2006 (excluding object numbers).


Web metrics

Usability tracking

A new product has emerged from the UK – Clickdensity.

Clickdensity adds that extra layer above what you are probably using Google Analytics or log analysis to do – that is, track where and how people use your website. What is different about Click Density is that it tracks where, exactly, users click on each page on your site – by mouse coordinate.

It uses a little bit of Javascript code to log activity to the Clickdensity server and then you can log in and collect reports. It works on a sliding per month pay scale depending on what you want to track and for how long.

Certainly if you are wanting to check out or test your information architecture, navigation and menu structures then it would be worth investigating.

Social networking Web 2.0 Web metrics

Is MySpace really greying?

A lot of blogosphere energy has been spent discussing the recent figures from ComScore in the USA about a rapid aging of the average MySpace user.

Boyd and Stutzman have been doing some digging and comparing the result to their own research which problematises the apparent age rise. As they point out, there is a vast difference between visitors (content readers) and users (content creators/social networking participants)

[ComSpace] have found that the unique VISITORS have gotten older. This is _not_ the same thing as USERS. A year ago, most adults hadn’t heard about MySpace. The moral panic has made it such that many US adults have now heard of it. This means that they _visit_ the site. Do they all have accounts? Probably not. Furthermore, MySpace has attracted numerous bands in the last year. If you Google most bands, their MySpace page is either first or second; you can visit these without an account. People of all ages look for bands through search.

Interactive Media Web 2.0 Web metrics

OPAC2.0 New features – locations and search terms

Another week and another set of new features are now live on our Collection search.

While most local readers of the blog were at Web Directions South (congratulations to Museum Victoria for taking out the Web Excellence award – perhaps the most strict clean code award around!), we’ve been working to get two key new features out on the search.

The first is publicly visible and is an exhibition location field. This now allows users to see the exhibition in which objects that are on public display are in. This will, in the next few weeks, operate in the reverse as well, allowing visitors to each exhibition microsite to quickly view a list of collection objects on display within the exhibition, and pull up more details on each. Filtering by only objects on display is coming soon too.

Examples (scroll to bottom of object record to see the change) –

+ Shou Lao figure on display in Other Histories

+ ‘Wiggle Chair’ by Frank Gehry on display in Inspired

The second key feature is an internal backend tool which allows querying of search terms by object. This queries the ever-growing database of search terms and object views and allows us to quickly examine, by object, the terms used to discover it. This is the start of an internal visualisation project looking at ways of displaying, and more importantly, revealing patterns in the search data.

(Remember, too, all this data is anonymous. We do not gather any identifying data and these terms are not correlated against IP addresses. There is no need for us to do this – the anonymous data provides enough to assist other users discover and browse.)

Here’s a sample of the search terms used to discover the aforementioned Wiggle Chair. As you can see, there are some interesting (and, on the surface, unrelated) terms used to discover this object. Now, this data can then be compared against the discovery terms used to find other chairs in the collection, to build a more effective search thesaurus, suggested search terms, or, as I was calling it today – a ‘searchsonomy’.

(c) indicates a tag cloud click through.

Search terms used

+ 29/09/2006 – gehry
+ 29/09/2006 – wiggle chair (c)
+ 28/09/2006 – wiggle chair
+ 28/09/2006 – wiggle chair
+ 28/09/2006 – cardboard (c)
+ 28/09/2006 – wiggle AND chair (c)
+ 28/09/2006 – wiggle chair (c)
+ 27/09/2006 – gehry
+ 27/09/2006 – frank gehry (c)
+ 27/09/2006 – frank AND gehry (c)
+ 27/09/2006 – wiggle AND chair (c)
+ 26/09/2006 – Architecture
+ 25/09/2006 – wiggle chair (c)
+ 24/09/2006 – frank AND gehry (c)
+ 24/09/2006 – frank gehry (c)
+ 23/09/2006 – plastic (c)
+ 20/09/2006 – frank gehry (c)
+ 20/09/2006 – frank gehry (c)
+ 20/09/2006 – frank gehry (c)
+ 20/09/2006 – frank gehry (c)
+ 20/09/2006 – frank gehry (c)
+ 20/09/2006 – frank gehry (c)
+ 19/09/2006 – modern (c)
+ 18/09/2006 – frank AND gehry (c)
+ 17/09/2006 – fibreglass (c)
+ 17/09/2006 – frank gehry (c)
+ 16/09/2006 – wiggle AND chair (c)
+ 16/09/2006 – wiggle AND chair (c)
+ 14/09/2006 – gehry AND chair AND wiggle
+ 14/09/2006 – wiggle AND chair (c)
+ 14/09/2006 – wiggle chair (c)
+ 14/09/2006 – frank AND gehry
+ 14/09/2006 – cardboard (c)
+ 12/09/2006 – Architecture
+ 11/09/2006 – fibreglass (c)
+ 11/09/2006 – frank gehry (c)
+ 11/09/2006 – frank AND gehry (c)
+ 11/09/2006 – frank AND gehry (c)
+ 10/09/2006 – wiggle
+ 10/09/2006 – wiggle
+ 09/09/2006 – wiggle
+ 07/09/2006 – wiggle chair
+ 07/09/2006 – wiggle chair
+ 07/09/2006 – wiggle
+ 07/09/2006 – gehry AND chair
+ 06/09/2006 – cardboard (c)
+ 05/09/2006 – gehry AND chair
+ 05/09/2006 – gehry AND chair
+ 05/09/2006 – wiggle AND gehry
+ 03/09/2006 – fibreglass (c)
+ 02/09/2006 – wiggle chair
+ 01/09/2006 – wiggle chair (c)
+ 31/08/2006 – wiggle chair (c)
+ 30/08/2006 – University of California
+ 30/08/2006 – University of California
+ 30/08/2006 – frank gehry (c)
+ 30/08/2006 – frank gehry (c)
+ 30/08/2006 – wiggle chair (c)
+ 30/08/2006 – wiggle chair (c)
+ 29/08/2006 – frank gehry (c)
+ 29/08/2006 – frank gehry (c)
+ 28/08/2006 – chair
+ 28/08/2006 – chair
+ 28/08/2006 – wiggle chair
+ 23/08/2006 – double (c)
+ 22/08/2006 – frankgehry (c)
+ 22/08/2006 – frank gehry (c)
+ 21/08/2006 – gehry
+ 19/08/2006 – cardboard (c)
+ 18/08/2006 – wiggle chair (c)
+ 17/08/2006 – University of California
+ 17/08/2006 – University of California
+ 16/08/2006 – Frank Gehry
+ 16/08/2006 – Frank Gehry
+ 16/08/2006 – chair
+ 14/08/2006 – wiggle chair (c)
+ 12/08/2006 – frank gehry (c)
+ 12/08/2006 – frank gehry (c)
+ 11/08/2006 – frank gehry (c)
+ 11/08/2006 – wiggle chair (c)
+ 11/08/2006 – wiggle chair (c)
+ 11/08/2006 – frank gehry (c)
+ 10/08/2006 – wiggle chair
+ 10/08/2006 – chair
+ 01/08/2006 – gehry
+ 30/07/2006 – furniture
+ 30/07/2006 – gehry
+ 30/07/2006 – chair
+ 29/07/2006 – architecture
+ 25/07/2006 – marylin sofa
+ 25/07/2006 – marylin sofa
+ 25/07/2006 – frank gehry
+ 23/07/2006 – marc newson (c)
+ 22/07/2006 – marc newson
+ 21/07/2006 – newton (c)
+ 13/07/2006 – chair
+ 13/07/2006 – chair
+ 12/07/2006 – weil
+ 07/07/2006 – chair
+ 07/07/2006 – chair
+ 07/07/2006 – chair
+ 13/06/2006 – chairs
+ 08/06/2006 – wiggle chair

Web 2.0 Web metrics

Reviewing web metrics

Evan Williams (one of the makers of Blogger) posts a strong argument for why organisations should be moving away from using page views as a metric much in the same way we all moved away from hits in the late 90s.

Looking at MySpace he compares page views with ‘reach’ (effectively uniqiue visitors) and maps the results against the same for MySpace suddenly doesn’t look as far ahead as it did when based solely on page views. He draws on Mike Davidson‘s argument that MySpace has such enormous metrics largely as a result of poor architecture – requiring the user to go through refresh pages many more times than necessary if MySpace was redesigned from the ground up with usability in mind.

Ajax is only part of the reason pageviews are obsolete. Another one is RSS. About half the readers of this blog do so via RSS. I can know how many subscribers I have to my feed, thanks to Feedburner. And I can know how many times my feed is downloaded, if I wanted to dig into my server logs. But I don’t get to count pageviews for every view in Google Reader or Bloglines or LiveJournal or anywhere else I’m syndicated.

Another reason: Widgets. The web is becoming increasingly widgetized—little bits of functionality from one site are displayed on many others. The purveyors of a widget can track how many times their javascript of flash file is loaded elsewhere—but what does that mean? If you get a widget loaded in a sidebar of a blog without anyone paying attention to it, that’s not worth anything. But if you’re YouTube, and someone’s watching a whole video and perhaps even an ad you’re getting paid for, that’s something else entirely. But is it a pageview?

Pageviews were never a great measure of popularity. A simple javascript form validation can easily cut down on pageviews (and save users time), while a useless frameset can pump up your numbers. But with the proliferation of Ajax, RSS, and widgets, pageviews are even more silly to pay much attention to—even as we’re all obsessed with them.