The long and the short of it

This is the text of a talk I gave at A Very Informal Lightning Series on Digital Culture, organised by members and friends of the National Digital Forum. It was a good evening (disrupted only by repeated fire alarm tests in the building), with a lot of crossover and agreement across all the talks. My talk’s original title was ‘The Long and the Short of it: a future for Te Ara and NZHistory’.

———

I have to start with a confession and that’s that I came up with the title for this talk in a hurry and before thinking about what I’d talk about. I’m quite happy with the first bit, ‘The long and the short of it’, but I’m not sure about the subtitle even with the rider that it’s only ‘a’ future. Instead I’ll talk about this: Some Things to Talk About When Thinking About the Future of Te Ara and NZHistory.

Long and short

Te Ara is the online encyclopedia of New Zealand – a landmark born digital encyclopedia that’s been drawn together over the last 10 years. NZHistory is its slightly older sibling, or perhaps rotten uncle. Together they’re some of the most heavily used websites in the cultural and heritage sector.

Someone asked me last week what the difference between them is. In reply I drew them a picture a bit like this one.

Te Ara and NZ History content

In some ways they’re very similar – they cover New Zealand’s culture and history in its many forms.

But Te Ara is incredibly broad. It covers the full range of New Zealand subjects, from politics, social history, Maori culture, natural history, creativity, and more.

NZHistory – as the name suggests – focuses more on history: political, cultural, and a lot of coverage of war history.

Typically Te Ara entries are highly structured and relatively short. NZHistory on the other hand is almost anarchic and has the freedom to cover subjects in more depth. It’s a gross generalisation but one that’s useful to point to the complementary nature of short and long content. Depending on our users level of interest it allows us to serve up the right amount of content for them.

That’s one thing to think about: short and long content is complementary.

Content development

I said it’s a generalisation – both sites have content that doesn’t conform to such an easy statement. One of the things that NZHistory does for example, are tiny but useful pieces of content that sit on top of the deep content and act as hooks to draw people into deeper content.

Nuggets

It’s something I think we need to do a lot more of, while bearing in mind that it’s probably a huge task – literally summarising all our content. But it has interesting spin-off uses. For one thing it’s more mobile friendly for users looking for a quick answer or fact. And as Virginia Gow pointed out, it’s also far more useful for sharing and posting to social media by us and our users.

At the other end of the scale we’ve got all these gaps in our long content. There’s a huge potential here for us to develop content that fills these gaps or work with other groups to do so. It could even be the content exists and we just need to look at mechanisms for connecting it up.

Nuggets and data

I heard the scientist Hamish Campbell talk recently about the ebook he released with Bridget Williams Books BWBText series. As books they’re short, but they’re decent length essays that Hamish Campbell felt gave him the space to develop an argument in a way that newspaper or magazine publishing doesn’t.

That’s a format I’d like to see explored on our websites – not so much ebooks but more indepth content that responds to contemporary issues and draws on the breadth of our content. That establishes a cycle between short an long content: short content provides a foundation to build arguments which in turn establish new facts that inform the foundation.

That’s the second thing: short and long content inform and develop each other.

A couple of other quickish things to mention…

Project calabash

Something I talked about at NDF last year was a small pilot project that we’re working on with Te Papa to create better links between Te Ara and Collections Online. We’re still working through it but at its simplest form it links images of Te Papa objects that appear on Te Ara to the record for the object on Collections Online and vice versa. Basically it means people can get more information about the object through a simple hyperlink.

Two calabashes

It’s a hardwired connection between two sites. Once we get it working with Te Papa we’ll extend it to other collections. At that point we start making connections not just between two sites but by association across the network – where our stories use objects and items from different collections, we effectively create a network of subject based links between collections.

That’s the third thing: stories link objects into a wider context.

Dynamic connections

We’re currently redesigning NZHistory. That’s how it started out anyway, but then our lead designer and NZHistory’s product manager got to talking behind our developers’ backs and decided to redevelop it at the same time.

Where they’re going with it is to add a new navigation that’s driven off keywords, with the keywords broken down into People, Places, Events and Subjects. We’re using the keywords to generate what we’re calling dynamic pages – effectively pulling together all the content from across NZHistory that has that keyword and the displaying it in the same sorts of content groupings.

David Lange on the new NZHistory

Here’s mockup based on a People keyword, so it’s got a hero image and story – the biography – as well as related media items, events, and articles. I think this is exciting, but it’s only the start – next we’ll look at pulling in Te Ara content and use DigitalNZ to pull in content from other websites.

At this point it really starts to become a model that the digital heritage sector can experiment with as well – pulling content from multiple sources that fits a particular interest. This is just a New Zealand history take on it, but the possibilities for combining content around different subjects are endless.

That’s my last thing: dynamic connections will unlock the digital heritage sector’s collective potential.

A future?

I’ve failed to present a future, but I think these ideas point in the right direction. I’ve drifted away from the original idea of the Long and the Short of it, but it’s content in its many forms that we do well and that we contribute to the sector.

Recognising our internal strengths and playing to them while seeing how and where we can fit into the wider sector – providing context, providing links – and collaborating with the sector to help all our users create their own stories and collections has to be something to which we should aspire.

Your item is my story; my story is your item

This the text, more or less, of the talk I gave at NDF2013. It’s draws heavily on earlier posts, especially Project calabash and the pompously titled Open letter to cultural collecting organisations.

I was tempted not to give this talk after seeing Chris McDowall‘s talk on cross-walking collections, but in a way it supports that idea and also plays a bit with what Ed Summers had to say about small data and collections needing to be ‘regions of stability’ in his keynote talk, The web as a preservation medium. And more on that from Michael Lascarides in his talk and this pertinent tweet:

But I gave the talk anyway and this is what I said…

———

I have to preface this talk by saying that I’ll be talking about one very simple idea – linking a couple of sites to each other – and some more complicated ideas. They’re not really fully formed and aren’t really mine.

But they’re ideas that are floating around – what Virginia Gow refers to as shared ideas, or just ideas whose time has come and that we’re all kind of thinking about. And they’re not ideas I have solutions for, but they seem like problems that are staring us in the face and we should try to solve them.

The Web Team at the Ministry for Culture and Heritage (MCH) looks after all the Ministry’s websites, including our two largest sites, Te Ara – the encyclopedia of New Zealand, and NZ History – New Zealand History Online.

Unlike a lot of the organisations involved with NDF, MCH isn’t a collecting institution. We write text, a lot of it, both in print and online. Te Ara has maybe 3.5 million words; NZ History at least another million.

Where we intersect with collecting institutions is in the thousands of images that illustrate our stories.

Google image site search on Te Ara for 'Museum of New Zealand Te Papa Tongarewa'

We’re probably one of the country’s biggest collection users – Te Ara has 25 to 30000 resources, most from collecting institutions; NZ History has maybe 4 or 5000.

When it comes to collections items, like this gourd, or calabash, or hue from Te Papa, our sites explain their significance and place them in the context of stories that demonstrate their relevance to other items.

I’ll come back to that idea of relevance later, but for now I want to talk about a very simple linking project that we’re working on with Te Papa.

We source images and other media from institutions like Te Papa. Most of these items are available on Te Papa’s website. What we’re doing isn’t rocket science: we’ll be providing links between the two sites based on the item. So if you see it on Te Papa’s site you can go and read more on Te Ara, and if you’re on Te Ara you can find the original version on Te Papa’s site.

Before I get to why we’re doing this, I should say why are we using Te Papa as a test case.

A big part of it is down to people, and I have to name check Adrian Kingston for coming up with this idea, and seeing this as something that was possible, relatively easy, and worth the effort.

Partly it’s also the synergy between what the two organisations and our websites do: we both take a national view, and in the case of Te Ara, we also take an encyclopedic view and try to cover all aspects of New Zealand culture and history in much the same way that Te Papa’s collections span history, culture, natural sciences and so on.

It’s also partly that the number of Te Papa images used on Te Ara is relatively small, about 500, so we’ve got a manageable set to work with manually. And Te Papa uses persistent identifiers for items on Collections Online. We can’t do this without persistent IDs that will be there forever.

The process is currently manual. It’s basically a spreadsheet of Te Papa images and where they appear on Te Ara. Te Papa staff are going through the spreadsheet and identifying the corresponding IDs and URLs.

With 500 items that’s not a major hassle, but at some point I hope we’re going to have to think about how to scale is beyond 500 images and automate. It’s a pilot so maybe we’ll never have to ask that question, but I’d love to be applying this to some of the larger collections we use like the several thousand images from the Alexander Turnbull Library.

Two calabashes

So we’re basically making item-to-item links. Well so what? You can look at this image on Te Ara or you can look at it on Te Papa. Surely if you’ve seen one image you’ve seen them all. But not quite.

Hue on Te Ara

What you’re getting from Te Ara is information on the item’s significance. It’s in the story on traditional Māori warfare; it illustrates the page about preparations and entering into battle; and  from the caption you learn about its use in this context. Te Ara provides the context that signals why this item is important.

Taha huahua from Te Papa

From Te Papa on the other hand you can find out that it’s also called a taha huahua, or calabash, it’s made of harakeke, muka, gourd, dye and was purchased in 1905. You can see the collection it belongs to and what it was influenced by. And all those underlined words link to more items that share those classifications.

So what we’re doing is helping people find the information that’s relevant to them. If they’re interested in the story behind something, get it in one place; if they’re interested in the detail about it, get it from another. What you get in each place is what’s most relevant to where you are but you can easily find other information if that’s what you’re interested in.

I just want to mention a couple of other examples that probably bring this into sharper relief.

Abel Tasman

This is Abel Tasman. Eric Ketelaar from the University of Amsterdam spoke earlier this year at the GLAM symposium held at Victoria University. I’m hazy on the details but he was talking about the many copies of Tasman’s journals that exist around the world.

Some  exist as simple scans; others as transcripts of the Dutch; others as translations. Duplication isn’t the issue; as archivists tell us, lots of copies keep stuff safe. The issue is that none of these copies are linked to the others.

The copy you happen to stumble across directly affects your experience and ability to use the materials.

If you can’t read long-hand the scans are no good to you; if you can’t read Dutch, the transcripts won’t help; if you can’t read English, you might be better off with the Dutch. It wouldn’t be hard to link them together – they’re on the web so it’s just hyperlinks that are needed – so if you find one copy you can get to the copy that works for you.

H series

Another example is images like this one from what’s called the H series – a collection of World War One photographss commissioned by the New Zealand government and taken by Henry Armytage Sanders. The Alexander Turnbull Library holds the original glass plate negatives and copies are held by other organisations around the country.

Auckland Museum for example has copies in photo albums, and that points to story of how the images were originally used. They were put into albums with captions, and the albums were distributed around the country so that soldiers and their families could order prints. This is why they’re all numbered so people knew which photo to order.

Again linking the originals to the albums gives people the chance to experience and use the images in different ways. If you want a hi-res copy, the best place to go is the Turnbull; if you want to experience what it was like for a nation to see what the war had been like in page after page of photos, you can view the albums. Making that simple link makes those things possible.

Back to Te Ara and the relevance of items to other items. I’ve written a bit over the last few years about the way that something like Te Ara – but any publication really that uses collection items – is kind of like the meat in the sandwich between collections. Where a publication uses items from different collections it’s effectively creating an inferred relationship between the items and the collections.

I keep coming back to the calabash. Its many names are part of the complexity, as are its many uses. As we’ve seen, it illustrates the story about traditional Māori warfare, but it also illustrates the story about rongoa, or the  medicinal use of plants.

Another hue from Te Ara

Within that story it’s suggesting relationships with items as diverse as other plants to an engraving of a Māori warrior, a Lindauer portrait of the tohunga, Tūhoto Ariki, and a cartoon of Māui.

Hue and friends

Through that one story we have inferred relationships forming between Te Papa, Turnbull, Auckland Art Gallery, the Department for Conservation and Godwit Publishing.

Inferred relationships

Some of these links even start to get a little playful, in the way that Cath Styles talked about in her game Sembl last year, where you let your user make the connections between items.

Collecting on Te Ara

Just to get a little meta about it, Te Ara’s story on collecting brings together a really wonderful assortment of subjects from Turnbull himself and his book plates to firearms and Barbie dolls. It starts to coalesce around a subject that an institution might not think of, but once it’s in a user’s hands, those connections start to form.

———

Where else can we go with this? At a simple level item-to-item linking opens up a few options. We can potentially share our content more easily with other organisations if they want it – that saves them the effort of writing new content about their items.

We can also use it as a hook to update copyright or other information when the institution changes their record. Or we could look at pulling their descriptions of items in as alt text for screen readers to use.

We’re also interested in sharing our content with third parties to build new publications, websites or apps. Currently we can only share the text as that’s our copyright, but if we have a direct link to an item, then it’s easier for a third party to find sources of images and be able to negotiate re-use rights directly with the holding institution. Services like Digital NZ could also use the information and map our stories to institution records and expose those relationships through their API.

But more than all that, we could as Adrian Kingston suggests start to use the items to catalogue the stories they illustrate.

Te Ara subjects are at a very high level – the story title and page title is in effect the main subject. That’s fair enough, it’s an encyclopedia after all, so the title is a headword, and a headword by default is a subject. But what if we used the items to infer more specific subjects that the story might relate to?

From that we might see that the story about Māori warfare and this image of a taiaha…

Taiaha on Te Ara

…is also a story about woodcarving and the use of materials like feathers, dog hair and flax in Māori society through Te Papa’s catalogue record.

Taiaha kura from Te Papa

And through that connection is related to thousands of items in their collection. Not all the Te Papa subjects will be directly relevant to Ta Ara’s story, but by being able to choose which ones are, we can make direct links from a story to a much larger pool of related items on Te Papa’s site.

And where this starts to head is towards an idea that Virginia Gow threw at me recently which picks up on some work in the Netherlands. The National History Museum there joined up with some other websites and created what’s basically a trusted network of sites. When one site links to another site in the network, the links gets reciprocated automatically. It’s the sort of thing you could start letting your users do for you.

It shouldn’t be impossible. It’s the kind of thing Facebook does when it lets you tag someone in a post or photo, or that WordPress does when you allow pingbacks to a blog post. What they’re doing is building a system that’s aware of the network it’s part of and letting users take advantage of the network.

Simply linking items to items is going to take some work, and it’s obvious we’ll need to work out ways of doing it automatically when we look at a set larger than the Te Ara Te Papa set. Could we let our users do it? Could we just start sharing our data with each other in such a way that machines can start making the matches for us?

It plays into the work that Chris McDowall demoed yesterday – making matches across collections based on people. That sort of thing can be done automatically, by the right person with the right tools. All we need to do is let people like Chris use our websites and collections and see what they can do.

People are easy – ish; so are places, and even some events. They’re hooks that can connect our websites, and connect our content. Subjects and classifications are potentially no different. A little more ambiguous at times but not impossible.

———

Historical accidents

One of the things you notice when you look at collections is they’re never as comprehensive as you’d hope. They’re riddled with historical accidents. No institution has everything related to a subject or person or place. Or think of the Treaty of Waitangi – held by Archives New Zealand, soon to be housed in the building of the National Library, but arguably as relevant to Te Papa’s collection.

Look at any significant artist and see how their works are scattered across museums and galleries around the country (if not the world).

That’s history – different things get picked up by different institutions at different times, and we can’t change it. But for the user it’s infuriating. Why can’t I see all of someone’s work in one place? Or everything on a particular event all together?

National stories

The beauty of linking all our content together is that it creates a layer of meaning and use that sits above individual collections. It lets us all play to our strengths. Organisations can maintain their own web presence that talks to their mission, their collection, and their community, but it lets a much wider community tell their own stories that cross-walk all the separate institutions and collections. Through that we and our users could create truly national stories using all the different parts held in different institutions around the country.

That’s taken us away from the simple idea of linking items to items. That network is hard but we need to do it. At the same time, let’s not forget about doing the simple stuff. If there’s stuff you can connect your collection items to, just do that. It’s a start, and if it gives more use and meaning to your users then you’re doing something right.

But keep the hard stuff in mind – agitate for it, remind people why it’s worth doing, do it if you can and share the results with as much of the network as possible. That way we build a richer digital ecosystem for developers and our users.

Open letter to cultural collecting organisations

Last week I spent two days at the NDF conference in Wellington. This is the lightning talk I wish I’d thought to give.

I work on web projects at the Ministry for Culture and Heritage.* We run some wonderful websites, sites like Te Ara, NZHistory, Vietnam War, 28th Maori Battalion, and others. They’re popular, especially the first two. Te Ara gets about a quarter of a million visits a month, NZHistory a bit under 200k. That’s not bad for government websites, in fact, it’s pretty bloody good.

There are a couple of really obvious things going on on our sites, and neither are unique. I won’t ask you to guess. The first…

1. Text


There’s a lot of it. It’s one of the main things our area of the Ministry does. We research, write and edit text, whether for books or the web. Text is something we’ve been doing since our origins in the historical branch of Department of Internal Affairs in the 1930s.

I’’ll come back to the second thing that’s happening soon, but first take a look at this:


It’s the Creative Commons licence from Te Ara. A similar one appears on NZHistory. It’s a complicated statement that attempts to say “you’re welcome to use the text – non-commercially – and some of the images”. Which leads me nicely along to the second thing happening on the site.

2. Images (etc)


Our sites use a ton of images, like the one above from Auckland City Libraries – Tāmaki Pātaka Kōrero (Reference: NZ Map 2664), video and audio files, interactives and diagrams. Images are the biggest group. We own some of them, maybe 10%, maybe as little as 5%. The vast majority of them, and this goes for video and audio too, come from organizations like National Library, the Turnbull, Archives NZ, museums, galleries and other organizations up and down the country.

To get a sense of how many images we use from big collecting institutions, try a Google image search of Te Ara for Museum of New Zealand Te Papa Tongarewa, Auckland War Memorial Museum, or the Alexander Turnbull Library.

Whats my point?

There was a lot of talk at NDF about the need for narrative around organizations’ collections, ways to connect the dots for users, as well as finding new ways to present collections, making interaction more about giving rather than expecting the user to ask for something. And that’s pretty much what we do – our text is the context for all the collection content that we re-use. Our content is context for that content.

Back to that creative commons license and what it’s effectively saying. As a user of our websites you can use our text for non-commercial purposes like research, study, mixing and mashing, etc. If you want to use it commercially, get in touch and we can talk. Of the images, where we own them you can use those as well. Our view is simple: New Zealand taxpayers funded the creation of this content and continue to fund the websites. It was created with the public good in mind, and sharing it widely contributes to that public good.

So what about the bulk of the images and other media files on the sites, can you use them too? No. They’re not ours and we can’t share them. You can use their captions – they’re ours – but sorry, not the pictures themselves.

The pitch

So here’s my pitch to the holders of our cultural collections: how can we work together to share our text and your images? How can we build on the narrative that Te Ara and NZHistory provide about your collections? How do we collaborate to make shared content and narratives available for re-use? And what could we do with that shared pool of content ourselves?

These seem like obvious questions to answer. Like our content, your collections are paid for and maintained by the taxpayers of New Zealand, and it’s on you to share and make this content available. Surely it would help your organization if you could easily find and re-use descriptions about your collection items? Or how about shared application like the beautiful Biblion iPad app from New York Public Library?

Like someone said at the conference, it’s just programming. This stuff should be easy. We’ve found the images, written about them, created a narrative, all we need now is permission and willingness.

* Full disclosure: as well as working at the Ministry, I’m also joining the NDF Board for 2012 and 2013. The views expressed here are mine and not necessarily those of the Ministry or the NDF Board.