Mining Trove


Gavin Fordyce, 7 August 2012 · # · ·

We all know that the Trove newspaper database is a rich resource for historical research, but how can we make best use of it for this project? A quick search on ‘Mosman’ for the years 1914-1918 returns more than 20,000 articles — clearly we need a more strategic approach!

I’ve spent a lot of time over the last couple of years looking at ways of analysing large quantities of newspaper articles from Trove. You can play with some of the possibilities yourself, by entering search terms into QueryPic. But while some ‘big picture’ analysis might be useful down the track, I suspect we’re going to have to rely mostly on human brainpower.

What types of searches are likely to be useful? Obviously we can start with names of people and organisations that we already know. We could also search for additional names by combining keywords like ‘Mosman’ and ‘casualty’. What about the broader war effort? A search combining ‘Mosman’ and ‘recruiting’ returns a lot of interesting information about local war-related events.

There are many possibilities and I’m hoping that someone at the Build-a-thon might be interested in exploring them to help us develop some useful strategies. As well as outlining a search plan, it’s also important for us to have a picture of the types of articles available and the sorts of information they contain. Reports of events, for example, will include dates and places — structured information that we might be able to use in developing rich contexts and snazzy visualisations.

To enable us to start building this picture, I suggest we use Trove’s tagging features. As a start we can use:

Just click on the ‘Add tag’ link in Trove and type in one of these tags — make sure to keep them ‘public’. Once they’re tagged we can easily retrieve them for further analysis.

What do you think? What other tags might be useful? How might we use the articles once we’ve identified them? Come along to the Build-a-thon to discuss this all further!


Comments

Bernard D · 8 August 2012 · #

Report comment

While you’re tagging, you might do a little text correcting. Do you keep the hyphen when a word wraps? I found the text correcting guidelines handy last night.


Tim · 8 August 2012 · #

Report comment

Indeed! And by using the Trove API not only will we be able to retrieve the tagged articles, we’ll be able to generate an automatic ‘to-do’ list of the uncorrected ones.


Marg · 8 August 2012 · #

Report comment

Good luck with the project – Earlier in the year we started a similar one – “The Ryde goes to War Project” for service men and women from the Ryde Municipality ! When we started the ‘Cumberland Argus’ on Trove was only a dream – but in the last month or so it has been added covering the WWI years and has made searching for local stories so much easier that scanning microfilm!
The taging of relevant items in Trove is a great idea.


Bernard D · 9 August 2012 · #

Report comment

Thanks Marg, looks great! Will have a good look through the website tonight. Join us on Saturday if you can.


Bernard D · 9 August 2012 · #

Report comment

The tempo that web publishing makes possible has given us some interesting day by day retellings. One of the best is the Gallipoli real-time Twitter project from a Turkish history magazine – @gallipoli_live. It added another dimension to Anzac Day for me this year, seeing the reports coming in from the beach in real time. The University of Oxford did something similar recently with the Battle of Arras and @Arras95 but they added opportunities for the audience to participate in the telling of the story. A day by day view for mosman1418 content might be worth exploring.