Search And Mining Tools with Linguistic Analysis: domain specific search through statistical language modelling

Dr Dan Levene and Mr Martyn Harris will present ‘Samtla’ which is a collaborative Digital Humanities research project between University of Southampton and Birkbeck University of London.

Samtla provides a search facility through statistical language modelling, and document comparison tools to identify both local and global patterns of language use and change over time.

Samtla is language-agnostic allowing the tools to be applied to any language corpus or collection of documents. The current implementation provides search and comparison over a corpus of Aramaic Magic Texts from Late Antiquity (AMTLA), and an English version based on the Book of Genesis, from a collection of Bibles ranging from the 13th century through to 2010.

As well as presenting the history of Samtla and a two working versions of it we hope to attract other users in the Faculty of Humanities at Soton Uni.

The seminar was live streamed and is now available below. You can also view the SAMTLA seminar on Panopto.