Digital Humanities & Hindu Studies

Directors

Bjarne Wernicke-Olesen
Lucian Wong

Data Science Lead

Ulrik Lyngs

Researchers

Rajan Khatiwoda (Śākta database)
Silje Lyngar Einarsen (Śākta database)

Digital curators

Siddharth Chhabra (Bengali database)
Michael Elison (Śākta database)
Prema Goet (Śākta and Geldblum Collection)

Project outline

This project explores the potential for using computational methods in combination with traditional scholarly analysis in Hindu Studies. Compared to traditional workflows in which scholars manually collate, compare and critically edit manuscripts into edited volumes, new computing tools hold substantial promise. For example, many time-consuming tasks may now be automated, and new understandings and insights based on the analysis of large amounts of data can be obtained that would previously have been impossible.

With new coding tools such as R or Python it is now possible to read entire corpora of texts into a computer and readily obtain answers to questions such as: How does the frequency of specific words differ between texts? Which words commonly co-occur together? Which texts are unusual or interesting on some criteria, such as occurrence of specific words, phrases, length of verses, and so on? Answering these sort of questions based on the analysis of large corpora of texts spanning a longue durée opens up entirely new possibilities and fields of investigation (e.g. in relation to discourse analysis, conceptual history and statistics). In this way, automated methods can give scholars new ways to efficiently understand and interact with large bodies of texts, which may then be combined with more traditional, in-depth manual analysis and interpretation.

New digital tools also allow scholars to easily share data and analyses in test scripts and open-access databases, or to build online visualisation tools that allow others to interact with digitised content in new ways.

Project outputs

  • Online lecture series and seminars on basic workflows for textual analysis in data science, including methods for text mining and visualisation of large corpora (YouTube)
  • Establishing a state-of-the art OCHS open-access database (https://ochs-database-demo.netlify.app/ ), providing a new user interface for browsing and interacting with primary research materials
    • The database establishes the primary research material for Śākta traditions in South Asia as an emerging field of studies. It makes unknown material widely available and searchable for the first time.
    • The database aims to include tens of thousands of manuscripts drawn from the OCHS Kathmandu digitisation project, the National Archives of Nepal, the ASA archives, the Kaiser Library and metadata from the NGMCP as well as a large number of Bengali texts and the Geldblum Collection.
    • Compared to existing major manuscript databases such as the Cambridge Digital Library and the NGMCP, the OCHS database offers a more advanced interface allowing users to
      • see transliterated and translated texts side-by-side with images of the original manuscripts, and download specific views of text data in structured form (e.g. CSV)
      • overlay text on top of the manuscript image to compare (e.g. transliteration or translation with the original Sanskrit text)
      • add comments or suggest corrections for text or image material
  • The OCHS database offers new workflows for use of computational tools in Hindu Studies, including
    • the possibility to automatically generate formatted HTMLs, PDF, or Word files with customised content of specific manuscripts (e.g. choosing to include the original Sanskrit, transliteration and translation in language of choice)
    • easily perform textual analysis and concordance (e.g., count and compare the frequency of specific words or phrases across manuscripts, including identifying parallel passages)
    • automated transliteration of hand-written manuscripts