Part I: The data science pipeline
Part II: Basic text mining
Part III: Practical example w/ RStudio and texts from GRETIL
Part IV: Reproducible work with R Markdown
In recent years, the number of Sanskrit texts available in digital formats has grown exponentially. This provides new opportunities to improve and revise traditional scholarly understandings of Hindu traditions based on primary sources. In addition to the digital application of traditional methods of textual criticism and the editing of Sanskrit texts, Digital Humanities opens the door to entirely new types of digital analysis based on methods from statistics and computer science, such as automated analysis of word frequencies, topic modelling, and visualisation of key features in vast textual corpora. In this lecture, I introduce recent developments in the Digital Humanities with a focus on Oxford and the opportunities it provides for the Study of Religions and Hindu Studies in particular. Using data from e.g. the Göttingen Register of Electronic Texts in Indian Languages (GRETIL) and the Muktabodha database, I illustrate how the programming language R can be used to analyse and visualise Tantric texts and generate reproducible, publication-quality outputs in multiple formats, such as traditional journal articles or interactive websites.
Dr Ulrik Lyngs recently defended his DPhil thesis in Computer Science from the University of Oxford. He has an interdisciplinary background with an MA in the Study of Religion and Psychology from the University of Aarhus and an MSc in Cognitive and Evolutionary Anthropology from the University of Oxford. He was awarded the EPSRC Doctoral Prize in 2019 for his thesis research, and has received multiple prizes for research communication and impact, including the 2017 DOMUS Prize from Linacre College, and the 2020 MPLS Impact Awards. He is a Data Scientist and Digital Humanities Consultant at the Śākta Traditions research programme, where he provides support and consultancy on natural language processing methods for analysis of large text corpora, using reproducible workflows in R Markdown. At the computer science department, he researches design strategies for supporting self-control over digital device use, using a mix of quantitative and qualitative methods including large-scale web scraping, automated textual analysis, experimental studies, and participant interviews. At Aarhus University, his MA thesis analysed religion as a culturally evolved set of beliefs and practices for scaffolding self-regulation.