AI in Humanities: Teaching Machines to Read and Interpret Classical Texts

Can AI go beyond math and science? Here is how digital humanities is paving the way.

Can AI impact the humanities? This connection may not be clear at first since most of these fields do not include an element of science or math. They are more preoccupied with developing theories than testing hypotheses through experimentation. Subjects like literature, philosophy, history, languages and religious studies (and theology) rely heavily on the interpretation and qualitative analysis of texts. In such an environment, how could mathematical algorithms be of any use?

Before addressing the question above, we must first look at the field of digital humanities that created a bridge from ancient texts to modern computation. The field dates back to the 1930’s, before the emergence of Artificial Intelligence. Ironically, the first project in this area was a collaboration between an English professor, a Jesuit Priest, and IBM to create a concordance for Thomas Aquinas’ writings.

As digital technology advanced and texts became digitized, the field has continued to grow in importance. Its primary purpose is to both apply digital methods to humanities as well as reflect on its use. That is, they are not only interested in digitizing books but also evaluating how the use of digital medium affects human understanding of these texts.

Building on the foundation of digital humanities, the connection with AI becomes all too clear. Once computers can ingest these texts, text mining and natural language processing are now a possibility. With the recent advances in machine learning algorithms, cheapening of computing power, and the availability of open source tools, the conditions are ripe for an AI revolution in the humanities.

How can that happen? The use of machine learning in combination with Natural Language Processing can open avenues of meaning that were not possible before. For centuries, these academic subjects have relied on the accumulated analysis of texts performed by humans. Yet, human capacity to interpret, analyze and absorb texts is finite. Humans do a great job in capturing meaning and nuances in texts of hundreds or even a few thousand pages. Yet, as the volume increases, machine learning can detect patterns that are not apparent to a human reader. This can be especially critical in applications such as author attribution (determining who the writer was when that information is not clear or in question), analysis of cultural trends, semantics, tone and relationship between disparate texts.

Therefore, as NLP and machine learning are combined to mine large bodies of text, the possibilities are endless. One example is using word2vec algorithms to find similarities in disparate texts. Word2vec places words into a multi-dimensional plane based on their meaning and how close that is to each other. In this way, synonyms are close to each other along with words that are of the same category (ie: boat, car, and airplane as motorized vehicles). By looking at texts this way, the algorithm can mathematically find how close in meaning one text is from another, making connections that would take readers a very long time to do.

Read more

Reality Changing Observations:

Q1. What field of humanities would most benefit from AI algorithms? Why?

Q2. Can machines extract meaning from texts better than humans? Why or why not?

Q3. As AI starts interpreting text what ethical concerns does it raise?