AGU RESEARCH

Columns that reveal the world
- Getting up close and personal with the researchers -

In the world we live in,
From issues close to us to issues that affect all of humanity,
There are many different problems.
The current situation and truth that are surprisingly unknown,
Our proud faculty members offer interesting insights
We will reveal it.

  • Faculty of Letters, Department of Japanese Literature
  • Utilizing cutting-edge computer technology
    Understanding with Data Science
    The mysterious world of classical languages
  • Professor Yasuhiro Kondo
  • Faculty of Letters, Department of Japanese Literature
  • Utilizing cutting-edge computer technology
    Understanding with Data Science
    The mysterious world of classical languages
  • Professor Yasuhiro Kondo

Started computer-based research into Japanese language in the 1970s

It was in 1973 that I entered College of Literature at the University of Tokyo and began studying Japanese linguistics. At that time, I thought that "computers would become essential for future research in Japanese linguistics" based on overseas trends, and I felt that there was great potential in the study of Japanese linguistics using what is now called data science. At that time, personal computers were not yet common in Japan, but large electronic computers were available at the university, and computer-based humanities classes were already being offered in the second year. It was the dawn of computers in Japan, and while I learned a lot of it on my own, I continued to use computers to study Japanese linguistics in graduate school. The world of computers is constantly evolving, so old knowledge needs to be constantly updated. Even after I started working as a teacher at College of Literature at our university in 1991, I have continued to study the latest computer technology and Japanese linguistics in parallel, up to the present day.

 

One of the themes I am interested in and am researching is interpreting classical Japanese songs and literature, such as the Kokinshu and The Tale of Genji, through data science using computer technology. The meanings of Japanese language are vast, and the nuances of each word are extremely diverse and complex. For example, both "cold" and "cool" express a low temperature, but "cold" has a negative image, while "cool" has a comfortable image. Even if they express the same low temperature, the impression and nuances they convey are very different. "Utsukushii" (beautiful) now means "aesthetic," but in the past it meant "cute." Until now, there has been no method to clearly and formally capture these differences in linguistics. As for whether the meanings of words written in Japanese dictionaries are correct, the explanations of words written there are merely one interpretation by the editor and are merely one example.

 

So a technique was invented that uses computers to accumulate and verify tens of millions of word samples, which may reveal the meaning and function of a word in a certain context, and the original meaning of the word. (Figure 1) By using a computer, it becomes easy to see in what context a word "co-occurs" with other words. This is called a "distributed representation of word meaning," and when this distributed representation is assigned (embedded) to each word, the data becomes like a string of numbers. By converting words into numerical values (vectors) in this way, it becomes clear whether one word is similar to another by comparing the numbers.

 

Furthermore, this makes it possible to "calculate" the meaning of words. For example, adding the number of the word "king" to the number that corresponds to the difference between "woman" and "man" gives the number "queen." This allows for addition, and can be used to understand more complex language systems. In recent years, the development of "Word2vec," a software that uses "deep learning" to teach computers the processes that humans naturally perform, has made it technically possible to easily calculate such values. Even if you try to do this kind of work manually, as in traditional linguistics, the number of words to refer to and their relationships are simply too enormous.

 

 

Figure 1

A computer-generated classification diagram of "Shiku conjugation adjectives" from the Heian period.

The adjectives on the left are subjective, and the adjectives on the right are somewhat objective.

Using computer technology to unravel the mysteries of Heian period culture

I thought that using such cutting-edge computer technology in the study of classical languages would lead to an understanding of the essence of classical languages. This is because even if modern people try to understand the meaning of classical languages, they can only understand a very narrow range. In modern languages, for example, if we take the word "nomi" in the phrase "nomi ni ikou" (let's go drinking), we can immediately tell that it is not just the conjunctive form of the verb "nomu" (to drink), but also "to drink sake (enjoyably) together." Everyone can tell that it is not "drinking water," because we share the context and background in which the word is used. However, unfortunately, we cannot share most of the lives and cultures of the characters that appear in The Tale of Genji, and we cannot correctly understand even the subtle differences. The state of mind that is shaped by our own senses, such as our five senses, is called "introspection," but this sense works for modern languages but not for classical languages. However, as mentioned above, by analyzing and verifying a huge number of classical languages using computers, the essence of the words becomes clear. Using cutting edge computer technology, we can say that we have an alternative to classical introspection.

 

In terms of classical Japanese, we don't really know how words are used differently depending on what we would call gender today. For example, it is true that there are differences in the words used by men and women, such as in the male and female songs in the Kokinshu, but the mechanism by which this is done, such as what rules and conventions are still shrouded in mystery. For example, there is no honorific language in waka poetry, but there is no clear answer as to why there is no honorific language. Of course, there must have been a clear reason among the people of that time, and if we could ask Murasaki Shikibu directly, she would give us an answer right away, but we today have no way of knowing. However, there is a huge amount that computers can do to solve these mysteries. I think we are still only at the stage where we are standing at the door to deep research.

Computer analysis of the Meiji-era Japanese translation of the Bible

Another research topic that interests me is the Japanese translations of the Bible made during the Meiji period. The American missionary James Curtis Hepburn and others were central to translating the Bible into Japanese, and I wanted to use computer technology to unravel the process of how the original Greek Bible was translated into Japanese. I served as the library director of our university until 2019, and at that time I was reminded of the richness of our university's Bible collection, which I felt was a great opportunity to utilize it as a reference.

 

In order to investigate how the Japanese translation of the Bible was made during the Meiji period, we have examined how it was translated not only into the original Greek, but also into the English and Chinese versions, and how it influenced the Japanese translation. Until now, such research has been done manually by comparing each document, but by using a computer, it is possible to see at a glance how a single word is expressed in each language. The Meiji Japanese translation of the Bible has a complex form, with ruby (furigana) and the Chinese version has kunten (reading marks), but by inputting all of this text into a morphological analysis tool developed by the National Institute for Japanese Language and Linguistics, the parts of speech can be instantly determined. Similar software is available for Greek and English, so by performing the same analysis, a database of accumulated texts and speeches called a "corpus" can be created. This makes it easy to compare and verify at the word level, and also provides detailed information such as how often a certain word is used.

 

 

Acts

A Chinese-language Bible of the Acts of the Apostles published in Yokohama in 1884.

The Korean reading "吐" is printed on it.

 

 

When I used this method to study the Meiji-era Japanese translations of the Bible, I found that they had distinctive honorific language. You might think that it's natural for the Bible to contain honorific language, since it's the word of God, but Meiji-era novel-like writing basically did not use honorific language. In the first place, the Bible, written in Greek, does not contain honorific language, so if you translate it according to the original text, honorific language is not necessary, so why did Hepburn and his colleagues use honorific language when translating it into Japanese? In fact, another Meiji-era text that used honorific language was textbooks, or so-called Japanese language readers. Textbooks are read by teachers in classrooms, so it's not good if they don't use honorific language. My hypothesis is that the Bible is also read by pastors in churches, so they thought that a certain degree of honorific language was necessary. This was discovered because computers have made it possible to analyze all words by analyzing their parts of speech. In other words, without computers, this kind of research would not be possible in the first place. The tools themselves already exist, so it's important to use them to come up with the idea of, for example, "Let's study the Japanese translation of the Bible." If you apply this to Natsume Soseki or Akutagawa Ryunosuke, it becomes a study of modern literature. There are countless other interesting themes in terms of language history and literary history. The key is to decide what theme to take up.

 

 

The opening of the Greek Bible "The Gospel of Matthew" that Hepburn likely used

(Published in the United States in 1822)

The need for researchers who transcend the fields of humanities and science

Today, it is unclear what the goal of language research should be. In the past, it was said that language research would be useful for machine translation technology and Japanese language education, but the world of machine translation has evolved independently of linguistics, and Japanese language education has recently placed less emphasis on grammar, and the trend is to expose students to many languages through experience, just like English. So, does this mean that language research is no longer necessary? Not at all. I believe that language research will become an academic field that studies the nature of human culture from a broader perspective, rather than one that focuses on whether it is immediately useful. To achieve this, I believe that we must restructure the academic system and turn it into an academic field that allows us to understand human culture more deeply.

 

Some people say that if automatic translation machines become widespread, it will no longer be necessary to study foreign languages, and nowadays, thanks to the improvement in the accuracy of translation software, it has become quite easy to read texts on overseas websites. There are also apps that will proofread the Japanese you have written. However, the essential part of thinking with your own head and speaking in your own words is something that computers can never help you with. When a person speaks or writes in their native language, their education and thoughts and feelings about words will naturally be expressed. Rather, in the computer age, I think it will become extremely important to express the essence of a person in words. At that time, it is very important to know the essential things as education, such as "What are words?" and "What is honorific language?" For example, if you know what grammar the honorific language "sasemaseru" is made up of and how to use it, you can use it correctly with confidence, but if you don't, you will just use "sasemaseru" whenever you want. You will often see such examples in your daily life.

 

I strongly feel that future Japanese language research will need to be conducted in collaboration with information engineering specialists, and in recent years, in the field of education, the idea of "integration of humanities and sciences" has emerged, which goes beyond the boundaries between the humanities and sciences and learns across academic fields. However, it is true that this is quite a challenge. In my research field, if a Japanese language specialist does not know what a computer can do, he or she cannot ask an information engineering specialist for help, and conversely, if an information engineering specialist does not have linguistic knowledge, they will not know what to do, and the conversation will not be possible, just like between people who speak different languages. Therefore, an interpreter is needed to bridge the gap between the two. That is, a person should be an expert in both languages at the same level. In other words, for example, they should be an expert in "The Tale of Genji" and an expert in deep learning, but at the moment, there are not many such people in Japan.

 

However, I believe that a large proportion of young researchers in the future will have to be like that. In the field of humanities in Europe, these machine learning and deep learning methods have been adopted and are already becoming quite common. Even in the world of science in Japan, which has had no connection to information science until now, there are young researchers who are incorporating deep learning into their research, for example in the field of astronomy. Information methods can be applied in all fields, but for those involved in humanities research in particular, it can be said that there is an infinite, untouched blue ocean (an unexplored area with no competitors) right in front of them. I think that young researchers in the humanities in the future will be able to become world leaders in no time if they have knowledge of information engineering.

Currently, programming has become a required subject in high schools, and as a result, even liberal arts students will need basic programming skills from now on. However, it is true that some people are better at it than others. For example, I am not good at sports, so being forced to exercise is just painful. I think it is important to create a diverse, multi-track curriculum that allows students to learn with interest, rather than forcing them to do it. Also, if students with excellent liberal arts abilities are denied the opportunity to study because they lack science knowledge, it will be a great loss for the educational field. To prevent this from happening, we believe that we need to come up with various ideas to diversify research.

(Published October 2022)

Related articles

  • "Grammar and Pragmatics of Honorific Language" edited by Yasuhiro Kondo and Jun Sawada (Kaitakusha: 2022)
  • "Corpus and Japanese Language History Research" edited by Yasuhiro Kondo, Makiro Tanaka, and Toshinobu Ogiso (Hitsuji Shobo: 2015)
  • "Methods of Research on Imperial Waka Poetry" by Miyuki Kondo (Kasama Shoin: 2015)

Study this topic at Aoyama Gakuin University

College of Literature Department of Japanese Language and Literature

  • Faculty of Letters, Department of Japanese Literature
  • Professor Yasuhiro Kondo
Link to researcher information

Related Keywords

Related Content

  • Faculty of Science and Engineering
  • Let's make use of "IE" in our daily lives
  • Professor Toshiyuki Matsumoto
  • "IE (Industrial Engineering)" is used in factories and companies all over the world. However, "IE" is originally something that is useful for our daily lives. It is also a tool to make dreams come true. Here, we explain what "IE" is and how to use "IE" in daily life. (Published in 2012)

  • Faculty of Science and Engineering Department of Chemistry and Life Sciences
  • Challenging the unknown
    Adaptation strategies for high water pressure environments learned from yeast
  • Professor Fumiyoshi Abe
  • "I don't want to do the same thing as other people." This desire led him to study microorganisms that have adapted to high water pressure. He comes up with his own research methods and pioneers research from his own unique perspective, which sometimes leads to discoveries that astound the world. With his originality and daily diligent efforts, he takes on basic research to discover something from nothing. (Released in 2022)

  • Faculty of Economics
  • Predicting population distribution for the next few decades by block and district
    Urban planning and disaster prevention planning
    Providing the underlying data
  • Professor Takashi Inoue
  • Foreseeing changes in population structure over the long term is essential for national and local government policymaking. This is all the more true in Japan, where population decline is accelerating. However, detailed population estimates have been extremely difficult due to technical barriers. The smaller the estimated area, the more likely it is that numerical fluctuations will be reduced. Professor Inoue applied a certain classical theory to devise a groundbreaking equation, enabling him to estimate future populations for each small region across the country. In this column, we take a closer look at the researcher and explain the details of his new methodology. (Published in 2021)

Related Content

  • Faculty of Business Administration, Department of Marketing
  • Unraveling the mechanisms of the distribution system that intertwines economy, society, culture, and history
  • Professor Nobukazu Azuma

  • Faculty of Economics
  • Predicting population distribution for the next few decades by block and district
    Urban planning and disaster prevention planning
    Providing the underlying data
  • Professor Takashi Inoue

  • Faculty of Science and Engineering, Department of Physical Sciences
  • Understanding the mechanisms of life at the molecular level
    Unraveling these mysteries leads to the discovery of new physical laws
  • Professor Michio Tomishige