The fun of understanding unknown things through mathematical models -Aoyama Gakuin University

Faculty of Science and Engineering
The fun of understanding things we don't understand through mathematical models
Associate Professor Naoyuki Ichihara (Published October 2020)

Faculty of Science and Engineering
The fun of understanding things we don't understand through mathematical models
Associate Professor Naoyuki Ichihara (Published October 2020)

A giant step forward for humanity and the development of optimal control theory

My specialty, stochastic optimal control, is a fusion field of optimal control theory and probability theory. Optimal control theory is a dynamic optimization theory for systems that change over time. A "system that changes over time" could be the trajectory of an object that moves according to the laws of physics, or an indicator that represents some economic situation. A representative example of the former is the "moon landing problem." This is the problem of soft-landing a rocket approaching the moon's surface with minimal fuel consumption. The purpose of optimal control theory is to derive the theoretically optimal policy for when and how much power to use to reverse-thrust the rocket's engine.

Pioneering research into optimal control theory was already being conducted by both the United States and the Soviet Union shortly after the start of the Cold War, but it wasn't until the late 1950s that optimal control theory was established as a mathematical theory in research by Pontryagin and others at the Steklov Mathematical Institute in the Soviet Union. At roughly the same time, research into optimal control theory was also being carried out almost independently of the Soviet Union at the RAND Institute in the United States, where applied mathematician Berman and others were affiliated. The 1950s and 1960s was the period of the so-called US-Soviet space race, and optimal control theory made great advances through research into artificial satellites and rockets. In particular, the "moon landing problem" could be said to be a problem that symbolizes optimal control theory of this era.

From the perspective of pure mathematics, the origin of optimal control theory can be traced back to a branch of mathematics known as the calculus of variations. The calculus of variations originated from the problem of "brachystochrone" presented by mathematician Bernoulli in 1696, which asked the question of how to find the curve that would allow a ball to roll the fastest between two distant points with a certain height difference without the application of any force other than gravity. Of course, there were no artificial satellites or rockets in Bernoulli's time, but the idea behind the calculus of variations evolved into optimal control theory in the 20th century. This reminds us that mathematical ideas that get to the essence of things can make great advances that transcend time and academic fields.

What is stochastic optimal control?

Next, I will explain what stochastic optimal control is. In a word, stochastic optimal control theory is the stochastic version of optimal control theory. When the state of a system is uncertain, such as observed data containing noise or the trend of stock prices, it is necessary to incorporate stochastic effects from the beginning when constructing a mathematical model. Stochastic optimal control theory deals with optimal control of the state of such a "system containing randomness". A typical example of stochastic optimal control is the "optimal investment and consumption problem" in economics. This is a type of stochastic optimal control problem considered by American economist Merton around 1970, and it is a problem of finding a strategy that maximizes the expected utility of the assets themselves and consumption behavior while rearranging the investment ratio of risky assets such as stocks and safe assets such as deposits over time. In this problem, the investment ratio between risky assets and safe assets and the ratio of funds to consumption behavior are parameters to be controlled.

As you can see, the range of subjects of optimal control and stochastic optimal control is very broad, ranging from space engineering to economics, but the research subjects I am working on are slightly more abstract problems. Rather than presenting solutions to individual concrete problems, it may be more accurate to say that I am studying the mathematical properties of idealized models that extract essential parts from these problems. Below, I would like to provide a somewhat more detailed introduction to some of the research I have conducted so far.

Phase transitions caused by the conflict between rewards and costs

Stochastic optimal control deals with stochastic systems that change over time, and the most fundamental mathematical model describing such systems is the random walk. My current research subject is stochastic optimal control related to the random walk.

A random walk is a mathematical model of the movement of a particle that moves randomly across the vertices of a graph. More precisely, a one-dimensional random walk is a particle that moves one step at a time with equal probability in either the left or right direction at each time step across points that are equally spaced on a line. A two-dimensional random walk is a similar random movement considered on a plane rather than on a line, and a three-dimensional random walk is a similar movement considered in space. In other words, in a two-dimensional random walk, the particle moves with equal probability in one of the four directions (up, down, left, right) on a plane. In a three-dimensional random walk, the particle moves with equal probability in one of the six directions (up, down, left, right, front, back) in space. Mathematically, it is also possible to consider random walks with four or more dimensions.

One-dimensional and two-dimensional random walks

The most fundamental problem in random walks is to investigate "recursion", i.e., whether a particle returns to the starting point. Specifically, if a particle returns to the starting point many times (infinitely many times), it is called recursive, and if it returns only a finite number of times, it is called transient or non-recursive. According to Polya's theorem, a basic result of random walk theory, one- and two-dimensional random walks are recursive, while random walks in three or more dimensions are transient. Intuitively, the higher the dimension, the more places the random walk can move to in the next step, and therefore the more difficult it is to return to a point that it left. Polya's theorem shows that the boundary is exactly between two and three dimensions.

We consider "controlling" this random walk. A particle following the original random walk moves to adjacent points with equal probability, but we can change this probability to any value. However, when changing the probability, we assume that the greater the deviation from equal probability, the higher the cost. On the other hand, we assume that the particle receives a certain amount of reward every time it returns to the starting point. Our goal is to find a strategy that maximizes the number of times the particle returns to the starting point while minimizing the cost of changing the probability when observing the particle's movement over a long period of time. Mathematically, this can be formulated as a stochastic optimal control problem that maximizes the profit (= reward minus cost) that can be obtained per unit step.

The key point is that the strategy to control the random walk is in a trade-off relationship. In other words, particles that are not controlled in any way (even in the case of one or two dimensions) do not return to the starting point very often, so one cannot expect to get much reward. Conversely, if the particles are controlled too much, the costs will increase and exceed the reward, which is counterproductive. My research aims to analyze in detail this delicate trade-off between reward and cost and to investigate the behavior of an optimally controlled random walk. To summarize the results, in a random walk of three or more dimensions, it can be proven that when the reward amount is gradually increased, the particle movement changes drastically from transient to recursive at a certain critical value. Since the way in which this situation changes is similar to a phase transition in physics, this phenomenon is called a random walk phase transition.

The "random walk phase transition" introduced here is not a research topic that is directly linked to solving real-world problems like the moon landing problem or optimal investment and consumption problems, but it is not a complete armchair theory. For example, stochastic optimal control of random walks is also related to physical models of polymer chains and reinforcement learning, a field of machine learning. Even if a mathematical model is studied simply because it is mathematically interesting, if it is a "good" model that captures the fundamental nature of things, it is often applied to fields that were not initially expected in the long term in mathematics. In my own research, I work on my daily research with the goal of constructing an interesting mathematical model that is simple but gets to the essence.

The wonder of how a different world emerges from random accumulation

When I was a university student, I had the opportunity to study a field called "measure theory." This is a somewhat abstract theory that mathematically delves into questions such as "what is length and what is area," but I was shocked to learn that this theory is actually the foundation on which probability theory is based. At the same time, I was deeply impressed by how the vague feeling of uncertainty I had had about probability theory was made clearer by discussing it within the framework of measure theory.

I think one of the fascinating things about probability theory is what's known as the limit theorem. When you throw an unbiased dice, it's impossible to predict which number will come up. However, when you throw the dice hundreds or thousands of times, the relative frequency of each number coming up approaches infinitesimally close to 1 in 6. This is true no matter who throws the dice, when they are thrown in large numbers, and I found it intriguing that, even though each individual dice throw is unpredictable, when a large number of them are thrown together, a regular, non-random law emerges. This was one of the reasons I decided to enter this research field.

Sometimes I conduct research alone, and other times it is collaborative. In collaborative research, I often get ideas from casual conversations with my partner. We start by having a vague idea of what kind of research we want to do, but in the end, we end up coming together in a completely different direction, which is the best part of collaborative research. My co-researcher and I have completely different research backgrounds, so I think one of the fun parts of collaborative research is that we can discover perspectives that we would never have thought of alone.

When I research alone, I tend to pursue my research topic a little more stoically. Sometimes I read previous research papers and think, "What a great idea!", but in my case, I feel that the research ends up being more interesting when I feel unsatisfied or uncomfortable with the contents of the paper. When I was a student, I thought it was important to understand everything, but recently I've come to think that the time when I don't understand something is also quite important.

Message to prospective students and current students aiming to enter our university

I'm sure you will study hard for your entrance exams with the goal of passing until you enter university, but once you're there, please try to study "without a purpose." I think the true nature of science is to study as your intellectual curiosity dictates, rather than aiming for a set goal. We live in an age where the world as a whole is losing its leisure time, but I hope that at least while you're a university student, you will study without thinking too much about the future. I also hope that universities will continue to exist as places that can provide such leisure time, both in terms of time and mental health.

One of the features of this department is the close relationship between students and faculty. Furthermore, with the reorganization of the department next spring, we will have an even more diverse faculty. If you have any questions, please feel free to ask. There is sure to be a faculty member who can answer them. (Published October 2020)

“Deterministic and Stochastic Optimal Control” by Wendell H. Fleming, Raymond W. Rishel (Springer-Verlag, 1975)
“Markov Chains” by J.R. Norris (Cambridge University Press, 1997)
“Introduction to Stochastic Dynamic Programming” by Sheldon M. Ross (Academic Press, 1983)

Study this topic at Aoyama Gakuin University

Faculty of Science and Engineering

Faculty of Science and Engineering
Associate Professor Naoyuki Ichihara (Published October 2020)
Affiliation: Aoyama Gakuin University, Faculty College of Science and Engineering, Department of Physics and Mathematics

Link to researcher information

Faculty of Science and Engineering
Associate Professor Naoyuki Ichihara (Published October 2020)
Affiliation: Aoyama Gakuin University, Faculty College of Science and Engineering, Department of Physics and Mathematics
Link to researcher information

Related Keywords

Technology/Science

Columns that reveal the world
- Getting up close and personal with the researchers -

A giant step forward for humanity and the development of optimal control theory

What is stochastic optimal control?

Phase transitions caused by the conflict between rewards and costs

The wonder of how a different world emerges from random accumulation

Message to prospective students and current students aiming to enter our university

Related articles

Study this topic at Aoyama Gakuin University

Related Keywords

Related Keywords

Related Content

Related Content

Columns that reveal the world- Getting up close and personal with the researchers -

A giant step forward for humanity and the development of optimal control theory

What is stochastic optimal control?

Phase transitions caused by the conflict between rewards and costs

The wonder of how a different world emerges from random accumulation

Message to prospective students and current students aiming to enter our university

Related articles

Study this topic at Aoyama Gakuin University

Related Keywords

Related Keywords

Related Content

Related Content

Columns that reveal the world
- Getting up close and personal with the researchers -