In the world we live in,
From issues close to us to issues that affect all of humanity,
There are many different problems.
The current situation and truth that are surprisingly unknown,
Our proud faculty members offer interesting insights
We will reveal it.
The essence of data science is to solve problems at hand through data analysis. Specifically, problem solving can be said to be deciding what action to take next. If the problem in your business is that sales are not increasing, you will decide on measures such as spending money on advertising and increasing your sales staff, and then take action. In that case, data science helps you make the right decision on what choice to make. The more important the decision, the more likely it is that you will be unsure of whether it is really the right choice. It is also not uncommon for discussions to fail to reach a consensus. In such a situation, if you have objective data, it becomes easier to visualize and share the fact that "this is indeed the right choice to make." I believe that the value of data science is that it gives you the courage to make decisions to solve problems and helps build consensus.
My research is the development of methods for data analysis, a field known as statistical modeling. Statistical modeling is used in all kinds of data analysis situations. What kind of structure is behind the data obtained from a certain phenomenon? If we replace that structure with a mathematical formula, or mathematical model, we can predict the next value, that is, the future. In order to build a statistical model with higher predictive accuracy and versatility, we are working on developing new methods that apply existing theories such as sparse modeling and Bayesian modeling. This method does not predict actual phenomena, but estimates what kind of valid model can be used for that prediction, and is theoretical research. Therefore, it is difficult to feel that the research I am currently working on is useful or has been useful in this way. However, experiments have revealed that the newly developed method has improved the accuracy of selecting important components of the model. This can be said to be one of the research results of model improvement.
Derive a mathematical formula (mathematical model) from the data that expresses the phenomenon behind the data
All we have is data, and in reality we cannot fully understand the phenomenon.
In this situation, a method for obtaining a highly accurate and versatile mathematical model from data is required.
(Statistical modeling) is important.
The fun of research is multifaceted, but I especially enjoy the fact that it gives me hints about "how the world works." Simply put, statistics is about "gaining knowledge from data," but in fact it can also be said to be about using mathematical formulas and data to reproduce what everyone does in their heads. When we see an image, we can instantly determine whether it is a human face or a landscape or an object, because we have learned from past experience (i.e. data) that "human faces are like this." When we use data analysis to determine whether a human face is in the image, we build a model using a similar mechanism. When I learn, improve, or develop new data analysis methods and realize that "this is how this phenomenon occurs," I feel a sense of accomplishment in getting a glimpse into the workings of the world. As an aside, the most fun phenomenon I was able to explain through statistical modeling was that "as we get older, our minds become more rigid."
Recently, we used the newly developed method to evaluate the performance of basketball players. Basketball is a 5-on-5 game, and in professional leagues, each team has about 10 players who constantly switch players during the game. For example, five specific players from team A face off against five specific players from team B on the court, attack 10 times, and score 6 points. Normally, basketball is evaluated based on how many points are scored per 100 attacks, so the score is 60 points. Therefore, we expressed the cause and effect in a mathematical formula, such as the combination of these five players = average 60 points. In this way, we used points scored and conceded as variables and applied data from a full season, such as player combinations and whether the game was played at home or away, to measure each player's contribution to offense and defense.
When five players from a team of seven players, A, B, C, D, E, F, G, and five players from a team of seven players, a, b, c, d, e, f, g, are on the court,
A formula (mathematical model) expressing the points per 100 attacks
As shown in the figure above, by expressing the phenomenon in a mathematical model, it is possible to estimate the coefficients of the model.
It will be possible to clarify how this can be used to evaluate the offensive and defensive abilities of each player.
The data will enable this assessment.
The key point of statistical modeling is how well one can grasp the overall trend of a phenomenon from the limited data available. This study assessed the average performance of players by grasping the overall trend of the relationship between the players currently on the court and the points scored and conceded. It is possible that luck or bad luck and the physical condition of players have a greater impact on the points scored and conceded in a game than the factors listed here. However, being able to obtain more objective information from the data currently at hand will be useful for team management when deciding on players' annual salaries or scouting players from other teams.
In addition to theoretical research, I am also involved in joint projects that solve business problems through data analysis. As one example, I worked on accurately estimating from data how much fuel remains in a huge tank located underground at a gas station. Normally, the remaining amount in an underground tank is evaluated based on the height of the liquid level from the bottom, but because the tank is huge, even a slight tilt or dent can cause the evaluation value to be inaccurate. Therefore, I used a statistical modeling method to measure the amount of fuel in the tank based on how the evaluation value changed when fuel was used or replenished from the tank in the past. In this research, it was unclear how much fuel was actually remaining in the underground tank, so it was unclear for a long time whether the method I devised could solve the problem. However, I was able to have a collaborative research company conduct a verification experiment, and it was found that my method had a certain degree of accuracy. I also received words of gratitude from the person in charge, and it was a sense of accomplishment that was different from my previous theoretical research.
In joint projects with companies, I often face problems that are beyond my field of expertise, but if I suddenly change the angle at which I look at the problem, I sometimes find that it can be solved by using statistical modeling techniques. I am currently involved in analyzing data held by companies, and in a rare instance, I am developing a method for company employees to solve problems using data analysis. As I teach the methods I have developed to the general public, I continue to search for approaches that are easier to understand. I find great satisfaction in the process of making people in front of me happy by utilizing the knowledge I have cultivated through research.
In my research, I pursue themes that I find interesting and believe to be valuable. Theoretical research has no clear goal, and even if you make some progress, you may end up back at the starting point or be overtaken by other researchers. For this reason, I try to make the research itself enjoyable during the time I am doing it. One of the joys of statistical science research is being useful to the world, and the fact that my happiness and contribution to society are aligned is what makes me glad that I chose this field. On the other hand, in joint projects with companies, I thoroughly think about what is needed to solve the other party's problem and what will be realized after the solution. This is because no matter how advanced the data analysis, it is almost impossible to solve the problem if you ignore the actual site where the problem occurs. I value creating a situation where the other party's problem is seen as their own and can tackle it proactively.
Teaching data science at Aoyama Gakuin University 's School of Business is very meaningful. Currently, society is paying attention to the training of data scientists, but I believe that the role of returning specialized knowledge to society as a benefit will be most needed. These are the occupations called business translators and data strategists. Particularly important is the ability to correctly understand the results of data analysis. For example, let's say a data scientist chooses the data analysis method I developed from the many available and uses it to get results. The results of the analysis are still just a list of numbers, and it is necessary to work on reading the meaning from them. If you do not have a certain understanding of what the method is and why it is used, there is a high possibility of misinterpretation. What I am training is people who can translate, "Because this method is used, these numbers can be interpreted in this way," and can also propose the next step.
This year's seminar students in School of Business are already able to handle advanced techniques, such as participating in data analysis competitions. They are familiar with data analysis to a level that is unimaginable from the traditional image of a School of School of Business student, and they also have expertise in business administration. I am confident that students with the foundation to collaborate with data science experts like these will play an active role in society in the near future.
Although they are students in School of Business, which is considered a liberal arts field, they study statistics and data analysis methods in seminars,
We are conducting research on how data science can be used to solve problems in corporate management activities.