Full Name
Jun'ichi Takeuchi
Job Title
Professor
Company
Kyushu University
Speaker Bio
Jun'ichi Takeuchi was born in Tokyo, Japan in 1964. He graduated from the University of Tokyo in majoring physics in 1989. He received the Dr. Eng. degree in mathematical engineering from the University of Tokyo in 1996. From 1989 to 2006, he worked for NEC Corporation, Japan. In 2006, he moved to Kyushu University, Fukuoka, Japan, where he is a Professor. From 1996 to 1997 he was a Visiting Research Scholar at Department of Statistics, Yale University, New Haven, CT, USA. He was involved in the start of the annual workshop series on Information-Based Induction Sciences (IBIS, Japan domestic) in 1998. His research interest includes information theory and machine learning. He is a member of IEEE, IEICE, and JSIAM.
Speaking At
Abstract
Minimum Description Length (MDL) estimation of two layer neural networks with ReLU activation is analyzed. The target model is with d dimensional input, 1 dimensional output, and p hidden nodes. We assume that the weights matrix W from the input to the hidden nodes is fixed and consider the estimation of the weights vector v from hidden nodes to the output. Then, our target parameter is p-dimensional. The obtained risk bound is independent of p, and of order d^2. This bound is achieved by designing a two-stage code based on an approximate eigen decomposition of the Fisher information matrix, which was
recently shown by Takeishi et al. The decomposition shows that eigenvalue distribution concentrates on about d^2 eigenvalues and that the sum of eigenvalues nearly equals d/2. It also reveals how eigenvectors depend on W. The heart of the design of our two stage code is in utilization of eigenvectors with large eigenvalues. The risk bound is obtained via the theorem about MDL estimation (Barron and Cover 1991). This is a joint work with Yoshinari Takeishi.
recently shown by Takeishi et al. The decomposition shows that eigenvalue distribution concentrates on about d^2 eigenvalues and that the sum of eigenvalues nearly equals d/2. It also reveals how eigenvectors depend on W. The heart of the design of our two stage code is in utilization of eigenvectors with large eigenvalues. The risk bound is obtained via the theorem about MDL estimation (Barron and Cover 1991). This is a joint work with Yoshinari Takeishi.