题目: Frequency Principle in Deep NeuralNetworks
报告人: ZhiqinXu (New York University)
时间:2019年5月8日(周三)下午3: 00 - 4: 00
地点:管楼1218
摘要:
It remains a puzzle that why deepneural networks, with more parameters than that of samples, can oftengeneralize well. An attempt to understand this puzzle is to discover implicitbias in the training process. However, without an explicit mathematical description,it is unclear how implicit bias functions in the training process.
In this work, we first show theuniversality of the F-Principle--- DNNs initialized with small parameters oftenfit target functions from low to high frequencies --- by demonstrating thisphenomenon on high-dimensional benchmark datasets, such as MNIST/CIFAR10. Wealso give a mathematical proof of the F-Principle. Then, we consider a neuralnetwork with an extremely large width. In such a regime, the F-Principle isfound to be equivalent to an explicitly regularized optimization problem. Withthe equivalent explicit regularity, we then estimate a prior generalizationerror bound and show that a non-zero initial output can damage thegeneralization ability. Our work shows the F-Principle can lead a neuralnetwork trained without an explicit regularity to good generalizationperformance.
报告人简介:
Zhiqin is currently a VisitingMember in the Courant Institute of Mathematical Sciences at New York Universityand a Postdoc Associate at New York University Abu Dhabi, working with Prof.David W. McLaughlin in computational neuroscience and deep learning. Zhiqin obtainedB.S. in Physics (Zhiyuan College) and a Ph.D. degreein Mathematics from Shanghai Jiao Tong University in China under thesupervision of Profs. David Cai and Douglas Zhou.