Is Vanilla MLP in Neural Radiance Field Enough for Few-shot View Synthesis?

  • Hanxin Zhu ,
  • ,
  • Xin Li ,
  • Bingchen Li ,
  • Zhibo Chen

CVPR 2024 |

Neural Radiance Field (NeRF) has achieved superior performance for novel view synthesis by modeling the scene with a Multi-Layer Perception (MLP) and a volume rendering procedure however when fewer known views are given (i.e. few-shot view synthesis) the model is prone to overfit the given views. To handle this issue previous efforts have been made towards leveraging learned priors or introducing additional regularizations. In contrast in this paper we for the first time provide an orthogonal method from the perspective of network structure. Given the observation that trivially reducing the number of model parameters alleviates the overfitting issue but at the cost of missing details we propose the multi-input MLP (mi-MLP) that incorporates the inputs (i.e. location and viewing direction) of the vanilla MLP into each layer to prevent the overfitting issue without harming detailed synthesis. To further reduce the artifacts we propose to model colors and volume density separately and present two regularization terms. Extensive experiments on multiple datasets demonstrate that: 1) although the proposed mi-MLP is easy to implement it is surprisingly effective as it boosts the PSNR of the baseline from 14.73 to 24.23. 2) the overall framework achieves state-of-the-art results on a wide range of benchmarks. We will release the code upon publication.