Reference clip for inter prediction in video coding
- Changyue Ma ,
- Dong Liu ,
- Xiulian Peng ,
- Feng Wu ,
- Houqiang Li ,
- Tingting Wang
IEEE Transactions on Circuits and Systems for Video Technology |
Inter prediction is a fundamental technology in video coding to remove the temporal redundancy between video frames. Traditionally, the reconstructed frames are directly put into a reference frame buffer to serve as references for inter prediction. Using multiple reference frames increases the accuracy of inter prediction, but also incurs a waste of memory of the buffer since the content of reference frames is highly similar. To address this problem, we propose to organize the references at clip level in addition to frame level, i.e. the reference buffer stores not only reference frames, but also reference clips that are cropped regions selected from the reconstructed frames. Using clip-level references, we can manage the reference content more economically, since the content of multiple reference frames is divided into the singular content of each frame as well as the repetitive content that appears in multiple frames. For the repetitive content, only one copy is stored in reference clips so as to avoid duplicate. Moreover, using reference clips also facilitates the bit-rate allocation among reference content, i.e. the quality of each clip can be decided adaptively to achieve the rate-distortion optimization. In this paper, we propose a complete video coding framework using reference clips, and investigate the problems including how to generate reference clips as either singular content clips or repetitive content clips, how to manage the clips, how to utilize the clips for inter prediction, and how to allocate bit-rate among clips, in a systematic manner. The proposed video coding framework is implemented upon the state-of-the-art video coding scheme, high efficiency video coding (HEVC). Experimental results show that our scheme achieves on average 5.1% and 5.0% BD-rate reduction than the HEVC anchor, in low-delay B and low-delay P settings, respectively. We believe that reference clip opens up a new dimension for optimizing inter prediction in video coding, and thus is worthy of further study.