Thư viện tri thức trực tuyến
Kho tài liệu với 50,000+ tài liệu học thuật
© 2023 Siêu thị PDF - Kho tài liệu học thuật hàng đầu Việt Nam

Scene-based video super-resolution with minimum mean square error estimation
Nội dung xem thử
Mô tả chi tiết
Scene-Based Video Super-Resolution with Minimum
Mean Square Error Estimation
Cao Bui-Thu
Department of Electronic Technology
Ho Chi Minh City University of Industry (HUI)
Ho Chi Minh City, Vietnam
E-mail: [email protected]
Tuan Do-Hong
Division of Electronic-Telecommunication
Ho Chi Minh University of Technology
Ho Chi Minh City, Vietnam
E-mail: [email protected]
Thuong Le-Tien
Department of Electric-Electronic Engineering
Ho Chi Minh University of Technology (HCMUT)
Ho Chi Minh City, Vietnam
E-mail: [email protected]
Hoang Nguyen-Duc
Science and Technology Research Center BRAC
Vietnam Television VTV
Ho Chi Minh City, Vietnam
E-mail: [email protected]
Abstract— Motion estimation is a key problem in video superresolution (SR). If the estimation is highly accurate then the high
resolution (HR) frames reconstructed is better in quality.
Otherwise with small errors in estimation, they will create more
degradation in the reconstructed HR frames. In many recent
studies, the motion estimation is applied on every block of pixels.
There is too little input data for estimating process so that it is
hard to get high accuracy in results. This paper presents a new
method for SR video image reconstruction through two main
ideas. First, video frames are separated into two sections, as scene
and motive objects. The motions of the scene are the same and
uniform. We will have much data for estimating, so that the
result can be more accurate. Second, the motion estimation is
based on three parameters, rotation and shifts in vertical and
horizontal. It presents a perfectly estimating for real motion of
camera when capturing video frames. Based on that, an efficient
algorithm is proposed by combination of block matching search
method and minimum mean square error estimation. The results
of the proposed algorithm are more accurate than those of other
recent algorithms. It can be easy to see one we visualize the HR
video frames reconstructed by other algorithms.
I. INTRODUCTION
There are two type of super-resolution methods, SR image
reconstruction from single frame and SR image reconstruction
from multiple frames. The first uses interpolation, smoothing,
shaping to reconstruct HR video frames. The second uses
multiple frames so that we can combine the missing
information in sampling process from input low resolution
(LR) video frames sequence to reconstruct HR video frames
with higher quality than that of the original frames.
There are two main steps for SR image reconstruction.
First, the registration or motion estimation is performed to find
exactly the shifts and the rotations of pixels between the
reference frames and the context frame. Motion estimation is a
key problem in SR image reconstruction. It is also a vast
different challenge because a small error in the motion
estimation will translate almost directly into large degradation
in the results. Second we rearrange them in the same
coordinate, then use interpolation methods to combinate the
detail inform of images to reconstruct and create HR images.
Up to now, there are many authors and their methods for
SR image reconstruction, as described in technical overview of
Park [7] in 2003. In general, there are two main approaches of
SR reconstruction. The first is frequency domain approach, as
present in [1]-[3]. Most of frequency domain registration
methods, as typical researches of Cao [2] in 2009 and
Vandewalle [3] in 2006, are based on the fact that two shifted
images differ in frequency domain only by a phase shift, which
can be found from their correlation in Fourier transform. The
second is spatial domain approach, as present in [4]-[6].
Almost spatial domain methods, as the typical method of Keren
[6] in 1988, are based on algebra and statistics. Images are
presented in matrices of grey pixels. The relation between
reference frames with other frames is described in combination
of blur matrix, shift and rotation matrices then use algebra
processing methods to solve them. In recent studies, Mallat [1]
in 2010 used sparse mixing estimators (SME) to define
coefficients for interpolating in the wavelet transform. It is the
same ideal with Mallat, W. Dong [4] in 2011 used adaptive
sparse domain selection and adaptive regularization (ASDS) to
interpolate HR images in spatial domain. Takeda [5] in 2007
developed a frame work for SR image using multi-dimension
kernel regression (KRI), where each pixel in the video frame
sequence is approximated with a 3-D local Taylor series.
Although there are many studies in SR image, with advance
results recently, it is much difficult to apply the methods for
video image SR reconstruction. The reason for that is
complexity of motion characteristic of video images. Basically,
there two types of motion in video frames, global and local
motion. Global motion is the motion of camera when captured.
It creates shift and rotation for total frames. Local motion is the
arbitrary motion of objects on the scene. The registration
process is applied on every block of pixels. Some other reasons