is the process of determining motion vectors
that describe the transformation from one 2D image to another; usually from adjacent frames
in a video sequence. It is an ill-posed problem as the motion is in three dimensions but the images are a projection of the 3D scene onto a 2D plane. The motion vectors may relate to the whole image (global motion estimation) or specific parts, such as rectangular blocks, arbitrary shaped patches or even per pixel
. The motion vectors may be represented by a translational model or many other models that can approximate the motion of a real video camera, such as rotation and translation in all three dimensions and zoom.
Closely related to motion estimation is optical flow
, where the vectors correspond to the perceived movement of pixels. In motion estimation an exact 1:1 correspondence of pixel positions is not a requirement.
Applying the motion vectors to an image to synthesise the transformation to the next image is called motion compensation
. The combination of motion estimation and motion compensation is a key part of video compression
as used by MPEG
1, 2 and 4 as well as many other video codecs
The methods for finding motion vectors can be categorised into pixel based methods ("direct") and feature based methods ("indirect"). A famous debate resulted in two papers from the opposing factions being produced to try to establish a conclusion.Philip H.S. Torr and Andrew Zisserman:... Read More