High-speed SIFT matching
The algorithm Scale-invariant feature transform (SIFT) is frequently applied in computer vision and a variety of multimedia tasks. It has been used for widely varying vision tasks from finding objects in images and videos on the one hand to the creation of point clouds of 3D scenes on the other hand. After having existed for several year, it still quite frequently comes out as the winner in comparison with other algorithms.
Description of topic
The biggest problem seen with SIFT is that it is very complex and therefore very time-consuming (the second biggest that it is patent-protected in the US). Now, we have recently released a real-time GPU implementation of the frequently-used algorithm Scale-invariant feature transform (SIFT), which can be found here.
That opens a range of opportunities for a master thesis to expand on this:
(1) extend our PopSIFT implementation of SIFT to its derivates such as Affine SIFT, which increases the number of matches that can be found between images. The power of ASIFT is illustrated by its inventors in the following YouTube video.
An overview of SIFT relatives can be found in this overview.
(2) Develop real-time identification of camera positions in a large environment that has been reconstructed from photos. In this scenario, SIFT features are extracted in real-time from images of a video taken by a camera, for example using our PopSift implementation. These SIFT features are looked up in a database that has been constructed offline from thousands of photos, finding the position of the camera by identifying the spatial position of stored SIFT features. The main challenge is to understand SIFT matching at its lowest level, and take inspiration from computer architecture, assembly language, compiler construction, formal languages and databases to develop a new real-time matching system. The current state-of-the-art method is the VocTree method.
- Pål Halvorsen
- Carsten Griwodz