Saturday, December 29, 2012

Marker-less (General Image Recognition) based tracking

There's a nice presentation by Matt Mills on TED, where he demos a technology that uses image recognition as a reference point for Augmented Reality videos and models.

The demo looks nice, but can't help wondering what would happen when the number of image reference points on the server reaches critical mass...

The idea is that when your set of pictograms is small, or by some way easily decodes into a text string (or some other key), then it should be relatively easy to locate the appropriate data or model to show to the user.

However, when we allow multiple images, then some form of fast image search needs to happen. Since the reference point could potentially be any part of the scene (although higher priority would probably be given to the focus point of the scene), it requires some pretty powerful encoding or abstraction to be able to find those.

I guess that there should be optimizations for detecting "frames" (i.e. rectangles), and then cut that down by the length/width ratio to narrow down the number of possible images. However, if the image is a 4x6 (or any 2:3 ratio image), and since the phone has no data regarding the distance to the image due to lack of stereoscopy, I would assume that for such images the number of potential hits in the database would be quite large.

Other tricks, such as relying on geolocation, or abstracting parts of the image further (i.e. by recognizing elements in the encoded image such as rectangles, faces, other body parts, etc.) and doing the same on the input image, would probably allow faster lookup. But then again, if the database is full of portraits that would still be a hard problem (even with parallelisztion). Other elements like colour coding, de-rezing the image (into a much lower resolution), and some statistics on the image could probably be used in combination with Fuzzy Logic to allow yet faster image lookup.

If anyone has more details on how the performance is achieved in services such as Layar (as well as many others) that support general image recognition for Augmented Reality, I would love to hear about it.

On a side note:  +Augmented Reality has some good intro material and links for those new to AR, as well as links to lots of good demo videos.

No comments:

Post a Comment