Saturday, December 29, 2012

Holo-tables ("Avatar" style) are cool and seem very useful, but is the technology there to make them real today?

Like many others, I loved the Holo table concept presented in the film Avatar (2009).





I think the applications for showing terrain in 3D could be very useful for civilian and military uses.
For example, a quick look at a 3D enabled map such as that could help determine whether line-of-sight exists between two points, or whether some part of the terrain is going to be harder to traverse, a potential ambush spot, etc. I believe there are possibilities for strategic planning, beyond what 2D terrain maps enable.

One of the key aspects of such a "holo table" is the ability of having multiple viewers collaborate, while each viewer may be watching the map from a different angle.

A 3D display may be useful, although it would require lots of calculations to produce a large number of potential view points. However, most 3D displays today use vertically aligned lenticular lenses, which would enable multiple views from different horizontal points.

Using concentric elipses, one would be able to get the Parallax effect on the up-down motion, but not while walking around the table.


Therefor, one would need to organize the lenticular lenses along the perpendicular lines to these concentric ellipses. That would enable Parallax on the horizontal movement, which might be sufficient.
However, there is still a problem with the density of the lenticular lenses close to the centre of the display. Obviously, such lenses can not cross each other's paths.
The solution, both hardware, software and compute power, seems pretty expensive. An elliptic 3D display used horizontally as a "holo table", but price aside, it seems doable with today's technology unless I'm missing something.


Another interesting implementation idea, which shows more promise, although I believe it is at least a few years away from commercial use, is by using a Laser Plasma Volumetric display.



By setting up an overhead "projector" (infrared laser plasma emitter) that is capable of heating up the plasma in the air above the table, it should be able to display a 3D image over the table that resembles the picture above from Avatar. Currently the technology is limited to a monochromatic blueish like colour, but as it gets developed further we might see such displays in the future.


As a third, and perhaps simpler, alternative for implementing something similar to the holo table above, we might want to take a look at an old and familiar toy.

Using the same idea horizontally, and if we can effectively and electronically control the height of the pins, using a combination of elastic and electro magnetic forces as in this Pin-array tactile module, or even by using mechanical alternatives, one could overlay a reflective material over the pins (or just paint their heads white), use an overhead projector to project the surface map, and what we should get is a fairly realistic 3D representation of the terrain. Of course, the density of the pins would determine the resolution, and using the elastic/electro-magnetic combo solution would probably also allow some degree of animation (zoom, rotate, slide, etc.).
I can't help imagining how noisy this will be ;-) But at least it is a start in the right direction, until laser plasma volumetric displays are ready for prime time.

Update:

Zebra Imaging, a long-time producer of 3D holographic prints has been awarded a contract by DARPA back in 2005 to develop a real-time interactive holographic display map. The Urban Photoinc Sandtable Display (UPSD) is the result of that. It supports up to 20 participants, 360 degree view points, 12 inch depth and displays that scale up to 6 feet in length, enabling full Parallax without requiring special glasses or goggles.

Personally, I haven't heard of Zebra Imaging prior to searching for holo table implementations. Even with all my research into augmented reality and 3D display technology. However, I can only assume that the limit imposed on the number of participants suggests that the display does some face and eye tracking, in order to limit the amount of computation needed to create the holograms.

360 degree holograms take up a lot of compute power. Basically, as I understand it, one would need to calculate the interference patterns for multiple view points from all angles to create the full Parallax effect (otherwise the hologram may be invisible or look incoherent from some angles).
By being able to track the viewer's eyes, the display is able to calculate the relevant part of the holographic model only for that viewer. If you add more viewers, the patterns become more complex (and I assume super-linear complexity, although I'm not familiar with algorithms for creating holograms), thus if the system at its current compute power needs to deal with participant 21, the real-time effect would be lost, since refreshes would reduce the fram rate significantly.

I can only assume that with improved compute power, UPSC should be able to deal with a larger number of participants.

Conclusion:
I believe it is safe to tag "Holographic map tables" as Science-Fact.

Marker-less (General Image Recognition) based tracking

There's a nice presentation by Matt Mills on TED, where he demos a technology that uses image recognition as a reference point for Augmented Reality videos and models.

The demo looks nice, but can't help wondering what would happen when the number of image reference points on the server reaches critical mass...

The idea is that when your set of pictograms is small, or by some way easily decodes into a text string (or some other key), then it should be relatively easy to locate the appropriate data or model to show to the user.

However, when we allow multiple images, then some form of fast image search needs to happen. Since the reference point could potentially be any part of the scene (although higher priority would probably be given to the focus point of the scene), it requires some pretty powerful encoding or abstraction to be able to find those.

I guess that there should be optimizations for detecting "frames" (i.e. rectangles), and then cut that down by the length/width ratio to narrow down the number of possible images. However, if the image is a 4x6 (or any 2:3 ratio image), and since the phone has no data regarding the distance to the image due to lack of stereoscopy, I would assume that for such images the number of potential hits in the database would be quite large.

Other tricks, such as relying on geolocation, or abstracting parts of the image further (i.e. by recognizing elements in the encoded image such as rectangles, faces, other body parts, etc.) and doing the same on the input image, would probably allow faster lookup. But then again, if the database is full of portraits that would still be a hard problem (even with parallelisztion). Other elements like colour coding, de-rezing the image (into a much lower resolution), and some statistics on the image could probably be used in combination with Fuzzy Logic to allow yet faster image lookup.

If anyone has more details on how the performance is achieved in services such as Layar (as well as many others) that support general image recognition for Augmented Reality, I would love to hear about it.

On a side note:  +Augmented Reality has some good intro material and links for those new to AR, as well as links to lots of good demo videos.

Friday, December 28, 2012

From Science-Fiction to Science Fact: An overview of technologies for integrating AR in our daily lives

I've given a talk today at the IMAGinE "unconference":

From Science-Fiction to Science Fact:
An overview of technologies for integrating AR in our daily lives
By: Tal Arie
Description: A common tool in SciFi movies, AR is presented as an ideal way for interacting with some types of information. The film "Iron Man", for example, presents a Human Machine Interface that seems to open up possibilities for designers and researchers that were never possible before (and are still not possible to that degree). The "film" Minority Report presents a Human Machine Interface for dealing with large quantities of visual data (which is Science Fact). The talk presents a quick review of current and up-and-coming technologies for enabling Augmented Reality in our daily lives, including - 3D displays - holographic projectors (enclosed, and mid-air) - AR glasses (such as Microsoft's and Google's) as well as technologies for interacting with AR (if time permits).

The presentation is available here.

Following my presentation, one of the participants asked me whether I had a blog or some similar public presence, which is why I decided to create this blog.

Since this is the first post on this blog, I'll use the opportunity to define the blogs purpose:

I track technology (being an Inventor, Software Engineer, and a Technology Enthusiast).

A particular set of technologies that I'm interested in have to do with NUI, which stands for Natural User Interfaces. These include motion based user interfaces (think Kinect, LeapMotion, MUV, etc.), holographic or volumetric display technology (and 3D display technology in general), voice activation as well as Artificial Intelligence in general and in particular around user interaction (think "Jarvis" from the film "Iron Man").

Take a look at the presentation, I spent literally hours and hours researching the material. There's an endless list of works being done on this topic (and lots of youtube video demonstrations as well) which I wasn't able to contain in it. If you're enthusiastic about Augmented Reality like me, and also believe it to be The Next Big Thing, then you may have come to the right place.