The AI in AR

The AI in AR

We hear about Artificial Intelligence (AI) all the time now. It seems to be coming from everywhere. NPR is sponsored by consulting firms specializing in it. Google seems to have an AI breakthrough once a month. They’ve got AI writing AI! Startups with AI in their name seem to get funded on that basis alone. Elon Musk and Peter Thiel are siding with Stephen Hawking in worrying about AI, all the while pushing the boundaries of it’s capabilities.

I think it’s fair to say, it’s all the rage.

It seems like VR has been struggling to catch on for quite some time now. Some of the biggest companies you can think of are pushing VR. Facebook with their Oculus Rift. HTC with the Vive. Now Microsoft with their “Mixed Reality” which is just a rebranded VR, and all the OEM headsets coming out from Asus and Samsung and you-name-it. Even still you need a pretty powerful (PC-only) computer to run it – so it’s just not had the consumer explosion we’d hoped for. But this AR thing – what’s going on there?

Lately too, and more recently, we have been hearing about AR. AR, much like VR, has been evolving in the background for quite a few years, but relegated to extremely expensive systems used by huge companies like Boeing, and of course, the government. As the latest entrant of “personal CGI” into the mainstream, AR poked its head into the public consciousness with a wildly successful app called Pokemon Go. Of course anything Pokemon is going to have some automatic lift.

Most people after only a few minutes of thought begin to see that AR has a bigger, brighter future than VR. After all, it doesn’t necessitate that you completely immerse and isolate your primary senses into an entirely virtual world. Instead it lets you “augment” the world around you. Clearly this has a wider addressable market. Why the adoption lag?

It’s simple, really. AR is hard. Let me say that again, louder. AR is HARD. Pulling off the blending of digital reality with physical reality in any believable (or better yet, useful) fashion requires many disciplines to all come together. To name just a few:

  • Graphic Designers
  • 3D Modelers
  • Animators
  • Hardware Specialists
  • Innovative new hardware
  • Computer Vision
  • Machine Learning

Each of those has niches within niches, and each of these people need to be aware of how what they are building will interact with every other piece. Harder still, each of these disciplines seems to have its own obtuse vocabulary. After all, how do you place a digital 3D model of a beach ball on the coffee table? How does the device/screen doing the placement (head-mounted, handset, monocle..) know “where” the table is? Know that it is flat? Know that the light in the room is from the west and so the shadow is dark to the east below the ball? Etc…etc…

The blending of the physics of the real world, and the physics of your digital model is no trivial feat.

This is where the AI in AR comes in. Advances in Augmented Reality in the last year, the release of Software Development Kits (SDK)’s ARCore (Google) and ARKit (Apple) are the product of not months, but years of advanced software engineering, as well as work on custom hardware solutions, to bring all this together, and make it available to “everyday” application developers. The Augmented Reality developer of 2016 needed to know a whole lot more physics and light calculations, and computer vision algorithms, to pull off AR. In 2018 we can get started more readily and need to know a lot less to get it done.

Let’s take just the last 2 in our list. Computer Vision (CV) and Machine Learning (ML). CV embodies the physicality of how the computer perceives images. Different mechanisms in filters and preprocessors do things like edge detection, corner detection and the like – even face detection. ML is typically a layer on top of that – and very often using those recent entrants into the common vernacular “neural networks” – that creates a sophisticated pattern recognition system allowing the machine to see and remember faces, detect tables (planes), perceive depth. Heck the Hololens has 16 processing chipsets for these purposes! (Which is probably why it’s the most amazing device I’ve seen.)

You might think from that description that the work is all done. Far from it. We’re just scratching the surface. The hardware makers are scrambling to build faster custom processors just to handle the CV and ML pieces. The software engineers are developing more sophisticated learning algorithms to run on that new more sophisticated hardware.

But you don’t want to be stuck with the “once size fits all” solutions the vendors are providing. They are a fantastic starting point, and in many cases really are enough to build the app you have in mind. Often though you have AR ideas that require more sophisticated AI to achieve. In these cases you still have a variety of fallback plans. The “easy button” (relative to handcoding all of this yourself over half a decade) is to leverage one or more of the incredible services being offered by Google, Microsoft, and Amazon allowing you to train your own ML algorithms and query them from your app. Sometimes though this doesn’t perform as well as you’d like, or you need your app to work in offline mode. In that case Apple provides CoreML and Google has TensorFlow Lite. Also available is Caffe mobile and it seems like new vendors appear monthly.

Anyway getting pretty deep there on the tech bit. The point of all this is that it’s useful to be aware that:

  • AR is the Future
  • AR is HARD
  • AI makes AR better and easer (think ‘viable’)
  • A surprising amount of AI for AR is “boxed and ready” for your use
  • It’s time to start building that AR app
  • 2018/2019 are going to be the dawn of handset-driven AR which will be superceded by
  • Headset AR is on the horizon. Imagine not having to stare down at your phone while figuring out where you are going…

Lets start Augmenting our Reality.

Close Menu