The Humane AI Pin and the Future of Computer Interfaces

Are smartphones indisruptable?
2024-04-22
~
6 min read

Smartphones are boring. They all look the same and do the same things. They’ve been around for a while. Although they’re very capable, they’re by no means perfect.

A lot of people have been asking: “What’s next? Where do we go from here?”

There’s the foldable category that aims to make smartphones even more versatile. The spatial computing folks imagine a more immersive future where technology augments our perception of reality. And some want us to ditch the screen. It started with voice assistants back in 2014. Now Humane made a pretty bold move in that direction.

Depending on how you like to augment your experience of the world, one can make a lot more sense to you than the others. Some people want a foldable phone with two screens. Some want lasers projecting the weather forecast directly to their retinas. Others would prefer a device with no screen at all.

Could any of these replace the smartphone?

Can a Technology Be Undisruptable?

Over the past few decades, the tech industry made it seem like there’s always a next step. Each new development is only transitory until something even better arrives. Don’t get too comfortable. The next innovation is always right around the corner.

Sure, no technology is perfect. But is really nothing ever good enough?

Take shoes, for example. No one wore shoes 10,000 years ago. Then some folks had an idea to tie some bark to their feet. 5,000 years ago, people started making shoes made out of leather. These days, virtually everyone wears shoes.

New shapes, colours, sizes and materials come out all the time. But the concept of a shoe hasn’t changed much since the bark sandal days of 10,000 years ago. Inline skates, giant jumping stilts or tiny conveyor belts for your feet may be better for some use cases, but shoes are here to stay.

The Problem with the Humane AI Pin

Smartphones are an extension of the human brain. You could say that about the personal computers from the 80s as well. Before that, people achieved some of the same goals through writing on paper.

The way the Air Jordans are an evolution of the ‘shoe technology’, smartphones are an iteration of this brain-enhancing line of tech.

We communicate with our devices to achieve a variety of goals. Curiously, the interface has changed little over the years. The grey boxes running Windows 95 also had screens, keyboards and speakers. Touchscreens were a breakthrough. Speech interfaces have been a bit of a slow burn. Vision has improved a lot over the last few years. LLMs look promising.

Everything you do with your smartphone goes through this multi-modal interface. You can navigate the UI with your hands or type something on the virtual keyboard. You can use the voice assistant or the vision capability of the camera. The device can respond with voice, but it displays something on the screen most of the time.

The interface limits how well smartphones can extend our brains. It’s a bottleneck. Humane is right to be trying to integrate computers into our lives in a more seamless way.

Anything you can do to reduce friction or increase the throughput and fidelity of the interface between the human and the device is worth it. Where Humane went wrong with the AI Pin was to reduce the fidelity.

Replacing the 4k touchscreen nearly all smartphones have with the hand projector limits the throughput significantly. The gestures are interesting, but can they ever become as intuitive as pointing at something on the screen? Toddlers as young as two can learn to operate a touchscreen on their own. It’s pretty hard to beat.

Even if Humane can run the whole AI stack locally and reduce the latency of the voice assistant to zero, voice is still slow and imprecise. We rely on tons of subtle social norms and non-verbal cues when speaking. We stumble and mispronounce things. Speech is a noisy channel.

Most people can read faster than they can speak. We can scan through a ton of text on a screen. Voice-enabled interfaces are great, but voice-only is a step backwards.

The Future of Computer Interfaces

The smartphone most certainly can be disrupted. First of all, it could be unbundled. We might end up with a suite of more specialised devices — like the AI Pin. Don’t expect anything that would be universally better anytime soon, though.

Let’s zoom out and forget smartphones for a bit. What would be the absolute best extension of the brain? Something that actually extends the brain. A device that knows what you’re thinking and helps you achieve those things.

We’ll need to learn a bit more about how the brain works to make two-way brain-to-computer interfaces possible. Progress is being made on one-way interfaces, though.

If something could read your thoughts and forward them to your device, that would bypass one side of the interface bottleneck. Wanna remember something later? Cool, your device made a note already. Need to know what the traffic looks like this morning? It’s there when you pick up your phone.

Safe to say, devices that can read your mind are still pretty far out. But humans are creatures of habit. We all also do somewhat similar things. Could these devices predict what’s on your mind based on the context and your past behaviour?

The recent AI boom gave us several new predictive interfaces — coding copilots, writing assistants and chatbots to name a few. AI-driven interfaces have been around for a while, though. The feed algorithms on social media are a type of predictive interface. Given how much time people spend on TikTok, it’s a powerful one, too.

The iPhone tries to predict what app you’ll need when you open Spotlight. The suggestions are time-based and perhaps even location-based. It’s a small thing, but it can be pretty handy.

Over the next few years, these predictive engines will be integrated deeper into operating systems and apps. I’m really excited to see how close they will get us to feeling as if they’re reading our minds.

Final Thoughts

To bring this back to Humane: they’re doing all those things. They store all your interactions with the device. AI is at the centre of their OS, learning how you use the device so that it can better assist you in the future.

They clearly spent a lot of time thinking about how predictive AIs can improve the interactions with our devices over the past five years. Undoubtedly, their vision extends well beyond the pin. Maybe the next device they build has a screen? Or perhaps a pivot towards integrating with phones could be on the cards?

The AI Pin gives us a glimpse into the future of human and computer interaction. A future that is multi-modal and AI-driven. A future that most certainly includes a screen (until we figure out how to pipe information directly into people’s brains).