Skip to main content

The coming era of ubiquitous household chatbots


Back in 2005-2006 my friend Liesl told me about the coming age of chat bots.  I had a hard time imagining how people would embrace products that simulated human voice communication but were less “intelligent”.  She ended up building a company that allowed people to have polite automated service agents that you could program with a certain specific area of intelligence.  Upon launch she found that people spent a lot more time conversing with the bots than they did with the average human service agent.  I wondered if this was because it was harder to get questions answered, or if people just enjoyed the experience of conversing with the bots more than they enjoyed talking to people.  Perhaps when we know the customer service agent is paid hourly, we don't gab in excess.  But if it's chat bot you're talking to, we don't feel the need to be hasty?

Fast forwarding over a decade later, IBM has acquired her company into the Watson group.  During a dinner party we talked about Amazon’s Echo sitting on her porch.  She and her husband would occasionally make DJ requests to “Alexa” (the name for Echo’s internal chat bot) as if it was a person attending the party.  It was definitely seeming that the age of more intelligent bots is upon us.  Most folk who have experimented with speech-input products of the last decade have become accustomed to talking to bots in a robotic monotone devoid of accent because of the somewhat random speech capture mistakes that early technology was burdened with.  If the bots don't adapt to us, we go to them it seems, mimicking the 50's and 60's movies of how we've heard robotic voices depicted to us in science fiction films.

This month both Microsoft and Facebook have announced open bot APIs for their respective platforms.  Microsoft’s platform for integration is an open source "Bot Framework" that allows any web developer to re-purpose the code to inject new actions or content tools in the active discussion flow of their conversational chat bot called Cortana, which is built into the search box of every Windows 10 operating system they license.  They also demonstrated how the new bot framework allows their Skype messenger to respond to queries intelligently if they have the right libraries loaded. Amazon refers to the app-sockets for the Echo platform as "skills", whereby you load a specific field of intelligence into the speech engine to allow Alexa to query the external sources you wish.  I noticed that both Alexa team and Cortana team seem to be focusing on pizza ordering in both their product demos.  But one day we'll be able to query beyond the basic necessities.  In my early demonstration back in 2005 of the technology Liesl and Dr. Zakos (her cofounder) built, they had their chat bot ingest all my blog writings about folk percussion, then answer questions about certain topics that were in my personal blog.  If a bot narrows a question to a subject matter, its answers can be uncannily accurate to the field!

Facebook’s plan is to inject bot-intelligence into the main Facebook Messenger app.  Their announcements actually seem to follow quite closely the concept Microsoft announced of developers being able to port in new capabilities for the chatting engines of each platform vendor.  It may be that both Microsoft and Facebook are planning for the social capabilities of their joint collaborations on the launch of Oculus, Facebook's immersive virtual environment of head-set based virtual world environments which run on Windows 10 machines.

The outliers in this era of chat bot openness are the Apple Siri and Ok Google speech tools that are like a centrally managed brain.  (Siri may query the web using specific sources like Wolfram Alpha, but most of the answers you get from either will be consistent with the answers others receive for similar questions.)  The thing that I think is very elegant about the approaches Amazon, Microsoft and Facebook are taking is that they make the knowledge engine of the core platform extensible in ways that a single company could not.  Also, the approach allows customers to personalize their experience of the platform by specifically adding new ported services to the tools.  My interest here is that the speech platforms will become much more like the Internet of today where we are used to having very diverse content experiences based on our personal preferences.

It is very exciting to see that speech is becoming a useful interface for interacting with computers.  While the content of the web is already one of the knowledge ports of these speech tools, the open-APIs of Cortana, Alexa and Facebook Messenger will usher in a very exciting new means to create compelling internet experiences.  My hope is that there is a bit of standardization so that a merchant like Domino's doesn't have to keep rebuilding their chat bot tools for each platform.

Each of these innovative companies is dealing with the hard questions of how to get us out of our stereotypes of robot behavior and get us back to acting like people again, returning to the main interface that humans have used for eons to interact with each other.  Ideally the technology will fade into the background and we'll start acting normally again instead of staring at screens and tapping fingers.

Comments

Popular posts from this blog

The Momentum of Openness - My Journey From Netscape User to Mozillian Contributor

(Update: Because this post is exceedingly long, I have decided to make it available as a printed book: Momentum of Openness  It will remain free to read here.) Insider story behind the cover image: Mozilla's mascot derived from the name of the Mosaic browser and the trademarked name of a large mythical beast from Japanese culture which would rise from the oceans to protect mankind against peril. You may see this mythical creature in Bugzilla, or featured in popular web browsers like Chrome when they are having issues addressing your requests. I like to call it "The Mozilla" because it serves as a protector of all that's good. When I first came to the headquarters of Mozilla, I had to get a picture being bitten by the Mozilla. You'll understand why we feel so affectionately about this symbolic icon as you read the story of my journey to web development below. Foreword Shepard Fairey's Dino Working at Mozilla has been a very educational experience over the past...

Far-seeing Devices for Accessibility

The German word for TV is Fernseher, meaning far-seer. I often think about that concept of the fixture of our living rooms which allows us to teleport to perspectives of other places far away. A mode of communion with others, distraction, learning. We are societally connected across the world like never before. We tend to live our lives situationally in our local communities, then at some point in our evenings we teleport our awareness into the lives of others for the snippet of time that came to be known as prime time . This slot of our societal calendars is reputed to have the broadest attention span of collective conscious focus. It came to have that moniker because of marketers seeking to have some time during the hour of evening news or entertainment that would give their messages the broadest appeal to the space-portal's "share of voice" in this communal time of focus. When terrestrial TV fragmented into multi-platform and multi-screen surface areas along with the p...

Reflections on the evolution toward volumetric photography

During college I read Stephen Jay Gould’s books on natural history in the animal kingdom with fascination. He writes extensively on the many different paths distinct species took to develop eyes through successive enhancements in different branches of the tree of life. The development of eye designs and uses were randomly selected for benefits conveyed over time in enhancing survival traits for those species with them over lesser complex traits of their predecessors as biological competition increased over time. In science workshops as a youth, I designed pinhole cameras simulating the pupil of the eye and enjoyed taking apart old cameras to study how their shutters worked. When planes land, I’d notice inverse images of the ground projecting on the ceiling of the plane’s cabin, like a retina image, through the pupil-like windows. I’d study and ponder about General Relativity, the cosmic limits of light’s speed, and its implications about the nature of the cosmos and its origins. With t...