The iPhone, Siri and Conversation
When Steve Jobs introduced the Apple iPhone 10 years ago, the world could only guess the extent to which it (and other smartphones) would become indispensible to our lives.
A decade later, these handheld computers — more powerful than those aboard Apollo 11 — have given us enormous reach. On the fly we can deposit checks and transfer funds, keep up with email, watch movies, peruse Facebook or read the news, and navigate in real-time by way of GPS. With our iPhones we can find dates, check in for airline flights and even control the appliances in our homes.
But one of the most intriguing technologies that iPhone brought was Siri, the occasionally snarky voice assistant on the 4S version, who, among her other talents, can help you communicate, navigate, query the internet and activate various apps. Lost your house key? Siri can find a locksmith in your area. Need a caffeine fix? She will guide you to the nearest Starbucks. Not sure what to wear? Ask her for a weather report. She can even give you the title of the obscure novel or poem that’s on the tip of your tongue.
Such human-computer interaction is not new. For decades we have been on speaking terms with our machines, and certainly fantasizing about engaging them in dialogue (think HAL of “2001: A Space Odyssey” or KITT, the self-driving smart-alecky car in the 1980s TV show “Knight Rider”). But our real-life computers have tended to be uninteresting one-trick ponies and poor conversationalists.
That, however, was before research in artificial intelligence, particularly natural language processing (NLP), really took off. Thanks to, among other things, the surge in computational power and better algorithms, the desire to enable our computers not only to talk, but to hold conversations, also has increased.
“Natural language processing is a way to look at how machines understand text and speech,” said William Wang, an assistant professor in UC Santa Barbara’s Department of Computer Science. “Machines need to not just understand human language, but learn how to generate human language.”
Wang, who joined UCSB in fall 2016, set up the campus’s first course in the red-hot field of natural language processing. Last quarter, he introduced a graduate-level course on deep learning for NLP and had a cap of 25 students in his class. Sixty-five students showed up.
“There’s a trend in industry that requires our graduates to have more data analytics skills,” he said. And so the shift is toward finding ways for computers to look at and interpret massive sets of data.
What comes to us naturally — we start picking up context and speech early in life — is not so easy for a machine. There are hundreds of ways to say the same thing, and in our conversations we often refer to things contextually, stringing phrases and incomplete statements together without losing meaning. Even when we hear words in different accents we can parse out the meanings without skipping a beat. The voice assistants we’re most familiar with, including Siri, Amazon’s Alexa, Microsoft’s Cortana and Google Assistant, are just scratching the surface of that ability now.
Early versions of Siri, the first smartphone voice assistant, could only understand isolated statements and commands, Wang said. You couldn’t ask follow-up questions, where the subject may be inferred from the previous question, but not stated outright. As the minds behind Siri and other voice assistants race to evolve their technologies using artificial intelligence elements such as machine learning and neural networks, the assistants are bound to become more nuanced in their communications.
Another popular topic in natural language processing that Wang teaches is machine translation, in which the computer translates words and phrases from one language into another. Facebook has made attempts at this, with results ranging from fairly high accuracy to unintentionally hilarious — to downright confusing.
“Right now the accuracy is improving but it’s still not satisfactory,” Wang said, “so there’s still a lot of work to be done.”
The work, according to Wang, involves a lot of data and several layers of simultaneous processing. Voice recognition and other artificial intelligence research has been going on for years, but it’s only since the hardware has become capable of handling the multilayered and iterative processes of deep learning that NLP has become truly widespread.
As neural network models get more sophisticated in their training, the potential uses for NLP will grow, Wang said, and versions could be deployed to smartphones everywhere. One student in his class is working on a deep learning problem that could allow for language processing offline; an ideal application would involve individuals in foreign countries who need to communicate with the locals, but don’t have an internet connection to access information.
In the meantime, Wang is taking his work a step further by conducting research not only into teaching machines to learn, but, in some sense, teaching machines to teach themselves.
“In reinforcement learning, the idea is how can we teach the machine to make incremental decisions, without having a lot of human annotations,” Wang said. With this method of machine learning, computers are made to explore options and take actions in order to maximize a reward, with minimal human supervision.
Wang will offer his courses again during the upcoming academic year. His students — undergraduate and graduate — tend also to specialize in mathematics, geography and physics, all fields that will benefit greatly from machine learning and its capacity for processing and analyzing large amounts of data.
“In general I would say there’s a surge of interest in NLP and machine learning and data analytics in general,” he said. Natural language processing is particularly useful because we communicate with words, whether by text or voice, in a variety of applications and scenarios.
Added Wang, “How we actually design more intelligent machines that can understand humans and also facilitate natural language generation — I think that will be really useful for the future generations of technologists.”