Montag, 28. Dezember 2015

Smart Interaction Beyond the Smartphone

A similar opinion as stated in my last blog about Smartphones as Explicit Devices do not meet Weiser's Vision of Ubiquitous Computing stated Jennifer Winter in a blog  for user testing. In her blog Will AI Replace Your Smartphone? she bemoaned the bad user experience of smartphones requiring their users to pick up their phones, start an application to actually go for what they really want.

She sees more user friendliness in those interactions that disappear (similar to Weiser's vision). And there is some potential in it, despite the fact that users become addicted to their smartphones. Jennifer Winter mentions a statement from the Mobile World Congress in Barcelona in March 2015, that 79 percent of smartphone users have their phones within arm's reach for all but three hours of the day.

However, users see alternatives to that as a study by the Ericsson's Consumer Lab revealed.The study states: "1 in 2 smartphones users now thinks that smartphones will be a thing of the past, and that this will happen in just 5 years." Users want smart interaction with objects, but it does not need to be mediated by the smartphone. The study reveals even more potential about the use of artificial intelligence in our lives:
  • 85 % think wearable electronic assistants will be common within 5 years
  • 50 % believe they will be able to talk to household appliances, as they do to people
  • ...
More statements are in the following figure of that study.
Consumers who think using artificial intelligence (AI) would be a good idea 


It is unfortunate. that the study does not differentiate between AI and its interface. And so does Jennifer Winter. The technology behind the stuff we have with smartphones today is for sure artificial intelligence, cognitive computing, ... The difference lies in the interface which should (and will) disappear in the next few years.

Montag, 21. Dezember 2015

Smartphones as Explicit Devices do not meet Weiser's Vision of Ubiquitous Computing

Mars Weiser uttered the vision of the invisible computer as "The most profound technologies are those that disappear. They weave themselves into the fabric of everyday life until they are indistinguishable from it." He mainly based his vision on the observation that the cardinality of human-computer relation was changing as shown in the following figure.

Human-Computer relation over time (from http://tuprints.ulb.tu-darmstadt.de/5184/)

However, we are not there, yet. Currently, I see people holding their smartphones in front of their face to initiate, e.g., voice queries starting by "OK Google" or similar. So, we are not using everyday objects that we encounter in our daily lives to interact with them but use a vision-centric device as our key. As a results screen sizes are still increasing during the past years.
One of the drawbacks of this vision-centric key to smart spaces is that is is by no means hands-and eyes free. Google and Siri are continuously improving the voice capabilities of their personal agents but they still rely on manual interaction and force the users to look at the screens. It appears as if we forgot about the "invisible" attribute of Weiser's vision. Invisible meaning that we did not perceive it as a device. Today, the smartphone is still an explicit device.
One day, in the very near future, we will have to decide if this is the way to go. Do we want to reuse this thing everywhere, while we are walking the streets, in our cars,...?

Maybe, this also(!) motivates the wish for Android Auto and Apple CarPlay to have the car as another cradle for your smartphone.

Scenarios like those described in http://www.fastcodesign.com/3054733/the-new-story-of-computing-invisible-and-smarter-than-you are still far away. A video demonstrates their room E.



Prototypes like this are already existing in the labs and maybe, it is time to leave the labs.

Amazon Echo is, maybe, a first step in this direction. As a consequence, it became the best selling item above $100 http://www.theverge.com/2015/12/1/9826168/amazon-echo-fire-black-friday-sales

In contrast to the scenario in the video above, users do not need to speak up. It can be used for voice querying and controlling devices http://www.cnet.com/news/amazon-echo-and-alexa-the-most-impressive-new-technology-of-2015/ So, let's see how this one evolves with regard to Weiser's vision. Maybe, we see comparable approaches soon.

Mittwoch, 16. Dezember 2015

Nuance opens their NLU to developers

Nuance just opened their NLU platform as a beta to developers https://developer.nuance.com/public/index.php?task=mix

It is more than simply NLU but a full stack including speech recognition that can be used in own applications as shown in their promotion video.


Similar to efforts of NLU startups residing under .ai, Nuance Mix is able to detect an intent and user defined entities from entered sentences. The possibility to also employ Nuance ASR, however, makes it more complete than those efforts. Maybe, this has to be seen as an attempt to strenghten Nuance approach to a virtual assistant that they call Nina. Nina has been out for a while but did not receive much attention so far.
The market of virtual assistants is already somewhat populated. Google Now and Apple's Siri are well known and established. Others, like Microsoft's  Cortana also try to gain attraction. Recently, Microsoft opened Project Oxford as a cloud - based tool for the creation of smart (voice centric) applications. A comparable, but maybe more advanced offer is IBM Watson which is available for some time. Another one is Amazon Echo that also opened their platform to developers.

It appears that spoken language technology is mature enough to be really useful. Good news for developers who want to play around with voice interaction to control applications in the internet of things. Currently, there is a plethora of SDKs available that can be used for free. The current question is not if we will see more spoken interaction with everyday things in our lives, but who will win the race for a sufficient number of users and their data. Maybe, Nuance is already too late with Nuance Mix to enter that market. Maybe, they can step in nevertheless, relying on their year-long dominance in speech recognition

Freitag, 11. Dezember 2015

NLU vs Dialog Management

Recently, I stumbled across a blog from api.ai that their system now supports slot filling: https://api.ai/blog/2015/11/09/SlotFilling/. Note, that my goal is not on blaming their system.

Currently, I observe that efforts towards spoken interaction coming from cognitive computing are still not fully aware of what has been done in dialog management research in the past decades, and vice versa. Both parties are coming from different centers in the chain of spoken dialog systems.
While the AI community usually focuses on natural language understanding (linguistic analysis) the spoken dialog community focuses on the dialog manager as the central point in this chain.
Both have good reasons for their attitude and are able to deliver convincing results.

Cognitive computing sees the central point in the semantics which should also be grounded with previous utterances or external content. Speech input and output is in this view restricted to be some input into the system and some output. Dialog management can be really dumb in this case. Resulting user interfaces are currently more or less based on queries.

The dialog manager focused view regards the NLU to be some input into the system while the decision upon subsequent interaction is being handled in this component. Resulting user interfaces range from rigid state-based approaches over the information state update approach up to statistical motivated dialog managers like POMPD.

My hope is, that both communities start talking to each other to better incorporate convincing results of "the other component" to arrive at a convincing user experience.



Samstag, 5. Dezember 2015

Microsoft researchers expect human like capabilities of spoken systems in a few years

Researchers at Microsoft believe that we are only a few years away from equal capabilities of machines to understand spoken language as humans do. Although many advances have been made in the past years there are still too many challenges that need to be solved.This is especially true for distant speech recognition that we need to cope with in daily situations.  Maybe, their statement is still a bit too optimistic. However, as systems are available already and people start using it they are right in their assumption that these systems will make progress. We just have to make sure that voice based assistants like Cortana are used at all. Currently some of these systems seem to be more a gimmick to play with until users become bored of it. Hence, they are actually dammed to improve fast to also be helpful.