When the black box talks to us
Thanks to increasingly sophisticated voice control, communication between man and machine is making great strides. Will the dialog with intelligent systems change our society?
Voice command devices are on the rise, and the little boxes can already be found in many households. But because they answer with ready-made, programmed text modules, dialog with the smart devices is still rather clumsy. Experts, however, predict that this type of communication will have become part of our everyday life in just a few years' time – not only at home, but also in the car or at work. Already now, though, more and more people are beginning to ask critical questions: How can we ensure that the contents of recorded conversations are not misused? Or: How can we prevent children and young people from revealing too many personal details about themselves?
For the social psychologist Nicole Krämer from the University of Duisburg-Essen, there is also a much more fundamental question: How is daily contact with intelligent machines likely to change communication in our society? Over the next four years, she will be trying to answer this question in the frame of an interdisciplinary research project – together with colleagues from the fields of ethics, computer science and law. "When talking to people, we have at least an idea of how the other person thinks," she says, "but when we communicate with a technical device, we are talking to a black box for which we have no such mental conception. We don't know how their answers or decisions actually come about."
It is expected that the project will also provide insights for the further development of voice command devices. The computer scientists in the team are working on self-learning devices that instead of providing standard answers are able to gradually get to know the user – and can therefore respond according to his or her preferences and needs. For this purpose, the project will use machine learning methods that are able to recognize patterns and regularities in spoken language. As for the legal experts in the project, their focus is to ensure that the technologies meet all requirements, for example with regard to data protection, from the outset.
Research design and test environment
For computer scientist Professor Barbara Hammer from the University of Bielefeld and her 'Machine Learning Group', this project is special because it is so encompassing. Not only because of the interdisciplinary approach, but also because three different user groups are in the focus: children, middle-aged and retired people. "This provides us with a unique test environment in which we can comprehensively explore new technologies before they later become more widely used," says Barbara Hammer.
The project participants are jointly concerned with three main aspects: the type of communication, the relationship between humans and machines, and the question of how transparent a machine is. In other words, how or why it gives a certain answer. A different focus is set depending on the age group and user. "Children quickly build up a relationship with things: take cuddly toys, for instance," explains Prof. Dr. Nicole Krämer. "In this part of the project, our main focus is on what type of relationship is formed." The researchers will investigate how communication between child and machine takes place, and whether the child's behavior is also reflected in their interaction with people. "In the US, for example, it has been found that children who are used to giving commands to voice command forget to say 'please' and 'thank you'," says Nicole Krämer. Since many parents today already put a smart speaker in their children's room, the question arises as to what effect this has on them.
For one of their studies, the researchers want to observe the behavior of 20 children over several months – in the course of home visits, but also by recording their communication on camera. To ensure that the children actually communicate with the devices often enough, the researchers are developing incentives to integrate them into the children's everyday lives – for example by having them call up interesting audio books.
Barbara Hammer also wants to use this opportunity to teach the computer to learn in real time. The system should not only understand the preferences of the young users, but also be able to analyze why they sometimes reject an offer. "This is very difficult for a technical system to cope with," says Barbara Hammer. "Children tend to say just what they think: for example, they might simply say something is 'daft'." In such a case, an intelligent system must be able to recognize the reason why the child rejects something. Here, the machine must be able to distinguish whether the user finds radio plays "daft" in general, or maybe just a certain episode too boring. "You can solve this by letting a system ask intelligent questions," says Barbara Hammer. "...in this case using child-friendly language that is easy to understand."
Adopting the machine's language
The question of how to develop a 'good' relationship between man and machine also arises in the case of senior citizens – especially with people who are alone a lot and only seldom have the opportunity to talk to others. It is a well-known fact that people adapt to the person opposite them in the course of a conversation – adopting their accent, choice of words or speed of speech. Experts call this type of adaptation 'alignment'. "We assume that in intensive communication with the voice command device, an alignment occurs in a relatively short period of time, and that the user adapts by assuming a command-style of language or answering in short and simple sentences," says Nicole Krämer. What happens in the long term, however, is still unclear.
The project will initially use voice command devices. Later on, virtual 'agents' – figures on a screen – will speak to the users. This is the focus of the research done by Prof. Dr.-Ing. Stefan Kopp from the University of Bielefeld. His 'Sociable Agents Group' was involved in the development of the avatar 'Billie' for the Federal Ministry of Education and Research, which is also to be used in this project. Until now, such agents have responded according to rigid decision trees, whereby ready-made answers are given according to a yes-no scheme. In the current project, the avatar is to become more flexible by getting to know the needs of the users. In the future, it could give individual tips on how to spend one's leisure time, remind users of appointments or suggest medicines.
Information on collected data
The third main aspect, transparency, is examined in the project by means of a health app for adults. The app will support users in the treatment of back problems, give tips and monitor the training process. At the same time, it will have an interface through which the stored data can be presented. This is crucial, because nowadays things are usually done differently – with websites collecting countless data about users without them even noticing or knowing how those are processed.
Among other things, the app is designed to determine what type of data the user is interested in and how best to process it. To this end, functions are implemented that explain and display the data in an easily understandable way. Different versions will be tested in order to find out precisely how detailed and how transparent this should be. This presents a great challenge, since people's needs and preferences are very different in this respect. "We know from preliminary studies that some users prefer not to know what is stored about them," says Nicole Krämer. Others expected more detailed insights into the data. The app must therefore be designed in such a way that it can adapt to the user's needs. Barbara Hammer adds: "We want to move away from the 'black box' to a system that takes people with it, where they feel comfortable using artificial intelligence, where they see added value. And above all, where they retain control over their data."
For this reason, the project will also involve laypersons in the further development of the research questions. They will be recruited via newspaper advertisements, and in an ethics workshop they can express their wishes and fears regarding future voice command devices and dialog systems. "We hope this will open up new perspectives that we as experts might not even be aware of," says Nicole Krämer. Moreover, the workshop participants should also develop questionnaires for interviews with the persons and evaluate them together with the project team – so that in the end the foundations are actually laid for voice command devices that are not only eloquent and reliable, but are genuinely accepted by people as real support.