Skip to main content

2017 | OriginalPaper | Buchkapitel

Is Spoken Language All-or-Nothing? Implications for Future Speech-Based Human-Machine Interaction

verfasst von : Roger K. Moore

Erschienen in: Dialogues with Social Robots

Verlag: Springer Singapore

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Recent years have seen significant market penetration for voice-based personal assistants such as Apple’s Siri. However, despite this success, user take-up is frustratingly low. This article argues that there is a habitability gap caused by the inevitable mismatch between the capabilities and expectations of human users and the features and benefits provided by contemporary technology. Suggestions are made as to how such problems might be mitigated, but a more worrisome question emerges: “is spoken language all-or-nothing”? The answer, based on contemporary views on the special nature of (spoken) language, is that there may indeed be a fundamental limit to the interaction that can take place between mismatched interlocutors (such as humans and machines). However, it is concluded that interactions between native and non-native speakers, or between adults and children, or even between humans and dogs, might provide critical inspiration for the design of future speech-based human-machine interaction.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
See [1] for a comprehensive review of the history of speech technology R&D up to, and including, the release of Siri.
 
2
It is often argued that such an approach is unimportant as users will habituate. However, habituation only occurs after sustained exposure, and a key issue here is how to increase the effectiveness of first encounters (since that has a direct impact on the likelihood of further usage).
 
3
Interestingly, these ideas do appear to be having some impact on the design of contemporary autonomous social agents such as Jibo (which has a childlike and mildly robotic voice) [28].
 
4
Members of the same species.
 
5
Interestingly, Nass and Brave [8] noted that people speak to poor automatic speech recognition systems as if they were non-native listeners.
 
6
Unfortunately, this term has already been coined to refer to a robot’s natural language abilities in robot-robot and robot-human communication [54].
 
Literatur
1.
Zurück zum Zitat Pieraccini, R.: The Voice in the Machine. MIT Press, Cambridge (2012) Pieraccini, R.: The Voice in the Machine. MIT Press, Cambridge (2012)
2.
Zurück zum Zitat Liao, S.-H.: Awareness and Usage of Speech Technology. Masters thesis, Dept. Computer Science, University of Sheffield (2015) Liao, S.-H.: Awareness and Usage of Speech Technology. Masters thesis, Dept. Computer Science, University of Sheffield (2015)
3.
Zurück zum Zitat Deng, L., Huang, X.: Challenges in adopting speech recognition. Commun. ACM 47(1), 69–75 (2004)CrossRef Deng, L., Huang, X.: Challenges in adopting speech recognition. Commun. ACM 47(1), 69–75 (2004)CrossRef
4.
Zurück zum Zitat Minker, W., Pittermann, J., Pittermann, A., Strauß, P.-M., Bühler, D.: Challenges in speech-based human-computer interfaces. Int. J. Speech Technol. 10(2–3), 109–119 (2007)CrossRef Minker, W., Pittermann, J., Pittermann, A., Strauß, P.-M., Bühler, D.: Challenges in speech-based human-computer interfaces. Int. J. Speech Technol. 10(2–3), 109–119 (2007)CrossRef
5.
Zurück zum Zitat Gales, M., Young, S.J.: The application of hidden Markov models in speech recognition. Found. Trends Signal Process. 1(3), 195–304 (2007)CrossRefMATH Gales, M., Young, S.J.: The application of hidden Markov models in speech recognition. Found. Trends Signal Process. 1(3), 195–304 (2007)CrossRefMATH
6.
Zurück zum Zitat Hinton, G., Deng, L., Yu, D., Dahl, G.E., Mohamed, A., Jaitly, N., Senior, A., Vanhoucke, V., Nguyen, P., Sainath, T.N., Kingsbury, B.: Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups. IEEE Signal Process. Mag. (2012) Hinton, G., Deng, L., Yu, D., Dahl, G.E., Mohamed, A., Jaitly, N., Senior, A., Vanhoucke, V., Nguyen, P., Sainath, T.N., Kingsbury, B.: Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups. IEEE Signal Process. Mag. (2012)
7.
Zurück zum Zitat Moore, R.K.: Modelling data entry rates for ASR and alternative input methods. In: Proceedings of the INTERSPEECH-ICSLP, Jeju, Korea (2004) Moore, R.K.: Modelling data entry rates for ASR and alternative input methods. In: Proceedings of the INTERSPEECH-ICSLP, Jeju, Korea (2004)
8.
Zurück zum Zitat Nass, C., Brave, S.: Wired for Speech: How Voice Activates and Advances the Human-computer Relationship. MIT Press, Cambridge (2005) Nass, C., Brave, S.: Wired for Speech: How Voice Activates and Advances the Human-computer Relationship. MIT Press, Cambridge (2005)
9.
Zurück zum Zitat Moore, R.K.: From talking and listening robots to intelligent communicative machines. In: Markowitz, J. (ed.) Robots That Talk and Listen, pp. 317–335. De Gruyter, Boston (2015) Moore, R.K.: From talking and listening robots to intelligent communicative machines. In: Markowitz, J. (ed.) Robots That Talk and Listen, pp. 317–335. De Gruyter, Boston (2015)
10.
Zurück zum Zitat Bernsen, N.O., Dybkjaer, H., Dybkjaer, L.: Designing Interactive Speech Systems: From First Ideas to User Testing. Springer, London (1998)CrossRef Bernsen, N.O., Dybkjaer, H., Dybkjaer, L.: Designing Interactive Speech Systems: From First Ideas to User Testing. Springer, London (1998)CrossRef
11.
Zurück zum Zitat McTear, M.F.: Spoken Dialogue Technology: Towards the Conversational User Interface. Springer, London (2004)CrossRef McTear, M.F.: Spoken Dialogue Technology: Towards the Conversational User Interface. Springer, London (2004)CrossRef
12.
Zurück zum Zitat Lopez Cozar Delgado, R.: Spoken, Multilingual and Multimodal Dialogue Systems: Development and Assessment. Wiley (2005) Lopez Cozar Delgado, R.: Spoken, Multilingual and Multimodal Dialogue Systems: Development and Assessment. Wiley (2005)
13.
Zurück zum Zitat Philips, M.: Applications of spoken language technology and systems. In: Gilbert, M., Ney, H. (eds.) IEEE/ACL Workshop on Spoken Language Technology (SLT) (2006) Philips, M.: Applications of spoken language technology and systems. In: Gilbert, M., Ney, H. (eds.) IEEE/ACL Workshop on Spoken Language Technology (SLT) (2006)
14.
Zurück zum Zitat Tomko, S., Harris, T.K., Toth, A., Sanders, J., Rudnicky, A., Rosenfeld, R.: Towards efficient human machine speech communication. ACM Trans. Speech Lang. Process. 2(1), 1–27 (2005)CrossRef Tomko, S., Harris, T.K., Toth, A., Sanders, J., Rudnicky, A., Rosenfeld, R.: Towards efficient human machine speech communication. ACM Trans. Speech Lang. Process. 2(1), 1–27 (2005)CrossRef
15.
Zurück zum Zitat Tomko, S.L.: Improving User Interaction with Spoken Dialog Systems via Shaping. Ph.D. Thesis, Carnegie Mellon University (2006) Tomko, S.L.: Improving User Interaction with Spoken Dialog Systems via Shaping. Ph.D. Thesis, Carnegie Mellon University (2006)
16.
Zurück zum Zitat Komatani, K., Fukubayashi, Y., Ogata, T., Okuno, H.G.: Introducing utterance verification in spoken dialogue system to improve dynamic Help generation for novice users. In: Proceedings of the 8th SIGdial Workshop on Discourse and Dialogue, pp. 202–205 (2007) Komatani, K., Fukubayashi, Y., Ogata, T., Okuno, H.G.: Introducing utterance verification in spoken dialogue system to improve dynamic Help generation for novice users. In: Proceedings of the 8th SIGdial Workshop on Discourse and Dialogue, pp. 202–205 (2007)
17.
Zurück zum Zitat Schlangen, D., Skantze, G.: A general, abstract model of incremental dialogue processing. In: Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics (EACL-09), Athens, Greece (2009) Schlangen, D., Skantze, G.: A general, abstract model of incremental dialogue processing. In: Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics (EACL-09), Athens, Greece (2009)
18.
Zurück zum Zitat Hastie, H., Lemon, O., Dethlefs, N.: Incremental spoken dialogue systems: tools and data. In: Proceedings of NAACL-HLT Workshop on Future Directions and Needs in the Spoken Dialog Community, pp. 15–16, Montreal, Canada (2012) Hastie, H., Lemon, O., Dethlefs, N.: Incremental spoken dialogue systems: tools and data. In: Proceedings of NAACL-HLT Workshop on Future Directions and Needs in the Spoken Dialog Community, pp. 15–16, Montreal, Canada (2012)
19.
Zurück zum Zitat Williams, J.D., Young, S.J.: Partially observable Markov decision processes for spoken dialog systems. Comput. Speech Lang. 21(2), 231–422 (2007)CrossRef Williams, J.D., Young, S.J.: Partially observable Markov decision processes for spoken dialog systems. Comput. Speech Lang. 21(2), 231–422 (2007)CrossRef
20.
Zurück zum Zitat Gašić, M., Breslin, C., Henderson, M., Kim, D., Szummer, M., Thomson, B., Tsiakoulis, P., Young, S.J.: POMDP-based dialogue manager adaptation to extended domains. In: Proceedings of 14th SIGdial Meeting on Discourse and Dialogue, pp. 214–222, Metz, France (2013) Gašić, M., Breslin, C., Henderson, M., Kim, D., Szummer, M., Thomson, B., Tsiakoulis, P., Young, S.J.: POMDP-based dialogue manager adaptation to extended domains. In: Proceedings of 14th SIGdial Meeting on Discourse and Dialogue, pp. 214–222, Metz, France (2013)
21.
Zurück zum Zitat Mori, M.: Bukimi no tani (the uncanny valley). Energy 7, 33–35 (1970) Mori, M.: Bukimi no tani (the uncanny valley). Energy 7, 33–35 (1970)
22.
Zurück zum Zitat Moore, R.K.: A Bayesian explanation of the “Uncanny Valley” effect and related psychological phenomena. Nat. Sci. Rep. 2(864) (2012) Moore, R.K.: A Bayesian explanation of the “Uncanny Valley” effect and related psychological phenomena. Nat. Sci. Rep. 2(864) (2012)
23.
Zurück zum Zitat Moore, R.K., Maier, V.: Visual, vocal and behavioural affordances: some effects of consistency. In: Proceedings of the 5th International Conference on Cognitive Systems (CogSys 2012), Vienna (2012) Moore, R.K., Maier, V.: Visual, vocal and behavioural affordances: some effects of consistency. In: Proceedings of the 5th International Conference on Cognitive Systems (CogSys 2012), Vienna (2012)
24.
Zurück zum Zitat Gibson, J.J.: The theory of affordances. In: Shaw, R., Bransford, J. (eds.) Perceiving, Acting, and Knowing: Toward an Ecological Psychology, pp. 67–82. Lawrence Erlbaum, Hillsdale (1977) Gibson, J.J.: The theory of affordances. In: Shaw, R., Bransford, J. (eds.) Perceiving, Acting, and Knowing: Toward an Ecological Psychology, pp. 67–82. Lawrence Erlbaum, Hillsdale (1977)
25.
Zurück zum Zitat Worgan, S., Moore, R.K.: Speech as the perception of affordances. Ecolog. Psychol. 22(4), 327–343 (2010)CrossRef Worgan, S., Moore, R.K.: Speech as the perception of affordances. Ecolog. Psychol. 22(4), 327–343 (2010)CrossRef
26.
Zurück zum Zitat Balentine, B.: It’s Better to Be a Good Machine Than a Bad Person: Speech Recognition and Other Exotic User Interfaces at the Twilight of the Jetsonian Age. ICMI Press, Annapolis (2007) Balentine, B.: It’s Better to Be a Good Machine Than a Bad Person: Speech Recognition and Other Exotic User Interfaces at the Twilight of the Jetsonian Age. ICMI Press, Annapolis (2007)
27.
Zurück zum Zitat Moore, R.K., Morris, A.: Experiences collecting genuine spoken enquiries using WOZ techniques. In: Proceedings of the 5th DARPA Workshop on Speech and Natural Language, New York (1992) Moore, R.K., Morris, A.: Experiences collecting genuine spoken enquiries using WOZ techniques. In: Proceedings of the 5th DARPA Workshop on Speech and Natural Language, New York (1992)
29.
Zurück zum Zitat Jokinen, K., Hurtig, T.: User expectations and real experience on a multimodal interactive system. In: Proceedings of the INTERSPEECH-ICSLP Ninth International Conference on Spoken Language Processing, Pittsburgh, PA, USA (2006) Jokinen, K., Hurtig, T.: User expectations and real experience on a multimodal interactive system. In: Proceedings of the INTERSPEECH-ICSLP Ninth International Conference on Spoken Language Processing, Pittsburgh, PA, USA (2006)
30.
Zurück zum Zitat Gardiner, A.H.: The Theory of Speech and Language. Oxford University Press, Oxford (1932) Gardiner, A.H.: The Theory of Speech and Language. Oxford University Press, Oxford (1932)
31.
Zurück zum Zitat Bickerton, D.: Language and Human Behavior. University of Washington Press, Seattle (1995) Bickerton, D.: Language and Human Behavior. University of Washington Press, Seattle (1995)
32.
Zurück zum Zitat Hauser, M.D.: The Evolution of Communication. The MIT Press (1997) Hauser, M.D.: The Evolution of Communication. The MIT Press (1997)
33.
Zurück zum Zitat Hauser, M.D., Chomsky, N., Fitch, W.T.: The faculty of language: what is it, who has it, and how did it evolve? Science 298, 1569–1579 (2002)CrossRef Hauser, M.D., Chomsky, N., Fitch, W.T.: The faculty of language: what is it, who has it, and how did it evolve? Science 298, 1569–1579 (2002)CrossRef
34.
Zurück zum Zitat Everett, D.: Language: The Cultural Tool. Profile Books, London (2012) Everett, D.: Language: The Cultural Tool. Profile Books, London (2012)
35.
36.
Zurück zum Zitat Maturana, H.R., Varela, F.J.: The Tree of Knowledge: The Biological Roots of Human Understanding. New Science Library/Shambhala Publications, Boston (1987) Maturana, H.R., Varela, F.J.: The Tree of Knowledge: The Biological Roots of Human Understanding. New Science Library/Shambhala Publications, Boston (1987)
37.
Zurück zum Zitat Cummins, F.: Voice, (inter-)subjectivity, and real time recurrent interaction. Front. Psychol. 5, 760 (2014) Cummins, F.: Voice, (inter-)subjectivity, and real time recurrent interaction. Front. Psychol. 5, 760 (2014)
38.
Zurück zum Zitat Bickhard, M.H.: Language as an interaction system. New Ideas Psychol. 25(2), 171–187 (2007)CrossRef Bickhard, M.H.: Language as an interaction system. New Ideas Psychol. 25(2), 171–187 (2007)CrossRef
39.
Zurück zum Zitat Cowley, S.J. (ed.): Distributed Language. John Benjamins Publishing Company (2011) Cowley, S.J. (ed.): Distributed Language. John Benjamins Publishing Company (2011)
40.
Zurück zum Zitat Fusaroli, R., Raczaszek-Leonardi, J., Tylén, K.: Dialog as interpersonal synergy. New Ideas Psychol. 32, 147–157 (2014)CrossRef Fusaroli, R., Raczaszek-Leonardi, J., Tylén, K.: Dialog as interpersonal synergy. New Ideas Psychol. 32, 147–157 (2014)CrossRef
41.
Zurück zum Zitat Scott-Phillips, T.: Speaking Our Minds: Why Human Communication Is Different, and How Language Evolved to Make It Special. Palgrave MacMillan (2015) Scott-Phillips, T.: Speaking Our Minds: Why Human Communication Is Different, and How Language Evolved to Make It Special. Palgrave MacMillan (2015)
42.
Zurück zum Zitat Baron-Cohen, S.: Evolution of a theory of mind? In: Corballis, M., Lea, S. (eds.) The Descent of Mind: Psychological Perspectives on Hominid Evolution. Oxford University Press (1999) Baron-Cohen, S.: Evolution of a theory of mind? In: Corballis, M., Lea, S. (eds.) The Descent of Mind: Psychological Perspectives on Hominid Evolution. Oxford University Press (1999)
43.
Zurück zum Zitat Malle, B.F.: The relation between language and theory of mind in development and evolution. In: Givón, T., Malle, B.F. (eds.) The Evolution of Language out of Pre-Language, pp. 265–284. Benjamins, Amsterdam (2002)CrossRef Malle, B.F.: The relation between language and theory of mind in development and evolution. In: Givón, T., Malle, B.F. (eds.) The Evolution of Language out of Pre-Language, pp. 265–284. Benjamins, Amsterdam (2002)CrossRef
44.
Zurück zum Zitat Lakoff, G., Johnson, M.: Metaphors We Live By. University of Chicago Press, Chicago (1980) Lakoff, G., Johnson, M.: Metaphors We Live By. University of Chicago Press, Chicago (1980)
45.
Zurück zum Zitat Feldman, J.A.: From Molecules to Metaphor: A Neural Theory of Language. Bradford Books (2008) Feldman, J.A.: From Molecules to Metaphor: A Neural Theory of Language. Bradford Books (2008)
46.
Zurück zum Zitat Levinson, S.C.: Pragmatics. Cambridge University Press, Cambridge (1983) Levinson, S.C.: Pragmatics. Cambridge University Press, Cambridge (1983)
47.
Zurück zum Zitat Friston, K., Kiebel, S.: Predictive coding under the free-energy principle. Phil. Trans. R. Soc. B 364(1521), 1211–1221 (2009)CrossRef Friston, K., Kiebel, S.: Predictive coding under the free-energy principle. Phil. Trans. R. Soc. B 364(1521), 1211–1221 (2009)CrossRef
48.
Zurück zum Zitat Rizzolatti, G., Craighero, L.: The mirror-neuron system. Annu. Rev. Neurosci. 27, 169–192 (2004)CrossRef Rizzolatti, G., Craighero, L.: The mirror-neuron system. Annu. Rev. Neurosci. 27, 169–192 (2004)CrossRef
49.
Zurück zum Zitat Wilson, M., Knoblich, G.: The case for motor involvement in perceiving conspecifics. Psychol. Bull. 131(3), 460–473 (2005)CrossRef Wilson, M., Knoblich, G.: The case for motor involvement in perceiving conspecifics. Psychol. Bull. 131(3), 460–473 (2005)CrossRef
50.
Zurück zum Zitat Pickering, M.J., Garrod, S.: Do people use language production to make predictions during comprehension? Trends Cogn. Sci. 11(3), 105–110 (2007)CrossRef Pickering, M.J., Garrod, S.: Do people use language production to make predictions during comprehension? Trends Cogn. Sci. 11(3), 105–110 (2007)CrossRef
51.
Zurück zum Zitat Garrod, S., Gambi, C., Pickering, M.J.: Prediction at all levels: forward model predictions can enhance comprehension. Lang. Cogn. Neurosci. 29(1), 46–48 (2013)CrossRef Garrod, S., Gambi, C., Pickering, M.J.: Prediction at all levels: forward model predictions can enhance comprehension. Lang. Cogn. Neurosci. 29(1), 46–48 (2013)CrossRef
52.
Zurück zum Zitat Moore, R.K.: Introducing a pictographic language for envisioning a rich variety of enactive systems with different degrees of complexity. Int. J. Adv. Robot. Syst. 13(74) (2016) Moore, R.K.: Introducing a pictographic language for envisioning a rich variety of enactive systems with different degrees of complexity. Int. J. Adv. Robot. Syst. 13(74) (2016)
53.
Zurück zum Zitat Fernald, A.: Four-month-old infants prefer to listen to Motherese. Infant Behav. Dev. 8, 181–195 (1985)CrossRef Fernald, A.: Four-month-old infants prefer to listen to Motherese. Infant Behav. Dev. 8, 181–195 (1985)CrossRef
54.
Zurück zum Zitat Matson, E.T., Taylor, J., Raskin, V., Min, B.-C., Wilson, E.C.: A natural language exchange model for enabling human, agent, robot and machine interaction. In: Proceedings of the 5th International Conference on Automation, Robotics and Applications, pp. 340–345. IEEE (2011) Matson, E.T., Taylor, J., Raskin, V., Min, B.-C., Wilson, E.C.: A natural language exchange model for enabling human, agent, robot and machine interaction. In: Proceedings of the 5th International Conference on Automation, Robotics and Applications, pp. 340–345. IEEE (2011)
55.
Zurück zum Zitat Serpell, J.: The Domestic Dog: Its Evolution, Behaviour and Interactions with People. Cambridge University Press (1995) Serpell, J.: The Domestic Dog: Its Evolution, Behaviour and Interactions with People. Cambridge University Press (1995)
Metadaten
Titel
Is Spoken Language All-or-Nothing? Implications for Future Speech-Based Human-Machine Interaction
verfasst von
Roger K. Moore
Copyright-Jahr
2017
Verlag
Springer Singapore
DOI
https://doi.org/10.1007/978-981-10-2585-3_22