nach oben

2017 | Buch

Kapitel lesen Erstes Kapitel lesen

Beginning Microsoft Kinect for Windows SDK 2.0

Motion and Depth Sensing for Natural User Interfaces

verfasst von: Mansib Rahman

Verlag: Apress

Enthalten in: Springer Professional "Wirtschaft+Technik" , Springer Professional "Technik" , Springer Professional "Wirtschaft"

Einloggen, um Zugang zu erhalten

Über dieses Buch

Develop applications in Microsoft Kinect 2 using gesture and speech recognition, scanning of objects in 3D, and body tracking. Create motion-sensing applications for entertainment and practical uses, including for commercial products and industrial applications.

Beginning Microsoft Kinect for Windows SDK 2.0 is dense with code and examples to ensure that you understand how to build Kinect applications that can be used in the real world. Techniques and ideas are presented to facilitate incorporation of the Kinect with other technologies.

What You Will LearnSet up Kinect 2 and a workspace for Kinect application development

Access audio, color, infrared, and skeletal data streams from Kinect

Use gesture and speech recognition

Perform computer vision manipulations on image data streams

Develop Windows Store apps and Unity3D applications with Kinect 2Take advantage of Kinect Fusion (3D object mapping technology) and Kinect Ripple (Kinect projector infotainment system)Who This Book Is For

Developers who want to include the simple but powerful Kinect technology into their projects, including amateurs and hobbyists, and professional developers

Inhaltsverzeichnis

Frontmatter

Chapter 1. Getting Started

Abstract

It would be nice if we could just plug the Kinect in, hash out a quick script on Vim, and execute it on a command line, but, alas, seeing as the Kinect for Windows SDK 2.0 is deeply integrated into the Microsoft developer stack and as there are good development tools available for the Kinect, we’ll make a short initial time investment to set up our Kinect properly.

Mansib Rahman

Chapter 2. Understanding How the Kinect Works

Abstract

Although it is conceivable that we can learn to build a house without an education in physics, learn to be a chef without taking a course in food chemistry, and learn about programming without learning how a computer fundamentally relies on transistors and machine code, there is a reason that engineering schools, cooking academies, and computer science programs typically teach the theory that begot the profession before they teach how to actually take part in the profession. We may know to sear steak above a certain temperature for a certain time to hit medium rare, but if we know the Maillard reaction, we are on our way toward knowing how to cook hundreds of foods with perfect browning and flavor, even ones we have never before encountered. Similarly, in this chapter we will cover how the Kinect fundamentally works from an engineering perspective, followed by an overview of its software interface.

Mansib Rahman

Chapter 3. Working with Image Data Sources

Abstract

Above all else, the Kinect is an overpriced camera (half kidding), and it is imperative that we learn how to work with its image data before we can understand how other features like gestures work. For most amateur applications, you will probably want to give the user some visual feedback anyway. In this chapter, we will explore the peculiarities of working with the various image data sources.

Mansib Rahman

Chapter 4. Audio & Speech

Abstract

In the previous chapter, we covered the Kinect’s depth camera. In conjunction with its gesture-recognition capabilities, the depth camera tends to be the Kinect’s most touted feature. The Kinect’s audio and speech abilities, on the other hand, are typically overlooked. This is partly to do with marketability. From a video-gaming perspective, these features just do not help sell the Kinect nearly as much as the more immersive aspects do, such as gestures. The other important reason, though, is that the paradigm for audio input is somewhat misunderstood. We have all witnessed the “Xbox. . .” commands used to manipulate our video game consoles, but let us be completely honest: it is often easier to rely on the controller. To best realize the potential of the Kinect’s audio capabilities, we have to be sincere with ourselves. Audio is not a replacement for hand input, whether that be with a mouse, an Xbox One controller, or a touchpad. It only takes a discreet 2mm translation of our thumb to confirm a selection using a gamepad. We have to clear our voice and awkwardly talk to our device and wait for some latency in the voice-recognition technology to do the same with a microphone.

Mansib Rahman

Chapter 5. Body & Face Tracking

Abstract

At the heart of a user’s desire to interact with the Kinect is the ability to physically manipulate a digital reality. It is an experience that is nigh on magical for most people.

Mansib Rahman

Chapter 6. Computer Vision & Image Processing

Abstract

Hopefully, by this point some of the Kinect’s magic has worn off and you can see the machine for what it really is: two cameras with varying degrees of sophistication and a laser pointer. Barring the exorbitant cost of fielding a time of flight (ToF) camera, recreating a device that is conceptually similar to the Kinect in your own garage is not impossible. Getting the color, depth, and infrared streams from it could be technically challenging, but they are essentially a solved problem. What sets the Kinect apart from such a hobbyist device, however, other than its precision manufacturing and marginally superior components, is its capability to look at the depth feed and extract bodies and faces from it.

Mansib Rahman

Chapter 7. Game Development with Unity

Abstract

The original use case for the Kinect, and perhaps still the most popular, is game development. It is difficult to have a discussion on game development these days without bringing up Unity. Most readers will already be familiar with Unity. For those who are not acquainted, know that Unity is a cross-platform game engine that targets various APIs. These include the ever-fashionable Direct3D as well as the pretender to the throne, OpenGL. It is not limited to PCs, however. It also supports mobile, Windows Store, VR/AR, websites, and consoles. Unity apps are primarily developed in C# (though there is also JavaScript support), thus Unity does not require too much of a context switch from typical Kinect programming. It is free, is easy to get started with, and, importantly, has third-party support for the Kinect. In this chapter, we will cover the basics of integrating the Kinect with Unity.

Mansib Rahman

Chapter 8. Miscellaneous Tools

Abstract

Just consider the undertaking of detecting gestures. The most rudimentary manner to approach it would involve some form of heuristics; for example, if the hand joint is higher than the head joint, the person has their hand raised. It is not difficult to see how this increases in complexity for even the simplest of gestures. A hand-waving gesture would require some way to track the periodicity of the wave to check whether the person is actually waving, just saying stop, or perhaps just has their hand raised idly. Given that each joint has three position properties, our conditional statements stand to be very long and undesirably nested. The situation worsens dramatically if we start considering velocity and acceleration.

Mansib Rahman

Appendix A. Windows 10 & Universal Windows Platform

Abstract

It may not be a surprise to many that the current state of affairs for the Kinect is in flux. While the core APIs and drivers are relatively stable, Microsoft has somewhat dropped the ball on further development. That is not necessarily a bad thing. They have taken their learnings and applied them to the development of newer technologies, such as HoloLens and other mixed-reality headsets. The Kinect for Windows v2 still offers a strong entry point for hobbyists, developers on a budget, and research/commercial developers looking for a proven depth-sensing technology.

Mansib Rahman

Backmatter

Titel: Beginning Microsoft Kinect for Windows SDK 2.0
verfasst von: Mansib Rahman
Verlag: Apress
Electronic ISBN: 978-1-4842-2316-1
Print ISBN: 978-1-4842-2315-4
DOI: https://doi.org/10.1007/978-1-4842-2316-1

Springer Professional

Über dieses Buch

Inhaltsverzeichnis

Frontmatter

Chapter 1. Getting Started

Chapter 2. Understanding How the Kinect Works

Chapter 3. Working with Image Data Sources

Chapter 4. Audio & Speech

Chapter 5. Body & Face Tracking

Chapter 6. Computer Vision & Image Processing

Chapter 7. Game Development with Unity

Chapter 8. Miscellaneous Tools

Appendix A. Windows 10 & Universal Windows Platform

Backmatter

Premium Partner