The Xbox Kinect and Paranormal Investigation

Kenny Biddle

Ghost hunters absolutely love their gadgets. Simple observations show that technology attracts ghost hunters like moths to the blue glow of a bug zapper. It’s the “sciencey” aspect of it; gadgets make it look like one is being all scientific, with blinking lights and noises that reach excited peaks. Sadly, the majority of gadgets that are being used by ghost hunting teams are about as useful as a car muffler in the shower. In other words, they are completely useless. But that doesn’t stop groups from adapting any and every piece of cool tech to the cause of “proving” ghosts float among us.

In the past couple of years, I’ve noticed a new addition to the ghost hunting arsenal—the Xbox Kinect. Paranormal teams were plugging them into laptop computers and claiming the devices were able to track spirit entities that happened to appear in a room that was being monitored. As of this writing, portable versions of the device—basically a Kinect, a power source, and a tablet on a handle—have become extremely popular with paranormalists due to the freedom they provide and their various appearances on popular TV shows.

What is the reason behind this? Why do paranormalists get excited about this device? Well, it seems to have followed in the “footsteps” of the Ouija Board by gaining a reputation based on a movie. In October of 2012, the film Paranormal Activity 4 was released and gave the ghost hunting community the idea that the Kinect could “see” ghosts, spirits, demons, etc. Just as with other paranormal gadgets, the idea was based on an unsupported belief (and fiction); the Kinect uses infrared light (paranormalists believe ghosts can be seen with this), and it produces dots much like the laser grid (also known as a “Shadow Detector”). To paranormalists, it was a natural addition to their pile of Pelican Cases.


The infrared speckle pattern that is spread out onto the playing area.

However, the paranormalists using the Kinect were not getting the creepy alien-like figures portrayed in Paranormal Activity; instead they were getting random “skeletons.” More like stick figures, these “skeletons” are how the Kinect identifies what it thinks is a player (human) entering the playing area. When paranormalists began seeing these skeletons appear in empty rooms—in what they believed were haunted houses—they immediately came to the conclusion that they were capturing ghosts. This is an irrational conclusion that I have observed dozens of times and have been told about (by paranormalists) dozens more. Why don’t these paranormal groups ever test their equipment in “non-haunted” locations? False positives can happen anywhere with these pieces of equipment, and honestly, false positives are what drive the para-gadget industry.

So, are they really “capturing” ghosts with this device? To answer that, we must first understand how the Kinect works. This is something that is most often overlooked by paranormalists in their over-eager race to use the “latest technology” and land the next annoying paranormal TV show. We’re not going to rush; we’re going to learn a few things—knowledge is key.

The Xbox Kinect (for Xbox 360) was launched on November 4, 2010, (in North America) to the delight of video game geeks everywhere. It allowed players to put down the controller, get their butts off the couch, and reach a level of emersion within their games like never before. The Kinect packed a lot of technology into a small, cheap package.

The Kinect comes in a sleek black bar, measuring 2.8” x 11” x 2.8”, weighing in at 1.3 pounds. Housed within the bar is an RGB (Red, Green, Blue) camera used for video chatting, which has a standard resolution of 640×480 at 30 frames per second1 (Wilson 2010), a monochrome infrared depth sensor, an infrared light emitter, and four microphones. It has a power/USB cable that allows connection to the Xbox game console or any computer with a USB port.


The Kinect plugged into my laptop. It has assigned two distorted stick figures to my entertainment center and archway yet has ignored me standing there.

The process behind the motion sensing ability of the Kinect is a complicated one, but I’ll do my best to break it down. There are two main components that make this happen—software (casually referred to as “the brain”) and hardware. Working together, they produce some pretty cool results.

The Kinect has an infrared (IR) emitter that projects a speckle (as in dots) pattern of IR light into the playing area (room), which can be seen by IR cameras. This speckle pattern is not random; the software knows where each dot is within the pattern. An infrared depth sensor containing a CMOS sensor takes a look at the speckles reflected off of the various objects in the room and measures something called “time of flight”—the time it takes for the IR light to travel from the emitter, reflect off an object, and return to the depth sensor. This gives the Kinect the ability to know where an object is in space (Cong and Winters 2017).

In addition, there is more information encoded within the infrared light of the speckle pattern. When the light returns to the depth sensor, some of it is deformed by the various angles of the objects within a room, including the players. When the angles change due to a player moving forward, backward, right, left, up, or down, the speckle angle of reflection is changed as well. The technology and software of the Kinect can use these subtle or drastic deformations of light to calculate an object’s depth to within 1 cm and their height to within 3 mm (Carmody 2010). This is how the Kinect builds its 3D view of the room and tracks a player’s movements within it.

Both processes—reading the time of flight and reading the angle deformation—are done thirty times a second while tracking forty-eight different points of the human body, such as the head, torso, arms, elbows, hips, knees, and feet (Cong and Winters 2017).

So how does it see people and distinguish them from other objects? Well, that took a ton of work and started with millions of images of everyday people. Developers went into homes all over the world (seriously, they did) and gathered images that represented people of different heights, weights, genders, body types, clothing styles, hair styles, ages, and more. And they had to get images of each of these types in various poses in order to for the Kinect to recognize “real-life” scenarios, i.e., how people really move rather than how programmers think they move. Programmers then had to go into each frame and label every body part (Duffy 2010; Crawford 2010).

In addition, professional motion-capture video was also collected and incorporated into the required data (Duffy 2010). All of this information—literally millions of images and seconds of motion-capture video—was fed into a machine-learning algorithm developed by Jamie Shotton, a researcher at Microsoft Research Cambridge in England. The data are mapped to the various types of the human form and “teach” the system how to recognize joints and skeletal movements.


A full stick figure is assigned to my lower half, while another stick figure has been assigned to my entertainment center.

As you, the player, stand in front of the Kinect, the speckle pattern reflecting off of you allows the Kinect to calculate the distances to different parts of your body. This creates what is called a “point cloud,” which looks much like those pin frame toys that we push our hand or face into to create a 3D image from the metal pins. Based on the millions of reference images and video it has, the “brain” (the software) guesses which parts of your body are the arms, legs, head, etc. That’s right, I said “guesses.” Since it is comparing your figure and pose with references, accuracy depends on how close the two match.

The software then applies a wire “skeleton” to you (the player), based on reference images and video, probabilities assigned to various areas of the body, formal kinematics (mechanics concerned with the motion of objects without reference to the forces that cause the motion), and guessing. Yes, the software can make assumptions on where it thinks a hidden body part (say, an arm that is swung behind the player) is or the path of travel an appendage might take (Crawford 2010).  

The result of this is a wire frame that, in the game system, eventually gets layered with skin, clothes, armor, weapons, etc., and displayed in your favorite game in glorious HD. Thirty times every second the Kinect performs this process, frame-by-frame, sending the most likely skeletal structure to the gaming system. This all takes just a few milliseconds to accomplish. It truly is some amazing technology.

Ok, so that’s the break down of how the Xbox Kinect works. How does this apply to the mysterious and often freaky-looking stick figures that ghost hunters are capturing?

The first mistake is the most common one that I see ghost hunters make: they simply don’t test the equipment they put so much faith in. They see an anomaly that they cannot explain, and rather than investigating what might be the cause, they settle on the conclusion that it must be paranormal. This is, has been, and always will be bullshit.

The most common claim I see is that ghost hunters are seeing stick figures suddenly appear on the screen. These figures seem to interact with a living person that moves toward them, perhaps extending a hand to “touch” the figure, which seems to react—either by moving away or seeming to be reaching out. The final conclusion is usually that these figures are ghosts.

My first objective was to see this for myself and not just on YouTube videos. I wanted to go out to a location with a group that was using one of these things and see what was going on. Luckily, as I was working on this project, a friend (and ghost hunter) contacted me about this topic. Matt Hass invited me out with his group in order to observe the Kinect in use under the conditions that most ghost hunters were using. So, in the basement of an old mansion, we set up the Kinect and pointed it toward a room that had some shelves, an old sink, and some packed trash bags. One of the bags had a stick figure on it. We sat and watched it for a little over ten minutes doing nothing more than wiggling its “arms” and “legs” until we moved the trash bag. Poof—it disappeared. We got it to come back by moving the trash bag around until the stick figure popped back on screen. We also turned the Kinect to face the stucco wall two feet in front of it. Guess what? We got a stick figure there too.

Over the course of several weeks I watched (so you wouldn’t have to) over 200 videos that claimed to have caught a ghost using the Kinect.2 I found several common factors throughout all of them.

  1. The stick figures usually resemble something out of a horror/monster/alien movie with appendages coming from weird angles. Many seem to have extra joints and severely deformed limbs that are either too short or elongated (think of the “open arms” alien from Close Encounter of the Third Kind).
  2. The stick figures remain locked in one place. They move the “limbs” about like a dancing queen, but the body does not move about the scene; it is locked in one spot. In a few cases, the “torso” stick would shift slightly to one side, but then return locked to a specific item (see next point). I have yet to see one of these stick figure ghosts casually walk across the scene as the living people do.
  3. There is an object that the stick figure is locked to. The object most often the cause is a couch/sofa. Other objects include (but are not limited to) chairs, jackets hung over chairs, potted plants, water heaters, camera bags, vacuum cleaners, tripods, pillows, pedestal fans, curtains, and even part of an actual person (when the entire body of the person cannot be seen). I’ve posted several of my own videos showing stick figures on a vacuum cleaner, a jacket, and a camera bag.
  4. There were a few times when a stick figure would disappear and another would appear in a different location in the scene. The reaction of the ghost hunters was that it was the same entity and it simply moved, blinking out and in style. This most often occurred when another player (investigator) walked into the scene. Other times (not as often) it was due to the Kinect locking onto another object and assigning a stick figure to it.

All of these can be traced back to a common source—user error. It is not a glitch in the system (as originally designed) because it is working as it was designed to do. These false positives are due to ghost hunters ignoring how the device was intended to be used and ignoring the recommendations set forth by the manufacturer that would seriously lower the likelihood of a stick figure being assigned to an inanimate object. Let’s look at a more in-depth explanation of this.

The Kinect was designed to be placed above or below a television “between 2 feet (0.6 m) and 6 feet (1.8 m) from the floor” for the Xbox 360, according to Microsoft. The Xbox One gives the same measurements but includes the note “the higher the better.” It was also designed to be placed on a stable, stationary surface, not hand-held as we see in portable versions used on paranormal shows such as Ghost Adventures. But wait, isn’t the Kinect used for applications other than games? Yes, from robots using it for navigation (Tanz 2011) to a scientist using it to make a 3D scan of a cave floor (Mann 2011) but in these cases, the software has been adapted for specific uses by scientists, engineers, and programmers who are detecting/mapping things that are known to exist—rock, metals, land masses, etc.—all things that have known qualities. In the hobby of ghost hunting, paranormalists are using the same software the gamers are using, which is designed to detect players that come into the play area.


Tape measure showing a distance of ten feet from the Kinect. This is the distance it was still picking me up as a player, as well as the entertainment center and vacuum cleaner.

The play area, as per Microsoft, should be at least six feet wide, but no wider or longer than twelve feet. A solo player should be about six feet from the sensor to be picked up, and two players need to be at least eight feet from the Kinect. Keep in mind that this play area is supposed to be for the players to move around (dance, jump, step to the side, forward and backward, etc.), which means no furniture should be in the way. This is not the case when we see video after video from ghost hunters using the Kinect and getting an extra stick figure on the monitor.

The Kinect has a large reference library to compare with human shapes, but it doesn’t have any filters against other objects that resemble that shape. Keep in mind that the Kinect is relying on the depth sensor for recognition of a player. It is reading the distance of the infrared speckle pattern to determine depths. Also keep in mind that the software will make guesses on where missing limbs might be based on probabilities. This is why you’ll see “hands” or “feet” wiggle around when you’re in front of the Kinect, holding your hands in front of you or your feet are just off screen. When the difference in the depth of various objects matches images in the Kinect’s reference library, it will get confused and assign a stick figure to it.

When the Kinect is used at allegedly haunted locations, paranormalists are simply not setting up the device as it is supposed to be. They are positioning the Kinect in rooms filled with furniture and objects of various shapes and sizes. The surface of these various objects in the “play area” is being read by the depth sensor. The data is similar, but not an exact match, to references the Kinect has of player figures and poses. The Kinect, seeing only partial matches, extrapolates what the rest of the body is most likely doing. Because it is guessing thirty times a second for minutes at a time, it creates a waving/dancing/freaky stick figure.

There’s also an issue with people sitting down. In an article by Stephen Totilo (2010) titled “Xbox Kinect Does Not Play Well with Couch Potatoes,” the question of whether the Kinect would work only if someone was standing was brought up to Microsoft and several game developers by the author. The question was avoided or statements such as games were “optimized for standing” were given. However, the author finally got somewhat of an answer: “Sitting is something we’re still calibrating for.” This was followed with a little more: “One of the Microsoft people with whom I was discussing the ‘sitting question’ said the chair stuff is just more complicated. You could be sitting far away, at an angle” (Totilo 2010).

The Kinect can get confused when it doesn’t see the entire player. This is obvious by watching any number of videos that show the skeletal view and have people/players moving about. The stick figure limbs will not always track correctly when limbs are hidden or out of the frame. When players sit down, the Kinect gets confused; it simply doesn’t know how to interpret the depth data correctly. We blend into the couch and either disappear in the eye of the Kinect or it makes a guess based on what it does know (from its reference material). Since it has data from all types of players, it can mistake a sitting adult for the figure of a child.

But what about when the “ghost” stick figure interacts with a living person who enters the scene? There’s a few things to consider when answering this. First off, tracking is still dependant on the depth sensor. So, if the end of a couch is causing a stick figure and you’re standing next to the couch and reach out your arm, the depth senor is going to read you as either part of the couch (thus extending the stick figure created by the couch) or the Kinect will see you as a second player and attempt to keep the stick figures of the two players (as the Kinect sees things) separated by bunching up the couch stick figure. A third option is that the Kinect will realize its guess about the couch was wrong and correct itself and the stick figure will disappear. Oh, and another thing is you’re applying human traits to a computer error caused by you. Stop it.

Here’s another thing to consider: the Kinect is using infrared light to create the speckle pattern and calculate depth. This is what it uses to find a player and assign a stick figure. If a ghost was indeed the cause of these extra stick figures showing up, a few things would be quite clear. First, you’d see the darn thing with the infrared camera. You wouldn’t just see a stick figure; you would see the object causing the stick figure (the ghost). Why? Because if the infrared speckles (which are what would supposedly be reflecting off the ghost) can be seen by your night-vision camera, then the object reflecting them would also be seen by the same camera. And there’s always a solid figure of some form behind a stick figure. Second, since the only way to cause a stick figure to appear is to change the angle and location of the speckle dots, you would easily see not only a disturbance in the speckle pattern but the speckles change distances (like the pin art toy). The IR speckles could be observed changing distances while viewing through a night-vision camera.

In addition, I was able to borrow a few IR lights from a friend for a simple experiment. With paranormalists using IR lights as standard attachments to their video cameras and other toys, I wanted to see if the presence of these lights would interfere with the Kinect. I set up the system and (quite easily) got the Kinect to lock onto my vacuum cleaner and assign it a stick figure (skeleton).


A stick figure assigned to my vacuum cleaner, as well as another very distorted stick figure assigned to a set of shelves.

When I take everything I’ve learned over the few weeks I’ve been researching this topic, I am left with the conclusion that the cause for the “stick figure ghosts” is multi-layered. The first layer is a limitation in the software, caused by a machine being forced to make guesses. Another layer involves people not paying attention to the simple rule that the “play area” must be clear of obstacles except for the players. And the last layer falls heavily on the shoulders of ghost hunters, who too easily accept strange things they can’t understand as paranormal without good reason or a decent effort at researching the topic.


Acknowledgments

Special Thanks to the following for their involvement with this article.

Mitch Silverstein—review and discussion

Matt Haas— for questions on this topic and supplying various video clips

Notes

  1. The RGB camera can achieve 1280×1024 but at a slower frame rate.
  2. This does not include video that appeared to be obvious hoaxes.

References

Kenny Biddle

Kenny Biddle is a science enthusiast who investigates claims of paranormal experiences, equipment, photos, and video. He promotes science, critical thinking, and skepticism through his blog I Am Kenny Biddle. He frequently hosts workshops on how to deconstruct and explain paranormal photography. Email – parainvestigator@comcast.net