Gamasutra: The Art & Business of Making Gamesspacer
Charging at the Challenges of Kinect
View All     RSS
October 24, 2014
arrowPress Releases
October 24, 2014
PR Newswire
View All





If you enjoy reading this site, you might also want to check out these UBM Tech sites:


 
Charging at the Challenges of Kinect

June 25, 2012 Article Start Page 1 of 3 Next
 

While the Xbox 360's Kinect has proven popular with the mass consumer, developing games that accurately reflect player movement, and really take advantage of the 3D motion-sensing capabilities, has been a major challenge.

Here, David Quinn, who works at Microsoft's Rare studio in the UK as a Kinect engineer, details how he has approached different challenges when developing for the system and how he's handled them, over the course of developing Kinect Sports and its sequel.

How do you do a game like darts, where most of the player's arm is occluded by the body? How do you handle golf, when the Kinect camera loses track of the player's arms during the swing? How do you handle different accents across the UK and the U.S.?

Since Rare is a Microsoft first party, does the stuff you write end up going back into the Kinect SDK?

DQ: There are a couple of things that Rare has done have gone into the SDK. The avateering system, we did that at Rare; that was when you take the 20-joint skeleton and turn it into the 70-joint avatar. That was done at Rare. And this machine learning system that we've recently built with the platform team for Kinect Sports 2; we helped out with that, as well. They did the more mathematical side, and we worked on the tools.

Have you seen implementations of Kinect in third party games that have impressed you or that do things that you weren't expecting?

DQ: Sure. What Mass Effect had recently done with Kinect's speech system is an excellent use of speech. We pushed speech in Sports 2; that was always going to be a huge thing for us. It was going to be a key thing, a differentiator from Sports 1. But what the Mass Effect guys have done is bring it into a core title, showing it could be used with a controller. It doesn't have to be the "get up and dance" kind of experience. You can use speech in Kinect in a more core title, and it really demonstrated that. I think from here on in you'll see a lot of speech in core games.

Are you primarily concentrating on the skeleton and the visual tracking, or do you work a lot with speech as well?

DQ: I work with both of them, yeah. It's odd; Kinect is like a brand, but it's actually a group of technologies, really. I'm kind of the Kinect rep at the studio, so I kind of touch both. I did all the speech work for Sports 2, basically by myself, and then quite a bit of gesture work as well. The machine learning system in golf was kind of my responsibility as well.

Can you describe what that accomplishes?

DQ: For golf, the major problem is the player's side faces the camera, so we don't actually get a great feed off the skeleton tracking system, because the back half of the body is completely occluded. All those joints are kind of inferred, basically. It gives a good guess of where it thinks it is, but it has no real meaning.

So when the player does a backswing, it cuts their hands a little, detecting when they do a forward swing. We worked out a codey, hacky job -- "hacky" is a bad word -- an unscientific job of running the animation. But when the player actually hits the ball and it flies off into the air, that has to be very reliable, because it's so detrimental to gameplay. Obviously, that's the entire game: hitting the ball.

So, early days of golf, we kind of had it so you could to do a full backswing and we'd just kind of drop your hands, because we didn't want the ball to go, but our hand-coded system would actually release the ball.

That's when we went to the ATG guys, the advanced tech group in Microsoft: "This is kind of where we're seeing. We've got a problem with the golf swing; do you have any recommendations?" They came back with this idea of creating a machine learning system for gestures.

What we basically ended up doing was recording about 1600 clips of people doing golf swings in front of Kinect, tagging in the clip where the ball should release, and then getting the computer itself to work out what's consistent among all those clips.

Then what happens is it creates a trainer and a classifier and move around that classifier at runtime, so we can pipe in a live feed into the classifier, and it can go, "Yes, the ball should release now," because it's been trained on a load of clips. It knows when it should happen. When the golf ball flies off in golf, it's done in that system; there's no hand-written code. It's all mathematical.


Article Start Page 1 of 3 Next

Related Jobs

Magic Leap, Inc.
Magic Leap, Inc. — Wellington, New Zealand
[10.23.14]

Level Designer
DeNA Studios Canada
DeNA Studios Canada — Vancouver, British Columbia, Canada
[10.22.14]

Analytical Game Designer
University of Texas at Dallas
University of Texas at Dallas — Richardson, Texas, United States
[10.22.14]

Assistant/Associate Prof of Game Studies
Avalanche Studios
Avalanche Studios — New York, New York, United States
[10.22.14]

UI Artist/Designer






Comments


Michael DeFazio
profile image
Dear Kinect:
At some point, I'd love to have "semi-intelligent conversations" with NPCs by using my voice (as apposed to the hackneyed dialog wheel/tree where none of the proposed choices presented are options I would ever do/say).

Dynamically interacting with NPCs has been around since the old Kings Quest and Ultima Games (where you could type in keywords and have players respond to queries) and now that the technology is here can't we improve upon it ?

I'm not asking for a completely revolutionary artificial intelligent avatar system (a la Milo), I'd just like to be able to interact in a way that is less "mechanical" (static dialog trees) and more natural (in a way that resembles a "conversation")

Wouldn't it be cool to play an open world detective game (a la LA Noire) where one component of the game would be interviewing people (witnesses, suspects) to find clues using your voice, and NPC may respond to specific queries/keywords (i.e. "Where were you Friday Night?", "What do you know about Fredrick Pierce?")
...Or have a way of "bartering" with NPCs over price in an RPG? ("I'll give you $500... How about $580...$520 is my final offer")

Seems to me the voice aspect of Kinect has the most potential and the one most criminally underutilized.

Robert Green
profile image
I think that kind of thing is still many years off. Right now, the closest we have are systems like Siri, which aren't especially reliable, need to be tuned for each language/accent and, most importantly, have to send a result to the cloud for processing, which rules out most gaming scenarios. Imagine your LA Noire example with a few seconds delay while the result of each question is processed online, occasionally coming back with "can you rephrase that?" and suddenly choosing questions from a menu doesn't sound so bad.

On a different note, it's interesting that articles like this one and the one a few days ago from Harmonix are coming out just as Steel Battalion is released to terrible reviews that are calling it flat broken and a black mark on kinect itself.

Addison Siemko
profile image
I'm there with you. A man can dream...

Michael DeFazio
profile image
@Robert
You might be right (i sure hope not). Google does have a developer API which dictates as you talk:
http://android-developers.blogspot.com/2011/12/add-voice-typing-t
o-your-ime.html
(but similar to Siri, it does require internet access). The utilization of language on Kinect I've seen seems half-baked most of the time. (I still laugh at the memory of "Lightsaber...ON!" as presented at E3.) That being said, I've been very pleased with Google's recognition accuracy (even without a grammar to choose from "options" like Kinect does)

I'm not sure if the previous Kinect voice enabled games (i.e. Mass 3) suffer from a "limited grammar" due to technical reasons (i.e. the recognition accuracy is just not up to par for anything advanced) or lack of "inspiration" (Bioware didnt want to invest much time adapting the experience for Kinect and in effect making 2 separate games).

I'd just like to see something from Microsoft moving toward the Natal/Kinect vision they sold us 3/4 years ago. (I'm not talking full fledged "Milo" here, I'm just asking for some
rudimentary non-critical character interactions). Its one of those things where if you can show us something compelling (and provide us the tools) we'd jump at the chance to offer new interesting experiences with the tech.

TC Weidner
profile image
I think kinects future may be tied to things like John Carmacks Virtual Reality.

http://www.xbitlabs.com/news/multimedia/display/20120620215832_Jo
hn_Carmack_Virtual_Reality_Gaming_Is_The_Next_Big_Step.html

which I think was by far the best thing at E3.

Joshua Darlington
profile image
I didnt understand why natural language speech was such a hard problem - until I started checking these linguistics lectures.

http://itunes.apple.com/us/itunes-u/linguistics-lectures/id425738
097

On top of context, there's a chaotic pattern of cadence, speed, tonal sweeps and etc that humans use to understand each other.

If you listen to isolated words from natural language speech it's freakin hilarious.

That said, if SIRI was a DARPA project licensed to private industry, I wonder what the military is using right not to monitor phone conversations? I wonder when they will allow private industry to license that? Did IBM's WATSON use a speech recognition system or did they fake it? I wonder if they are reaching out to the game developer community?

Colin Sanders
profile image
It's interesting to hear about the effort they put into their technology, but I can't help but worry about this path. What happens when the novelty of the Kinect wears off on audiences? Look what happened to the Wii. Once this happens, I fear they may meet the same fate as Microsoft subsidiary Ensemble. In my conversations with both former and current employees, Rare had a fantastic development culture that was all but destroyed under Microsoft's ownership. As a result, they've met with disappointment far more than they have success. Banjo-Kazooie: Nuts and Bolts (an excellent reinvention, I might add) and Grabbed by the Ghoulies are two of the most sobering examples of the studio's fall from chart-topping winner to pioneer of bargain bins across the world.

Gaming studios have been disappearing over the course of this generation: it's quite startling, and I can't help but fear that Rare is next. In an ideal world, I feel Rare should go the way of Bungie. With Scott Henson at the helm, I doubt it's possible at this point, but in my eyes, it's probably their best bet to survive.


none
 
Comment: