Posted on September 5, 2017 at 12:51
Prof. Padir was recently able to acquire a new robot for our lab, one he called Frasier. I have no idea how he was able to come up with such a fantastic acronym, but he says he was inspired by the TV series, Frasier, since that’s how he learned English when he first arrived to the states. Frasier is Toyota Research Institute’s (TRI) home research platform which they call Human Support Robot (HSR). The robot is leased for free but in exchange, our lab has to compete in the Robocup@Home Domestic Service Platform League (DSPL). This league was made solely for teams who have the HSR robot and our lab was 1 of 5 other labs chosen from nationwide. Our team was the smallest of all consisting of Prof. Padir, Tarik Kelestemur, a PhD student, and me.
Prior to receiving the robot, we went over to California to receive formal training from TRI and meet the other teams who received a robot. It was an awesome three days meeting with fellow roboticists from all over the USA and TRI had a goal of cultivating a robotics/HSR community. We attended the first “Bits & Bots” meeting, an informal gathering of roboticists in the Bay Area hosted by TRI. It was a great experience and I got to meet Tully and some of the founders of ROS along with other high profile individuals!
Part of our training was a hackathon using the robot. We ‘hacked’ Google’s speech-to-text and Facebook’s Wit.ai NLP platform to get a speech module working. Tarik created a grasping framework to pick up cups and we threw both those together. There wasn’t much of a success but it laid the ground work for our approach to Robocup in Nagoya, where I decided to focus on the speech & person recognition task and Tarik the storing groceries task.
After a month or so of dabbling with NLP APIs to find which one would be of most use, I settled on Google’s Dialogflow since it was actively maintained and reliable and allowed me to send audio for analysis. Everything worked well and we had a great speech framework. Our issue was cross-module communication - we had no formal method of having the speech module communicate with grasping communicate with navigation communicate with recognition…
Since we had limited time, our solution was one big file with a bunch of functions each tailored to a specific request, whether it was a service call or publishing a message. This worked well but obviously was not maintainable for future use. For Robocup@Home2018, we used state machines and behavior trees, but thats for another post. We tested everything on the robot and were ready to go…! only to find out that we won’t be getting a specialized mic on the arena. This is a big assumption in our implementation since we were relying on using a professional mic as an audio input as HSR’s mic was lacking at best and that’s putting it mildly. The result was HSR attempting to hear the committee operator say a command then not recognizing it properly and then recognizing its own speech.
The result was 0 points in speech & person recognition, but at least we got points for the storing groceries task thanks to Tarik! On the bright side, it gave us time to explore Japan since we didn’t make it to the next stage.
Nevertheless, this was an important experience as it laid the framework for our success next year in Robocup@Home 2018.