PtoPA’s Ideal Conversation Robot
Most current humanoid robots under development have realized mobility and the transportation of objects. However, PtoPA possesses neither such technical know-how nor such a research team.
However, compared to existing humanoid robots, PtoPA’s competitive advantage is in communication technology at its core. Our ideal robot can have a natural language conversation. By combining PtoPA’s CAIWA system with a humanoid robot, we hope to create a robot that is capable not only of speaking, but of having fascinating conversations.
In order to create such a robot, a strong speech recognition technology is paramount. If the robot cannot recognize or misunderstands a user’s speech, the user will quickly lose interest.
Further, we must avoid the phenomena in conventional humanoid robot that no successful matching with assumed utterance sentence can be achieved if a key word is not included in actual utterance sentence. Also, we do not want to create a robot that simply responds to a user’s input and then stops. We desire a robot that can not only respond once but hold a natural conversation.
Besides speech recognition and conversational abilities, we want to build a robot that can recognize and remember a user’s face as well as create facial expressions, as these abilities, too, are very important to communication.
Wouldn’t it be exciting to have a robot that could recognize you and say with a smile, “Hello, (your name)! How are you today?”?
In addition to facial expressions, other aspects of communication, such as tone and speech content affect emotion and can change the feeling of a conversation drastically from having good time to bad time. If a robot can “react” to a user’s words or facial expressions to decide its own demeanor, even deeper communication can be realized. For instance, if a robot is “angry,” it can say words more loudly or strongly, or if it is “happy” it can smile while speaking.
PtoPA aims to create a robot with all of these characteristics by using the following technologies:
- A strong speech recognition technology with a flexible conversation structure (CAIWA system)
- A face recognition function (image processing system)
- An facial expression creation function (eyebrow, eyelid, mouth and neck movements)
- Emotional expression control function (emotion model)


