Pepper dialogue translator

Making your Pepper a translation agent with the Watson Language Translator API
Pepper dialogue translator

Wouldn’t it be great if your robot allowed you to talk to anyone who speaks a different language without needing the help of a third person to understand each other? Well, let’s make it real with the help of the Watson Language Translator API.

This article is part of the API Challenge: Quentin, an intern at SBRE, is plugging various third-party technologies (web services or IoT devices) into Pepper. For each technology, he has one week to test it, make a demo on Pepper, and write up his experience. This is a way of checking how easy it is to integrate technology with Pepper.

My goal here was to create an application for Pepper allowing two people to speak in different languages, face to face as the robot translates thanks to a language translation application (Watson Language Translator by IBMTM) but acting like a human interpreter who translates speech orally.

Here is how I wanted my application to be: the two interlocutors speak to each other sentence by sentence as the robot translates alternatively the speeches to each interlocutor.

The challenge here would be more about the use of QiChat rather than accessing the API.

Use case diagram of the app Pepper Dialogue Translator
Use case diagram of the app Pepper Dialogue Translator

1. The Watson Language Translator API, why and how

The first thing I needed to do was to find a way to allow my Pepper to translate from one language to another using, of course, an API service. I chose the Watson Language Translator API. You might ask, “Why not use the better known, most trusted Google translation one?”. Well, both of these two API work pretty much the same way and, as a matter of facts, the Google cloud API that grants access to the Google translation services had issues with android support when I was trying to use it... So, the Watson API is a product from the IBM gathering a lot of different services for multiple uses such as chatbots, banking services, AI and a translation service which is what interests us today. I created an IBM account and subscribed to the free version of the translation service (which offers more than enough for what I needed ) in order to get credentials for the API. IBM also has its own official java library available on GitHub and all the functions needed to get the translation service working are in there and well documented. I put all the variables and the translation function inside a class whose builder arguments are simply the input and output languages of the translation.

class TranslationService(inputLanguage:String,outputLanguage:String) {

   var inputLanguage = inputLanguage
   var outputLanguage = outputLanguage
   var iamOptions = IamOptions.Builder()
       .apiKey("my_api_key")
       .build()
   var service = LanguageTranslator("2018-05-01",iamOptions)
   init {    service.setEndPoint("https://gateway-lon.watsonplatform.net/language-translator/api")
   }

   fun translateText(inputText:String):String{
       var translateOptions = TranslateOptions.Builder()
           .addText(inputText)
           .source(inputLanguage)
           .target(outputLanguage)
           .build()
       var translationResult = service.translate(translateOptions).execute().getResult()
       return translationResult.translations[0].translationOutput
   }
}

As always, I made sure this was working correctly using an Android unit test class. This was the easiest API access I ever had to handle, really.

2. Making Pepper talk and listen in 4 different languages

Having access to the Watson API, my Pepper was able to translate any sentence to and from any language, but continued to talk with a thick english accent and didn’t understand anything I said in other languages.

Now comes the harder part of this mission, implementing the translation service into my dialogue app with QiChat. As I was advised, I created a hashmap containing 4 different Chat objects (English, Français, Español, Deutsch), one per language, with its own topic and chatbot, the latter being created with the locale corresponding to the chat’s language.

private fun getLocale(language: String) : Locale {
   return when (language) {
       "english" -> Locale(Language.ENGLISH, Region.UNITED_KINGDOM)
       //all the other languages used
       [...]
   }
}
private fun buildChat(language: String) {

   val locale = getLocale(language)

   val topic = TopicBuilder.
       with(qiContext)
       .withResource(R.raw.translation_chat)
       .build()
   val qiChatbot = QiChatbotBuilder
       .with(qiContext)
       .withTopic(topic)
       .withLocale(locale)
       .build()

   chats[language] = ChatBuilder
       .with(qiContext)
       .withLocale(locale)
       .withChatbot(qiChatbot)
       .build()
}

2.1 Switching languages

Ok, so we have our 4 different chats, but we still need to switch between them by starting and stopping them at the adequate moment. In that purpose, I created a one-line function to asynchronously run a selected chat, meaning that the chat will be active but not blocking the execution of other parts of the code:

private fun listenInLanguage(language : String) {
   currentChatFuture = chats[language]?.async()?.run()
}

The 4 chats contain only a few lines, their use being limited to listening and saying the translated response in the appropriate language:

topic: ~translation_chat()
u:(_"*")
$input=$1
^execute(switchLanguage,$input)

And the SwitchLanguage executor calls the translation, makes Pepper say it in the right language, closes the current chat and calls the listenInLanguage function with the other language selected as parameter:

private inner class SwitchLanguageExecutor(qiContext: QiContext) : BaseQiChatExecutor(qiContext) {
   override fun runWith(params: MutableList<String>?) {
       baseText = params!!.get(0)
       if(currentLanguage==language1){
           translation = translate(language1,language2,baseText)
           currentLanguage=language2
       } else {
           translation = translate(language2,language1,baseText)
           currentLanguage=language1
       }
       currentChatFuture?.cancel(true)
       SayBuilder.with(qiContext).withLocale(getLocale(currentLanguage)).withText(translation).buildAsync()
           .andThenCompose {say ->
               say.async().run()
           }
           .thenConsume {
               listenInLanguage(currentLanguage)
           }
   }
   override fun stop() {}
}

After that, when the app is launched, we get to talk in one language, listen to the translation in another one and respond in the last language.

We are getting close to what we wanted at first but there is still some work left to be able to choose the language on Pepper’s tablet and make the app a little more usable.

3. Creating a more practical and user-friendly app

3.1 A bit of UI

Since accessing the Watson API was pretty easy, I had some time left to create a more practical and cleaner app than I usually do. I had to learn to create buttons and text displays with the Android Studio IDE. It is actually very easy and intuitive. I created a first layer that shows when the app is created, containing 4 buttons (English, Français, Español, Deutsch) for the person using the app to select the languages in which Pepper should translate.

Screenshot of the app first layer with available languages
Screenshot of the app’s first layer with available languages

I then made the app switch to a second layer when the languages are selected. This second layer contains 3 text boxes, the first one showing the sentence to be translated (e.g. “bonjour”), the second one shows from and to which languages Pepper is translating (e.g.”french -> english”) while the last one shows the translation as Pepper is saying it (e.g. “Hello”).

Screenshot of the app second layer with translated sentences
Screenshot of the app’s second layer with translated sentences

3.2 Minor limitations

In theory, my app should now be allowing two interlocutors to speak together each in their own language. However, different issues made it a bit difficult for the app to be fully operational. Pepper can only listen to one language at a time, this makes the conversation less dynamic and pleasant than it should. Although overall, even if the final render of my app isn’t perfect, I am pretty satisfied with it as it respects the specifications given at the beginning, it is functional and more user-friendly than the other apps I developed for the previous articles.

To summarise, the created app allows Pepper to listen and translate each sentence of a conversation between two people speaking different languages, one by one, switching language at every sentence. The Watson Language Translator API and its java library from GitHub provided me with the tools needed to translate from one language to another, creating multiple chats with different locales and switching between them allowed the robot to understand and respond correctly in every supported language and finally, by creating and editing layouts, I added a language selection menu and an informative screen when Pepper is translating. We could think of a few improvements such as being able to tell Pepper when to start and stop listening in order to tell more than one sentence at a time.

Green Guy with glasses
Quentin SERDEL
Developer Intern

List of the references that has been used to solve this API Challenge