Goal - Make Pepper listen to a speaking human in order to recognize some predefined chunks of sentences.

// Create a phrase set.
PhraseSet phraseSet = PhraseSetBuilder.with(qiContext)

// Build the action.
Listen listen = ListenBuilder.with(qiContext)

// Run the action synchronously.;

Typical usage - Pepper asks something and waits for a short answer, among a set of short and predictable answers.

How it works

Defining phrase sets

Use Phrase or PhraseSet objects to define the chunks of sentences to be recognized.

While a Phrase defines a unique form of answer, a Phrase allows you to define a list of variants you would like to consider as synonyms.

For example, you can make Pepper listen to a “hello” concept:

PhraseSet phraseSet = PhraseSetBuilder.with(qiContext)
                                      .withTexts("Hello", "Hi")

Listen listen = ListenBuilder.with(qiContext)

Disabling the body language

By default, Pepper does not stay motionless while listening, he moves slightly, in order to let you know he is listening.

If necessary, you can disable the body language:

Listen listen = ListenBuilder.with(qiContext)

Modifying the language

By default Pepper uses his Preferred Language.

To set a different language, use a Locale.

For example, to make Pepper listen French, use Language.FRENCH and Region.FRANCE:

Locale locale = new Locale(Language.FRENCH, Region.FRANCE);
Listen listen = ListenBuilder.with(qiContext)

See also javadoc: Locale.

Use Case

Voice commands

Imagine we want to create an application where we control Pepper moves using voice commands.

Step Action

Define Phrases or PhraseSets

We could define the phrases Pepper should recognize.

Pseudo code:

forwards  = "move forwards"
backwards = "move backwards"
stop      = "stop moving"

But, perhaps, the user will fail to say exactly the phrase we expect, so we should authorize some variants.

Pseudo code:

forwards  = ["move forwards", "forwards"]
backwards = ["move backwards", "backwards"]
stop      = ["stop moving", "stop"]

Run the Listen object

Listen listen = ListenBuilder.with(qiContext)
                             .withPhraseSets(forwards, backwards, stop)

ListenResult listenResult =;

Retrieve the heard phrase and the corresponding PhraseSet

If the user says “forwards”:

Log.i(TAG, "Heard phrase: " + listenResult.getHeardPhrase().getText()); // Prints "Heard phrase: forwards".
PhraseSet matchedPhraseSet = listenResult.getMatchedPhraseSet(); // Equals to forwards.

Using these results, we can make Pepper move accordingly.

Performance & Limitations

Listen or Chat?

Listen is suitable when an application requires a short vocal interaction. Prefer Chat to create complex question & answer sequences between Pepper and humans.

Exclusions with other actions

Do not start a Listen while a Say, a Chat or a Discuss is running: the Listen would fail.


The microphones on Pepper are unidirectional, so Pepper can only listen to sounds from before him. This implies that anyone wanting to talk with Pepper should be in front of him.

Also, Pepper may not be able to hear a human in a noisy environment.