Chatbot and Personality

  • by

One of the biggest challenges involving artificial intelligence is called natural language processing. The problem becomes even greater when it comes to the Portuguese language. Because there are few countries that adopt this language as an official language and due to the great complexity in verb conjugation, among other things, datasets that can be used to train machine learning models are the major bottleneck in the area.

Given that learning is obtained through experience, a large set of texts, and above all, a diversity of them, is essential to “teach” the machine to naturally interact with humans. It is also important that the machine learns synonyms, abbreviations, regionalisms and the growing number of neologisms in the frequent readaptation and construction of the language.

Samuel Pordeus, a BSc in Computer Engineering developed his Bachelor's thesis with the goal of creating conversational models with different personalities. An intelligent agent using deep learning was implemented, and two instances were trained, the first fed by a dataset with 15,000 comedy movies and the second by the same amount of horror movies. After training, the conversational neural models answered a series of generic questions and with a questionnaire separated into several sections participants were asked to determine if the sentences appeared to be produced by humans and if the answers referred to the comedy and horror genres, or neither. More than half of the participants considered the origin of the answers produced by the chatbot to be human and the correctness rate for film genre was higher than the error rate in all eight excerpts of dialogue.

The idea is that the machine learns the speech pattern from the word frequency, realizing that words appear close to each other and associating questions (which is easier to control because of the presence of the question mark symbol) with their answers (which come right after the symbol). Parameters inherent to the method also involve vocabulary size, sentence size to be considered in the search for the pattern, among other classics of neural network architectures (and deep learning).

Currently, chatbot developers and researchers are concerned almost exclusively with textual generation and interpretation, which are themselves extremely complex tasks. As more applications involving chatbots emerge, it becomes more important to assign them personalities and to tackle challenges, such as making a machine understand sarcasm.

Excerpt from a dialogue with a chatbot (in Portuguese):

Humano: Oi!
Chatbot: Oi!
Humano: Qual o seu nome?
Chatbot: Henry.
Humano: Você está consciente?
Chatbot: Louca!
Humano: Como é?
Chatbot: Você é minha.
Humano: Eu vou te matar!
Chatbot: Vou pegar você!

If you can read Portuguese, does the dialogue seem to be between two people? With what genre was the network trained? Horror or comedy?