Conversational AI – Ciara Anderson, PhD

What is TFIDF?

Since a computer cannot analyze text in its raw form, it must be converted into a numerical format – vectorization. One way to vectorize text data is TFIDF.

When dealing with textual data, it’s important to know which words are most important in a given document. For instance, if you’re trying to retrieve textual data on a particular topic, certain unique words may be more informative than generic words that occur very frequently.

While a straightforward count vectorizor will provide insight into how frequently a term occurs in a given document, a TFIDF (Term Frequency Inverse Document Frequency) approach will tell you whether or not to prioritize a word in a given document.

TFIDF is equal to: term frequency * inverse document frequency.

In simple terms:

Term frequency (TF) refers to how often a term occurs in a given document divided by the total number if words in that particular document.

Inverse document frequency (IDF) tells us which terms or words occur frequently across all documents and which ones occur rarely. Terms that are very common have a lower IDF and vice versa.

Our TFIDF score gives the words in a given document a weightage which provides an insight into which words in the text are most and least informative.

The most informative are those with a higher score in a given document and those with a lower score are less informative (commonly used words). It assigns a score rather than a frequency.

Doc 1: “I think that the purple sweater is the best choice for the event”

Doc 2: “She thought that the pink jeans were the best for the event.“

Doc 3: “I think the the best choice for the event is the red dress”

The word the occurs a lot and has a high frequency count.

But words like purple, sweater and jeans provide more information on the person’s personal clothing choice. That’s the magic of TFIDF.

Do you have any favorite resources on this topic?

Politeness Research in Conversation Design

“Will you have a cup of tea?” – “No, no, no…” – “Go on, you will” – “No really, I’m fine!” – “You will, just a half-cup” – “Oh okay, just a half cup”.

This is a typical conversation you might witness upon entering an Irish home when I was growing up. The three to four turn polite decline, the modification of the offer, the giving in of the half-cup tea drinker. But in some cultures, this up front refusal of the offer may be interpreted as rude or ungrateful.

When designing virtual agents, be they chatbots or voice bots, an awareness of cultural context and social norms around areas such as politeness may make all the difference in a natural, seamless and comfortable user experience. Existing research in fields such as sociolinguistics can inform these design choices. Of course, nuance applies and not all generalizations will hold in every situation, but valuable insights are available to us if we consult the research!

For example, research by Hass & Wächter (2014) found that Japanese and German cultures present two opposite poles of a continuum when it comes to directness/indirectness of speech. A German communicative style favored the former – being more task oriented. The Japanese style favored the latter – valuing group orientation. Research on politeness has also been carried out in the setting of healthcare. Backhaus (2009) presents a cross-cultural comparative study of elderly care home interaction in Japan with elderly care home interactions in a range of different cultural and linguistic contexts. It was found, for instance, that praise, if applied out of context and in too exaggerated a manner, can be interpreted as another expression of the unequal power relations between residents and staff that characterize everyday life in the institution.

Consulting the research on topics like power dynamics and culturally specific values surrounding politeness are a valuable tool in the conversation designers kit.

What unique politeness principles are present in your culture?

Resources:

Hass & Wächter (2014) Culture and the Question of Impoliteness in Computer-Mediated Communication: a research gap. DOI: 10.18247/1983-2664/educaonline.v8n1p1-12

Backhaus (2009) Politeness in institutional elderly care in Japan: A cross-cultural comparison Journal of Politeness Research 5 (2009). DOI: 10.1515/JPLR.2009.004

What are Intents and Slots in Alexa Skill Building?

Intents and slots are central to the Alexa skill building process, but what are they exactly?

𝐈𝐧𝐭𝐞𝐧𝐭𝐬

Intents consist of names and a list of “utterances”. The latter are the various ways in which a user might ask Alexa a question.
For example,
Name: “RestaurantIntent” Utterances: “Where can I find a good restaurant” or “What’s a good place to eat”.
Machine learning processes will cater for many more ways in which customers might ask this based on the utterances you add.
Each intent is “handled” at the backend, using AWS lambda for instance, and provides appropriate responses for each intent.

𝐒𝐥𝐨𝐭𝐬

Words that express variable information such as names and locations can be allocated as slots.
Such words can be highlighted in the original utterance using curly braces {}
You can then create a new slot name such as StreetName
You then assign your slot name to a slot type such as dates or place names.
These types can be built-in or custom made.

Hope these snippets are helpful 🙂

What is SSML?

You’ve probably heard of HTML but possibly not SSML. Where HTML is used to describe the structure of a web page, SSML (Speech Synthesis Markup Language) is an XML based markup language used in speech synthesis applications. It controls aspects of synthesized speech such as pronunciation, emphasis, pitch and rate. The Alexa Skills Kit supports a subset of SSML tags to make your Alexa skill more personable and customizable. Cool features include things like adding emotions such as “excited” or the addition of audio files to your app. Note – If you’re using the Alexa Skills Kit SDK for Node.js or Java you don’t need to use the <speak> tags!