Custom data and machine learning best practices

This chapter provides guidelines and best practices that Conversational AI developers can use when working with their custom data, leveraging Omilia’s machine learning technology.

NLU model development and evaluation best practices

This chapter provides guidelines and best practices around how to best tune, train & evaluate your NLU model to get optimal performance.

Creating your own intents

Follow a top-down approach when you build your intents. Define generic intents and build your way up to more complex ones. For example, if you want to build intents around Bank Account Balance inquiries, you can work your way up following the methodology below. Keep in mind that it all comes down to the level of understanding you want to get to and the effort you want to invest in the model’s preparation.

Use at least 20 utterances per intent to get decent accuracy.

Create a generic Account intent with the following utterances:
- account
- it's about my account
- account inquiry
- I want to ask something about my account
- question about an account
- inquiry about my account
Move to the more specific Account.Balance intent and use phrases like the following:
- account balance
- an inquiry about my account’s balance
- can I get my balance, please
- I want the balance of my account
- account remaining balance
- whats my account remaining balance

Training your model

The best guidelines and practices for training an NLU model are given below.

Intents you create should have a minimum of 20 utterances per intent.
- The more utterances you use per intent, the better.
- Keep your intents balanced. The intent with the most utterances, should not have more than double the utterance count than of the one with the least amount of utterances.
Use a variety of different utterances for each intent. It increases the model’s generalization.
Adjust the number of utterances and their corresponding generalization according to the granularity of your intents.
- Creating similar intents requires a more careful selection of utterances.
- Similar utterances for different intents can lead to poor performance.
If needed, create OOScope intents, that is out-of-scope intents.
- These are in-domain intents that you are aware of but have intentionally chosen not to service them.
- Delete my account? → Account-Deletion → could be out-of-scope for your banking agent.
Once your solution goes live, it may be exposed to out-of-domain utterances that are considered in-domain. Periodically populate an OODomain or Unknown intent with the incoming confusing utterances.
- Work/life balance → OODomain instead of Account.Balance.
Avoid special characters like {, _, #, and so on.

Disambiguation

In the case of ambiguity, the system processes both the previous and the current user utterance to identify the intent. In the sample dialog below the final utterance that will be decoded by the ML system is the following: About my card I wanna activate it

System: How may I help you?
Customer: About my card.
System: How can I help you with your card?
Customer: I wanna activate it.

After the design is completed and the scripting for disambiguation is in place, it helps the performance of the model to augment the corpus with concatenated examples, as shown in the table below. If the model has not seen such concatenated utterances in training, it might yield the wrong intent or the correct intent with low-confidence.

Original	Augmented
uh fraudulent charges on my card	my card, uh fraudulent charges on it
i would like to remove late payments from my account	late payments, i would like to remove them* from my account*
were my payments on time	my payments, were they* on time*
transactions that I don’t use	transactions, those* that I don't use*

Evaluation of a trained model

The evaluation of a trained model requires an evaluation set. The following best practices can help you draft a proper one:

Avoid using the same utterances that you already used to train your model.
Your evaluation set must include all the intents you built. Do NOT include xPack intents since they are not part of the model’s training process since they are already pre-tuned and ready to be used on runtime.
Your evaluation set must be as balanced as possible.
Adopt the same text formatting for both training and evaluation utterances (upper/lower case, punctuation, and so on).
See some insightful use cases below which can help you better understand what it means to NOT follow the best practice guides discussed in this section.
- Your evaluation set utterances are identical to the ones you used for training your model. → Accuracy 100%
- Your evaluation set only includes a single intent, the model’s favorable one (the one with the most training utterances). → Accuracy 100%
- Your evaluation set contains all the training set intents, but it is heavily imbalanced. For example, 990 utterances for the model’s favorable intent, and one utterance for the rest of the 10 intents). → Accuracy 99% (even though it fails on 10 intents and succeeds in 1).