Skip to main content
Skip table of contents

Testing Studio User Guide

Logging in

To enter Testing Studio, follow the steps below:

  1. Navigate to https://ocp.ai/ and click Console, which is a unified entry point for managing all OCP® services.

  1. Select the regional OCP Console URL that you are using.

  2. Enter your email or OCP® username and click Sign In.

  1. Enter your password and click Sign In.

  1. If the credentials are correct, you are forwarded to the OCP Console® landing page. Click Testing Studio on the sidebar on the left.

Managing Projects

Projects page shows a list of all the created projects. A project is a remote repository where test cases reside.

At your first login to Testing Studio, the Projects page is empty and looks as follows:

Create a Project

To create a new project:

  1. Click + Add Project.

2. Fill in the project information in the dialog-box.

By default, the project is stored in the Cloud storage! If selecting Cloud, the Test Suites can be uploaded to the cloud directly there. In case of GitLab, the Test Suites are stored in the GitLab repository.

  • Name: The name of your project.

  • Group: The group the project belongs to.

  • Orchestrator Application: The application created in Orchestrator you will use for your project.

The list of available Orchestrator Applications is based on the group the user has selected, so selecting the group first is necessary!

If you want to store it in the a Git provider (GitLab, GitHub), toggle the bar accordingly and fill in the following information.

  • Git provider: Select GitLab or GitHub as your Git provider

  • Git URL: The URL of the remote repository with tests. Omilia currently uses GitLab

  • Git Token: An access token to GitLab

For further instructions on how to generate an access token, refer to the original Gitlab documentation.

Cloud storage

GitLab storage

3. Click Create. Your created projects are listed accordingly.

Delete a Project

To delete a project, follow the procedure below:

  1. Open the Projects page.

  2. Select a project to delete and click the three dots icon.

3. Once clicked, confirm the deletion as requested below.

The project is deleted from the projects list.

Project Overview

To open a project, just click on it. An opened project looks as shown below:

When you open a project, you can navigate through the following tabs:

General Tab

The General tab displays the general information about a selected project, such as its name, the remote repository URL and the access token. This is basically the information you’ve entered when creating the page.

It is possible to modify the project general information any time. Click Save to save the changes if modified.

Cloud storage

GitLab storage

Test Suites

Test Suites are collection of test cases that are grouped for test execution purposes. In Testing Studio, Test Suites can be uploaded to the Cloud from your computer as a .zip file by drag and dropping it onto the drop zone, or clicking the Upload button. Once uploaded, Test Suites are listed under the tab accordingly.

If you want to edit any of the Test Suites, click the Download button, edit the Test Suit, and upload it back.

If using a Mac (OS system) compress the Test Suite file to .zip over terminal before uploading back to Testing Studio, otherwise the file may be rejected.


To compress on terminal use the following command:
zip -r {output_file} {folder}

Run History Tab

The Run History tab displays the current and historical test runs of the project.

  • Run ID: Test case unique identifier

  • Status: Test case run status. The following statuses are possible:

    • Started: The test case is currently running

    • Pending: The test case is pending. This might happen if there are no free workers for the test execution.

    • Revoked: The test case has been cancelled

    • Success: The test execution was successful

    • Failure: The test case failed to run

  • Run Type: Defines the type of the test case - Classic for regular test cases or Augmented for test cases which use LLM.

  • Created: The test case creation timestamp (starting date)+ dd/MM/yyyy - HH:mm: e.g.: 21/07/2020 - 17.32

  • Finished: the test case completion timestamp (ending date)
    dd/MM/yyyy - HH:mm: e.g.: 21/07/2020 - 17.39

Run Test Cases

To run test cases, follow the guidelines below:

  1. Open a project and click Run Test.

2. Select the test cases to run by filtering them based on the following attributes.

  • Group: A group the test cases belong to. Group basically serves a tag.

  • Utterance : Utterances the test case contains.

  • Description : Test case description

Filtering supports the % symbol as a filter wildcard. For example:

  • balance%: matches all test cases starting with the word balance

  • %card%: matches all test cases containing the word card

  • %description: matches all test cases which ending with the word description

To run the whole project (which is basically the entire list of test cases), leave all the filter fields blank.

Filter is case insensitive. The filter value desc% will match Desc, desc and DESC.

When multiple filters are provided, the behavior is that of an AND boolean expression. The filtered test cases are the ones that meet all the filter criteria.

Run Augmented Test Cases

The Run Augmented Test Cases allows to expand the number of test case utterances by using different variations of them in order to cover more variants of utterances in scope of existing test cases. These variations are produced by LLM which takes the original utterance and creates different versions of it, and then returns new utterances to the dialog. For instance, if your original utterance is "What is my balance?", the LLM can formulate different versions such as:

  • "Could you please tell me the amount in my account?"

  • "Could you please provide my current account balance?"

  • "What amount do I have in my account?"

  • "What's the current status of my account?"

To run the augmented test cases, proceed as follows:
1. Open a project and click the Run Augmented Test Cases button.

  1. Fill in the form that provides context for LLM to make your request more precise, and click the Run button when finished.

The attributes below are the same as in the Run Test feature and are also optional:

  • Group: A group the test cases belong to. Group basically serves a tag.

  • Utterance: Utterances the test case contains.

  • Description: Test case description

Read more about Group, Utterance and Description attributes and filtering them out in the Run Test chapter.

Apart from the regular Run Test attributes, the specific Run Augmented Test Cases attributes are the following:

  • Location: Select the location from the dropdown list.

  • Age: Define the age threshold from the dropdown list.

  • Adjust Formality: Choose how formal the utterances should be.

  • Custom Instructions: Add some extra information LLM should take into account while creating new utterances.

  1. Once created, the augmented test case appears in the Run History board with the corresponding run type.

Test Execution Description

Testing Studio asynchronously runs test cases located in the /tests folder of the master branch of your git repo.

Make sure that /tests folder is located under the root folder of your git repo or upload ZIP file.

The way that the test cases are executed during a run is the following:

  1. All test cases in a test suite are loaded and set to run.

  2. All test cases both with and without a golden dialog start executing:

  • The test cases with golden dialogs are asserted and validated normally.

  • Test cases without golden dialogs are not asserted and defined as unavailable.

When loading test cases, Testing Studio does not distinguish between scenarios created by users inside scenarios.yml, individual test cases or test cases in test_cases.yml. If a generated test case is missing a golden dialog file (no matter if moved accidentally or deleted), it is treated as unavailable.

If an unavailable test case execution fails because of the other side (DiaManT closed the dialog unexpectedly), the test case is considered failed.

3. After execution has finished:

  • Unavailable test cases are stored inside output/Unavailable folder placed under the test suite directory.

  • Failed test cases are stored inside output/Failed directory, placed under the test suite directory.

  • The dialog that has executed the most steps is the one that gets written in the file. The equivalent validation error message is reported.

  • Report files contain successful, failed as well as unavailable test cases. All of them can be downloaded as Artifacts.

Download Test Run Artifacts

The test run details created after the run has been completed are called Artifacts. Testing Studio allows for downloading the artifacts to check the outcome of the run in detail.

To download artifacts of a selected test run, follow the steps below:

  1. Open a project.

  2. Click on a test run to open the run page.

3. Click Download Artifacts. The download starts.

Artifacts are downloaded as a .ZIP file containing results folder with two CSV files.

  • test_suites_report.csv: results report for each test suite. For example:

CODE
suite name,     total,  passed, failed, unavailable,  status
Test suite 1,   10,     10,     0,      0,             pass
Test suite 2,   20,     0,      10,     10,            fail
....
total,          30,     10,     10,     10,            fail
  • test_cases_report.csv: results report for each test case. For example:

CODE
test suite, test case,  status,        dialog id,  golden dialog id,  message
suite 1,    dialog 1,   pass,          dialogid1   1234,
suite 1,    dialog 2,   unavailable,   dialogid2   Unavailable,
suite 2,    dialog 3,   fail,          dialogid3,  123,               error message
...
suite x,    dialog x,   fail,          dialogidx,  12345,             error message
  • Output folder: created after each test run for every test suite. The output folder contains candidate golden dialog json files for failed test cases or for unavailable ones (which mean they have no golden dialogs):

CODE
test_suite
└── output
   ├── Failed
   └── Unavailable

Before every new run, the output folder gets deleted. Make sure you handle generated data before starting a new test case execution.

Generation History tab

Generate Golden Dialog

Golden Dialog is a dialog used within the application which is generated as a JSON file containing all its utterances which can be used for testing purposes. Having a generated Golden Dialog, it is possible to proceed with using it for generating test cases. Basically, after running a particular test, you compare a Golden Dialog with the original dialog to define differences.

To generate a Golden Dialog, proceed as follows:
1. Select the dialog you want to generate a Golden Dialog for, and copy its ID. For example, you can find it in the Insights tab.

2. Paste it into a Generate Golden Dialog field, and click the +Generate Golden button to generate and download the Golden Dialog’s JSON file.

Generate Test Cases

The Generation History displays the current and historical test case generations of your project. It allows you to download your generated test cases and add them to your test suites.

To generate test cases, follow the steps below:

  1. Open Generation History tab and click the +Generate button.

2. Set the necessary parameters by filling in the fields.

  • Group: Group which serves as a tag. It is assigned to all the test cases being generated.

  • Simulation Data: Simulation data which is included in the golden dialog. It will be automatically inserted in the generated test cases. Available values are:

    • Ani

    • Dnis

  • Overwrite existing test case file: By default, the existing test cases are not replaced by new ones. To replace the existing test cases with the newly generated ones, switch the toggle on.

3. Click Generate. The test case generation starts. The generated test cases are displayed in the Generation History tab.

  • Generation ID: Test case unique identifier

  • Status: Test case generation status. Available options:

    • Success: The test case generation was successful

    • Running: The test case generation is currently running

    • Pending: The test case generation is pending

    • Revoked: The test case has been cancelled

    • Failed: The test case generation has failed to run

  • Created :the creation timestamp (starting date)
    dd/MM/yyyy - HH:mm: e.g.: 21/07/2020 - 17.32

  • Finished: the completion timestamp (ending date)
    dd/MM/yyyy - HH:mm: e.g.: 21/07/2020 - 17.39

4. To download generated test cases and add them to your repository, select a test case by clicking on it and click the Download Test Cases button. The test cases are downloaded locally as a ZIP file.

Insights Tab

The Insights tab offers an intuitive representation of the test results of your project.

By default, the result of 10 latest runs are shown. You can can change this option by selecting 20 or 30 latest runs from the dropdown list:

Hover over a graph to see a percentage information:

By default the following test cases are included into the graph results:

  • Passed: Successfully executed test cases

  • Failed: Test cases that failed to run

  • Unavailable: Test cases generated by Testing Studio and having no golden dialogs

The Pending or Revoked test cases are not included into this statistics.

The graph gives you a visual idea of the test execution trend.

The goal to strive for is to have all test runs green (successfully passed) and none grey or red.

You can exclude some of the test cases from the result by clicking on its status as shown below:

As you see, the Passed test cases are not included.

Under the graph you can see the test run list. To get a detailed information about a test run, click on it and the corresponding Test Run page opens.

Test Run Details

The page header contains a detailed information about a selected test run.

  • Run number: Test run ID number

  • Created at: Test run creation timestamp

  • Test Duration: Test run duration

  • Finished at: Test run completion timestamp

  • Status: Test run status

  • Download Artifacts: This button allows for downloading Artifacts. You can read more about Artifacts.

The graph in the header shows a brief statistics on the executed test cases:

  • Passed: The percentage of successfully executed test cases

  • Failed: The percentage of test cases that failed to run

  • Unavailable: The percentage of test cases having no golden dialogs

Tasks Tab

The Tasks tab displays the tasks list performed during a test run.

  • Name: Tasks names performed during a test run. Each task has a detailed log which is available in the Logs tab. The following tasks are usually executed:

    • Clone repository

    • Load configuration

    • Load test suites

    • Run test suites

    • Aggregate results

  • Status: Status of a task

  • Created: Task creation timestamp

  • Finished: Task completion timestamp

Results Tab

The Results tab displays the information about all test cases (Succeeded, Failed, Unavailable) per each test suite.

The same information is contained in the output folder of artifacts.

Click on a selected test suite to unfold the information about the test cases. Then click on a test case to get the test result information:

The test result pop-up window may contain the following information:

Property

Description

Dialog ID

The dialog ID number

Golden Dialog ID

The golden dialog ID number

Assertion Errors

Errors occurred during asserting the expected and return outcome. The following error information is available:

  • Step: The step where the error was detected

  • Assertion: The assertion type

  • Found: The returned result (the result produced by the system)

  • Expected: The expected result (the result found in the golden dialog)

General Errors

Errors occurred due to the system failures. E.g. The dialog unexpectedly ended. The following error information is available:

  • Step: The step where the error was detected

  • Assertion: The assertion type

  • Found: The returned result (the result produced by the system)

  • Expected: The expected result (the result found in the golden dialog)

Warnings

May contain warnings, e.g. a delay warning due to the system high load

Logs Tab

Tests are run outside Testing Studio by distributed workers. The Logs tab allows you to see detailed logs of each task executed during a test run.

The tasks' names are marked in green. The task logs are written in white.

The information in the Logs tab becomes available right after the test run starts and is updated in the real time.

Explore

The Explore feature allows you to simulate a conversation between your dialog application and LLM to explore your application’s boundaries by identifying areas that might not be yet supported or possible issues. This enables you to prevent potential problems while using the dialog application in a real conversation.

Start Exploration

To create an exploration session, follow the steps below:

  1. Click the New Exploration button in the upper right.

  1. Fill in the form and hit the Explore button when finished:

  • Select Domain: choose the domain from the dropdown list. For example, pick Banking (mandatory).

  • Adjust Formality: select the formality level from the dropdown list. The formality level could be Informal, Normal or Formal (mandatory).

  • Key-Value: add the Key-Value pair to provide some additional information (optional).

  • Scenarios: define the number of scenarios LLM will run to check the dialog application.

  1. After clicking the Explore button, it might take some time for LLM and dialog application to communicate.

  1. Once finished, you are redirected to the Exploration Results chart.

Exploration Results

The Exploration Results provide a summary of the exploration sessions, offering detailed information about the conversation between the LLM and your application in a chart view. It allows you to assess the comprehensiveness of the dialog application and identify any potential failure reasons.

Dialogs List

The Dialogs List comprises the conversations that have taken place between your dialog application and LLM.

In this section you can perform the following actions:

  • Click the View Conversation Details button to assess the actual conversational steps between your dialog application and LLM, its status (Success or Failure), and find out the detailed failure description with some suggestions.

  • Download the conversation by hitting the Download Conversation button.

Areas of Improvement

The Areas of Improvement chart delivers a report on areas where enhancements can be made to enrich your conversational experience. In the future, the user will be able to deploy an application that supports the areas of improvement that the tool has identified.

The Deploy App button is currently disabled.

Log out from Testing Studio

To log out from Testing Studio, click the User icon and select Log out:

JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.