Overview of what datasets are and how they can be used in chatbot conversations. Learn how to add contextual information to your chatbot.

A dataset is a structured collection of data that can be used to provide additional context and information to a chatbot. It is a way for chatbots to access relevant data and use it to generate responses based on user input. A dataset can include information on a variety of topics, such as product information, customer service queries, or general knowledge.

Chatbots access datasets as needed during a conversation. The chatbot can retrieve specific data points or use the data to generate responses based on user input and the data. For example, if a user asks a chatbot about the price of a product, the chatbot can use data from a dataset to provide the correct price.

To access a dataset, you must specify the dataset id when starting a conversation with a chatbot. There is only one dataset allowed per conversation. The number of datasets you can have is determined by your monthly membership or subscription plan. If you need more datasets, you can upgrade your plan or contact customer service for more information.

How to create a Dataset

Follow these instructions to create a new dataset.

  1. Got to "Datasets" from the navigation bar.
  2. Click "Create Dataset" button.
  3. Name your dataset and provide a description.
  4. Save the dataset by clicking on the "Create" button.

Advanced Options

There are several advanced options you can configure.

OptionDescription
Match InstructionOptional bot instruction to use when a suitable dataset record match is found.
Mismatch InstructionOptional bot instruction to use when no suitable dataset records are found.
Dataset VisibilitySpecify if you want to make your Dataset public or keep it private. Public datasets can be found and used by the community.

How to create a Dataset Record

Now you have an empty dataset but you do not have any records. Creating records is also very easy.

  1. With your dataset selected, click on the "Create Record" button.
  2. Specify the record text, be aware of the total token count.
  3. Save the new dataset record by clicking on the "Create" button.

Dataset Record Splitting

If you have more than one paragraph in your dataset record you may wish to split it into multiple records. This is not always necessary, but it can help make your dataset more organized.

To do so simply press the "Create N Records" button. The record will be split into multiple records based on the paragraph breaks you have in the original record.

Dataset Record Autocomplete

We know that populating your Dataset can be hard especially when you do not have readily available data. This is why we have introduced the Record Autocomplete feature. As you type you can press CTRL+Enter or ⌘+Enter (if you are on Mac) to complete the text using the same models that are powering your chatbot.

Dataset Record Importing

You can import a dataset record from a web page or a document. To do so simply press the "Import" button. Type in the web page address you want to import. To import a document just select it from your file system. Then click the "Import" button.

Notice that large web pages and documents are split into multiple records. This is necessary because each record cannot be more than 400 tokens.

Limits

The following limits apply.

Record Size

The maximum record size can be 400 tokens. As you approach this limit you will see the token count turning from amber to red. It is advisable to keep individual dataset records small and on topic. This will improve the overall chat quality significantly. We reserve the right to make changes to this limit in the future.

Summary

In summary, datasets are structured collections of data that can be used to provide additional context and information to a chatbot. Chatbots can use datasets to retrieve specific data points or generate responses based on user input and the data. You can create and customize your own datasets to suit the needs of your chatbot and your users, and you can access them when starting a conversation with a chatbot by specifying the dataset id. There is a limit to the number of datasets you can use, which is determined by your monthly membership or subscription plan.