Overview of what datasets are and how they can be used in chatbot conversations. Learn how to add contextual information to your chatbot.

A dataset is a structured collection of data that can be used to provide additional context and information to your AI bot. It is a way for bots to access relevant data and use it to generate responses based on user input. A dataset can include information on a variety of topics, such as product information, customer service queries, or general knowledge.

Bots access datasets as needed during a conversation. A bot can retrieve specific data points or use the data to generate responses based on user input and the data. For example, if a user asks about the price of a product, the bot can use data from a dataset to provide the correct price.

To access a dataset, you must specify the dataset id when starting a conversation with a bot. There is only one dataset allowed per conversation. The number of datasets you can have is determined by your monthly membership or subscription plan. If you need more datasets, you can upgrade your plan or contact customer service for more information.

How to create a Dataset

Follow these instructions to create a new dataset.

  1. Got to "Datasets" from the navigation bar.
  2. Click "Create Dataset" button.
  3. Name your dataset and provide a description.
  4. Save the dataset by clicking on the "Create" button.

Advanced Options

There are several advanced options you can configure.

OptionDescription
Record Max TokensThe maximum number of tokens to use for new records. This value is only taken into account when importing data from files and integrations.
Search Max RecordsThe maximum number of records to return for each dataset search.
Search Max TokensThe maximum number of tokens to use for all found dataset record. It is recommended that this value is at least Record Max Tokens tokens in order to fit a single record.
Match InstructionOptional bot instruction to use when a suitable dataset record match is found.
Mismatch InstructionOptional bot instruction to use when no suitable dataset records are found.
Dataset VisibilitySpecify if you want to make your Dataset public or keep it private. Public datasets can be found and used by the community.

Files

Datasets can have attached files, which can provide additional information and context to the chatbot. These files are automatically split into records, ensuring that the dataset stays organized and up to date. Whenever the files change, the corresponding dataset records are kept in sync, ensuring that the chatbot's responses are always based on the most recent information.

The following file types are supported.

File TypeDescription
text (.txt)Plain text file
markdown (.md)Markdown formatted file
csv (.csv)Comma-separated values file
JSON (.json)JavaScript Object Notation file
JSONL (.jsonl)JSON Lines file
DOCX (.docx)Microsoft Word document file
PDF (.pdf)Portable Document Format file

How to create a Dataset Record

Now you have an empty dataset but you do not have any records. Creating records is also very easy.

  1. With your dataset selected, click on the "Create Record" button.
  2. Specify the record text, be aware of the total token count.
  3. Save the new dataset record by clicking on the "Create" button.

Dataset Record Splitting

If you have more than one paragraph in your dataset record you may wish to split it into multiple records. This is not always necessary, but it can help make your dataset more organized. This is done automatically for you based on your dataset parameters.

If you use URL importing or you wish to enter the record manually, there are some additional options. Simple enter / import the record. Then click the "Create N Records" button. The record will be split into multiple records based on the paragraph breaks you have in the original record.

Dataset Record Autocomplete

We know that populating your Dataset can be hard especially when you do not have readily available data. This is why we have introduced the Record Autocomplete feature. As you type you can press CTRL+Enter or ⌘+Enter (if you are on Mac) to complete the text using the same generative AI models that are powering your chatbot.

Dataset Record Importing

You can import a dataset record from a web page or a document. To do so simply press the "Import" button. Type in the web page address you want to import. To import a document just select it from your file system. Then click the "Import" button.

Summary

In summary, datasets are structured collections of data that can be used to provide additional context and information to a chatbot. Chatbots can use datasets to retrieve specific data points or generate responses based on user input and the data. You can create and customize your own datasets to suit the needs of your chatbot and your users, and you can access them when starting a conversation with a chatbot by specifying the dataset id. There is a limit to the number of datasets you can use, which is determined by your monthly membership or subscription plan.