Sitemap

With ChatBotKit's Sitemap feature, you can easily import a website's information into your dataset by simply providing the website's URL.

Step-by-step Guide

To integrate ChatBotKit's Sitemap feature into your dataset, follow these simple steps:

Navigate to "Integrations" in ChatBotKit and click Website Importer.
Enter a name and optional description for this integration.
Select the dataset you want to import information into.
Enter the website URL. You can also provide a sitemap.xml URL to be more specific.
Save the integration by clicking the "Create" button.

There are several advanced options that needs to be considered. You can find this information under "Advanced Options".

Glob Patterns: Glob patterns allow you to target specific pages for integration. For instance, if you aim to synchronize only the documentation located at /docs, you should set the glob pattern to /docs/**. This field supports multiple entries, enabling the inclusion of various patterns or the exclusion of specific ones using negative globs (prefixed with !). Negative globs override positive ones, offering a way to exclude particular URL patterns and refine your selection criteria.
Selectors: Utilize CSS selectors to narrow down the importer's focus to designated areas of your website. This feature helps in selectively importing content, ensuring only the relevant sections are captured. Additionally you can enter special selectors such as @jsonld (to extract structured data) and @skiphtml (to skip html) to further fine-tune your import process.
JavaScript Rendering: Activating this option enables the importer to operate with the capabilities of a full browser, essential for capturing content from websites rich in dynamic elements and scripts. This ensures comprehensive content capture, including AJAX-loaded content and other dynamic interactions.
Sync Schedule: This feature provides flexibility in how often your data is synchronized. You can set the sync schedule to various frequencies - never, hourly, daily, weekly, or monthly - according to your needs. Choosing "never" pauses automatic syncing, while selecting one of the time-based options ensures your data is regularly updated. This ensures that the integration aligns with your content update cycles and data freshness requirements, enabling timely and efficient data synchronization.
Expires In: This setting allows for the automatic expiration of outdated records, which is particularly beneficial for frequently updated websites. By specifying an expiration period, you can ensure that only the most current records are retained, enhancing data accuracy and relevance.

Once the Sitemap integration is created, ChatBotKit will automatically import the information from the website into your selected dataset.

Imported Dataset Records

To access the information that has been imported from the website, you need to navigate to the specific dataset that you selected during Step 3 of the process. This dataset is the repository where all the content imported from the website is stored. The content here is organized and structured in a way that makes it easily accessible. This information, which can be text, or any other kind of data extracted from the website, is highly useful. It can be used to train your chatbot, which will enable the chatbot to respond to user queries more effectively and accurately.

Structured Data Importing

When you opt to use the jsonld selector, the sitemap importer will also bring in structured data. This feature proves to be particularly beneficial if your aim is to import e-commerce websites, where production information is key. This comprehensive process allows you to import all the relevant and necessary structured data, facilitating a more streamlined and efficient import process. Furthermore, the sitemap importer is versatile and customizable - you can utilize custom selectors such as @skiphtml, which provide you with the option to bypass importing all other pages. This way, you can concentrate solely on the structured data, further enhancing the specificity and efficiency of your import process.

Events Tracking

Within each integration configuration page, you'll find the "Sitemap Integration Events" section. This area provides a comprehensive overview of all events related to your integration, including detailed information on the URLs covered by the crawler. This feature is invaluable for monitoring and analyzing the scope and success of your integration efforts, offering insights into the extent of your website's content that has been successfully captured and integrated.

ChatBotKit sitemap import AI summarization JavaScript

AI Agents

AI Widgets

AI for Messaging Platforms

AI SDKs

Examples

Documentation

Tutorials

Changelog

Reflections

Basics

Sitemap

Step-by-step Guide

Imported Dataset Records

Structured Data Importing

Events Tracking

Notion

Support

Extract