Extract
The ChatBotKit platform provides a versatile Data Extraction integration that allows to pull contextually relevant information from conversations based on a predetermined JSON schema. This integration populates the conversation metadata and facilitates more efficient data usage in subsequent steps, such as customer support, transcriptions and data analytics.
This integration empowers AI chatbots to not only interact autonomously with users but also to extract key pieces of information from the conversation. After the conversation ends or goes idle, the bot uses the provided JSON schema to extract data, consequently enriching the conversation metadata.
How to Use the Data Extraction Integration
- Log in to your ChatBotKit account and navigate to the "Integrations" tab.
- Expand "More Integrations" and select the "Data Extraction" integration.
- Specify a name and optional description for the integration.
- Provide a custom JSON schema that your chatbot will use for data extraction.
Once the integration is set up, your AI chatbot will automatically extract data from conversations according to the specified JSON schema. This data will be used to populate the conversation metadata.
Example Schema
Consider a scenario where you're running an e-commerce platform that sells various types of electronics. You want your chatbot to extract the customer's name, email, the product they are interested in, and any specific questions or issues they have about the product.
Here is an example of a JSON schema that could be used for this purpose:
This schema instructs the chatbot to extract the customer's name, email, the product they are interested in, and their specific question or issue. Remember, the chatbot's backstory and conversation flow need to be designed in such a way that these pieces of information are naturally collected during the conversation.
Advanced Features
The advanced features section offers enhanced functionality for data handling. Here, you can configure request settings, providing flexibility in how extracted data is processed. You have the option to specify either a simple URL or a more detailed request complete with custom headers. This configuration determines the destination for the extracted data. Once the chatbot has successfully extracted the relevant information from the conversation according to your predefined JSON schema, it will automatically transmit this data to the webhook you've specified in your request configuration. This powerful feature enables seamless integration with your existing systems and workflows, allowing for real-time data processing and analysis.
Numeric Value Metrics Collection
The Extract integration can automatically track and analyze numeric values from your conversations. This feature provides valuable insights into the quantitative data being extracted from customer interactions.
Enabling Metrics Collection
To enable automatic metrics collection for specific numeric fields, add the collect: true
property to those fields in your extraction schema:
What Gets Tracked
The system automatically identifies and tracks numeric values from fields marked with collect: true
:
- Monetary values: prices, amounts, costs
- Quantities: item counts, measurements, percentages
- Ratings and scores: customer satisfaction ratings, product scores
- Performance metrics: response times, conversion rates
Example
For an e-commerce support chatbot with the schema above:
Extracted Data:
Tracked Metrics:
- Order amount:
299.99
(tracked becausecollect: true
) - Quantity:
5
(tracked becausecollect: true
) - Discount percentage:
15.5
(tracked becausecollect: true
)
The customer name is not tracked as a metric since it doesn't have collect: true
in the schema.
Analytics and Insights
The collected metrics enable you to:
- Track trends: Monitor changes in order values, quantities, or ratings over time
- Identify patterns: Discover peak ordering periods or common discount amounts
- Generate reports: Create business intelligence reports from conversation data
- Monitor performance: Track key business metrics directly from customer interactions
Benefits
- Business Intelligence: Turn conversation data into actionable business insights
- Trend Analysis: Identify patterns in customer behavior and preferences
- Performance Monitoring: Track key metrics automatically from customer interactions
- Data-Driven Decisions: Make informed decisions based on conversation analytics
This feature seamlessly integrates with your existing Extract integration workflow and requires no changes to your current setup.
Triggering Extraction on Historic Conversations
The Extract integration provides the ability to retroactively apply extraction to existing conversations. This is useful when you want to extract data from conversations that occurred before the integration was configured, or when you've updated your extraction schema and want to reprocess previous conversations.
Using the Trigger Feature
On your Extract integration page, you'll find a "Trigger on Last 20 Conversations" button above the metrics chart. This button allows you to:
- Process Recent Conversations: Apply your current extraction schema to the most recent 20 conversations
- Refresh Analytics: Update your metrics charts with newly extracted data
- Test Schema Changes: Validate schema modifications against existing conversation data
How It Works
When you trigger extraction on historic conversations:
- Conversation Selection: The system selects up to 20 of your most recent conversations
- Bot Filtering: If your integration is linked to a specific bot, only conversations from that bot are processed
- Queue Processing: Each conversation is queued for extraction using the same pipeline as real-time processing
- Automatic Updates: The metrics chart automatically refreshes to display newly extracted data
Benefits
- Historical Data Recovery: Extract valuable data from conversations that predate your integration setup
- Schema Testing: Validate new extraction schemas against real conversation data
- Analytics Refresh: Update your metrics and charts after making schema changes
- Data Completeness: Ensure comprehensive data extraction across all your conversations
Usage Notes
- The trigger processes conversations using your current extraction schema configuration
- Each conversation will have its metadata updated with the newly extracted data
- The feature respects your integration's bot filtering settings
- Chart data refreshes automatically within 30 seconds of processing completion
This capability enables you to maintain complete historical data while continuously improving your extraction schemas.
Caveats
While the Data Extraction integration is powerful, it's important to design your JSON schema carefully. Inaccurate or inappropriate schema could lead to incomplete or incorrect data extraction. It's recommended to thoroughly test your JSON schema with various conversation scenarios to ensure it extracts the intended data accurately.
FAQ
Do you retry failed request?
Yes. All failed requests will be retried up to 5 times. Requests attempt are recorded in the integration logs. The delay between each retry is calculated based on the following formula: