ChatBotKit supports a variety of models for generating responses, including OpenAI models such as the Davinci and Curie family, and in-house models such as text-algo-001 and text-algo-002.

ChatBotKit supports a variety of models for generating responses, including base OpenAI models such as the GPT4 and GPT3. Additionally, ChatBotKit has several in-house models such as text-algo-002 and text-algo-003 for our in-house general assistant.

Below is a table summarizing the different models, including their name, short description, and context size (the max number of tokens).

Model NameShort DescriptionToken RatioContext Size
gpt-4-turbo (beta)Similar to gpt-4, this model belongs to the GPT-4 family. It has the same advantages as gpt-4 but it also faster and cheaper.1135168
gpt-4-nextSimilar to gpt-4, this model belongs to the GPT-4 family. It has the same advantages and disadvantages as gpt-4, but it also serves as a proxy to the 'gpt-4-0613' model.38192
gpt-4Newest model with enhanced capabilities for conversational use.38192
gpt-3.5-turbo-nextThis model is similar to gpt-3.5-turbo and serves as a proxy to the 'gpt-3.5-turbo-0613' model.0.34096
gpt-3.5-turbo-16kThis model is based on the GPT-3.5 architecture and has a larger token limit of 16000. It is more cost-effective with a lower token ratio but sacrifices some quality compared to higher-cost models.0.616394
gpt-3.5-turbo-instructThis is an instructGPT-style model, trained similarly to text-davinci-003.`0.34096
gpt-3.5-turboNewest model with enhanced capabilities for conversational use.0.34096
claude-instant-v1 (beta)Claude Instant is Anthropic's faster, lower-priced yet very capable LLM.0.08100000
claude-v2.1 (beta)Claude 2.1 is Anthropic’s latest large language model (LLM) with an industry-leading 200K token context window, reduced hallucination rates, and improved accuracy over long documents.0.8200000
claude-v2 (beta)Claude 2.0 is a leading LLM from Anthropic that enables a wide range of tasks from sophisticated dialogue and creative content generation to detailed instruction.0.8100000
text-algo-003In-house model optimized for general assistant use.38192
text-algo-002In-house model optimized for general assistant use.0.34096
text-qaa-001In-house model optimised for question and answer conversations0.34096

The complete list of active models and their configurations can be found here.

The token ratio is a multiplier that ChatBotKit uses to calculate the actual number of tokens consumed by the model. Every token from the model is multiplied by the token ratio to produce the actual number of tokens that ChatBotKit records. This allows ChatBotKit to accurately track the amount of resources used by each model and ensure that users are billed correctly for their usage.

The context size refers to the maximum number of tokens (words or symbols) that the model can take into account when generating a response. A larger context size allows the model to consider more information in its response, potentially resulting in more accurate and relevant responses.

Choose the appropriate model depending on your specific use case and desired performance.

If you're looking for the most advanced and capable model, gpt-4 is your best bet. On the other hand, if you're looking for a capable model with a smaller footprint, gpt-3.5-turbo might be more suitable.

Customising Model Settings

To customize a model's settings, click on the settings icon next to the model name.

There are four main properties that can be customized: Max Tokens, Temperature, Frequency Penalty, Presence Penalty, Interaction Max Messages and Region.

Max Tokens: This property determines the maximum number of tokens that the model can consume when generating a response. By default, this is set to the maximum context size for the model, but you can reduce it to limit the amount of resources used by the model. This can help save token cost but may also reduce the ability of the chatbot to keep up with the conversation.

Temperature: This property determines the level of randomness or creativity in the model's responses. A higher temperature value will result in more diverse and creative responses, while a lower value will result in more conservative and predictable responses.

Frequency Penalty: This property determines how much the model penalizes the repetition of certain words or phrases in its responses. A higher frequency penalty value will result in responses that are more varied and less repetitive.

Presence Penalty: This property determines how much the model penalizes the use of certain words or phrases in its responses. A higher presence penalty value will result in responses that are less likely to contain specific words or phrases.

Interaction Max Messages: The maximum number of messages to use per model interaction. Setting this value to low will make the model more deterministic. Increasing the value will result in more creativity. For Q&A-style conversation it is recommended to keep the value to 2.

Region: The region property allows you to specify the geographical region for the model. This can be particularly useful for services that have specific regional requirements or restrictions. However, it's important to note that the availability of certain models may vary depending on the region.

By customizing these properties, you can fine-tune the behavior of the model to better suit your specific use case and requirements. However, it's important to note that changing these properties can have a significant impact on the model's performance and accuracy, so it's recommended to experiment with different settings to find the best balance between performance and creativity.