Models
ChatBotKit supports a variety of models for generating responses, including base OpenAI models such as the Davinci and Curie family, the entire OpenAI instruct family, and the newest models such as gpt-3.5-turbo (also known as ChatGPT). Additionally, ChatBotKit has several in-house models such as text-algo-001 and text-algo-002 for our in-house general assistant.
Below is a table summarizing the different models, including their name, short description, and context size (the max number of tokens).
Model Name | Short Description | Token Ratio | Context Size |
---|---|---|---|
gpt-4 (beta) | Newest model with enhanced capabilities for conversational use. | 3 | 8192 |
gpt-4-next (beta) | Similar to gpt-4, this model belongs to the GPT-4 family. It has the same advantages and disadvantages as gpt-4, but it also serves as a proxy to the 'gpt-4-0613' model. | 3 | 8192 |
gpt-3.5-turbo | Newest model with enhanced capabilities for conversational use. | 0.3 | 4096 |
gpt-3.5-turbo-next | This model is similar to gpt-3.5-turbo and serves as a proxy to the 'gpt-3.5-turbo-0613' model. | 0.3 | 4096 |
gpt-3.5-turbo-16k | This model is based on the GPT-3.5 architecture and has a larger token limit of 16000. It is more cost-effective with a lower token ratio but sacrifices some quality compared to higher-cost models. | 0.6 | 16384 |
gpt-3.5-turbo-instruct | This is an instructGPT-style model, trained similarly to text-davinci-003.` | 0.3 | 4096 |
text-davinci-003 | An advanced and capable model. | 1 | 4096 |
text-curie-001 | Capable model with a smaller footprint. | 1 | 2048 |
davinci-instruct-beta | Family of models optimized for specific use cases. | 1 | 2048 |
code-davinci-002 | Code-generation model | 1 | 2048 |
text-algo-003 | In-house model optimized for general assistant use. | 3 | 8192 |
text-algo-002 | In-house model optimized for general assistant use. | 1 | 4096 |
text-algo-001 | In-house model optimized for general assistant use. | 1 | 4096 |
The token ratio is a multiplier that ChatBotKit uses to calculate the actual number of tokens consumed by the model. Every token from the model is multiplied by the token ratio to produce the actual number of tokens that ChatBotKit records. This allows ChatBotKit to accurately track the amount of resources used by each model and ensure that users are billed correctly for their usage.
The context size refers to the maximum number of tokens (words or symbols) that the model can take into account when generating a response. A larger context size allows the model to consider more information in its response, potentially resulting in more accurate and relevant responses.
Choose the appropriate model depending on your specific use case and desired performance.
If you're looking for the most advanced and capable model, gpt-3.5-turbo is your best bet. On the other hand, if you're looking for a capable model with a smaller footprint, text-davinci-003 and text-curie-001 might be more suitable.
Customising Model Settings
To customize a model's settings, click on the settings icon next to the model name.
There are four main properties that can be customized: Max Tokens, Temperature, Frequency Penalty, Presence Penalty and Interaction Max Messages.
Max Tokens: This property determines the maximum number of tokens that the model can consume when generating a response. By default, this is set to the maximum context size for the model, but you can reduce it to limit the amount of resources used by the model. This can help save token cost but may also reduce the ability of the chatbot to keep up with the conversation.
Temperature: This property determines the level of randomness or creativity in the model's responses. A higher temperature value will result in more diverse and creative responses, while a lower value will result in more conservative and predictable responses.
Frequency Penalty: This property determines how much the model penalizes the repetition of certain words or phrases in its responses. A higher frequency penalty value will result in responses that are more varied and less repetitive.
Presence Penalty: This property determines how much the model penalizes the use of certain words or phrases in its responses. A higher presence penalty value will result in responses that are less likely to contain specific words or phrases.
Interaction Max Messages: The maximum number of messages to use per model interaction. Setting this value to low will make the model more deterministic. Increasing the value will result in more creativity. For Q&A-style conversation it is recommended to keep the value to 2.
By customizing these properties, you can fine-tune the behavior of the model to better suit your specific use case and requirements. However, it's important to note that changing these properties can have a significant impact on the model's performance and accuracy, so it's recommended to experiment with different settings to find the best balance between performance and creativity.
FAQ
Q: Why is gpt-4 and text-algo-003 three times the token ratio of other models?
The gpt-4 and text-algo-003 models are currently in beta and only available in limited supply. We have set the token ratio at three to ensure that users are billed appropriately for the limited availability of these models. We may adjust the token ratio as the model becomes more widely available in the future.
Q: How does context length affect token consumption?
The context length refers to the maximum amount of information that a model can take into account when generating a response. Models with higher context length may use more tokens when the conversation approaches the context length window. This is because the model needs to retain more readily accessible information to generate a response that is relevant to the ongoing conversation. As a result, models with higher context length may consume more tokens than models with lower context length, depending on the length and complexity of the conversation. It is important to choose a model with an appropriate context length for your specific use case to ensure optimal performance and efficient use of resources.
You can adjust how many tokens the model can consume by customizing the model max tokens setting.
Q: How do the text-algo models differ from other models?
The text-algo models, including text-algo-001, text-algo-002, and text-algo-003, are in-house models developed by ChatBotKit. They are based on the standard GPT-3.5-turbo and GPT-4 models but are customized with proprietary settings and technology that make them suitable for use as general-purpose assistants.