Files are fundamental resources in the platform that enable you to store, manage, and utilize various types of content including documents, images, and data files. The file management system provides a comprehensive API for creating, uploading, retrieving, updating, and deleting files with robust security controls and flexible storage options.

Files can serve multiple purposes within the platform: they can be attached to datasets as data sources, used for bot training materials, or serve as general storage for application content. The platform handles file storage securely with support for both public and private visibility settings, allowing you to control access to your files based on your specific needs.

Listing Files

Retrieving a list of your files is essential for managing your stored content and understanding what resources are available in your account. The list endpoint provides powerful filtering and pagination capabilities to help you efficiently navigate through your file collection.

To list files, make a GET request to the files endpoint. The API supports cursor-based pagination, which is ideal for efficiently handling large collections of files without performance degradation:

You can control the pagination behavior using query parameters:

The take parameter specifies how many files to retrieve per request (default and maximum may vary based on your account limits), while the order parameter controls whether files are returned in ascending (asc) or descending (desc) order based on creation time.

For subsequent pages, use the cursor provided in the previous response:

You can also filter files by blueprint ID to retrieve only files associated with a specific blueprint:

The response includes comprehensive information about each file including its ID, name, description, visibility settings, blueprint association, metadata, and timestamps. This information enables you to understand file properties and make informed decisions about file management operations.

Note: The list endpoint only returns files that belong to your user account, ensuring proper data isolation and security. Files are automatically filtered based on your authentication context.

Creating Files

Creating a file is the foundational step in managing file resources within the platform. File creation establishes a file record in the system with metadata and configuration settings, after which you can upload actual content to the file resource.

To create a new file, make a POST request to the create endpoint with the file's metadata and configuration:

The create operation returns a file ID that you'll use for all subsequent operations including content upload, updates, and downloads. All fields in the request body are optional, allowing you to create a minimal file record and populate details later.

Available Configuration Options

When creating a file, you can specify the following properties:

  • name: A descriptive name for the file (optional, can be set later)
  • description: Detailed information about the file's purpose and contents (optional)
  • visibility: Access control setting - either private (default) or public
  • blueprintId: Associate the file with a blueprint resource for organizational purposes (optional)
  • meta: Custom metadata object for storing additional file-specific information (optional)

The visibility setting is particularly important for security and access control. Setting a file to private ensures that only you can access it, while public visibility allows anyone with the file ID to download the content. Choose the appropriate visibility based on your security requirements and use case.

Creation Workflow

The typical workflow for working with files involves two distinct steps:

  1. Create the file record: Use this endpoint to establish the file resource with metadata
  2. Upload content: Use the upload endpoint with the returned file ID to add actual file content

This two-step process provides flexibility, allowing you to configure file metadata before or after content upload, and enables you to replace file content while maintaining the same file ID and metadata.

Example Response:

The response includes the unique file ID that you'll use for uploading content and performing other file operations. Store this ID for future reference as it serves as the primary identifier for the file resource.

Rate Limiting: File creation is subject to database resource limits based on your account tier. If you reach your file creation limit, you'll need to delete existing files or upgrade your account to create additional files.

Uploading File Content

Uploading actual content to a file is a critical operation that stores your data in the platform's secure storage system. The upload endpoint provides multiple flexible methods to accommodate different file sizes, source locations, and client capabilities, ensuring you can efficiently upload content regardless of your specific requirements.

After creating a file record using the create endpoint, you use the file ID to upload content. The platform supports several upload methods, each optimized for different scenarios ranging from small embedded files to large multi-gigabyte resources.

Upload Methods Overview

The upload endpoint intelligently handles different content types and sources based on the request format:

URL-Based Upload: Provide an HTTP URL to a publicly accessible file, and the platform will fetch and store the content. This method is ideal for importing files from external sources or content delivery networks.

Data URL Upload: Embed small files directly in the request as base64-encoded data URLs, suitable for files under 4.5MB like images, documents, or configuration files.

Multipart Form Upload: Use standard multipart/form-data encoding, which is the traditional method supported by web browsers and most HTTP clients, also limited to 4.5MB.

Raw Stream Upload: Send the file content directly in the request body with appropriate Content-Type header, ideal for programmatic uploads up to 4.5MB.

Direct-to-Storage Upload: For files larger than 4.5MB, request pre-signed upload credentials and upload directly to the storage service, bypassing API size limits and improving performance for large files.

Method 1: URL-Based Upload

The platform fetches the file from the provided URL and stores it securely. This method works with any publicly accessible HTTP or HTTPS URL.

Method 2: Data URL Upload

Encode your file content as a base64 data URL with the appropriate MIME type. This method is convenient for small files and embedded content.

Method 3: Multipart Form Data Upload

Standard multipart upload supported by all major HTTP clients and browsers.

Method 4: Raw Stream Upload

Send the file content directly as the request body with the appropriate Content-Type header. This is the simplest method for programmatic uploads.

Method 5: Direct-to-Storage Upload (For Large Files)

First, request upload credentials by providing file metadata:

The response includes pre-signed upload credentials:

Then use the provided credentials to upload directly to storage:

File Size Limits

File size limits vary based on the upload method and your account tier:

  • API-Based Methods (URL, data URL, multipart, raw stream): Up to 4.5MB
  • Direct-to-Storage Method: Size limits based on your account tier, typically much larger

If you exceed size limits, you'll receive a limits reached error. Use the direct-to-storage method for large files.

Important Notes:

  • Uploading new content to a file replaces any existing content
  • The file's metadata (content type) is automatically updated based on the uploaded content
  • All uploads are performed securely with appropriate authentication and authorization checks
  • Upload operations are atomic - if an upload fails, the previous content (if any) remains unchanged

Deleting Files

Deleting files permanently removes both the file metadata and the actual file content from storage. This operation is irreversible and should be used with caution, particularly for files that may be referenced by other resources in your application.

To delete a file, make a POST request to the delete endpoint:

Replace {fileId} with the ID of the file you want to delete. Even though this endpoint doesn't require any body parameters, you must still send a POST request with an empty JSON object and the appropriate Content-Type header.

When you delete a file, the following actions occur:

  1. Storage Cleanup: The actual file content is removed from the storage service, freeing up storage space in your account
  2. Metadata Removal: The file record and all associated metadata are deleted from the database
  3. Reference Breaking: Any references to this file from other resources (such as dataset attachments) become invalid

The delete operation includes comprehensive security checks to ensure you have permission to delete the file. Only files that belong to your user account can be deleted through this endpoint, preventing accidental or malicious deletion of other users' files.

Important Considerations:

  • Dataset Attachments: If the file is attached to one or more datasets, deleting the file will not automatically remove dataset records created from this file. You should detach the file from datasets first using the dataset file detachment endpoint if you want to remove associated records.

  • Blueprint Dependencies: If the file is associated with a blueprint, ensure that removing it won't break any blueprint functionality or dependent resources.

  • No Undo: File deletion is permanent and cannot be reversed. Make sure you have backups of important files before deletion, or verify that the file is no longer needed.

Example Response:

The response confirms successful deletion by returning the ID of the deleted file. If the file doesn't exist or you don't have permission to delete it, the API will return an appropriate error response.

Best Practice: Before deleting a file, consider using the fetch endpoint to retrieve and verify its details, ensuring you're deleting the correct file and understanding any potential impacts on your application.

Downloading Files

Downloading file content is a core operation that allows you to retrieve the actual data stored in a file resource. The download endpoint provides flexible access options including direct binary streaming and URL redirection, accommodating different client requirements and use cases.

To download a file's content, make a GET request to the download endpoint:

Replace {fileId} with the ID of the file you want to download. The behavior of this endpoint depends on the file's visibility setting and your authentication status:

Private Files: If the file visibility is set to private, you must be authenticated and be the owner of the file to download it. The API performs security checks to verify ownership before allowing access to the content.

Public Files: If the file visibility is set to public, anyone can download the file without authentication, making it suitable for publicly shared resources like documentation, images, or datasets that should be accessible to all users.

Response Formats

The download endpoint supports two response formats based on the Accept header:

Direct Binary Download (default):

This returns the file content directly as a binary stream with appropriate Content-Type and Content-Disposition headers. The filename in the response will match the file's name field or default to a generated name.

URL Response:

This returns a JSON object containing a pre-signed URL that you can use to download the file:

Important Note: The download endpoint streams content from secure storage services and handles all necessary authentication and authorization. For large files or high-traffic scenarios, consider using the JSON response format to obtain a direct storage URL, which can reduce load on your application servers.

Retrieving File Details

Fetching detailed information about a specific file is essential when you need to inspect file properties, verify metadata, or understand the current state of a file resource before performing operations on it.

To retrieve complete details for a specific file, make a GET request to the fetch endpoint with the file ID:

Replace {fileId} with the actual ID of the file you want to retrieve. The file ID is typically obtained when creating a file or from the list endpoint response.

The response includes comprehensive information about the file:

  • Basic Information: File ID, name, and description that identify and describe the file
  • Ownership: User ID indicating who owns the file
  • Blueprint Association: Blueprint ID if the file is associated with a blueprint resource
  • Visibility Settings: Whether the file is private or public, controlling access permissions
  • Metadata: Custom metadata stored in the meta field, which may include content type information and other file-specific properties
  • Timestamps: Creation and last update times for tracking file lifecycle

The fetch operation performs security checks to ensure you have permission to access the requested file. Only files that belong to your user account can be retrieved through this endpoint, protecting data privacy and preventing unauthorized access.

Example Response:

Syncing Files to Datasets

File synchronization is a critical operation when using files as data sources for datasets. The sync endpoint triggers the import process that reads the file content and creates or updates records in all datasets where the file is attached.

Files are not automatically processed when uploaded or attached to datasets. This design gives you explicit control over when file content is imported, allowing you to prepare and verify files before triggering potentially expensive processing operations. The sync operation must be explicitly requested when you're ready to import the file data.

To sync a file to its associated datasets, make a POST request to the sync endpoint:

Replace {fileId} with the ID of the file you want to sync. Even though this endpoint doesn't require any body parameters, you must still send a POST request with an empty JSON object.

How File Sync Works

When you trigger a file sync:

  1. Dataset Discovery: The system identifies all datasets where this file is currently attached
  2. Queue Processing: Sync events are queued for each associated dataset, ensuring reliable processing even under high load
  3. Asynchronous Import: The file content is processed asynchronously in the background, parsing the file and creating dataset records
  4. Event Logging: Progress and results are recorded in the dataset event log, which you can monitor to track sync status

The sync operation returns immediately after queuing the sync events, rather than waiting for processing to complete. This prevents timeouts for large files and allows you to continue with other operations while the import runs in the background.

Rate Limiting

The sync endpoint has rate limiting to prevent excessive resource usage. You can trigger a sync for a specific file once every 2 minutes. This prevents duplicate processing and ensures system stability.

If you attempt to sync a file more frequently than allowed, you'll receive a rate limit error. Wait for the rate limit window to expire before triggering another sync.

Use Cases

  • Initial Import: After uploading and attaching a file to a dataset, sync to import the initial data
  • Content Updates: When you re-upload a file with updated content, sync to refresh dataset records
  • Multi-Dataset Import: Trigger import across all datasets where the file is attached with a single sync operation

Important Note: If a file is not attached to any datasets, the sync operation will succeed but won't perform any processing. Ensure your file is properly attached to datasets before syncing.

Updating File Metadata

Updating file metadata allows you to modify the descriptive information and configuration settings associated with a file without affecting the actual file content. This is useful for organizing files, updating descriptions, changing access controls, or modifying blueprint associations as your application requirements evolve.

To update a file's metadata, make a POST request to the update endpoint with the file ID and the fields you want to modify:

All fields in the update request are optional, allowing you to modify only the properties you need to change. The fields you can update include:

  • name: A descriptive name for the file that helps identify its purpose
  • description: Detailed information about the file's content and purpose
  • visibility: Access control setting that can be either private or public, controlling who can access the file
  • blueprintId: Associate the file with a blueprint resource, or set to null to remove blueprint association
  • meta: Custom metadata object for storing additional file-specific information

The visibility setting is particularly important for security. When set to private, only the file owner can download or access the file content. When set to public, the file becomes accessible to anyone with the file ID, which is useful for publicly shared resources.

The update operation preserves any fields not included in the request, performing a partial update rather than replacing the entire file resource. The actual file content and upload status remain unchanged - this endpoint only modifies metadata.

Example: Changing File Visibility

Security Note: You can only update files that belong to your user account. The API validates ownership before allowing any modifications, ensuring proper access control and preventing unauthorized changes to other users' files.