Preview Chunks from Knowledgebase
Retrieve all text chunks that were extracted from a specific dataset in a knowledgebase. This is useful for previewing how your content was processed and chunked.Endpoint
Path Parameters
The unique identifier of the knowledgebase containing the dataset
The unique identifier of the dataset to retrieve chunks from
Headers
Your project API key for authentication
Response
Returns an array of text chunks extracted from the specified dataset.Unique identifier for the chunk
The text content of the chunk
Length of the chunk content in characters
Whether the chunk is enabled for search (can be disabled to exclude from results)
Error Responses
Status Code | Description |
---|---|
400 | Missing knowledgebase ID or dataset ID |
401 | Invalid API key |
403 | Dataset does not belong to the specified knowledgebase |
404 | Knowledgebase, dataset, or chunks not found |
500 | Internal server error |
Understanding Chunks
Chunking Process
- Documents are automatically split into smaller, searchable pieces
- Chunk size is determined by the knowledgebase configuration
- Overlap between chunks ensures context continuity
- Processing preserves semantic meaning across chunk boundaries
Chunk Properties
- Content: The actual text extracted from the document
- Length: Character count helps understand chunk size
- Status: Enabled chunks participate in search, disabled ones don’t
Search Integration
- Each chunk becomes a searchable unit in semantic search
- Chunks are converted to embeddings for similarity matching
- Search queries return the most relevant chunks across all datasets
Use Cases
Content Review
- Preview how your documents were processed
- Verify important information was extracted correctly
- Check for any processing errors or formatting issues
Search Optimization
- Understand how content is structured for search
- Identify chunks that might need better context
- Optimize document structure for better chunking
Troubleshooting
- Debug why certain content isn’t appearing in search results
- Verify chunk content matches expectations
- Check if chunks are properly enabled
Usage Notes
- Only chunks from the specified dataset are returned
- Results are ordered by their position in the original document
- Large datasets may return many chunks
- Chunk content reflects the processed and cleaned text, not the raw file content
Migration from Old Endpoint
If you’re migrating from the deprecated/api/datasets/{filename}/chunks
endpoint:
- Get your knowledgebase ID using Get Knowledgebases
- Get the dataset ID from Get Datasets in Knowledgebase
- Update your API calls to use both IDs in the URL path
- The response format remains the same