Document Ingestion
Ingestion is the process of reading your documents, splitting them into chunks, embedding them as vectors, and storing them in Qdrant for search.
Web UI
Open the Web UI at http://127.0.0.1:3235/ui/ingest. You can:
- Upload individual files directly through the browser
- Specify a directory path on your machine to ingest recursively
Documents are placed in data/documents/ and code files in data/code/ within the RagGo directory.
MCP tools
You can also ingest documents via MCP tools from any connected client:
// Ingest a single file
{
"tool": "ingest_file",
"input": {
"file_path": "/path/to/document.pdf"
}
}
// Ingest an entire directory
{
"tool": "ingest_documents_directory",
"input": {
"directory_path": "/path/to/my-documents/"
}
}
Supported file types
| Type | Extensions | Free | Paid |
|---|---|---|---|
| Plain text | .txt | ✓ | ✓ |
| Markdown | .md, .markdown | ✓ | ✓ |
| — | ✓ | ||
| Word documents | .docx, .doc | — | ✓ |
| Spreadsheets | .xlsx, .xls, .csv | — | ✓ |
| Presentations | .pptx, .ppt | — | ✓ |
| HTML | .html, .htm | — | ✓ |
| Python | .py | — | ✓ |
| Go | .go | — | ✓ |
| JavaScript / TypeScript | .js, .ts, .jsx, .tsx | — | ✓ |
| Rust | .rs | — | ✓ |
| Java | .java | — | ✓ |
| C / C++ | .c, .cpp, .h | — | ✓ |
| C# | .cs | — | ✓ |
| Ruby | .rb | — | ✓ |
| PHP | .php | — | ✓ |
| Swift | .swift | — | ✓ |
| Kotlin | .kt | — | ✓ |
| Fortran | .f, .f90 | — | ✓ |
| Shell | .sh | — | ✓ |
| SQL | .sql | — | ✓ |
| Other code | .lua, .r, .jl, .dart, .scala, .proto, and more | — | ✓ |
Individual and Teams plans also include OCR for scanned PDFs and data-at-rest encryption (AES-256-GCM).
Chunking
RagGo splits documents into chunks of up to 1000 tokens. This provides a good balance between precision and context for most document types.
Re-ingestion
Ingesting the same path again updates changed files and skips unchanged ones (by content hash).
Ingestion speed depends on your CPU. Expect roughly 50–200 documents per minute on a modern 8-core machine.