Projects & uploads

How to organise documents into projects, what formats are supported, how to update or remove files safely.

What a project is (and isn't)

A project is a private container for a set of related documents. Two things hang off projects:

Projects are not a folder hierarchy — there's no nesting. If you find yourself wanting sub-projects, that's usually a sign two separate projects are the right answer.

How to name them

Good project names describe what's inside, not where or when:

Supported file types

TypeWhat gets extractedCitation metadata
PDFText via pdftotext; OCR via Tesseract for scanned PDFsPage number
DOCXParagraphs and headingsHeading section
XLSXEvery cell, sheet by sheetSheet name + row index
PPTXSlide titles and body textSlide number
CSV / TSVEvery rowRow index
TXT / MDParagraphs
PNG / JPG / WEBPOCR text via Tesseract

Default size limit per file is 64 MB (configurable per-tenant on Business / Enterprise plans).

About OCR

Scanned PDFs and standalone images run through Tesseract OCR. The default language pack is English + French; ask support if you need more (German, Spanish, Polish, Russian, Ukrainian, and most European languages are installable on shared hosting in a few minutes).

OCR is slower than plain text extraction — a 50-page scanned PDF can take 1-3 minutes versus 10 seconds for a text PDF of the same size. Once processed, search performance is identical.

Uploading

Open the Upload tab in the sidebar. You can:

Each upload becomes a row in the Jobs view. Status progresses queued → extracting → embedding → completed. A 10-page PDF typically takes 5-10 seconds end-to-end. Failed jobs stay visible so you can read the error and retry.

Updating a document

Knowledge doesn't have an in-place "replace this file" button by design — versions can drift silently and citations get confusing. The honest pattern is:

  1. Upload the new version with a name that distinguishes it (employee_handbook_v3.pdf rather than overwriting employee_handbook.pdf).
  2. Wait for it to finish processing.
  3. Delete the old version from the Library view. Cascade: chunks and embeddings go too, so the old text never appears in future search results.

If two versions of the same document are in the same project, searches may pull passages from either — exactly the confusion we're trying to avoid. Keep one canonical version live at a time unless you're deliberately doing a comparison.

Deleting documents

Library tab → tick the row → Delete selected. Or open one document → trash icon. Deletion is immediate and cascades:

Deletion is not undoable. There's no trash bin / 30-day recovery window — bytes are gone the moment the worker processes the delete. If you need archive-then-delete, download the file first from the document detail view.

Moving a document between projects

Library tab → open a document → Edit projects button → tick / untick. The document itself stays put; only its project memberships change. Useful for documents that outgrow their original project ("this used to be HR-only, now Compliance needs it too").

What admins see vs members

This is enforced at the database level — not just hidden in the UI — so it's safe to expose the workspace to people who shouldn't see everything.