Create embeddings
To create embeddings from documents, follow these steps:
Open the command line interface.
Run the following command:
llmsearch index create -c /path/to/config.yaml
The default vector database for dense embeddings is ChromaDB, and default embedding model is e5-large-v2 (unless specified otherwise using embedding_model section such as above), which is known for its high performance.
You can find more information about this and other embedding models at MTEB Leadboard.
In addition to dense embeddings, sparse embedding will be generated in /path/to/embedding/folder/splade using SPLADE algorithm.
Both dense and sparse embeddings will be used for context search and ranked using an offline re-ranker.
Update embeddings (optional)
When new files are added or existing documents are changed, follow these steps to update the embeddings:
llmsearch index update -c /path/to/config.yaml
Executing this command will detect changed or new files (based on MD5 hash) and will incrementally update the changes, without the need to rescan the documents from scratch.
Interact with documents
To interact with the documents using one of the supported LLMs, follow these steps:
Open the command line interface.
Launch web interface
llmsearch interact webapp -c /path/to/config_folder -m /path/to/model_config.yaml
Here path/to/config/folder points to a folder of one or more document config files. The tool will scans the configs and allows to switch between them.
API and MCP Server
To launch FastAPI/MCP server, supply a path semantic search config file in the FASTAPI_RAG_CONFIG and path to llm config in FASTAPI_LLM_CONFIG environment variable and launch llmsearchapi
FASTAPI_RAG_CONFIG="/path/to/config.yaml" FASTAPI_LLM_CONFIG="/path/to/llm.yaml" llmsearchapi
The API server will be available at http://localhost:8000/docs and can be used to interact with the documents using the LLMs.
The MCP server will be available at http://localhost:8000/mcp and can be configured via any MCP client, assuming SSE MCP server which should point to the same URL.