Scroll Vector Database
sense.scroll.team is a vector database that stores the vector embeddings of all latest articles published by scroll. It adds new articles as they are published.
Tech:
- Typesense for search engine and vector database.
- ts/all-MiniLM-L12-v2 model for generating vector embeddings.
- FastAPI for backend API.
- Vue.js with vue-instantsearch/vue3/es for frontend.
Learnings:
- It’s pretty easy to generate embeddings locally and semantic search is pretty accurate and fast with Typesense. Scales really well with the number of articles even on limited hardware.
- Along with semantic search Typesense also provides really good support for keyword search, filtering and faceting etc. Don’t forget to add facet=True to eligible fields in the schema before populating data.
- Other powerful features like hybrid search, geo search, Image search, natural language, conversational search, and AI agent modes are also available in Typesense.
- For semantic search, long paragraphs are better than short phrases.