Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Scroll Vector Database

sense.scroll.team is a vector database that stores the vector embeddings of all latest articles published by scroll. It adds new articles as they are published.

Tech:

  • Typesense for search engine and vector database.
  • ts/all-MiniLM-L12-v2 model for generating vector embeddings.
  • FastAPI for backend API.
  • Vue.js with vue-instantsearch/vue3/es for frontend.

Learnings:

  • It’s pretty easy to generate embeddings locally and semantic search is pretty accurate and fast with Typesense. Scales really well with the number of articles even on limited hardware.
  • Along with semantic search Typesense also provides really good support for keyword search, filtering and faceting etc. Don’t forget to add facet=True to eligible fields in the schema before populating data.
  • Other powerful features like hybrid search, geo search, Image search, natural language, conversational search, and AI agent modes are also available in Typesense.
  • For semantic search, long paragraphs are better than short phrases.