LangChain
LangChain is a popular framework for working with AI, Vectors, and embeddings. LangChain supports using Supabase as a vector store, using the pgvector
extension.
Initializing your database#
Prepare you database with the relevant tables:
_38-- Enable the pgvector extension to work with embedding vectors_38create extension vector;_38_38-- Create a table to store your documents_38create table documents (_38 id bigserial primary key,_38 content text, -- corresponds to Document.pageContent_38 metadata jsonb, -- corresponds to Document.metadata_38 embedding vector(1536) -- 1536 works for OpenAI embeddings, change if needed_38);_38_38-- Create a function to search for documents_38create function match_documents (_38 query_embedding vector(1536),_38 match_count int default null,_38 filter jsonb DEFAULT '{}'_38) returns table (_38 id bigint,_38 content text,_38 metadata jsonb,_38 similarity float_38)_38language plpgsql_38as $$_38#variable_conflict use_column_38begin_38 return query_38 select_38 id,_38 content,_38 metadata,_38 1 - (documents.embedding <=> query_embedding) as similarity_38 from documents_38 where metadata @> filter_38 order by documents.embedding <=> query_embedding_38 limit match_count;_38end;_38$$;
Usage#
You can now search your documents using any Node.js application. This is intended to be run on a secure server route.
_28import { SupabaseVectorStore } from 'langchain/vectorstores/supabase'_28import { OpenAIEmbeddings } from 'langchain/embeddings/openai'_28import { createClient } from '@supabase/supabase-js'_28_28const supabaseKey = process.env.SUPABASE_SERVICE_ROLE_KEY_28if (!supabaseKey) throw new Error(`Expected SUPABASE_SERVICE_ROLE_KEY`)_28_28const url = process.env.SUPABASE_URL_28if (!url) throw new Error(`Expected env var SUPABASE_URL`)_28_28export const run = async () => {_28 const client = createClient(url, supabaseKey)_28_28 const vectorStore = await SupabaseVectorStore.fromTexts(_28 ['Hello world', 'Bye bye', "What's this?"],_28 [{ id: 2 }, { id: 1 }, { id: 3 }],_28 new OpenAIEmbeddings(),_28 {_28 client,_28 tableName: 'documents',_28 queryName: 'match_documents',_28 }_28 )_28_28 const resultOne = await vectorStore.similaritySearch('Hello world', 1)_28_28 console.log(resultOne)_28}
Simple Metadata Filtering#
Given the above match_documents
Postgres function, you can also pass a filter parameter to only return documents with a specific metadata field value. This filter parameter is a JSON object, and the match_documents
function will use the Postgres JSONB Containment operator @>
to filter documents by the metadata field values you specify. See details on the Postgres JSONB Containment operator for more information.
_32import { SupabaseVectorStore } from 'langchain/vectorstores/supabase'_32import { OpenAIEmbeddings } from 'langchain/embeddings/openai'_32import { createClient } from '@supabase/supabase-js'_32_32// First, follow set-up instructions above_32_32const privateKey = process.env.SUPABASE_SERVICE_ROLE_KEY_32if (!privateKey) throw new Error(`Expected env var SUPABASE_SERVICE_ROLE_KEY`)_32_32const url = process.env.SUPABASE_URL_32if (!url) throw new Error(`Expected env var SUPABASE_URL`)_32_32export const run = async () => {_32 const client = createClient(url, privateKey)_32_32 const vectorStore = await SupabaseVectorStore.fromTexts(_32 ['Hello world', 'Hello world', 'Hello world'],_32 [{ user_id: 2 }, { user_id: 1 }, { user_id: 3 }],_32 new OpenAIEmbeddings(),_32 {_32 client,_32 tableName: 'documents',_32 queryName: 'match_documents',_32 }_32 )_32_32 const result = await vectorStore.similaritySearch('Hello world', 1, {_32 user_id: 3,_32 })_32_32 console.log(result)_32}
Advanced Metadata Filtering#
You can also use query builder-style filtering (similar to how the Supabase JavaScript library works) instead of passing an object. Note that since the filter properties will be in the metadata column, you need to use arrow operators (->
for integer or ->>
for text) as defined in Postgrest API documentation and specify the data type of the property (e.g. the column should look something like metadata->some_int_value::int
).
_62import { SupabaseFilterRPCCall, SupabaseVectorStore } from 'langchain/vectorstores/supabase'_62import { OpenAIEmbeddings } from 'langchain/embeddings/openai'_62import { createClient } from '@supabase/supabase-js'_62_62// First, follow set-up instructions above_62_62const privateKey = process.env.SUPABASE_SERVICE_ROLE_KEY_62if (!privateKey) throw new Error(`Expected env var SUPABASE_SERVICE_ROLE_KEY`)_62_62const url = process.env.SUPABASE_URL_62if (!url) throw new Error(`Expected env var SUPABASE_URL`)_62_62export const run = async () => {_62 const client = createClient(url, privateKey)_62_62 const embeddings = new OpenAIEmbeddings()_62_62 const store = new SupabaseVectorStore(embeddings, {_62 client,_62 tableName: 'documents',_62 })_62_62 const docs = [_62 {_62 pageContent:_62 'This is a long text, but it actually means something because vector database does not understand Lorem Ipsum. So I would need to expand upon the notion of quantum fluff, a theorectical concept where subatomic particles coalesce to form transient multidimensional spaces. Yet, this abstraction holds no real-world application or comprehensible meaning, reflecting a cosmic puzzle.',_62 metadata: { b: 1, c: 10, stuff: 'right' },_62 },_62 {_62 pageContent:_62 'This is a long text, but it actually means something because vector database does not understand Lorem Ipsum. So I would need to proceed by discussing the echo of virtual tweets in the binary corridors of the digital universe. Each tweet, like a pixelated canary, hums in an unseen frequency, a fascinatingly perplexing phenomenon that, while conjuring vivid imagery, lacks any concrete implication or real-world relevance, portraying a paradox of multidimensional spaces in the age of cyber folklore.',_62 metadata: { b: 2, c: 9, stuff: 'right' },_62 },_62 { pageContent: 'hello', metadata: { b: 1, c: 9, stuff: 'right' } },_62 { pageContent: 'hello', metadata: { b: 1, c: 9, stuff: 'wrong' } },_62 { pageContent: 'hi', metadata: { b: 2, c: 8, stuff: 'right' } },_62 { pageContent: 'bye', metadata: { b: 3, c: 7, stuff: 'right' } },_62 { pageContent: "what's this", metadata: { b: 4, c: 6, stuff: 'right' } },_62 ]_62_62 await store.addDocuments(docs)_62_62 const funcFilterA: SupabaseFilterRPCCall = (rpc) =>_62 rpc_62 .filter('metadata->b::int', 'lt', 3)_62 .filter('metadata->c::int', 'gt', 7)_62 .textSearch('content', `'multidimensional' & 'spaces'`, {_62 config: 'english',_62 })_62_62 const resultA = await store.similaritySearch('quantum', 4, funcFilterA)_62_62 const funcFilterB: SupabaseFilterRPCCall = (rpc) =>_62 rpc_62 .filter('metadata->b::int', 'lt', 3)_62 .filter('metadata->c::int', 'gt', 7)_62 .filter('metadata->>stuff', 'eq', 'right')_62_62 const resultB = await store.similaritySearch('hello', 2, funcFilterB)_62_62 console.log(resultA, resultB)_62}
Hybrid search#
LangChain supports the concept of a hybrid search, which combines Similarity Search with Full Text Search. Read the official docs to get started: Supabase Hybrid Search.
You can install the LangChain Hybrid Search function though our database.dev package manager.
Resources#
- Official LangChain site.
- Official LangChain docs.
- Supabase Hybrid Search.