AI and Vector Search ​
This guide covers AYB's shipped vector-search surface: embedding wiring, vector index admin APIs, nearest-neighbor queries, semantic queries, and hybrid search.
It is based on:
internal/api/vector_query.go,internal/api/semantic_query.go,internal/api/hybrid_search.go,internal/api/handler_list.gointernal/server/vector_admin_handler.go,internal/server/routes_admin.gointernal/cli/start_services_wiring_support.go::wireAIEmbeddinginternal/config/config_types.go::AIConfig- Tests:
vector_query_test.go,semantic_query_test.go,hybrid_search_test.go,vector_admin_handler_test.go,embedding_test.go,config_ai_test.go
AI assistant endpoints under /api/admin/ai/assistant* are documented separately. This page focuses on vector search.
Prerequisites ​
- A table with at least one
vectorcolumn (vector(N)is recommended so dimensions are enforced). pgvectoravailable in the connected database.- An embedding-capable AI provider wired via
[ai]config.
If you are running AYB in managed PostgreSQL mode, verify that the managed PostgreSQL build you are using includes pgvector. If it does not, point AYB at an external PostgreSQL instance with pgvector installed.
AI embedding config ​
AIConfig keys used by vector search:
default_providerdefault_modelembedding_provider(optional, falls back todefault_provider)embedding_model(optional)embedding_dimensions(map key format:provider:model, case-insensitive)timeoutmax_retriesbreaker.failure_thresholdbreaker.open_secondsbreaker.half_open_probe_limitproviders.<name>.api_keyproviders.<name>.base_urlproviders.<name>.default_model
Provider/model resolution in wireAIEmbedding:
- Provider:
embedding_provider->default_provider - Model:
embedding_model->providers.<provider>.default_model->default_model
Configured dimensions:
- If
embedding_dimensions[provider:model]exists, AYB stores that expected dimension. - On semantic queries, AYB fails fast if that configured dimension does not match the target vector column dimension.
Example:
[ai]
default_provider = "openai"
default_model = "gpt-4o-mini"
embedding_provider = "openai"
embedding_model = "text-embedding-3-small"
timeout = 30
max_retries = 2
[ai.embedding_dimensions]
"openai:text-embedding-3-small" = 1536
[ai.providers.openai]
api_key = "${OPENAI_API_KEY}"
default_model = "gpt-4o-mini"
[ai.providers.ollama]
base_url = "http://localhost:11434"
default_model = "nomic-embed-text"Admin vector indexes ​
All endpoints require admin auth (Authorization: Bearer <admin-token>).
POST /api/admin/vector/indexesGET /api/admin/vector/indexes
Create index ​
Request JSON:
table(required)column(required, must be a vector column)method(required:hnsworivfflat)metric(required:cosine,l2,inner_product)schema(optional, defaults topublicunless schema cache resolves table schema)index_name(optional, auto-generated asidx_<table>_<column>_<method>)lists(optional, used forivfflat)
Example:
curl -X POST http://localhost:8090/api/admin/vector/indexes \
-H "Authorization: Bearer $AYB_ADMIN_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"table": "documents",
"column": "embedding",
"method": "ivfflat",
"metric": "cosine",
"lists": 100
}'AYB executes CREATE INDEX CONCURRENTLY IF NOT EXISTS ....
Response fields on success:
index_namemethodmetrictablecolumn
AYB returns 409 when another concurrent index build is already in progress.
List indexes ​
GET /api/admin/vector/indexes returns:
{
"indexes": [
{
"name": "idx_documents_embedding_hnsw",
"schema": "public",
"table": "documents",
"method": "hnsw",
"definition": "CREATE INDEX ..."
}
]
}Only hnsw and ivfflat indexes are included.
Query modes ​
All query modes are on collection list endpoints:
GET /api/collections/{table}
Common vector params:
vector_column(optional unless table has multiple vector columns)distance(optional; defaults tocosine; allowed:cosine,l2,inner_product)perPagecontrols result limitfilteris applied before ranking
1) Nearest-neighbor (nearest) ​
Use a JSON array in the query string:
curl "http://localhost:8090/api/collections/documents?nearest=[0.12,0.34,0.56]&distance=cosine&perPage=10"Validation enforced:
nearestmust be JSON array of numbers- vector cannot be empty
- if column is
vector(N), query vector must have dimensionN
Response rows include _distance.
2) Semantic query (semantic_query) ​
AYB embeds the text, then runs nearest-neighbor search.
curl "http://localhost:8090/api/collections/documents?semantic_query=find+similar+docs&distance=l2&perPage=10"Error mapping includes:
400when semantic search is not configured or pgvector is unavailable500when configured embedding dimensions do not match the target vector column, or when the provider returns an embedding with the wrong dimension504for embed timeout/cancel429for provider rate limit (ProviderError429)502for provider auth/other upstream errors, including empty embedding results503for breaker-open conditions
3) Hybrid search (search + semantic=true) ​
Hybrid mode combines full-text and vector results using reciprocal rank fusion.
It requires both search=<text> and semantic=true. The hybrid path only activates when both are present.
It also requires at least one text column for the full-text side of the query, in addition to the target vector column.
curl "http://localhost:8090/api/collections/articles?search=postgres+indexing&semantic=true&distance=cosine&perPage=10"Rules from dispatchVectorPaths:
semantic=truecannot be combined withnearestorsemantic_querynearestandsemantic_queryare mutually exclusive
Hybrid responses include fused fields:
_fts_rank_vector_distance_hybrid_score
BYOK chat (movies demo) ​
The movies demo exposes a Bring-Your-Own-Key (BYOK) chat flow that pairs vector search with streaming AI completions. Provider API keys live in AYB's vault and are referenced by name when registering a BYOK binding, so the raw key never travels in the request body. For the full demo overview, see Demos.
Accepted providers: openai, anthropic, ollama. The vault secret named in the BYOK binding must already exist before the binding can be set — POST /api/admin/movies/byok validates the secret against the vault before installing the mapping.
Set a BYOK binding ​
curl -X POST http://localhost:8090/api/admin/movies/byok \
-H "Authorization: Bearer $AYB_ADMIN_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"provider": "openai",
"secret_name": "openai_api_key"
}'Clear a BYOK binding ​
DELETE /api/admin/movies/byok/{provider} removes the binding for the given provider.
curl -X DELETE http://localhost:8090/api/admin/movies/byok/openai \
-H "Authorization: Bearer $AYB_ADMIN_TOKEN"Clearing is idempotent — clearing an unbound provider returns 204 just like clearing a bound one.
Stream chat ​
POST /api/admin/movies/chat/stream resolves the provider (BYOK-aware), then streams the completion as Server-Sent Events. The wire format is a single start event, zero or more chunk events, then either a done event on completion or an error event on stream failure.
curl -N -X POST http://localhost:8090/api/admin/movies/chat/stream \
-H "Authorization: Bearer $AYB_ADMIN_TOKEN" \
-H "Content-Type: application/json" \
-H "Accept: text/event-stream" \
-d '{
"messages": [
{"role": "user", "content": "Recommend a movie like Inception."}
],
"provider": "openai",
"model": "gpt-4o-mini",
"session_id": ""
}'session_id is optional — when empty or not a valid UUID, AYB mints a fresh session id and returns it on the start event so the client can append further turns to the same conversation.