Download the PHP package codewithkyrian/chromadb-php without Composer
On this page you can find all versions of the php package codewithkyrian/chromadb-php. It is possible to download/install these versions without Composer. Possible dependencies are resolved automatically.
Download codewithkyrian/chromadb-php
More information about codewithkyrian/chromadb-php
Files in codewithkyrian/chromadb-php
Package chromadb-php
Short Description A PHP client for the Chroma Open Source Embedding Database
License MIT
Informations about the package chromadb-php
ChromaDB PHP
A PHP library for interacting with Chroma vector database seamlessly.
Note: This package is framework-agnostic, and can be used in any PHP project. If you're using Laravel however, you might want to check out the Laravel-specific package here which provides a more Laravel-like experience, and includes a few extra features.
Description
Chroma is an open-source vector database that allows you to store, search, and analyze high-dimensional data at scale. It is designed to be fast, scalable, and reliable. It makes it easy to build LLM (Large Language Model) applications and services that require high-dimensional vector search.
ChromaDB PHP provides a simple and intuitive interface for interacting with Chroma from PHP. It enables you to:
- Create, read, update, and delete documents.
- Execute queries and aggregations.
- Manage collections and indexes.
- Handle authentication and authorization.
- Utilize other ChromaDB features seamlessly.
- And more...
Small Example
Requirements
- PHP 8.1 or higher
- ChromaDB 0.4.0 or higher running in client/server mode
Running ChromaDB
In order to use this library, you need to have ChromaDB running somewhere. You can either run it locally or in the cloud. (Chroma doesn't support cloud yet, but it will soon.)
For now, ChromaDB can only run in-memory in Python. You can however run it in client/server mode by either running the python project or using the docker image (recommended).
To run the docker image, you can use the following command:
You can also pass in some environment variables using a .env
file:
Or if you prefer using a docker-compose file, you can use the following:
And then run it using:
(Check out the Chroma Documentation for more information on how to run ChromaDB.)
Either way, you can now access ChromaDB at http://localhost:8000
.
Installation
Usage
Connecting to ChromaDB
By default, ChromaDB will try to connect to http://localhost:8000
using the default database name default_database
and default tenant name default_tenant
. You can however change these values by constructing the client using the
factory method:
If the tenant or database doesn't exist, the package will automatically create them for you.
Authentication
ChromaDB supports static token-based authentication. To use it, you need to start the Chroma server passing the required
environment variables as stated in the documentation. If you're using the docker image, you can pass in the environment
variables using the --env
flag or by using a .env
file and for the docker-compose file, you can use the env_file
option, or pass in the environment variables directly like so:
You can then connect to ChromaDB using the factory method:
Getting the version
Creating a Collection
Creating a collection is as simple as calling the createCollection
method on the client and passing in the name of
the collection.
If the collection already exists in the database, the package will throw an exception.
Inserting Documents
To insert documents into a collection, you need to provide the following:
ids
: An array of document ids. The ids must be unique and must be strings.embeddings
: An array of document embeddings. The embeddings must be a 1D array of floats with a consistent length. You can compute the embeddings using any embedding model of your choice (just make sure that's what you use when querying as well).metadatas
: An array of document metadatas. The metadatas must be an array of key-value pairs.
If you don't have the embeddings, you can pass in the documents and provide an embedding function that will be used to compute the embeddings for you.
Passing in Embedding Function
To use an embedding function, you need to pass it in as an argument when creating the collection:
The embedding function must be an instance of EmbeddingFunctionInterface
. There are a few built-in embedding functions
that you can use:
-
OpenAIEmbeddingFunction
: This embedding function uses the OpenAI API to compute the embeddings. You can use it like this:You can get your OpenAI API key and organization id from your OpenAI dashboard, and you can omit the organization id if your API key doesn't belong to an organization. The model name is optional as well and defaults to
text-embedding-ada-002
-
JinaEmbeddingFunction
: This is a wrapper for the Jina Embedding models. You can use by passing your Jina API key and the desired model. THis defaults tojina-embeddings-v2-base-en
HuggingFaceEmbeddingServerFunction
: This embedding function is a wrapper around the HuggingFace Text Embedding Server. Before using it, you need to have the HuggingFace Embedding Server running somewhere locally. Here's how you can use it:
Besides the built-in embedding functions, you can also create your own embedding function by implementing
the EmbeddingFunction
interface (including Anonymous Classes):
The embedding function will be called for each batch of documents that are inserted into the collection, and must be provided either when creating the collection or when querying the collection. If you don't provide an embedding function, and you don't provide the embeddings, the package will throw an exception.
Inserting Documents into a Collection with an Embedding Function
Getting a Collection
Or with an embedding function:
Make sure that the embedding function you provide is the same one that was used when creating the collection.
Counting the items in a collection
Updating a collection
Deleting Documents
Querying a Collection
To query a collection, you need to provide the following:
queryEmbeddings
(optional): An array of query embeddings. The embeddings must be a 1D array of floats. You can compute the embeddings using any embedding model of your choice (just make sure that's what you use when inserting as well).nResults
: The number of results to return. Defaults to 10.-
queryTexts
(optional): An array of query texts. The texts must be strings. You can omit this if you provide the embeddings. Here's an example: -
where
(optional): The where clause to use to filter items based on their metadata. Here's an example:The where clause must be an array of key-value pairs. The key must be a string, and the value can be a string or an array of valid filter values. Here are the valid filters (
$eq
,$ne
,$in
,$nin
,$gt
,$gte
,$lt
,$lte
):$eq
: Equals$ne
: Not equals$gt
: Greater than$gte
: Greater than or equal to$lt
: Less than$lte
: Less than or equal to
Here's an example:
You can also use multiple filters:
-
whereDocument
(optional): The where clause to use to filter items based on their document. Here's an example:The where clause must be an array of key-value pairs. The key must be a string, and the value can be a string or an array of valid filter values. In this case, only two filtering keys are supported -
$contains
and$not_contains
.Here's an example:
-
include
(optional): An array of fields to include in the response. Possible values areembeddings
,documents
,metadatas
anddistances
. It defaults toembeddings
andmetadatas
(documents
are not included by default because they can be large).distances
is only valid for querying and not for getting. It returns the distances between the query embeddings and the embeddings of the results.
Other relevant information about querying and retrieving a collection can be found in the ChromaDB Documentation.
Deleting items in a collection
To delete the documents in a collection, pass in an array of the ids of the items:
Passing the ids is optional. You can delete items from a collection using a where filter:
Deleting a collection
Deleting a collection is as simple as passing in the name of the collection to be deleted.
Testing
Contributors
- Kyrian Obikwelu
- Other contributors are welcome.
License
This project is licensed under the MIT License. See the LICENSE file for more information.