EVA DB#

Database system for building simpler and faster AI-powered applications.

Welcome to EVA DB#

EVA DB is an AI-SQL database for developing applications powered by AI models. We aim to simplify the development and deployment of AI-powered applications that operate on structured (tables, feature stores) and unstructured data (videos, text, podcasts, PDFs, etc.).

Github: https://github.com/georgia-tech-db/eva
PyPI: https://pypi.org/project/evadb/
Twitter: https://twitter.com/evadb_ai
Slack: Invite link

Why EVA?#

Over the last decade, AI models have radically changed the world of natural language processing and computer vision. They are accurate on various tasks ranging from question answering to object tracking in videos. However, two challenges prevent many users from benefiting from these models.

Usability: To use an AI model, the user needs to program against multiple low-level libraries, like OpenCV, PyTorch, and Hugging Face. This tedious process often leads to a complex application that glues together these libraries to accomplish the given task. This programming complexity prevents people who are experts in other domains from benefiting from these models.
Money and Time: Running these deep learning models on large video or document datasets is costly and time-consuming. For example, the state-of-the-art object detection model takes multiple GPU years to process just a week’s videos from a single traffic monitoring camera. Besides the money spent on hardware, these models also increase the time that you spend waiting for the model inference to finish.

Proposed Solution#

That’s where EVA DB comes in.

1. Quickly build AI-Powered Applications#

Historically, SQL database systems have been successful because the query language is simple enough in its basic structure that users without prior experience can learn a usable subset of the language on their first sitting. EVA supports a simple SQL-like query language designed to make it easier for users to leverage AI models. With this query language, the user may chain multiple models in a single query to accomplish complicated tasks with minimal programming.

Here is an illustrative query that examines the emotions of actors in a movie by leveraging multiple deep-learning models that take care of detecting faces and analyzing the emotions of the detected bounding boxes:

--- Analyze the emotions of actors in a movie scene
SELECT id, bbox, EmotionDetector(Crop(data, bbox))
FROM Interstellar
   JOIN LATERAL UNNEST(FaceDetector(data)) AS Face(bbox, conf)
WHERE id < 15;

EVA’s declarative query language reduces the complexity of the application, leading to more maintainable code that allows users to build on top of each other’s queries.

EVA comes with a wide range of models for analyzing unstructured data including image classification, object detection, OCR, face detection, etc. It is fully implemented in Python, and licensed under the Apache license. It already contains integrations with widely-used AI pipelines based on Hugging Face, PyTorch, and Open AI.

The high-level SQL API allows even beginners to use EVA in a few lines of code. Advanced users can define custom user-defined functions that wrap around any AI model or Python library.

2. Save time and money#

EVA DB automatically optimizes the queries to save inference cost and query execution time using its Cascades-style extensible query optimizer. EVA’s optimizer is tailored for AI pipelines. The Cascades query optimization framework has worked well in SQL database systems for several decades. Query optimization in EVA is the bridge that connects the declarative query language to efficient execution.

EVA accelerates AI pipelines using a collection of optimizations inspired by SQL database systems including function caching, sampling, and cost-based operator reordering.

EVA supports an AI-oriented query language for analysing both structured and unstructured data. Here are some illustrative applications:

Using ChatGPT to ask questions based on videos

Analysing traffic flow at an intersection

Examining the emotion palette of actors in a movie

Finding similar images on Reddit

Classifying images based on their content

Image Segmentation using Hugging Face

Recognizing license plates

Analysing toxicity of social media memes

The Getting Started page shows how you can use EVA for different AI tasks and how you can easily extend EVA to support your custom deep learning model through user-defined functions.

The User Guides section contains Jupyter Notebooks that demonstrate how to use various features of EVA. Each notebook includes a link to Google Colab, where you can run the code yourself.

Key Features#

With EVA, you can easily combine SQL and deep learning models to build next-generation database applications. EVA treats deep learning models as functions similar to traditional SQL functions like SUM().
EVA is extensible by design. You can write an user-defined function (UDF) that wraps around your custom deep learning model. In fact, all the built-in models that are included in EVA are written as user-defined functions.
EVA comes with a collection of built-in sampling, caching, and filtering optimizations inspired by relational database systems. These optimizations help speed up queries on large datasets and save money spent on model inference.

Next Steps#

Getting Started

A step-by-step guide to installing EVA and running queries

Query Language

List of all the query commands supported by EVA

User Defined Functions

A step-by-step tour of registering a user defined function that wraps around a custom deep learning model

Illustrative EVA Applications#

Community#

Join the EVA community on Slack to ask questions and to share your ideas for improving EVA.