Question Answering#

Run on Google Colab View source on GitHub Download notebook


Introduction#

In this tutorial, we present how to use HuggingFace and OpenAI models in EvaDB to answer questions based on videos. In particular, we will first convert the speech component of the video to text using the HuggingFace model. The generated transcript is stored in a table as a text column for subsequent analysis. We then use an OpenAI model to answer questions based on the text column.

EvaDB makes it easy to answer questions based on videos using its built-in support for HuggingFace and OpenAI models.

Prerequisites#

To follow along, you will need to set up a local instance of EvaDB via pip.

Connect to EvaDB#

After installing EvaDB, use the following Python code to establish a connection and obtain a cursor for running EvaQL queries.

import evadb
cursor = evadb.connect().cursor()

We will assume that the input ukraine_video video is loaded into EvaDB. To download the video and load it into EvaDB, see the complete question answering notebook on Colab.

Create Speech Recognition Function#

To create a custom SpeechRecognizer function based on the popular Whisper model, use the CREATE FUNCTION statement. In this query, we leverage EvaDB’s built-in support for HuggingFace models. We only need to specify the task and the model parameters in the query to create this function:

CREATE FUNCTION SpeechRecognizer
TYPE HuggingFace
    TASK 'automatic-speech-recognition'
    MODEL 'openai/whisper-base';

Note

EvaDB has built-in support for a wide range of HuggingFace models.

Create ChatGPT Function#

EvaDB has built-in support for ChatGPT function from OpenAI. You will need to configure the OpenAI key in the environment as shown below:

# Set OpenAI key
import os
os.environ["OPENAI_API_KEY"] = "sk-..."

Note

EvaDB has built-in support for a wide range of OpenAI models. You can also switch to another large language models that runs locally by defining a custom AI function.

ChatGPT function is a wrapper around OpenAI API call. You can also switch to other LLM models that can run locally.

Convert Speech to Text#

After registering the SpeechRecognizer function, we run it over the video to obtain the video’s transcript. EvaDB supports direct reference to the audio component of the video as shown in this query:

CREATE TABLE text_summary AS
SELECT SpeechRecognizer(audio)
FROM ukraine_video;

Here, the SpeechRecognizer function is applied on the audio component of the ukraine_video video loaded into EvaDB. The output of the SpeechRecognizer function is stored in the text column of the text_summary table.

Here is the query’s output DataFrame:

+-------------------------------------------------------------------------------------------------------------------------+
|                                                    text_summary.text                                                    |
+-------------------------------------------------------------------------------------------------------------------------+
| The war in Ukraine has been on for 415 days. Who is winning it? Not Russia. Certainly not Ukraine. It is the US oil ... |
+-------------------------------------------------------------------------------------------------------------------------+

Question Answering using ChatGPT#

We next run a EvaQL query that uses the ChatGPT function on the text column to answer questions based on the video. The text column serves as important context for the large language model. This query checks if the video is related to the war between Ukraine and Russia.

SELECT ChatGPT(
    'Is this video summary related to Ukraine russia war',
    text)
FROM text_summary;

Here is the query’s output DataFrame:

+--------------------------------------------------------------------------------------------------------------------------+
|                                                     chatgpt.response                                                     |
+--------------------------------------------------------------------------------------------------------------------------+
| No, the video summary provided does not appear to be related to the Ukraine-Russia war. It seems to be a conversatio ... |
+--------------------------------------------------------------------------------------------------------------------------+

Leverage Text Processing AI Engines with EvaDB#

By integrating databases and AI engines using EvaDB, developers can easily extract insights from text data with just a few EvaQL queries. These powerful natural language processing (NLP) models from OpenAI and HuggingFace are capable of complex text processing tasks (e.g., answering complex questions with context obtained from a column in a table).

EvaDB makes it easy for developers to easily incorporate powerful NLP capabilities into their AI-powered applications while saving time and resources compared to traditional AI development pipelines.

What’s Next?#

👋 If you are excited about our vision of bringing AI inside databases, consider:




Language Models (🦙) and Databases

Language Models (🦙) and Databases#