Home Rental Prediction#
Introduction#
In this tutorial, we present how to use Prediction AI Engines in EvaDB to predict home rental prices. EvaDB makes it easy to do predictions using its built-in AutoML engines with your existing databases.
Prerequisites#
To follow along, you will need to set up a local instance of EvaDB via pip.
Connect to EvaDB#
After installing EvaDB, use the following Python code to establish a connection and obtain a cursor
for running EvaQL
queries.
import evadb
cursor = evadb.connect().cursor()
Connect EvaDB to PostgreSQL Database Server#
We will assume that you have a PostgreSQL
database server running locally that contains the data needed for analysis. Follow these instructions to install PostgreSQL.
EvaDB lets you connect to your favorite databases, data warehouses, data lakes, etc., via the CREATE DATABASE
statement. In this query, we connect EvaDB to an existing PostgreSQL
server:
CREATE DATABASE postgres_data
WITH ENGINE = 'postgres',
PARAMETERS = {
"user": "eva",
"password": "password",
"host": "localhost",
"port": "5432",
"database": "evadb"
}
We will assume that the input data is loaded into a PostgreSQL
database.
To load the home rental data into your database, see the complete home rental prediction notebook on Colab.
Preview the Home Sales Data#
We use the home rental data in this usecase. The data contains eight columns: number_of_rooms
, number_of_bathrooms
, sqft
, location
, days_on_market
, initial_price
, neighborhood
, and rental_price
.
SELECT * FROM postgres_data.home_rentals LIMIT 3;
This query previews the data in the home_rentals table:
+------------------------------+----------------------------------+-------------------+-----------------------+-----------------------------+----------------------------+---------------------------+---------------------------+
| home_rentals.number_of_rooms | home_rentals.number_of_bathrooms | home_rentals.sqft | home_rentals.location | home_rentals.days_on_market | home_rentals.initial_price | home_rentals.neighborhood | home_rentals.rental_price |
|------------------------------|----------------------------------|-------------------|-----------------------|-----------------------------|----------------------------|---------------------------|---------------------------|
| 1 | 1 | 674 | good | 1 | 2167 | downtown | 2167 |
| 1 | 1 | 554 | poor | 19 | 1883 | westbrae | 1883 |
| 0 | 1 | 529 | great | 3 | 2431 | south_side | 2431 |
+------------------------------+----------------------------------+-------------------+-----------------------+-----------------------------+----------------------------+---------------------------+---------------------------+
Train a Home Rental Prediction Model#
Let’s next train a prediction model from the home_rental table using EvaDB’s CREATE FUNCTION
query.
We will use the built-in Ludwig engine for this task.
CREATE OR REPLACE FUNCTION PredictHouseRent FROM
( SELECT * FROM postgres_data.home_rental )
TYPE Ludwig
PREDICT 'rental_price'
TIME_LIMIT 3600;
In the above query, we use all the columns (except rental_price
) from home_rental
table to predict the rental_price
column.
We set the training time out to be 3600 seconds.
Note
Go over Model Training with Ludwig page on exploring all configurable paramters for the model training frameworks.
+----------------------------------------------+
| Function PredictHouseRent successfully added |
+----------------------------------------------+
Predict the Home Rental Price using the Trained Model#
Next we use the trained PredictHouseRent
to predict the home rental price.
SELECT PredictHouseRent(*) FROM postgres_data.home_rentals LIMIT 3;
We use *
to simply pass all columns into the PredictHouseRent
function.
+-------------------------------------------+
| predicthouserent.rental_price_predictions |
+-------------------------------------------+
| 2087.763672 |
| 1793.570190 |
| 2346.319824 |
+-------------------------------------------+
We have the option to utilize a LATERAL JOIN
to compare the actual rental prices in the home_rentals
dataset with the predicted rental prices generated by the trained model, PredictHouseRent
.
SELECT rental_price, predicted_rental_price
FROM postgres_data.home_rentals
JOIN LATERAL PredictHouseRent(*) AS Predicted(predicted_rental_price)
LIMIT 3;
Here is the query’s output:
+---------------------------+----------------------------------+
| home_rentals.rental_price | Predicted.predicted_rental_price |
+---------------------------+----------------------------------+
| 2167 | 2087.763672 |
| 1883 | 1793.570190 |
| 2431 | 2346.319824 |
+------------------ --------+----------------------------------+
What’s Next?#
👋 EvaDB’s vision is to bring AI inside your database system and make it easy to build fast AI-powered apps. If you liked this tutorial and are excited about our vision, show some ❤️ by:
🐙 giving a ⭐ for the EvaDB repository on Github: https://github.com/georgia-tech-db/evadb
📟 engaging with the EvaDB community on Slack to ask questions and share your ideas and thoughts: https://evadb.ai/community
🎉 contributing to EvaDB by developing cool applications/integrations: https://github.com/georgia-tech-db/evadb/issues
🐦 following us on Twitter: https://twitter.com/evadb_ai
📝 following us on Medium: https://medium.com/evadb-blog