MNIST TUTORIAL#

Run on Google Colab View source on GitHub Download notebook

Start EVA server#

We are reusing the start server notebook for launching the EVA server.

!wget -nc "https://raw.githubusercontent.com/georgia-tech-db/eva/master/tutorials/00-start-eva-server.ipynb"
%run 00-start-eva-server.ipynb
cursor = connect_to_server()
File '00-start-eva-server.ipynb' already there; not retrieving.

[  -z "$(lsof -ti:5432)" ] || kill -9 $(lsof -ti:5432)
nohup eva_server > eva.log 2>&1 &

WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
Note: you may need to restart the kernel to use updated packages.

Downloading the videos#

# Getting MNIST as a video
!wget -nc https://www.dropbox.com/s/yxljxz6zxoqu54v/mnist.mp4
# Getting a udf
!wget -nc https://raw.githubusercontent.com/georgia-tech-db/eva/master/tutorials/apps/mnist/eva_mnist_udf.py
--2022-12-18 17:37:23--  https://www.dropbox.com/s/yxljxz6zxoqu54v/mnist.mp4
Resolving www.dropbox.com (www.dropbox.com)... 162.125.81.18, 2620:100:6031:18::a27d:5112
Connecting to www.dropbox.com (www.dropbox.com)|162.125.81.18|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: /s/raw/yxljxz6zxoqu54v/mnist.mp4 [following]
--2022-12-18 17:37:25--  https://www.dropbox.com/s/raw/yxljxz6zxoqu54v/mnist.mp4
Reusing existing connection to www.dropbox.com:443.
HTTP request sent, awaiting response... 302 Found
Location: https://uc0434f86f4e20eb47bfce2d0904.dl.dropboxusercontent.com/cd/0/inline/By1gM8gLIuELxuYPr39tgADfJbLZU-kr0e84LkeUeMUOrSbmYjLMReXusSb2odj64Ve6Pxr7tvBNT_cj_Stv1Gqpj3sjyqMP9nupUu6EWZpOkBN97XcH3djlLow_EsJlPT9fDl9elRf5UxQGD_WK97xXHHNzhE0qfsLl57m-2n5cxw/file# [following]
--2022-12-18 17:37:26--  https://uc0434f86f4e20eb47bfce2d0904.dl.dropboxusercontent.com/cd/0/inline/By1gM8gLIuELxuYPr39tgADfJbLZU-kr0e84LkeUeMUOrSbmYjLMReXusSb2odj64Ve6Pxr7tvBNT_cj_Stv1Gqpj3sjyqMP9nupUu6EWZpOkBN97XcH3djlLow_EsJlPT9fDl9elRf5UxQGD_WK97xXHHNzhE0qfsLl57m-2n5cxw/file
Resolving uc0434f86f4e20eb47bfce2d0904.dl.dropboxusercontent.com (uc0434f86f4e20eb47bfce2d0904.dl.dropboxusercontent.com)... 162.125.81.15, 2620:100:6031:15::a27d:510f
Connecting to uc0434f86f4e20eb47bfce2d0904.dl.dropboxusercontent.com (uc0434f86f4e20eb47bfce2d0904.dl.dropboxusercontent.com)|162.125.81.15|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 62156 (61K) [video/mp4]
Saving to: 'mnist.mp4'

mnist.mp4           100%[===================>]  60.70K   156KB/s    in 0.4s    

2022-12-18 17:37:27 (156 KB/s) - 'mnist.mp4' saved [62156/62156]

File 'eva_mnist_udf.py' already there; not retrieving.

Upload the video for analysis#

cursor.execute('DROP TABLE MNISTVid')
response = cursor.fetch_all()
print(response)
cursor.execute('LOAD VIDEO "mnist.mp4" INTO MNISTVid')
response = cursor.fetch_all()
print(response)
@status: ResponseStatus.SUCCESS
@batch: 
                                       0
0  Table Successfully dropped: MNISTVid
@query_time: 0.024652965999848675
@status: ResponseStatus.SUCCESS
@batch: 
                            0
0  Number of loaded VIDEO: 1
@query_time: 0.06907113200031745

Visualize Video#

from IPython.display import Video
Video("mnist.mp4", embed=True)

Create an user-defined function (UDF) for analyzing the frames#

cursor.execute("""CREATE UDF IF NOT EXISTS MnistCNN
                  INPUT  (data NDARRAY (3, 28, 28))
                  OUTPUT (label TEXT(2))
                  TYPE  Classification
                  IMPL  'eva_mnist_udf.py';
        """)
response = cursor.fetch_all()
print(response)
@status: ResponseStatus.SUCCESS
@batch: 
                                              0
0  UDF MnistCNN already exists, nothing added.
@query_time: 0.014347114999509358

Run the Image Classification UDF on video#

cursor.execute("""SELECT data, MnistCNN(data).label 
                  FROM MNISTVid
                  WHERE id = 30 OR id = 50 OR id = 70 OR id = 0 OR id = 140""")
response = cursor.fetch_all()
print(response.batch)
                                                                                         mnistvid.data  \
0  [[[ 0  0  0]\n [ 0  0  0]\n [ 0  0  0]\n [ 0  0  0]\n [ 0  0  0]\n [ 0  0  0]\n [ 0  0  0]\n [ 0...   
1  [[[2 2 2]\n [2 2 2]\n [2 2 2]\n [2 2 2]\n [2 2 2]\n [2 2 2]\n [2 2 2]\n [2 2 2]\n [2 2 2]\n [2 2...   
2  [[[13 13 13]\n [ 2  2  2]\n [ 2  2  2]\n [13 13 13]\n [ 6  6  6]\n [ 0  0  0]\n [ 5  5  5]\n [22...   
3  [[[ 0  0  0]\n [ 0  0  0]\n [ 0  0  0]\n [ 0  0  0]\n [ 0  0  0]\n [ 0  0  0]\n [ 1  1  1]\n [ 3...   
4  [[[ 0  0  0]\n [ 0  0  0]\n [ 0  0  0]\n [ 0  0  0]\n [ 0  0  0]\n [ 0  0  0]\n [ 0  0  0]\n [ 0...   

  mnistcnn.label  
0              6  
1              2  
2              3  
3              7  
4              5  

Visualize output of query on the video#

# !pip install matplotlib
import matplotlib.pyplot as plt
import numpy as np

# create figure (fig), and array of axes (ax)
fig, ax = plt.subplots(nrows=1, ncols=5, figsize=[6,8])

df = response.batch.frames
for axi in ax.flat:
    idx = np.random.randint(len(df))
    img = df['mnistvid.data'].iloc[idx]
    label = df['mnistcnn.label'].iloc[idx]
    axi.imshow(img)
    
    axi.set_title(f'label: {label}')

plt.show()
../../_images/f631563ab5f8411e1065ed19bf3e4cfc325ed290c016d78d201df99ff2b96de2.png