Train & Deploy YOLOv7 to Streamlit

Steven Smiley
14 min readNov 27, 2022

Make a web application out of a custom trained YOLOv7 model.

Image from my iPhone, taken on 11/15/2022

Over the last few years object detection algorithms have constantly been improving. Currently for 2022, the state-of-the-art is YOLOv7 as it states in their recent paper¹:

“YOLOv7 surpasses all known object detectors in both
speed and accuracy in the range from 5 FPS to 160 FPS
and has the highest accuracy 56.8% AP among all known
real-time object detectors with 30 FPS or higher on GPU
V100.”¹

You can play with their code base here: https://github.com/WongKinYiu/yolov7. I went ahead and developed my own Graphical User Interface (GUI) called Full-Loop-YOLO² to interact with training/deploying models with their code base. It simplifies much of the process. You can find my repository for that tool here: https://github.com/stevensmiley1989/Full_Loop_YOLO.

Screenshot of my GUI from my website.²

This article will explain how I deployed one of these custom trained YOLOv7 models to a web application, using the open source framework, Streamlit.³

If you want to see the web application first, click here.

The code for this article is located in my GitHub repo here.

What is Streamlit³?

“Streamlit is an open-source Python library that makes it easy to create and share beautiful, custom web apps for machine learning and data science. In just a few minutes you can build and deploy powerful data apps.”

So we can just deploy right? It is that simple? … not yet speed racer. A few steps up-front:

  1. Gather Training Data
  2. Train a Custom YOLOv7 Model
  3. Create a a custom python class to inference with YOLOv7 Model
  4. Integrate custom python class with Streamlit.
  5. … then Deploy

Gather Training Data

You need data to train a model. Fortunately, you can find a plethora of data for free on the internet. In my case, I chose the VisDrone⁴ object detection dataset. You can find that data I used here: http://aiskyeye.com/download/object-detection-2/.

Train a Custom YOLOv7 Model

In order to work with Full-Loop-YOLO, you will need to convert their annotation files to PascalVOC format. Full-Loop-YOLO converts PascalVOC (i.e., x0,y0,x1,y1) .xml files to YOLO format (i.e., xc,yc,w,h) .txt files when you click, Create YOLO Objects. So the only real work to do is to convert VisDrone to PascalVOC. See below at a helper function, VisDrone_to_PascalVOC.

PascalVOC format: xmin,ymin,xmax,ymax

YOLO format: xc,yc,w,h

The VisDrone format for annotations are: xmin,ymin,w,h

import numpy as np
from pathlib import Path
from lxml import etree as ET
from PIL import Image
import os
def VisDrone_to_PascalVOC(path_anno,path_img,custom_new_path_anno=None):
'''VisDrone Dataset annotation format
xmin,ymin,w,h,score_i,cat_i,trunc_i,occ_i
INPUT:
READS a Single Annotation from VisDrone, Format .txt
READS a Single Image from VisDrone, Format .jpg
OUTPUT:
WRITES a Single Annotation of PascalVOC, Format .xml
'''
f=open(path_anno,'r')
f_read=f.readlines()
f.close()
All_bboxes=[]
for i,line in enumerate(f_read):
xmin,ymin,w,h,score_i,label_i,trunc_i,occ_i=line.split(',')
xmin=int(xmin)
xmax=int(xmin)+int(w)
ymin=int(ymin)
ymax=int(ymin)+int(h)

dic_i={"xmin":xmin_i,
"xmax":xmax_i,
"ymin":ymin_i,
"ymax":ymax_i,
"Label":label_i,
"trunc_i":trunc_i,
"occ_i":occ_i}

All_bboxes.append(dic_i)
if type(custom_new_path_anno)==type(None):
# Create basepath to PascalVOC Annotations
basepath_PascalVOC=os.path.dirname(path_anno,"Annotations")
if os.path.exists(basepath_PascalVOC)==False:
os.makedirs(basepath_PascalVOC)
# Create new PascalVOC annotation
basename_xml_i=os.path.basename(path_anno).split('.')[0]+'.xml'
path_anno_PascalVOC_i=os.path.join(basepath_PascalVOC,basename_xml_i)
else:
path_anno_PascalVOC_i=custom_new_path_anno #if you want to specify a specific path

# import img for width/height/depth information
img = np.array(Image.open(path_img).convert('RGB'))

# create an ET.Element tree for the new annotation
annotation = ET.Element('annotation')
ET.SubElement(annotation, 'folder').text = str(anno_folder)
ET.SubElement(annotation, 'filename').text = str(filename)
ET.SubElement(annotation, 'path').text = str(filename)

source = ET.SubElement(annotation, 'source')
ET.SubElement(source, 'database').text = 'VisDrone'

size = ET.SubElement(annotation, 'size')
ET.SubElement(size, 'width').text = str (img.shape[1])
ET.SubElement(size, 'height').text = str(img.shape[0])
ET.SubElement(size, 'depth').text = str(img.shape[2])

ET.SubElement(annotation, 'segmented').text = '0'

for item in All_bboxes:
label = item['Label']
trunc = item['trunc_i']
diff = item['occ_i']
xmax = item['xmax']
xmin = item['xmin']
ymin = item['ymin']
ymax = item['ymax']

object = ET.SubElement(annotation, 'object')
ET.SubElement(object, 'name').text = label
ET.SubElement(object, 'truncated').text = trunc
ET.SubElement(object, 'difficult').text = diff

bndbox = ET.SubElement(object, 'bndbox')
ET.SubElement(bndbox, 'xmin').text = str(xmin)
ET.SubElement(bndbox, 'ymin').text = str(ymin)
ET.SubElement(bndbox, 'xmax').text = str(xmax)
ET.SubElement(bndbox, 'ymax').text = str(ymax)

tree = ET.ElementTree(annotation)
tree.write(path_anno_PascalVOC_i,pretty_print=True)

Full-Loop-YOLO steps to train YOLOv7

#1. Go to my repo and follow directions that allow for YOLOv7 training:

https://github.com/stevensmiley1989/Full_Loop_YOLO

git clone https://github.com/stevensmiley1989/Full_Loop_YOLO.git
Full-Loop-YOLO repo

#2. Run Full-Loop YOLO

python3 Full_Loop_YOLO.py

#3. Create New Model (button click)

Create New Model

#4. Set Yolo_Files path

This is the path where your model files will be resolved to. My suggestion is to make it at the same level directory as JPEGImages/Annotations/Yolo_Objs for consistency with dataset and model development.

#5. Set Annotations path

This is where all of your PascalVOC Annotations are located for VisDrone we just converted.

#6. Set JPEGImages path

This is where all of your images are located for VisDrone that correspond to the Annotations (i.e. example.xml has a image called example.jpg in JPEGImages directory).

#7. Set Yolo_Objs path

This is where all of the YOLO .txt files will be generated for training on. My suggestion is this directory is at the same level as Annotations/JPEGImages/Yolo_Files.

#8. Modify Width/Height/Prefix as you desire to train with.

I changed the input dimensions from 640x640 to 1056x1056 for a slightly bigger yolov7-tiny model.

#9. Generate Configuration Files for Training (button click)

Future efforts allow you to just load them.

#10. Create Yolo Objects (button click)

First time the radio button should be set to “Create new.” Future efforts you can use previous. Once this step is done all of you JPEGImages/Annotations have been read and converted into Yolo_Objs (.txt/.jpg) in the Yolo_Objs directory, which is required as a format for training YOLO.

#11. Split Training/Validation (button click)

Recommendation is to leave it at 70% Train/ 30% Validate for training.

#12. Create Scripts (button click)

This creates all of the current bash training & inference scripts you can find in the Yolo_Files directory. Click the Scripts icon to open directly to scripts.

#13. TRAIN Scripts Buttons (button click)

This brings a top level pop-up to start training a desired algorithm type of YOLO. In this case, YOLOv7-tiny.

#14. Click the train to start training(button click)

We will keep it around 40 EPOCHS at first and train tiny-YoloV7.

We can see as it trains through Tensorboard our progress:

Our weights will be the best.pt located in the sub-directory yolov7-tiny

Create a a custom python class to inference with YOLOv7 Model

Now that we have a trained model, we need a way to integrate it with Streamlit. The repo for YOLOv7 has some functionality we will copy to make this work, specifically scripts located in yolov7/utils & yolov7/models.

I wrote this helper python class (SingleInference_YOLOV7) that can inference with YOLOv7 trained models given their weights path, model input size, and a input image or cv2 matrix.

#singleinference_yolov7.py
import random
import numpy as np
import os
import sys
import torch
import cv2
import logging

class SingleInference_YOLOV7:
'''
SimpleInference_YOLOV7
created by Steven Smiley 2022/11/24

INPUTS:
VARIABLES TYPE DESCRIPTION
1. img_size, #int# #this is the yolov7 model size, should be square so 640 for a square 640x640 model etc.
2. path_yolov7_weights, #str# #this is the path to your yolov7 weights
3. path_img_i, #str# #path to a single .jpg image for inference (NOT REQUIRED, can load cv2matrix with self.load_cv2mat())

OUTPUT:
VARIABLES TYPE DESCRIPTION
1. predicted_bboxes_PascalVOC #list# #list of values for detections containing the following (name,x0,y0,x1,y1,score)

CREDIT
Please see https://github.com/WongKinYiu/yolov7.git for Yolov7 resources (i.e. utils/models)
@article{wang2022yolov7,
title={{YOLOv7}: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors},
author={Wang, Chien-Yao and Bochkovskiy, Alexey and Liao, Hong-Yuan Mark},
journal={arXiv preprint arXiv:2207.02696},
year={2022}
}

'''
def __init__(self,
img_size, path_yolov7_weights,
path_img_i='None',
device_i='cpu',
conf_thres=0.25,
iou_thres=0.5):

self.conf_thres=conf_thres
self.iou_thres=iou_thres
self.clicked=False
self.img_size=img_size
self.path_yolov7_weights=path_yolov7_weights
self.path_img_i=path_img_i
from utils.general import check_img_size, non_max_suppression, scale_coords
from utils.torch_utils import select_device
from models.experimental import attempt_load
self.scale_coords=scale_coords
self.non_max_suppression=non_max_suppression
self.select_device=select_device
self.attempt_load=attempt_load
self.check_img_size=check_img_size

#Initialize
self.predicted_bboxes_PascalVOC=[]
self.im0=None
self.im=None
self.device = self.select_device(device_i) #gpu 0,1,2,3 etc or '' if cpu
self.half = self.device.type != 'cpu' # half precision only supported on CUDA
self.logging=logging
self.logging.basicConfig(level=self.logging.DEBUG)



def load_model(self):
'''
Loads the yolov7 model

self.path_yolov7_weights = r"/example_path/my_model/best.pt"
self.device = '0' for gpu cuda 0, '' for cpu

'''
# Load model
self.model = self.attempt_load(self.path_yolov7_weights, map_location=self.device) # load FP32 model
self.stride = int(self.model.stride.max()) # model stride
self.img_size = self.check_img_size(self.img_size, s=self.stride) # check img_size
if self.half:
self.model.half() # to FP16

# Get names and colors
self.names = self.model.module.names if hasattr(self.model, 'module') else self.model.names
self.colors = [[random.randint(0, 255) for _ in range(3)] for _ in self.names]

# Run inference
if self.device.type != 'cpu':
self.model(torch.zeros(1, 3, self.img_size, self.img_size).to(self.device).type_as(next(self.model.parameters()))) # run once

def read_img(self,path_img_i):
'''
Reads a single path to a .jpg file with OpenCV

path_img_i = r"/example_path/img_example_i.jpg"

'''
#Read path_img_i
if type(path_img_i)==type('string'):
if os.path.exists(path_img_i):
self.path_img_i=path_img_i
self.im0=cv2.imread(self.path_img_i)
print('self.im0.shape',self.im0.shape)
#self.im0=cv2.resize(self.im0,(self.img_size,self.img_size))
else:
log_i=f'read_img \t Bad path for path_img_i:\n {path_img_i}'
self.logging.error(log_i)
else:
log_i=f'read_img \t Bad type for path_img_i\n {path_img_i}'
self.logging.error(log_i)


def load_cv2mat(self,im0=None):
'''
Loads an OpenCV matrix

im0 = cv2.imread(self.path_img_i)

'''
if type(im0)!=type(None):
self.im0=im0
if type(self.im0)!=type(None):
self.img=self.im0.copy()
self.imn = cv2.cvtColor(self.im0, cv2.COLOR_BGR2RGB)
self.img=self.imn.copy()
image = self.img.copy()
image, self.ratio, self.dwdh = self.letterbox(image,auto=False)
self.image_letter=image.copy()
image = image.transpose((2, 0, 1))

image = np.expand_dims(image, 0)
image = np.ascontiguousarray(image)
self.im = image.astype(np.float32)
self.im = torch.from_numpy(self.im).to(self.device)
self.im = self.im.half() if self.half else self.im.float() # uint8 to fp16/32
self.im /= 255.0 # 0 - 255 to 0.0 - 1.0
if self.im.ndimension() == 3:
self.im = self.im.unsqueeze(0)
else:
log_i=f'load_cv2mat \t Bad self.im0\n {self.im0}'
self.logging.error(log_i)

def inference(self):
'''
Inferences with the yolov7 model, given a valid input image (self.im)
'''
# Inference
if type(self.im)!=type(None):
self.outputs = self.model(self.im, augment=False)[0]
# Apply NMS
self.outputs = self.non_max_suppression(self.outputs, self.conf_thres, self.iou_thres, classes=None, agnostic=False)
img_i=self.im0.copy()
self.ori_images = [img_i]
self.predicted_bboxes_PascalVOC=[]
for i,det in enumerate(self.outputs):
if len(det):
# Rescale boxes from img_size to im0 size
#det[:, :4] = self.scale_coords(self.im.shape[2:], det[:, :4], self.im0.shape).round()
#Visualizing bounding box prediction.
batch_id=i
image = self.ori_images[int(batch_id)]

for j,(*bboxes,score,cls_id) in enumerate(reversed(det)):
x0=float(bboxes[0].cpu().detach().numpy())
y0=float(bboxes[1].cpu().detach().numpy())
x1=float(bboxes[2].cpu().detach().numpy())
y1=float(bboxes[3].cpu().detach().numpy())
self.box = np.array([x0,y0,x1,y1])
self.box -= np.array(self.dwdh*2)
self.box /= self.ratio
self.box = self.box.round().astype(np.int32).tolist()
cls_id = int(cls_id)
score = round(float(score),3)
name = self.names[cls_id]
self.predicted_bboxes_PascalVOC.append([name,x0,y0,x1,y1,score]) #PascalVOC annotations
color = self.colors[self.names.index(name)]
name += ' '+str(score)
cv2.rectangle(image,self.box[:2],self.box[2:],color,2)
cv2.putText(image,name,(self.box[0], self.box[1] - 2),cv2.FONT_HERSHEY_SIMPLEX,0.75,[225, 255, 255],thickness=2)
self.image=image
else:
self.image=self.im0.copy()
else:
log_i=f'Bad type for self.im\n {self.im}'
self.logging.error(log_i)

def show(self):
'''
Displays the detections if any are present
'''
if len(self.predicted_bboxes_PascalVOC)>0:
self.TITLE='Press any key or click mouse to quit'
cv2.namedWindow(self.TITLE)
cv2.setMouseCallback(self.TITLE,self.onMouse)
while cv2.waitKey(1) == -1 and not self.clicked:
#print(self.image.shape)
cv2.imshow(self.TITLE, self.image)
cv2.destroyAllWindows()
self.clicked=False
else:
log_i=f'Nothing detected for {self.path_img_i} \n \t w/ conf_thres={self.conf_thres} & iou_thres={self.iou_thres}'
self.logging.debug(log_i)

def letterbox(self,im, new_shape=(640, 640), color=(114, 114, 114), auto=True, scaleup=True, stride=32):
'''
Formats the image in letterbox format for yolov7
'''
# Resize and pad image while meeting stride-multiple constraints
shape = im.shape[:2] # current shape [height, width]
if isinstance(new_shape, int):
new_shape = (new_shape, new_shape)

# Scale ratio (new / old)
r = min(new_shape[0] / shape[0], new_shape[1] / shape[1])
if not scaleup: # only scale down, do not scale up (for better val mAP)
r = min(r, 1.0)

# Compute padding
new_unpad = int(round(shape[1] * r)), int(round(shape[0] * r))
dw, dh = new_shape[1] - new_unpad[0], new_shape[0] - new_unpad[1] # wh padding

if auto: # minimum rectangle
dw, dh = np.mod(dw, stride), np.mod(dh, stride) # wh padding

dw /= 2 # divide padding into 2 sides
dh /= 2

if shape[::-1] != new_unpad: # resize
im = cv2.resize(im, new_unpad, interpolation=cv2.INTER_LINEAR)
top, bottom = int(round(dh - 0.1)), int(round(dh + 0.1))
left, right = int(round(dw - 0.1)), int(round(dw + 0.1))
im = cv2.copyMakeBorder(im, top, bottom, left, right, cv2.BORDER_CONSTANT, value=color) # add border
return im, r, (dw, dh)
def onMouse(self,event,x,y,flags,param):
'''
Handles closing example window
'''
if event==cv2.EVENT_LBUTTONUP:
self.clicked=True

if __name__=='__main__':

#INPUTS
img_size=1056
path_yolov7_weights="weights/best.pt"
path_img_i=r"test_images/DJI_0028_fps24_frame00000040.jpg"

#INITIALIZE THE app
app=SingleInference_YOLOV7(img_size,path_yolov7_weights,path_img_i,device_i='cpu',conf_thres=0.25,iou_thres=0.5)

#LOAD & INFERENCE
app.load_model() #Load the yolov7 model
app.read_img(path_img_i) #read in the jpg image from the full path, note not required if you want to load a cv2matrix instead directly
app.load_cv2mat() #load the OpenCV matrix, note could directly feed a cv2matrix here as app.load_cv2mat(cv2matrix)
app.inference() #make single inference
app.show() #show results
print(f'''
app.predicted_bboxes_PascalVOC\n
\t name,x0,y0,x1,y1,score\n
{app.predicted_bboxes_PascalVOC}''')

Integrate custom python class with Streamlit.

First we need to have a GitHub account and repository dedicated to host our web application with Streamlit. Streamlit uses this dedicated repository to load your webpage from. After those ducks are in a row, we need to ensure our repo has the minimum requirements:

  1. A requirements file (i.e. requirements.txt)
  2. A python file with streamlit running (i.e. streamlit_yolov7.py)

It is kind of a chicken & egg here, but you might not know your requirements until you try running your application a few times. It took me a few tries to figure out the appropriate OpenCV call for pip installation. Here were my requirements in my requirements.txt file:

opencv-contrib-python-headless
streamlit==0.81.1
torch
torchvision
requests
yolov7

In order to integrate our YOLOv7 inference class & model with Streamlit, we need to import the Streamlit library and a few others:

import singleinference_yolov7
from singleinference_yolov7 import SingleInference_YOLOV7
import os
import streamlit as st
import logging
import requests
from PIL import Image
from io import BytesIO
import numpy as np
import cv2

Lets start a class for this and keep some debug logging:

class Streamlit_YOLOV7(SingleInference_YOLOV7):
'''
streamlit app that uses yolov7
'''
def __init__(self,):
self.logging_main=logging
self.logging_main.basicConfig(level=self.logging_main.DEBUG)

Lets make a function that calls our inference class and since Streamlit does not use gpu as far as I am aware, lets make the device for the cpu by default:

def new_yolo_model(self,img_size,path_yolov7_weights,path_img_i,device_i='cpu'):
'''
SimpleInference_YOLOV7
created by Steven Smiley 2022/11/24

INPUTS:
VARIABLES TYPE DESCRIPTION
1. img_size, #int# #this is the yolov7 model size, should be square so 640 for a square 640x640 model etc.
2. path_yolov7_weights, #str# #this is the path to your yolov7 weights
3. path_img_i, #str# #path to a single .jpg image for inference (NOT REQUIRED, can load cv2matrix with self.load_cv2mat())

OUTPUT:
VARIABLES TYPE DESCRIPTION
1. predicted_bboxes_PascalVOC #list# #list of values for detections containing the following (name,x0,y0,x1,y1,score)

CREDIT
Please see https://github.com/WongKinYiu/yolov7.git for Yolov7 resources (i.e. utils/models)
@article{wang2022yolov7,
title={{YOLOv7}: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors},
author={Wang, Chien-Yao and Bochkovskiy, Alexey and Liao, Hong-Yuan Mark},
journal={arXiv preprint arXiv:2207.02696},
year={2022}
}

'''
super().__init__(img_size,path_yolov7_weights,path_img_i,device_i=device_i)

How about a helper function to load an image to inference on

def load_image_st(self):
uploaded_img=st.file_uploader(label='Upload an image')
if type(uploaded_img) != type(None):
self.img_data=uploaded_img.getvalue()
st.image(self.img_data)
self.im0=Image.open(BytesIO(self.img_data))
self.im0=np.array(self.im0)

return self.im0
elif type(self.im0) !=type(None):
return self.im0
else:
return None

And a helper function to inference with

def predict(self):
self.conf_thres=self.conf_selection
st.write('Loading image')
self.load_cv2mat(self.im0)
st.write('Making inference')
self.inference()

self.img_screen=Image.fromarray(self.image).convert('RGB')

self.capt='DETECTED:'
if len(self.predicted_bboxes_PascalVOC)>0:
for item in self.predicted_bboxes_PascalVOC:
name=str(item[0])
conf=str(round(100*item[-1],2))
self.capt=self.capt+ ' name='+name+' confidence='+conf+'%, '
st.image(self.img_screen, caption=self.capt, width=None, use_column_width=None, clamp=False, channels="RGB", output_format="auto")
self.image=None

Now to define the main function of this web app

def main(self):
st.title('Custom YoloV7 Object Detector')
st.subheader(""" Upload an image and run YoloV7 on it.
This model was trained to detect the following classes from a drone's vantage point.
Notice where the model fails.
(i.e. objects too close up & too far away):\n""")
st.markdown(
"""
<style>
.reportview-container .markdown-text-container {
font-family: monospace;
}
.sidebar .sidebar-content {
background-image: linear-gradient(#2e7bcf,#2e7bcf);
color: black;
}
.Widget>label {
color: green;
font-family: monospace;
}
[class^="st-b"] {
color: green;
font-family: monospace;
}
.st-bb {
background-color: black;
}
.st-at {
background-color: green;
}
footer {
font-family: monospace;
}
.reportview-container .main footer, .reportview-container .main footer a {
color: black;
}
header .decoration {
background-image: None);
}

</style>
""",
unsafe_allow_html=True,
)
st.markdown(
"""
<style>
.reportview-container {
background: url("https://raw.githubusercontent.com/stevensmiley1989/STREAMLIT_YOLOV7/main/misc/IMG_0512_reduce.JPG")
}
.sidebar .sidebar-content {
background: url("https://raw.githubusercontent.com/stevensmiley1989/STREAMLIT_YOLOV7/main/misc/IMG_0512_reduce.JPG")
}
</style>
""",
unsafe_allow_html=True
)
text_i_list=[]
for i,name_i in enumerate(self.names):
#text_i_list.append(f'id={i} \t \t name={name_i}\n')
text_i_list.append(f'{i}: {name_i}\n')
st.selectbox('Classes',tuple(text_i_list))
self.conf_selection=st.selectbox('Confidence Threshold',tuple([0.1,0.25,0.5,0.75,0.95]))

self.response=requests.get(self.path_img_i)

self.img_screen=Image.open(BytesIO(self.response.content))

st.image(self.img_screen, caption=self.capt, width=None, use_column_width=None, clamp=False, channels="RGB", output_format="auto")
st.markdown('YoloV7 on streamlit. Demo of object detection with YoloV7 with a web application.')
self.im0=np.array(self.img_screen.convert('RGB'))
self.load_image_st()
predictions = st.button('Predict on the image?')
if predictions:
self.predict()
predictions=False

Putting it all together:

import singleinference_yolov7
from singleinference_yolov7 import SingleInference_YOLOV7
import os
import streamlit as st
import logging
import requests
from PIL import Image
from io import BytesIO
import numpy as np
import cv2
class Streamlit_YOLOV7(SingleInference_YOLOV7):
'''
streamlit app that uses yolov7
'''
def __init__(self,):
self.logging_main=logging
self.logging_main.basicConfig(level=self.logging_main.DEBUG)

def new_yolo_model(self,img_size,path_yolov7_weights,path_img_i,device_i='cpu'):
'''
SimpleInference_YOLOV7
created by Steven Smiley 2022/11/24

INPUTS:
VARIABLES TYPE DESCRIPTION
1. img_size, #int# #this is the yolov7 model size, should be square so 640 for a square 640x640 model etc.
2. path_yolov7_weights, #str# #this is the path to your yolov7 weights
3. path_img_i, #str# #path to a single .jpg image for inference (NOT REQUIRED, can load cv2matrix with self.load_cv2mat())

OUTPUT:
VARIABLES TYPE DESCRIPTION
1. predicted_bboxes_PascalVOC #list# #list of values for detections containing the following (name,x0,y0,x1,y1,score)

CREDIT
Please see https://github.com/WongKinYiu/yolov7.git for Yolov7 resources (i.e. utils/models)
@article{wang2022yolov7,
title={{YOLOv7}: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors},
author={Wang, Chien-Yao and Bochkovskiy, Alexey and Liao, Hong-Yuan Mark},
journal={arXiv preprint arXiv:2207.02696},
year={2022}
}

'''
super().__init__(img_size,path_yolov7_weights,path_img_i,device_i=device_i)
def main(self):
st.title('Custom YoloV7 Object Detector')
st.subheader(""" Upload an image and run YoloV7 on it.
This model was trained to detect the following classes from a drone's vantage point.
Notice where the model fails.
(i.e. objects too close up & too far away):\n""")
st.markdown(
"""
<style>
.reportview-container .markdown-text-container {
font-family: monospace;
}
.sidebar .sidebar-content {
background-image: linear-gradient(#2e7bcf,#2e7bcf);
color: black;
}
.Widget>label {
color: green;
font-family: monospace;
}
[class^="st-b"] {
color: green;
font-family: monospace;
}
.st-bb {
background-color: black;
}
.st-at {
background-color: green;
}
footer {
font-family: monospace;
}
.reportview-container .main footer, .reportview-container .main footer a {
color: black;
}
header .decoration {
background-image: None);
}


</style>
""",
unsafe_allow_html=True,
)
st.markdown(
"""
<style>
.reportview-container {
background: url("https://raw.githubusercontent.com/stevensmiley1989/STREAMLIT_YOLOV7/main/misc/IMG_0512_reduce.JPG")
}
.sidebar .sidebar-content {
background: url("https://raw.githubusercontent.com/stevensmiley1989/STREAMLIT_YOLOV7/main/misc/IMG_0512_reduce.JPG")
}
</style>
""",
unsafe_allow_html=True
)
text_i_list=[]
for i,name_i in enumerate(self.names):
#text_i_list.append(f'id={i} \t \t name={name_i}\n')
text_i_list.append(f'{i}: {name_i}\n')
st.selectbox('Classes',tuple(text_i_list))
self.conf_selection=st.selectbox('Confidence Threshold',tuple([0.1,0.25,0.5,0.75,0.95]))

self.response=requests.get(self.path_img_i)

self.img_screen=Image.open(BytesIO(self.response.content))

st.image(self.img_screen, caption=self.capt, width=None, use_column_width=None, clamp=False, channels="RGB", output_format="auto")
st.markdown('YoloV7 on streamlit. Demo of object detection with YoloV7 with a web application.')
self.im0=np.array(self.img_screen.convert('RGB'))
self.load_image_st()
predictions = st.button('Predict on the image?')
if predictions:
self.predict()
predictions=False

def load_image_st(self):
uploaded_img=st.file_uploader(label='Upload an image')
if type(uploaded_img) != type(None):
self.img_data=uploaded_img.getvalue()
st.image(self.img_data)
self.im0=Image.open(BytesIO(self.img_data))#.convert('RGB')
self.im0=np.array(self.im0)

return self.im0
elif type(self.im0) !=type(None):
return self.im0
else:
return None

def predict(self):
self.conf_thres=self.conf_selection
st.write('Loading image')
self.load_cv2mat(self.im0)
st.write('Making inference')
self.inference()

self.img_screen=Image.fromarray(self.image).convert('RGB')

self.capt='DETECTED:'
if len(self.predicted_bboxes_PascalVOC)>0:
for item in self.predicted_bboxes_PascalVOC:
name=str(item[0])
conf=str(round(100*item[-1],2))
self.capt=self.capt+ ' name='+name+' confidence='+conf+'%, '
st.image(self.img_screen, caption=self.capt, width=None, use_column_width=None, clamp=False, channels="RGB", output_format="auto")
self.image=None

if __name__=='__main__':
app=Streamlit_YOLOV7()

#INPUTS for YOLOV7
img_size=1056
path_yolov7_weights="weights/best.pt"
path_img_i="https://raw.githubusercontent.com/stevensmiley1989/STREAMLIT_YOLOV7/main/test_images/DJI_0028_fps24_frame00000040.jpg"
#INPUTS for webapp
app.capt="Initial Image"
app.new_yolo_model(img_size,path_yolov7_weights,path_img_i)
app.conf_thres=0.65
app.load_model() #Load the yolov7 model

app.main()

…DEPLOY

Check out the webpage we just made here.

Thank you for reading!

Feel free to contact me to discuss any issues, questions, or comments.

References

  1. Wang, C.-Y., Bochkovskiy, A., & Liao, H.-Y. M. (2022). YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. arXiv. https://doi.org/10.48550/ARXIV.2207.02696
  2. Full-Loop-YOLO, “GUI for training/inference with YoloV4 & Yolov7.” https://github.com/stevensmiley1989/Full_Loop_YOLO.
  3. Streamlit. https://streamlit.io/
  4. Zhu P, Wen L, Du D, et al. Detection and Tracking Meet Drones Challenge[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2021 (01): 1–1

--

--

Steven Smiley

Lead Machine Learning Engineer who also enjoys writing about Data Science, CV, DL, ML, AI, Python https://www.linkedin.com/in/stevensmiley1989/