Train & Deploy YOLOv7 to Streamlit

14 min readNov 27, 2022

Make a web application out of a custom trained YOLOv7 model.

Image from my iPhone, taken on 11/15/2022

Over the last few years object detection algorithms have constantly been improving. Currently for 2022, the state-of-the-art is YOLOv7 as it states in their recent paper¹:

“YOLOv7 surpasses all known object detectors in both
speed and accuracy in the range from 5 FPS to 160 FPS
and has the highest accuracy 56.8% AP among all known
real-time object detectors with 30 FPS or higher on GPU
V100.”¹

You can play with their code base here: https://github.com/WongKinYiu/yolov7. I went ahead and developed my own Graphical User Interface (GUI) called Full-Loop-YOLO² to interact with training/deploying models with their code base. It simplifies much of the process. You can find my repository for that tool here: https://github.com/stevensmiley1989/Full_Loop_YOLO.

This article will explain how I deployed one of these custom trained YOLOv7 models to a web application, using the open source framework, Streamlit.³

If you want to see the web application first, click here.

The code for this article is located in my GitHub repo here.

What is Streamlit³?

“Streamlit is an open-source Python library that makes it easy to create and share beautiful, custom web apps for machine learning and data science. In just a few minutes you can build and deploy powerful data apps.”

So we can just deploy right? It is that simple? … not yet speed racer. A few steps up-front:

Gather Training Data
Train a Custom YOLOv7 Model
Create a a custom python class to inference with YOLOv7 Model
Integrate custom python class with Streamlit.
… then Deploy

Gather Training Data

You need data to train a model. Fortunately, you can find a plethora of data for free on the internet. In my case, I chose the VisDrone⁴ object detection dataset. You can find that data I used here: http://aiskyeye.com/download/object-detection-2/.

Train a Custom YOLOv7 Model

In order to work with Full-Loop-YOLO, you will need to convert their annotation files to PascalVOC format. Full-Loop-YOLO converts PascalVOC (i.e., x0,y0,x1,y1) .xml files to YOLO format (i.e., xc,yc,w,h) .txt files when you click, Create YOLO Objects. So the only real work to do is to convert VisDrone to PascalVOC. See below at a helper function, VisDrone_to_PascalVOC.

PascalVOC format: xmin,ymin,xmax,ymax
YOLO format: xc,yc,w,h
The VisDrone format for annotations are: xmin,ymin,w,h

import numpy as np
from pathlib import Path
from lxml import etree as ET
from PIL import Image
import os
def VisDrone_to_PascalVOC(path_anno,path_img,custom_new_path_anno=None):
  '''VisDrone Dataset annotation format
  xmin,ymin,w,h,score_i,cat_i,trunc_i,occ_i
  INPUT:
    READS a Single Annotation from VisDrone, Format .txt
    READS a Single Image from VisDrone, Format .jpg
  OUTPUT:
    WRITES a Single Annotation of PascalVOC, Format .xml
  '''
  f=open(path_anno,'r')
  f_read=f.readlines()
  f.close()
  All_bboxes=[]
  for i,line in enumerate(f_read):
    xmin,ymin,w,h,score_i,label_i,trunc_i,occ_i=line.split(',')
    xmin=int(xmin)
    xmax=int(xmin)+int(w)
    ymin=int(ymin)
    ymax=int(ymin)+int(h)

    dic_i={"xmin":xmin_i,
    "xmax":xmax_i,
    "ymin":ymin_i,
    "ymax":ymax_i,
    "Label":label_i,
    "trunc_i":trunc_i,
    "occ_i":occ_i}

    All_bboxes.append(dic_i)
  if type(custom_new_path_anno)==type(None):
    # Create basepath to PascalVOC Annotations 
    basepath_PascalVOC=os.path.dirname(path_anno,"Annotations")
    if os.path.exists(basepath_PascalVOC)==False:
      os.makedirs(basepath_PascalVOC)
    # Create new PascalVOC annotation
    basename_xml_i=os.path.basename(path_anno).split('.')[0]+'.xml'
    path_anno_PascalVOC_i=os.path.join(basepath_PascalVOC,basename_xml_i)
  else:
    path_anno_PascalVOC_i=custom_new_path_anno #if you want to specify a specific path
  
  # import img for width/height/depth information
  img = np.array(Image.open(path_img).convert('RGB')) 
  
  # create an ET.Element tree for the new annotation
  annotation = ET.Element('annotation')
  ET.SubElement(annotation, 'folder').text = str(anno_folder)
  ET.SubElement(annotation, 'filename').text = str(filename)
  ET.SubElement(annotation, 'path').text = str(filename)

  source = ET.SubElement(annotation, 'source')
  ET.SubElement(source, 'database').text = 'VisDrone'

  size = ET.SubElement(annotation, 'size')
  ET.SubElement(size, 'width').text = str (img.shape[1])
  ET.SubElement(size, 'height').text = str(img.shape[0])
  ET.SubElement(size, 'depth').text = str(img.shape[2])

  ET.SubElement(annotation, 'segmented').text = '0'

  for item in All_bboxes:
      label = item['Label']
      trunc = item['trunc_i']
      diff  = item['occ_i']
      xmax  = item['xmax']
      xmin  = item['xmin']
      ymin  = item['ymin']
      ymax  = item['ymax']

      object = ET.SubElement(annotation, 'object')
      ET.SubElement(object, 'name').text = label
      ET.SubElement(object, 'truncated').text = trunc
      ET.SubElement(object, 'difficult').text = diff

      bndbox = ET.SubElement(object, 'bndbox')
      ET.SubElement(bndbox, 'xmin').text = str(xmin)
      ET.SubElement(bndbox, 'ymin').text = str(ymin)
      ET.SubElement(bndbox, 'xmax').text = str(xmax)
      ET.SubElement(bndbox, 'ymax').text = str(ymax)

  tree = ET.ElementTree(annotation)
  tree.write(path_anno_PascalVOC_i,pretty_print=True)

Full-Loop-YOLO steps to train YOLOv7

#1. Go to my repo and follow directions that allow for YOLOv7 training:

https://github.com/stevensmiley1989/Full_Loop_YOLO

git clone https://github.com/stevensmiley1989/Full_Loop_YOLO.git

#2. Run Full-Loop YOLO

python3 Full_Loop_YOLO.py

#3. Create New Model (button click)

#4. Set Yolo_Files path

This is the path where your model files will be resolved to. My suggestion is to make it at the same level directory as JPEGImages/Annotations/Yolo_Objs for consistency with dataset and model development.

#5. Set Annotations path

This is where all of your PascalVOC Annotations are located for VisDrone we just converted.

#6. Set JPEGImages path

This is where all of your images are located for VisDrone that correspond to the Annotations (i.e. example.xml has a image called example.jpg in JPEGImages directory).

#7. Set Yolo_Objs path

This is where all of the YOLO .txt files will be generated for training on. My suggestion is this directory is at the same level as Annotations/JPEGImages/Yolo_Files.

#8. Modify Width/Height/Prefix as you desire to train with.

I changed the input dimensions from 640x640 to 1056x1056 for a slightly bigger yolov7-tiny model.

#9. Generate Configuration Files for Training (button click)

Future efforts allow you to just load them.

#10. Create Yolo Objects (button click)

First time the radio button should be set to “Create new.” Future efforts you can use previous. Once this step is done all of you JPEGImages/Annotations have been read and converted into Yolo_Objs (.txt/.jpg) in the Yolo_Objs directory, which is required as a format for training YOLO.

#11. Split Training/Validation (button click)

Recommendation is to leave it at 70% Train/ 30% Validate for training.

#12. Create Scripts (button click)

This creates all of the current bash training & inference scripts you can find in the Yolo_Files directory. Click the Scripts icon to open directly to scripts.

#13. TRAIN Scripts Buttons (button click)

This brings a top level pop-up to start training a desired algorithm type of YOLO. In this case, YOLOv7-tiny.

#14. Click the train to start training(button click)

We will keep it around 40 EPOCHS at first and train tiny-YoloV7.

We can see as it trains through Tensorboard our progress:

Our weights will be the best.pt located in the sub-directory yolov7-tiny

Create a a custom python class to inference with YOLOv7 Model

Now that we have a trained model, we need a way to integrate it with Streamlit. The repo for YOLOv7 has some functionality we will copy to make this work, specifically scripts located in yolov7/utils & yolov7/models.

I wrote this helper python class (SingleInference_YOLOV7) that can inference with YOLOv7 trained models given their weights path, model input size, and a input image or cv2 matrix.

#singleinference_yolov7.py
import random
import numpy as np
import os
import sys
import torch
import cv2
import logging

class SingleInference_YOLOV7:
    '''
    SimpleInference_YOLOV7
    created by Steven Smiley 2022/11/24

    INPUTS:
       VARIABLES                    TYPE    DESCRIPTION
    1. img_size,                    #int#   #this is the yolov7 model size, should be square so 640 for a square 640x640 model etc.
    2. path_yolov7_weights,         #str#   #this is the path to your yolov7 weights 
    3. path_img_i,                  #str#   #path to a single .jpg image for inference (NOT REQUIRED, can load cv2matrix with self.load_cv2mat())

    OUTPUT:
       VARIABLES                    TYPE    DESCRIPTION
    1. predicted_bboxes_PascalVOC   #list#  #list of values for detections containing the following (name,x0,y0,x1,y1,score)

    CREDIT
    Please see https://github.com/WongKinYiu/yolov7.git for Yolov7 resources (i.e. utils/models)
    @article{wang2022yolov7,
        title={{YOLOv7}: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors},
        author={Wang, Chien-Yao and Bochkovskiy, Alexey and Liao, Hong-Yuan Mark},
        journal={arXiv preprint arXiv:2207.02696},
        year={2022}
        }
    
    '''
    def __init__(self,
    img_size, path_yolov7_weights, 
    path_img_i='None',
    device_i='cpu',
    conf_thres=0.25,
    iou_thres=0.5):

        self.conf_thres=conf_thres
        self.iou_thres=iou_thres
        self.clicked=False
        self.img_size=img_size
        self.path_yolov7_weights=path_yolov7_weights
        self.path_img_i=path_img_i
        from utils.general import check_img_size, non_max_suppression, scale_coords
        from utils.torch_utils import select_device
        from models.experimental import attempt_load
        self.scale_coords=scale_coords
        self.non_max_suppression=non_max_suppression
        self.select_device=select_device
        self.attempt_load=attempt_load
        self.check_img_size=check_img_size

        #Initialize
        self.predicted_bboxes_PascalVOC=[]
        self.im0=None
        self.im=None
        self.device = self.select_device(device_i) #gpu 0,1,2,3 etc or '' if cpu
        self.half = self.device.type != 'cpu'  # half precision only supported on CUDA
        self.logging=logging
        self.logging.basicConfig(level=self.logging.DEBUG)



    def load_model(self):
        '''
        Loads the yolov7 model

        self.path_yolov7_weights = r"/example_path/my_model/best.pt"
        self.device = '0' for gpu cuda 0, '' for cpu

        '''
        # Load model
        self.model = self.attempt_load(self.path_yolov7_weights, map_location=self.device)  # load FP32 model
        self.stride = int(self.model.stride.max())  # model stride
        self.img_size = self.check_img_size(self.img_size, s=self.stride)  # check img_size
        if self.half:
            self.model.half() # to FP16

        # Get names and colors
        self.names = self.model.module.names if hasattr(self.model, 'module') else self.model.names
        self.colors = [[random.randint(0, 255) for _ in range(3)] for _ in self.names]

        # Run inference
        if self.device.type != 'cpu':
            self.model(torch.zeros(1, 3, self.img_size, self.img_size).to(self.device).type_as(next(self.model.parameters())))  # run once

    def read_img(self,path_img_i):
        '''
        Reads a single path to a .jpg file with OpenCV

        path_img_i = r"/example_path/img_example_i.jpg"

        '''
        #Read path_img_i
        if type(path_img_i)==type('string'):
            if os.path.exists(path_img_i):
                self.path_img_i=path_img_i
                self.im0=cv2.imread(self.path_img_i)
                print('self.im0.shape',self.im0.shape)
                #self.im0=cv2.resize(self.im0,(self.img_size,self.img_size))
            else:
                log_i=f'read_img \t Bad path for path_img_i:\n {path_img_i}'
                self.logging.error(log_i)
        else:
            log_i=f'read_img \t Bad type for path_img_i\n {path_img_i}'
            self.logging.error(log_i)


    def load_cv2mat(self,im0=None):
        '''
        Loads an OpenCV matrix
        
        im0 = cv2.imread(self.path_img_i)

        '''
        if type(im0)!=type(None):
            self.im0=im0
        if type(self.im0)!=type(None):
            self.img=self.im0.copy()    
            self.imn = cv2.cvtColor(self.im0, cv2.COLOR_BGR2RGB)
            self.img=self.imn.copy()
            image = self.img.copy()
            image, self.ratio, self.dwdh = self.letterbox(image,auto=False)
            self.image_letter=image.copy()
            image = image.transpose((2, 0, 1))

            image = np.expand_dims(image, 0)
            image = np.ascontiguousarray(image)
            self.im = image.astype(np.float32)
            self.im = torch.from_numpy(self.im).to(self.device)
            self.im = self.im.half() if self.half else self.im.float()  # uint8 to fp16/32
            self.im /= 255.0  # 0 - 255 to 0.0 - 1.0
            if self.im.ndimension() == 3:
                self.im = self.im.unsqueeze(0)
        else:
            log_i=f'load_cv2mat \t Bad self.im0\n {self.im0}'
            self.logging.error(log_i)

    def inference(self):
        '''
        Inferences with the yolov7 model, given a valid input image (self.im)
        '''
        # Inference
        if type(self.im)!=type(None):
            self.outputs = self.model(self.im, augment=False)[0]
            # Apply NMS
            self.outputs = self.non_max_suppression(self.outputs, self.conf_thres, self.iou_thres, classes=None, agnostic=False)
            img_i=self.im0.copy()
            self.ori_images = [img_i]
            self.predicted_bboxes_PascalVOC=[]
            for i,det in enumerate(self.outputs):
                if len(det):
                    # Rescale boxes from img_size to im0 size
                    #det[:, :4] = self.scale_coords(self.im.shape[2:], det[:, :4], self.im0.shape).round()
                    #Visualizing bounding box prediction.
                    batch_id=i
                    image = self.ori_images[int(batch_id)]

                    for j,(*bboxes,score,cls_id) in enumerate(reversed(det)):
                        x0=float(bboxes[0].cpu().detach().numpy())
                        y0=float(bboxes[1].cpu().detach().numpy())
                        x1=float(bboxes[2].cpu().detach().numpy())
                        y1=float(bboxes[3].cpu().detach().numpy())
                        self.box = np.array([x0,y0,x1,y1])
                        self.box -= np.array(self.dwdh*2)
                        self.box /= self.ratio
                        self.box = self.box.round().astype(np.int32).tolist()
                        cls_id = int(cls_id)
                        score = round(float(score),3)
                        name = self.names[cls_id]
                        self.predicted_bboxes_PascalVOC.append([name,x0,y0,x1,y1,score]) #PascalVOC annotations
                        color = self.colors[self.names.index(name)]
                        name += ' '+str(score)
                        cv2.rectangle(image,self.box[:2],self.box[2:],color,2)
                        cv2.putText(image,name,(self.box[0], self.box[1] - 2),cv2.FONT_HERSHEY_SIMPLEX,0.75,[225, 255, 255],thickness=2)
                    self.image=image
                else:
                    self.image=self.im0.copy()
        else:
            log_i=f'Bad type for self.im\n {self.im}'
            self.logging.error(log_i)

    def show(self):
        '''
        Displays the detections if any are present
        '''
        if len(self.predicted_bboxes_PascalVOC)>0:
            self.TITLE='Press any key or click mouse to quit'
            cv2.namedWindow(self.TITLE)
            cv2.setMouseCallback(self.TITLE,self.onMouse)
            while cv2.waitKey(1) == -1 and not self.clicked:
                #print(self.image.shape)
                cv2.imshow(self.TITLE, self.image)
            cv2.destroyAllWindows()
            self.clicked=False
        else:
            log_i=f'Nothing detected for {self.path_img_i} \n \t w/ conf_thres={self.conf_thres} & iou_thres={self.iou_thres}'
            self.logging.debug(log_i)

    def letterbox(self,im, new_shape=(640, 640), color=(114, 114, 114), auto=True, scaleup=True, stride=32):
        '''
        Formats the image in letterbox format for yolov7
        '''
        # Resize and pad image while meeting stride-multiple constraints
        shape = im.shape[:2]  # current shape [height, width]
        if isinstance(new_shape, int):
            new_shape = (new_shape, new_shape)

        # Scale ratio (new / old)
        r = min(new_shape[0] / shape[0], new_shape[1] / shape[1])
        if not scaleup:  # only scale down, do not scale up (for better val mAP)
            r = min(r, 1.0)

        # Compute padding
        new_unpad = int(round(shape[1] * r)), int(round(shape[0] * r))
        dw, dh = new_shape[1] - new_unpad[0], new_shape[0] - new_unpad[1]  # wh padding

        if auto:  # minimum rectangle
            dw, dh = np.mod(dw, stride), np.mod(dh, stride)  # wh padding

        dw /= 2  # divide padding into 2 sides
        dh /= 2

        if shape[::-1] != new_unpad:  # resize
            im = cv2.resize(im, new_unpad, interpolation=cv2.INTER_LINEAR)
        top, bottom = int(round(dh - 0.1)), int(round(dh + 0.1))
        left, right = int(round(dw - 0.1)), int(round(dw + 0.1))
        im = cv2.copyMakeBorder(im, top, bottom, left, right, cv2.BORDER_CONSTANT, value=color)  # add border
        return im, r, (dw, dh)
    def onMouse(self,event,x,y,flags,param):
        '''
        Handles closing example window
        '''
        if event==cv2.EVENT_LBUTTONUP:
            self.clicked=True

if __name__=='__main__':  

    #INPUTS
    img_size=1056
    path_yolov7_weights="weights/best.pt"
    path_img_i=r"test_images/DJI_0028_fps24_frame00000040.jpg"

    #INITIALIZE THE app
    app=SingleInference_YOLOV7(img_size,path_yolov7_weights,path_img_i,device_i='cpu',conf_thres=0.25,iou_thres=0.5)

    #LOAD & INFERENCE
    app.load_model() #Load the yolov7 model
    app.read_img(path_img_i) #read in the jpg image from the full path, note not required if you want to load a cv2matrix instead directly
    app.load_cv2mat() #load the OpenCV matrix, note could directly feed a cv2matrix here as app.load_cv2mat(cv2matrix)
    app.inference() #make single inference
    app.show() #show results
    print(f'''
    app.predicted_bboxes_PascalVOC\n
    \t name,x0,y0,x1,y1,score\n
    {app.predicted_bboxes_PascalVOC}''')

Integrate custom python class with Streamlit.

First we need to have a GitHub account and repository dedicated to host our web application with Streamlit. Streamlit uses this dedicated repository to load your webpage from. After those ducks are in a row, we need to ensure our repo has the minimum requirements:

A requirements file (i.e. requirements.txt)
A python file with streamlit running (i.e. streamlit_yolov7.py)

It is kind of a chicken & egg here, but you might not know your requirements until you try running your application a few times. It took me a few tries to figure out the appropriate OpenCV call for pip installation. Here were my requirements in my requirements.txt file:

opencv-contrib-python-headless
streamlit==0.81.1
torch
torchvision
requests
yolov7

In order to integrate our YOLOv7 inference class & model with Streamlit, we need to import the Streamlit library and a few others:

import singleinference_yolov7
from singleinference_yolov7 import SingleInference_YOLOV7
import os
import streamlit as st
import logging
import requests
from PIL import Image
from io import BytesIO
import numpy as np
import cv2

Lets start a class for this and keep some debug logging:

class Streamlit_YOLOV7(SingleInference_YOLOV7):
    '''
    streamlit app that uses yolov7
    '''
    def __init__(self,):
        self.logging_main=logging
        self.logging_main.basicConfig(level=self.logging_main.DEBUG)

Lets make a function that calls our inference class and since Streamlit does not use gpu as far as I am aware, lets make the device for the cpu by default:

def new_yolo_model(self,img_size,path_yolov7_weights,path_img_i,device_i='cpu'):
        '''
        SimpleInference_YOLOV7
        created by Steven Smiley 2022/11/24

        INPUTS:
        VARIABLES                    TYPE    DESCRIPTION
        1. img_size,                    #int#   #this is the yolov7 model size, should be square so 640 for a square 640x640 model etc.
        2. path_yolov7_weights,         #str#   #this is the path to your yolov7 weights 
        3. path_img_i,                  #str#   #path to a single .jpg image for inference (NOT REQUIRED, can load cv2matrix with self.load_cv2mat())

        OUTPUT:
        VARIABLES                    TYPE    DESCRIPTION
        1. predicted_bboxes_PascalVOC   #list#  #list of values for detections containing the following (name,x0,y0,x1,y1,score)

        CREDIT
        Please see https://github.com/WongKinYiu/yolov7.git for Yolov7 resources (i.e. utils/models)
        @article{wang2022yolov7,
            title={{YOLOv7}: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors},
            author={Wang, Chien-Yao and Bochkovskiy, Alexey and Liao, Hong-Yuan Mark},
            journal={arXiv preprint arXiv:2207.02696},
            year={2022}
            }
        
        '''
        super().__init__(img_size,path_yolov7_weights,path_img_i,device_i=device_i)

How about a helper function to load an image to inference on

def load_image_st(self):
        uploaded_img=st.file_uploader(label='Upload an image')
        if type(uploaded_img) != type(None):
            self.img_data=uploaded_img.getvalue()
            st.image(self.img_data)
            self.im0=Image.open(BytesIO(self.img_data))
            self.im0=np.array(self.im0)

            return self.im0
        elif type(self.im0) !=type(None):
            return self.im0
        else:
            return None

And a helper function to inference with

def predict(self):
        self.conf_thres=self.conf_selection
        st.write('Loading image')
        self.load_cv2mat(self.im0)
        st.write('Making inference')
        self.inference()

        self.img_screen=Image.fromarray(self.image).convert('RGB')
        
        self.capt='DETECTED:'
        if len(self.predicted_bboxes_PascalVOC)>0:
            for item in self.predicted_bboxes_PascalVOC:
                name=str(item[0])
                conf=str(round(100*item[-1],2))
                self.capt=self.capt+ ' name='+name+' confidence='+conf+'%, '
        st.image(self.img_screen, caption=self.capt, width=None, use_column_width=None, clamp=False, channels="RGB", output_format="auto")
        self.image=None

Now to define the main function of this web app

def main(self):
        st.title('Custom YoloV7 Object Detector')
        st.subheader(""" Upload an image and run YoloV7 on it.  
        This model was trained to detect the following classes from a drone's vantage point. 
        Notice where the model fails.
        (i.e. objects too close up & too far away):\n""")
        st.markdown(
            """
        <style>
        .reportview-container .markdown-text-container {
            font-family: monospace;
        }
        .sidebar .sidebar-content {
            background-image: linear-gradient(#2e7bcf,#2e7bcf);
            color: black;
        }
        .Widget>label {
            color: green;
            font-family: monospace;
        }
        [class^="st-b"]  {
            color: green;
            font-family: monospace;
        }
        .st-bb {
            background-color: black;
        }
        .st-at {
            background-color: green;
        }
        footer {
            font-family: monospace;
        }
        .reportview-container .main footer, .reportview-container .main footer a {
            color: black;
        }
        header .decoration {
            background-image: None);
        }

        </style>
        """,
            unsafe_allow_html=True,
        )
        st.markdown(
            """
            <style>
            .reportview-container {
                background: url("https://raw.githubusercontent.com/stevensmiley1989/STREAMLIT_YOLOV7/main/misc/IMG_0512_reduce.JPG")
            }
        .sidebar .sidebar-content {
                background: url("https://raw.githubusercontent.com/stevensmiley1989/STREAMLIT_YOLOV7/main/misc/IMG_0512_reduce.JPG")
            }
            </style>
            """,
            unsafe_allow_html=True
        )
        text_i_list=[]
        for i,name_i in enumerate(self.names):
            #text_i_list.append(f'id={i} \t \t name={name_i}\n')
            text_i_list.append(f'{i}: {name_i}\n')
        st.selectbox('Classes',tuple(text_i_list))
        self.conf_selection=st.selectbox('Confidence Threshold',tuple([0.1,0.25,0.5,0.75,0.95]))
        
        self.response=requests.get(self.path_img_i)

        self.img_screen=Image.open(BytesIO(self.response.content))

        st.image(self.img_screen, caption=self.capt, width=None, use_column_width=None, clamp=False, channels="RGB", output_format="auto")
        st.markdown('YoloV7 on streamlit.  Demo of object detection with YoloV7 with a web application.')
        self.im0=np.array(self.img_screen.convert('RGB'))
        self.load_image_st()
        predictions = st.button('Predict on the image?')
        if predictions:
            self.predict()
            predictions=False

Putting it all together:

import singleinference_yolov7
from singleinference_yolov7 import SingleInference_YOLOV7
import os
import streamlit as st
import logging
import requests
from PIL import Image
from io import BytesIO
import numpy as np
import cv2
class Streamlit_YOLOV7(SingleInference_YOLOV7):
    '''
    streamlit app that uses yolov7
    '''
    def __init__(self,):
        self.logging_main=logging
        self.logging_main.basicConfig(level=self.logging_main.DEBUG)

    def new_yolo_model(self,img_size,path_yolov7_weights,path_img_i,device_i='cpu'):
        '''
        SimpleInference_YOLOV7
        created by Steven Smiley 2022/11/24

        INPUTS:
        VARIABLES                    TYPE    DESCRIPTION
        1. img_size,                    #int#   #this is the yolov7 model size, should be square so 640 for a square 640x640 model etc.
        2. path_yolov7_weights,         #str#   #this is the path to your yolov7 weights 
        3. path_img_i,                  #str#   #path to a single .jpg image for inference (NOT REQUIRED, can load cv2matrix with self.load_cv2mat())

        OUTPUT:
        VARIABLES                    TYPE    DESCRIPTION
        1. predicted_bboxes_PascalVOC   #list#  #list of values for detections containing the following (name,x0,y0,x1,y1,score)

        CREDIT
        Please see https://github.com/WongKinYiu/yolov7.git for Yolov7 resources (i.e. utils/models)
        @article{wang2022yolov7,
            title={{YOLOv7}: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors},
            author={Wang, Chien-Yao and Bochkovskiy, Alexey and Liao, Hong-Yuan Mark},
            journal={arXiv preprint arXiv:2207.02696},
            year={2022}
            }
        
        '''
        super().__init__(img_size,path_yolov7_weights,path_img_i,device_i=device_i)
    def main(self):
        st.title('Custom YoloV7 Object Detector')
        st.subheader(""" Upload an image and run YoloV7 on it.  
        This model was trained to detect the following classes from a drone's vantage point. 
        Notice where the model fails.
        (i.e. objects too close up & too far away):\n""")
        st.markdown(
            """
        <style>
        .reportview-container .markdown-text-container {
            font-family: monospace;
        }
        .sidebar .sidebar-content {
            background-image: linear-gradient(#2e7bcf,#2e7bcf);
            color: black;
        }
        .Widget>label {
            color: green;
            font-family: monospace;
        }
        [class^="st-b"]  {
            color: green;
            font-family: monospace;
        }
        .st-bb {
            background-color: black;
        }
        .st-at {
            background-color: green;
        }
        footer {
            font-family: monospace;
        }
        .reportview-container .main footer, .reportview-container .main footer a {
            color: black;
        }
        header .decoration {
            background-image: None);
        }


        </style>
        """,
            unsafe_allow_html=True,
        )
        st.markdown(
            """
            <style>
            .reportview-container {
                background: url("https://raw.githubusercontent.com/stevensmiley1989/STREAMLIT_YOLOV7/main/misc/IMG_0512_reduce.JPG")
            }
        .sidebar .sidebar-content {
                background: url("https://raw.githubusercontent.com/stevensmiley1989/STREAMLIT_YOLOV7/main/misc/IMG_0512_reduce.JPG")
            }
            </style>
            """,
            unsafe_allow_html=True
        )
        text_i_list=[]
        for i,name_i in enumerate(self.names):
            #text_i_list.append(f'id={i} \t \t name={name_i}\n')
            text_i_list.append(f'{i}: {name_i}\n')
        st.selectbox('Classes',tuple(text_i_list))
        self.conf_selection=st.selectbox('Confidence Threshold',tuple([0.1,0.25,0.5,0.75,0.95]))
        
        self.response=requests.get(self.path_img_i)

        self.img_screen=Image.open(BytesIO(self.response.content))

        st.image(self.img_screen, caption=self.capt, width=None, use_column_width=None, clamp=False, channels="RGB", output_format="auto")
        st.markdown('YoloV7 on streamlit.  Demo of object detection with YoloV7 with a web application.')
        self.im0=np.array(self.img_screen.convert('RGB'))
        self.load_image_st()
        predictions = st.button('Predict on the image?')
        if predictions:
            self.predict()
            predictions=False

    def load_image_st(self):
        uploaded_img=st.file_uploader(label='Upload an image')
        if type(uploaded_img) != type(None):
            self.img_data=uploaded_img.getvalue()
            st.image(self.img_data)
            self.im0=Image.open(BytesIO(self.img_data))#.convert('RGB')
            self.im0=np.array(self.im0)

            return self.im0
        elif type(self.im0) !=type(None):
            return self.im0
        else:
            return None
    
    def predict(self):
        self.conf_thres=self.conf_selection
        st.write('Loading image')
        self.load_cv2mat(self.im0)
        st.write('Making inference')
        self.inference()

        self.img_screen=Image.fromarray(self.image).convert('RGB')
        
        self.capt='DETECTED:'
        if len(self.predicted_bboxes_PascalVOC)>0:
            for item in self.predicted_bboxes_PascalVOC:
                name=str(item[0])
                conf=str(round(100*item[-1],2))
                self.capt=self.capt+ ' name='+name+' confidence='+conf+'%, '
        st.image(self.img_screen, caption=self.capt, width=None, use_column_width=None, clamp=False, channels="RGB", output_format="auto")
        self.image=None

if __name__=='__main__':
    app=Streamlit_YOLOV7()

    #INPUTS for YOLOV7
    img_size=1056
    path_yolov7_weights="weights/best.pt"
    path_img_i="https://raw.githubusercontent.com/stevensmiley1989/STREAMLIT_YOLOV7/main/test_images/DJI_0028_fps24_frame00000040.jpg"
    #INPUTS for webapp
    app.capt="Initial Image"
    app.new_yolo_model(img_size,path_yolov7_weights,path_img_i)
    app.conf_thres=0.65
    app.load_model() #Load the yolov7 model
    
    app.main()

…DEPLOY

Check out the webpage we just made here.

Thank you for reading!

Feel free to contact me to discuss any issues, questions, or comments.

References

Wang, C.-Y., Bochkovskiy, A., & Liao, H.-Y. M. (2022). YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. arXiv. https://doi.org/10.48550/ARXIV.2207.02696
Full-Loop-YOLO, “GUI for training/inference with YoloV4 & Yolov7.” https://github.com/stevensmiley1989/Full_Loop_YOLO.
Streamlit. https://streamlit.io/
Zhu P, Wen L, Du D, et al. Detection and Tracking Meet Drones Challenge[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2021 (01): 1–1