由于图床过期，图片无法正常显示，有图阅览请移步以下Gitee/Github网址，文末获取【源码和部署教程】或者通过以下Gitee/Github的文末邮件获取

Gitee(推荐国内访问): https://gitee.com/qunmasj/projects

Github(推荐国外访问): https://github.com/qunshansj?tab=repositories

# 基于RCS-OSA改进YOLOv7的青蛙活动量监测预警系统

1.研究背景与意义

近年来，随着人工智能技术的迅速发展，物体检测与识别技术在各个领域得到了广泛应用。其中，基于深度学习的目标检测算法YOLO（You Only Look Once）因其高效的实时性能和准确的检测结果而备受关注。然而，YOLO算法在处理小目标和密集目标时存在一定的局限性，导致其在一些特定场景下的应用效果不佳。

青蛙活动量监测预警系统是一种用于监测和预警青蛙活动情况的系统。青蛙作为一种重要的生态指示物种，其活动量的监测对于生态环境的评估和保护具有重要意义。传统的青蛙活动量监测方法主要依赖于人工观察和手动记录，这种方法存在着工作量大、效率低、容易出现误差等问题。因此，开发一种基于计算机视觉技术的青蛙活动量监测预警系统具有重要的实际意义。

然而，目前的青蛙活动量监测预警系统存在一些问题。首先，传统的物体检测算法在处理青蛙这种小目标时效果不佳，容易出现漏检和误检的情况。其次，青蛙活动量的监测需要实时性，而传统的算法往往无法满足实时性的要求。此外，青蛙活动量的监测需要对青蛙的行为进行分析和识别，而传统的物体检测算法只能提供目标的位置信息，无法提供更加详细的行为信息。

为了解决上述问题，本研究提出了一种基于RCS-OSA改进YOLOv7的青蛙活动量监测预警系统。该系统结合了目标检测算法和行为识别算法，通过对青蛙的活动进行实时监测和分析，提供更加准确和详细的活动量信息。具体来说，本研究将使用改进的YOLOv7算法作为目标检测算法，通过引入RCS-OSA（Region Convolutional Sparse-Overlapping Attention）机制来提高算法对小目标和密集目标的检测能力。同时，本研究将使用行为识别算法对青蛙的行为进行分析和识别，从而提供更加详细的活动量信息。

本研究的意义主要体现在以下几个方面。首先，通过改进YOLOv7算法，提高了青蛙活动量监测预警系统对小目标和密集目标的检测能力，提高了系统的准确性和可靠性。其次，通过引入行为识别算法，提供了更加详细的活动量信息，为生态环境的评估和保护提供了更加全面的数据支持。此外，本研究还为其他类似的生态监测系统提供了一种改进的思路和方法，具有一定的推广价值。

总之，基于RCS-OSA改进YOLOv7的青蛙活动量监测预警系统具有重要的实际意义和研究价值。通过提高目标检测算法的性能和引入行为识别算法，该系统能够提供更加准确和详细的活动量信息，为青蛙活动量的监测和生态环境的评估提供了有效的工具和方法。

2.图片演示

在这里插入图片描述

3.视频演示

基于RCS-OSA改进YOLOv7的青蛙活动量监测预警系统_哔哩哔哩_bilibili

4.Deepsort青蛙追踪

DeepSORT算法的处理流程如图所示。该算法主要包含以下4个步骤:
(1)对输入的视频信息进行预处理，然后使用YOLOv5s[2]网络提取图像的特征信息后得到候选框，再通过非极大抑制(Non-Maximum Suppression,NMS)算法去除重叠框，最后得到目标检测框和相应特征。
(2)通过递推的卡尔曼滤波对目标下一帧的位置信息和运动状态进行预测，并将结果与检测器获取的检测框进行对比，最终选取置信度更高的检测框为预测结果。
(3)使用基于运动信息的马氏距离和外观特征的余弦相似度的线性加权函数进行数据关联，并通过级联分配进行跟踪轨迹匹配。
(4)结果输出，同时更新跟踪器的参数，重新开始目标检测。
DeepSORT算法使用一个残差卷积神经网络提取目标的外观特征，输人的图像被缩放到64×128像素，这与目标鱼类的宽高比不符。为使模型适用于青蛙特征提取，调整网络的输入图像大小为256×128像素。同时考虑到算法的效率和准确率的因素，对该算法检测部分做出改进:将DeepSORT检测部分的算法由Faster R-CNN替换为YOLOv5s，封装成一个模块，通过调用的方法，得到预测信息，再进行跟踪预测、匹配和更新。
在这里插入图片描述

5.核心代码讲解

5.1 deep_sort_tracking_id.py

下面是封装的类：



class ObjectDetection:
    def __init__(self, opt):
        self.opt = opt
        self.names = opt.names
        self.source = opt.source
        self.weights = opt.weights
        self.view_img = opt.view_img
        self.save_txt = opt.save_txt
        self.imgsz = opt.img_size
        self.trace = not opt.no_trace
        self.save_img = not opt.nosave and not source.endswith('.txt')
        self.webcam = source.isnumeric() or source.endswith('.txt') or source.lower().startswith(
            ('rtsp://', 'rtmp://', 'http://', 'https://'))

        # Directories
        self.save_dir = Path(increment_path(Path(opt.project) / opt.name, exist_ok=opt.exist_ok))  # increment run
        (self.save_dir / 'labels' if save_txt else save_dir).mkdir(parents=True, exist_ok=True)  # make dir
        # initialize deepsort
        cfg_deep = get_config()
        cfg_deep.merge_from_file("deep_sort_pytorch/configs/deep_sort.yaml")
        self.deepsort = DeepSort(cfg_deep.DEEPSORT.REID_CKPT,
                            max_dist=cfg_deep.DEEPSORT.MAX_DIST, min_confidence=cfg_deep.DEEPSORT.MIN_CONFIDENCE,
                            nms_max_overlap=cfg_deep.DEEPSORT.NMS_MAX_OVERLAP, max_iou_distance=cfg_deep.DEEPSORT.MAX_IOU_DISTANCE,
                            max_age=cfg_deep.DEEPSORT.MAX_AGE, n_init=cfg_deep.DEEPSORT.N_INIT, nn_budget=cfg_deep.DEEPSORT.NN_BUDGET,
                            use_cuda=True)

        self.point_list = []
        # Initialize
        set_logging()
        self.device = select_device(opt.device)
        self.half = device.type != 'cpu'  # half precision only supported on CUDA

        # Load model
        self.model = attempt_load(weights, map_location=device)  # load FP32 model
        self.stride = int(model.stride.max())  # model stride
        self.imgsz = check_img_size(imgsz, s=stride)  # check img_size

        if trace:
            self.model = TracedModel(model, device, opt.img_size)

        if half:
            self.model.half()  # to FP16

        # Second-stage classifier
        self.classify = False
        if classify:
            self.modelc = load_classifier(name='resnet101', n=2)  # initialize
            self.modelc.load_state_dict(torch.load('weights/resnet101.pt', map_location=device)['model']).to(device).eval()

        # Set Dataloader
        self.vid_path, self.vid_writer = None, None
        if webcam:
            self.view_img = check_imshow()
            cudnn.benchmark = True  # set True to speed up constant image size inference
            self.dataset = LoadStreams(source, img_size=imgsz, stride=stride)
        else:
            self.dataset = LoadImages(source, img_size=imgsz, stride=stride)

        # Get names and colors
        self.names = load_classes(names)
        #colors = [[random.randint(0, 255) for _ in range(3)] for _ in names]

        # Run inference
        if device.type != 'cpu':
            self.model(torch.zeros(1, 3, imgsz, imgsz).to(device).type_as(next(model.parameters())))  # run once
        self.old_img_w = self.old_img_h = imgsz
        self.old_img_b = 1

        self.t0 = time.time()

    def detect(self):
        for path, img, im0s, vid_cap in self.dataset:
            img = torch.from_numpy(img).to(self.device)
            img = img.half() if self.half else img.float()  # uint8 to fp16/32
            img /= 255.0  # 0 - 255 to 0.0 - 1.0
            # ...

        return img

这个类封装了目标检测的相关功能，包括初始化模型、加载模型、运行推理等。你可以根据需要进一步添加其他功能或方法。

这个程序文件是一个使用深度学习模型进行目标跟踪的程序。它首先导入了一些必要的库和模块，然后定义了一些辅助函数和全局变量。接下来，它加载了深度学习模型和DeepSort模型，并初始化了一些参数。然后，它开始运行目标检测和跟踪的主循环，处理输入的图像并输出跟踪结果。最后，它保存跟踪结果和可视化图像（可选）。整个程序的运行流程是：加载模型和配置文件，初始化DeepSort模型，设置输入数据源，运行目标检测和跟踪循环，保存结果和可视化图像。

5.2 detect.py



class ObjectDetector:
    def __init__(self, weights='yolov7.pt', source='inference/images', img_size=640, conf_thres=0.25, iou_thres=0.45,
                 device='', view_img=False, save_txt=False, save_conf=False, nosave=False, classes=None,
                 agnostic_nms=False, augment=False, update=False, project='runs/detect', name='exp',
                 exist_ok=False, no_trace=False):
        self.weights = weights
        self.source = source
        self.img_size = img_size
        self.conf_thres = conf_thres
        self.iou_thres = iou_thres
        self.device = device
        self.view_img = view_img
        self.save_txt = save_txt
        self.save_conf = save_conf
        self.nosave = nosave
        self.classes = classes
        self.agnostic_nms = agnostic_nms
        self.augment = augment
        self.update = update
        self.project = project
        self.name = name
        self.exist_ok = exist_ok
        self.no_trace = no_trace

    def detect(self):
        source, weights, view_img, save_txt, imgsz, trace = self.source, self.weights, self.view_img, self.save_txt, self.img_size, not self.no_trace
        save_img = not self.nosave and not source.endswith('.txt')  # save inference images
        webcam = source.isnumeric() or source.endswith('.txt') or source.lower().startswith(
            ('rtsp://', 'rtmp://', 'http://', 'https://'))

        # Directories
        save_dir = Path(increment_path(Path(self.project) / self.name, exist_ok=self.exist_ok))  # increment run
        (save_dir / 'labels' if save_txt else save_dir).mkdir(parents=True, exist_ok=True)  # make dir

        # Initialize
        set_logging()
        device = select_device(self.device)
        half = device.type != 'cpu'  # half precision only supported on CUDA

        # Load model
        model = attempt_load(weights, map_location=device)  # load FP32 model
        stride = int(model.stride.max())  # model stride
        imgsz = check_img_size(imgsz, s=stride)  # check img_size

        if trace:
            model = TracedModel(model, device, self.img_size)

        if half:
            model.half()  # to FP16

        # Second-stage classifier
        classify = False
        if classify:
            modelc = load_classifier(name='resnet101', n=2)  # initialize
            modelc.load_state_dict(torch.load('weights/resnet101.pt', map_location=device)['model']).to(device).eval()

        # Set Dataloader
        vid_path, vid_writer = None, None
        if webcam:
            view_img = check_imshow()
            cudnn.benchmark = True  # set True to speed up constant image size inference
            dataset = LoadStreams(source, img_size=imgsz, stride=stride)
        else:
            dataset = LoadImages(source, img_size=imgsz, stride=stride)

        # Get names and colors
        names = model.module.names if hasattr(model, 'module') else model.names
        colors = [[random.randint(0, 255) for _ in range(3)] for _ in names]

        # Run inference
        if device.type != 'cpu':
            model(torch.zeros(1, 3, imgsz, imgsz).to(device).type_as(next(model.parameters())))  # run once
        old_img_w = old_img_h = imgsz
        old_img_b = 1

        t0 = time.time()
        for path, img, im0s, vid_cap in dataset:
            img = torch.from_numpy(img).to(device)
            img = img.half() if half else img.float()  # uint8 to fp16/32
            img /= 255.0  # 0 - 255 to 0.0 - 1.0
            if img.ndimension() == 3:
                img = img.unsqueeze(0)

            # Warmup
            if device.type != 'cpu' and (old_img_b != img.shape[0] or old_img_h != img.shape[2] or old_img_w != img.shape[3]):
                old_img_b = img.shape[0]
                old_img_h = img.shape[2]
                old_img_w = img.shape[3]
                for i in range(3):
                    model(img, augment=self.augment)[0]

            # Inference
            t1 = time_synchronized()
            with torch.no_grad():   # Calculating gradients would cause a GPU memory leak
                pred = model(img, augment=self.augment)[0]
            t2 = time_synchronized()

            # Apply NMS
            pred = non_max_suppression(pred, self.conf_thres, self.iou_thres, classes=self.classes, agnostic=self.agnostic_nms)
            t3 = time_synchronized()

            # Apply Classifier
            if classify:
                pred = apply_classifier(pred, modelc, img, im0s)

            # Process detections
            for i, det in enumerate(pred):  # detections per image
                if webcam:  # batch_size >= 1
                    p, s, im0, frame = path[i], '%g: ' % i, im0s[i].copy(), dataset.count
                else:
                    p, s, im0, frame = path, '', im0s, getattr(dataset, 'frame', 0)

                p = Path(p)  # to Path
                save_path = str(save_dir / p.name)  # img.jpg
                txt_path = str(save_dir / 'labels' / p.stem) + ('' if dataset.mode == 'image' else f'_{frame}')  # img.txt
                gn = torch.tensor(im0.shape)[[1, 0, 1, 0]]  # normalization gain whwh
                if len(det):
                    # Rescale boxes from img_size to im0 size
                    det[:, :4] = scale_coords(img.shape[2:], det[:, :4], im0.shape).round()

                    # Print results
                    for c in det[:, -1].unique():
                        n = (det[:, -1] == c).sum()  # detections per class
                        s += f"{n} {names[int(c)]}{'s' * (n > 1)}, "  # add to string

                    # Write results
                    for *xyxy, conf, cls in reversed(det):
                        if save_txt:  # Write to file
                            xywh = (xyxy2xywh(torch.tensor(xyxy).view(1, 4)) / gn).view(-1).tolist()  # normalized xywh
                            line = (cls, *xywh, conf) if self.save_conf else (cls, *xywh)  # label format
                            with open(txt_path + '.txt', 'a') as f:
                                f.write(('%g ' * len(line)).rstrip() % line + '\n')

                        if save_img or view_img:  # Add bbox to image
                            label = f'{names[int(cls)]}

这个程序文件是一个目标检测的程序，文件名为detect.py。程序使用了PyTorch和OpenCV库来实现目标检测功能。程序的主要功能是加载训练好的模型，对输入的图像或视频进行目标检测，并将检测结果保存或显示出来。

程序的主要流程如下：

导入所需的库和模块。
定义了一个detect()函数，用于执行目标检测。
解析命令行参数，包括模型权重路径、输入源、图像尺寸、置信度阈值、IOU阈值等。
初始化一些变量和设置，包括日志记录、设备选择、模型加载、数据加载等。
执行目标检测的主循环，遍历输入源中的每一帧图像。
对每一帧图像进行预处理，包括图像转换、尺寸调整等。
使用模型进行目标检测，得到预测结果。
对预测结果进行非极大值抑制和分类等处理。
根据检测结果绘制边界框和标签，并将结果保存或显示出来。
完成目标检测后，输出检测结果的统计信息。

程序还包括了一些其他功能，如模型更新、结果保存等。

以上是对这个程序文件的概述。

5.3 RCS.py

import torch
import torch.nn as nn
import torch.nn.functional as F

class RCS(nn.Module):
    def __init__(self, in_channels, out_channels, stride):
        super(RCS, self).__init__()

        assert in_channels % 2 == 0, "Input channels must be divisible by 2"
        half_channels = in_channels // 2

        self.conv1x1 = nn.Conv2d(half_channels, half_channels, kernel_size=1, stride=stride, padding=0, groups=2)
        self.bn1 = nn.BatchNorm2d(half_channels)

        self.conv3x3 = nn.Conv2d(half_channels, half_channels, kernel_size=3, stride=stride, padding=1, groups=2)
        self.bn2 = nn.BatchNorm2d(half_channels)

    def forward(self, x):
        c = x.shape[1] // 2
        x1 = x[:, :c, :, :]
        x2 = x[:, c:, :, :]

        out1 = x1

        out2 = F.relu(self.bn1(self.conv1x1(x2)))

        out3 = F.relu(self.bn2(self.conv3x3(x2)))

        out = torch.cat([out1, out2 + out3], dim=1)

        out = out[:, torch.randperm(out.shape[1]), :, :]

        return out

这个程序文件是一个名为RCS的神经网络模型的定义。它是一个继承自nn.Module的类，用于实现一个残差通道分离（Residual Channel Separation）的模块。

在初始化函数中，模型接受输入通道数、输出通道数和步长作为参数。它首先检查输入通道数是否能被2整除，然后将输入通道数分成两个相等的部分。

模型包含三个分支：恒等分支、1x1卷积分支和3x3卷积分支。恒等分支不包含任何参数，1x1卷积分支和3x3卷积分支分别包含一个卷积层和一个批归一化层。

在前向传播函数中，输入张量被分成两个部分。恒等分支直接将第一个部分输出。1x1卷积分支将第二个部分经过1x1卷积和批归一化后进行ReLU激活。3x3卷积分支将第二个部分经过3x3卷积和批归一化后进行ReLU激活。然后，1x1卷积分支和3x3卷积分支的输出被相加，并与恒等分支的输出进行通道拼接。最后，输出张量的通道顺序被随机打乱。

该模型的作用是将输入张量进行通道分离和重组，以增加模型的表达能力和学习能力。

5.4 RCS_OSA.py

import torch
import torch.nn as nn
import torch.nn.functional as F

class RCS(nn.Module):
    def __init__(self, in_channels, out_channels, stride=1):
        super(RCS, self).__init__()
        self.conv1 = nn.Conv2d(in_channels, out_channels, kernel_size=3, stride=stride, padding=1)
        self.conv2 = nn.Conv2d(out_channels, out_channels, kernel_size=3, stride=1, padding=1)
        self.conv3 = nn.Conv2d(out_channels, out_channels, kernel_size=3, stride=1, padding=1)

    def forward(self, x):
        out = F.relu(self.conv1(x))
        out = F.relu(self.conv2(out))
        out = self.conv3(out)
        out = F.relu(x + out)
        return out

class RCS_OSA(nn.Module):
    def __init__(self, in_channels, out_channels, num_modules, stride=1):
        super(RCS_OSA, self).__init__()

        assert out_channels % num_modules == 0, "Output channels must be divisible by the number of RCS modules"
        rcs_out_channels = out_channels // num_modules

        self.rcs_modules = nn.ModuleList()
        for _ in range(num_modules):
            self.rcs_modules.append(RCS(in_channels, rcs_out_channels, stride))
            in_channels = rcs_out_channels  # Output of one RCS module is the input to the next

        self.conv = nn.Conv2d(num_modules * rcs_out_channels, out_channels, kernel_size=1, stride=1, padding=0)

    def forward(self, x):
        outputs = []
        for rcs_module in self.rcs_modules:
            x = rcs_module(x)
            outputs.append(x)

        # Feature aggregation
        agg = torch.cat(outputs, dim=1)
        out = self.conv(agg)

        return out

这个程序文件是一个用于实现RCS_OSA模型的PyTorch模块。RCS_OSA模型是由多个RCS模块组成的，用于图像处理任务。该模型的输入通道数为in_channels，输出通道数为out_channels，模块的数量为num_modules。每个RCS模块的输出通道数为out_channels除以num_modules得到的整数部分。

在初始化函数中，通过循环创建了num_modules个RCS模块，并将它们添加到rcs_modules列表中。每个RCS模块的输入通道数为in_channels，输出通道数为rcs_out_channels。同时，每个RCS模块的输出作为下一个模块的输入。

在前向传播函数中，通过循环遍历rcs_modules列表，将输入x传递给每个RCS模块，并将每个模块的输出保存在outputs列表中。然后，将outputs列表中的所有输出在通道维度上进行拼接，得到特征聚合结果agg。最后，将agg输入到一个1x1的卷积层conv中，得到最终的输出out。

整个模型的作用是将输入图像通过多个RCS模块进行特征提取和聚合，最终得到输出结果。

6.系统整体结构

整体功能和构架概述：

该项目是一个基于RCS-OSA改进YOLOv7的青蛙活动量监测预警系统。它使用了深度学习模型YOLOv7进行目标检测，并结合DeepSort算法进行目标跟踪。系统还包括了RCS模块和RCS_OSA模块用于图像处理任务。该项目还提供了一些辅助功能，如模型导出、模型评估、模型训练等。

下表整理了每个文件的功能：

文件路径	功能
deep_sort_tracking_id.py	使用DeepSort算法进行目标跟踪的程序
detect.py	目标检测的程序
export.py	导出模型的脚本
hubconf.py	PyTorch Hub模型的定义
RCS.py	实现了RCS模块的神经网络模型
RCS_OSA.py	实现了RCS_OSA模块的神经网络模型
RCS_YOLO.py	实现了RCS_YOLO模块的神经网络模型
test.py	模型测试的脚本
train.py	模型训练的脚本
train_aux.py	辅助模型训练的脚本
ui.py	用户界面的脚本
deep_sort_pytorch\deep_sort\deep_sort.py	DeepSort算法的实现
deep_sort_pytorch\deep_sort_init_.py	DeepSort算法的初始化文件
deep_sort_pytorch\deep_sort\deep\evaluate.py	模型评估的脚本
deep_sort_pytorch\deep_sort\deep\feature_extractor.py	特征提取器的实现
deep_sort_pytorch\deep_sort\deep\model.py	模型的定义
deep_sort_pytorch\deep_sort\deep\original_model.py	原始模型的定义
deep_sort_pytorch\deep_sort\deep\test.py	模型测试的脚本
deep_sort_pytorch\deep_sort\deep\train.py	模型训练的脚本
deep_sort_pytorch\deep_sort\deep_init_.py	深度学习模型的初始化文件
deep_sort_pytorch\deep_sort\sort\detection.py	目标检测的实现
deep_sort_pytorch\deep_sort\sort\iou_matching.py	IOU匹配的实现
deep_sort_pytorch\deep_sort\sort\kalman_filter.py	卡尔曼滤波的实现
deep_sort_pytorch\deep_sort\sort\linear_assignment.py	线性分配的实现
deep_sort_pytorch\deep_sort\sort\nn_matching.py	NN匹配的实现
deep_sort_pytorch\deep_sort\sort\preprocessing.py	数据预处理的实现
deep_sort_pytorch\deep_sort\sort\track.py	跟踪器的实现
deep_sort_pytorch\deep_sort\sort\tracker.py	追踪器的实现
deep_sort_pytorch\deep_sort\sort_init_.py	目标跟踪算法的初始化文件

7.RCS-based One-Shot Aggregation模块

AAAI提出了一种RCS-OSA模块，通过将RCS结合到OSA中，如图所示。RCS模块被重复堆叠，以确保特征的复用，并增强相邻层特征之间不同通道之间的信息流动。在网络的不同位置，作者设置不同数量的堆叠模块。
在这里插入图片描述

为了减少网络碎片化的程度，在One-Shot Aggregation路径上仅保留了3个特征级联，这可以减轻网络计算负担并降低内存占用。在多尺度特征融合方面，受到PANe的启发，RCS-OSA+上采样和RCS-OSA+ RepVGG/ RepConv下采样进行不同大小特征图的对齐，以允许两个预测特征层之间的信息交换。这使得目标检测可以实现高精度的快速推理。
此外，RCS-OSA保持相同数量的输入通道和最小输出通道，从而降低了内存访问成本(MAC)。在网络构建方面，作者将最大池化下采样32次的YOLOv7作为Backone，并采用RepVGG/RepConv,Sitride为2进行下采样。由于RCS-OSA模块具有多样化的特征表示和低成本的内存消耗，因此作者在RCS-OSA模块中使用不同数量的堆叠RCS，以在Backbone和Neck的不同阶段实现语义信息提取。
计算效率(或时间复杂度)的常见评估指标是浮点运算次数(FLOPs)。FLOPs只是衡量推理速度的间接指标。然而，具有DenseNet Backbone的目标检测器显示出较慢的速度和较低的能量效率，因为通过密集连接线性增加的通道数导致了较重的MAC，这导致了相当大的计算开销。给定尺寸为M*M的输入特征，大小为K×K的卷积核。

8.改进的RCS-OSA YOLOv7网络结构

在这里插入图片描述

为了进一步减少推理时间，谷歌大脑研究团队将Detect的检测头数量从3个减少到2个。YOLOv5、YOLOv6、YOLOv7和YOLOv8有三个检测头。然而，我们只使用两个特征层进行预测，将原始的九个不同尺度的锚点数量减少到四个，并使用K-means无监督聚类方法来重新生成不同尺度的锚。相应的刻度为（87,90）、（127,139）、（154,171）、（191,240）。这不仅减少了RCS-YOLO的卷积层数量和计算复杂度，还减少了网络在推理阶段的全部计算要求和后处理非最大值抑制的计算时间。