基于VGG16神经网络实现以图搜图

思路

· 预先准备一份图片库,并对其中数据进行批处理操作,使用VGG16卷积神经网络提取图像的512维卷积特征,刷入数据库(ClickHouse)记录;

· 上传目标图像进行识图,同样使用VGG16提取目标图像特征,使用CK数据库距离函数进行匹配,高于阈值即可返回识图结果

神经网络

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
# -*- coding: utf-8 -*-
# @Author : tianL.R
# @Email : rtl1312@163.com
# @Time : 2023.11.26
import time

import numpy as np
from PIL import Image
from keras.applications.vgg16 import VGG16
from keras.applications.vgg16 import preprocess_input
from keras.preprocessing import image
from numpy import linalg


class VGG16Net:
def __init__(self):
self.input_shape = (224, 224, 3)
self.weight = 'imagenet'
self.pooling = 'max'
self.model_vgg = VGG16(weights=self.weight,
input_shape=(self.input_shape[0], self.input_shape[1], self.input_shape[2],),
pooling=self.pooling,
include_top=False)
self.model_vgg.predict(np.zeros((1, 224, 224, 3)))

def detection(self, img_path):
"""
提取VGG16最后一层卷积特征
"""
# img = image.load_img(img_path, target_size=(self.input_shape[0], self.input_shape[1]))
img = img_path.resize((self.input_shape[0], self.input_shape[1]))
img = image.img_to_array(img)
img = np.expand_dims(img, axis=0)
img = preprocess_input(img)
feat = self.model_vgg.predict(img)
norm_feat = feat[0] / linalg.norm(feat[0])
return norm_feat.tolist()


if __name__ == '__main__':
img1 = '333.jpg'
img2 = '555.jpg'
img1 = Image.open(img1)
img2 = Image.open(img2)

vgg = VGG16Net()
queryVec1 = np.array(vgg.detection(img1))
queryVec2 = np.array(vgg.detection(img2))
scores = np.dot(queryVec1, queryVec2)
score2 = queryVec1.dot(queryVec2) / (np.linalg.norm(queryVec1) * np.linalg.norm(queryVec2))
print(scores)
print(score2)

Read More

卷积神经网络图像分类算法小集

目录结构

训练结构

· 在项目根目录下新建数据集文件夹data_set,建立子文件夹(数据集名称)用于存放训练集和测试集;

· 在项目根目录下新建数据集文件夹class_j,用于存放分类json文件;

· 在项目根目录下新建数据集文件夹models,用于存放训练好的模型文件;

· 神经网络model.py

· 训练脚本train.py

· 预测脚本predict.py

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
# project
├── data_set
│ ├── data
│ ├── train
│ │ ├── 00001.jpg
│ │ ├── 00002.jpg
│ │ ├── 00003.jpg
│ │ ├── ...
│ │ └── 10000.jpg
│ └── val
│ ├── 00001.jpg
│ ├── 00002.jpg
│ ├── 00003.jpg
│ ├── ...
│ └── 01000.jpg
├── class_j
│ ├── class_indices.json
├── models
│ ├── model.pth
├── model.py
├── train.py
└── predict.py

封装结构

GoogLeNet神经网络为例:

1
2
3
4
5
6
# GoogLeNet
├── class_j
│ ├── class_indices.json
│── weights
│ ├── GoogLeNet_GPU_v1.pth
└── model.py

Read More

CentOS-LibreOffice工具包安装

· 系统: CentOS7

· LibreOffice: 7.4.5.1 稳定版

资源下载

· 官方网站: https://zh-cn.libreoffice.org/download/libreoffice/

· 下载地址:https://downloadarchive.documentfoundation.org/libreoffice/old/7.4.5.1/rpm/x86_64/

选择LibreOffice_7.4.5.1_Linux_x86-64_rpm.tar.gz安装包和LibreOffice_7.4.5.1_Linux_x86-64_rpm_langpack_zh-CN.tar.gz中文语言包并下载

安装

进入安装包下载目录进行解压,这里为/usr/local/

1
2
3
cd /usr/local/   进入目录
tar -zxvf LibreOffice_7.4.6.1_Linux_x86-64_rpm.tar.gz 解压libreoffice
tar -zxvf LibreOffice7.4.6.1_Linux_x86-64_rpm_langpack_zh-CN.tar.gz 解压中文语言包

安装libreoffice和语言包的rpm包,默认安装目录为/opt/libreoffice7.4

1
2
3
4
cd /usr/local/LibreOffice_7.4.6.1_Linux_x86-64_rpm/RPMS/
yum -y install *.rpm
cd /usr/local/LibreOffice_7.4.6.1_Linux_x86-64_rpm_langpack_zh-CN/RPMS
yum -y install *.rpm

安装soffice,进入/opt/libreoffice7.4/program目录执行

1
2
3
4
cd /opt/libreoffice7.4/program/
yum install cairo
yum install cups-libs
yum install libSM

检查

1
/opt/libreoffice7.4/program/soffice -help

正常输出,安装成功,接下来将soffice添加到环境变量

1
vim /etc/profile
1
2
3
# libreoffice
export LibreOffice_PATH=/opt/libreoffice7.4/program
export PATH=$LibreOffice_PATH:$PATH
1
source /etc/profile

Read More

LangChain + ChatGLM2-6B的本地知识问答库

原项目Github:https://github.com/imClumsyPanda/langchain-ChatGLM

项目部署

· v 0.2.6

机器配置:

· python 环境:anaconda3 + python3.10.12

· GPU:RTX3090*2 + CUDA11.7

· torch:2.0.1(CUDA未升至12)

· conda:py310_dtglm

模型下载

· m3e https://huggingface.co/moka-ai/m3e-base/tree/main

· chatglm2-6b https://huggingface.co/THUDM/chatglm2-6b/tree/main

chatglm清华源 https://cloud.tsinghua.edu.cn/d/674208019e314311ab5c/?p=%2F&mode=list

(这里将模型全部下载至/root/huggingface下)

创建虚拟环境,安装依赖

1
2
3
4
5
6
conda create -n py310_dtglm python=3.10.12
conda activate py310_dtglm

pip install --use-pep517 -r requirements.txt -i https://mirror.baidu.com/pypi/simple
pip install --use-pep517 -r requirements_api.txt -i https://mirror.baidu.com/pypi/simple
pip install --use-pep517 -r requirements_webui.txt -i https://mirror.baidu.com/pypi/simple

修改配置、模型路径

复制配置文件

1
python copy_config_example.py

修改配置文件

· model_config.py

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
MODEL_ROOT_PATH = "/root/huggingface"

MODEL_PATH = {
"embed_model": {
...
"m3e-base": "/root/huggingface/m3e-base", # 修改m3e模型路径
...
},
# TODO: add all supported llm models
"llm_model": {
...
"chatglm2-6b": "/root/huggingface/chatglm2-6b", # 修改chatglm2-6b模型路径
...
},
}

EMBEDDING_MODEL = "m3e-base" # 可以尝试最新的嵌入式sota模型:bge-large-zh-v1.5
LLM_MODEL = "chatglm2-6b"

Read More

Nebula3集群版新旧版本多开

· 系统:CentOS7 · 已有nebula版本:2.6.1(开源社区版) · 已有nebula-console版本:2.6.0 · 已有nebula-graph-studio版本:3.2.3 · 多开nebula版本:3.6.0(开源社区版) · 多开nebula-graph-studio版本:3.2.3 · 多开nebula-console版本:3.6.0

集群部署

· 参考单机部署方式,对配置文件--meta_server_addrs做扩展,添加meta机器

· 区分2.6.1版本已被占用的端口,找到配置文件默认的955919559966919669977919779端口,修改为855918559866918669877918779

· 启动集群

· 配置nebula-graph-studio默认端口为7002

·:双开nebula后使用同版本nebula-graph-studio即使更换了端口,也不能同时运行,可以安装nebula-console来同时启动nebula控制台

chmod 111 nebula-console

./nebula-console --addr <host> --port 9669 -u root -p nebula

./nebula-console --addr <host> --port 8669 -u root -p nebula

Nebula3单机版快速安装

· 系统:CentOS7 · nebula版本:3.6.0(开源社区版) · nebula-graph-studio版本:3.2.3

单机部署

tar包源码下载

wget https://oss-cdn.nebula-graph.com.cn/package/3.6.0/nebula-graph-3.6.0.el7.x86_64.tar.gz

解压并重命名

tar -xvzf nebula-graph-3.6.0.el7.x86_64.tar.gz

mv nebula-graph-3.6.0.el7.x86_64 nebula

修改配置文件

cd nebula/etc

mv nebula-graphd.conf.default nebula-graphd.conf

mv nebula-metad.conf.default nebula-metad.conf

mv nebula-storaged.conf.default nebula-storaged.conf

修改对应文件存储位置、节点ip地址,集群同理

Read More

基于VGG16神经网络实现图像艺术风格转换

基本原理

通过vgg16或其他神经网络提取图像特征,并使用格拉姆矩阵(Gram matrix)进行图像风格的迁移。

VGG16

不必多说,2014年ImageNet图像分类竞赛亚军,定位竞赛冠军;VGG网络采用连续的小卷积核(3x3)和池化层构建深度神经网络,网络深度可以达到16层或19层,其中VGG16和VGG19最为著名。VGG16和VGG19网络架构非常相似,都由多个卷积层和池化层交替堆叠而成,最后使用全连接层进行分类。两者的区别在于网络的深度和参数量,VGG19相对于VGG16增加了3个卷积层和一个全连接层,参数量也更多。

可在keras直接使用vgg16/19源码,自动下载相关预训练模型

1
2
from keras.applications.vgg16 import VGG16
from keras.applications.vgg19 import VGG19

这里结合transform,在torch中构建神经网络

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
import torch
from collections import namedtuple
from torchvision import models
import torch.nn as nn
import torch.nn.functional as F


# VGG16神经网络定义
class VGG16(torch.nn.Module):
"""Vgg16 Net"""
def __init__(self, requires_grad=False):
super(VGG16, self).__init__()
vgg_pretrained_features = models.vgg16(pretrained=True).features
self.slice1 = torch.nn.Sequential()
self.slice2 = torch.nn.Sequential()
self.slice3 = torch.nn.Sequential()
self.slice4 = torch.nn.Sequential()

for x in range(4):
self.slice1.add_module(str(x), vgg_pretrained_features[x])

for x in range(4, 9):
self.slice2.add_module(str(x), vgg_pretrained_features[x])

for x in range(9, 16):
self.slice3.add_module(str(x), vgg_pretrained_features[x])

for x in range(16, 23):
self.slice4.add_module(str(x), vgg_pretrained_features[x])

if not requires_grad:
for param in self.parameters():
param.requires_grad = False

def forward(self, X):
h = self.slice1(X)
h_relu1_2 = h
h = self.slice2(h)
h_relu2_2 = h
h = self.slice3(h)
h_relu3_3 = h
h = self.slice4(h)
h_relu4_3 = h

vgg_outputs = namedtuple("VggOutputs", ["relu1_2", "relu2_2", "relu3_3", "relu4_3"])
output = vgg_outputs(h_relu1_2, h_relu2_2, h_relu3_3, h_relu4_3)

return output


class TransformerNet(torch.nn.Module):
def __init__(self):
super(TransformerNet, self).__init__()
self.model = nn.Sequential(
ConvBlock(3, 32, kernel_size=9, stride=1),
ConvBlock(32, 64, kernel_size=3, stride=2),
ConvBlock(64, 128, kernel_size=3, stride=2),
ResidualBlock(128),
ResidualBlock(128),
ResidualBlock(128),
ResidualBlock(128),
ResidualBlock(128),
ConvBlock(128, 64, kernel_size=3, upsample=True),
ConvBlock(64, 32, kernel_size=3, upsample=True),
ConvBlock(32, 3, kernel_size=9, stride=1, normalize=False, relu=False),
)

def forward(self, x):
return self.model(x)


class ResidualBlock(torch.nn.Module):
def __init__(self, channels):
super(ResidualBlock, self).__init__()
self.block = nn.Sequential(
ConvBlock(channels, channels, kernel_size=3, stride=1, normalize=True, relu=True),
ConvBlock(channels, channels, kernel_size=3, stride=1, normalize=True, relu=False),
)

def forward(self, x):
return self.block(x) + x


class ConvBlock(torch.nn.Module):
def __init__(self, in_channels, out_channels, kernel_size, stride=1, upsample=False, normalize=True, relu=True):
super(ConvBlock, self).__init__()
self.upsample = upsample
self.block = nn.Sequential(
nn.ReflectionPad2d(kernel_size // 2),
nn.Conv2d(in_channels, out_channels, kernel_size, (stride,))
)
self.norm = nn.InstanceNorm2d(out_channels, affine=True) if normalize else None
self.relu = relu

def forward(self, x):
if self.upsample:
x = F.interpolate(x, scale_factor=2)
x = self.block(x)
if self.norm is not None:
x = self.norm(x)
if self.relu:
x = F.relu(x)
return x


"""
测试模型
"""
if __name__ == '__main__':
input1 = torch.rand([224, 3, 224, 224])
model_x = VGG16()
print(model_x)

格拉姆矩阵

格拉姆矩阵(Gram matrix)即n维欧式空间中任意k个向量之间两两的内积所组成的矩阵,是一个对称矩阵。

更直观的理解:

输入图像的feature map为[ ch, h, w]。我们经过flatten(即是将hw进行平铺成一维向量)和矩阵转置操作,可以变形为[ ch, hw]和[ h*w, ch]的矩阵。再对两个作内积得到格拉姆矩阵。

使用格拉姆矩阵进行风格迁移:

1.准备目标图像和目标风格图像;

2.使用深层网络加白噪声提取目标图像和风格目标的特征向量。对两个图像的特征向量计算格拉姆矩阵,以矩阵差异最小化为优化目标,不断调整目标图像,使风格不断相似。

torch中格拉姆矩阵代码:

1
2
3
4
5
6
def gram_matrix(y):
(b, c, h, w) = y.size()
features = y.view(b, c, w * h)
features_t = features.transpose(1, 2)
gram = features.bmm(features_t) / (c * h * w)
return gram

开始训练

准备训练文件和风格图片,例如随机图像*20和梵高名作星月夜

utils.py工具

配置训练参数:

1
2
3
4
5
6
7
8
9
10
11
12
13
parser = argparse.ArgumentParser(description="Parser 4 Training")
parser.add_argument("--style", type=str, default="images/styles/the_starry_night.jpg", help="Path 2 style image")
parser.add_argument("--dataset", type=str, help="path 2 training dataset")
parser.add_argument("--epochs", type=int, default=1, help="Number of training epochs")
parser.add_argument("--batch_size", type=int, default=4, help="Batch size 4 training")
parser.add_argument("--image_size", type=int, default=256, help="Size of training images")
parser.add_argument("--style_size", type=int, help="Size of style image")
parser.add_argument("--lr", type=float, default=1e-3, help="Learning rate")
parser.add_argument("--lambda_img", type=float, default=1e5, help="Weight 4 image loss")
parser.add_argument("--lambda_style", type=float, default=1e10, help="Weight 4 style loss")
parser.add_argument("--model_path", type=str, help="Optional path 2 checkpoint model")
parser.add_argument("--model_checkpoint", type=int, default=1000, help="Batches 4 saving model")
parser.add_argument("--result_checkpoint", type=int, default=1000, help="Batches 4 saving image result")

使用神经网络进行风格训练

1
2
3
4
5
6
7
8
9
10
def train_transform(image_size):
transform = transforms.Compose(
[
transforms.Resize(int(image_size * 1.15)),
transforms.RandomCrop(image_size),
transforms.ToTensor(),
transforms.Normalize(mean, std),
]
)
return transform

使用神经网络进行风格转换

1
2
3
4
def style_transform(image_size=None):
resize = [transforms.Resize(image_size)] if image_size else []
transform = transforms.Compose(resize + [transforms.ToTensor(), transforms.Normalize(mean, std)])
return transform

使用均值和标准对图像张量进行反规范化

1
2
3
4
def denormalize(tensors):
for c in range(3):
tensors[:, c].mul_(std[c]).add_(mean[c])
return tensors

train.py训练脚本

训练配置

1
2
3
4
5
6
7
8
train_args = TrainArgs()
args = train_args.initialize().parse_args()

args.dataset = './dataset'
args.style = './images/styles/the_starry_night.jpg'
args.epochs = 2400 # epochs*(数据集/batch_size)是1000的公倍数
args.batch_size = 4
args.image_size = 256

训练流程

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
style_name = args.style.split("/")[-1].split(".")[0]
os.makedirs(f"images/train/{style_name}_training", exist_ok=True)
os.makedirs(f"checkpoints", exist_ok=True)
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
train_dataset = datasets.ImageFolder(args.dataset, train_transform(args.image_size))
dataloader = DataLoader(train_dataset, batch_size=args.batch_size)
transformer = TransformerNet().to(device)
vgg = VGG16(requires_grad=False).to(device)
if args.model_path:
transformer.load_state_dict(torch.load(args.model_path))
optimizer = Adam(transformer.parameters(), args.lr)
l2_loss = torch.nn.MSELoss().to(device)
style = style_transform(args.style_size)(Image.open(args.style))
style = style.repeat(args.batch_size, 1, 1, 1).to(device)
features_style = vgg(style)
gram_style = [gram_matrix(y) for y in features_style]
image_samples = []
for path in random.sample(glob.glob(f"{args.dataset}/*/*"), len(train_dataset)):
image_samples += [style_transform(args.image_size)(Image.open(path).resize((224, 224)))]
image_samples = torch.stack(image_samples)

启动训练

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
def save_result(sample):
transformer.eval()
with torch.no_grad():
output = transformer(image_samples.to(device))
image_rgb = denormalize(torch.cat((image_samples.cpu(), output.cpu()), 2))
save_image(image_rgb, f"images/train/{style_name}_training/{sample}.jpg", nrow=4)
transformer.train()


def save_model(sample):
torch.save(transformer.state_dict(), f"checkpoints/{style_name}_{sample}.pth")


for epoch in range(args.epochs):
for line in range(len(dataloader)):
batch_i = line
batches_done = epoch * len(dataloader) + batch_i + 1
images = list(dataloader)[line][0]
optimizer.zero_grad()

images_original = images.to(device)
images_transformed = transformer(images_original)

features_original = vgg(images_original)
features_transformed = vgg(images_transformed)

img_loss = args.lambda_img * l2_loss(features_transformed.relu2_2, features_original.relu2_2)

style_loss = 0
for ft_y, gm_s in zip(features_transformed, gram_style):
gm_y = gram_matrix(ft_y)
style_loss += l2_loss(gm_y, gm_s[: images.size(0), :, :])
style_loss *= args.lambda_style

total_loss = img_loss + style_loss
total_loss.backward()
optimizer.step()
if batches_done % args.result_checkpoint == 0:
save_result(batches_done)
if args.model_checkpoint > 0 and batches_done % args.model_checkpoint == 0:
save_model(batches_done)

第1000次迭代

第12000次迭代(2400epoch * (20/batch_size)),效果明显

到这一步,训练结束,可以预测结果

预测:

配置预测参数

1
2
3
4
predict_args = PredictArgs()
args = predict_args.initialize().parse_args()
args.image_path = './images/input/001.jpg'
args.model_path = './checkpoints/the_starry_night_12000.pth'

预测代码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
os.makedirs("images/output", exist_ok=True)
device = torch.device('cpu')#("cuda" if torch.cuda.is_available() else "cpu")
transform = style_transform()
transformer = TransformerNet().to(device)
transformer.load_state_dict(torch.load(mod_path))
transformer.eval()
image_tensor = Variable(transform(Image.open(img_path))).to(device)
image_tensor = image_tensor.unsqueeze(0)

with torch.no_grad():
output_image = denormalize(transformer(image_tensor)).cpu()

name = img_path.split("/")[-1]
save_image(output_image, f"images/output/output_{name}")

思路·参考

https://github.com/elleryqueenhomels/fast_neural_style_transfer/tree/master

https://github.com/AaronJny/DeepLearningExamples/tree/master/tf2-neural-style-transfer

https://github.com/Huage001/PaintTransformer

https://github.com/eriklindernoren/Fast-Neural-Style-Transfer/tree/master

https://github.com/NeverGiveU/PaintTransformer-Pytorch-master

https://github.com/junyanz/pytorch-CycleGAN-and-pix2pix

paddleDetection Demo

PPHuman

行人属性识别

行人属性

cfg:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
crop_thresh: 0.5
attr_thresh: 0.5
kpt_thresh: 0.2
visual: True
warmup_frame: 50

DET:
model_dir: https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_pipeline.zip
batch_size: 1

MOT:
model_dir: https://bj.bcebos.com/v1/paddledet/models/pipeline/mot_ppyoloe_l_36e_pipeline.zip
tracker_config: /exp/work/video/PaddleDetection/deploy/pipeline/config/tracker_config.yml
batch_size: 1
skip_frame_num: -1 # preferably no more than 3
enable: True

KPT:
model_dir: https://bj.bcebos.com/v1/paddledet/models/pipeline/dark_hrnet_w32_256x192.zip
batch_size: 8

ATTR:
model_dir: https://bj.bcebos.com/v1/paddledet/models/pipeline/PPLCNet_x1_0_person_attribute_945_infer.zip
batch_size: 8
enable: True

cli:

1
python deploy/pipeline/pipeline.py --config deploy/pipeline/config/cache/cfg_human.yml --device=gpu --video_file=demo_input/human.mp4 --output_dir=demo_output/

Read More

LiveGBS国标GB/T28181视频流媒体平台

软件包下载

LiveGBS GB28181流媒体服务下载地址:https://www.liveqing.com/docs/download/LiveGBS.html#%E7%89%88%E6%9C%AC%E4%B8%8B%E8%BD%BD

选择windows版本的LiveGBS 信令服务LiveGBS流媒体服务,免费版授权周期为26天,届时需要手动更新软件服务

安装LiveGBS GB28281

解压下载好的软件包,分别启动LiveCMS.exeLiveSMS.exe,如果有默认端口被占用的情况可以修改对应的livecms.inilivesms.ini配置文件,这里我将LiveGBS的默认端口从10000修改为10005

成功启动后后台出现livecms和livesms的图标

Read More

paddleDetection-视频OCR

PPOCR_V4

安装百度最新ppocr_v4库,使用虚拟环境为py39_vio,本虚拟环境不可与人脸识别(py38_arcface)兼容(opencv版本不兼容)

1
pip install paddleocr --user -i https://mirror.baidu.com/pypi/simple

代码

cfg_utils.py新增cfg--ocr,设置True为开启,默认False

1
2
3
4
5
parser.add_argument(
"--ocr",
type=bool,
default=False,
help="use paddlepaddle-ocr")

pipeline.py

1
from python.visualize import visualize_box_mask, visualize_attr, visualize_pose, visualize_action, visualize_vehicleplate, visualize_vehiclepress, visualize_lane, visualize_vehicle_retrograde, visualize_ocr
1
2
3
class PipePredictor(object):  
def __init__(self, args, cfg, is_video=True, multi_camera=False):
self.ocr = args.ocr
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
def visualize_video(self,
image_rgb,
result,
collector,
frame_id,
fps,
entrance=None,
records=None,
center_traj=None,
do_illegal_parking_recognition=False,
illegal_parking_dict=None):
image = cv2.cvtColor(image_rgb, cv2.COLOR_RGB2BGR)
mot_res = copy.deepcopy(result.get('mot'))

if self.ocr:
lock.acquire() # 加锁,paddleOCR是线程不安全的
ocr_result = ocr.ocr(image, cls=True)[0]
lock.release()
ocr_boxes = [line[0] for line in ocr_result]
ocr_txts = [line[1][0] for line in ocr_result]
ocr_scores = [line[1][1] for line in ocr_result]

image = visualize_ocr(image, ocr_boxes, ocr_txts, ocr_scores)

visualize.py

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
def visualize_ocr(im, boxes, texts, score):
if isinstance(im, str):
im = Image.open(im)
im = np.ascontiguousarray(np.copy(im))
im = cv2.cvtColor(im, cv2.COLOR_RGB2BGR)
else:
im = np.ascontiguousarray(np.copy(im))

# 创建透明图层,为图像添加文字水印
im = Image.fromarray(im)
im = im.convert('RGBA')
im_canvas = Image.new('RGBA', im.size, (255, 255, 255, 0))

for i, res in enumerate(texts):
if boxes is not None:
box = boxes[i]
text = res
if text == "":
continue

text_scale = max(1.0, int(box[2][1] - box[1][1]))

draw = ImageDraw.Draw(im_canvas)
draw.text(
(box[0][0], box[0][1]),
text,
font=ImageFont.truetype(font_file, size=int(text_scale)),
fill=(255, 255, 0, 85)) # 第四位是透明度
try:
draw.rectangle(
((box[0][0], box[0][1]), (box[2][0], box[2][1])),
fill=None,
outline=(255, 255, 0),
width=1)
except ValueError:
pass

# 复合图层
im = Image.alpha_composite(im, im_canvas)
im = im.convert('RGB')
# 还原连续存储数组
im = np.ascontiguousarray(np.copy(im))
return im

Read More


Powered by Hexo and Hexo-theme-hiker

Copyright © 2017 - 2024 青域 All Rights Reserved.

UV : | PV :