doccano

2024-06-28 深度学习 PV:

Doccano是一种用于文本标注的开源工具，旨在简化和加速标注任务的进行。它提供了一个直观的用户界面，使标注人员能够轻松地对文本数据进行标注，并创建高质量的训练数据集用于机器学习和自然语言处理任务。

链接：https://github.com/doccano/doccano

一、安装部署

环境

操作系统：Centos7.9

python：3.10

doccano：1.6.2

pip安装

注：百度源没有相应安装包

pip install doccano==1.6.2 -i https://pypi.tuna.tsinghua.edu.cn/simple

初始化

doccano init

设置超级管理员账号密码

doccano createuser --username admin --password 123456

启动服务

doccano webserver --port 8000

ERNIE-UIE 关系抽取微调数据标注

创建序列标注任务

导入增量训练数据集

注：如果导入不成功，长时间转圈，需要去控制台执行doccano task

创建实体标签和关系标签

数据标注

数据集导出

{"id": 11, "text": "钢筋调直宜采用机械方法,也可以采用冷拉方法", "relations": [{"id": 1, "from_id": 45, "to_id": 44, "type": "材料"}, {"id": 2, "from_id": 46, "to_id": 44, "type": "材料"}], "entities": [{"id": 44, "start_offset": 0, "end_offset": 4, "label": "工程"}, {"id": 45, "start_offset": 7, "end_offset": 11, "label": "工艺"}, {"id": 46, "start_offset": 17, "end_offset": 21, "label": "工艺"}]}
{"id": 12, "text": "受力钢筋的接头形式应按设计要求采用,若设计无要求时,钢筋宜采用焊接接头和机械连接接头,也可采用绑扎接头。", "relations": [{"id": 3, "from_id": 53, "to_id": 22, "type": "材料"}, {"id": 4, "from_id": 54, "to_id": 22, "type": "材料"}, {"id": 5, "from_id": 55, "to_id": 22, "type": "材料"}], "entities": [{"id": 22, "start_offset": 0, "end_offset": 9, "label": "工程"}, {"id": 53, "start_offset": 31, "end_offset": 35, "label": "工艺"}, {"id": 54, "start_offset": 36, "end_offset": 42, "label": "工艺"}, {"id": 55, "start_offset": 47, "end_offset": 51, "label": "工艺"}]}
{"id": 13, "text": "多层非焊接钢筋骨架的各层钢筋之间,应保持层距准确,宜采用短钢筋支垫。", "relations": [{"id": 6, "from_id": 60, "to_id": 59, "type": "工艺"}], "entities": [{"id": 59, "start_offset": 0, "end_offset": 9, "label": "工程"}, {"id": 60, "start_offset": 28, "end_offset": 33, "label": "工艺"}]}
{"id": 14, "text": "预制桩的修筑工艺包括一体化成孔、自灌注。", "relations": [{"id": 7, "from_id": 62, "to_id": 28, "type": "工艺"}, {"id": 8, "from_id": 61, "to_id": 28, "type": "工艺"}], "entities": [{"id": 28, "start_offset": 0, "end_offset": 3, "label": "工程"}, {"id": 61, "start_offset": 10, "end_offset": 15, "label": "工艺"}, {"id": 62, "start_offset": 16, "end_offset": 19, "label": "工艺"}]}
{"id": 15, "text": "目前我国水运工程的模板用材已向多样化发展,除钢材和木材外,胶木板、竹胶板、塑料等已得到广泛运用,并取得了较好的技术经济效益。", "relations": [{"id": 9, "from_id": 63, "to_id": 52, "type": "材料"}, {"id": 10, "from_id": 64, "to_id": 52, "type": "材料"}, {"id": 11, "from_id": 65, "to_id": 52, "type": "材料"}, {"id": 12, "from_id": 67, "to_id": 52, "type": "材料"}], "entities": [{"id": 52, "start_offset": 4, "end_offset": 8, "label": "工程"}, {"id": 63, "start_offset": 22, "end_offset": 24, "label": "材料"}, {"id": 64, "start_offset": 25, "end_offset": 27, "label": "材料"}, {"id": 65, "start_offset": 29, "end_offset": 32, "label": "材料"}, {"id": 66, "start_offset": 33, "end_offset": 36, "label": "材料"}, {"id": 67, "start_offset": 37, "end_offset": 39, "label": "材料"}]}

数据集格式转换

参考https://github.com/PaddlePaddle/PaddleNLP/tree/develop/legacy/model_zoo/uie

进入PaddleNLP-UIE路径，将json文件放置在路径下，创建data文件夹用于存储数据集，执行：

python doccano.py \
    --doccano_file ./data/doccano_ext.json \
    --task_type ext \
    --save_dir ./data \
    --splits 0.8 0.2 0 \
    --schema_lang ch

由于测试样本较少（5条），未自动化分验证集dev，手动将测试集内容复制到验证集当中。

开启训练

在UIE路径下创建checkpoint/model_best用于存放模型

执行（GPU）：

export finetuned_model=./checkpoint/model_best

python -u -m paddle.distributed.launch --gpus "0,1" finetune.py \
    --device gpu \
    --logging_steps 10 \
    --save_steps 100 \
    --eval_steps 100 \
    --seed 42 \
    --model_name_or_path uie-base \
    --output_dir $finetuned_model \
    --train_path data/train.txt \
    --dev_path data/dev.txt  \
    --max_seq_length 512  \
    --per_device_eval_batch_size 16 \
    --per_device_train_batch_size  16 \
    --num_train_epochs 100 \
    --learning_rate 1e-5 \
    --do_train \
    --do_eval \
    --do_export \
    --export_model_dir $finetuned_model \
    --label_names "start_positions" "end_positions" \
    --overwrite_output_dir \
    --disable_tqdm True \
    --metric_for_best_model eval_f1 \
    --load_best_model_at_end  True \
    --save_total_limit 1 \

训练结果：

(py39_ppner_2_7_2) [root@jdz uie]# python -u -m paddle.distributed.launch --gpus "1" finetune.py     --device gpu     --logging_steps 10     --save_steps 100     --eval_steps 100     --seed 42     --model_name_or_path uie-base     --output_dir $finetuned_model     --train_path data/train.txt     --dev_path data/dev.txt      --max_seq_length 512      --per_device_eval_batch_size 16     --per_device_train_batch_size  16     --num_train_epochs 100     --learning_rate 1e-5     --do_train     --do_eval     --do_export     --export_model_dir $finetuned_model     --label_names "start_positions" "end_positions"     --overwrite_output_dir     --disable_tqdm True     --metric_for_best_model eval_f1     --load_best_model_at_end  True     --save_total_limit 1 
LAUNCH INFO 2024-06-26 18:05:23,778 -----------  Configuration  ----------------------
LAUNCH INFO 2024-06-26 18:05:23,779 auto_parallel_config: None
LAUNCH INFO 2024-06-26 18:05:23,779 auto_tuner_json: None
LAUNCH INFO 2024-06-26 18:05:23,779 devices: 1
LAUNCH INFO 2024-06-26 18:05:23,779 elastic_level: -1
LAUNCH INFO 2024-06-26 18:05:23,779 elastic_timeout: 30
LAUNCH INFO 2024-06-26 18:05:23,779 enable_gpu_log: True
LAUNCH INFO 2024-06-26 18:05:23,779 gloo_port: 6767
LAUNCH INFO 2024-06-26 18:05:23,779 host: None
LAUNCH INFO 2024-06-26 18:05:23,779 ips: None
LAUNCH INFO 2024-06-26 18:05:23,779 job_id: default
LAUNCH INFO 2024-06-26 18:05:23,779 legacy: False
LAUNCH INFO 2024-06-26 18:05:23,779 log_dir: log
LAUNCH INFO 2024-06-26 18:05:23,780 log_level: INFO
LAUNCH INFO 2024-06-26 18:05:23,780 log_overwrite: False
LAUNCH INFO 2024-06-26 18:05:23,780 master: None
LAUNCH INFO 2024-06-26 18:05:23,780 max_restart: 3
LAUNCH INFO 2024-06-26 18:05:23,780 nnodes: 1
LAUNCH INFO 2024-06-26 18:05:23,780 nproc_per_node: None
LAUNCH INFO 2024-06-26 18:05:23,780 rank: -1
LAUNCH INFO 2024-06-26 18:05:23,780 run_mode: collective
LAUNCH INFO 2024-06-26 18:05:23,780 server_num: None
LAUNCH INFO 2024-06-26 18:05:23,780 servers: 
LAUNCH INFO 2024-06-26 18:05:23,780 sort_ip: False
LAUNCH INFO 2024-06-26 18:05:23,780 start_port: 6070
LAUNCH INFO 2024-06-26 18:05:23,780 trainer_num: None
LAUNCH INFO 2024-06-26 18:05:23,780 trainers: 
LAUNCH INFO 2024-06-26 18:05:23,780 training_script: finetune.py
LAUNCH INFO 2024-06-26 18:05:23,781 training_script_args: ['--device', 'gpu', '--logging_steps', '10', '--save_steps', '100', '--eval_steps', '100', '--seed', '42', '--model_name_or_path', 'uie-base', '--output_dir', './checkpoint/model_best', '--train_path', 'data/train.txt', '--dev_path', 'data/dev.txt', '--max_seq_length', '512', '--per_device_eval_batch_size', '16', '--per_device_train_batch_size', '16', '--num_train_epochs', '100', '--learning_rate', '1e-5', '--do_train', '--do_eval', '--do_export', '--export_model_dir', './checkpoint/model_best', '--label_names', 'start_positions', 'end_positions', '--overwrite_output_dir', '--disable_tqdm', 'True', '--metric_for_best_model', 'eval_f1', '--load_best_model_at_end', 'True', '--save_total_limit', '1']
LAUNCH INFO 2024-06-26 18:05:23,781 with_gloo: 1
LAUNCH INFO 2024-06-26 18:05:23,781 --------------------------------------------------
LAUNCH INFO 2024-06-26 18:05:23,782 Job: default, mode collective, replicas 1[1:1], elastic False
LAUNCH INFO 2024-06-26 18:05:23,797 Run Pod: pwbjet, replicas 1, status ready
LAUNCH INFO 2024-06-26 18:05:23,824 Watching Pod: pwbjet, replicas 1, status running
/root/anaconda3/envs/py39_ppner_2_7_2/lib/python3.9/site-packages/_distutils_hack/__init__.py:33: UserWarning: Setuptools is replacing distutils.
  warnings.warn("Setuptools is replacing distutils.")
[2024-06-26 18:05:27,933] [ WARNING] - evaluation_strategy reset to IntervalStrategy.STEPS for do_eval is True. you can also set evaluation_strategy='epoch'.
[2024-06-26 18:05:27,933] [    INFO] - The default value for the training argument `--report_to` will change in v5 (from all installed integrations to none). In v5, you will need to use `--report_to all` to get the same behavior as now. You should start updating your code and make this info disappear :-).
[2024-06-26 18:05:27,934] [    INFO] - ============================================================
[2024-06-26 18:05:27,934] [    INFO] -      Model Configuration Arguments      
[2024-06-26 18:05:27,934] [    INFO] - paddle commit id              :fbf852dd832bc0e63ae31cd4aa37defd829e4c03
[2024-06-26 18:05:27,934] [    INFO] - export_model_dir              :./checkpoint/model_best
[2024-06-26 18:05:27,934] [    INFO] - model_name_or_path            :uie-base
[2024-06-26 18:05:27,934] [    INFO] - multilingual                  :False
[2024-06-26 18:05:27,934] [    INFO] - 
[2024-06-26 18:05:27,934] [    INFO] - ============================================================
[2024-06-26 18:05:27,934] [    INFO] -       Data Configuration Arguments      
[2024-06-26 18:05:27,934] [    INFO] - paddle commit id              :fbf852dd832bc0e63ae31cd4aa37defd829e4c03
[2024-06-26 18:05:27,935] [    INFO] - dev_path                      :data/dev.txt
[2024-06-26 18:05:27,935] [    INFO] - dynamic_max_length            :None
[2024-06-26 18:05:27,935] [    INFO] - max_seq_length                :512
[2024-06-26 18:05:27,935] [    INFO] - train_path                    :data/train.txt
[2024-06-26 18:05:27,935] [    INFO] - 
[2024-06-26 18:05:27,935] [ WARNING] - Process rank: -1, device: gpu, world_size: 1, distributed training: False, 16-bits training: False
[2024-06-26 18:05:27,935] [    INFO] - We are using (<class 'paddlenlp.transformers.ernie.tokenizer.ErnieTokenizer'>, False) to load 'uie-base'.
[2024-06-26 18:05:27,936] [    INFO] - Already cached /root/.paddlenlp/models/uie-base/ernie_3.0_base_zh_vocab.txt
[2024-06-26 18:05:27,968] [    INFO] - tokenizer config file saved in /root/.paddlenlp/models/uie-base/tokenizer_config.json
[2024-06-26 18:05:27,969] [    INFO] - Special tokens file saved in /root/.paddlenlp/models/uie-base/special_tokens_map.json
[2024-06-26 18:05:27,970] [    INFO] - Already cached /root/.paddlenlp/models/uie-base/model_state.pdparams
[2024-06-26 18:05:27,970] [    INFO] - Loading weights file model_state.pdparams from cache at /root/.paddlenlp/models/uie-base/model_state.pdparams
[2024-06-26 18:05:28,898] [    INFO] - Loaded weights file from disk, setting weights to model.
W0626 18:05:29.062965 285242 gpu_resources.cc:119] Please NOTE: device: 1, GPU Compute Capability: 8.6, Driver API Version: 12.2, Runtime API Version: 12.0
W0626 18:05:29.064332 285242 gpu_resources.cc:164] device: 1, cuDNN Version: 8.9.
[2024-06-26 18:05:30,516] [    INFO] - All model checkpoint weights were used when initializing UIE.

[2024-06-26 18:05:30,517] [    INFO] - All the weights of UIE were initialized from the model checkpoint at uie-base.
If your task is similar to the task the model of the checkpoint was trained on, you can already use UIE for predictions without further training.
[2024-06-26 18:05:30,562] [    INFO] - The global seed is set to 42, local seed is set to 43 and random seed is set to 42.
[2024-06-26 18:05:30,655] [   DEBUG] - ============================================================
[2024-06-26 18:05:30,655] [   DEBUG] -     Training Configuration Arguments    
[2024-06-26 18:05:30,656] [   DEBUG] - paddle commit id              : fbf852dd832bc0e63ae31cd4aa37defd829e4c03
[2024-06-26 18:05:30,656] [   DEBUG] - paddlenlp commit id           : b39e701e21d11ff66ac3abfc81d384b6af8f8240
[2024-06-26 18:05:30,656] [   DEBUG] - _no_sync_in_gradient_accumulation: True
[2024-06-26 18:05:30,656] [   DEBUG] - activation_quantize_type      : None
[2024-06-26 18:05:30,656] [   DEBUG] - adam_beta1                    : 0.9
[2024-06-26 18:05:30,656] [   DEBUG] - adam_beta2                    : 0.999
[2024-06-26 18:05:30,656] [   DEBUG] - adam_epsilon                  : 1e-08
[2024-06-26 18:05:30,656] [   DEBUG] - algo_list                     : None
[2024-06-26 18:05:30,656] [   DEBUG] - amp_custom_black_list         : None
[2024-06-26 18:05:30,656] [   DEBUG] - amp_custom_white_list         : None
[2024-06-26 18:05:30,656] [   DEBUG] - amp_master_grad               : False
[2024-06-26 18:05:30,656] [   DEBUG] - batch_num_list                : None
[2024-06-26 18:05:30,657] [   DEBUG] - batch_size_list               : None
[2024-06-26 18:05:30,657] [   DEBUG] - bf16                          : False
[2024-06-26 18:05:30,657] [   DEBUG] - bf16_full_eval                : False
[2024-06-26 18:05:30,657] [   DEBUG] - bias_correction               : False
[2024-06-26 18:05:30,657] [   DEBUG] - current_device                : gpu:1
[2024-06-26 18:05:30,657] [   DEBUG] - data_parallel_rank            : 0
[2024-06-26 18:05:30,657] [   DEBUG] - dataloader_drop_last          : False
[2024-06-26 18:05:30,657] [   DEBUG] - dataloader_num_workers        : 0
[2024-06-26 18:05:30,657] [   DEBUG] - dataset_rank                  : 0
[2024-06-26 18:05:30,657] [   DEBUG] - dataset_world_size            : 1
[2024-06-26 18:05:30,657] [   DEBUG] - device                        : gpu
[2024-06-26 18:05:30,657] [   DEBUG] - disable_tqdm                  : True
[2024-06-26 18:05:30,657] [   DEBUG] - distributed_dataloader        : False
[2024-06-26 18:05:30,657] [   DEBUG] - do_compress                   : False
[2024-06-26 18:05:30,658] [   DEBUG] - do_eval                       : True
[2024-06-26 18:05:30,658] [   DEBUG] - do_export                     : True
[2024-06-26 18:05:30,658] [   DEBUG] - do_predict                    : False
[2024-06-26 18:05:30,658] [   DEBUG] - do_train                      : True
[2024-06-26 18:05:30,658] [   DEBUG] - eval_accumulation_steps       : None
[2024-06-26 18:05:30,658] [   DEBUG] - eval_batch_size               : 16
[2024-06-26 18:05:30,658] [   DEBUG] - eval_steps                    : 100
[2024-06-26 18:05:30,658] [   DEBUG] - evaluation_strategy           : IntervalStrategy.STEPS
[2024-06-26 18:05:30,658] [   DEBUG] - flatten_param_grads           : False
[2024-06-26 18:05:30,658] [   DEBUG] - force_reshard_pp              : False
[2024-06-26 18:05:30,658] [   DEBUG] - fp16                          : False
[2024-06-26 18:05:30,658] [   DEBUG] - fp16_full_eval                : False
[2024-06-26 18:05:30,658] [   DEBUG] - fp16_opt_level                : O1
[2024-06-26 18:05:30,658] [   DEBUG] - gradient_accumulation_steps   : 1
[2024-06-26 18:05:30,659] [   DEBUG] - greater_is_better             : True
[2024-06-26 18:05:30,659] [   DEBUG] - hybrid_parallel_topo_order    : None
[2024-06-26 18:05:30,659] [   DEBUG] - ignore_data_skip              : False
[2024-06-26 18:05:30,659] [   DEBUG] - ignore_load_lr_and_optim      : False
[2024-06-26 18:05:30,659] [   DEBUG] - input_dtype                   : int64
[2024-06-26 18:05:30,659] [   DEBUG] - input_infer_model_path        : None
[2024-06-26 18:05:30,659] [   DEBUG] - label_names                   : ['start_positions', 'end_positions']
[2024-06-26 18:05:30,659] [   DEBUG] - lazy_data_processing          : True
[2024-06-26 18:05:30,659] [   DEBUG] - learning_rate                 : 1e-05
[2024-06-26 18:05:30,659] [   DEBUG] - load_best_model_at_end        : True
[2024-06-26 18:05:30,659] [   DEBUG] - load_sharded_model            : False
[2024-06-26 18:05:30,659] [   DEBUG] - local_process_index           : 0
[2024-06-26 18:05:30,659] [   DEBUG] - local_rank                    : -1
[2024-06-26 18:05:30,659] [   DEBUG] - log_level                     : -1
[2024-06-26 18:05:30,660] [   DEBUG] - log_level_replica             : -1
[2024-06-26 18:05:30,660] [   DEBUG] - log_on_each_node              : True
[2024-06-26 18:05:30,660] [   DEBUG] - logging_dir                   : ./checkpoint/model_best/runs/Jun26_18-05-27_jdz
[2024-06-26 18:05:30,660] [   DEBUG] - logging_first_step            : False
[2024-06-26 18:05:30,660] [   DEBUG] - logging_steps                 : 10
[2024-06-26 18:05:30,660] [   DEBUG] - logging_strategy              : IntervalStrategy.STEPS
[2024-06-26 18:05:30,660] [   DEBUG] - logical_process_index         : 0
[2024-06-26 18:05:30,660] [   DEBUG] - lr_end                        : 1e-07
[2024-06-26 18:05:30,660] [   DEBUG] - lr_scheduler_type             : SchedulerType.LINEAR
[2024-06-26 18:05:30,660] [   DEBUG] - max_evaluate_steps            : -1
[2024-06-26 18:05:30,660] [   DEBUG] - max_grad_norm                 : 1.0
[2024-06-26 18:05:30,660] [   DEBUG] - max_steps                     : -1
[2024-06-26 18:05:30,660] [   DEBUG] - metric_for_best_model         : eval_f1
[2024-06-26 18:05:30,660] [   DEBUG] - minimum_eval_times            : None
[2024-06-26 18:05:30,660] [   DEBUG] - moving_rate                   : 0.9
[2024-06-26 18:05:30,661] [   DEBUG] - no_cuda                       : False
[2024-06-26 18:05:30,661] [   DEBUG] - num_cycles                    : 0.5
[2024-06-26 18:05:30,661] [   DEBUG] - num_train_epochs              : 100.0
[2024-06-26 18:05:30,661] [   DEBUG] - onnx_format                   : True
[2024-06-26 18:05:30,661] [   DEBUG] - optim                         : OptimizerNames.ADAMW
[2024-06-26 18:05:30,661] [   DEBUG] - optimizer_name_suffix         : None
[2024-06-26 18:05:30,661] [   DEBUG] - output_dir                    : ./checkpoint/model_best
[2024-06-26 18:05:30,661] [   DEBUG] - overwrite_output_dir          : True
[2024-06-26 18:05:30,661] [   DEBUG] - past_index                    : -1
[2024-06-26 18:05:30,661] [   DEBUG] - per_device_eval_batch_size    : 16
[2024-06-26 18:05:30,661] [   DEBUG] - per_device_train_batch_size   : 16
[2024-06-26 18:05:30,661] [   DEBUG] - pipeline_parallel_config      : 
[2024-06-26 18:05:30,661] [   DEBUG] - pipeline_parallel_degree      : -1
[2024-06-26 18:05:30,661] [   DEBUG] - pipeline_parallel_rank        : 0
[2024-06-26 18:05:30,662] [   DEBUG] - power                         : 1.0
[2024-06-26 18:05:30,662] [   DEBUG] - prediction_loss_only          : False
[2024-06-26 18:05:30,662] [   DEBUG] - process_index                 : 0
[2024-06-26 18:05:30,662] [   DEBUG] - prune_embeddings              : False
[2024-06-26 18:05:30,662] [   DEBUG] - recompute                     : False
[2024-06-26 18:05:30,662] [   DEBUG] - remove_unused_columns         : True
[2024-06-26 18:05:30,662] [   DEBUG] - report_to                     : ['visualdl']
[2024-06-26 18:05:30,662] [   DEBUG] - resume_from_checkpoint        : None
[2024-06-26 18:05:30,662] [   DEBUG] - round_type                    : round
[2024-06-26 18:05:30,662] [   DEBUG] - run_name                      : ./checkpoint/model_best
[2024-06-26 18:05:30,662] [   DEBUG] - save_on_each_node             : False
[2024-06-26 18:05:30,662] [   DEBUG] - save_sharded_model            : False
[2024-06-26 18:05:30,662] [   DEBUG] - save_steps                    : 100
[2024-06-26 18:05:30,662] [   DEBUG] - save_strategy                 : IntervalStrategy.STEPS
[2024-06-26 18:05:30,663] [   DEBUG] - save_total_limit              : 1
[2024-06-26 18:05:30,663] [   DEBUG] - scale_loss                    : 32768
[2024-06-26 18:05:30,663] [   DEBUG] - seed                          : 42
[2024-06-26 18:05:30,663] [   DEBUG] - sep_parallel_degree           : -1
[2024-06-26 18:05:30,663] [   DEBUG] - sharding                      : []
[2024-06-26 18:05:30,663] [   DEBUG] - sharding_degree               : -1
[2024-06-26 18:05:30,663] [   DEBUG] - sharding_parallel_config      : 
[2024-06-26 18:05:30,663] [   DEBUG] - sharding_parallel_degree      : -1
[2024-06-26 18:05:30,663] [   DEBUG] - sharding_parallel_rank        : 0
[2024-06-26 18:05:30,663] [   DEBUG] - should_load_dataset           : True
[2024-06-26 18:05:30,663] [   DEBUG] - should_load_sharding_stage1_model: False
[2024-06-26 18:05:30,663] [   DEBUG] - should_log                    : True
[2024-06-26 18:05:30,663] [   DEBUG] - should_save                   : True
[2024-06-26 18:05:30,663] [   DEBUG] - should_save_model_state       : True
[2024-06-26 18:05:30,664] [   DEBUG] - should_save_sharding_stage1_model: False
[2024-06-26 18:05:30,664] [   DEBUG] - skip_memory_metrics           : True
[2024-06-26 18:05:30,664] [   DEBUG] - skip_profile_timer            : True
[2024-06-26 18:05:30,664] [   DEBUG] - strategy                      : dynabert+ptq
[2024-06-26 18:05:30,664] [   DEBUG] - tensor_parallel_config        : 
[2024-06-26 18:05:30,664] [   DEBUG] - tensor_parallel_degree        : -1
[2024-06-26 18:05:30,664] [   DEBUG] - tensor_parallel_rank          : 0
[2024-06-26 18:05:30,664] [   DEBUG] - to_static                     : False
[2024-06-26 18:05:30,664] [   DEBUG] - train_batch_size              : 16
[2024-06-26 18:05:30,664] [   DEBUG] - unified_checkpoint            : False
[2024-06-26 18:05:30,664] [   DEBUG] - unified_checkpoint_config     : 
[2024-06-26 18:05:30,664] [   DEBUG] - use_auto_parallel             : False
[2024-06-26 18:05:30,664] [   DEBUG] - use_hybrid_parallel           : False
[2024-06-26 18:05:30,664] [   DEBUG] - use_pact                      : True
[2024-06-26 18:05:30,665] [   DEBUG] - warmup_ratio                  : 0.1
[2024-06-26 18:05:30,665] [   DEBUG] - warmup_steps                  : 0
[2024-06-26 18:05:30,665] [   DEBUG] - weight_decay                  : 0.0
[2024-06-26 18:05:30,665] [   DEBUG] - weight_name_suffix            : None
[2024-06-26 18:05:30,665] [   DEBUG] - weight_quantize_type          : channel_wise_abs_max
[2024-06-26 18:05:30,665] [   DEBUG] - width_mult_list               : None
[2024-06-26 18:05:30,665] [   DEBUG] - world_size                    : 1
[2024-06-26 18:05:30,665] [   DEBUG] - 
[2024-06-26 18:05:30,666] [    INFO] - Starting training from resume_from_checkpoint : None
/root/anaconda3/envs/py39_ppner_2_7_2/lib/python3.9/site-packages/paddle/distributed/parallel.py:410: UserWarning: The program will return to single-card operation. Please check 1, whether you use spawn or fleetrun to start the program. 2, Whether it is a multi-card program. 3, Is the current environment multi-card.
  warnings.warn(
[2024-06-26 18:05:30,667] [    INFO] - [timelog] checkpoint loading time: 0.00s (2024-06-26 18:05:30) 
[2024-06-26 18:05:30,667] [    INFO] - ***** Running training *****
[2024-06-26 18:05:30,667] [    INFO] -   Num examples = 68
[2024-06-26 18:05:30,667] [    INFO] -   Num Epochs = 100
[2024-06-26 18:05:30,667] [    INFO] -   Instantaneous batch size per device = 16
[2024-06-26 18:05:30,668] [    INFO] -   Total train batch size (w. parallel, distributed & accumulation) = 16
[2024-06-26 18:05:30,668] [    INFO] -   Gradient Accumulation steps = 1
[2024-06-26 18:05:30,668] [    INFO] -   Total optimization steps = 500
[2024-06-26 18:05:30,668] [    INFO] -   Total num train samples = 6,800
[2024-06-26 18:05:30,670] [   DEBUG] -   Number of trainable parameters = 117,946,370 (per device)
/root/anaconda3/envs/py39_ppner_2_7_2/lib/python3.9/site-packages/paddlenlp/transformers/tokenizer_utils_base.py:2538: FutureWarning: The `max_seq_len` argument is deprecated and will be removed in a future version, please use `max_length` instead.
  warnings.warn(
/root/anaconda3/envs/py39_ppner_2_7_2/lib/python3.9/site-packages/paddlenlp/transformers/tokenizer_utils_base.py:1938: FutureWarning: The `pad_to_max_length` argument is deprecated and will be removed in a future version, use `padding=True` or `padding='longest'` to pad to the longest sequence in the batch, or use `padding='max_length'` to pad to a max length. In this case, you can give a specific length with `max_length` (e.g. `max_length=45`) or leave max_length to None to pad to the maximal input size of the model (e.g. 512 for Bert).
  warnings.warn(
[2024-06-26 18:05:34,727] [    INFO] - loss: 0.00208795, learning_rate: 1e-05, global_step: 10, interval_runtime: 4.0565, interval_samples_per_second: 39.44293710564109, interval_steps_per_second: 2.4651835691025683, progress_or_epoch: 2.0
[2024-06-26 18:05:37,278] [    INFO] - loss: 0.00129266, learning_rate: 1e-05, global_step: 20, interval_runtime: 2.55, interval_samples_per_second: 62.74611413497253, interval_steps_per_second: 3.921632133435783, progress_or_epoch: 4.0
[2024-06-26 18:05:39,833] [    INFO] - loss: 0.00092623, learning_rate: 1e-05, global_step: 30, interval_runtime: 2.5559, interval_samples_per_second: 62.599374107598685, interval_steps_per_second: 3.912460881724918, progress_or_epoch: 6.0
[2024-06-26 18:05:42,413] [    INFO] - loss: 0.00058895, learning_rate: 1e-05, global_step: 40, interval_runtime: 2.5801, interval_samples_per_second: 62.012657597381434, interval_steps_per_second: 3.8757910998363396, progress_or_epoch: 8.0
[2024-06-26 18:05:44,946] [    INFO] - loss: 0.00047016, learning_rate: 1e-05, global_step: 50, interval_runtime: 2.5327, interval_samples_per_second: 63.17332385701746, interval_steps_per_second: 3.948332741063591, progress_or_epoch: 10.0
[2024-06-26 18:05:47,514] [    INFO] - loss: 0.00031726, learning_rate: 1e-05, global_step: 60, interval_runtime: 2.5675, interval_samples_per_second: 62.316911606640005, interval_steps_per_second: 3.8948069754150003, progress_or_epoch: 12.0
[2024-06-26 18:05:50,081] [    INFO] - loss: 0.00024869, learning_rate: 1e-05, global_step: 70, interval_runtime: 2.5673, interval_samples_per_second: 62.32305770213569, interval_steps_per_second: 3.8951911063834808, progress_or_epoch: 14.0
[2024-06-26 18:05:52,658] [    INFO] - loss: 0.00041933, learning_rate: 1e-05, global_step: 80, interval_runtime: 2.5765, interval_samples_per_second: 62.10074334302723, interval_steps_per_second: 3.881296458939202, progress_or_epoch: 16.0
[2024-06-26 18:05:55,223] [    INFO] - loss: 0.00017784, learning_rate: 1e-05, global_step: 90, interval_runtime: 2.5647, interval_samples_per_second: 62.38556549647052, interval_steps_per_second: 3.8990978435294075, progress_or_epoch: 18.0
[2024-06-26 18:05:57,816] [    INFO] - loss: 0.00019145, learning_rate: 1e-05, global_step: 100, interval_runtime: 2.5931, interval_samples_per_second: 61.70326891265709, interval_steps_per_second: 3.856454307041068, progress_or_epoch: 20.0
[2024-06-26 18:05:57,817] [    INFO] - ***** Running Evaluation *****
[2024-06-26 18:05:57,817] [    INFO] -   Num examples = 4
[2024-06-26 18:05:57,817] [    INFO] -   Total prediction steps = 1
[2024-06-26 18:05:57,817] [    INFO] -   Pre device batch size = 16
[2024-06-26 18:05:57,818] [    INFO] -   Total Batch size = 16
[2024-06-26 18:05:57,889] [    INFO] - eval_loss: 0.005224619060754776, eval_precision: 0.4, eval_recall: 0.4, eval_f1: 0.4000000000000001, eval_runtime: 0.0705, eval_samples_per_second: 56.74937846074747, eval_steps_per_second: 14.187344615186868, progress_or_epoch: 20.0
[2024-06-26 18:05:57,889] [    INFO] - Saving model checkpoint to ./checkpoint/model_best/checkpoint-100
[2024-06-26 18:05:57,890] [    INFO] - tokenizer config file saved in ./checkpoint/model_best/checkpoint-100/tokenizer_config.json
[2024-06-26 18:05:57,890] [    INFO] - Special tokens file saved in ./checkpoint/model_best/checkpoint-100/special_tokens_map.json
[2024-06-26 18:05:57,902] [    INFO] - Configuration saved in ./checkpoint/model_best/checkpoint-100/config.json
[2024-06-26 18:06:00,646] [    INFO] - Model weights saved in ./checkpoint/model_best/checkpoint-100/model_state.pdparams
[2024-06-26 18:06:00,648] [    INFO] - Saving optimizer files.
[2024-06-26 18:06:07,375] [    INFO] - [timelog] checkpoint saving time: 9.48s (2024-06-26 18:06:07) 
[2024-06-26 18:06:09,924] [    INFO] - loss: 0.00022237, learning_rate: 1e-05, global_step: 110, interval_runtime: 12.1083, interval_samples_per_second: 13.21409688201894, interval_steps_per_second: 0.8258810551261837, progress_or_epoch: 22.0
[2024-06-26 18:06:12,478] [    INFO] - loss: 0.00017666, learning_rate: 1e-05, global_step: 120, interval_runtime: 2.5542, interval_samples_per_second: 62.64218200480218, interval_steps_per_second: 3.9151363753001363, progress_or_epoch: 24.0
[2024-06-26 18:06:15,065] [    INFO] - loss: 0.00024367, learning_rate: 1e-05, global_step: 130, interval_runtime: 2.587, interval_samples_per_second: 61.84880269252828, interval_steps_per_second: 3.8655501682830176, progress_or_epoch: 26.0
[2024-06-26 18:06:17,695] [    INFO] - loss: 0.0001915, learning_rate: 1e-05, global_step: 140, interval_runtime: 2.6297, interval_samples_per_second: 60.84334418667863, interval_steps_per_second: 3.8027090116674143, progress_or_epoch: 28.0
[2024-06-26 18:06:20,298] [    INFO] - loss: 0.00017253, learning_rate: 1e-05, global_step: 150, interval_runtime: 2.5998, interval_samples_per_second: 61.542086843031, interval_steps_per_second: 3.8463804276894376, progress_or_epoch: 30.0
[2024-06-26 18:06:22,895] [    INFO] - loss: 0.00014841, learning_rate: 1e-05, global_step: 160, interval_runtime: 2.6008, interval_samples_per_second: 61.51999965531349, interval_steps_per_second: 3.844999978457093, progress_or_epoch: 32.0
[2024-06-26 18:06:25,498] [    INFO] - loss: 0.00012293, learning_rate: 1e-05, global_step: 170, interval_runtime: 2.6024, interval_samples_per_second: 61.48274409158785, interval_steps_per_second: 3.8426715057242404, progress_or_epoch: 34.0
[2024-06-26 18:06:28,099] [    INFO] - loss: 0.00010312, learning_rate: 1e-05, global_step: 180, interval_runtime: 2.601, interval_samples_per_second: 61.514997685471634, interval_steps_per_second: 3.844687355341977, progress_or_epoch: 36.0
[2024-06-26 18:06:30,698] [    INFO] - loss: 9.851e-05, learning_rate: 1e-05, global_step: 190, interval_runtime: 2.5992, interval_samples_per_second: 61.5571084161852, interval_steps_per_second: 3.847319276011575, progress_or_epoch: 38.0
[2024-06-26 18:06:33,295] [    INFO] - loss: 0.00013552, learning_rate: 1e-05, global_step: 200, interval_runtime: 2.5971, interval_samples_per_second: 61.60644154857953, interval_steps_per_second: 3.8504025967862208, progress_or_epoch: 40.0
[2024-06-26 18:06:33,295] [    INFO] - ***** Running Evaluation *****
[2024-06-26 18:06:33,295] [    INFO] -   Num examples = 4
[2024-06-26 18:06:33,295] [    INFO] -   Total prediction steps = 1
[2024-06-26 18:06:33,295] [    INFO] -   Pre device batch size = 16
[2024-06-26 18:06:33,295] [    INFO] -   Total Batch size = 16
[2024-06-26 18:06:33,358] [    INFO] - eval_loss: 0.006688571535050869, eval_precision: 0.4, eval_recall: 0.4, eval_f1: 0.4000000000000001, eval_runtime: 0.0621, eval_samples_per_second: 64.41700614712398, eval_steps_per_second: 16.104251536780996, progress_or_epoch: 40.0
[2024-06-26 18:06:33,359] [    INFO] - Saving model checkpoint to ./checkpoint/model_best/checkpoint-200
[2024-06-26 18:06:33,360] [    INFO] - tokenizer config file saved in ./checkpoint/model_best/checkpoint-200/tokenizer_config.json
[2024-06-26 18:06:33,360] [    INFO] - Special tokens file saved in ./checkpoint/model_best/checkpoint-200/special_tokens_map.json
[2024-06-26 18:06:33,370] [    INFO] - Configuration saved in ./checkpoint/model_best/checkpoint-200/config.json
[2024-06-26 18:06:34,574] [    INFO] - Model weights saved in ./checkpoint/model_best/checkpoint-200/model_state.pdparams
[2024-06-26 18:06:34,574] [    INFO] - Saving optimizer files.
[2024-06-26 18:06:39,080] [    INFO] - [timelog] checkpoint saving time: 5.72s (2024-06-26 18:06:39) 
[2024-06-26 18:06:41,672] [    INFO] - loss: 0.00011128, learning_rate: 1e-05, global_step: 210, interval_runtime: 8.3772, interval_samples_per_second: 19.099448736351285, interval_steps_per_second: 1.1937155460219553, progress_or_epoch: 42.0
[2024-06-26 18:06:44,288] [    INFO] - loss: 0.00011245, learning_rate: 1e-05, global_step: 220, interval_runtime: 2.6149, interval_samples_per_second: 61.18868160731556, interval_steps_per_second: 3.8242926004572224, progress_or_epoch: 44.0
[2024-06-26 18:06:46,911] [    INFO] - loss: 0.00012268, learning_rate: 1e-05, global_step: 230, interval_runtime: 2.6234, interval_samples_per_second: 60.98920142963071, interval_steps_per_second: 3.8118250893519194, progress_or_epoch: 46.0
[2024-06-26 18:06:49,537] [    INFO] - loss: 0.00013804, learning_rate: 1e-05, global_step: 240, interval_runtime: 2.6259, interval_samples_per_second: 60.93070388366043, interval_steps_per_second: 3.8081689927287767, progress_or_epoch: 48.0
[2024-06-26 18:06:52,117] [    INFO] - loss: 0.00020081, learning_rate: 1e-05, global_step: 250, interval_runtime: 2.5809, interval_samples_per_second: 61.99492724918558, interval_steps_per_second: 3.8746829530740987, progress_or_epoch: 50.0
[2024-06-26 18:06:54,753] [    INFO] - loss: 0.00014258, learning_rate: 1e-05, global_step: 260, interval_runtime: 2.6358, interval_samples_per_second: 60.702937625070206, interval_steps_per_second: 3.793933601566888, progress_or_epoch: 52.0
[2024-06-26 18:06:57,369] [    INFO] - loss: 0.00012738, learning_rate: 1e-05, global_step: 270, interval_runtime: 2.6164, interval_samples_per_second: 61.153369707036596, interval_steps_per_second: 3.8220856066897873, progress_or_epoch: 54.0
[2024-06-26 18:07:00,006] [    INFO] - loss: 0.00010497, learning_rate: 1e-05, global_step: 280, interval_runtime: 2.6373, interval_samples_per_second: 60.669083466377856, interval_steps_per_second: 3.791817716648616, progress_or_epoch: 56.0
[2024-06-26 18:07:02,657] [    INFO] - loss: 0.00010167, learning_rate: 1e-05, global_step: 290, interval_runtime: 2.6499, interval_samples_per_second: 60.38069724825452, interval_steps_per_second: 3.7737935780159075, progress_or_epoch: 58.0
[2024-06-26 18:07:05,297] [    INFO] - loss: 0.000171, learning_rate: 1e-05, global_step: 300, interval_runtime: 2.6399, interval_samples_per_second: 60.60732256501828, interval_steps_per_second: 3.7879576603136424, progress_or_epoch: 60.0
[2024-06-26 18:07:05,298] [    INFO] - ***** Running Evaluation *****
[2024-06-26 18:07:05,298] [    INFO] -   Num examples = 4
[2024-06-26 18:07:05,298] [    INFO] -   Total prediction steps = 1
[2024-06-26 18:07:05,298] [    INFO] -   Pre device batch size = 16
[2024-06-26 18:07:05,298] [    INFO] -   Total Batch size = 16
[2024-06-26 18:07:05,357] [    INFO] - eval_loss: 0.007026550825685263, eval_precision: 0.4, eval_recall: 0.4, eval_f1: 0.4000000000000001, eval_runtime: 0.0582, eval_samples_per_second: 68.67042956838507, eval_steps_per_second: 17.16760739209627, progress_or_epoch: 60.0
[2024-06-26 18:07:05,358] [    INFO] - Saving model checkpoint to ./checkpoint/model_best/checkpoint-300
[2024-06-26 18:07:05,358] [    INFO] - tokenizer config file saved in ./checkpoint/model_best/checkpoint-300/tokenizer_config.json
[2024-06-26 18:07:05,358] [    INFO] - Special tokens file saved in ./checkpoint/model_best/checkpoint-300/special_tokens_map.json
[2024-06-26 18:07:05,364] [    INFO] - Configuration saved in ./checkpoint/model_best/checkpoint-300/config.json
[2024-06-26 18:07:07,995] [    INFO] - Model weights saved in ./checkpoint/model_best/checkpoint-300/model_state.pdparams
[2024-06-26 18:07:07,996] [    INFO] - Saving optimizer files.
[2024-06-26 18:07:11,005] [    INFO] - [timelog] checkpoint saving time: 5.64s (2024-06-26 18:07:11) 
[2024-06-26 18:07:13,640] [    INFO] - loss: 9.265e-05, learning_rate: 1e-05, global_step: 310, interval_runtime: 8.3438, interval_samples_per_second: 19.17600366118997, interval_steps_per_second: 1.1985002288243731, progress_or_epoch: 62.0
[2024-06-26 18:07:16,285] [    INFO] - loss: 0.00012613, learning_rate: 1e-05, global_step: 320, interval_runtime: 2.6441, interval_samples_per_second: 60.512013700058034, interval_steps_per_second: 3.782000856253627, progress_or_epoch: 64.0
[2024-06-26 18:07:18,887] [    INFO] - loss: 7.656e-05, learning_rate: 1e-05, global_step: 330, interval_runtime: 2.603, interval_samples_per_second: 61.4676855948393, interval_steps_per_second: 3.8417303496774564, progress_or_epoch: 66.0
[2024-06-26 18:07:21,539] [    INFO] - loss: 9.599e-05, learning_rate: 1e-05, global_step: 340, interval_runtime: 2.6517, interval_samples_per_second: 60.33811856609532, interval_steps_per_second: 3.7711324103809574, progress_or_epoch: 68.0
[2024-06-26 18:07:24,185] [    INFO] - loss: 9.96e-05, learning_rate: 1e-05, global_step: 350, interval_runtime: 2.6462, interval_samples_per_second: 60.463986738962795, interval_steps_per_second: 3.7789991711851747, progress_or_epoch: 70.0
[2024-06-26 18:07:26,830] [    INFO] - loss: 9.502e-05, learning_rate: 1e-05, global_step: 360, interval_runtime: 2.6438, interval_samples_per_second: 60.51813635974227, interval_steps_per_second: 3.782383522483892, progress_or_epoch: 72.0
[2024-06-26 18:07:29,486] [    INFO] - loss: 8.396e-05, learning_rate: 1e-05, global_step: 370, interval_runtime: 2.6568, interval_samples_per_second: 60.22368303445632, interval_steps_per_second: 3.76398018965352, progress_or_epoch: 74.0
[2024-06-26 18:07:32,128] [    INFO] - loss: 0.000107, learning_rate: 1e-05, global_step: 380, interval_runtime: 2.6414, interval_samples_per_second: 60.57406701856363, interval_steps_per_second: 3.785879188660227, progress_or_epoch: 76.0
[2024-06-26 18:07:34,774] [    INFO] - loss: 0.00015859, learning_rate: 1e-05, global_step: 390, interval_runtime: 2.646, interval_samples_per_second: 60.46891186644317, interval_steps_per_second: 3.7793069916526982, progress_or_epoch: 78.0
[2024-06-26 18:07:37,422] [    INFO] - loss: 8.44e-05, learning_rate: 1e-05, global_step: 400, interval_runtime: 2.6486, interval_samples_per_second: 60.40979254722639, interval_steps_per_second: 3.7756120342016493, progress_or_epoch: 80.0
[2024-06-26 18:07:37,423] [    INFO] - ***** Running Evaluation *****
[2024-06-26 18:07:37,423] [    INFO] -   Num examples = 4
[2024-06-26 18:07:37,423] [    INFO] -   Total prediction steps = 1
[2024-06-26 18:07:37,423] [    INFO] -   Pre device batch size = 16
[2024-06-26 18:07:37,423] [    INFO] -   Total Batch size = 16
[2024-06-26 18:07:37,482] [    INFO] - eval_loss: 0.00755023630335927, eval_precision: 0.4, eval_recall: 0.4, eval_f1: 0.4000000000000001, eval_runtime: 0.0578, eval_samples_per_second: 69.19010227647641, eval_steps_per_second: 17.297525569119102, progress_or_epoch: 80.0
[2024-06-26 18:07:37,482] [    INFO] - Saving model checkpoint to ./checkpoint/model_best/checkpoint-400
[2024-06-26 18:07:37,483] [    INFO] - tokenizer config file saved in ./checkpoint/model_best/checkpoint-400/tokenizer_config.json
[2024-06-26 18:07:37,483] [    INFO] - Special tokens file saved in ./checkpoint/model_best/checkpoint-400/special_tokens_map.json
[2024-06-26 18:07:37,489] [    INFO] - Configuration saved in ./checkpoint/model_best/checkpoint-400/config.json
[2024-06-26 18:07:38,671] [    INFO] - Model weights saved in ./checkpoint/model_best/checkpoint-400/model_state.pdparams
[2024-06-26 18:07:38,671] [    INFO] - Saving optimizer files.
[2024-06-26 18:07:41,730] [    INFO] - [timelog] checkpoint saving time: 4.24s (2024-06-26 18:07:41) 
[2024-06-26 18:07:44,394] [    INFO] - loss: 8.615e-05, learning_rate: 1e-05, global_step: 410, interval_runtime: 6.9719, interval_samples_per_second: 22.949178353528108, interval_steps_per_second: 1.4343236470955067, progress_or_epoch: 82.0
[2024-06-26 18:07:47,030] [    INFO] - loss: 8.146e-05, learning_rate: 1e-05, global_step: 420, interval_runtime: 2.6363, interval_samples_per_second: 60.691782269542095, interval_steps_per_second: 3.793236391846381, progress_or_epoch: 84.0
[2024-06-26 18:07:49,660] [    INFO] - loss: 0.0001024, learning_rate: 1e-05, global_step: 430, interval_runtime: 2.6297, interval_samples_per_second: 60.84334970295866, interval_steps_per_second: 3.8027093564349164, progress_or_epoch: 86.0
[2024-06-26 18:07:52,296] [    INFO] - loss: 0.00011797, learning_rate: 1e-05, global_step: 440, interval_runtime: 2.6357, interval_samples_per_second: 60.70518895680559, interval_steps_per_second: 3.7940743098003495, progress_or_epoch: 88.0
[2024-06-26 18:07:54,939] [    INFO] - loss: 0.00014716, learning_rate: 1e-05, global_step: 450, interval_runtime: 2.6435, interval_samples_per_second: 60.52505176192572, interval_steps_per_second: 3.7828157351203573, progress_or_epoch: 90.0
[2024-06-26 18:07:57,583] [    INFO] - loss: 7.592e-05, learning_rate: 1e-05, global_step: 460, interval_runtime: 2.6437, interval_samples_per_second: 60.520177521644605, interval_steps_per_second: 3.782511095102788, progress_or_epoch: 92.0
[2024-06-26 18:08:00,303] [    INFO] - loss: 8.616e-05, learning_rate: 1e-05, global_step: 470, interval_runtime: 2.7201, interval_samples_per_second: 58.82132721602211, interval_steps_per_second: 3.676332951001382, progress_or_epoch: 94.0
[2024-06-26 18:08:02,968] [    INFO] - loss: 7.984e-05, learning_rate: 1e-05, global_step: 480, interval_runtime: 2.6645, interval_samples_per_second: 60.048504909189305, interval_steps_per_second: 3.7530315568243315, progress_or_epoch: 96.0
[2024-06-26 18:08:05,618] [    INFO] - loss: 7.743e-05, learning_rate: 1e-05, global_step: 490, interval_runtime: 2.6507, interval_samples_per_second: 60.3610372488141, interval_steps_per_second: 3.7725648280508812, progress_or_epoch: 98.0
[2024-06-26 18:08:08,285] [    INFO] - loss: 7.793e-05, learning_rate: 1e-05, global_step: 500, interval_runtime: 2.6661, interval_samples_per_second: 60.01194714924565, interval_steps_per_second: 3.750746696827853, progress_or_epoch: 100.0
[2024-06-26 18:08:08,285] [    INFO] - ***** Running Evaluation *****
[2024-06-26 18:08:08,285] [    INFO] -   Num examples = 4
[2024-06-26 18:08:08,286] [    INFO] -   Total prediction steps = 1
[2024-06-26 18:08:08,286] [    INFO] -   Pre device batch size = 16
[2024-06-26 18:08:08,286] [    INFO] -   Total Batch size = 16
[2024-06-26 18:08:08,344] [    INFO] - eval_loss: 0.007735834456980228, eval_precision: 0.4, eval_recall: 0.4, eval_f1: 0.4000000000000001, eval_runtime: 0.0574, eval_samples_per_second: 69.71943866123114, eval_steps_per_second: 17.429859665307784, progress_or_epoch: 100.0
[2024-06-26 18:08:08,344] [    INFO] - Saving model checkpoint to ./checkpoint/model_best/checkpoint-500
[2024-06-26 18:08:08,345] [    INFO] - tokenizer config file saved in ./checkpoint/model_best/checkpoint-500/tokenizer_config.json
[2024-06-26 18:08:08,345] [    INFO] - Special tokens file saved in ./checkpoint/model_best/checkpoint-500/special_tokens_map.json
[2024-06-26 18:08:08,351] [    INFO] - Configuration saved in ./checkpoint/model_best/checkpoint-500/config.json
[2024-06-26 18:08:09,518] [    INFO] - Model weights saved in ./checkpoint/model_best/checkpoint-500/model_state.pdparams
[2024-06-26 18:08:09,518] [    INFO] - Saving optimizer files.
[2024-06-26 18:08:11,773] [    INFO] - [timelog] checkpoint saving time: 3.42s (2024-06-26 18:08:11) 
[2024-06-26 18:08:11,774] [    INFO] - 
Training completed. 

[2024-06-26 18:08:11,774] [    INFO] - Loading best model from ./checkpoint/model_best/checkpoint-100 (score: 0.4000000000000001).
[2024-06-26 18:08:12,116] [    INFO] - set state-dict :([], [])
[2024-06-26 18:08:12,118] [    INFO] - train_runtime: 161.4472, train_samples_per_second: 42.1190341985795, train_steps_per_second: 3.0969878087190805, train_loss: 0.00023241847997996956, progress_or_epoch: 100.0
[2024-06-26 18:08:12,134] [    INFO] - Saving model checkpoint to ./checkpoint/model_best
[2024-06-26 18:08:12,135] [    INFO] - tokenizer config file saved in ./checkpoint/model_best/tokenizer_config.json
[2024-06-26 18:08:12,135] [    INFO] - Special tokens file saved in ./checkpoint/model_best/special_tokens_map.json
[2024-06-26 18:08:12,142] [    INFO] - Configuration saved in ./checkpoint/model_best/config.json
[2024-06-26 18:08:13,814] [    INFO] - Model weights saved in ./checkpoint/model_best/model_state.pdparams
[2024-06-26 18:08:13,815] [    INFO] - ***** train metrics *****
[2024-06-26 18:08:13,815] [    INFO] -   progress_or_epoch        =      100.0
[2024-06-26 18:08:13,815] [    INFO] -   train_loss               =     0.0002
[2024-06-26 18:08:13,815] [    INFO] -   train_runtime            = 0:02:41.44
[2024-06-26 18:08:13,815] [    INFO] -   train_samples_per_second =     42.119
[2024-06-26 18:08:13,815] [    INFO] -   train_steps_per_second   =      3.097
[2024-06-26 18:08:13,820] [    INFO] - ***** Running Evaluation *****
[2024-06-26 18:08:13,820] [    INFO] -   Num examples = 4
[2024-06-26 18:08:13,820] [    INFO] -   Total prediction steps = 1
[2024-06-26 18:08:13,820] [    INFO] -   Pre device batch size = 16
[2024-06-26 18:08:13,820] [    INFO] -   Total Batch size = 16
[2024-06-26 18:08:13,888] [    INFO] - eval_loss: 0.005224619060754776, eval_precision: 0.4, eval_recall: 0.4, eval_f1: 0.4000000000000001, eval_runtime: 0.0687, eval_samples_per_second: 58.19804494272889, eval_steps_per_second: 14.549511235682223, progress_or_epoch: 100.0
[2024-06-26 18:08:13,889] [    INFO] - ***** eval metrics *****
[2024-06-26 18:08:13,889] [    INFO] -   eval_f1                 =        0.4
[2024-06-26 18:08:13,889] [    INFO] -   eval_loss               =     0.0052
[2024-06-26 18:08:13,889] [    INFO] -   eval_precision          =        0.4
[2024-06-26 18:08:13,889] [    INFO] -   eval_recall             =        0.4
[2024-06-26 18:08:13,889] [    INFO] -   eval_runtime            = 0:00:00.06
[2024-06-26 18:08:13,890] [    INFO] -   eval_samples_per_second =     58.198
[2024-06-26 18:08:13,890] [    INFO] -   eval_steps_per_second   =    14.5495
[2024-06-26 18:08:13,890] [    INFO] -   progress_or_epoch       =      100.0
/root/anaconda3/envs/py39_ppner_2_7_2/lib/python3.9/site-packages/paddle/jit/dy2static/program_translator.py:712: UserWarning: full_graph=False don't support input_spec arguments. It will not produce any effect.
You can set full_graph=True, then you can assign input spec.

  warnings.warn(
[2024-06-26 18:08:13,895] [    INFO] - Exporting inference model to ./checkpoint/model_best/model
I0626 18:08:15.824677 285242 program_interpreter.cc:212] New Executor is Running.
[2024-06-26 18:08:18,295] [    INFO] - Inference model exported.
[2024-06-26 18:08:18,297] [    INFO] - tokenizer config file saved in ./checkpoint/model_best/tokenizer_config.json
[2024-06-26 18:08:18,297] [    INFO] - Special tokens file saved in ./checkpoint/model_best/special_tokens_map.json
LAUNCH INFO 2024-06-26 18:08:20,044 Pod completed
LAUNCH INFO 2024-06-26 18:08:20,045 Exit code 0

模型文件

(py39_ppner_2_7_2) [root@jdz uie]# ll checkpoint/model_best/
total 919580
-rw-r--r-- 1 root root       207 Jun 26 18:08 all_results.json
drwxr-xr-x 2 root root       251 Jun 26 18:06 checkpoint-100
drwxr-xr-x 2 root root       251 Jun 26 18:06 checkpoint-200
drwxr-xr-x 2 root root       251 Jun 26 18:07 checkpoint-300
drwxr-xr-x 2 root root       251 Jun 26 18:07 checkpoint-400
drwxr-xr-x 2 root root       251 Jun 26 18:08 checkpoint-500
-rw-r--r-- 1 root root       559 Jun 26 18:08 config.json
-rw-r--r-- 1 root root 469428391 Jun 26 18:08 model.pdiparams
-rw-r--r-- 1 root root     17581 Jun 26 18:08 model.pdiparams.info
-rw-r--r-- 1 root root    153135 Jun 26 18:08 model.pdmodel
-rw-r--r-- 1 root root 471806543 Jun 26 18:12 model_state.pdparams
drwxr-xr-x 7 root root       136 Jun 26 18:05 runs
-rw-r--r-- 1 root root       112 Jun 26 18:08 special_tokens_map.json
drwxr-xr-x 2 root root        90 Jun 26 18:12 static
-rw-r--r-- 1 root root       197 Jun 26 18:08 tokenizer_config.json
-rw-r--r-- 1 root root     16736 Jun 26 18:08 trainer_state.json
-rw-r--r-- 1 root root      2598 Jun 26 18:08 training_args.bin
-rw-r--r-- 1 root root       207 Jun 26 18:08 train_results.json
-rw-r--r-- 1 root root    186807 Jun 26 18:08 vocab.txt
(py39_ppner_2_7_2) [root@jdz uie]#

调用api测试

from pprint import pprint
from paddlenlp import Taskflow

schema = [{'工程': ['工艺']}]

ie = Taskflow('information_extraction', schema=schema, task_path='./checkpoint/model_best')

ie.set_schema(schema) # Reset schema
pprint(ie("""受力钢筋的接头形式应按设计要求采用,若设计无要求时,钢筋宜采用焊接接头和机械连接接头,也可采用绑扎接头。"""))

测试结果

[{'工程': [{'end': 9,
          'probability': 0.9548929987860788,
          'relations': {'工艺': [{'end': 42,
                                'probability': 0.2548182944884658,
                                'start': 36,
                                'text': '机械连接接头'}]},
          'start': 0,
          'text': '受力钢筋的接头形式'}]}]