Part Number:TDA4VM
我使用edge-ai-tools工具编译了一个车道线检测模型,部署在EVM上时,准确率降低了很多,这是什么原因导致的啊?
Shine:
请问是自己的板子还是TI的EVM板? 具体编译步骤是怎么样的?能否贴一下代码和log输出信息。
,
yong xuan:
TI的EVM板。就按照edge-ai-tools工具里的readme编译的。log如下。
Available execution providers : [‘TIDLExecutionProvider’, ‘TIDLCompilationProvider’, ‘CPUExecutionProvider’]
Running 1 Models – [‘bestb’]
Running_Model : bestb
Running shape inference on model ../../../models/public/culane_18.onnx
——————–2———————————tidl_tools_path = /home/leo/code/edgeai-tidl-tools-master/examples/osrt_python/ort/tidl_tools artifacts_folder = ../../../model-artifacts//bestb/ tidl_tensor_bits = 8 debug_level = 1 num_tidl_subgraphs = 16 tidl_denylist = tidl_denylist_layer_name = tidl_denylist_layer_type = tidl_allowlist_layer_name = model_type = tidl_calibration_accuracy_level = 7 tidl_calibration_options:num_frames_calibration = 2 tidl_calibration_options:bias_calibration_iterations = 5 mixed_precision_factor = -1.000000 model_group_id = 0 power_of_2_quantization = 2 enable_high_resolution_optimization = 0 pre_batchnorm_fold = 1 add_data_convert_ops = 3 output_feature_16bit_names_list = m_params_16bit_names_list = reserved_compile_constraints_flag = 1601 ti_internal_reserved_1 =
****** WARNING : Network not identified as Object Detection network : (1) Ignore if network is not Object Detection network (2) If network is Object Detection network, please specify "model_type":"OD" as part of OSRT compilation options******
Supported TIDL layer type — Conv — Conv_0 Supported TIDL layer type — Relu — Relu_1 Supported TIDL layer type — MaxPool — MaxPool_2 Supported TIDL layer type — Conv — Conv_3 Supported TIDL layer type — Relu — Relu_4 Supported TIDL layer type — Conv — Conv_5 Supported TIDL layer type — Add — Add_6 Supported TIDL layer type — Relu — Relu_7 Supported TIDL layer type — Conv — Conv_8 Supported TIDL layer type — Relu — Relu_9 Supported TIDL layer type — Conv — Conv_10 Supported TIDL layer type — Add — Add_11 Supported TIDL layer type — Relu — Relu_12 Supported TIDL layer type — Conv — Conv_13 Supported TIDL layer type — Relu — Relu_14 Supported TIDL layer type — Conv — Conv_15 Supported TIDL layer type — Conv — Conv_16 Supported TIDL layer type — Add — Add_17 Supported TIDL layer type — Relu — Relu_18 Supported TIDL layer type — Conv — Conv_19 Supported TIDL layer type — Relu — Relu_20 Supported TIDL layer type — Conv — Conv_21 Supported TIDL layer type — Add — Add_22 Supported TIDL layer type — Relu — Relu_23 Supported TIDL layer type — Conv — Conv_24 Supported TIDL layer type — Relu — Relu_25 Supported TIDL layer type — Conv — Conv_26 Supported TIDL layer type — Conv — Conv_27 Supported TIDL layer type — Add — Add_28 Supported TIDL layer type — Relu — Relu_29 Supported TIDL layer type — Conv — Conv_30 Supported TIDL layer type — Relu — Relu_31 Supported TIDL layer type — Conv — Conv_32 Supported TIDL layer type — Add — Add_33 Supported TIDL layer type — Relu — Relu_34 Supported TIDL layer type — Conv — Conv_35 Supported TIDL layer type — Relu — Relu_36 Supported TIDL layer type — Conv — Conv_37 Supported TIDL layer type — Conv — Conv_38 Supported TIDL layer type — Add — Add_39 Supported TIDL layer type — Relu — Relu_40 Supported TIDL layer type — Conv — Conv_41 Supported TIDL layer type — Relu — Relu_42 Supported TIDL layer type — Conv — Conv_43 Supported TIDL layer type — Add — Add_44 Supported TIDL layer type — Relu — Relu_45 Supported TIDL layer type — Conv — Conv_46 Supported TIDL layer type — Reshape — Reshape_48 Supported TIDL layer type — Gemm — Gemm_49 Supported TIDL layer type — Relu — Relu_50 Supported TIDL layer type — Gemm — Gemm_51 Supported TIDL layer type — Reshape — Reshape_53
Preliminary subgraphs created = 1 Final number of subgraphs created are : 1, – Offloaded Nodes – 52, Total Nodes – 52 SUGGESTION — [TIDL_InnerProductLayer] Size larger than 2048 * 2048 is not optimal. Running runtimes graphviz – /home/leo/code/edgeai-tidl-tools-master/examples/osrt_python/ort/tidl_tools/tidl_graphVisualiser_runtimes.out ../../../model-artifacts//bestb//allowedNode.txt ../../../model-artifacts//bestb//tempDir/graphvizInfo.txt ../../../model-artifacts//bestb//tempDir/runtimes_visualization.svg *** In TIDL_createStateImportFunc *** Compute on node : TIDLExecutionProvider_TIDL_0_0 0, Conv, 3, 1, input, 201 1, Relu, 1, 1, 201, 129 2, MaxPool, 1, 1, 129, 130 3, Conv, 3, 1, 130, 204 4, Relu, 1, 1, 204, 133 5, Conv, 3, 1, 133, 207 6, Add, 2, 1, 207, 136 7, Relu, 1, 1, 136, 137 8, Conv, 3, 1, 137, 210 9, Relu, 1, 1, 210, 140 10, Conv, 3, 1, 140, 213 11, Add, 2, 1, 213, 143 12, Relu, 1, 1, 143, 144 13, Conv, 3, 1, 144, 216 14, Relu, 1, 1, 216, 147 15, Conv, 3, 1, 147, 219 16, Conv, 3, 1, 144, 222 17, Add, 2, 1, 219, 152 18, Relu, 1, 1, 152, 153 19, Conv, 3, 1, 153, 225 20, Relu, 1, 1, 225, 156 21, Conv, 3, 1, 156, 228 22, Add, 2, 1, 228, 159 23, Relu, 1, 1, 159, 160 24, Conv, 3, 1, 160, 231 25, Relu, 1, 1, 231, 163 26, Conv, 3, 1, 163, 234 27, Conv, 3, 1, 160, 237 28, Add, 2, 1, 234, 168 29, Relu, 1, 1, 168, 169 30, Conv, 3, 1, 169, 240 31, Relu, 1, 1, 240, 172 32, Conv, 3, 1, 172, 243 33, Add, 2, 1, 243, 175 34, Relu, 1, 1, 175, 176 35, Conv, 3, 1, 176, 246 36, Relu, 1, 1, 246, 179 37, Conv, 3, 1, 179, 249 38, Conv, 3, 1, 176, 252 39, Add, 2, 1, 249, 184 40, Relu, 1, 1, 184, 185 41, Conv, 3, 1, 185, 255 42, Relu, 1, 1, 255, 188 43, Conv, 3, 1, 188, 258 44, Add, 2, 1, 258, 191 45, Relu, 1, 1, 191, 192 46, Conv, 3, 1, 192, 193 47, Reshape, 2, 1, 193, 195 48, Gemm, 3, 1, 195, 196 49, Relu, 1, 1, 196, 197 50, Gemm, 3, 1, 197, 198 51, Reshape, 2, 1, 198, output
Input tensor name – input Output tensor name – output In TIDL_onnxRtImportInit subgraph_name=outputLayer 0, subgraph id output, name=outputLayer 1, subgraph id output, name=inputIn TIDL_runtimesOptimizeNet: LayerIndex = 54, dataIndex = 53
************** Frame index 1 : Running float import ************* In TIDL_runtimesPostProcessNet SUGGESTION: [TIDL_InnerProductLayer] Gemm_51 Size larger than 2048 * 2048 is not optimal.****************************************************** 1 WARNINGS 0 ERRORS ****************************************************************** in TIDL_subgraphRtCreate ************ The soft limit is 2048The hard limit is 2048MEM: Init … !!!MEM: Init … Done !!! 0.0s: VX_ZONE_INIT:Enabled 0.5s: VX_ZONE_ERROR:Enabled 0.6s: VX_ZONE_WARNING:Enabled 0.1579s: VX_ZONE_INIT:[tivxInit:184] Initialization Done !!!************ TIDL_subgraphRtCreate done ************ ******* In TIDL_subgraphRtInvoke ******** Layer, Layer Cycles,kernelOnlyCycles, coreLoopCycles,LayerSetupCycles,dmaPipeupCycles, dmaPipeDownCycles, PrefetchCycles,copyKerCoeffCycles,LayerDeinitCycles,LastBlockCycles, paddingTrigger, paddingWait,LayerWithoutPad,LayerHandleCopy, BackupCycles, RestoreCycles,
1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 5, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 6, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 7, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 8, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 9, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 10, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 11, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 12, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 13, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 14, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 15, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 16, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 17, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 18, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 19, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 20, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 21, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 22, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 23, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 24, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 25, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 26, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 27, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 28, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 29, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 30, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 31, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 32, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 33, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 34, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 35, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 36, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 37, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 38, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 39, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 40, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
Sum of Layer Cycles 0 Sub Graph Stats 237.000000 16919031.000000 186.000000 ******* TIDL_subgraphRtInvoke done ********
********** Frame Index 1 : Running float inference ***************** In TIDL_subgraphRtInvoke ******** Layer, Layer Cycles,kernelOnlyCycles, coreLoopCycles,LayerSetupCycles,dmaPipeupCycles, dmaPipeDownCycles, PrefetchCycles,copyKerCoeffCycles,LayerDeinitCycles,LastBlockCycles, paddingTrigger, paddingWait,LayerWithoutPad,LayerHandleCopy, BackupCycles, RestoreCycles,
1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 5, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 6, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 7, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 8, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 9, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 10, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 11, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 12, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 13, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 14, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 15, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 16, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 17, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 18, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 19, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 20, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 21, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 22, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 23, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 24, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 25, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 26, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 27, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 28, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 29, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 30, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 31, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 32, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 33, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 34, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 35, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 36, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 37, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 38, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 39, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 40, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
Sum of Layer Cycles 0 Sub Graph Stats 319.000000 16811320.000000 131.000000 ******* TIDL_subgraphRtInvoke done ********
********** Frame Index 2 : Running fixed point mode for calibration **********In TIDL_runtimesPostProcessNet
~~~~~Running TIDL in PC emulation mode to collect Activations range for each layer~~~~~
Processing config file #0 : /home/leo/code/edgeai-tidl-tools-master/model-artifacts/bestb/tempDir/output_tidl_io_.qunat_stats_config.txt Freeing memory for user provided Net ———————– TIDL Process with REF_ONLY FLOW ————————
# 0 . .. T 16715.82 …. ….. … …. …..# 1 . .. T 16788.76 …. ….. … …. …..~~~~~Running TIDL in PC emulation mode to collect Activations range for each layer~~~~~
Processing config file #0 : /home/leo/code/edgeai-tidl-tools-master/model-artifacts/bestb/tempDir/output_tidl_io_.qunat_stats_config.txt Freeing memory for user provided Net ———————– TIDL Process with REF_ONLY FLOW ————————
# 0 . .. T 10349.83 …. ….. … …. …..# 1 . .. T 10283.69 …. ….. … …. ….. ***************** Calibration iteration number 0 completed ************************
~~~~~Running TIDL in PC emulation mode to collect Activations range for each layer~~~~~
Processing config file #0 : /home/leo/code/edgeai-tidl-tools-master/model-artifacts/bestb/tempDir/output_tidl_io_.qunat_stats_config.txt Freeing memory for user provided Net ———————– TIDL Process with REF_ONLY FLOW ————————
# 0 . .. T 10276.73 …. ….. … …. …..# 1 . .. T 10274.64 …. ….. … …. ….. ***************** Calibration iteration number 1 completed ************************
~~~~~Running TIDL in PC emulation mode to collect Activations range for each layer~~~~~
Processing config file #0 : /home/leo/code/edgeai-tidl-tools-master/model-artifacts/bestb/tempDir/output_tidl_io_.qunat_stats_config.txt Freeing memory for user provided Net ———————– TIDL Process with REF_ONLY FLOW ————————
# 0 . .. T 10418.43 …. ….. … …. …..# 1 . .. T 10384.01 …. ….. … …. ….. ***************** Calibration iteration number 2 completed ************************
~~~~~Running TIDL in PC emulation mode to collect Activations range for each layer~~~~~
Processing config file #0 : /home/leo/code/edgeai-tidl-tools-master/model-artifacts/bestb/tempDir/output_tidl_io_.qunat_stats_config.txt Freeing memory for user provided Net ———————– TIDL Process with REF_ONLY FLOW ————————
# 0 . .. T 10272.75 …. ….. … …. …..# 1 . .. T 10359.20 …. ….. … …. ….. ***************** Calibration iteration number 3 completed ************************
~~~~~Running TIDL in PC emulation mode to collect Activations range for each layer~~~~~
Processing config file #0 : /home/leo/code/edgeai-tidl-tools-master/model-artifacts/bestb/tempDir/output_tidl_io_.qunat_stats_config.txt Freeing memory for user provided Net ———————– TIDL Process with REF_ONLY FLOW ————————
# 0 . .. T 10398.03 …. ….. … …. …..# 1 . .. T 10254.21 …. ….. … …. ….. ***************** Calibration iteration number 4 completed ************************
—————— Network Compiler Traces —————————–NC running for device: 1Running with OTF buffer optimizationssuccessful Memory allocationRerunning network compiler for reshape
—————— Network Compiler Traces —————————–NC running for device: 1Running with OTF buffer optimizationssuccessful Memory allocationSUGGESTION: [TIDL_InnerProductLayer] Gemm_51 Size larger than 2048 * 2048 is not optimal.****************************************************** 1 WARNINGS 0 ERRORS ******************************************************
Completed_Model : 1, Name : bestb , Total time : 91370.38, Offload Time : 16865.18 , DDR RW MBs : 0, Output File : py_out_bestb_ppp.jpg ************ in TIDL_subgraphRtDelete ************ MEM: Deinit … !!!MEM: Alloc's: 26 alloc's of 743870520 bytes MEM: Free's : 26 free's of 743870520 bytes MEM: Open's : 0 allocs of 0 bytes MEM: Deinit … Done !!!
,
Shine:
已转给e2e英文论坛工程师,请关注下面帖子的回复。https://e2e.ti.com/support/processors-group/processors/f/processors-forum/1267412/tda4vm-edge-ai-tools-accuracy-of-lane-detection-model-is-decreased
,
yong xuan:
英文论坛贴下还是没有回复,可以帮忙问下吗
,
Shine:
抱歉,已经去催了,可能问题比较复杂,工程师需要花些时间回复。
,
Shine:
请看下面e2e工程师的回复。
Please check out our extensive documentation on Performance and Accuracy here :
https://github.com/TexasInstruments/edgeai-tidl-tools/blob/master/docs/tidl_osr_debug.md
,
yong xuan:
我看了这个文档,里面涉及的参数我都是按照标准来的,准确率也是写的accuracy_level=1。板子上的检测效果确是和PC端出入很大。请问,什么原因导致了准确率的丢失?怎么解决?
,
Shine:
看到e2e帖子上您已经咨询工程师了,请关注帖子的回复。