Part Number:TDA2SX
could you kindly give some info about the GPU performance? since my app break down when openGL render eight images to frameBuffer, every image file is .jpg format, the resolution is 1280*720 and size is 450KBytes, the openGL would render every image to icon size area(width: 150pixels and height 100pixels).
almost in every time, the app would break down once openGL call glDrawArrays func to draw image. the render frequency is about 25fps, since the data is not very big, and i think that work loading of openGL is not heavy. so the question is why app breaks down and what's the limitation of GPU performance and resource.
btw, operation steps when openGL render the screen:
1, load 8 pictures from tf card which mounted on the board, load one picture and render one icon, and repeat this procedure;
2, don't resize the jpg image before openGL render it, resizing image is finished by openGL.
thanks in advance.
BR.
想了解一下GPU的性能,目前板子上的一个应用程序,在渲染8幅图像时会崩溃,这些图像是 jpg格式,分辨率 1280*720,文件大小 450KBytes,openGL 会把这些图像渲染成缩略图,尺寸150*100, 渲染频率 25fps;
几乎每一次 openGL 调用 glDrawArrays 进行渲染时就会崩溃,因为图像数据量不是特别大,渲染的工作量也不会太大,所以想知道GPU的性能和资源,以及 app 崩溃的原因。
渲染时具体要做的事情:
头一次渲染时,依次从 板子上的 tf 卡中,加载8幅图像,加载一幅,渲染一幅;
把1280*720 的jpg图像,渲染成 缩略图时,事先没有经过缩放,缩放是通过 openGL 完成的;
感谢!
Cherry Zhou:
Hi we have received your question and feedback to the engineer, please expect the response. Thanks!
,
Cherry Zhou:
Hi,
Sorry for the delay response. And the issue description is not very clear.
What exact do you mean by app would break down once OpenGL call glDrawArrays?
Is there a reference to a testcase that can be reproduced on the TI EVM? Or some logs to indicate the problem?
We need a reference testcase or logs or a clear description of the problem to proceed.
Thanks and Best Regards,
Cherry
,
henry o:
Hello,nice to hear from you, and thanks for your advice.maybe my descprition on this issue is not clear, and I rearrange the question as below:1, the usecase (partly) which app used: Capture -> Dup Dup -> Sync_disp -> Gate_camera_display -> Merge_display -> SgxFrmcpy (A15) -> Display in SgxFrmCpy link, I put the frameRender function in SgxFrmCpy processFrame function, i.e. SgxFrmCpy link would call frameRender function to render the screen, such as the video captured by cameras and ui widgets which rendered by openGL in this function when SgxFrmCpy link get new data from previous link.2,the app could process user's touch operation on touch screen, and switch to responding ui, that means frameRender could render different ui widgets when sgxFrmCpy link receive new image data.3,the app would break down when frameRender function try to render the setting ui widgets almost on every time.4,when app breaks down, I found it's call openGL glDrawArrays function to render some icons. 5, the icon comes from jpg image file in tf card which mounted on board, the processdure as below: 1, read image from tfcard; 2, bind the image to tex (a data struct which is used to render iamge and word); 3, call openGL api function to render the image as icon, resize the image and draw on the framebuffer;6,there are 8 icons need to render, but app is always killed by linux os. so, I wanna know the details and performance of GPU.thanks in advance.
BR.
ohenrybtw: this is the function to render the icons:void className::fixed_paint_tex(TEX tex,float px,float py,float lx,float ly,float xx,float yy,float winWidth,float winHeight,float width,float height,bool stretch,float angle,float opacity){ GLboolean cull_enable=glIsEnabled(GL_CULL_FACE ); glEnable(GL_CULL_FACE); glCullFace(GL_BACK); float x0,y0,x1,y1; if(stretch){ x0=(xx-winWidth/2)/(winWidth/2); y0=(winHeight/2-yy)/(winHeight/2); x1=(xx+width-winWidth/2)/(winWidth/2); y1=(winHeight/2-(yy+height))/(winHeight/2); }else{ x0=(xx-winWidth/2)/(winWidth/2); y0=(winHeight/2-yy)/(winHeight/2); x1=(xx+lx-winWidth/2)/(winWidth/2); y1=(winHeight/2-(yy+ly))/(winHeight/2); } ////////////////////////////////////////////////////////////////// float vertices[30+1]={ x0, y0, -0.1, px/(float)tex.tex_width, py/(float)tex.tex_height, // 0 2/3 x0, y1, -0.1, px/(float)tex.tex_width, (py+ly)/(float)tex.tex_height, // x1, y0, -0.1, (px+lx)/(float)tex.tex_width, py/(float)tex.tex_height, // x1, y0, -0.1, (px+lx)/(float)tex.tex_width, py/(float)tex.tex_height, // x0, y1, -0.1, px/(float)tex.tex_width, (py+ly)/(float)tex.tex_height, // x1, y1, -0.1, (px+lx)/(float)tex.tex_width, (py+ly)/(float)tex.tex_height, // 1/4 5 0 }; float arc_angle=angle*M_PI/180.0f; float mid_x=(x0+x1)/2.0; float mid_y=(y0+y1)/2.0; if(angle > 0.000001 || angle < -0.000001) /* omit calculation when angle == 0; */ { for(int i=0;i<30;i+=5) { xx=(vertices[i]-mid_x)*cos(arc_angle)-(vertices[i+1]-mid_y)*sin(arc_angle)+mid_x; yy=(vertices[i]-mid_x)*sin(arc_angle)+(vertices[i+1]-mid_y)*cos(arc_angle)+mid_y; vertices[i]=xx; vertices[i+1]=yy; } } GLuint program = shaderObj->get_program(); glUseProgram(program); glUniform1i(glGetUniformLocation(program, "sTexture"), 0); glUniform1f(glGetUniformLocation(program, "opacity"), opacity); glActiveTexture(GL_TEXTURE0); glBindTexture(GL_TEXTURE_2D, tex.texid); glDisable(GL_DEPTH_TEST); if(b_blend) { glEnable(GL_BLEND); glBlendFunc(GL_SRC_ALPHA, GL_ONE_MINUS_SRC_ALPHA); } else glDisable(GL_BLEND); glBindBuffer(GL_ARRAY_BUFFER, 0); glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, 0); glEnableVertexAttribArray(0); glEnableVertexAttribArray(1); glVertexAttribPointer(0, 3, GL_FLOAT, GL_FALSE, 20, vertices); glVertexAttribPointer(1, 2, GL_FLOAT, GL_FALSE, 20, vertices+3);
glDrawArrays(GL_TRIANGLES, 0, 6); /* app breaks down when calls this function. */ glDisableVertexAttribArray(0); glDisableVertexAttribArray(1); if(cull_enable) glEnable(GL_CULL_FACE); else glDisable(GL_CULL_FACE);}FYI.
,
Cherry Zhou:
Hi Dear,
Sorry for the delay response here.
What does app breakdown mean?
Are you getting GPU crash or Linux Kernel panic?
Do you have some logs with the errors?
Thanks!
,
henry o:
Hi,Nice to hear from you. below is app break down log.
apps_jh6.out is the app name.when app calls the gldrawArrays(…) func, it breaks down and log printed as below.linux kernel is unstable when app breaks down. it means that sometimes linux kernel breaks down either and resart, and sometimes, app halt, but linux still works well.
FYI.
thanks and best regards!
[ 1858.599018] apps_jh6.out invoked oom-killer: gfp_mask=0x24000c4, order=0, oom_score_adj=0
[ 1858.600066] apps_jh6.out cpuset=/ mems_allowed=0
[ 1858.600679] CPU: 0 PID: 400 Comm: apps_jh6.out Tainted: G W O 4.4.84 #5
[ 1858.601641] Hardware name: Generic DRA74X (Flattened Device Tree)
[ 1858.602417] Backtrace:
[ 1858.602755] [<c00131c4>] (dump_backtrace) from [<c00133c0>] (show_stack+0x18/0x1c)
[ 1858.603715] r7:ee353478 r6:60070013 r5:00000000 r4:c0842ad0
[ 1858.604479] [<c00133a8>] (show_stack) from [<c0252128>] (dump_stack+0x8c/0xa0)
[ 1858.605406] [<c025209c>] (dump_stack) from [<c0120608>] (dump_header+0x5c/0x1ac)
[ 1858.606346] r7:ee353478 r6:00000000 r5:edf47adc r4:ee353000
[ 1858.607103] [<c01205ac>] (dump_header) from [<c00d5e38>] (oom_kill_process+0x2fc/0x448)
[ 1858.608118] r10:c082b8f8 r9:000104a4 r8:000000a2 r7:ee353478 r6:0006414c r5:edf47adc
[ 1858.609149] r4:ee353000
[ 1858.609487] [<c00d5b3c>] (oom_kill_process) from [<c00d62e4>] (out_of_memory+0x2f0/0x32c)
[ 1858.610523] r10:c082b8f8 r9:000104a4 r8:c082b8f8 r7:c082bb78 r6:0006414c r5:edf47adc
[ 1858.611551] r4:ee353000
[ 1858.611890] [<c00d5ff4>] (out_of_memory) from [<c00db1ac>] (__alloc_pages_nodemask+0x924/0x964)
[ 1858.612992] r10:c0867680 r9:024000c4 r8:00000000 r7:c0867690 r6:edf46000 r5:00000000
[ 1858.614019] r4:00000000
[ 1858.614425] [<c00da888>] (__alloc_pages_nodemask) from [<bf0059fc>] (NewAllocPagesLinuxMemArea+0xcc/0x278 [pvrsrvkm])
[ 1858.615769] r10:00004000 r9:00000000 r8:00000000 r7:bf0345ec r6:000011f0 r5:0000047c
[ 1858.616797] r4:f26a41f0
[ 1858.617233] [<bf005930>] (NewAllocPagesLinuxMemArea [pvrsrvkm]) from [<bf000a18>] (OSAllocPages_Impl+0xe4/0xfc [pvrsrvkm])
[ 1858.618632] r10:00000000 r9:ede75840 r8:00800000 r7:02014200 r6:edf47c34 r5:00000203
[ 1858.619660] r4:02014200
[ 1858.620096] [<bf000934>] (OSAllocPages_Impl [pvrsrvkm]) from [<bf00892c>] (BM_ImportMemory+0x284/0x580 [pvrsrvkm])
[ 1858.621406] r5:00000203 r4:edee1080
[ 1858.621987] [<bf0086a8>] (BM_ImportMemory [pvrsrvkm]) from [<bf0129dc>] (RA_Alloc+0xb8/0x29c [pvrsrvkm])
[ 1858.623188] r10:00000040 r9:ee368200 r8:bf0086a8 r7:edf47ca0 r6:00000040 r5:00800000
[ 1858.624220] r4:ee368200
[ 1858.624661] [<bf012924>] (RA_Alloc [pvrsrvkm]) from [<bf008d00>] (BM_Alloc+0xd8/0x50c [pvrsrvkm])
[ 1858.625786] r10:00000040 r9:ee368200 r8:00800000 r7:c31ce608 r6:c2d39ac0 r5:edee1080
[ 1858.626818] r4:00000203
[ 1858.627257] [<bf008c28>] (BM_Alloc [pvrsrvkm]) from [<bf009350>] (AllocDeviceMem+0xb4/0x194 [pvrsrvkm])
[ 1858.628448] r10:bf033b98 r9:edee1080 r8:00800000 r7:00000003 r6:edf47d4c r5:c31ce600
[ 1858.629476] r4:00000000
[ 1858.629916] [<bf00929c>] (AllocDeviceMem [pvrsrvkm]) from [<bf009cf8>] (_PVRSRVAllocDeviceMemKM+0xb8/0x224 [pvrsrvkm])
[ 1858.631270] r9:eddd30c0 r8:f1a05000 r7:ede755c0 r6:ee382200 r5:edee1080 r4:00000003
[ 1858.632396] [<bf009c40>] (_PVRSRVAllocDeviceMemKM [pvrsrvkm]) from [<bf01565c>] (PVRSRVAllocDeviceMemBW+0x194/0x40c [pvrsrvkm])
[ 1858.633849] r7:ede755c0 r6:00000000 r5:f1a06000 r4:00000000
[ 1858.634707] [<bf0154c8>] (PVRSRVAllocDeviceMemBW [pvrsrvkm]) from [<bf0181dc>] (BridgedDispatchKM+0x94/0x25c [pvrsrvkm])
[ 1858.636083] r8:f1a06000 r7:f1a05000 r6:bf0154c8 r5:ede755c0 r4:edf47e68
[ 1858.637076] [<bf018148>] (BridgedDispatchKM [pvrsrvkm]) from [<bf004d74>] (PVRSRV_BridgeDispatchKM+0x180/0x338 [pvrsrvkm])
[ 1858.638475] r8:00000040 r7:eddd30c0 r6:000000ac r5:c01c6707 r4:edf47e68
[ 1858.639421] [<bf004bf4>] (PVRSRV_BridgeDispatchKM [pvrsrvkm]) from [<c031d888>] (drm_ioctl+0x140/0x454)
[ 1858.640611] r7:ee3c2c00 r6:c08a0310 r5:0000001c r4:edf47e68
[ 1858.641367] [<c031d748>] (drm_ioctl) from [<c0133df4>] (do_vfs_ioctl+0x3f0/0x614)
[ 1858.642316] r10:00000000 r9:edf46000 r8:78c92e4c r7:00000010 r6:eddd3000 r5:ee32e648
[ 1858.643349] r4:78c92e4c
[ 1858.643687] [<c0133a04>] (do_vfs_ioctl) from [<c0134054>] (SyS_ioctl+0x3c/0x64)
[ 1858.644616] r10:00000000 r9:edf46000 r8:78c92e4c r7:401c6440 r6:eddd3000 r5:00000010
[ 1858.645644] r4:eddd3001
[ 1858.645985] [<c0134018>] (SyS_ioctl) from [<c000fae0>] (ret_fast_syscall+0x0/0x34)
[ 1858.646945] r9:edf46000 r8:c000fc84 r7:00000036 r6:401c6440 r5:78c92e4c r4:0000001c
[ 1858.648038] Mem-Info:
[ 1858.648370] active_anon:32705 inactive_anon:0 isolated_anon:0
[ 1858.648370] active_file:3983 inactive_file:4625 isolated_file:0
[ 1858.648370] unevictable:0 dirty:12 writeback:31 unstable:0
[ 1858.648370] slab_reclaimable:741 slab_unreclaimable:1912
[ 1858.648370] mapped:35661 shmem:0 pagetables:401 bounce:0
[ 1858.648370] free:314564 free_pcp:162 free_cma:31785
[ 1858.652569] DMA free:107040kB min:1556kB low:1944kB high:2332kB active_anon:11996kB inactive_anon:0kB active_file:48kB inactive_file:40kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:753664kB managed:332080kB mlocked:0kB dirty:20kB writeback:0kB mapped:116004kB shmem:0kB slab_reclaimable:2964kB slab_unreclaimable:7648kB kernel_stack:1200kB pagetables:228kB unstable:0kB bounce:0kB free_pcp:12kB local_pcp:0kB free_cma:103476kB writeback_tmp:0kB pages_scanned:1100 all_unreclaimable? yes
[ 1858.658174] lowmem_reserve[]: 0 0 1253 1253
[ 1858.658780] HighMem free:1151804kB min:512kB low:3804kB high:7096kB active_anon:118824kB inactive_anon:0kB active_file:15884kB inactive_file:18460kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:1307648kB managed:1307648kB mlocked:0kB dirty:28kB writeback:124kB mapped:26640kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:1376kB unstable:0kB bounce:0kB free_pcp:4kB local_pcp:4kB free_cma:23664kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
[ 1858.664379] lowmem_reserve[]: 0 0 0 0
[ 1858.664894] DMA: 22*4kB (UEHC) 14*8kB (UHC) 13*16kB (HC) 4*32kB (H) 5*64kB (HC) 2*128kB (C) 0*256kB 3*512kB (C) 2*1024kB (HC) 2*2048kB (HC) 24*4096kB (C) = 107096kB
[ 1858.667005] HighMem: 551*4kB (UMC) 258*8kB (UMC) 333*16kB (UMC) 145*32kB (UMC) 184*64kB (UMC) 118*128kB (UMC) 85*256kB (UMC) 73*512kB (UMC) 37*1024kB (UM) 29*2048kB (UMC) 233*4096kB (MC) = 1151900kB
[ 1858.669398] 8598 total pagecache pages
[ 1858.669880] 0 pages in swap cache
[ 1858.670322] Swap cache stats: add 0, delete 0, find 0/0
[ 1858.671004] Free swap = 0kB
[ 1858.671383] Total swap = 0kB
[ 1858.671763] 515328 pages RAM
[ 1858.672133] 326912 pages HighMem/MovableOnly
[ 1858.672678] 105396 pages reserved
[ 1858.673101] 51200 pages cma reserved
[ 1858.673568] [ pid ] uid tgid total_vm rss nr_ptes nr_pmds swapents oom_score_adj name
[ 1858.674738] [ 119] 0 119 566 406 4 2 0 0 sh
[ 1858.675802] [ 172] 0 172 339108 68381 406 3 0 0 apps_jh6.out
[ 1858.676983] Out of memory: Kill process 172 (apps_jh6.out) score 162 or sacrifice child
[ 1858.678327] Killed process 172 (apps_jh6.out) total-vm:1356432kB, anon-rss:130596kB, file-rss:142928kB
[ 1858.679658] apps_jh6.out: page allocation failure: order:0, mode:0x24000c4
[ 1858.680561] CPU: 0 PID: 400 Comm: apps_jh6.out Tainted: G W O 4.4.84 #5
[ 1858.681524] Hardware name: Generic DRA74X (Flattened Device Tree)
[ 1858.682297] Backtrace:
[ 1858.682632] [<c00131c4>] (dump_backtrace) from [<c00133c0>] (show_stack+0x18/0x1c)
[ 1858.683594] r7:c082866c r6:60070013 r5:00000000 r4:c0842ad0
[ 1858.684354] [<c00133a8>] (show_stack) from [<c0252128>] (dump_stack+0x8c/0xa0)
[ 1858.684404] apps_jh6.out: page allocation failure: order:1, mode:0x26000c0
[ 1858.6890 r6:20000013 r5:00000000 r4:c0842ad0
[ 1858.755913] [<c00133a8>] (show_stack) from [<c0252128>] (dump_stack+0x8c/0xa0)
[ 1858.756837] [<c025209c>] (dump_stack) from [<c00d7ee4>] (warn_alloc_failed+0xe4/0x124)
[ 1858.757841] r7:c0867690 r6:00000000 r5:00000001 r4:026000c0
[ 1858.758598] [<c00d7e04>] (warn_alloc_failed) from [<c00daa4c>] (__alloc_pages_nodemask+0x1c4/0x964)
[ 1858.759745] r3:00040001 r2:00000000
[ 1858.760215] r6:ede68000 r5:00000000 r4:00000000
[ 1858.760831] [<c00da888>] (__alloc_pages_nodemask) from [<c00db4a8>] (alloc_kmem_pages_node+0x28/0xb4)
[ 1858.761999] r10:6e6803f8 r9:edde8c00 r8:c000fc84 r7:6e67fec8 r6:00000001 r5:ecd5f800
[ 1858.763030] r4:026000c0
[ 1858.763371] [<c00db480>] (alloc_kmem_pages_node) from [<c003655c>] (copy_process+0x124/0x14c8)
[ 1858.764462] r7:6e67fec8 r6:6e6803f8 r5:ecd5f800 r4:003d0f00
[ 1858.765215] [<c0036438>] (copy_process) from [<c0037a34>] (_do_fork+0x78/0x334)
[ 1858.766143] r10:6e6803f8 r9:ede68000 r8:c000fc84 r7:00000000 r6:00000000 r5:0022ca98
[ 1858.767171] r4:003d0f00
[ 1858.767508] [<c00379bc>] (_do_fork) from [<c0037de4>] (SyS_clone+0x28/0x30)
[ 1858.768392] r10:00000000 r9:ede68000 r8:c000fc84 r7:00000078 r6:00000000 r5:0022ca98
[ 1858.769420] r4:6e6803f8
[ 1858.769758] [<c0037dbc>] (SyS_clone) from [<c000fae0>] (ret_fast_syscall+0x0/0x34)
[ 1858.770985] Mem-Info:
[ 1858.771291] active_anon:32705 inactive_anon:0 isolated_anon:0
[ 1858.771291] active_file:3983 inactive_file:4621 isolated_file:0
[ 1858.771291] unevictable:0 dirty:12 writeback:31 unstable:0
[ 1858.771291] slab_reclaimable:714 slab_unreclaimable:1912
[ 1858.771291] mapped:35661 shmem:0 pagetables:401 bounce:0
[ 1858.771291] free:315836 free_pcp:52 free_cma:31785
[ 1858.775563] DMA free:111540kB min:1556kB low:1944kB high:2332kB active_anon:11996kB inactive_anon:0kB active_file:48kB inactive_file:24kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:753664kB managed:332080kB mlocked:0kB dirty:20kB writeback:0kB mapped:116004kB shmem:0kB slab_reclaimable:2856kB slab_unreclaimable:7648kB kernel_stack:1200kB pagetables:228kB unstable:0kB bounce:0kB free_pcp:204kB local_pcp:60kB free_cma:103476kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
[ 1858.781122] lowmem_reserve[]: 0 0 1253 1253
[ 1858.781702] HighMem free:1151804kB min:512kB low:3804kB high:7096kB active_anon:118824kB inactive_anon:0kB active_file:15884kB inactive_file:18460kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:1307648kB managed:1307648kB mlocked:0kB dirty:28kB writeback:124kB mapped:26640kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:1376kB unstable:0kB bounce:0kB free_pcp:4kB local_pcp:0kB free_cma:23664kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
[ 1858.787335] lowmem_reserve[]: 0 0 0 0
[ 1858.787839] DMA: 183*4kB (UMEHC) 84*8kB (UMEHC) 41*16kB (MEHC) 13*32kB (MEH) 11*64kB (UMEHC) 15*128kB (UEC) 2*256kB (U) 3*512kB (C) 2*1024kB (HC) 2*2048kB (HC) 24*4096kB (C) = 111596kB
[ 1858.790220] HighMem: 551*4kB (UMC) 258*8kB (UMC) 333*16kB (UMC) 145*32kB (UMC) 184*64kB (UMC) 118*128kB (UMC) 85*256kB (UMC) 73*512kB (UMC) 37*1024kB (UM) 29*2048kB (UMC) 233*4096kB (MC) = 1151900kB
[ 1858.792659] 8598 total pagecache pages
[ 1858.793140] 0 pages in swap cache
[ 1858.793566] Swap cache stats: add 0, delete 0, find 0/0
[ 1858.794231] Free swap = 0kB
[ 1858.794927] Total swap = 0kB
[ 1858.795300] 515328 pages RAM
[ 1858.795671] 326912 pages HighMem/MovableOnly
[ 1858.796215] 105396 pages reserved
[ 1858.796638] 51200 pages cma reserved
[ 1858.987238] virtio_rpmsg_bus virtio2: msg received with no recipient
[ 1859.120497] virtio_rpmsg_bus virtio2: msg received with no recipient
[ 1859.253782] virtio_rpmsg_bus virtio2: msg received with no recipient
[ 1859.387170] virtio_rpmsg_bus virtio2: msg received with no recipient
,
Cherry Zhou:
Hi Henry,
Sorry for the delay in response.
This issue doesn't have anything to do with GPU performance.
From the trace, it looks like memory allocation issue.
Was the SGX framecopy use case working before you added their changes?
Regards,
Cherry
,
henry o:
Hi, Cherry,Thanks for your kindly reply.and i think the answer is right. since low mem is used up, then the oom-killer kills the process (apps_jh6.out). since sr1 is about 300MB which in low mem region, and for linux kernel, the available memory space is limited.so, when apps_jh6.out apply for mem to display the scaled image, the low mem is up, and oom-killer actions.
thanks.
BR.