Part Number:AM5718Other Parts Discussed in Thread: TPS65916, TMP102
我们选用飞凌嵌入式的核心板,用的AM5718芯片,目前共有3套,都出现过核心板掉电现象,但底板给核心板供电的DC5V正常,附件是两次掉电U盘内存储的LOG文件,其中有两个有关温度的信息:
有散热器但忘记开风扇:Nov 24 02:45:20 ok5718-idk user.emerg kernel: [31329.394634] thermal thermal_zone5: critical temperature reached(255 C),shutting down
有散热器且开风扇:Nov 25 18:12:19 ok5718-idk user.emerg kernel: [ 2691.074642] thermal thermal_zone5: critical temperature reached(127 C),shutting down
第一个255℃不可能达到这么高温度吧,第二个风扇的功率足够大,127℃也感觉不应该;不知道大家有没有遇到的这样的情况,下一步的测试验证方向是哪里?
第一个掉电前log:
第二个掉电前log:
Nov 25 18:12:19 ok5718-idk user.emerg kernel: [ 2691.074642] thermal thermal_zone5: critical temperature reached(127 C),shutting down
Nov 25 18:12:19 ok5718-idk daemon.warn systemd[1]: sysinit.target: Found ordering cycle on sysinit.target/start
Nov 25 18:12:19 ok5718-idk daemon.warn systemd[1]: sysinit.target: Found dependency on alignment.service/start
Nov 25 18:12:19 ok5718-idk daemon.warn systemd[1]: sysinit.target: Found dependency on basic.target/stop
Nov 25 18:12:19 ok5718-idk daemon.warn systemd[1]: sysinit.target: Found dependency on sysinit.target/start
Nov 25 18:12:19 ok5718-idk daemon.warn systemd[1]: sysinit.target: Breaking ordering cycle by deleting job alignment.service/start
Nov 25 18:12:19 ok5718-idk daemon.err systemd[1]: alignment.service: Job alignment.service/start deleted to break ordering cycle starting with sysinit.target/start
Nov 25 18:12:19 ok5718-idk daemon.info systemd[1]: Stopping Update UTMP about System Boot/Shutdown…
Nov 25 18:12:19 ok5718-idk daemon.info systemd[1]: Stopped target System Time Synchronized.
Nov 25 18:12:19 ok5718-idk daemon.info systemd[1]: Stopping Network Time Synchronization…
Nov 25 18:12:19 ok5718-idk daemon.info systemd[1]: Stopped target Multi-User System.
Nov 25 18:12:19 ok5718-idk daemon.info charon: 00[DMN] signal of type SIGINT received. Shutting down
Nov 25 18:12:19 ok5718-idk daemon.info charon: 00[KNL] received netlink error: Address family not supported by protocol (97)
Nov 25 18:12:19 ok5718-idk daemon.info ipsec[909]: 00[DMN] Starting IKE charon daemon (strongSwan 5.5.0, Linux 4.9.41, armv7l)
Nov 25 18:12:19 ok5718-idk daemon.info ipsec[909]: 00[CFG] opening PKCS#11 library failed: /usr/lib/softhsm/libsecstore.so.1: cannot open shared object file: No such file or directory
Nov 25 18:12:19 ok5718-idk daemon.info ipsec[909]: 00[KNL] received netlink error: Address family not supported by protocol (97)
Nov 25 18:12:19 ok5718-idk daemon.info ipsec[909]: 00[KNL] unable to create IPv4 routing table rule
Nov 25 18:12:19 ok5718-idk daemon.info ipsec[909]: 00[CFG] loading ca certificates from '/etc/ipsec.d/cacerts'
Nov 25 18:12:19 ok5718-idk daemon.info ipsec[909]: 00[LIB] file coded in unknown format, discarded
Nov 25 18:12:19 ok5718-idk daemon.info ipsec[909]: 00[LIB] building CRED_CERTIFICATE – X509 failed, tried 5 builders
Nov 25 18:12:19 ok5718-idk daemon.info ipsec[909]: 00[CFG] loading ca certificate from '/etc/ipsec.d/cacerts/.gitignore' failed
Nov 25 18:12:19 ok5718-idk daemon.info ipsec[909]: 00[CFG] loading aa certificates from '/etc/ipsec.d/aacerts'
Nov 25 18:12:19 ok5718-idk daemon.info ipsec[909]: 00[LIB] file coded in unknown format, discarded
Nov 25 18:12:19 ok5718-idk daemon.info ipsec[909]: 00[LIB] building CRED_CERTIFICATE – X509 failed, tried 5 builders
Nov 25 18:12:19 ok5718-idk daemon.info ipsec[909]: 00[CFG] loading AA certificate from '/etc/ipsec.d/aacerts/.gitignore' failed
Nov 25 18:12:19 ok5718-idk daemon.info ipsec[909]: 00[CFG] loading ocsp signer certificates from '/etc/ipsec.d/ocspcerts'
Nov 25 18:12:19 ok5718-idk daemon.info ipsec[909]: 00[LIB] file coded in unknown format, discarded
Nov 25 18:12:19 ok5718-idk daemon.info ipsec[909]: 00[LIB] building CRED_CERTIFICATE – X509 failed, tried 5 builders
Nov 25 18:12:19 ok5718-idk daemon.info ipsec[909]: 00[CFG] loading certificate from '/etc/ipsec.d/ocspcerts/.gitignore' failed
Nov 25 18:12:19 ok5718-idk daemon.info ipsec[909]: 00[CFG] loading attribute certificates from '/etc/ipsec.d/acerts'
Nov 25 18:12:19 ok5718-idk daemon.info ipsec[909]: 00[LIB] file coded in unknown format, discarded
Nov 25 18:12:19 ok5718-idk daemon.info ipsec[909]: 00[LIB] building CRED_CERTIFICATE – X509_AC failed, tried 3 builders
Nov 25 18:12:19 ok5718-idk daemon.info ipsec[909]: 00[CFG] loading attribute certificate from '/etc/ipsec.d/acerts/.gitignore' failed
Nov 25 18:12:19 ok5718-idk daemon.info ipsec[909]: 00[CFG] loading crls from '/etc/ipsec.d/crls'
Nov 25 18:12:19 ok5718-idk daemon.info ipsec[909]: 00[LIB] file coded in unknown format, discarded
Nov 25 18:12:19 ok5718-idk daemon.info ipsec[909]: 00[LIB] building CRED_CERTIFICATE – X509_CRL failed, tried 4 builders
Nov 25 18:12:19 ok5718-idk daemon.info ipsec[909]: 00[CFG] loading crl from '/etc/ipsec.d/crls/.gitignore' failed
Nov 25 18:12:19 ok5718-idk daemon.info ipsec[909]: 00[CFG] loading secrets from '/etc/ipsec.secrets'
Nov 25 18:12:19 ok5718-idk daemon.info ipsec[909]: 00[CFG] opening secrets file '/etc/ipsec.secrets' failed: No such file or directory
Nov 25 18:12:19 ok5718-idk daemon.info ipsec[909]: 00[LIB] loaded plugins: charon pkcs11 aes des rc2 sha2 sha1 md5 random nonce x509 revocation constraints pubkey pkcs1 pkcs7 pkcs8 pkcs12 pgp dnskey sshkey pem openssl fips-prf xcbc cmac hmac ctr curl sqlite attr kernel-netl
Nov 25 18:12:19 ok5718-idk daemon.info avahi-daemon[621]: Got SIGTERM, quitting.
Nov 25 18:12:19 ok5718-idk daemon.info ipsec[909]: 00[JOB] spawning 16 worker threads
Nov 25 18:12:19 ok5718-idk daemon.info ipsec[909]: 05[KNL] interface can0 activated
Nov 25 18:12:19 ok5718-idk daemon.info ipsec[909]: 08[KNL] interface can1 activated
Nov 25 18:12:19 ok5718-idk daemon.info ipsec[909]: 00[DMN] signal of type SIGINT received. Shutting down
Nov 25 18:12:19 ok5718-idk daemon.info ipsec[909]: 00[KNL] received netlink error: Address family not supported by protocol (97)
Nov 25 18:12:19 ok5718-idk daemon.info systemd[1]: Stopping strongSwan IPsec IKEv1/IKEv2 daemon using ipsec.conf…
Nov 25 18:12:19 ok5718-idk daemon.info avahi-daemon[621]: Leaving mDNS multicast group on interface eth1.IPv4 with address 192.168.1.35.
Nov 25 18:12:19 ok5718-idk daemon.info lighttpd[1041]: 2021-11-25 18:12:18: (../../lighttpd-1.4.41/src/server.c.1821) server stopped by UID = 0 PID = 1
Nov 25 18:12:19 ok5718-idk daemon.info ipsec[909]: charon stopped after 200 ms
Nov 25 18:12:19 ok5718-idk daemon.info ipsec[909]: ipsec starter stopped
Nov 25 18:12:19 ok5718-idk daemon.info systemd[1]: Stopping Login Service…
Nov 25 18:12:19 ok5718-idk user.notice kernel: klogd: exiting
Nov 25 18:12:19 ok5718-idk daemon.info systemd[1]: Stopping Avahi mDNS/DNS-SD Stack…
Nov 25 18:12:19 ok5718-idk authpriv.info ipsec_starter[909]: charon stopped after 200 ms
Nov 25 18:12:19 ok5718-idk daemon.info systemd[1]: Stopping Kernel Logging Service…
Nov 25 18:12:19 ok5718-idk authpriv.info ipsec_starter[909]: ipsec starter stopped
Nov 25 18:12:19 ok5718-idk daemon.info systemd[1]: Stopping Load/Save Screen Backlight Brightness of backlight:backlight…
Nov 25 18:12:19 ok5718-idk daemon.info avahi-daemon[621]: Leaving mDNS multicast group on interface eth0.IPv4 with address 192.168.0.232.
Nov 25 18:12:19 ok5718-idk daemon.info systemd[1]: Stopping Load/Save Random Seed…
Nov 25 18:12:19 ok5718-idk daemon.info avahi-daemon[621]: avahi-daemon 0.6.32 exiting.
Nov 25 18:12:19 ok5718-idk daemon.info systemd[1]: Stopped target Sound Card.
Nov 25 18:12:19 ok5718-idk local0.info snmpd[998]: Received TERM or STOP signal… shutting down…
Nov 25 18:12:19 ok5718-idk daemon.info systemd[1]: Stopping Simple Network Management Protocol (SNMP) Daemon….
Nov 25 18:12:19 ok5718-idk daemon.info ofonod[604]: Terminating
Nov 25 18:12:19 ok5718-idk daemon.info systemd[1]: Stopping helloworld…
Nov 25 18:12:19 ok5718-idk daemon.notice dbus[640]: [system] Activating via systemd: service name='org.bluez' unit='dbus-org.bluez.service'
Nov 25 18:12:19 ok5718-idk daemon.info systemd[1]: Stopped target Timers.
Nov 25 18:12:19 ok5718-idk daemon.info ofonod[604]: Exit
Nov 25 18:12:19 ok5718-idk daemon.info systemd[1]: Stopped Daily Cleanup of Temporary Directories.
Nov 25 18:12:19 ok5718-idk daemon.info systemd[1]: Stopping Lightning Fast Webserver With Light System Requirements…
Nov 25 18:12:19 ok5718-idk daemon.info systemd[1]: Stopped target Login Prompts.
Nov 25 18:12:19 ok5718-idk daemon.info systemd[1]: Stopping Getty on tty1…
Nov 25 18:12:19 ok5718-idk daemon.info systemd[1]: Stopping Serial Getty on ttyS2…
Nov 25 18:12:19 ok5718-idk daemon.info systemd[1]: Stopping Telephony service…
Nov 25 18:12:19 ok5718-idk daemon.info systemd[1]: Closed RPCbind Server Activation Socket.
Nov 25 18:12:19 ok5718-idk syslog.info syslogd exiting
Shine:
请用# cat /sys/class/thermal/thermal_zone0/temp指令看一下其他4个thermal_zone温度是多少?是不是只有zone5温度过高?下面文档的数据供您参考。https://www.ti.com/lit/an/spraci0/spraci0.pdf
,
Sinty Liang:
root@ok5718-idk:~# cat /sys/class/thermal/thermal_zone0/temp90200root@ok5718-idk:~# cat /sys/class/thermal/thermal_zone1/temp90200root@ok5718-idk:~# cat /sys/class/thermal/thermal_zone2/temp89800root@ok5718-idk:~# cat /sys/class/thermal/thermal_zone3/temp89400root@ok5718-idk:~# cat /sys/class/thermal/thermal_zone4/temp90200root@ok5718-idk:~# cat /sys/class/thermal/thermal_zone5/temp65437
,
Shine:
看这个温度正常啊。
,
Sinty Liang:
您好:
1.请教一下这个温度值(例如zone0的90200)怎么换算?
2.这个温度值是开机后5分钟左右读取的,不是发生故障前;
3.到目前我统计了8个发生故障时的log文件,每一个文件都有“ thermal thermal_zone5: failed to read out thermal zone (-121)”,其中2个是“thermal thermal_zone5: critical temperature reached(127 C),shutting down”;5个文件是“thermal thermal_zone5: critical temperature reached(255 C),shutting down”;最后1个文件没有报临界值信息!
4.目前测试发现:裸芯片,芯片+散热器,芯片+散热器+风扇,三种情况下都会出现上述故障;
5.用热成像仪测试裸芯片(AM5718)温度70℃左右,PMIC(TPS65916)是80℃左右;
非常感谢您的跟进解决;
,
Sinty Liang:
核心板掉电log文件.zip
,
Shine:
1. 打印的数值/1000是温度,例如zone0的90200是90.2度。2. 请问用的是哪个版本的processor SDK?
,
Sinty Liang:
ti-processor-sdk-linux-am57xx-evm-04.01.00.06-Linux-x86-Install.bin
,
Sinty Liang:
还有一个问题请教一下:thermal_zone5是哪里的温度?手册上没有相关介绍;
我目前查询到的信息如下:
thermal_zone0对应MPU的温度
thermal_zone1对应GPU的温度
thermal_zone2对应CORE的温度
thermal_zone3对应DSP的温度
thermal_zone4对应IVA的温度
,
Shine:
请更新一下processor SDK版本。https://www.ti.com.cn/tool/download/PROCESSOR-SDK-LINUX-AM57X#downloadsthermal_zone5对应DSPEVE的温度,请参考TRM文档18.4.6.2 Thermal Management Related Registers。https://www.ti.com.cn/cn/lit/ug/spruhz7j/spruhz7j.pdf
,
Sinty Liang:
请问:更新SDK版本的目的是什么?是已经确认现有版本有bug造成的误报过温问题,还是说只是确认是否是版本的原因,排除一下?
因为软件工程师说,不能升级,只能用飞凌深度开发版本的,否则我们的程序就不能用;如果您确认是版本问题引起那我们就得尽快想办法了。
再次感谢!
,
Shine:
是为了确认是否是版本的原因,排除一下。
因为用热成像仪测试裸芯片的温度是正常的,但实际软件读出来的偏高,看一下是软件的问题还是芯片上temperature senor的问题。或者可以在TI EVM试一下做交叉实验。
,
Sinty Liang:
我们在U盘建了6个temp.txt文件,实时记录6个热区的温度值,发生掉电后查看数据:zone5还不到60℃,但是打印信息里还是255C报警关机;所以可以肯定的是系统误报!温度相关messages.zip
,
Shine:
建议在TI EVM板上做一下交叉实验,看一下是软件的问题还是芯片上temperature senor的问题
,
user6044729:
我也遇到这个问题,目前我在处理这个问题,发现zone0-4是cpu手册上描述的五个热区
0x4A00232C CTRL_CORE_TEMP_SENSOR_MPU0x4A002330 CTRL_CORE_TEMP_SENSOR_GPU0x4A002334 CTRL_CORE_TEMP_SENSOR_CORE0x4A002578 CTRL_CORE_TEMP_SENSOR_IVA0x4A002574 CTRL_CORE_TEMP_SENSOR_DSPEVE
其中zone5是我们使用你们推荐的TMP102这个温度传感器芯片,设备树配置如下
&thermal_zones {board_thermal: board_thermal {polling-delay-passive = <1250>; /* milliseconds */polling-delay = <1500>; /* milliseconds *//* sensorID */thermal-sensors = <&tmp1020>;board_trips: trips {board_alert0: board_alert {temperature = <40000>; /* millicelsius */hysteresis = <2000>; /* millicelsius */type = "active";};board_crit: board_crit {temperature = <105000>; /* millicelsius */hysteresis = <0>; /* millicelsius */type = "critical";};};board_cooling_maps: cooling-maps {map0 {trip = <&board_alert0>;cooling-device =<&pwm_fan THERMAL_NO_LIMIT THERMAL_NO_LIMIT>;};};}; };电路如下
目前在启动时,每隔一段时间打印温度,温度都是正常的60-70度左右
出现关机的板子出现如下信息
thermal thermal_zone5: failed to read out thermal zone (-121)
thermal thermal_zone5: failed to read out thermal zone (-121)
thermal thermal_zone5: failed to read out thermal zone (-121)
thermal thermal_zone5: failed to read out thermal zone (-121)
cat: read error: Remote I/O error
thermal thermal_zone5: critical temperature reached(255 C),shutting down
所以请教一下您们有没有遇到类似的问题,谢谢
,
Shine:
看到勘误表上有个虚假报警的bug,请参考workround。Spurious Thermal Alert Generation When Temperature Remains in Expected Rangehttps://www.ti.com/lit/er/sprz436g/sprz436g.pdf
,
Sinty Liang:
勘误表上的都是5718内部传感器,但目前我们的问题是片外的温度传感器误报,而实际采样是没有问题的,还在进一步验证中,暂时还没有结论
,
user6044729:
是的,我在你们的论坛上找到一篇高低温测试I2C读取异常的问题,是不是和上拉电阻有关,我们用的核心板是2.2K上拉的,不清楚是不是这的问题
,
user6044729:
请问这个问题有什么排查思路吗