TI中文支持网
TI专业的中文技术问题搜集分享网站

UBIFS error文件系统崩溃问题

近期公司产品出现概率性返修,经分析,与UBIFS error文件系统错误相关,请大神们解决!!问题描述如下:

【产品平台说明】

①硬件CPU:AM3354

②Linux内核版本:Linux version 3.2.0

③gcc编译器版本:gcc version 4.5.3 20110311 (prerelease)

④文件系统:ubifs

【问题现象】

(每个产品的问题不太一样,但是,都统一指向了UBIFS文件系统错误,如下列举两个现象供大神分析??)

 

第1种现象:系统无法正常启动,内核启动打印信息如下:

U-Boot SPL 2011.09 (Sep 11 2013 – 08:39:49)
Texas Instruments Revision detection unimplemented

U-Boot 2011.09 (Sep 11 2013 – 08:39:49)

DRAM:  256 MiB
NAND:  256 MiB
MMC:   OMAP SD/MMC: 0
*** Warning – bad CRC, using default environment

Net:   cpsw
Hit any key to stop autoboot:  0Booting from nand …
HW ECC BCH8 Selected

NAND read: device 0 offset 0x280000, size 0x500000
 5242880 bytes read: OK
## Booting kernel from Legacy Image at 80007fc0 …
   Image Name:   Linux-3.2.0
   Image Type:   ARM Linux Kernel Image (uncompressed)
   Data Size:    2960864 Bytes = 2.8 MiB
   Load Address: 80008000
   Entry Point:  80008000
   Verifying Checksum … OK
   XIP Kernel Image … OK
OK

Starting kernel …

Uncompressing Linux… done, booting the kernel.
[    0.000000] Linux version 3.2.0 (rm@localhost.localdomain) (gcc version 4.5.3 20110311 (prerelease) (GCC) ) #113 Mon Nov 3 17:05:49 CST 2014

………………………..

此处Linux内核正常启动,无异常信息(省略打印信息)……

……………………………..

[    4.001856] rtc-s35390a 1-0030: setting system clock to 2000-01-02 21:31:24 UTC (946848684)
[    4.078074] UBIFS: recovery needed
[    4.235435] UBIFS: recovery completed
[    4.239322] UBIFS: mounted UBI device 0, volume 0, name "rootfs"
[    4.245663] UBIFS: file system size:   199225344 bytes (194556 KiB, 189 MiB, 1569 LEBs)
[    4.254106] UBIFS: journal size:       9023488 bytes (8812 KiB, 8 MiB, 72 LEBs)
[    4.261799] UBIFS: media format:       w4/r0 (latest is w4/r0)
[    4.267951] UBIFS: default compressor: lzo
[    4.272260] UBIFS: reserved for root:  0 bytes (0 KiB)
[    4.280961] VFS: Mounted root (ubifs filesystem) on device 0:13.
[    4.288499] Freeing init memory: 608K
INIT: version 2.86 booting

/*此处开始打印UBIFS文件系统错误…….*/
[    4.754005] UBIFS error (pid 823): ubifs_check_node: bad CRC: calculated 0xbe587da8, read 0xe45bdd67
[    4.763690] UBIFS error (pid 823): ubifs_check_node: bad node at LEB 211:113632
[    4.771406] UBIFS error (pid 823): ubifs_read_node: expected node type 9
[    4.778479] UBIFS error (pid 823): ubifs_iget: failed to read inode 189, error -117
[    4.786577] UBIFS error (pid 823): ubifs_lookup: dead directory entry 'default', error -117
[    4.795383] UBIFS warning (pid 823): ubifs_ro_mode: switched to read-only mode, error -117
[    4.804094] Backtrace:[    4.806720] [<c0017978>] (dump_backtrace+0x0/0x110) from [<c03dc4ec>] (dump_stack+0x18/0x1c)
[    4.815640]  r6:cf37d000 r5:cf47a368 r4:60008400 r3:c05f6c48
[    4.821655] [<c03dc4d4>] (dump_stack+0x0/0x1c) from [<c01736d0>] (ubifs_ro_mode+0x74/0x78)
[    4.830387] [<c017365c>] (ubifs_ro_mode+0x0/0x78) from [<c016d5bc>] (ubifs_lookup+0x148/0x150)
[    4.839461]  r4:cf480298 r3:c05f6c48
[    4.843253] [<c016d474>] (ubifs_lookup+0x0/0x150) from [<c00b0858>] (d_alloc_and_lookup+0x4c/0x6c)
[    4.852694]  r8:00000001 r7:00000000 r6:cfab1ed8 r5:cf480298 r4:cf47a368
[    4.859812] [<c00b080c>] (d_alloc_and_lookup+0x0/0x6c) from [<c00b2810>] (do_lookup+0x254/0x34c)
[    4.869070]  r6:cfab1e44 r5:cfab1ed8 r4:cfab1e3c r3:00000000
[    4.875080] [<c00b25bc>] (do_lookup+0x0/0x34c) from [<c00b2a3c>] (link_path_walk+0x134/0x7d0)
[    4.884080] [<c00b2908>] (link_path_walk+0x0/0x7d0) from [<c00b44f8>] (path_openat+0xa4/0x398)
[    4.893153] [<c00b4454>] (path_openat+0x0/0x398) from [<c00b48fc>] (do_filp_open+0x34/0x88)
[    4.901972] [<c00b48c8>] (do_filp_open+0x0/0x88) from [<c00a6b00>] (do_sys_open+0xe8/0x180)
[    4.910771]  r7:00000001 r6:00000003 r5:00020000 r4:cf93c000
[    4.916781] [<c00a6a18>] (do_sys_open+0x0/0x180) from [<c00a6bc0>] (sys_open+0x28/0x2c)
[    4.925243] [<c00a6b98>] (sys_open+0x0/0x2c) from [<c0014280>] (ret_fast_syscall+0x0/0x30)
INIT: Entering runlevel: 51: can

/*此处开始打印UBIFS文件系统错误…….*/
[    5.045726] UBIFS error (pid 826): ubifs_check_node: bad CRC: calculated 0xbe587da8, read 0xe45bdd67
[    5.055414] UBIFS error (pid 826): ubifs_check_node: bad node at LEB 211:113632
[    5.063116] UBIFS error (pid 826): ubifs_read_node: expected node type 9
[    5.070190] UBIFS error (pid 826): ubifs_iget: failed to read inode 189, error -117
[    5.078290] UBIFS error (pid 826): ubifs_lookup: dead directory entry 'default', error -117
/etc/init.d/rc: .: line 18: can't open /etc/default/rcS
Cannot create Qt/Embedded data directory: /tmp/qtembedded-0
Cannot create Qt/Embedded data directory: /tmp/qtembedded-0
Cannot create Qt/Embedded data directory: /tmp/qtembedded-0
Cannot create Qt/Embedded data directory: /tmp/qtembedded-0
Cannot create Qt/Embedded data directory: /tmp/qtembedded-0
Cannot create Qt/Embedded data directory: /tmp/qtembedded-0
Cannot create Qt/Embedded data directory: /tmp/qtembedded-0
Cannot create Qt/Embedded data directory: /tmp/qtembedded-0
Cannot create Qt/Embedded data directory: /tmp/qtembedded-0
Cannot create Qt/Embedded data directory: /tmp/qtembedded-0
INIT: Id "tty2" respawning too fast: disabled for 5 minutes
[    9.584319] UBIFS error (pid 858): make_reservation: cannot reserve 160 bytes in jhead 1, error -30
[    9.593872] UBIFS error (pid 858): ubifs_write_inode: can't write inode 5005, error -30

/*此时Linux系统直接死掉,不向下走了……….*/

 

 

 

第2种现象:系统启动成功,文件无法操作,操作文件会导致文件系统崩溃,变为只读:

U-Boot SPL 2011.09 (Sep 11 2013 – 08:39:49)
Texas Instruments Revision detection unimplemented
U-Boot 2011.09 (Sep 11 2013 – 08:39:49)

DRAM:  256 MiB
NAND:  256 MiB
MMC:   OMAP SD/MMC: 0
*** Warning – bad CRC, using default environment

Net:   cpsw
Hit any key to stop autoboot:  0Booting from nand …
HW ECC BCH8 Selected

NAND read: device 0 offset 0x280000, size 0x500000
 5242880 bytes read: OK
## Booting kernel from Legacy Image at 80007fc0 …
   Image Name:   Linux-3.2.0
   Image Type:   ARM Linux Kernel Image (uncompressed)
   Data Size:    2960864 Bytes = 2.8 MiB
   Load Address: 80008000
   Entry Point:  80008000
   Verifying Checksum … OK
   XIP Kernel Image … OK
OK

Starting kernel …

Uncompressing Linux… done, booting the kernel.
[    0.000000] Linux version 3.2.0 (rm@localhost.localdomain) (gcc version 4.5.3 20110311 (prerelease) (GCC) ) #113 Mon Nov 3 17:05:49 CST 2014

………………………..

此处Linux内核正常启动,无异常信息(省略打印信息)……

……………………………..

 

4.001908] rtc-s35390a 1-0030: setting system clock to 2000-01-04 01:28:44 UTC (946949324)
[    4.078180] UBIFS: recovery needed
[    4.180129] UBIFS: recovery completed
[    4.184043] UBIFS: mounted UBI device 0, volume 0, name "rootfs"
[    4.190372] UBIFS: file system size:   199225344 bytes (194556 KiB, 189 MiB, 1569 LEBs)
[    4.198815] UBIFS: journal size:       9023488 bytes (8812 KiB, 8 MiB, 72 LEBs)
[    4.206518] UBIFS: media format:       w4/r0 (latest is w4/r0)
[    4.212655] UBIFS: default compressor: lzo
[    4.216996] UBIFS: reserved for root:  0 bytes (0 KiB)
[    4.226220] VFS: Mounted root (ubifs filesystem) on device 0:13.
[    4.233732] Freeing init memory: 608K
INIT: version 2.86 booting
Please wait: booting…
Starting udev
[    5.093881] udevd (841): /proc/841/oom_adj is deprecated, please use /proc/841/oom_score_adj instead.
[    9.827571] alignment: ignoring faults is unsafe on this CPU.  Defaulting to fixup mode.
Root filesystem already rw, not remounting
Caching udev devnodes
ALSA: Restoring mixer settings…
Configuring network interfaces… /usr/sbin/alsactl: load_state:1625: No soundcards found…
[   10.684082] net eth0: CPSW phy found : id is : 0x1cc915
done.
Setting up IP spoofing protection: rp_filter.
[   10.784501] net eth1: CPSW phy found : id is : 0x1cc915
INIT: Entering runlevel: 5
Starting system message bus: dbus.
Starting telnet daemon.
Starting syslogd/klogd: done
Starting thttpd.

root@am335x:~# –1–fd=16—
[   13.363970] net eth0: CPSW phy found : id is : 0x1cc915
[   15.354352] PHY: 0:01 – Link is Up – 100/Full

/*此时Linux系统已经成功启动*/

/*下面对系统中的文件进行删除操作时,出错了………………*/

root@am335x:~# cd /home/                                                                                                                                              root@am335x:/home# ls
app         app.tar.gz  app_U       chargeData  mtar        root
root@am335x:/home# rm -rf app.tar.gz    /*删除一个文件*/                                                                                                                              root@am335x:/home# cd /usr/                                                                                                                                           root@am335x:/usr# cd qt/lib/fonts/                                                                                                                                    root@am335x:/usr/qt/lib/fonts# rm wenquanyi_160_75.qpf /*删除一个文件*/                                                                                                               root@am335x:/usr/qt/lib/fonts# rm wenquanyi_160_50.qpf  /*删除一个文件*/   

/********经过上面的删除文件操作后,UBIFS文件系统开始报错,并变为只读***********/                                                                                                           [   67.372753] UBIFS error (pid 2021): ubifs_check_node: bad CRC: calculated 0x8819f8fc, read 0xd9c5d8fd
[   67.382503] UBIFS error (pid 2021): ubifs_check_node: bad node at LEB 224:384
[   67.390038] UBIFS error (pid 2021): ubifs_read_node: expected node type 9
[   67.397198] UBIFS warning (pid 2021): ubifs_ro_mode: switched to read-only mode, error -117
[   67.406001] Backtrace:[   67.408631] [<c0017978>] (dump_backtrace+0x0/0x110) from [<c03dc4ec>] (dump_stack+0x18/0x1c)
[   67.417540]  r6:ffffff8b r5:cf519048 r4:60008400 r3:c05f6c48
[   67.423557] [<c03dc4d4>] (dump_stack+0x0/0x1c) from [<c01736d0>] (ubifs_ro_mode+0x74/0x78)
[   67.432288] [<c017365c>] (ubifs_ro_mode+0x0/0x78) from [<c016a0b4>] (ubifs_jnl_delete_inode+0x94/0xb4)
[   67.442097]  r4:cf37d000 r3:00000000
[   67.445899] [<c016a020>] (ubifs_jnl_delete_inode+0x0/0xb4) from [<c016efd8>] (ubifs_evict_inode+0xc0/0x100)
[   67.456164]  r9:cfb8a000 r8:c0014428 r7:0000000a r6:c03f2688 r5:cf37d000
[   67.463184] r4:cf519048
[   67.465985] [<c016ef18>] (ubifs_evict_inode+0x0/0x100) from [<c00bcebc>] (evict+0x7c/0x160)
[   67.474791]  r5:c03f2688 r4:cf519048
[   67.478578] [<c00bce40>] (evict+0x0/0x160) from [<c00bd0b0>] (iput+0x110/0x1b0)
[   67.486282]  r5:cf376600 r4:cf519048
[   67.490078] [<c00bcfa0>] (iput+0x0/0x1b0) from [<c00b3a2c>] (do_unlinkat+0x120/0x164)
[   67.498329]  r6:cf519048 r5:cf58d118 r4:00000000 r3:00000000
[   67.504341] [<c00b390c>] (do_unlinkat+0x0/0x164) from [<c00b4d3c>] (sys_unlink+0x18/0x1c)
[   67.512944]  r6:00000000 r5:00000000 r4:bebd1da9
[   67.517868] [<c00b4d24>] (sys_unlink+0x0/0x1c) from [<c0014280>] (ret_fast_syscall+0x0/0x30)
[   67.526763] UBIFS error (pid 2021): ubifs_evict_inode: can't delete inode 8398, error -117

 

请大神们解决,紧急啊!!!!!

Jian Zhou:

问下,客户大概是一种什么样的应用场景,是不是会经常出现异常掉电。

jinlei zhang:

回复 Jian Zhou:

你好!

      非常感谢你的及时回答!!

      我们分析此问题的开始,也是怀疑产品频繁掉电引起的,为验证此原因,我们做了工装平台,做频繁上下电实验:

      实验1:开机启动到命令行—>立刻断电—>停留5s     持续了3天3夜(估算:3000次左右)

      实验2:开机未启动完成(内核刚启动一半)—>立刻断电—>停留5s     持续了1夜(估算:700次左右)

     但是,无法复现市场上返回的产品的问题现象!

    目前,我们苦于无法复现问题现象,无法分析出此问题的根本原因?

    请问:

     ①可能的原因有哪些?是内核的原因还是UBI-FS文件系统的原因?

     ②与产品的应用程序是否有关联?如:应用程序做哪些操作可能会导致此问题?

     ③目前TI的客户中,有没有遇到此类问题?怎么解决的?

     盼复,谢谢!!

Jian Zhou:

回复 jinlei zhang:

如果是工业环境,建议换成鲁棒性更强的YAFFS文件系统。

或者把根文件系统做成只读的

jinlei zhang:

回复 Jian Zhou:

你好!

         非常感谢你提供的方案!多谢多谢!!

         我们的产品应用环境是工业环境!

         你提供的两个方案有没有参考文档&资料:

          ①如何制作鲁棒性更强的YAFFS文件系统,TI有没有直接可以在直接用的?(硬件CPU:AM3354   Linux内核版本:Linux version 3.2.0)

          ②如何将根文件系统做成只读?

 

          另外:

         由于涉及市场产品的召回返工,动作比较大,为了确保方案的有效性,我们想知道:

         目前TI的客户有没有遇到类似问题的?他们是如何解决的?最终是否杜绝了产品问题?

          盼复!!谢谢!!

Yaoming Qin:

回复 jinlei zhang:

文件系统是一个纯软的,基于mtd驱动,可以脱离具体的平台,可以参考 http://processors.wiki.ti.com/index.php/Create_a_YAFFS_Target_Image?keyMatch=yaffs&tisearch=Search-EN 或者更多的baidu上的资料。

Yaoming Qin:

回复 jinlei zhang:

关于,文件系统只读的设置,是在uboot中的命令行参数把rw改为ro

leo chen:

回复 Yaoming Qin:

不知用ramfs可不可行,这样每次系统重启都是初始状态

young young:

你好!

    同行你的问题解决了吗?这个问题很多人遇到的TI一直没有能解决问题的答复。

    我感觉是ECC纠错的问题,我也遇到和你同样问题,我用这个型号TC58NVG1S3HTA00的NAND FLASH就会出现

    ECC是8bit的就会出现,那个出问题的机器,你可要看看nand是不是那个坏块正好在文件系统的分区上,如果是的话开关机多次后

   只要对这个坏块区读写数据操作了,就会出现你发的那个报错的打印,系统起不来我们现在还没有解决。

  你的情况怎么样??

jinlei zhang:

回复 young young:

你好!

           我们的问题还没有解决!目前还在想办法…..

young young:

回复 jinlei zhang:

你好!

      希望能一起研究这个问题。我感觉是ECC纠错出了问题,你是用BCH8吗?

       只要文件系统分区上有坏块,就会出现文件系统起不了。

赞(0)
未经允许不得转载:TI中文支持网 » UBIFS error文件系统崩溃问题
分享到: 更多 (0)