XFS – Empty files after a crash

Another major bug in XFS :

After a power failure, server was back but shows some files with size 0 (when I was sure this cannot be possible).
using ‘du -sh’ on this folder shows that total size was > 0 …

The bug was already reported on XFS mailing list here :

http://oss.sgi.com/archives/xfs/2012-02/msg00517.html

If you have many files to recover, procedure is really a pain. That’s why I created a small PHP file that do the trick.

https://github.com/odoucet/xfsrepair

i’m not responsible if you destroy your data 😉 But this script worked well for me.

8 réflexions sur « XFS – Empty files after a crash »

  1. Recently we had the power cut on a file server with xfs and we see the effect you describe. I did a recovery test using your script. It works well. We hope to recover the data. Some binary files were already recovered and appeared usable in application. Many thanks for very useful script. However, I see strange feature for small ASCII files. Seems that recovery is made using the whole blocks, in my case 32k. I see that for short original ASCII files < 32 k the recovered file is always 32k and the random content is appended to a file. It seems like the EOF file is lost somehow when blocks are copied or something like this. I do not have any experience with xfs. Do you think that it is possible to correct this feature?

  2. Hello Mariusz,
    We do not know where is the EOF because I believe this information is stored in metadata. I also experienced this problem, but I did not find any workaround for this. Maybe you could ask on XFS mailing list (http://xfs.org/index.php/XFS_email_list_and_archives). So yes, recovery is not perfect, but it is still better than nothing 🙂

  3. Hi Olivier,
    Thanks for the answer. I see strange effect. The recovered files are much bigger than the original. I did the test with nonzero files which in principle should be OK. For example
    original – 162968992
    recovered – 1303773184
    which is very close to a factor 8?
    Do you have any idea what it could be?

  4. I have found that in the case of my storage stat returns nr of 512 byte blocks while in dd bs=4096 is specified. Hence this factor 8. I modified nr of blocks in dd in your script and now everything works. I recovered all affected files. Thanks for help.

  5. Good to know. Could you please give me the output of ‘stat’ on one of your file ? I’ll see what differ from my code and update it if necessary.

  6. du bbInclPowheg_9091.dst
    262144 bbInclPowheg_9091.dst

    stat bbInclPowheg_9091.dst
    File: `bbInclPowheg_9091.dst’
    Size: 0 Blocks: 524288 IO Block: 4096 regular empty file
    Device: 801h/2049d Inode: 16868 Links: 1
    Access: (0644/-rw-r–r–) Uid: ( 1009/ UNKNOWN) Gid: ( 1000/ UNKNOWN)
    Access: 2012-03-29 13:29:51.878887000 +0200
    Modify: 2011-09-26 11:00:46.299480000 +0200
    Change: 2012-09-12 14:50:14.665280518 +0200

    Seems that stat gives nr of blocks on device, 512 in my case
    One can get correct size of the file (from du) multiplying nr of blocks from stat by 512
    524288*0.512k/1.024 = 262144k

    xfs_db for this file give nr of 4096 blocks i.e. 65536
    In your script I replaced nr of blocks from stat to nr of blocks from xfs_db so the dd comand using 4096 block size gave correct file size on output.

  7. Thank you. I’ve updated my source code and I now use xfs_db for sector size.

Laisser un commentaire

Votre adresse e-mail ne sera pas publiée. Les champs obligatoires sont indiqués avec *