[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#940821: linux-image-5.2.0-2-amd64: file cache corruption with nfs4



Control: tags -1 + moreinfo

Hi Anton,

On Fri, Sep 20, 2019 at 11:09:29AM +0100, Anton Ivanov wrote:
> Package: src:linux
> Version: 5.2.9-2
> Severity: critical
> Justification: breaks unrelated software
> 
> Dear Maintainer,
> 
> NFSv4 caching is completely broken on SMP.
> 
> How to reproduce:
> 
> Option 1. clone openwrt, run while make clean && make -j `nproc` ; do true ; done
> 
> It will break depending on number of CPUs within several runs. 
> 
> Symptoms of breakage. A directory on the client looks empty. Example (mnt is an NFSv4 mount):
> 
> ls -laF /mnt/src/openwrt/build_dir/target-mips_24kc_musl/linux-ar71xx_tiny/linux-4.14.125/arch/mips/include/generated/uapi/asm
> total 8
> drwxr-xr-x 2 anivanov anivanov 4096 Sep 20 10:51 ./
> drwxr-xr-x 3 anivanov anivanov 4096 Sep 20 10:51 ../
> 
> While it actually has a file in it (same on server):
> 
> ls -laF /exports/work/src/openwrt/build_dir/target-mips_24kc_musl/linux-ar71xx_tiny/linux-4.14.125/arch/mips/include/generated/uapi/asm
> total 12
> drwxr-xr-x 2 anivanov anivanov 4096 Sep 20 10:51 ./
> drwxr-xr-x 3 anivanov anivanov 4096 Sep 20 10:51 ../
> -rw-r--r-- 1 anivanov anivanov   32 Sep 20 10:51 ipcbuf.h
> 
> This cache entry on the client does not expire as it should per the NFSv4 caching documentation - the only way of dealing with it is reboot, unmount or caches drop.
> 
> Option 2. Have your $HOME on nfsv4 and use thunderbird. Move mails between folders. Sooner or later (usually sooner) you will lose an email.
> 
> So this is both "breaks unrelated software" and "data loss" depending on what you are doing.
> 
> Tested on:
> 
> AMD Ryzen 5 2400G, AMD Ryzen 5 1600X, AMD Ryzen 5 1600, AMD A8-6500
> 
> Shows up on all. Fastest on the 6 core 12 thread ryzens, slowest on the AMD A8 (takes up to 3 iterations of make there).

Looks that noone so far was able to either confirm or pinpoint the
issue otherwise (neither here nor upstream).

Are you still able to reproduce the issue with recent kernel versions?

Regards,
Salvatore


Reply to: