[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: file born 30 seconds after its creation on ext4 - bug?



On 2022-04-30 15:19:21 +0200, Thomas Schmitt wrote:
> An explanation of the observed problem would need:
> - a mechanism which delayed the content production of the inode while it
>   was already in use for open and write,
> - or a mechanism which caused ext4 to hide the inode to other processes
>   and to write a wrong birth timestamp,
> - or a mechanism which deleted the file shortly after it was created and
>   re-created it 30 seconds later with its full expected content,
> - or something of which i cannot think yet.
> 
> (The birth timestamp happens to match roughly the time when the file finally
> became visible to other processes. It matches the modification time of the
> finally visible file.)

I'm wondering whether the data are transferred from the VFS to ext4
necessarily within the same openat system call or could just be kept
in the VFS as long as they are not needed elsewhere, i.e. the VFS
behaving like a cache. In the latter case, since the VFS doesn't
have a notion of birth timestamp (from the code I've read), a bug
in the VFS code could explain the behavior I had observed. This is
the only explanation I could have.

> --------------------------------------------------------------------------
> 
> The strange strace report around the time when the file finally appeared
> 
>   openat(AT_FDCWD, "….out", O_WRONLY|O_CREAT|O_APPEND, 0666 <unfinished ...>
>   <... openat resumed>)  = 3
> 
> does not mean any disturbance, but only that strace had to deal with more
> than one thread or process at the time. Between above two lines there is
> supposed to have been another line with a system call, though.
> man 1 strace:
> 
>    If  a  system call is being executed and meanwhile another one is being
>    called from a different thread/process then strace will try to preserve
>    the  order  of  those  events and mark the ongoing call as being unfin‐
>    ished.  When the call returns it will be marked as resumed.
> 
>    [pid 28772] select(4, [3], NULL, NULL, NULL <unfinished ...>
>    [pid 28779] clock_gettime(CLOCK_REALTIME, {1130322148, 939977000}) = 0
>    [pid 28772] <... select resumed> )      = 1 (in [3])

Thanks for the information. I'd say that "resumed" is misleading, as
it makes me think that "select" was interrupted. IMHO, "continued"
would be much better.

-- 
Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)


Reply to: