public inbox for goredo-devel@lists.cypherpunks.ru
Atom feed
From: Sergey Matveev <stargrave@stargrave•org>
To: goredo-devel@lists.cypherpunks.ru
Subject: Re: redo-ood taking much longer to return in a copy of a project, compared with the original
Date: Fri, 19 Nov 2021 23:41:16 +0300	[thread overview]
Message-ID: <YZgL+AmcRH6LkJSq@stargrave.org> (raw)
In-Reply-To: <BE3C3BD8-EB1F-4AE7-883A-6DE07D732A75@gmail.com>

[-- Attachment #1: Type: text/plain, Size: 1950 bytes --]

Greetings!

*** Karolis K [2021-11-19 22:23]:
>My understanding was that since .redo dependencies are stored under each dir individually the computations shouldn’t depend on where the root is located. But somehow it does.

Each recorded dependency is stored with the following metainformation
(some real example):

    [...]
    Type: ifchange
    Target: all.do
    Hash: 48a30bcbca86c8e2f66daa4111f86d59c79d59619a2445c7004f23b5db45de22
    Size: 875
    CtimeSec: 1628512691
    CtimeNsec: 304504000
    [...]

By default, if file's ctime is the same, then it is assumed not modified
and no reading is done to compare its hash. When you copy your project,
then all files (even if you do "cp -a" (instead of "cp -r"), that will
keep mtime) will have different ctime value, so redo is forced to check
file's contents. If you "export REDO_INODE_NO_TRUST=1", then that
behaviour (always checking the hash) will be done everywhere. Ctime
metainformation is just an optimization based on assumption that
filesystem can be trusted in that way. After copying, recorded ctimes
are useless and you just loose that optimization.

And basically nothing can be done, as I can see. The only guaranteed
information we can trust is file's contents, that also can be trusted
through long-enough collision resistant hash function. Another thing
that can be used to skip hash checking is file's size: if it differs,
then file's content differ. And as an optimization to skip every time
file's reading, we can use some metainformation from filesystem. And
basically there are only mtime and ctime, that can be useful here. mtime
hardly can be trusted: https://apenwarr.ca/log/20181113
ctime is better, but it can give "false positives" for example when just
adding hardlink. But it is still very helpful in practice.

-- 
Sergey Matveev (http://www.stargrave.org/)
OpenPGP: CF60 E89A 5923 1E76 E263  6422 AE1A 8109 E498 57EF

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  reply	other threads:[~2021-11-19 20:41 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-19 20:23 redo-ood taking much longer to return in a copy of a project, compared with the original Karolis K
2021-11-19 20:41 ` Sergey Matveev [this message]
2021-11-19 20:45   ` Sergey Matveev
2021-11-20 19:18 Karolis K
2021-11-21 19:00 ` Sergey Matveev