public inbox for goredo-devel@lists.cypherpunks.ru
Atom feed
From: "Jan Niklas Böhm" <mail@jnboehm•com>
To: goredo-devel@lists.cypherpunks.ru
Subject: Re: Suggestion to revert touching files when the hash matches (problem with hardlinks)
Date: Tue, 1 Nov 2022 08:50:20 +0100 [thread overview]
Message-ID: <64d10f4c-b1db-c04e-e238-ee7c26fd5595@jnboehm.com> (raw)
In-Reply-To: <eee2108f-a0b1-45be-8dfe-1cffc5eba5e0@spacefrogg.net>
> Hardlinks are a bad idea due to their "automatic updates". You no longer get the guarantee that your output is only changed by redo.
Unfortunately I am kind of stuck with hardlinks at this point. I
actually have not looked in symlinks in detail yet, but that feels a bit
hacky (since then there is only the indirect link between the file and
the data contents).
> Are you sure this did not also happen before 1.23? Because I know this error. After the first run of b.do, you've already established the hardlink between a and b. Linking again to $3 doesn't change the fact that you also changed b directly (via the common link to a).
I am fairly sure that this is due to the symlinking, also because this
error does not occur when the output of "a" is changed and thus the file
gets renamed.
The reason is that when both "a" and "b" point to the same inode and we
have to redo both it roughly goes like this:
echo aaa > a.tmp
# goredo does the following, but only if a.tmp != a
mv a.tmp a
So now "a" and "b" point to different inodes and "b" remains unchanged.
Then when we "redo b" it will establish the hardlink again. This is
what in my opinion should also happen if the output of "redo a" did not
change the contents of "a".
While the contents do not change throughout, by touching "a" the mtime
for "b" does change and that's what messes up the state in the redo
process / recfiles, unless I am misunderstanding something.
The file attributes after the first call to "redo b" (when it exits with
0) are:
a, inode = 123, mtime = 1
b, inode = 123, mtime = 2
Now with version 1.27.1 (or any after 1.23.0) when we change "a.do" so
that it is rerun, but its output does not change, and then "redo a" the
files look like:
a, inode = 123, mtime = 3
b, inode = 123, mtime = 3 # not 2 anymore, error for redo
Whereas if we would move $3 to a it would look like:
a, inode = 321, mtime = 3
b, inode = 123, mtime = 2
And "b" could be redone once more, since it is not seemingly modified
externally.
> As an alternative, you could look into using 'cp --reflink' on modern file systems.
Thanks for that suggestion, this actually reflects the intention a bit
better of what I would want to happen. Unfortunately it is not
supported on the machines I am using.
What I am not sure about is what will trigger the copy mechanism and
whether that is well suited. On the one hand, if touching the file
triggers the copy already, then the updating mechanism from goredo will
become fairly expensive as this now triggers a full copying instead of
only a renaming operation. On the other hand, if touching does not
cause a copy, then the issue outlined above will also persist. Of
course this is more hypothetical, since I cannot use it anyways.
next prev parent reply other threads:[~2022-11-01 7:52 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-10-31 21:37 Suggestion to revert touching files when the hash matches (problem with hardlinks) Jan Niklas Böhm
2022-11-01 6:42 ` goredo
2022-11-01 7:50 ` Jan Niklas Böhm [this message]
2022-11-01 8:21 ` goredo
2022-11-01 9:02 ` Jan Niklas Böhm
2022-11-01 11:49 ` Spacefrogg
2022-11-01 13:14 ` Jan Niklas Böhm
2022-11-02 13:57 ` Sergey Matveev
2022-11-02 22:42 ` Jan Niklas Böhm
2022-11-03 8:55 ` Sergey Matveev