public inbox for goredo-devel@lists.cypherpunks.ru
Atom feed
From: "Jan Niklas Böhm" <mail@jnboehm•com>
To: goredo-devel@lists.cypherpunks.ru
Subject: Re: Suggestion to revert touching files when the hash matches (problem with hardlinks)
Date: Tue, 1 Nov 2022 14:14:37 +0100	[thread overview]
Message-ID: <179b06ff-a4f2-dfbd-3040-6e20c07084e3@jnboehm.com> (raw)
In-Reply-To: <e8df6a33-afb7-4f6a-a84b-cdebefc7c782@spacefrogg.net>

>>      a: inode=646, mtime=1667292427
>>      b: inode=646, mtime=1667292427
> 
> Your bug is right here. This is a misuse of redo and breaks its logic. A and B are expected to mean different things, so they MUST NOT point to the same inode. Period.
> 
>>      a: inode=655, mtime=1667292646
>>      b: inode=655, mtime=1667292646  # wasn't redone, but changed
> 
> This is, where you trigger the bug. Yes, reverting goredo resolves this exact TRIGGER of the bug. Your actual bug is still present, see above.
> 
> I reiterate: Separate paths MUST represent separate inodes or you break redo logic one way or another.

Thank you for elaborating on this.  I see what you mean now.  To be 
honest, I am not aware of anyone making this strict assumption.

> Your bug has nothing to do with the change in 1.23. I don't see why your case of misusing redo should be given any preference over a completely useful optimisation that is in line with the logic of redo.

There are two reasons for changing the behavior:

1. The optimization is not improving the entire execution by a lot. 
Instead of calling os.Rename, goredo calls os.Remove and os.Chtimes. 
Since both the temporary file and the target are in the same directory, 
they're also on the same drive, making os.Rename a cheap operation.
2. The behavior I suggested agrees with the implementation by apenwarr.

> One way out of your mess is that you create a c.do with an indirect dependency:
> redo-ifchange a
> ln a b # indirect dependency
> touch $3
> 
> Users have to depend on c and know that they should use b. Now, you are honest to your consumers, that b is some dirty optimisation that may/will be changed whenever a is changed.

My current setup is structured somewhat like this.  One of my issues is 
that the analogue to the target c produces many files, which I do not 
want to be visible unless they are specifically requested via redoing 
them (in which case they'll be hardlinked from another directory where 
they are saved).

> Final words: I sympathize with what you are trying to achieve. But you should rather fix the bug instead of removing one of its triggers.

Thanks, and thanks for the correspondence (despite the disagreements).

An alternative suggestion is to change the function `isModified`. 
Currently it only checks the ctime/mtime based on the value 
REDO_INODE_TRUST, but will not check the file contents in case the time 
differs (this is the behavior of `Inode.Equals`).  Adding a hash check 
before returning the computed modified value would decrease the number 
of false positives here, because this is the reason why the warning 
about an externally modified file is emitted in the first place.  I'm 
not sure whether the hash check should be done in Inode.Equals or 
outside of that function.

  reply	other threads:[~2022-11-01 13:16 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-31 21:37 Suggestion to revert touching files when the hash matches (problem with hardlinks) Jan Niklas Böhm
2022-11-01  6:42 ` goredo
2022-11-01  7:50   ` Jan Niklas Böhm
2022-11-01  8:21     ` goredo
2022-11-01  9:02       ` Jan Niklas Böhm
2022-11-01 11:49         ` Spacefrogg
2022-11-01 13:14           ` Jan Niklas Böhm [this message]
2022-11-02 13:57             ` Sergey Matveev
2022-11-02 22:42               ` Jan Niklas Böhm
2022-11-03  8:55                 ` Sergey Matveev