*** goredo [2021-11-09 13:43]: >I also was wondering what redo-stamp currently does, exactly. apenwarr/redo uses it to achieve the behaviour that the output of a target can be independent of it's hash. They use it like the following: Initially goredo tried to fully resemble behaviour of apenwarr/redo and redo-stamp had (should had) completely the same behaviour. But soon I came to the confidence that redo-stamp is just useless and completely unnecessary thing and complication. The main difference between apenwarr's and my view on redo is that I am confident that it is ok to always (cryptographically) checksum target. https://redo.readthedocs.io/en/latest/FAQImpl/#why-not-always-use-checksum-based-dependencies-instead-of-timestamps http://www.goredo.cypherpunks.ru/FAQ.html In my practice, there were huge quantity of .do-s ending with something like "command -v redo-stamp > /dev/null || exit 0 ; redo-stamp <$3". I realized (and I assume that applies to most redo users using it for software building) that redo-stamping is the thing that is nearly always wished for. apenwarr/redo's documentation states somewhere that mainly always-checksumming is useful to make less false-positive OOD decisions. That is true. But I am confident that hashing can be considered pretty cheap operation. Even if it is sometimes slowing something down, it greatly simplified .do-files and overall redo implementation. apenwarr/redo basically has to ways of determining if the target is changed: * either it has different mtime+size+whatever metainformation * or it used redo-stamp and has different hash goredo, as redo-c, has single way: * it has different hash * and just as an optimization, that check can be skipped, if ctime is the same (goredo's REDO_INODE_NO_TRUST=1 can forcefully distrust everything related to inode's metainformation and hash checking will be done anyway -- most trustworthy OOD) * and as another optimization, target is OOD if its size differs 1. Can we trust mtime+other metainformation guaranteed changing if underlying file was definitely changed? According to https://apenwarr.ca/log/20181113 it is good enough in practice, but can be broken on some FUSEd filesystems. So if we want to have strong confidence of guaranteed OOD determination, then we should check the hash -- it will by definitely different is something is changed (let's forget about possible hash collisions of long enough strong cryptographic hash -- its probability is negligible) 2. Or we can use more "reliable" ctime check (again, that can also fail on strange/broken FUSE filesystems/drivers for example). apenwarr/redo does not use ctime, because it could create too many false positives (like changing the number of hard links). But ctime can also be broken/untrusted, so cryptographic hashing again will save us here As I saw, as I understand, redo-stamp is used mainly with redo-always targets. Because redo-always will anyway change inode enough to satisfy OOD decision, people use redo-stamp to skip false-positive OOD decision and resource-wasting rebuilding. redo-c/goredo's OOD determination based on inodes/hashes is very simple from implementation point of view. redo-always+redo-stamp hugely complicates overall logic and code. I look at redo-stamp as some kind of a hack to prevent redo-always targets to OOD everything they touches (that redo-always is intended to do by definition). And I came to conclusion that redo-always itself is just an ugly idea. Not the redo-always itself, but huge complications aimed to skip rebuilding of everything all the time, because OOD definitely should say "it is OOD, because it depends on always-target, that is always OOD by definition". redo-always just should be used. At least as a way many people (I saw and I assume) uses: to create some kind of target: redo-always env | sort command -v redo-stamp > /dev/null || exit 0 ; redo-stamp <$3 # command check is for compatibility with implementations without redo-stamp I used to do that all the time. But I tired of that stamps (for preventing rebuilding of literally everything, because everything depends on environment variables, for example) and of all of that complications introduced with redo-always. For me, that is just harmful idea (redo-always). All of that I tried to note in http://www.goredo.cypherpunks.ru/FAQ.html Another issue with hashes/stamps is that you do not always want to checksum the target's value itself. If someone decides that hash of unexistent target equals to empty string, and if redo implementation creates resulting file even if nothing was sent to stdout, then of course there is not way make that target always OOD (possibly that was the reason people invented redo-always?). But with goredo (and redo-c, as I remember) there is not problems: if nothing was sent to stdout, then no output file is created -- unexistent file is always OOD. But if you wish to explicitly create an empty file, then you can just always touch "$3". Constant hashing won't harm you here anyhow. If you really really wish to check only for some metainformation (only check for mtime), then nothing prevents you to create some intermediate target that contains output of (stat -f %m $1) and depend not on the (probably) huge file, but on that intermediate metainformation file having only the necessary data you wish to check. >redo-ifchange $input_files >cmd $input_files >$3 >for f in $input_files; do >  redo-stamp <$f >done I do not understand where is the catch :-). redo-ifchange "$input_files" clearly explicitly states: rebuild that target (do cmd $input_files) and everyone who depends on it, if any of $input_files are changed. If $input_files are not changed, then that target won't be OOD, won't be rebuild and noone who depends on it won't be rebuild too (if it is the only dependency of course). In you example redo-stamps literally tells: this target is OOD if hash of all $input_files data is changed. redo-ifchange $input_files (with implicit hashing) tells exactly that too. Is not it? -- Sergey Matveev (http://www.stargrave.org/) OpenPGP: CF60 E89A 5923 1E76 E263 6422 AE1A 8109 E498 57EF