Re: redoing unnecessary targets when a do file is modified but the output remains unchanged

From: Sergey Matveev <stargrave_at_domain.hidden>
Date: Wed, 5 May 2021 10:35:10 +0300
Message-ID: <YJJKukYSoYIa5JNB_at_domain.hidden>
Greetings!

*** Andrey Dobrovolsky [2021-05-05 01:52]:
>I was fascinated by redo idea after opennet.ru announced Your goredo
>release, Your habr article helped a lot too, thanks for both!

Glad they were useful! redo is indeed life-changing for me.
Unfortunately it made me allergic to any kind of Makefiles :-)

>Now I use my fork of Leah Neukirchen's redo-c. The problem You are
>talking about bothered me too, and I've solved it in my  dev2 branch

>github.com/AndreyDobrovolskyOdessa/redo-c

Glad that your version satisfied you! Anyway I am still completely
unsure how it can be done right, paying attention issues said before:
http://lists.cypherpunks.ru/archive/goredo-devel/2102/0015.html

I quickly looked at your changes and one thing seems me very strange.
Possibly I am wrong, because I did not read the whole code to see the
full picture, but:

    In fact "redo" without targets is full equivalent of "redo-always"

(taken from one of your commit message) seems to be just plain wrong.
"redo" literally tells to "re-do specified targets", but "redo-always"
marks the *currently* executed target as an "always" target. They have
completely different purposes: one is for initiating the building of
targets, the other is for marking already running target's dependency.

>But hashed dependencies allow to do only what is
>really needed to be done, and that's really great!

Well, all of apenwarr/redo's redo-stamp, goredo and redo-c use hashed
dependencies. They just do what literally was said to them: redo if that
targets/dependencies are changed. Redoing dependencies that possibly are
actually not dependencies anymore, because of "dynamic" nature of
redo-ifchange, is very confusing to me (and seems to apenwarr, redo-c
authors).

>(yet?) in multi-threaded redoing, that's why current version is
>single-threaded.

redo-c already uses the jobserver protocol and, as I remember,
parallelize jobs good. And each target is another shell/redo invocation.
I do not see where multithreading can help. Reading all that files and
directories metainformation (ctime, inodes, whatever) -- won't be faster
than syscalls and IO. Hashing multiple dependencies will harm, because
of non sequential IO, unless hash algorithm is the bottleneck. Actually
the very very first thing I did in redo-c (when there was no goredo) is
using BLAKE2b instead of SHA256. SHA256 is the most slow algorithm from
the well-known and widely used ones: SHA512 is considerably faster on
64-bit machines. And it really easily can be the bottleneck on my
computer. goredo uses BLAKE3, that still being the cryptographically
secure, is 12 times faster than SHA256 in single thread.

Thanks for sharing your experience and the fork, that could be useful to
others!

-- 
Sergey Matveev (http://www.stargrave.org/)
OpenPGP: CF60 E89A 5923 1E76 E263  6422 AE1A 8109 E498 57EF

Received on 2021-05-05 07:35:10 UTC

This archive was generated by hypermail 2.4.0 : 2021-05-05 08:00:08 UTC