public inbox for goredo-devel@lists.cypherpunks.ru
Atom feed
From: spacefrogg-goredo@spacefrogg•net
To: goredo-devel@lists.cypherpunks.ru
Subject: OOD checking done too eagerly
Date: Fri, 22 Jan 2021 16:23:51 +0100	[thread overview]
Message-ID: <87pn1x806w.fsf@paris.i-did-not-set--mail-host-address--so-tickle-me> (raw)
In-Reply-To: 87sg6t80s8.fsf@paris.i-did-not-set--mail-host-address--so-tickle-me

I was just testing 1.0.0 (size checking! :) and
noticed something odd.

I believe OOD checking is done too eagerly and to deeply, as I'd like to
show in a small case.
The dependency chain is as follows:
c.do -> b.do -> a.do -> x
with a, b and c being built with redo and x being a source. Each is just
a copy of its dependency.
After building c, to populate the dependency database, x is being
modified. Now, running `redo -d c` again produces the following trace:

```
[30328] dbg  [53277686-f322-4fc4-36df-b66997ca2441] run: redo [c] cwd:/tmp/goredo dirprefix:
[30328] js   release from initial fill
[30328] dbg  collecting deps
[30328] dbg  building 1 targets: [c]
[30328] lock LOCK_NB: /tmp/goredo/.redo/c.lock
[30328] dbg  ifchange: /tmp/goredo/.redo/.redo.c.dep.2157eaa2 <- c.do
[30328] wait c
[30328] dbg  sh: c: /bin/sh [-e c.do c c .redo.c.21594a82.3] cwd:/tmp/goredo dirprefix:
[30328] js   acquire for sh: c: /bin/sh [-e c.do c c .redo.c.21594a82.3] cwd:/tmp/goredo dirprefix:
[30328] js   acquired for sh: c: /bin/sh [-e c.do c c .redo.c.21594a82.3] cwd:/tmp/goredo dirprefix:
[30328] dbg  c runs [30333]
[30334] dbg  . [53277686-f322-4fc4-36df-b66997ca2441] run: redo-ifchange [b] cwd:/tmp/goredo dirprefix:
[30334] js   . release from ifchange entered
[30334] dbg  . collecting deps
[30334] dbg  . building 1 targets: [b]
[30334] dbg  . ood: b checking
[30334] dbg  . ood: b -> b.do: checking
[30334] dbg  . ood: b -> b.do: same inode
[30334] dbg  . ood: b -> b.do: is source
[30334] dbg  . ood: b -> a: checking
[30334] dbg  . ood: b -> a: same inode
[30334] dbg  . ood: . a checking
[30334] dbg  . ood: . a -> a.do: checking
[30334] dbg  . ood: . a -> a.do: same inode
[30334] dbg  . ood: . a -> a.do: is source
[30334] dbg  . ood: . a -> x: checking
[30334] dbg  . ood: . a -> x: size differs
[30334] dbg  . ood: . a: true
[30334] dbg  . ood: b -> a: ood
[30334] dbg  . ood: b: true
[30334] lock . LOCK_NB: /tmp/goredo/.redo/b.lock
[30334] dbg  . ifchange: /tmp/goredo/.redo/.redo.b.dep.2188c262 <- b.do
[30334] wait . b
[30334] dbg  . sh: b: /bin/sh [-e b.do b b .redo.b.218bbe4a.3] cwd:/tmp/goredo dirprefix:
[30334] js   . acquire for sh: b: /bin/sh [-e b.do b b .redo.b.218bbe4a.3] cwd:/tmp/goredo dirprefix:
[30334] js   . acquired for sh: b: /bin/sh [-e b.do b b .redo.b.218bbe4a.3] cwd:/tmp/goredo dirprefix:
[30334] dbg  . b runs [30341]
[30342] dbg  . . [53277686-f322-4fc4-36df-b66997ca2441] run: redo-ifchange [a] cwd:/tmp/goredo dirprefix:
[30342] js   . . release from ifchange entered
[30342] dbg  . . collecting deps
[30342] dbg  . . building 1 targets: [a]
[30342] dbg  . . ood: a checking
[30342] dbg  . . ood: a -> a.do: checking
[30342] dbg  . . ood: a -> a.do: same inode
[30342] dbg  . . ood: a -> a.do: is source
[30342] dbg  . . ood: a -> x: checking
[30342] dbg  . . ood: a -> x: size differs
[30342] dbg  . . ood: a: true
...
```
So, process 30334 (redo-ifchange b) runs the complete dependency
chain down to the source x, determining that b must be OOD.

This is repeated in 30342 (redo-ifchange a).

The problem is, the dependency chain behind a could be arbitrarily
large, forcing redo to do this expensive calculation unnecessarily
often.

What the log should look like (in my opinion) is:
```
[30328] dbg  [53277686-f322-4fc4-36df-b66997ca2441] run: redo [c] cwd:/tmp/goredo dirprefix:
[30328] js   release from initial fill
[30328] dbg  collecting deps
[30328] dbg  building 1 targets: [c]
[30328] lock LOCK_NB: /tmp/goredo/.redo/c.lock
[30328] dbg  ifchange: /tmp/goredo/.redo/.redo.c.dep.2157eaa2 <- c.do
[30328] wait c
[30328] dbg  sh: c: /bin/sh [-e c.do c c .redo.c.21594a82.3] cwd:/tmp/goredo dirprefix:
[30328] js   acquire for sh: c: /bin/sh [-e c.do c c .redo.c.21594a82.3] cwd:/tmp/goredo dirprefix:
[30328] js   acquired for sh: c: /bin/sh [-e c.do c c .redo.c.21594a82.3] cwd:/tmp/goredo dirprefix:
[30328] dbg  c runs [30333]
[30334] dbg  . [53277686-f322-4fc4-36df-b66997ca2441] run: redo-ifchange [b] cwd:/tmp/goredo dirprefix:
[30334] js   . release from ifchange entered
# Immediately enter dependency c.do -> b.do without checking
[30334] lock . LOCK_NB: /tmp/goredo/.redo/b.lock
[30334] dbg  . ifchange: /tmp/goredo/.redo/.redo.b.dep.2188c262 <- b.do
[30334] wait . b
[30334] dbg  . sh: b: /bin/sh [-e b.do b b .redo.b.218bbe4a.3] cwd:/tmp/goredo dirprefix:
[30334] js   . acquire for sh: b: /bin/sh [-e b.do b b .redo.b.218bbe4a.3] cwd:/tmp/goredo dirprefix:
[30334] js   . acquired for sh: b: /bin/sh [-e b.do b b .redo.b.218bbe4a.3] cwd:/tmp/goredo dirprefix:
[30334] dbg  . b runs [30341]
# Immediately enter the dependency chain b.do -> a.do without checking
[30342] dbg  . . [53277686-f322-4fc4-36df-b66997ca2441] run: redo-ifchange [a] cwd:/tmp/goredo dirprefix:
[30342] js   . . release from ifchange entered
[30342] dbg  . . collecting deps
[30342] dbg  . . building 1 targets: [a]
[30342] dbg  . . ood: a checking
[30342] dbg  . . ood: a -> a.do: checking
[30342] dbg  . . ood: a -> a.do: same inode
[30342] dbg  . . ood: a -> a.do: is source
[30342] dbg  . . ood: a -> x: checking
[30342] dbg  . . ood: a -> x: size differs
[30342] dbg  . . ood: a: true
# Back in `redo-ifchange b`, check if running redo-* actually forced `a` OOD
[30334] dbg  . collecting deps
[30334] dbg  . building 1 targets: [b]
[30334] dbg  . ood: b checking
[30334] dbg  . ood: b -> b.do: checking
[30334] dbg  . ood: b -> b.do: same inode
[30334] dbg  . ood: b -> b.do: is source
[30334] dbg  . ood: b -> a: checking
# Now a's size will also differ.
[30334] dbg  . ood: b -> a: size differs
[30334] dbg  . ood: b -> a: ood
[30334] dbg  . ood: b: true
# This is no longer necessary. If a is still current, we're done.
# [30334] dbg  . ood: . a checking
# [30334] dbg  . ood: . a -> a.do: checking
# [30334] dbg  . ood: . a -> a.do: same inode
# [30334] dbg  . ood: . a -> a.do: is source
# [30334] dbg  . ood: . a -> x: checking
# [30334] dbg  . ood: . a -> x: size differs
# [30334] dbg  . ood: . a: true
# [30334] dbg  . ood: b -> a: ood
# [30334] dbg  . ood: b: true
```
As you can see, this cuts down the complete second dependency check to a
single level in the tree hierarchy.

The only reason for a target to become OOD is that its immediate
dependencies are OOD. So, this cuts down a quadratic number of checks to
a linear number.

–Michael

       reply	other threads:[~2021-01-22 15:50 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <87sg6t80s8.fsf@paris.i-did-not-set--mail-host-address--so-tickle-me>
2021-01-22 15:23 ` spacefrogg-goredo [this message]
2021-01-22 16:12   ` OOD checking done too eagerly Sergey Matveev
2021-01-22 16:24     ` Sergey Matveev