public inbox for goredo-devel@lists.cypherpunks.ru
Atom feed
From: Sergey Matveev <stargrave@stargrave•org>
To: goredo-devel@lists.cypherpunks.ru
Subject: Re: Dependency collection takes very long
Date: Sat, 7 Oct 2023 18:29:15 +0300	[thread overview]
Message-ID: <ZSF5SycIun8Y5A6H@stargrave.org> (raw)
In-Reply-To: <bd1ec0b6419ef2f3ebd5c6cca78da433@spacefrogg.net>

[-- Attachment #1: Type: text/plain, Size: 2206 bytes --]

Greetings!

Next release will contain HUGE number of various optimisations.
redo-sources will be magnitudes faster with many targets.
redo(-ifchange) will be many times faster.

I tested all of that with synthetic 10k targets depending on another 1k
common targets. .redo directory contained more than 2.1GB of dependency
files. Big part of the whole time was spent on just parsing so much
data. So I decided to use binary .dep files instead of recfile .rec
ones. They are just trivial concatenation of 64-bit integers, raw
hashes, single-byte types and so on. They are nearly 3 times smaller
than .rec ones.

My test case was (just whatever came into my head):

    all.do:
    ------------------------ >8 ------------------------
    for i in `seq 10000` ; do deps="$deps $i" ; done
    redo-ifchange $deps
    echo ok
    ------------------------ >8 ------------------------

    default.do:
    ------------------------ >8 ------------------------
    for i in `seq 1000` ; do deps="$deps $i.2nd" ; done
    redo-ifchange $deps
    dd if=/dev/random of=$3 bs=1K count=1 2>/dev/null
    ------------------------ >8 ------------------------

    default.2nd.do:
    ------------------------ >8 ------------------------
    echo 'print $RANDOM' | zsh
    ------------------------ >8 ------------------------

With all of that, with 8 parallel jobs (-j 8), REDO_NO_SYNC=1 (apenwarr
just always has fsync disabled for his sqlite3 database),
goredo's initial run took 79 seconds, .redo holds 865MB.
apenwarr/redo's took 870 seconds, .redo holds 784MB.
goredo's "redo-ifchange all" took 18 seconds.
apenwarr/redo's took 68 seconds.

However goredo consumes much more memory, loading all the dependency
information in memory. It heavily pressures Go's garbage collector, so
memory usage can be decreased for example by 1/3, but taking 2x/3x
longer runtime to run (anyway faster than apenwarr).

Because of dependency format change, major version number will be
raised and you have to run redo-depfix to convert existing .redo/*.rec
to .redo/*.dep.

-- 
Sergey Matveev (http://www.stargrave.org/)
OpenPGP: 12AD 3268 9C66 0D42 6967  FD75 CB82 0563 2107 AD8A

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

      parent reply	other threads:[~2023-10-07 15:29 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-28 12:05 Dependency collection takes very long spacefrogg
2023-09-30 16:47 ` Sergey Matveev
2023-09-30 21:02   ` Sergey Matveev
2023-10-01 11:53     ` goredo
2023-10-02 11:01       ` Sergey Matveev
2023-10-07 15:29 ` Sergey Matveev [this message]