Greetings! I investigated why it is so slow and found various issues: mainly .redo/*.rec files were read too often, when completely unnecessary, that gave huge I/O load. That happened indeed during dependencies collection. There were often reads of the whole .rec, but only its first line (Build: ...) was actually used. >By the looks of it, redo always hashes source files. No, I did not notice any unnecessary hashing. >redo-targets returns instantly, but redo-sources gets stuck as well, redo-sources was written incredibly algorithmically badly. I created several default.do targets producing ten thousand of targets with thousands of partly shared dependencies. Only .rec directory weights more than gigabyte. Just its parsing (located on tmpfs, so no using only memory) takes nearly 20 seconds on my machine. Current optimised version of redo-sources works 22 seconds now, comparing to... I do not know, because I never waited till it finishes. redo-ifchange works for 60 seconds, where 20 seconds is parsing of all .redo/*.rec during "dependencies collection", another 20 seconds is again for parsing of all .redo/*.rec files during OOD decision process. I can cache (just two lines of code) already loaded/parsed *.rec files during "dependencies collection", so 1/3 of time will be reduced. But that took much RAM (several gigabytes, because of gigabyte .redo directory). It stored all dependencies information as an ordinary map with all those huge field names. I also optimised the code to reduce that memory usage. I am thinking also about using binary files for storing dependency information. Currently working with that, heavily reducing their size and that should also speed their parsing up. There are still left several *.rec-reading places, that check if the target was already built by parallel process. That can be mitigated in practice, but currently no plans for that. With so many targets and processes I also often catch: read /tmp/foo/31.2nd: bad file descriptor error and *completely* do not understand why that is happening. I already created several commits that heavily optimises many places. But work is still in progress. I thought that will finish today, but no. I can return to work on goredo only after several days unfortunately. >As a comparison, apenwarr/redo needs a fraction of a second to check all >dependencies. (It does no source file hashing, though.) In general, apenwarr/redo should be faster anyway, because of centralised SQlite3 database with indexing capabilities and pretty compact binary storage. For my synthetic workload with thousands of targets and dependencies, that means more than dozen of thousands of files reading and parsing of pretty verbose recfile-format. My current testing binary format reduces .redo directory size for three times, but that is still hundreds of megabytes of data, that is parsed. -- Sergey Matveev (http://www.stargrave.org/) OpenPGP: 12AD 3268 9C66 0D42 6967 FD75 CB82 0563 2107 AD8A