public inbox for goredo-devel@lists.cypherpunks.ru
Atom feed
* File descriptor leak
@ 2022-05-08  4:44 Andrew Chambers
  2022-05-08  8:22 ` goredo
  2022-05-08 11:42 ` Sergey Matveev
  0 siblings, 2 replies; 7+ messages in thread
From: Andrew Chambers @ 2022-05-08  4:44 UTC (permalink / raw)
  To: goredo-devel

[-- Attachment #1: Type: text/plain, Size: 441 bytes --]

I have an especially large redo build (for building musl-libc) and I get 
the error:

main.go:506: src/stdio/putw.o: open /home/ac/src/musl/default.do: too 
many open files

I am using goredo 1.24.0 built with go1.16.9. I am fairly sure this is a 
file descriptor leak in goredo, as I don't have any problems with other 
redo implementations.

I have attached the default.do file that must be placed at the root of a 
musl v1.2.3 file tree.

[-- Attachment #2: default.do --]
[-- Type: text/plain, Size: 1748 bytes --]

set -eux
set -o pipefail
exec >&2

ARCH=${ARCH:-x86_64}
AR=${AR:-ar}
CC=${CC:-cc}
CFLAGS=${CFLAGS:--O2 -D _XOPEN_SOURCE=700}

bits_hdrs () {
	printf "%s\n" arch/$ARCH/bits/*.h
	printf "%s\n" arch/generic/bits/*.h
}

internal_hdrs () {
	echo "src/internal/version.h"
	printf "%s\n" src/internal/*.h
}

all_hdrs () {
	internal_hdrs
	echo "include/bits/alltypes.h"
	echo "include/bits/syscall.h"
	bits_hdrs
	printf "%s\n" include/*.h include/*/*.h
}

all_src () {
	printf "%s\n" src/*/$ARCH/*.[csS]
	printf "%s\n" src/*/*.c src/malloc/mallocng/*.c
}

all_obj () {
	all_src | sed -e 's/\.[csS]$/.o/g' 
}

case "$1" in

	all)
		mkdir -p include/bits
		redo-ifchange libc.a
	;;
	
	libc.a)
		obj="$(all_obj)"
		redo-ifchange $obj
		$AR rc "$3" $obj
	;;

	src/internal/version.h)
		echo "#define VERSION \"unknown\"" > "$3"
	;;

	include/bits/alltypes.h)
		redo-ifchange tools/mkalltypes.sed arch/$ARCH/bits/alltypes.h.in include/alltypes.h.in
		sed -f tools/mkalltypes.sed arch/$ARCH/bits/alltypes.h.in include/alltypes.h.in > "$3"
	;;

	include/bits/syscall.h)
		redo-ifchange arch/$ARCH/bits/syscall.h.in
		cp arch/$ARCH/bits/syscall.h.in "$3"
		sed -n -e s/__NR_/SYS_/p < arch/$ARCH/bits/syscall.h.in >> "$3"
	;;
	
	*.o)
		cfile="${1%.o}.c"
		sfile="${1%.o}.s"
		Sfile="${1%.o}.S"

		if test -e "$cfile"
		then
			src="$cfile"
		elif test -e "$sfile"
		then
			src="$sfile"
		elif test -e "$Sfile"
		then
			src="$Sfile"
		else
			echo "don't know how to build $1" 2>&1
			exit 1
		fi

		redo-ifchange $(all_hdrs) "$src"
		includes="
			-nostdinc
			-I./arch/$ARCH
			-I./arch/generic
			-I./src/internal
			-I./src/include
			-I./include
		"
		$CC $CFLAGS $includes -c -o "$3" "$src"
	;;

	*)
		echo "don't know how to build $1" 2>&1
		exit 1
	;;
esac

^ permalink raw reply	[flat|nested] 7+ messages in thread

* File descriptor leak
  2022-05-08  4:44 File descriptor leak Andrew Chambers
@ 2022-05-08  8:22 ` goredo
  2022-05-08  9:12   ` Andrew Chambers
  2022-05-08 11:42 ` Sergey Matveev
  1 sibling, 1 reply; 7+ messages in thread
From: goredo @ 2022-05-08  8:22 UTC (permalink / raw)
  To: goredo-devel

You likely run into the same trouble I did before. You can artificially restrict the number of parallel redo-ifchange calls with "xargs -n 50 ALL_ARGS | redo-ifchange" or try to increase the ulimit values for open files.

If you look into the archives, I, too, mentioned this problem. The way goredo is designed, makes it open all prospective fds upfront although with no intention to run as many parallel processes. So the task is to limit the number of arguments a single redo-ifchange invocation sees. The author's stance at the time was to keep the code simple and let the work be done in the redo targets. But I am still in favour of a more conservative approach to opening files of targets that are not immediately run.

–Michael

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: File descriptor leak
  2022-05-08  8:22 ` goredo
@ 2022-05-08  9:12   ` Andrew Chambers
  2022-05-08 11:21     ` goredo
  0 siblings, 1 reply; 7+ messages in thread
From: Andrew Chambers @ 2022-05-08  9:12 UTC (permalink / raw)
  To: goredo-devel


> You likely run into the same trouble I did before. You can artificially restrict the number of parallel redo-ifchange calls with "xargs -n 50 ALL_ARGS | redo-ifchange" or try to increase the ulimit values for open files.
>
> If you look into the archives, I, too, mentioned this problem. The way goredo is designed, makes it open all prospective fds upfront although with no intention to run as many parallel processes. So the task is to limit the number of arguments a single redo-ifchange invocation sees. The author's stance at the time was to keep the code simple and let the work be done in the redo targets. But I am still in favour of a more conservative approach to opening files of targets that are not immediately run.
>
> –Michael
>

If its the case the problem can be solved by limiting the arg size, I 
suppose the main function of redo-ifchange could just perform this 
batching of the arg list without any major change to the code.

I guess we can see what other people say.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: File descriptor leak
  2022-05-08  9:12   ` Andrew Chambers
@ 2022-05-08 11:21     ` goredo
  0 siblings, 0 replies; 7+ messages in thread
From: goredo @ 2022-05-08 11:21 UTC (permalink / raw)
  To: goredo-devel

As I said, goredo is lazy in this respect and just opens all file descriptors, it knows, it will need upfront. It still does the batching of processes, when it comes to the recursive execution of the targets.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: File descriptor leak
  2022-05-08  4:44 File descriptor leak Andrew Chambers
  2022-05-08  8:22 ` goredo
@ 2022-05-08 11:42 ` Sergey Matveev
  2022-05-09  0:31   ` Andrew Chambers
  1 sibling, 1 reply; 7+ messages in thread
From: Sergey Matveev @ 2022-05-08 11:42 UTC (permalink / raw)
  To: goredo-devel

[-- Attachment #1: Type: text/plain, Size: 10324 bytes --]

Greetings!

*** Andrew Chambers [2022-05-08 16:44]:
>I am fairly sure this is a
>file descriptor leak in goredo, as I don't have any problems with other redo
>implementations.

As spacefrogg already answered, goredo opens lock files for all targets
came from the command line. I tried you default.do with musl and saw:

      PID COMM                FD T V FLAGS    REF  OFFSET PRO NAME        
    11263 goredo            text v r r-------   -       - -   /home/stargrave/work/goredo/goredo
    11263 goredo            ctty v c rw------   -       - -   /dev/pts/22       
    11263 goredo             cwd v d r-------   -       - -   /tmp/musl-1.2.3   
    11263 goredo            root v d r-------   -       - -   /                 
    11263 goredo               0 v c r-------   2       0 -   /dev/null         
    11263 goredo               1 p - rw------   4       0 -   -                 
    11263 goredo               2 p - rw------   4       0 -   -                 
    11263 goredo               3 v r -w------   1       0 -   /tmp/musl-1.2.3/src/select/.redo/poll.o.lock
    11263 goredo               4 v r rw------   3       0 -   -                 
    11263 goredo               5 p - rw------  28       0 -   -                 
    11263 goredo               6 p - rw------  31       0 -   -                 
    11263 goredo               7 p - rw------  28       0 -   -                 
    11263 goredo               8 v r -wa-----   3       0 -   /tmp/musl-1.2.3/.redo/.redo.libc.a.rec.8d2431ca
    11263 goredo               9 v r rw------  21   53097 -   -                 
    11263 goredo              10 v r rw------  22       0 -   -                 
    11263 goredo              11 v r -w------   1       0 -   /tmp/musl-1.2.3/src/linux/.redo/timerfd.o.lock
    11263 goredo              12 v r -w------   1       0 -   /tmp/musl-1.2.3/src/linux/.redo/unshare.o.lock
    11263 goredo              13 v r rw------   1       0 -   /tmp/musl-1.2.3/src/linux/.redo.prctl.o.9b587891
    11263 goredo              14 v r -w------   1       0 -   /tmp/musl-1.2.3/src/linux/.redo/utimes.o.lock
    11263 goredo              15 v r -w------   1       0 -   /tmp/musl-1.2.3/src/linux/.redo/vhangup.o.lock
    11263 goredo              16 v r -wa-----   3       0 -   /tmp/musl-1.2.3/src/linux/.redo/.redo.prctl.o.rec.9b55ee9a
    11263 goredo              17 k - rw------   2       0 -   -                 
    11263 goredo              18 p - rw---n--   2       0 -   -                 
    11263 goredo              19 p - rw---n--   1       0 -   -                 
    11263 goredo              20 p - rw---n--   3       0 -   -                 
    11263 goredo              21 v r rw------   1       0 -   /tmp/musl-1.2.3/src/linux/.redo.name_to_handle_at.o.9b4468e0
    11263 goredo              22 v r -wa-----   3       0 -   /tmp/musl-1.2.3/src/linux/.redo/.redo.name_to_handle_at.o.rec.9b41bf26
    11263 goredo              23 v r rw------   1       0 -   /tmp/musl-1.2.3/src/linux/.redo.memfd_create.o.9b341573
    11263 goredo              24 v r rw------   1       0 -   /tmp/musl-1.2.3/src/linux/.redo.pivot_root.o.9b50a895
    11263 goredo              25 v r -wa-----   3   59585 -   /tmp/musl-1.2.3/src/linux/.redo/.redo.memfd_create.o.rec.9b315eb7
    11263 goredo              26 p - rw---n--   3       0 -   -                 
    11263 goredo              27 v r -wa-----   3       0 -   /tmp/musl-1.2.3/src/linux/.redo/.redo.pivot_root.o.rec.9b4e1c31
    11263 goredo              28 v r rw------   1       0 -   /tmp/musl-1.2.3/src/linux/.redo.mount.o.9b404696
    11263 goredo              29 v r -wa-----   3   56373 -   /tmp/musl-1.2.3/src/linux/.redo/.redo.mount.o.rec.9b3dad33
    11263 goredo              30 p - rw---n--   3       0 -   -                 
    11263 goredo              31 p - rw---n--   3       0 -   -                 
    11263 goredo              32 v r rw------   1       0 -   /tmp/musl-1.2.3/src/linux/.redo.personality.o.9b4cb7c2
    11263 goredo              33 v r -wa-----   3       0 -   /tmp/musl-1.2.3/src/linux/.redo/.redo.personality.o.rec.9b4a0ca2
    11263 goredo              34 p - rw---n--   3       0 -   -                 
    11263 goredo              35 p - rw---n--   3       0 -   -                 
    11263 goredo              36 v r -w------   1       0 -   /tmp/musl-1.2.3/src/math/.redo/signgam.o.lock
    11263 goredo              37 v r rw------   2       0 -   /tmp/musl-1.2.3/src/linux/.redo.ppoll.o.9b5491f0
    11263 goredo              38 v r -wa-----   2       0 -   /tmp/musl-1.2.3/src/linux/.redo/.redo.ppoll.o.rec.9b520a93
    11263 goredo              39 v r rw------   1       0 -   /tmp/musl-1.2.3/src/linux/.redo.gettid.o.9b0f6737
    11263 goredo              40 v r -w------   1       0 -   /tmp/musl-1.2.3/src/math/.redo/scalblnl.o.lock
    11263 goredo              41 v r -w------   1       0 -   /tmp/musl-1.2.3/src/math/.redo/significand.o.lock
    11263 goredo              42 v r -w------   1       0 -   /tmp/musl-1.2.3/src/math/.redo/significandf.o.lock
    11263 goredo              43 v r -w------   1       0 -   /tmp/musl-1.2.3/src/select/.redo/pselect.o.lock
    11263 goredo              44 p - rw---n--   3       0 -   -                 
    11263 goredo              45 p - rw------   7       0 -   -                 
    11263 goredo              46 v r -w------   1       0 -   /tmp/musl-1.2.3/src/locale/.redo/catclose.o.lock
    11263 goredo              47 v c r-------   3       0 -   /dev/null         
    11263 goredo              48 p - rw------   1       0 -   -                 
    11263 goredo              49 v r -w------   1       0 -   /tmp/musl-1.2.3/src/math/.redo/sinf.o.lock
    11263 goredo              50 v r -wa-----   3   59573 -   /tmp/musl-1.2.3/src/linux/.redo/.redo.gettid.o.rec.9b0cdcb5
    11263 goredo              51 v r -w------   1       0 -   /tmp/musl-1.2.3/src/math/.redo/sin.o.lock
    11263 goredo              52 v r -w------   1       0 -   /tmp/musl-1.2.3/src/math/.redo/cbrt.o.lock
    11263 goredo              53 v r -w------   1       0 -   /tmp/musl-1.2.3/src/locale/.redo/catgets.o.lock
    11263 goredo              54 v r -w------   1       0 -   /tmp/musl-1.2.3/src/math/.redo/sincos.o.lock
    11263 goredo              55 v r -w------   1       0 -   /tmp/musl-1.2.3/src/locale/.redo/c_locale.o.lock
    11263 goredo              56 v r -w------   1       0 -   /tmp/musl-1.2.3/src/linux/.redo/sethostname.o.lock
    11263 goredo              57 v r -w------   1       0 -   /tmp/musl-1.2.3/src/misc/.redo/setdomainname.o.lock
    11263 goredo              58 p - rw---n--   3       0 -   -                 
    11263 goredo              61 v r -w------   1       0 -   /tmp/musl-1.2.3/src/math/.redo/asinhl.o.lock
    11263 goredo              62 v r -w------   1       0 -   /tmp/musl-1.2.3/src/linux/.redo/wait4.o.lock
    11263 goredo              63 v r -w------   1       0 -   /tmp/musl-1.2.3/src/stdio/.redo/fseek.o.lock
    11263 goredo              64 v r -w------   1       0 -   /tmp/musl-1.2.3/src/mman/.redo/mmap.o.lock
    11263 goredo              65 v r -w------   1       0 -   /tmp/musl-1.2.3/src/math/.redo/__invtrigl.o.lock
    11263 goredo              66 v r -w------   1       0 -   /tmp/musl-1.2.3/src/math/.redo/sinh.o.lock
    11263 goredo              67 v r -w------   1       0 -   /tmp/musl-1.2.3/src/math/.redo/cbrtf.o.lock
    11263 goredo              68 v r -w------   1       0 -   /tmp/musl-1.2.3/src/math/.redo/atan2f.o.lock
    11263 goredo              69 v r -w------   1       0 -   /tmp/musl-1.2.3/src/math/.redo/cbrtl.o.lock
    [more than thousand of opened lock files]

and it was run with more than a thousand of targets indeed.

spacefrogg advised to use xargs to limit number of arguments passed.
That will work and I like that. Making artificial "xargs"-like limiter
inside goredo leads to the question: how many arguments should be
processed at once? Of course that could be configurable, but that is an
additional burden for the end-user, completely the same as adding xargs.

For example github.com/leahneukirchen/redo-c does not have the problems
with many opened lockfiles, because it takes jobserver's token before it
opens those lockfiles. I explicitly decided to take jobserver's token
after we really decided that target must be done and run. Goredo opens
lockfiles, checks various metainformation, makes some preparations and
temporary files and only after that it acquires jobserver's token to run
the do-file itself. I explicitly wanted to acquire token as late as
possible.

But that leads to possibility of many simultaneously opened (maximally
prepared target before acquiring the jobserver's token and running the
command itself), that, as I can see, brings more complications and
distractions to the end user.

So let's acquire jobserver's token (that already acts as a limiter for
number of jobs simultaneously running) before we open the lockfile. That
will solve the problem of many opened lockfiles. If you pass unlimited
number of jobs (-j 0), then of course it will open as many files as it
wish.

There is a drawback: we can acquire jobserver's token, open the
lockfile, process metadata and see that actually that target was already
done, so we return the jobserver's token back -- it was useless and no
do-file was actually run. But I did not notice any performance
differences in practice, because all of that seems to be anyway pretty
fast enough and not deserving the attention.

So with 1.25.0 that kind of problems should pass away. I do not like
that, but seems that is pretty acceptable compromise, with only a minor
changes in the code.

I also overviewed my code again and found several places where file
descriptors were not closed. But all of them were only related to failed
situations or supplementary helper commands.

By the way, just a remark, your default.do contains "set -o pipefail",
that does not exist in POSIX shell. For example FreeBSD's /bin/sh tells:
    set: Illegal option -o pipefail

-- 
Sergey Matveev (http://www.stargrave.org/)
OpenPGP: CF60 E89A 5923 1E76 E263  6422 AE1A 8109 E498 57EF

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: File descriptor leak
  2022-05-08 11:42 ` Sergey Matveev
@ 2022-05-09  0:31   ` Andrew Chambers
  2022-05-09  7:10     ` Sergey Matveev
  0 siblings, 1 reply; 7+ messages in thread
From: Andrew Chambers @ 2022-05-09  0:31 UTC (permalink / raw)
  To: goredo-devel


> inside goredo leads to the question: how many arguments should be
> processed at once? Of course that could be configurable, but that is an
> additional burden for the end-user, completely the same as adding xargs.
One idea would perhaps have been to inspect ulimit, but I agree that is 
still just a guess.
>
>
> So with 1.25.0 that kind of problems should pass away. I do not like
> that, but seems that is pretty acceptable compromise, with only a minor
> changes in the code.
>
>
Thanks, I appreciate your reasoned approach to resolving the issue.
> By the way, just a remark, your default.do contains "set -o pipefail",
> that does not exist in POSIX shell. For example FreeBSD's /bin/sh tells:
>      set: Illegal option -o pipefail
>
Note this is accepted for upcoming POSIX, so in the end Freebsd will 
need to change :)

Kind regards,
Andrew Chambers

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: File descriptor leak
  2022-05-09  0:31   ` Andrew Chambers
@ 2022-05-09  7:10     ` Sergey Matveev
  0 siblings, 0 replies; 7+ messages in thread
From: Sergey Matveev @ 2022-05-09  7:10 UTC (permalink / raw)
  To: goredo-devel

[-- Attachment #1: Type: text/plain, Size: 623 bytes --]

*** Andrew Chambers [2022-05-09 12:31]:
>One idea would perhaps have been to inspect ulimit

That will definitely work too. But that depends on each exact system's
configuration, where upper ulimit's values can be strictly lowered by
the administrator and you won't be able to increase them.

>Note this is accepted for upcoming POSIX, so in the end Freebsd will need to
>change :)

Ah, did not know that. That is good, because pipefail seems to be the
very first thing one want to be standardized.

-- 
Sergey Matveev (http://www.stargrave.org/)
OpenPGP: CF60 E89A 5923 1E76 E263  6422 AE1A 8109 E498 57EF

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2022-05-09  7:10 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-05-08  4:44 File descriptor leak Andrew Chambers
2022-05-08  8:22 ` goredo
2022-05-08  9:12   ` Andrew Chambers
2022-05-08 11:21     ` goredo
2022-05-08 11:42 ` Sergey Matveev
2022-05-09  0:31   ` Andrew Chambers
2022-05-09  7:10     ` Sergey Matveev