public inbox for nncp-devel@lists.cypherpunks.ru
Atom feed
From: Sergey Matveev <stargrave@stargrave•org>
To: nncp-devel@lists.cypherpunks.ru
Subject: Re: Issues with very large packets
Date: Fri, 19 Feb 2021 15:36:16 +0300	[thread overview]
Message-ID: <YC+wy3ePulQ/wXu2@stargrave.org> (raw)
In-Reply-To: <87im6phxdz.fsf@complete.org>

[-- Attachment #1: Type: text/plain, Size: 3401 bytes --]

Greetings!

*** John Goerzen [2021-02-18 15:35]:
>First, nncp-stat was extremely slow on a system that had packets like that
>queued for transmission.  I'm wondering if it is trying to read all the
>packets, even when called with no parameters?
>It makes me wonder if nncp-daemon was doing some sort of expensive scan at
>the beginning of the call, and either it or the OS cached the results?

Both of them, as all other commands uses (very simple) src/jobs.go code
to get the list of available encrypted packets in the spool. It does
I/O, but I would not call it expensive:

* list files in the directory (node's spool)
* open each file and read its XDR-encoded header
* this header (currently) takes only 172 bytes of data
* seek each file to the beginning
* return all that jobs metainformation with opened file descriptors

So the only I/O is directory and 172 bytes of each files reading. So if
you have many of files, then yes, it would take some time. But it will
read the whole block/record (up to 128KiB by default) on ZFS. And yes,
if you repeat the operation, then ZFS ARC cache should contain those
blocks, that is why it is much more faster.

First of all: I have no ideas how file's size can affect anyhow that
algorithm above. Possibly read-ahead configuration may read more than
single ZFS block, but it is question about just several ones in the
worst case. So I see no difference in 1TiB or 1GiB file's
metainformation getting. I am talking much about ZFS there only because
reading of that little piece of information will be more expensive on
it, comparing to other filesystems (and I think it is ok, because in
most real world use-cases you wish to read the whole file after that).

If we will keep that metainformation nearby, then it should help much:

* Keeping separate database/cache-file of course is not an option,
  because of complexity and raised consistency questions.
* In nearly all code I see in NNCP, only the niceness level is the only
  thing used everywhere. And a long time ago I actually thought about
  keeping it in the filename itself. But I did not like that clumsy
  optimization and still do not like, because niceness is some kind of
  private valuable metainformation itself (if someone send a packet with
  non-common niceness of 145 -- the fact that some packet with it
  appeared somewhere may be valuable). That is not nice :-)
* We can keep copy of that metainformation (that 172-byte) header in
  ".meta"/whatever file nearby. It it does not exist -- then read the
  beginning of file as used to. It can be recreated anytime atomically.
  I like that solution
* That kind of information can also be kept in filesystem's extended
  attributes. Honestly I have never ever worked with xattrs at all.
  Today was my first setextattr/getextattr command invocations. But
  seems that it should be even more optimal and faster access
  information than separate file on any filesystem. It xattrs are
  missing (disabled?), then fallback to ordinary file reading
* But as I can see, OpenBSD does not support xattrs at all. So fallback
  to separate ".meta" could be valuable anyway

So seems I am going to write that header keeping in xattrs with fallback
to separate file storage.

-- 
Sergey Matveev (http://www.stargrave.org/)
OpenPGP: CF60 E89A 5923 1E76 E263  6422 AE1A 8109 E498 57EF

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  reply	other threads:[~2021-02-19 12:36 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-18 21:35 Issues with very large packets John Goerzen
2021-02-19 12:36 ` Sergey Matveev [this message]
2021-02-19 19:18   ` John Goerzen
2021-02-19 19:46     ` Sergey Matveev
2021-02-19 20:34       ` John Goerzen
2021-02-20 19:56         ` Sergey Matveev
2021-02-21  4:31           ` John Goerzen
2021-02-21  8:27             ` Sergey Matveev