public inbox for nncp-devel@lists.cypherpunks.ru
Atom feed
* nncp-bundle, nncp-caller, and .seen not working?
@ 2020-12-30  6:37 John Goerzen
  2020-12-30 14:08 ` Sergey Matveev
  0 siblings, 1 reply; 13+ messages in thread
From: John Goerzen @ 2020-12-30  6:37 UTC (permalink / raw)
  To: nncp-devel

Hi,

Today I had a slow wifi link between two computes passing a few GB 
of data, so I experimented with nncp-bundle.

I didn't delete from the source machine.  I transferred to USB, 
then used nncp-bundle -rx on the destination.

It appeared to work.  I used nncp-toss with -seen and it did its 
thing.

Strangely, however, nncp-caller would re-download the same 
packets.

I noticed that nncp-bundle put incoming data in a directory named 
for the local self's id, while nncp-caller put incoming data in a 
directory named for the neighbor's id.  Perhaps that's the issue? 
I'd wind up with two .seen files, one in each place, after all 
this was over.

I haven't tested but wonder of nncp-xfer would have the same 
issue.

Thanks!

- John

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: nncp-bundle, nncp-caller, and .seen not working?
  2020-12-30  6:37 nncp-bundle, nncp-caller, and .seen not working? John Goerzen
@ 2020-12-30 14:08 ` Sergey Matveev
  2020-12-30 15:37   ` John Goerzen
  0 siblings, 1 reply; 13+ messages in thread
From: Sergey Matveev @ 2020-12-30 14:08 UTC (permalink / raw)
  To: nncp-devel

[-- Attachment #1: Type: text/plain, Size: 1291 bytes --]

Greetings!

*** John Goerzen [2020-12-30 00:37]:
>I noticed that nncp-bundle put incoming data in a directory named for the
>local self's id, while nncp-caller put incoming data in a directory named for
>the neighbor's id.  Perhaps that's the issue?

Of course that was a bug. I do not know (or remember?) why it is not
placed in the sender's rx/ directory, but in the self's one. nncp-toss
worked anyway, because sender/recipient information is taken from
packet's header. I have fixed that in:
http://www.git.cypherpunks.ru/?p=nncp.git;a=commit;h=87245f236b40415c6e30fbaee18cdaf46b7f5c57
You can try it just by overwriting nncp-bundle/main.go file and
rebuilding it.

By the way, I plan to add ability to create uncompressed nncp-exec
packets. Currently nncp-exec always compresses the data inside, because
originally it was used only for email transmission. I will add another
plain packet type and an additional command line option for nncp-exec.
Everything will stay the same and backwards compatibility with existing
nncp-exec-generated packets won't break. After that I will create new
release with current nncp-bundle fix and documentation updates.

-- 
Sergey Matveev (http://www.stargrave.org/)
OpenPGP: CF60 E89A 5923 1E76 E263  6422 AE1A 8109 E498 57EF

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: nncp-bundle, nncp-caller, and .seen not working?
  2020-12-30 14:08 ` Sergey Matveev
@ 2020-12-30 15:37   ` John Goerzen
  2020-12-30 15:44     ` John Goerzen
  2020-12-30 19:10     ` Sergey Matveev
  0 siblings, 2 replies; 13+ messages in thread
From: John Goerzen @ 2020-12-30 15:37 UTC (permalink / raw)
  To: Sergey Matveev; +Cc: nncp-devel


On Wed, Dec 30 2020, Sergey Matveev wrote:

> packet's header. I have fixed that in:

Thanks!

> By the way, I plan to add ability to create uncompressed 
> nncp-exec
> packets. Currently nncp-exec always compresses the data inside, 
> because
> originally it was used only for email transmission. I will add 
> another
> plain packet type and an additional command line option for 
> nncp-exec.
> Everything will stay the same and backwards compatibility with 
> existing
> nncp-exec-generated packets won't break. After that I will 
> create new
> release with current nncp-bundle fix and documentation updates.

Interesting!  I just happened across an apparent memory leak in 
nncp-exec in which it was using over 1GB of RAM.  To be sure, I 
may have been piping several GB of data to it.  But I would expect 
it to stream out the data as it is read from stdin, hopefully?  I 
wonder if this is related?

Thanks again for NNCP.  I am really appreciating it!

- John

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: nncp-bundle, nncp-caller, and .seen not working?
  2020-12-30 15:37   ` John Goerzen
@ 2020-12-30 15:44     ` John Goerzen
  2020-12-30 19:10     ` Sergey Matveev
  1 sibling, 0 replies; 13+ messages in thread
From: John Goerzen @ 2020-12-30 15:44 UTC (permalink / raw)
  To: Sergey Matveev; +Cc: nncp-devel

On Wed, Dec 30 2020, John Goerzen wrote:

> Interesting!  I just happened across an apparent memory leak in 
> nncp-exec in
> which it was using over 1GB of RAM.  To be sure, I may have been 
> piping several
> GB of data to it.  But I would expect it to stream out the data 
> as it is read
> from stdin, hopefully?  I wonder if this is related?

Also related:

On the receiving side, I'm getting "data exceeds max slice limit" 
for these large ones.

Is there a file size limit to exec and (un-chunked) file?

- John

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: nncp-bundle, nncp-caller, and .seen not working?
  2020-12-30 15:37   ` John Goerzen
  2020-12-30 15:44     ` John Goerzen
@ 2020-12-30 19:10     ` Sergey Matveev
  2020-12-30 19:16       ` Sergey Matveev
                         ` (2 more replies)
  1 sibling, 3 replies; 13+ messages in thread
From: Sergey Matveev @ 2020-12-30 19:10 UTC (permalink / raw)
  To: nncp-devel

[-- Attachment #1: Type: text/plain, Size: 3677 bytes --]

*** John Goerzen [2020-12-30 09:37]:
>Interesting!  I just happened across an apparent memory leak in nncp-exec in
>which it was using over 1GB of RAM.  To be sure, I may have been piping
>several GB of data to it.  But I would expect it to stream out the data as it
>is read from stdin, hopefully?  I wonder if this is related?

My intention for uncompressed execs is not related. I just see that
someone can (obviously) use nncp-exec for tasks other than sending
(highly compressible) email/news and compression can harm and be useless
in many cases.

nncp-exec really stores all compressed data in the memory before writing
to the disk. I can replace it with writing to the temporary file, as
"nncp-file - dst:..." already does now.

Problem with streaming is that you do not know exact file size in
advance, that is written in the beginning of the first encrypted block.
So currently, for nncp-file reading from stdin, I store all read data in
the temporary file and then read it, knowing its size. Of course
ephemeral symmetric encryption is done, so even if computer will
suddenly shutdown and that file stays there -- noone can decrypt it.

I am not sure, (it is too late to think clear :-)), but I think that
temporary file can be replaced with the following scheme:

* we create an ordinary encrypted packet in place, streaming all the
  data inside but. But leaving its first encrypted 128KiB block blank
  (filled with zeros)! Its payload is memorized in RAM
* after all the data is read and written, we fill size field in that
  block in the RAM and encrypt it
* then seek to the zero-filled first block and overwrite with its
  encrypted fully ready contents from the RAM

All encrypted blocks are independent from each other
(http://www.nncpgo.org/Encrypted.html), so we can safely do that. Zero
filled block is also will skip real block allocation on ZFS with any
kind of compression enabled. But that single 128KiB block won't be
sequentially placed to other ones. Not a big deal I think.

But here is another problem: the whole packet is hashed from beginning
to the end. And changing that single block will require the entire
hashing calculation to be done again. It is good that we get rid of
temporary file at all, but we still need to sequentially write all the
data from stdin, make seek/overwrite and then sequentially read all the
data again for its checksumming.

We can avoid it by using hash trees, Merkle trees. Changing of that
single block will lead to recalculation of only one tree path, but not
reading the data from the disk again. Moreover it gives ability to
parallelize hash calculations (however I doubt it is bottleneck anywhere
in practice, but who knows!).

So... currently I do not see what can go wrong with that scheme. No
temporary files (for "nncp-file -" and nncp-exec) and ability of
parallelizable hashing. And of course it will "fix" (actually it is not
a bug, just that is how it worked all the time :-)) the issue with
memory consumption with nncp-exec.

But it will make current packets backward incompatible with that scheme,
because of different checksum algorithm.

>On the receiving side, I'm getting "data exceeds max slice limit" for these
>large ones.
>Is there a file size limit to exec and (un-chunked) file?

Should not be any. Currently do not know where it can raise that error.
Will look at it later, probably when the "new scheme" will be implemented.

>Thanks again for NNCP.  I am really appreciating it!

Glad to hear that!

-- 
Sergey Matveev (http://www.stargrave.org/)
OpenPGP: CF60 E89A 5923 1E76 E263  6422 AE1A 8109 E498 57EF

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: nncp-bundle, nncp-caller, and .seen not working?
  2020-12-30 19:10     ` Sergey Matveev
@ 2020-12-30 19:16       ` Sergey Matveev
  2021-01-01  5:36       ` John Goerzen
  2021-01-01  5:39       ` John Goerzen
  2 siblings, 0 replies; 13+ messages in thread
From: Sergey Matveev @ 2020-12-30 19:16 UTC (permalink / raw)
  To: nncp-devel

[-- Attachment #1: Type: text/plain, Size: 500 bytes --]

*** Sergey Matveev [2020-12-30 22:10]:
>Problem with streaming is that you do not know exact file size in
>advance, that is written in the beginning of the first encrypted block.

Actually it is not stored in the first block, but just preceedes it. So
"size" is just a 64-bit ciphertext + 128-bit MAC tag. So we do not have
to store that 128KiB of data, just only the 64-bit size.

-- 
Sergey Matveev (http://www.stargrave.org/)
OpenPGP: CF60 E89A 5923 1E76 E263  6422 AE1A 8109 E498 57EF

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: nncp-bundle, nncp-caller, and .seen not working?
  2020-12-30 19:10     ` Sergey Matveev
  2020-12-30 19:16       ` Sergey Matveev
@ 2021-01-01  5:36       ` John Goerzen
  2021-01-02 12:56         ` Sergey Matveev
  2021-01-01  5:39       ` John Goerzen
  2 siblings, 1 reply; 13+ messages in thread
From: John Goerzen @ 2021-01-01  5:36 UTC (permalink / raw)
  To: Sergey Matveev; +Cc: nncp-devel

Happy New Year!

On Wed, Dec 30 2020, Sergey Matveev wrote:
> Should not be any. Currently do not know where it can raise that 
> error.
> Will look at it later, probably when the "new scheme" will be 
> implemented.
>

I'm getting things like this periodically:

Dec 31 12:41:00 hostname nncp-caller[14357]: E 
2020-12-31T18:41:00.090076849Z
[sp-recv err="xdr:decodeArray: data exceeds max slice limit - 
read: '415373007'"
nice="255" node="L...Q"]

Dec 31 15:48:00 hostname nncp-caller[14357]: E 
2020-12-31T21:48:00.046258445Z
[sp-recv err="xdr:decodeArray: data exceeds max slice limit - 
read:
'3814302942'" nice="255"
node="L...Q"]

It doesn't seem to result in any harm in the end, though I will 
note that multi-GB files seem to cause either the caller or the 
daemon to hang up.  It's not entirely clear which, but it seems 
that the caller takes awhile to finish the write (a fsync?  or 
hash verification?) and the daemon times out, perhaps killing 
simultaneous transfers (it appears that's a thing?)

The nice value is odd too; I've never explicitly set anything to 
nice 255.  Maybe some form of corruption?

Thanks again,

John

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: nncp-bundle, nncp-caller, and .seen not working?
  2020-12-30 19:10     ` Sergey Matveev
  2020-12-30 19:16       ` Sergey Matveev
  2021-01-01  5:36       ` John Goerzen
@ 2021-01-01  5:39       ` John Goerzen
  2021-01-02 13:04         ` Sergey Matveev
  2021-01-06 17:15         ` Sergey Matveev
  2 siblings, 2 replies; 13+ messages in thread
From: John Goerzen @ 2021-01-01  5:39 UTC (permalink / raw)
  To: Sergey Matveev; +Cc: nncp-devel


On Wed, Dec 30 2020, Sergey Matveev wrote:

> But here is another problem: the whole packet is hashed from 
> beginning
> to the end. And changing that single block will require the 
> entire
> hashing calculation to be done again. It is good that we get rid 
> of
> temporary file at all, but we still need to sequentially write 
> all the
> data from stdin, make seek/overwrite and then sequentially read 
> all the
> data again for its checksumming.
>
> We can avoid it by using hash trees, Merkle trees. Changing of 
> that
>

I think I'm following here :-)

I'm not familiar with Merkle trees, but one solution to this would 
be to hash it in the order it's written at verification time: save 
the first 128K in RAM, read the rest, and when you're at EOF, 
supply that first 128K to the hasher and then you'd have your 
calculation.

- John

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: nncp-bundle, nncp-caller, and .seen not working?
  2021-01-01  5:36       ` John Goerzen
@ 2021-01-02 12:56         ` Sergey Matveev
  0 siblings, 0 replies; 13+ messages in thread
From: Sergey Matveev @ 2021-01-02 12:56 UTC (permalink / raw)
  To: nncp-devel

[-- Attachment #1: Type: text/plain, Size: 734 bytes --]

Happy New Year everyone!

*** John Goerzen [2020-12-31 23:36]:
>I'm getting things like this periodically:

Aah! Yes, I have also seen similar errors in my logs for a long time.
Well, actually I do not remember why that issue is still not fixed.
Either I could not find the reason, or I have never tried it, because,
as you mentioned, actually no data is either lost or corrupted, so why
bothering :-). When I will start working with NNCP this year, I will
look at that issues closer again. My daemon/caller just restarts their
connection and everything continues to flow smoothly again. So that just
annoys.

-- 
Sergey Matveev (http://www.stargrave.org/)
OpenPGP: CF60 E89A 5923 1E76 E263  6422 AE1A 8109 E498 57EF

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: nncp-bundle, nncp-caller, and .seen not working?
  2021-01-01  5:39       ` John Goerzen
@ 2021-01-02 13:04         ` Sergey Matveev
  2021-01-06 17:15         ` Sergey Matveev
  1 sibling, 0 replies; 13+ messages in thread
From: Sergey Matveev @ 2021-01-02 13:04 UTC (permalink / raw)
  To: nncp-devel

[-- Attachment #1: Type: text/plain, Size: 716 bytes --]

*** John Goerzen [2020-12-31 23:39]:
>I'm not familiar with Merkle trees, but one solution to this would be to hash
>it in the order it's written at verification time: save the first 128K in
>RAM, read the rest, and when you're at EOF, supply that first 128K to the
>hasher and then you'd have your calculation.

Of course that is also solution and very simple one indeed. But just no
so beautiful and elegant like general tree hashing mode. Current hashing
even compatible with ordinary b2sum commands, but tree hashing of course
is won't be anymore too. Soon will look at all of this closer.

-- 
Sergey Matveev (http://www.stargrave.org/)
OpenPGP: CF60 E89A 5923 1E76 E263  6422 AE1A 8109 E498 57EF

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: nncp-bundle, nncp-caller, and .seen not working?
  2021-01-01  5:39       ` John Goerzen
  2021-01-02 13:04         ` Sergey Matveev
@ 2021-01-06 17:15         ` Sergey Matveev
  2021-01-06 18:55           ` Sergey Matveev
  2021-01-07 17:56           ` John Goerzen
  1 sibling, 2 replies; 13+ messages in thread
From: Sergey Matveev @ 2021-01-06 17:15 UTC (permalink / raw)
  To: nncp-devel

[-- Attachment #1: Type: text/plain, Size: 4222 bytes --]

*** John Goerzen [2020-12-31 23:39]:
>> We can avoid it by using hash trees, Merkle trees. Changing of that
>
>I think I'm following here :-)

Well, I tried to get rid of temporary files and make different checksum
calculation. I tried own Merkle tree implementation, but did not finish
with it. I decided to use just hash(hash(header) || hash(body)), where
body is hashed when written on the disk, and at the end, when we
overwrite the header, we just calculate two more hashes. I tried BLAKE3
and it was twice as fast, without any parallelization, on my hardware --
just was curious about it.

But I failed. All of that can be done with single encrypted packet. But
when you want to send via some other nodes, then all that wrapped
encrypted packets are created *on the fly* simultaneously. Overwriting
headers requires seeking ability inside that *unaligned* encrypted
streams. So hardly can be done.

Moreover, I have started to refactor nncp-exec (to be able to use
temporary files at least). And I remembered why it was so simple and
even kept everything in memory. If nncp-exec transfers very big volume
of data, then it should be chunked (depending on configuration of
course) as an ordinary nncp-file. That means creation of multiple NNCP
packets, that can be reordered during the transfer. You can not run
any command when not the whole data is ready. Moreover all the data is
inside multiple encrypted packets.

So there must be some complicated code that somehow must find the .meta
packet containing information about all other chunks. Then it must find
that chunks somehow. It differs from current nncp-file/toss operations,
because they decrypt packets and store them on the filesystem. Ok, let's
just decrypt nncp-exec's chunks too and store on the filesystem and
do literally something like "nncp-reass -stdout nncp-exec's.meta |".

So either we have very complex code trying to find chunks *inside*
encrypted packets (obviously by parsing them again and again, or by
having some additional state anywhere), or we just have an ordinary
nncp-file-like transmission.

But current nncp-reass has options for being able to keep fragments on
the disk even after reassembling. It is able to feed them and delete at
once, not requiring to have double disk space (for chunks and whole
reassembled file). If it will be feeding the file to some external
command -- what kind of behaviour should it have? Delete all files if
that external command fails? Keep them? Possibly it is better to
reassemble them on dedup-ed ZFS dataset and pass that reassembled file
as an argument? I just simply can not tell any of those questions at all.

So I remembered why nncp-exec is so simple. If you want to pass big
volumes of data -- then pass them using nncp-file, and then nncp-exec
commands passing them that filename. Remote side's exec configuration
will work with incoming files anyhow user wants. nncp-exec packet can
reach destination much earlier than nncp-file's data itself. It can just
fail and fail every time nncp-toss met nncp-exec packet, until nncp-file
will come and (probably) be reassembled). I will prefer to use some kind
of cron-ed daemon that will check new files in some special directory
with reassembled files, that are treated like tasks to some command.

It is the most flexible way, without huge code complications and many
hard-to-answer questions/decisions. As I remember, FidoNet had similar
kind of things, where big binary files were transferred with "attached"
additional packet, referencing that binary file.

That is why nncp-exec packets can not be chunked, can not be big and no
temporary files are made, because anyway those -exec packets are
relatively small (well, actually there could be multimegabyte email
messages, but I assume that NNCP is running on computers with multiple
hundreds MB of RAM). Their "nature" is to be streamed inside some
command, then tossed, repeated again if it fail. If they are big, then
nncp-file must be used for data transfer itself (chunked, reassembled,
and so on).

-- 
Sergey Matveev (http://www.stargrave.org/)
OpenPGP: CF60 E89A 5923 1E76 E263  6422 AE1A 8109 E498 57EF

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: nncp-bundle, nncp-caller, and .seen not working?
  2021-01-06 17:15         ` Sergey Matveev
@ 2021-01-06 18:55           ` Sergey Matveev
  2021-01-07 17:56           ` John Goerzen
  1 sibling, 0 replies; 13+ messages in thread
From: Sergey Matveev @ 2021-01-06 18:55 UTC (permalink / raw)
  To: nncp-devel

[-- Attachment #1: Type: text/plain, Size: 601 bytes --]

*** Sergey Matveev [2021-01-06 20:15]:
>That is why nncp-exec packets can not be chunked, can not be big and no
>temporary files are made, because anyway those -exec packets are
>relatively small

Also that means that I see no profit in ability to create non-compressed
nncp-exec packets. Those packets them are small and even if contain
incompressible data, then Zstandard quickly fallbacks to incompressible
mode of operation, taking unnoticeable CPU and space overhead amount.

-- 
Sergey Matveev (http://www.stargrave.org/)
OpenPGP: CF60 E89A 5923 1E76 E263  6422 AE1A 8109 E498 57EF

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: nncp-bundle, nncp-caller, and .seen not working?
  2021-01-06 17:15         ` Sergey Matveev
  2021-01-06 18:55           ` Sergey Matveev
@ 2021-01-07 17:56           ` John Goerzen
  1 sibling, 0 replies; 13+ messages in thread
From: John Goerzen @ 2021-01-07 17:56 UTC (permalink / raw)
  To: Sergey Matveev; +Cc: nncp-devel


On Wed, Jan 06 2021, Sergey Matveev wrote:

> That is why nncp-exec packets can not be chunked, can not be big 
> and no
> temporary files are made, because anyway those -exec packets are
> relatively small (well, actually there could be multimegabyte 
> email
> messages, but I assume that NNCP is running on computers with 
> multiple
> hundreds MB of RAM). Their "nature" is to be streamed inside 
> some
> command, then tossed, repeated again if it fail. If they are 
> big, then
> nncp-file must be used for data transfer itself (chunked, 
> reassembled,
> and so on).

Thank you for the new update and all your work on NNCP.  This will 
be a very nice improvement!

The option to use an encrypted temporary file for nncp-exec will 
be perfect for me and the temp file solution is fine.  chunked for 
NNCP isn't important for my use case; although the files are 
sometimes many GB in size, that is still smaller than the 
transport (whether network or airgapped) and thus the resumable 
transfers are sufficient for my needs.

Thanks again!

- John

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2021-01-07 17:57 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-30  6:37 nncp-bundle, nncp-caller, and .seen not working? John Goerzen
2020-12-30 14:08 ` Sergey Matveev
2020-12-30 15:37   ` John Goerzen
2020-12-30 15:44     ` John Goerzen
2020-12-30 19:10     ` Sergey Matveev
2020-12-30 19:16       ` Sergey Matveev
2021-01-01  5:36       ` John Goerzen
2021-01-02 12:56         ` Sergey Matveev
2021-01-01  5:39       ` John Goerzen
2021-01-02 13:04         ` Sergey Matveev
2021-01-06 17:15         ` Sergey Matveev
2021-01-06 18:55           ` Sergey Matveev
2021-01-07 17:56           ` John Goerzen