public inbox for nncp-devel@lists.cypherpunks.ru
Atom feed
From: Sergey Matveev <stargrave@stargrave•org>
To: nncp-devel@lists.cypherpunks.ru
Subject: Re: I/O timeout in nncp-daemon
Date: Sat, 23 Jan 2021 12:28:17 +0300 [thread overview]
Message-ID: <YAvsPFmR462hkFHA@stargrave.org> (raw)
In-Reply-To: <875z3px2u4.fsf@complete.org>
[-- Attachment #1: Type: text/plain, Size: 4223 bytes --]
Greetings!
*** John Goerzen [2021-01-21 17:55]:
>This timeout is much lower than any of the deadlines set in the configuration
>file, and appears to be a few seconds.
10 seconds. These are various:
conn.Set*Deadline(time.Now().Add(DefaultDeadline))
lines.
>Would it be possible to make this configurable, or larger?
Well... when I wrote that, I assumed that remote system anyway has to
give an answer (from TCP point of view) during 10 seconds.
Overall file's retrieving algorithm is simple:
* after handshake is made, each side sends list of packets (with their
nice value) available for remote side (INFO)
* then each side, if it wants to (due to configuration), sends request
(FREQ) to remote side to start sending specified packet's contents
from specified offset
* after receiving of that FREQ, sending of many FILE packets begins.
Each of FILE contains a chunk of encrypted packet. Packet's contents
are saved to "PKT-HASH.part"
* when encrypted packet is fully retrieved (its length equals to INFO's
length known in advance), then background checksum checker is started.
If checksum is good, then it renames "PKT-HASH.part" to "PKT-HASH"
* but it is running in background, while receiving/sending of another
packets continues
* when another packet is fully retrieved, but background checker is
still busy -- NNCP waits checker to complete
And seems (I am sure) that exactly because of the last step, the program
"hangs" and does not read anything more from the socket. Of course some
buffers contains possibly another FILE chunk, but they are quickly
filled and still waits for program to issue Read() from it.
Checksumming is required to be completely sure that received file is
good and we can send "DONE" message to remote side, that will delete the
packet from its spool.
You know, then I was writing that online part of NNCP, I was not
thinking about huge files at all. Initially NNCP lacked any online
communication at all.
What can be done?
* receiver can send HALT packet to stop remote's side from sending any
data. I do not like that case, because it can easily lead to constant
HALT+FREQ exchanging. It is hard to determine if we really needs to
stop reception, because we have got many gigabytes of data on USB2 HDD
* receiver can send FREQs not for the bunch of files, but only just for
single one, waiting for its reception, checksumming and only asking
for another FREQ after that. I do not like that, because it leads to
many round-trips. Currently it can send many KiBs of FREQs in just a
single TCP segment, leading to a non-stopping stream of sent packets
* receiver can make a queue with packets needed to be checked. Instead
of waiting for checksumer to end, it just fills up his queue
But! Anyway I do not like the fact that checker and receiver work
simultaneously, leading to constant read/write operations, killing HDDs
performance. It highly decreases the overall receiving speed. Ideally we
should either only send the data, or only received the data, to be able
to *sequentially* write it on the disk (it can be done now by specifying
rx/tx modes). Then, we should sequentially read it, for checksum
verification, without any network transmission at all.
Moreover, what if we deal with 1TiB file? There are high probability
that daemon/caller will be restarted and noone, until the next online
session will start that .part checking.
I think that there should be another intermediate step made of packets
processing. "PKT-HASH.part" is partly received file. No it is time to
create some kind of "PKT-HASH.done-but-unchecked" one. And another
nncp-checker daemon, that just checks the checksum and renames
".done-but-unchecked" to "PKT-HASH". And need to add possibility for
nncp-daemon *not* to do checksumming immediately. So there be
possibility to do checksum check completely asynchronously from the
transmission. Of course tossing must be made with -seen option, to save
the fact that some file was seen and (possibly) already processed.
--
Sergey Matveev (http://www.stargrave.org/)
OpenPGP: CF60 E89A 5923 1E76 E263 6422 AE1A 8109 E498 57EF
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
next prev parent reply other threads:[~2021-01-23 9:32 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-01-21 23:55 I/O timeout in nncp-daemon John Goerzen
2021-01-23 9:28 ` Sergey Matveev [this message]
2021-01-27 22:48 ` John Goerzen
2021-01-28 7:40 ` Sergey Matveev
2021-01-28 15:03 ` John Goerzen
2021-01-28 15:25 ` Sergey Matveev