Note: Links to docs will display an error until the docs builds have been completed.
As of commit 0975d5e with merge base 4b96575 ():
๐ Looks good so far! There are no failures yet. ๐
This comment was automatically generated by Dr. CI and updates every 15 minutes.
134 | 137 | public: | |
135 | 138 | explicit UvTcpSocket(uv_loop_t* loop) { | |
136 | 139 | uv_tcp_init(loop, &client); | |
140 | if(int err = uv_tcp_nodelay(&client, 1)) { |
do we have some microbenchmark code that we could use to validate the improvement (and no unexpected perf impact)?
Yes, I'll do some more testing on this internally but I expect it'll be pretty safe.
We already set TCP_NODELAY on everything that uses socket.cpp so the older TCPStoreBackend and the current TCPStore client. Not setting this seems like an oversight when writing the new backend -- it's a bit weird to set it only on the client but not the server as well since it's asymmetric.
One more data point: we do internal buffering of messages so we only call write() once the entire message is ready to be sent. Thus we only call write once per request so there won't ever be another message that can be coalesced.
make sense. would it be useful for users to have an option to actually set the value instead of relying on OS?
@kurman the OS setting is the global max so the value passed to listen()
can only be used to shrink it more. I don't see a good use case for limiting this -- there's no case where we would want to intentionally overflow the backlog and there's no performance benefit to setting it lower. It may save a tiny amount of memory
We set it to -1 on the old TCPStoreBackend so this is matching existing behaviors.
@d4l3k has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.
@d4l3k has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.
34 | // This controls how many un-accepted TCP connections can be waiting in the | ||
35 | // backlog. This should be at least world size to avoid issues on init. We set | ||
36 | // it to -1 to use the host max value which is controlled by `soconnmax`. | ||
37 | #define DEFAULT_BACKLOG -1 |
This DEFAULT_BACKLOG setting seems to be following the libuv example -- likely they didn't realize that -1 worked to default to max and just set it to be the max known world size?
https://github.com/libuv/libuv/blob/v1.x/docs/code/tcp-echo-server/main.c#L7
@pytorchbot merge -f 'Landed internally'
(Initiating merge automatically since Phabricator Diff has merged, using force because this PR might not pass merge_rules.json but landed internally)
Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use -f
as last resort and instead consider -i/--ignore-current
to continue the merge ignoring current failures. This will allow currently pending tests to finish and report signal before the merge.
Learn more about merging in the wiki.
Questions? Feedback? Please reach out to the PyTorch DevX Team
Login to write a write a comment.
This adjusts the settings of the libuv backend to match the older TCPStore.
tcp_abort_on_overflow
set it results in connections being reset.Test plan:
Benchmark script:
Performance seems to be negligible difference between TCP_NODELAY and not for a single host
cc @mrshenli @pritamdamania87 @zhaojuanmao @satgera @gqchen @aazzolini @osalpekar @jiayisuse @H-Huang @kwen2501 @awgu @penguinwu @fegin @XilunWu @wanchaol @fduwjj @wz337 @tianyu-l @wconstab @yf225 @chauhang