#jruby on 2019-11-13 — irc logs at freenode.irclog.whitequark.org

2019-08-12 18:53 ChanServ changed the topic of #jruby to: Get 9.2.8.0! http://jruby.org/ | http://wiki.jruby.org | http://logs.jruby.org/jruby/ | http://bugs.jruby.org | Paste at http://gist.github.com

00:01 cpuguy83 has quit [Remote host closed the connection]

00:04 cpuguy83 has joined #jruby

00:06 hosiawak has quit [Ping timeout: 268 seconds]

00:07 hosiawak has joined #jruby

00:10 hosiawak has quit [Remote host closed the connection]

00:10 hosiawak has joined #jruby

00:15 hosiawak has quit [Ping timeout: 268 seconds]

00:17 hosiawak has joined #jruby

00:21 hosiawak has quit [Ping timeout: 240 seconds]

00:28 hosiawak has joined #jruby

00:32 hosiawak has quit [Ping timeout: 268 seconds]

00:38 hosiawak has joined #jruby

00:43 cpuguy83 has quit [Remote host closed the connection]

00:46 cpuguy83 has joined #jruby

01:19 cpuguy83 has quit [Remote host closed the connection]

01:20 cpuguy83 has joined #jruby

01:25 cpuguy83 has quit [Ping timeout: 240 seconds]

01:47 <headius[m]> I cracked the code

01:47 <headius[m]> https://gist.github.com/headius/66abd0cb5142240026dc6562919b039a

01:48 <headius[m]> for some reason JRuby sends the headers early as a separate packet, and the client doesn't ack that until after 0.04s or so

02:21 cpuguy83 has joined #jruby

02:48 cpuguy83 has quit [Ping timeout: 240 seconds]

03:00 cpuguy83 has joined #jruby

03:43 cpuguy83 has quit [Remote host closed the connection]

04:36 Antiarc has quit [Quit: ZNC 1.7.4+deb7 - https://znc.in]

04:41 Antiarc has joined #jruby

06:04 hosiawak has quit [Ping timeout: 245 seconds]

06:32 hosiawak has joined #jruby

06:36 hosiawak has quit [Ping timeout: 250 seconds]

07:00 hosiawak has joined #jruby

07:07 hosiawak has quit [Ping timeout: 276 seconds]

07:19 _whitelogger has joined #jruby

07:28 cpuguy83 has joined #jruby

07:31 hosiawak has joined #jruby

07:32 KeyJoo has joined #jruby

07:32 cpuguy83 has quit [Ping timeout: 240 seconds]

07:41 hosiawak has quit [Remote host closed the connection]

08:03 rusk has joined #jruby

09:45 drbobbeaty has joined #jruby

10:07 drbobbeaty has quit [Quit: My MacBook Pro has gone to sleep. ZZZzzz…]

10:21 KeyJoo has quit [Quit: KeyJoo]

11:01 shellac has joined #jruby

11:54 drbobbeaty has joined #jruby

12:36 lucasb has joined #jruby

14:34 shellac has quit [Quit: Computer has gone to sleep.]

15:04 shellac has joined #jruby

15:30 subbu is now known as subbu|away

15:36 cpuguy83 has joined #jruby

15:40 cpuguy83 has quit [Ping timeout: 240 seconds]

15:59 bzb has joined #jruby

16:02 subbu|away is now known as subbu

16:02 enebo has quit [Ping timeout: 246 seconds]

16:03 xardion has quit [Remote host closed the connection]

16:03 xardion has joined #jruby

16:04 enebo has joined #jruby

16:04 <headius[m]> Good morning!

16:19 bzb has quit [Quit: Leaving]

16:22 enebo has quit [Ping timeout: 240 seconds]

16:38 enebo has joined #jruby

16:39 shellac has quit [Ping timeout: 245 seconds]

16:39 <headius[m]> aha, I finally cracked this ACK nut

16:40 <headius[m]> crACKed it

16:40 <rdubya[m]> 🥳

16:40 <headius[m]> from 200 req/s to 20k

16:40 <rdubya[m]> nice

16:40 <headius[m]> unfortunately it's not good news

16:41 <rdubya[m]> not so nice lol

16:41 <headius[m]> the problem lies in our socket impl

16:41 <headius[m]> I've just hacked around part of it in the benchclient

16:42 <headius[m]> well "problem" may be a stretch

16:42 <headius[m]> we work properly but because we don't obey all the socket flags puma sets we send response packets differently than MRI

16:42 <headius[m]> it just happens to hit the delayed ACK problem

16:43 <headius[m]> for anyone following along: https://gist.github.com/headius/d8e0468bceebccd7d3c951774845bf07

16:44 <headius[m]> actually I have to update that because I did just get QUICKACK to work

16:45 <headius[m]> there

16:46 <headius[m]> ok so the "bug" is basically that JRuby's response is broken into packets differently than MRI's

16:46 <lopex> ip packets ?

16:46 <headius[m]> I believe it's TCP_CORK doing it, but MRI sends headers plus the first part of the response all at once while acking the request

16:46 <lopex> er, tcp

16:46 <enebo[m]> yeah so for hello world size payloads which can complete processing a request in much less than 0.04s people will notice us as slower

16:46 <headius[m]> we send headers as one packet and then wait for ack before sending the body

16:47 <headius[m]> enebo: yeah if the request required more than 40ms we probably would not notice this, and also possibly if it were a larger packet

16:47 <enebo[m]> so my main takeaway is people doing minimal benchmarking may end up drawing a poor conclusion

16:47 <lopex> mss ?

16:48 <headius[m]> MRI does not have the 40ms problem even though the body gets broken into two packets

16:48 <headius[m]> may be because it closes connection after the second part of response

16:48 <enebo[m]> if quick ack is client side setting then realistically I doubt we can influence this mistake in evaluation of Rubys either

16:48 <headius[m]> there's some questions remaining but the bottom line is that we packetize the response differently and it just so happens to trigger this delay problem

16:49 <headius[m]> right, it's a client-side socket option that appears to get reset to default very easily

16:49 <enebo[m]> I guess in our talk we can lightly cover the dangers of using too tiny of a bench as well as point out this issue

16:49 <headius[m]> my first patch set it right after connection establishment and that didn't work

16:49 <enebo[m]> With a larger bench showing us perform reasonably it should be easier to make the point

16:49 <headius[m]> which sent me down this rat's nest of tcpdump

16:50 <headius[m]> I moved the option to immediately before request write and that did it

16:50 <enebo[m]> it is troubling that people make decisions running really silly benchmarks. I do get it too, but it is still troubling

16:50 <enebo[m]> of course unless someone wants to make a scalable date service :)

16:50 <headius[m]> from my research it appears this is a problem on Linux and on Windows, though on Windows it appears there's a way to disable the delay globally

16:51 <headius[m]> Linux does not normally have a way unless you use a kernel patch which is apparently associated with realtime...sorta makes sense, if you want realtime behavior you don't want ACK delays randomly

16:51 <headius[m]> well this would certainly affect, say, an NTP server

16:52 <enebo[m]> headius: half serious question...what issues would prevent us going full native for sockets? ssl?

16:52 <headius[m]> primarily it's the many socket structs

16:53 <enebo[m]> I guess there is some variety there

16:53 <headius[m]> we could really just adopt the FFI-based socket lib that was written for rbx and which TR has been improving, but in both of their cases they have a build time step on each system to generate the struct layouts

16:53 <headius[m]> in theory that library is at least as good as what we have and probably a lot better in most ways

16:53 <enebo[m]> in most cases it feels like difference in behavior than different struct layouts but it is a big undertaking I guess is the main answer

16:54 <enebo[m]> meh on ffi sockets

16:54 <headius[m]> There's a possible out for us too, though... netty ships a pure-native socket library for Windows and Linux that has all the options and flags and such

16:54 <enebo[m]> aha yeah I wondered about netty since it is the god of servers

16:54 <headius[m]> of course that ties us to a pre-built binary

16:54 <headius[m]> yeah they basically figured all this out but never blogged it

16:55 <headius[m]> as far as I could find

16:55 <enebo[m]> and will limit what platforms we are on without contributing to that library

16:55 <headius[m]> correct

16:56 <enebo[m]> At this point I would say education on client benching may be best short term bet. Perhaps we can have some benchmarking page on our wiki

16:56 <headius[m]> of course we can maintain a dual impl that's JDK sockets on unsupported platforms, but yeah

16:56 <enebo[m]> because 0.04 is not a massive penalty for a real app and we are unsure whether MRI or JRuby will really pay this is ordinary responses

16:56 <headius[m]> it's also possible that we could pregenerate socket structs for all the platforms we need and it would be fine

16:56 <headius[m]> I mean, it's a finite set

16:56 <enebo[m]> the full logic tree of that for the lazy is not coming

16:57 <enebo[m]> I am wrestling with the effort vs the actual problem

16:57 <headius[m]> yeah it's a bit frustrating

16:57 <enebo[m]> If it was low effort I feel this conversation would be worth more

16:57 <headius[m]> we aren't really broken here

16:57 <headius[m]> we're just not obeying some low-level, Linux-specific packeting flags

16:58 <headius[m]> really it's JDK that's broken if these flags are really recommended or required, because we're just calling JDK's sockets

16:58 <enebo[m]> as it stands it really hurts bare metal benching which is misused by some to evaluate JRuby but realistically this is probably not a real problem for nearly all actual conventional uses

16:58 <headius[m]> perhaps...I don't have any idea how big a problem this might be in reality

16:58 <enebo[m]> you also bring up a reasonable point. JDK {n} may end up fixing this at some point too

16:59 <headius[m]> it's not like it adds 40ms across the board, or even reliably

16:59 <enebo[m]> well we need to probably look at tcpdump and see how common that delay is

16:59 <enebo[m]> you actually know enough now to be able to observe whether bigger stuff even has the issue at least

16:59 <headius[m]> at worst it's 40ms minus whatever time is consumed between the request and the final ack of the response

17:00 <headius[m]> so request handling eats part of that 40

17:00 <enebo[m]> perhaps we need to do more analysis before we evaluate solutions

17:00 <enebo[m]> yeah

17:01 <headius[m]> I think the problem here is that this intermediate packet doesn't look like it needs an immediate ack, so client doesn't send one

17:01 <headius[m]> I put both our packets and MRI's packets (minus the final part of the body) here: https://gist.github.com/headius/66abd0cb5142240026dc6562919b039a

17:02 <headius[m]> actually nevermind...I see now the MRI response packet is 25068, which is 25k for the body and 68 for the headers

17:02 <headius[m]> so MRI manages to respond in exactly one packet

17:02 <headius[m]> the delayed ack doesn't matter because it's done

17:03 <enebo[m]> whoa how big are packets these days?

17:03 <headius[m]> yeah seems big

17:03 <enebo[m]> has it always been 65k

17:03 <headius[m]> let me grab MRI's final ack

17:03 <headius[m]> I suspect it's sent immediately because wrk sends the next request with it

17:03 <headius[m]> so the pipeline flows

17:03 <enebo[m]> aha ok coming back

17:04 <enebo[m]> TCP packet is 65k but ethernet frames MTU is like 1500

17:04 <lopex> isnt this tcpi_snd_mss and tcpi_rcv_mss ?

17:04 <enebo[m]> so lower physical layers are breaking that up and TCP does a bunch of hijinx to assemble that into a "packet"

17:05 <lopex> mtu is on eth only right ?

17:05 <headius[m]> yup that's it

17:05 <enebo[m]> including retransmission or reordering if one get sent out of order

17:05 <headius[m]> so here's the tail end packet info from MRI response followed by the ack from client

17:05 * headius[m] sent a long message: < >

17:05 <headius[m]> it sends the next request with the ack

17:06 <headius[m]> in the JRuby case there's no data to send with the ack so it delays it

17:06 <headius[m]> and server sits waiting for it

17:06 <enebo[m]> we need to send "

17:06 <enebo[m]> :)

17:06 <headius[m]> yeah I don't remember what you can send from client in the middle of the response

17:06 <headius[m]> in any case it's not "broken", it's just unfortunate

17:07 <enebo[m]> This is interesting too that you are using localhost

17:07 <headius[m]> puma could be modified to do this better, possibly...it does do separate writes for headers and body

17:07 <enebo[m]> I mean I get what is happening but on a real network multiple packets have their own latency as well

17:07 <headius[m]> presumably that's why TCP_CORK is used, so those get sent as a single packet

17:07 <enebo[m]> so likely this will not be as big a deal in a real app

17:07 <headius[m]> hey you know what, I'll try a 70k response on MRI

17:08 <enebo[m]> I guess MRI doing it in one packet is the big win here

17:08 <headius[m]> that should break response into two packets and have the same problem if theories line up

17:08 <enebo[m]> yeah makes sense

17:12 <headius[m]> interesting

17:12 <headius[m]> no delay problem...but MRI does not wait to send the second part of the response

17:13 <headius[m]> so the ack comes with the next request anyway, after two packets from server

17:13 <headius[m]> that may be a clue...it's possible TCP_NODELAY is making this work for them

17:14 cpuguy83 has joined #jruby

17:15 <headius[m]> ahah but we are ok too now

17:16 <enebo[m]> yay

17:17 <headius[m]> I'll revert my socket patches to JRuby and try it

17:18 <headius[m]> yeah it's working

17:21 <headius[m]> heh, now I want to make sure it's still broken

17:22 Antiarc has quit [Remote host closed the connection]

17:22 <headius[m]> yeah back down to 25k and it breaks

17:23 <headius[m]> and restore QUICKACK and it's fixed

17:23 <headius[m]> so I'm not entirely clear why the larger response means server doesn't wait for ack

17:24 Antiarc has joined #jruby

17:24 <enebo[m]> probably just hopes for the best with possible retransmission later

17:25 <headius[m]> we do still break the headers off as a separate initial packet

17:25 <headius[m]> but unlike the 25k case, the 75k case starts right in on the body

17:25 <enebo[m]> I guess though if these are both the server wants to send two packets why would it wait in one case

17:25 <headius[m]> "we"

17:25 <headius[m]> I mean the kernel

17:25 <headius[m]> could be that the kernel sees the next packet is a big sucker and just goes for it

17:25 <enebo[m]> syscall is likely a big difference for MRI

17:25 <headius[m]> yeah I guess that's what you said

17:26 <headius[m]> so like it knows there's more packets to come and has to dump the buffer

17:26 <enebo[m]> I can only ponder why would a larger amount of data just decide to push forward

17:27 <headius[m]> right

17:27 <headius[m]> that gets into deeper magic of how it's deciding when to send packets

17:27 <enebo[m]> but TCP does have to potentially retransmit lost packets too so it is not too clear to me why this is the case

17:28 <headius[m]> in any case it's not something I can see in tcpdump and I suspect I wouldn't see puma actually waiting

17:28 <enebo[m]> yeah

17:28 <headius[m]> it's merrily tossing stuff onto the wire and it's just that stars align when the body fits entirely in a packet

17:28 <headius[m]> kernel holds it for ack

18:27 enebo has quit [Ping timeout: 245 seconds]

18:52 subbu is now known as subbu|lunch

18:56 hosiawak has joined #jruby

18:59 <hosiawak> headius[m]: I managed to figure out deployment on Puma, this real life Rails app does 2x better in terms of reqs/s on JRuby than on MRI 2.6 (invokedynamic + Graal VM), it also uses 1.2 GB as opposed to 2.5 GB on MRI and runs all the bg processes (delayed job, sidekiq, mails etc.) in the same process using Quartz. So overall I'm very pleased with JRuby. Thanks for your great work :)

18:59 <hosiawak>

18:59 <hosiawak> end

19:01 cpuguy83 has quit [Remote host closed the connection]

19:52 <headius[m]> Oh, you know I was going to ask if you really needed to use a war file...we definitely recommend deploying on Puma if you can because the Java server model is kind of a dying practice

19:52 <headius[m]> And those numbers look excellent! Maybe we can get you to do a guest blog post at some point

19:54 subbu|lunch is now known as subbu

19:54 <lopex[m]> headius: did you see the message above ?

20:03 <headius[m]> MTU?

20:04 cpuguy83 has joined #jruby

20:06 <lopex[m]> the one about jruby/puma/graal

20:21 enebo has joined #jruby

20:24 <headius[m]> I did, that's what I was responding to

20:25 <headius[m]> hosiawak fwiw we have not seen GraalVM perform better than openjdk on any non-trivial jruby application

20:25 <headius[m]> If you haven't tried it already, I would recommend testing openjdk. If your app is actually faster on Graal VM it would be a first

20:28 <enebo[m]> hosiawak: Also if you are using Java 9+ be sure to specify parallel GC: -J-XX:+UseParallelGC

20:29 <enebo[m]> lopex: MTU is ethernet only I believe

20:29 <lopex> enebo[m]: yeah MSS is the tcp one right ?

20:32 <enebo[m]> lopex: you are dredging info I used to enjoy 30 years ago :P

20:32 <enebo[m]> lopex: possibly...I wonder if I still have that Stevens network book

20:32 <enebo[m]> I believe I have some ancient TCP/IP multi-volume thing too

20:33 <enebo[m]> yeah MSS is the TCP one

20:33 <lopex> yeah, I was just asking

20:34 <enebo[m]> Maximum Segment Size

20:34 <enebo[m]> Maximum Transmission Unit

20:34 <lopex> I know

20:34 <enebo[m]> Synonyms Suck Folks (SSF)

20:34 <lopex> was just wondering about that packet size and mss

20:35 <lopex> since it's on the tcp socket right ?

20:36 <enebo[m]> yeah mss will be the limiting factor I guess?

20:36 <enebo[m]> Part of me is weirded out this design has held up for so long

20:37 <enebo[m]> I mean I know things have changed here and there but overall the main bits have held up

20:38 <enebo[m]> And this was all designed around potentially adding other protocols

20:38 <enebo[m]> Is it brilliant or just too incumbent to ever change?

20:38 <lopex> like that bgp thingy ?

20:39 <lopex> I read it's even worse

20:39 <enebo> BGP is on top of transport though right?

20:39 <lopex> you tell me

20:40 <lopex> it would make sense

20:40 <enebo> NO YOU TELL ME :)

21:36 <headius[m]> I still have my Stevens networking book

21:36 <headius[m]> no idea if it's still useful though

21:48 drbobbeaty has quit [Quit: My MacBook Pro has gone to sleep. ZZZzzz…]

22:44 hosiawak has quit [Ping timeout: 240 seconds]

23:23 cpuguy83 has quit [Remote host closed the connection]

23:32 rtyler has joined #jruby

23:32 <rtyler> headius[m]: this may be relevant to your interests https://twitter.com/damageboy/status/1194751035136450560

23:33 <headius[m]> oh fun

23:38 cpuguy83 has joined #jruby