#jruby on 2019-12-17 — irc logs at freenode.irclog.whitequark.org

2019-08-12 18:53 ChanServ changed the topic of #jruby to: Get 9.2.8.0! http://jruby.org/ | http://wiki.jruby.org | http://logs.jruby.org/jruby/ | http://bugs.jruby.org | Paste at http://gist.github.com

00:05 nirvdrum has quit [Ping timeout: 248 seconds]

00:25 ur5us has quit [Ping timeout: 248 seconds]

00:45 ur5us has joined #jruby

00:53 nirvdrum has joined #jruby

01:32 lucasb has quit [Quit: Connection closed for inactivity]

05:00 nirvdrum has quit [Ping timeout: 268 seconds]

05:51 ur5us has quit [Ping timeout: 265 seconds]

06:12 nirvdrum has joined #jruby

06:29 ur5us has joined #jruby

07:16 nirvdrum has quit [Ping timeout: 248 seconds]

08:13 rusk has joined #jruby

08:29 ur5us has quit [Ping timeout: 248 seconds]

09:00 nirvdrum has joined #jruby

09:04 dopplerg- has joined #jruby

09:06 havenwood_ has joined #jruby

09:06 adam12_ has joined #jruby

09:09 _Caerus has joined #jruby

09:10 Caerus has quit [*.net *.split]

09:10 dopplergange has quit [*.net *.split]

09:10 havenwood has quit [*.net *.split]

09:10 adam12 has quit [*.net *.split]

09:10 havenwood_ is now known as havenwood

09:10 havenwood has joined #jruby

09:10 havenwood has quit [Changing host]

09:30 KeyJoo has joined #jruby

09:31 ur5us has joined #jruby

10:08 ur5us has quit [Ping timeout: 248 seconds]

12:15 shellac has joined #jruby

12:15 shellac has quit [Client Quit]

14:06 lucasb has joined #jruby

14:56 adam12_ is now known as adam12

14:57 adam12 is now known as Guest95748

14:58 Guest95748 is now known as adam12

16:03 xardion has quit [Remote host closed the connection]

16:08 xardion has joined #jruby

17:34 rusk has quit [Remote host closed the connection]

17:49 KeyJoo has quit [Quit: KeyJoo]

18:29 <headius[m]> subbu: hey I am working on some optimizations for blocks and have a question about flags and compiuler passes

18:29 <headius[m]> there's an instruction for blocks to load the frame's closure, so it can be yielded to... LoadFrameClosureInstr

18:29 <headius[m]> this is added conservatively to all closures by the builder, but if we never yield it becomes dead code and goes away

18:30 <headius[m]> however its flag changes stick around... it forces a frame when no frame load is actually needed

18:30 <headius[m]> so all blocks are pushing and popping a frame and twiddling its visibility even though most don't need it

18:31 <headius[m]> the lifecycle of flags is confusing...once we set them they stay set, but various paths cause them to be calculated very early

18:32 <headius[m]> seems there needs to be a reset at some point, or perhaps every point, during the optimization passes

18:33 <headius[m]> naively resetting them in after DCE or before ACP seems to damage non-local break though

18:40 travis-ci has joined #jruby

18:40 <travis-ci> jruby/jruby (reduce_block_overhead:2b31906 by Charles Oliver Nutter): The build passed. https://travis-ci.org/jruby/jruby/builds/626326525 [177 min 23 sec]

18:40 travis-ci has left #jruby [#jruby]

18:43 <subbu> oh man ... that seems like a lifetime ago ... i am in india right now .. let me try to page in some of it tomorrow and see what i can find unless enebo[m] knows the answer readily.

19:06 <enebo[m]> The naive reset not working answer is that the instrs determine flags but also the passes themselves will set flags. So it is a combination of things. I think perhaps ACP itself should determine if we can eliminate it at that point.

19:08 <enebo[m]> I did no work on ACP itself but seemingly we just need to know which properties satisfy when a frame is not needed and then not add it during ACP OR a pass later which examines the state of things and removes conservative assumptions.

19:45 <headius[m]> So you are saying that perhaps make ACP walk instructions again?

19:46 <enebo[m]> headius: possibly...if ACP removes something which will change it so we know we don't need to push a frame

19:46 <enebo[m]> otherwise we can not add it in the first place if we know all we need to

19:47 <headius[m]> Currently ACP just asks for flags but those flags reflect instructions that have been removed

19:48 <headius[m]> The load of the frame closure is removed properly, but it has already tainted the flags

19:48 <headius[m]> Clearly there is a disconnect between the life cycle of the flags and the life cycle of the instructions

19:48 <headius[m]> There may be many other things not optimizing properly because these early flags leak into the optimized IR later on

19:48 <enebo[m]> perhaps whatever has removed whatever was holding it back should be removing the flag then...but flags were added more conservatively initially

19:50 <headius[m]> I thought about trying that, but if two instructions trigger a flag, removing one of those instructions should not remove the flag

19:51 <headius[m]> So it needs to be a full recalculate after instructions get modified, at least for flags that are driven by the instructions present

19:51 <enebo[m]> yeah so the property would be not having both and would require a full scan

19:51 <enebo[m]> but with that said if you remove one of them you can then fire off a full scan afterwards or none if nothing was removed

19:52 <headius[m]> The good news is that if we fix this, that unnecessary frame push and pop will just go away

19:52 <enebo[m]> the utility of what flags represent is valuable but maintaining them is tough. Being able to calculate them on demand would be a tremendous amount of time

19:53 <headius[m]> With that gone, only a few tweaks are required to make leaf closures have almost zero dispatch cost

19:53 <enebo[m]> So let me ask a question...is it ACP which potentially removes some instrs or is it conceivably others as well?

19:54 <headius[m]> ACP only adds instructions

19:54 <enebo[m]> ok but isn't ACP the pass that adds the frame push?

19:54 <headius[m]> It examines flags to see if a frame or scope are needed, adds those instructions, and then marks the scope as having call protocol

19:55 <enebo[m]> so possibly a good "today" solution is a prescan in ACP to see if the instrs which would force a frame or not and update flags. Later determine whether that can be pushed further back into other pass of building itself

19:55 <headius[m]> In this case, the load of the frame closure has set that the frame is required, so ACP ads it

19:56 <headius[m]> Even though that load was removed

19:56 <enebo[m]> I believe this is mostly fallout from just being conservative with the intention to shave this away but I guess we need to know what allows this

19:57 <enebo[m]> what removes that load?

19:57 <headius[m]> I could make ACP rescan instructions, but if we have logic that is attempting to force a frame or scope based on information other than the instructions it won't see that

19:57 <headius[m]> I believe dce

19:57 <enebo[m]> ah DCE should probably have a post scan

19:58 <headius[m]> that's what I tried...reset flags and rescan, but it loses something needed for break

19:58 <enebo[m]> it would be DCE responsibility to know it removed something which would change the scopes flags

19:58 <enebo[m]> oh don't rescan

19:58 <enebo[m]> just scan and remove flag(s) which were results of removed instrs

19:58 <headius[m]> it works otherwise and this unnecessary frame goes away

19:58 <headius[m]> yeah how?

19:59 <enebo[m]> which instrs exist which require this flag?

19:59 <headius[m]> this case is LoadFrameClosure

20:00 <enebo[m]> If that is only a single instr which reflects this then that would be enough to remove the flag

20:00 <enebo[m]> and it can even be done at the time the instr is marked dead

20:00 <enebo[m]> if it is a set of instrs then I guess you need to run over all passes if you see any in the set

20:00 <headius[m]> it's NEEDS_BLOCK

20:01 <headius[m]> also set by CallBase for calls with literal block

20:01 <headius[m]> seems only those two set it

20:01 <headius[m]> hmm

20:01 <headius[m]> actually I don't think CallBase should set it

20:02 <headius[m]> ahh I see

20:02 <enebo[m]> in LiveVariableNode.markDeadInstrs something in the while can remove it but now I am not sure if I am totally following along

20:03 <headius[m]> it sets it for Proc.new and for calls to method names that might access frame

20:03 <headius[m]> not for calls that take a block

20:03 <enebo[m]> so the flag specifically is NEEDS_BLOCK?

20:03 <headius[m]> yeah

20:04 <headius[m]> my issue with removing flags is that I don't know if there are other instructions remaining that still need it

20:04 <headius[m]> like if the LoadFrameClosure gets removed, but later on there's a Proc.new, the body still needs frame closure

20:05 <enebo[m]> yeah that is where I started this conversation...we need to know what needs the flag. For this case instr wise Proc.new call and whatever does load_frame_closure

20:06 <headius[m]> ok let me turn this around a bit

20:06 <enebo[m]> we also have REQUIER_ALL_FRAME_{ two} subsets

20:06 <headius[m]> why doesn't a full reset and rescan work?

20:06 ur5us has joined #jruby

20:07 <headius[m]> I feel like if there are flags that can't be restored by a full rescan then there's a flaw in flags

20:07 <enebo[m]> I believe it is because the passes can observe something not inherent in a full scan BUT that maybe is not true

20:07 <enebo[m]> yeah I cannot fully answer that...it could just be a bug

20:07 <headius[m]> something clearly removes a flag needed for break in a block

20:07 <enebo[m]> we would need to audit all flag sets and make sure it is all just sourced from current set of instrs

20:07 <headius[m]> or rather, a rescan does not properly restore that flag

20:08 <headius[m]> yeah that's where I'm leaning now

20:08 <enebo[m]> I mean by default calculateflags just uses the method on the instr

20:08 <headius[m]> if there are flags that are not based on instrs, they should be a separate set

20:08 <enebo[m]> and some of the logic for what flags are needed in understanding more globally whether a set of instrs means something

20:08 <headius[m]> like for forcing a frame or scope regardless of what instrs need

20:08 <headius[m]> but there shouldn't be many of those cases...ideally flags should always be rescannable

20:09 <headius[m]> the combination of scope state + instrs should always be enough to restore flags from empty

20:09 <enebo[m]> and passes

20:09 <headius[m]> I say not passes

20:10 <headius[m]> at least for flags that are based on what instructions are present

20:10 <headius[m]> passes are a transformation of scope + instrs

20:10 <enebo[m]> so we have 2 sets of data and one cannot be recalcd?

20:10 <enebo[m]> IRFlags specifically what is true about the scope

20:11 <enebo[m]> here is a question so I better get your head space on this...

20:12 <enebo[m]> where is REQUIRES_BLOCK?

20:12 <headius[m]> it only comes from instructions, so it should accurately be restored in a rescan

20:12 <headius[m]> something needed for break does not get restored from a rescan

20:13 <headius[m]> I'm saying that is a bug

20:14 <enebo[m]> well in the case of REQUIRES_BLOCK it sounds like caclulateFlags on instr can just set REQUIRES_BLOCK if there is the presence of Proc.new OR load_frame_closure so I agree here

20:14 <enebo[m]> I would like to know when this does not work

20:14 <headius[m]> so you can try this and see the breakage I see

20:15 <headius[m]> in DCE after it runs, remove FLAGS_CALCULATED from the scope's flags and rescan

20:15 <enebo[m]> I have reset on occasion and seen breakage but I think some of this may be stuff like reuse_parent_dynscope being removed

20:15 <headius[m]> then try to run rake tests

20:15 <enebo[m]> some of the flags are calculated in a more complicated way than just is there an instr present

20:15 <headius[m]> yeah so that's a good example of a flag that should not be resettable this way

20:16 <headius[m]> I wouldn't say more complicated, I'd say it's based on things other than instrs

20:16 <headius[m]> so I think the issue is that we're overloading flags to be both scope state and information about the instructions

20:16 <enebo[m]> ok so I think we both agree there is some set of completely rederivable flags and in fact they are a virtue in that we can simply rescan once

20:16 <headius[m]> yes

20:16 <enebo[m]> instr.computeScopeFlags() on each one

20:17 <enebo[m]> There is a secondary set of state which depends on whole scope analysis which is an outcome of passes

20:17 <headius[m]> yeah this makes more sense to me now

20:17 <enebo[m]> whole scope analysis which may even include some child scopes

20:18 <enebo[m]> we really have two solutions to consider here

20:18 <enebo[m]> we could separate these or we could delineate the type and make a rescan subset

20:18 <headius[m]> I feel like separating would be better

20:19 <headius[m]> the entire purpose of caching the instr flags is so we don't have to rescan them, but we should be able to rebuild that subset entirely from instrs without wiping out flags that are derived elsewhere

20:19 <headius[m]> so computeInstrFlags should be a separate thing that can always flush and rescan

20:20 <enebo[m]> I do really feel something to make this technical decision more orthogonal would be to make all accesses via isBlockRequired()

20:20 <enebo[m]> from a usage standpoint and probably a set otherwise

20:21 <headius[m]> I don't follow

20:21 <enebo[m]> this would isolate the implementation enough where the only consideration would be memory consumption and pain of keeping the actual enumset as a more messy impl

20:21 <enebo[m]> we should not be doing scope.getIRFlags() from passes

20:21 <enebo[m]> we are leaking impl

20:21 <headius[m]> yeah

20:21 <headius[m]> well it's not really doing that but in this case it's needsFrame()

20:21 <enebo[m]> we should properly scope.isBlockRequired { get it from whereever }

20:22 <headius[m]> needsFrame checks for a number of flags representing frame fields, and REQUIRES_BLOCK is still there even though the instr that added it is gone

20:22 <headius[m]> yeah

20:22 <enebo[m]> if we properly use predicates and setters then I don't care if we split them or not

20:22 <enebo[m]> my only followup on splitting is adding a whole field for the separation

20:23 <enebo[m]> leaving combined would make the impl more complicated but we would more or less restrict any uses of how we store it to IRScope and IRFlags itself so it would stay an impl detail

20:26 <enebo[m]> headius: so I think we did unwrap some mystery of flags today

20:28 <headius[m]> I will look into splitting for now, and see if I can separate flags derived from instructions from flags derived through other state

20:28 <enebo[m]> My other ponder is whether this should be happening more than a few places. rederiving every pass is possible but I think it is only important on flags determining whether we add something or remove something

20:28 <headius[m]> my thinking is that calculating if a frame is needed would look at both, so we can force frame and scope behavior in one set, and the deer I have it in another

20:28 <headius[m]> Derive it in another

20:28 <enebo[m]> in passes which are about live variable state for example we probably don't look at any flags so resetting for them won't help

20:29 <enebo[m]> ACP is sounds will definitely help

20:29 <headius[m]> For the no argument times block, once this is fixed there are two other items

20:30 <headius[m]> One is that eval type field, which I think needs to be made final and eval with a block should always make a new one

20:30 <headius[m]> So blocks that are not used in an eval don't constantly have to check and set this field

20:30 <headius[m]> The other one is argument processing which we talked about in Richmond

20:31 <headius[m]> In this case, as a normal no arg block, we should just remove the instruction altogether. It's only being inserted to check arity if the block is a lambda

20:32 <headius[m]> I pushed a branch that makes block type final, and if proc needs to change it, it clones the block

20:32 <headius[m]> Since the vast majority of blocks will always be normal, very few will ever be cloned. That commit appears to be green

20:33 <headius[m]> This gets back to old discussions about splitting up the blocks of different types, so rather than having a field indicating what block type or eval type it is, that is all handled in instructions

20:34 <headius[m]> Possibly by lazily recompiling as a lambda or eval block

20:34 <enebo[m]> yeah

20:34 <enebo[m]> That has been something I have spiked once or twice and ran into issues

20:34 <enebo[m]> but hey I got inlining seemingly green perhaps it is not as bad as I think

20:35 <enebo[m]> I do think one issue is the idea that one parent scope has to share two child blocks which represent one piece of code

20:35 <enebo[m]> although I don't think that was the problem I had

20:36 <enebo[m]> Personally I really feel making a new IRClosure for a second form should just work but the parent has to be conservative so both work

20:37 <enebo[m]> this is where I don't remember how many times we getChildClosures and set flags/state on the parent

20:39 <enebo[m]> IRScope.calculateClosureScopeFlags() has one issue with nonlocal returns since if it is a lambda it doesn't but if it is proc it does

20:40 <enebo[m]> so parent sort of has to set it to expecting them and I believe that largely just adds a BB which recv_exception and then does the nonlocalreturn helper

20:40 <enebo[m]> anyways that is just a wrinkle and should not accept the lion share of code which is ordinary procs and not lambdas

20:40 <enebo[m]> also that BB will never get reached in actual code so it may just end up as code bloat

23:13 ur5us_ has joined #jruby

23:14 ur5us has quit [Read error: Connection reset by peer]

23:56 nirvdrum has quit [Ping timeout: 248 seconds]