KarolBucekGitter has quit [Ping timeout: 240 seconds]
KarolBucekGitter has joined #jruby
rg[m] has quit [Ping timeout: 246 seconds]
rdubya[m] has quit [Ping timeout: 246 seconds]
kares[m] has quit [Ping timeout: 246 seconds]
TimGitter[m]1 has quit [Ping timeout: 246 seconds]
sintel[m] has quit [Ping timeout: 246 seconds]
venkatkms[m] has quit [Ping timeout: 246 seconds]
i8her8oat[m] has quit [Ping timeout: 246 seconds]
multislacker[m] has quit [Ping timeout: 246 seconds]
smarks[m] has quit [Ping timeout: 246 seconds]
elisabeth[m] has quit [Ping timeout: 246 seconds]
rg[m] has joined #jruby
smarks[m] has joined #jruby
multislacker[m] has joined #jruby
i8her8oat[m] has joined #jruby
kares[m] has joined #jruby
TimGitter[m]1 has joined #jruby
rdubya[m] has joined #jruby
venkatkms[m] has joined #jruby
sintel[m] has joined #jruby
elisabeth[m] has joined #jruby
ur5us has quit [Ping timeout: 256 seconds]
ur5us has joined #jruby
jmalves has joined #jruby
nirvdrum has quit [Ping timeout: 260 seconds]
jmalves has quit [Ping timeout: 246 seconds]
bga57 has quit [Ping timeout: 272 seconds]
bga57 has joined #jruby
bga57 has quit [Ping timeout: 272 seconds]
bga57 has joined #jruby
_whitelogger has joined #jruby
bga57 has quit [Ping timeout: 272 seconds]
bga57 has joined #jruby
Antiarc has quit [Ping timeout: 265 seconds]
Antiarc has joined #jruby
ur5us has quit [Remote host closed the connection]
ur5us has joined #jruby
ur5us has quit [Ping timeout: 246 seconds]
ur5us has joined #jruby
snickers has joined #jruby
snickers has quit [Ping timeout: 260 seconds]
ur5us has quit [Ping timeout: 246 seconds]
drbobbeaty has joined #jruby
ur5us has joined #jruby
nirvdrum has joined #jruby
ur5us has quit [Ping timeout: 246 seconds]
<headius[m]> enebo: you around?
<enebo[m]> headius: sure
<headius[m]> OpenJDK is BUSTE*D
<headius[m]> wow that formatted nice
<enebo[m]> I saw your tweet
<enebo[m]> with Atilla
<headius[m]> yeah cl4es messaged me later and we paired on some code spelunking
<headius[m]> I think this is the root problem
<headius[m]> you can see the closing in URLClassLoader.close, and via the openConnection logic it eventually uses a JDK-global cache of JarFile instances
<headius[m]> so blammo, eventually it closes a JarFile that's in the cache
<headius[m]> this is the cause of all the exceptions in that bug, plus all our sporadic "zip file closed" errors in CI
<enebo[m]> Atilla mentioned making his own URLCL? Is that too big an idea?
<headius[m]> I'd much rather find one we could reuse because this isn't simple stuff
<headius[m]> I'm actually not hating the idea of not closing URLClassLoader
<headius[m]> I don't really get why it's doing all these gymnastics to cache JarFile instances anyway
<headius[m]> the right fix, however, is probably to avoid ever using ClassLoader.getResourceAsStream or URL.openStream directly
<headius[m]> this is really shockingly broken
<enebo[m]> we have a need to caching presumably because we do not require all files within a jar (like warbler or our kernel files)
<headius[m]> so there's two caches involved here
<headius[m]> inside JarResource there's a static reference to JarCache, which was contributed to us by a user
<headius[m]> that cache takes a jar path string and builds a cache of JarEntry that's faster to search than manually walking entries
<headius[m]> it's used anywhere we use JarResource.openWhatever
<headius[m]> that does not appear to be broken, at least not exactly
<headius[m]> the cache that's broken is internal to OpenJDK... whenever you request a resource from a jar using either a URL or a URLClassLoader, it eventually calls into a global cache mapping jar URLs to JarFile instances
<headius[m]> to reuse those JarFile for... reasons?
<headius[m]> but URLClassLoader also adds the jar files it encounters into a list of things to close
<headius[m]> I just don't understand how this hasn't been reported and fixed years ago
<enebo[m]> but there are in fact two separate things in a jar for us as well
<enebo[m]> classfiles which presumably somehow is the reason for some deep cache
<enebo[m]> and ordinary files (e.g. .rb)
<headius[m]> the cache we make doesn't get used for classloading
<headius[m]> it's only used for things like globbing files from a jar
<enebo[m]> I guess us reading all ordinary ones into memory and closing/not accessing from the jarfile is probably not a solution because it would be time consuming
<enebo[m]> oh yeah I got that
<headius[m]> I think classloader resources may at some level also be going through this jar file cache
<enebo[m]> I am just trying to think about why we actually hit the error and it is to access these resources and not actual classfiles
<headius[m]> yeah seems like all the errors I saw were loading ruby scripts from inside the jar
<enebo[m]> I am surprised there is no bug parade on this
<headius[m]> I think I saw a case where it was looking up a class, but it was opening the class as a resource to see if it was there
<headius[m]> so again a resource access
<enebo[m]> yeah if we could know a jar is loaded we could reload it with our own jar code and then all requiers would grab from that...seems far fetched
<headius[m]> the logic inside URLClassLoader and on down for opening these connections is just stupid complicated
<enebo[m]> I imagne it ends up being a stream at some point so not super useful...and I doubt we want to read the entire resource stream into memory
<enebo[m]> I mean a second time
<enebo[m]> and sequentially
<enebo[m]> even forget that this is about concurrent interaction
<enebo[m]> that will not even work because it might be closed by the time you try it
<headius[m]> yeah single-thread sequential access will also break
<headius[m]> it's totally broken
<headius[m]> like if you use two URLClassLoader to read the same jar, and you close them, one will break
<headius[m]> it's that simple
<enebo[m]> Too bad there is no concept of dynamic shading
<enebo[m]> Or not really shading but something which confuses the lowest level into thinking they are in fact different jars
<enebo[m]> like adding something to the url which is not stripped off via normalization
<headius[m]> we could copy it to a temp location :-)
<enebo[m]> possibly it is a good solution for warbler at least
<enebo[m]> we could try and generalize it but I don't know
<enebo[m]> So I guess though this is purely a problem with emebdding
<enebo[m]> I would think most warbler people would not start up n runtime instances in one runtime but startup one with n threads
elia has joined #jruby
elia has quit [Client Quit]
<headius[m]> warbler's a good example... I have never gotten a good picture of which servers, or which versions of those servers, unpack the war to a temp location
<headius[m]> but we've gotten a few reports recently about FFI-based libraries that have native lilbs
<headius[m]> they clearly don't work from within a jar
<headius[m]> but FFI would have to know the file URL it was given to open as a library is in a jar and needs to be unpacked, etc
<headius[m]> could try to load the file into shared memory somewhere and dlopen it as a pointer?
<headius[m]> madness
<headius[m]> anyway yeah the typical case when you control all the threads would be to use same instance
<headius[m]> but I think there are a lot of apps out there using JRuby as a scripting language for a larger app, so they separate it by runtime or ScriptingContainer
<enebo[m]> yeah so we see this in testing via multiple instances in same process and the people reporting are embedding use cases
<enebo[m]> Just thinking through impact of this mostly
<enebo[m]> So my comment about warbler maybe is not relevant
<enebo[m]> but if we could rename jars somehow on each script container load then it might eliminate this problem and create a new problem
<enebo[m]> since we would have the same jar possibly loaded hundreds of times
<headius[m]> yup exactly
<headius[m]> I never put all these cases together until this guy came up with a pocket reproduction
<enebo[m]> Seems like scripting container with jar resources is just a problem we might be stuck with until JVM actually fixes it
<enebo[m]> Our position will be to not store resources in a jar due to this I guess?
<enebo[m]> for some applications that just means exploding those resources to some runtime location
<enebo[m]> That sucks of course but the way you have painted this I am not seeing a lot of recourse
<enebo[m]> Another solution would be for apps to not load multiple containers but use them in one of our threaded modes
<enebo[m]> (as a suggested fix)
<headius[m]> well, all of jruby-complete is resources in a jar
<enebo[m]> that really only helps if you do not require isolation
<enebo[m]> yeah but if you only load it once it is fine right?
<enebo[m]> Or I guess it could still close
<enebo[m]> Do we know why it closes?
<headius[m]> did you see this one? https://github.com/jruby/jruby/pull/6269
<enebo[m]> finalization of our runtime makes it think it is done even though another comes in
<headius[m]> it is possible to work around this by always getting a URLConnection and turning off caching
<headius[m]> rather than using the shortcut methods
<headius[m]> it's just gross
<enebo[m]> ok but at least it is a workaround although I guess that will have an impact :)
<headius[m]> this plus a ClassLoader.getResourceAsString patch should fix it for all JRuby
<headius[m]> Stream
<headius[m]> yeah I have no idea
<headius[m]> hmmm
<headius[m]> I wonder when we started calling close
<headius[m]> URLClassLoader.close was added in Java 7
<enebo[m]> yeah is this a function on n concurrent opens/closes lifecycles from only us
<headius[m]> yeah
<enebo[m]> so no close would probably not have them ever close but would be a leak
<headius[m]> I also figured out in JDK code it is a concurrency bug
<enebo[m]> although not a leak in practice probably for most uses
<headius[m]> it does try to remove this JarFile from the cache, but there's no guarantee that someone else isn't using it
<enebo[m]> So perhaps a property added for this problem saying 'do not close jar resources' which we put in a wiki and they can determine whether that is really a leak for them
<headius[m]> oh a property for not using the jar file cache maybe?
<headius[m]> clearly this is going to be needed in enough places I will have to make a utility method
<enebo[m]> well I was just thinking if we never close the jar file in any way it will never close so we will never see the error
<enebo[m]> it is about finalization of the whole jar file right?
<headius[m]> yeah that's the problem though, the jar file gets registered in JRubyClassLoader and we explicitly close it on teardown
<enebo[m]> or individual resource contention within a jar?
<headius[m]> URLClassLoader closes these JarFile that it used, which it probably got from the cache, and it does it in such a way that someone might get a stale entry
<enebo[m]> oh I guess even if we did not close on teardown it may still think they need to close
<headius[m]> oh I think I get what you're asking... the other fix
<headius[m]> yeah I have no idea if we don't close URLClassLoader will it leak a ton of stuff?
<headius[m]> JarFile is a ZipFile which eventually holds a RandomAccessFile
<headius[m]> huh I just realized Java 9 removed a bunch of finalizers
<headius[m]> anyway I don't think RandomAccessFile will close itself on finalization
<headius[m]> it's just so broken every way I think about it
<headius[m]> I was thinking "oh well the jar file will just stay in the cache and be okay" but then why did they write this code to close jar file instances?
<enebo[m]> probably an optimization or thought of closing the corner case that you really are done with a jar and you do not want it anymore
<enebo[m]> seemingly though this is only an implementation problem which can get fixed
<enebo[m]> not for what most people are using today but the semantics of closing seems reasonable if done right
<headius[m]> that fails immediately
<enebo[m]> if you remove urlc.close() and do it alot I wonder how much growth occurs (or if it also fails)
<headius[m]> in theory it should use the same cached JarFile
<headius[m]> I'll try i
<headius[m]> it
<headius[m]> hmm mac Activity Monitor doesn't seem to show file descriptors
<headius[m]> java 5193 headius 10r REG 1,5 15440228 27842244 /Users/headius/projects/jruby/lib/jruby.jar
<headius[m]> only the one entry
<headius[m]> now the problem in JRuby though is that we also use URLClassLoader for nested jars, which get unpacked to unique temp locations
<headius[m]> ugh we could manually clean those up
<headius[m]> unique temp locations would leak pretty badly
<headius[m]> ugh I wonder if his example is unpacking jars for every loop
<headius[m]> what a can of worms
<enebo[m]> so if all distinct files always had the same location and no temp file stuff we could potentially just say anything loaded will never be unloaded (with a property perhaps so it can be disabled (or enabled))
<enebo[m]> time it !
<headius[m]> right
<headius[m]> that would be the worst outcome if it's all just filesystem locations
<enebo[m]> It is possible this is way faster also
<headius[m]> well we are clearly not benefiting from the JDK cache if they close it when we close the CL
<headius[m]> we do have in hand a list of the tempfile jars, so I think it should be possible to get a JarURLConnection that has the cached JarFile and kill it
<headius[m]> we'll know it's ours because it's a tempfile
<headius[m]> this is relying on some super deep magic though... opening a new connection assuming we are going to get the cached JarFile to close it
<headius[m]> 😬
<enebo[m]> heh
<enebo[m]> So unpacking to tempfiles happend for jffi but what else does that?
<enebo[m]> If we had an extended URLClassL that was TempURLClassL then assuming we do not reuse temp loaded between instances they could just close
<headius[m]> nested jars
<headius[m]> like all the jars packed inside jruby-complete.jar
<enebo[m]> or do we not control how those get loaded out of a jar
<headius[m]> hmm
<headius[m]> ugh it's possible
<headius[m]> we'd have a separate TempCL that wraps JRubyCL
<headius[m]> TempCL would actually have to become JRubyCL since that's our top-level classloader on every API
<enebo[m]> top-level jars we never close (maybe on property) and for nested exploded to temp dirs we use different CL and always close?
<headius[m]> this would be hidden inside it I suppose
<headius[m]> internally delegate so that only temp jars get closed
<enebo[m]> yeah
<headius[m]> that's not terrible
<enebo[m]> I am just brainstorming of course but it sounds like it might work :)
<enebo[m]> Still sounds like we need top-level urlclassloaders potentialy not close (not sure about defaulting that on or off by default)
<enebo[m]> defaulting it on means potentially more memory someone might notice and report an issue. Leaving close on will probably generate an issue report and then they will be told/discover they can use the property
<enebo[m]> I am all about not getting issue reports :)
<headius[m]> I think I have a patch for your idea
<headius[m]> I seems to work with jruby.jar but I will check if it leaks with jruby-complete
<headius[m]> hmm still blew up
<enebo[m]> hmm
<headius[m]> I'm going to take a breather... been messing with this thing for two days
<headius[m]> If you see anything obvious in that patch give it a try... the example class and command line are in the bug
<enebo[m]> ok
Antiarc has quit [Ping timeout: 246 seconds]
subbu is now known as subbu|lunch
Antiarc has joined #jruby
subbu|lunch is now known as subbu
ebarrett has quit [Quit: WeeChat 2.8]
ur5us has joined #jruby
JohnPhillips3141 has joined #jruby
nirvdrum has quit [Ping timeout: 272 seconds]
ur5us has quit [Ping timeout: 272 seconds]