sb0 changed the topic of #m-labs to: ARTIQ, Migen, MiSoC, Mixxeo & other M-Labs projects :: fka #milkymist :: Logs http://irclog.whitequark.org/m-labs
<sb0> there's this OSError unittest problem...
sb0 has quit [Quit: Leaving]
fengling has quit [Quit: WeeChat 1.4]
fengling has joined #m-labs
ylamarre has joined #m-labs
kuldeep has joined #m-labs
sandeepkr has joined #m-labs
ylamarre has quit [Quit: ylamarre]
sb0 has joined #m-labs
fengling has quit [Ping timeout: 240 seconds]
stekern has joined #m-labs
fengling has joined #m-labs
FabM has joined #m-labs
sb0 has quit [Quit: Leaving]
fengling has quit [Ping timeout: 240 seconds]
<whitequark> sb0: I've an opportunity to grab some silver plated bolts with a bunch of other fasteners I'm ordering for other stuff
<whitequark> which sizes did you need?
fengling has joined #m-labs
fengling has quit [Ping timeout: 240 seconds]
sb0 has joined #m-labs
<sb0> whitequark, M6 screws, but they need to have specific lengths that depend on the flange you connect to the cube
<whitequark> yeah, which?
<sb0> that's why I went for MoS2, because the length requirement *plus* the silver-plating requirement makes it annoying
<sb0> I'd have to measure them
<sb0> I don't have the figures anymore
<whitequark> ok.
<sb0> if they're cheap and easy to get, take an assortment of M6 of various lengths... might come handy, even for regular flanges
<sb0> can you even use any M screw for conflat? don't they need to be high-strength as well?
<whitequark> nah
<whitequark> well, any stainless screw
<whitequark> stainless is already quite hard compared to cheap mild steel
<whitequark> https://imgur.com/FkblHjk which one should we get for the lab?
<sb0> the mid-sized one?
<whitequark> ok
<sb0> it should be large enough for things like kf25 clamps
<whitequark> maybe we should get the large one then
<sb0> we also have a bunch of small electronic parts
<whitequark> though... kf25 should be fine with either
<whitequark> electronic parts go in tiny plastic bags anyway
<whitequark> or you'll never fish them out
<whitequark> also, marking
<whitequark> so its not that important for those
sb0 has quit [Quit: Leaving]
fengling has joined #m-labs
fengling has quit [Ping timeout: 240 seconds]
<cr1901_modern> sb0: You're using IPV6 in your test case test_ctlmgr.py, correct?
fengling has joined #m-labs
fengling has quit [Ping timeout: 240 seconds]
<cr1901_modern> sb0: This is the function that's failing: https://github.com/python/cpython/blob/3.5/Modules/overlapped.c#L1016-L1057 B/c it's a Windows error, it's either failing in Py_ConnectEx (which is an alias for a ConnectEx in WINAPI), or parse_address. I believe what's happening is that Python is sending a SOCKADDR_IN to ConnectEx by mistake when it should be sending a SOCKADDR_IN6.
<cr1901_modern> I can confirm that that line 506 of asyncio\windows_events.py errors out immediately when passed a two-length address tuple, whereas if I used pdb to set a breakpoint at line 495 of asyncio\windows_events.py, and extend the tuple to 4 elements ("::1", $PORT, 0, 0), the breakpoint will be triggered multiple times before erroring out with OSError at the same location.
<cr1901_modern> ("[WinError 10022] An invalid argument was supplied" can be returned if Windows doesn't like the socket parameters.)
<cr1901_modern> sb0: Currently, I haven't found anywhere in our user code that lets us force Python to create a 4-argument tuple when running Overlapped_ConnectEx
fengling has joined #m-labs
fengling has quit [Ping timeout: 240 seconds]
fengling has joined #m-labs
fengling has quit [Ping timeout: 240 seconds]
ylamarre has joined #m-labs
sandeepkr has quit [Ping timeout: 240 seconds]
kuldeep has quit [Ping timeout: 240 seconds]
kuldeep has joined #m-labs
sandeepkr has joined #m-labs
kuldeep has quit [Max SendQ exceeded]
kuldeep has joined #m-labs
kuldeep has quit [Max SendQ exceeded]
kuldeep has joined #m-labs
fengling has joined #m-labs
sb0 has joined #m-labs
<sb0> cr1901_modern, are you certain that python simply calls it with the wrong object type?
<sb0> cr1901_modern, why does it have the correct type sometimes?
<sb0> and yes, the test uses ipv6
fengling has quit [Ping timeout: 240 seconds]
<whitequark> sb0: looks like the dashboard also fails with the same problem... should I look at it?\
<cr1901_modern> sb0: No I'm not certain that ObjectEx() gets the wrong object type. I am certain that the Python wrapper function (which is called Overlapped_ObjectEx) is passed a 2-length tuple in that test case. To be certain I'd somehow have to hook the call to ObjectEx or parse_address().
<cr1901_modern> That being said, if you look at parse_address, it's an if-else... and the 2-tuple that's passed in by the test case should be parsed correctly by PyArg_ParseTuple("sH", ...)
<sb0> whitequark, the dashboard cannot connect at all?
altker128 has joined #m-labs
<altker128> Hey guys.
<altker128> For the MiSoC project, is there a debugger? Looking at the SRC, I see some comments on "serial debugger"
<sb0> altker128, where do you see such comments?
<cr1901_modern> It's in the custom additions to LM32, IIRC
<altker128> cr1901_modern: Do you have any personal experience using it?
<cr1901_modern> No, unfortunately :(
<cr1901_modern> sb0: I have python running in a debugger... I'm trying to break on the ObjectEx() WINAPI call... naturally it's not working as expected
<sb0> cr1901_modern, I don't think you need any debugger. just print() whatever python passes to it
<sb0> it's probably very straightforward to reproduce, just create_connection to ::1
<cr1901_modern> sb0: The ObjectEx call isn't called from Python code; it's called from C code. So I want to break JUST before the error so I can know whether it's going to fail or not
<cr1901_modern> The C code decides based on tuple length whether to try calling ObjectEx with IPv6 or IPv4 params
<sb0> okay and do you think this C code is wrong
<sb0> does python pass it the correct tuple length or not
<sb0> ?
fengling has joined #m-labs
<cr1901_modern> Python passes it a 2-tuple; for IPv6, it expects a 4-tuple. I'm not sure why Python is deciding to send a 2-tuple
<cr1901_modern> sb0: Didn't you just say that sometimes the code doesn't throw OSError?
<cr1901_modern> "(12:45:58 PM) sb0: cr1901_modern, why does it have the correct type sometimes?"
<sb0> apparently not
<sb0> it fails consistently
<sb0> I thought so because this is such an obvious problem
<sb0> but my opinion of python is perhaps still too high
<cr1901_modern> Kinda late to rewrite everything...
<cr1901_modern> My main gripe is that Windoze is giving a shitty vague error message
<sb0> yeah
<sb0> it fails all the time
<sb0> and python definitely fucked that up
<sb0> all you need to do is create ProactorEventLoop, set it as main event loop, and call open_connection with ::1 as first parameter
<sb0> the port doesnt even need to be open
<cr1901_modern> sb0: Oh, sorry I didn't get that far yet to create a minimum component example :P
<sb0> IPv4 is not affected
fengling has quit [Ping timeout: 240 seconds]
<cr1901_modern> sb0: I didn't find many chances to overlapped.c since the version bump, so I'm guessing the code above is breaking. Haven't figured out where yet
<cr1901_modern> changes*
<sb0> bah the first thing you check is whether overlapped.c is called with correct arguments or not.
<sb0> just print() what it gives it
<sb0> the bug is trivial to reproduce and printing will work easily
<cr1901_modern> sb0: I also think it's a Python bug at this point... see linked comment https://github.com/python/cpython/blob/3.5/Lib/asyncio/base_events.py#L604-L607
<cr1901_modern> sb0: >>> loop.run_until_complete(loop.getaddrinfo("::1", 4242))
<cr1901_modern> [(<AddressFamily.AF_INET6: 23>, 0, 0, '', ('::1', 4242, 0, 0))]
<cr1901_modern> ^ To help further pinpoint the bug. getaddrinfo returns the correct tuple o.0;
<sb0> bah. is create_connection feeding that directly into overlapped.c or not?
<sb0> did you set the loop to proactor before calling that?
<altker128> Sorry to be off topic, are you guys using the MiSoC in Altera / Xilinx parts?
<sb0> yes, both
<sb0> some lattice as well
<altker128> What's the resource utilization like for all three?
<sb0> ~kLUT for a small SoC
<cr1901_modern> sb0: Hold on two seconds before I answer that
<altker128> sb0: But never used any debug capability?
<cr1901_modern> sb0: Yes, I set the proactor event loop. I just copied the four lines of your bug report snippet and changed the function called
<cr1901_modern> create_connection() calls _ensure_resolved(), which is a thin wrapper over loop.getaddrinfo()
<sb0> altker128, no
<sb0> but there are many bits of code you could integrate if you're interested in that
<cr1901_modern> the return value from _ensure_resolved() is destructured (host, family, cname, addresss, etc) and then, passed into overlapped.c
<sb0> why don't you add a print() right before it calls overlapped.c?
<cr1901_modern> sb0: That's how I figured out in the first place that a 2-tuple was being passed :)
<cr1901_modern> (well I set a breakpoint using pdb, but same difference)
<cr1901_modern> https://github.com/python/cpython/blob/3.5/Lib/asyncio/base_events.py#L150-L161 This is what I'm focusing on right now... when I set a breakpoint on getaddrinfo, it doesn't trigger
<cr1901_modern> (Meaning _ensure_resolved isn't actually calling getaddrinfo, but another wrapper :/)
<cr1901_modern> To reiterate: open_connection => create_connection => overlapped.c (b/c the proactor in ProactorEventLoop on Windows is a wrapper over overlapped.c)
<cr1901_modern> sb0: Pretty sure the problem is this line: https://github.com/python/cpython/blob/3.5/Lib/asyncio/base_events.py#L142
<cr1901_modern> sb0: That's the problem line. Change "return af, type, proto, '', (host, port)" to "return af, type, proto, '', (host, port, 0, 0)", line 142 in asyncio/base_events.py. The OSError disappears
<cr1901_modern> (I get a different error, but... one thing at a time :P)
<sb0> doesn't that break v4?
<sb0> what is the new error?
<cr1901_modern> sb0: Of course it prob breaks v4 :P. I didn't make it foolproof yet
<cr1901_modern> New error is ConnectionRefusedError: [WinError 1225] The remote computer refused the network connection
<cr1901_modern> And a TimeoutError was raised on line 57
<sb0> the previous code already returns "af, type, proto, '', (host, port)"
<sb0> well, 3.5.1 was before this commit.
<cr1901_modern> sb0: Before this commit, getaddrinfo was guaranteed to run.
<cr1901_modern> Your linked commit tries to bypass getaddrinfo, and does so in a way that a 2-tuple is always returned, when getaddrinfo itself may return a 4-tuple
<sb0> okay thanks.
<cr1901_modern> The bug is that _ipaddr_info() doesn't return parameters that matches getaddrinfo's return values in all possible scenarios
<sb0> I see
<cr1901_modern> Np... glad I could help. At least it's now known it's a legit Python bug
fengling has joined #m-labs
FabM has quit [Quit: ChatZilla 0.9.92 [Firefox 47.0/20160604131506]]
fengling has quit [Ping timeout: 240 seconds]
fengling has joined #m-labs
fengling has quit [Ping timeout: 240 seconds]
<whitequark> rjo: in https://github.com/m-labs/artiq/issues/512#issuecomment-232168447 by "trigger" do you mean on our buildbot?
<whitequark> because that shouldn't be possible in the web interface, but should be possible in IRC
<whitequark> (unauthenticated, that is)
fengling has joined #m-labs
<bb-m-labs> build #173 of artiq-kc705-nist_qc2 is complete: Success [build successful] Build details are at http://buildbot.m-labs.hk/builders/artiq-kc705-nist_qc2/builds/173
fengling has quit [Ping timeout: 240 seconds]
<bb-m-labs> build #229 of artiq-pipistrello-nist_qc1 is complete: Success [build successful] Build details are at http://buildbot.m-labs.hk/builders/artiq-pipistrello-nist_qc1/builds/229
<rjo> whitequark: yes. that's what i meant and told him.
mumptai has joined #m-labs
<rjo> whitequark: are you bulding rust-or1k currently?
<rjo> or are you writing a networking stack in rust?
<whitequark> former
<whitequark> a networking stack in rust sounds like something best left to other people, more experienced and more invested in TCP than me
<whitequark> from what I know about TCP implementation, it sounds easy and is full of gnarly interoperability issues
<cr1901_modern> Neither sounds particularly like fun (although two ppl have told me adding a new backend to Rust isn't THAT bad)
<whitequark> in any case, there is no significant advantage to having a TCP/IP stack in Rust since it is a cleanly separated layer that you can just wall off behind an FFI facade, without it affecting your Rust interfaces
<cr1901_modern> What prompted using Rust?
<rjo> whitequark: ack
<bb-m-labs> build #230 of artiq-pipistrello-nist_qc1 is complete: Failure [failed conda_build] Build details are at http://buildbot.m-labs.hk/builders/artiq-pipistrello-nist_qc1/builds/230
<whitequark> rjo: ooh, my timing error extraction code is working
<whitequark> I was going to test it but never did
fengling has joined #m-labs
<altker128> Do you guys still use the Navre AVR core?
fengling has quit [Ping timeout: 240 seconds]
fengling has joined #m-labs
fengling has quit [Ping timeout: 240 seconds]
mumptai has quit [Remote host closed the connection]
sandeepkr_ has joined #m-labs
kuldeep has quit [Ping timeout: 250 seconds]
sandeepkr has quit [Ping timeout: 244 seconds]
sandeepkr__ has joined #m-labs
kuldeep has joined #m-labs
sandeepkr_ has quit [Read error: No route to host]
sandeepkr has joined #m-labs
kuldeep has quit [Ping timeout: 250 seconds]
sandeepkr__ has quit [Ping timeout: 272 seconds]
fengling has joined #m-labs
ylamarre has quit [Quit: ylamarre]
fengling has quit [Ping timeout: 240 seconds]