#m-labs on 2016-07-12 — irc logs at freenode.irclog.whitequark.org

2015-03-04 14:45 sb0 changed the topic of #m-labs to: ARTIQ, Migen, MiSoC, Mixxeo & other M-Labs projects :: fka #milkymist :: Logs http://irclog.whitequark.org/m-labs

00:33 <sb0> there's this OSError unittest problem...

02:32 sb0 has quit [Quit: Leaving]

02:39 fengling has quit [Quit: WeeChat 1.4]

02:40 fengling has joined #m-labs

02:41 ylamarre has joined #m-labs

03:45 kuldeep has joined #m-labs

03:45 sandeepkr has joined #m-labs

04:23 ylamarre has quit [Quit: ylamarre]

04:25 sb0 has joined #m-labs

04:38 fengling has quit [Ping timeout: 240 seconds]

05:21 stekern has joined #m-labs

05:31 fengling has joined #m-labs

06:16 FabM has joined #m-labs

07:04 sb0 has quit [Quit: Leaving]

09:07 fengling has quit [Ping timeout: 240 seconds]

10:03 <whitequark> sb0: I've an opportunity to grab some silver plated bolts with a bunch of other fasteners I'm ordering for other stuff

10:03 <whitequark> which sizes did you need?

10:34 fengling has joined #m-labs

10:39 fengling has quit [Ping timeout: 240 seconds]

10:46 sb0 has joined #m-labs

11:00 <sb0> whitequark, M6 screws, but they need to have specific lengths that depend on the flange you connect to the cube

11:00 <whitequark> yeah, which?

11:00 <sb0> that's why I went for MoS2, because the length requirement *plus* the silver-plating requirement makes it annoying

11:01 <sb0> I'd have to measure them

11:01 <sb0> I don't have the figures anymore

11:01 <whitequark> ok.

11:02 <sb0> if they're cheap and easy to get, take an assortment of M6 of various lengths... might come handy, even for regular flanges

11:04 <sb0> can you even use any M screw for conflat? don't they need to be high-strength as well?

11:04 <whitequark> nah

11:04 <whitequark> well, any stainless screw

11:04 <whitequark> stainless is already quite hard compared to cheap mild steel

11:04 <whitequark> https://imgur.com/FkblHjk which one should we get for the lab?

11:07 <sb0> the mid-sized one?

11:07 <whitequark> ok

11:08 <sb0> it should be large enough for things like kf25 clamps

11:08 <whitequark> maybe we should get the large one then

11:09 <sb0> we also have a bunch of small electronic parts

11:09 <whitequark> though... kf25 should be fine with either

11:09 <whitequark> electronic parts go in tiny plastic bags anyway

11:09 <whitequark> or you'll never fish them out

11:09 <whitequark> also, marking

11:09 <whitequark> so its not that important for those

11:25 sb0 has quit [Quit: Leaving]

11:36 fengling has joined #m-labs

11:41 fengling has quit [Ping timeout: 240 seconds]

12:14 <cr1901_modern> sb0: You're using IPV6 in your test case test_ctlmgr.py, correct?

12:37 fengling has joined #m-labs

12:42 fengling has quit [Ping timeout: 240 seconds]

12:43 <cr1901_modern> sb0: This is the function that's failing: https://github.com/python/cpython/blob/3.5/Modules/overlapped.c#L1016-L1057 B/c it's a Windows error, it's either failing in Py_ConnectEx (which is an alias for a ConnectEx in WINAPI), or parse_address. I believe what's happening is that Python is sending a SOCKADDR_IN to ConnectEx by mistake when it should be sending a SOCKADDR_IN6.

12:45 <cr1901_modern> I can confirm that that line 506 of asyncio\windows_events.py errors out immediately when passed a two-length address tuple, whereas if I used pdb to set a breakpoint at line 495 of asyncio\windows_events.py, and extend the tuple to 4 elements ("::1", $PORT, 0, 0), the breakpoint will be triggered multiple times before erroring out with OSError at the same location.

12:47 <cr1901_modern> ("[WinError 10022] An invalid argument was supplied" can be returned if Windows doesn't like the socket parameters.)

12:49 <cr1901_modern> sb0: Currently, I haven't found anywhere in our user code that lets us force Python to create a 4-argument tuple when running Overlapped_ConnectEx

13:39 fengling has joined #m-labs

13:44 fengling has quit [Ping timeout: 240 seconds]

14:40 fengling has joined #m-labs

14:45 fengling has quit [Ping timeout: 240 seconds]

15:48 ylamarre has joined #m-labs

16:08 sandeepkr has quit [Ping timeout: 240 seconds]

16:09 kuldeep has quit [Ping timeout: 240 seconds]

16:21 kuldeep has joined #m-labs

16:21 sandeepkr has joined #m-labs

16:25 kuldeep has quit [Max SendQ exceeded]

16:26 kuldeep has joined #m-labs

16:30 kuldeep has quit [Max SendQ exceeded]

16:30 kuldeep has joined #m-labs

16:44 fengling has joined #m-labs

16:45 sb0 has joined #m-labs

16:45 <sb0> cr1901_modern, are you certain that python simply calls it with the wrong object type?

16:45 <sb0> cr1901_modern, why does it have the correct type sometimes?

16:47 <sb0> and yes, the test uses ipv6

16:49 fengling has quit [Ping timeout: 240 seconds]

16:54 <whitequark> sb0: looks like the dashboard also fails with the same problem... should I look at it?\

17:01 <cr1901_modern> sb0: No I'm not certain that ObjectEx() gets the wrong object type. I am certain that the Python wrapper function (which is called Overlapped_ObjectEx) is passed a 2-length tuple in that test case. To be certain I'd somehow have to hook the call to ObjectEx or parse_address().

17:02 <cr1901_modern> That being said, if you look at parse_address, it's an if-else... and the 2-tuple that's passed in by the test case should be parsed correctly by PyArg_ParseTuple("sH", ...)

17:05 <sb0> whitequark, the dashboard cannot connect at all?

17:07 <whitequark> https://github.com/m-labs/artiq/issues/508

17:29 altker128 has joined #m-labs

17:29 <altker128> Hey guys.

17:30 <altker128> For the MiSoC project, is there a debugger? Looking at the SRC, I see some comments on "serial debugger"

17:38 <sb0> altker128, where do you see such comments?

17:38 <cr1901_modern> It's in the custom additions to LM32, IIRC

17:39 <altker128> cr1901_modern: Do you have any personal experience using it?

17:39 <cr1901_modern> No, unfortunately :(

17:40 <cr1901_modern> sb0: I have python running in a debugger... I'm trying to break on the ObjectEx() WINAPI call... naturally it's not working as expected

17:40 <sb0> cr1901_modern, I don't think you need any debugger. just print() whatever python passes to it

17:41 <sb0> it's probably very straightforward to reproduce, just create_connection to ::1

17:42 <cr1901_modern> sb0: The ObjectEx call isn't called from Python code; it's called from C code. So I want to break JUST before the error so I can know whether it's going to fail or not

17:42 <cr1901_modern> The C code decides based on tuple length whether to try calling ObjectEx with IPv6 or IPv4 params

17:44 <sb0> okay and do you think this C code is wrong

17:44 <sb0> does python pass it the correct tuple length or not

17:45 <sb0> ?

17:45 fengling has joined #m-labs

17:45 <cr1901_modern> Python passes it a 2-tuple; for IPv6, it expects a 4-tuple. I'm not sure why Python is deciding to send a 2-tuple

17:46 <cr1901_modern> sb0: Didn't you just say that sometimes the code doesn't throw OSError?

17:46 <cr1901_modern> "(12:45:58 PM) sb0: cr1901_modern, why does it have the correct type sometimes?"

17:46 <sb0> apparently not

17:46 <sb0> it fails consistently

17:46 <sb0> I thought so because this is such an obvious problem

17:46 <sb0> but my opinion of python is perhaps still too high

17:47 <cr1901_modern> Kinda late to rewrite everything...

17:48 <cr1901_modern> My main gripe is that Windoze is giving a shitty vague error message

17:48 <sb0> yeah

17:48 <sb0> it fails all the time

17:48 <sb0> and python definitely fucked that up

17:49 <sb0> all you need to do is create ProactorEventLoop, set it as main event loop, and call open_connection with ::1 as first parameter

17:49 <sb0> the port doesnt even need to be open

17:49 <cr1901_modern> sb0: Oh, sorry I didn't get that far yet to create a minimum component example :P

17:49 <sb0> IPv4 is not affected

17:50 fengling has quit [Ping timeout: 240 seconds]

17:51 <cr1901_modern> sb0: I didn't find many chances to overlapped.c since the version bump, so I'm guessing the code above is breaking. Haven't figured out where yet

17:51 <cr1901_modern> changes*

17:57 <sb0> http://bugs.python.org/issue27500

17:58 <sb0> bah the first thing you check is whether overlapped.c is called with correct arguments or not.

17:58 <sb0> just print() what it gives it

17:58 <sb0> the bug is trivial to reproduce and printing will work easily

18:00 <cr1901_modern> sb0: I also think it's a Python bug at this point... see linked comment https://github.com/python/cpython/blob/3.5/Lib/asyncio/base_events.py#L604-L607

18:08 <cr1901_modern> sb0: >>> loop.run_until_complete(loop.getaddrinfo("::1", 4242))

18:08 <cr1901_modern> [(<AddressFamily.AF_INET6: 23>, 0, 0, '', ('::1', 4242, 0, 0))]

18:08 <cr1901_modern> ^ To help further pinpoint the bug. getaddrinfo returns the correct tuple o.0;

18:10 <sb0> bah. is create_connection feeding that directly into overlapped.c or not?

18:10 <sb0> did you set the loop to proactor before calling that?

18:11 <altker128> Sorry to be off topic, are you guys using the MiSoC in Altera / Xilinx parts?

18:11 <sb0> yes, both

18:11 <sb0> some lattice as well

18:12 <altker128> What's the resource utilization like for all three?

18:13 <sb0> ~kLUT for a small SoC

18:13 <cr1901_modern> sb0: Hold on two seconds before I answer that

18:14 <altker128> sb0: But never used any debug capability?

18:15 <cr1901_modern> sb0: Yes, I set the proactor event loop. I just copied the four lines of your bug report snippet and changed the function called

18:15 <cr1901_modern> create_connection() calls _ensure_resolved(), which is a thin wrapper over loop.getaddrinfo()

18:16 <sb0> altker128, no

18:16 <sb0> but there are many bits of code you could integrate if you're interested in that

18:16 <cr1901_modern> the return value from _ensure_resolved() is destructured (host, family, cname, addresss, etc) and then, passed into overlapped.c

18:18 <sb0> why don't you add a print() right before it calls overlapped.c?

18:19 <cr1901_modern> sb0: That's how I figured out in the first place that a 2-tuple was being passed :)

18:19 <cr1901_modern> (well I set a breakpoint using pdb, but same difference)

18:21 <cr1901_modern> https://github.com/python/cpython/blob/3.5/Lib/asyncio/base_events.py#L150-L161 This is what I'm focusing on right now... when I set a breakpoint on getaddrinfo, it doesn't trigger

18:21 <cr1901_modern> (Meaning _ensure_resolved isn't actually calling getaddrinfo, but another wrapper :/)

18:29 <cr1901_modern> To reiterate: open_connection => create_connection => overlapped.c (b/c the proactor in ProactorEventLoop on Windows is a wrapper over overlapped.c)

18:34 <cr1901_modern> sb0: Pretty sure the problem is this line: https://github.com/python/cpython/blob/3.5/Lib/asyncio/base_events.py#L142

18:38 <cr1901_modern> sb0: That's the problem line. Change "return af, type, proto, '', (host, port)" to "return af, type, proto, '', (host, port, 0, 0)", line 142 in asyncio/base_events.py. The OSError disappears

18:38 <cr1901_modern> (I get a different error, but... one thing at a time :P)

18:39 <sb0> doesn't that break v4?

18:39 <sb0> what is the new error?

18:40 <cr1901_modern> sb0: Of course it prob breaks v4 :P. I didn't make it foolproof yet

18:40 <cr1901_modern> New error is ConnectionRefusedError: [WinError 1225] The remote computer refused the network connection

18:40 <cr1901_modern> And a TimeoutError was raised on line 57

18:41 <sb0> the previous code already returns "af, type, proto, '', (host, port)"

18:41 <sb0> https://github.com/python/cpython/blame/490dd4e568c68353d107134b6d864d5a2d9e5b0e/Lib/asyncio/base_events.py#L141

18:42 <sb0> introduced here: https://github.com/python/cpython/commit/03df54d549173e17e1cf9a767199de32a363aa6b

18:42 <sb0> well, 3.5.1 was before this commit.

18:42 <cr1901_modern> sb0: Before this commit, getaddrinfo was guaranteed to run.

18:43 <cr1901_modern> Your linked commit tries to bypass getaddrinfo, and does so in a way that a 2-tuple is always returned, when getaddrinfo itself may return a 4-tuple

18:44 <sb0> okay thanks.

18:44 <cr1901_modern> The bug is that _ipaddr_info() doesn't return parameters that matches getaddrinfo's return values in all possible scenarios

18:44 <sb0> I see

18:45 <cr1901_modern> Np... glad I could help. At least it's now known it's a legit Python bug

18:46 fengling has joined #m-labs

18:48 FabM has quit [Quit: ChatZilla 0.9.92 [Firefox 47.0/20160604131506]]

18:51 fengling has quit [Ping timeout: 240 seconds]

19:48 fengling has joined #m-labs

19:53 fengling has quit [Ping timeout: 240 seconds]

20:27 <whitequark> rjo: in https://github.com/m-labs/artiq/issues/512#issuecomment-232168447 by "trigger" do you mean on our buildbot?

20:27 <whitequark> because that shouldn't be possible in the web interface, but should be possible in IRC

20:27 <whitequark> (unauthenticated, that is)

20:49 fengling has joined #m-labs

20:54 <bb-m-labs> build #173 of artiq-kc705-nist_qc2 is complete: Success [build successful] Build details are at http://buildbot.m-labs.hk/builders/artiq-kc705-nist_qc2/builds/173

20:54 fengling has quit [Ping timeout: 240 seconds]

20:56 <bb-m-labs> build #229 of artiq-pipistrello-nist_qc1 is complete: Success [build successful] Build details are at http://buildbot.m-labs.hk/builders/artiq-pipistrello-nist_qc1/builds/229

20:59 <rjo> whitequark: yes. that's what i meant and told him.

20:59 mumptai has joined #m-labs

21:00 <rjo> whitequark: are you bulding rust-or1k currently?

21:00 <rjo> or are you writing a networking stack in rust?

21:07 <whitequark> former

21:08 <whitequark> a networking stack in rust sounds like something best left to other people, more experienced and more invested in TCP than me

21:09 <whitequark> from what I know about TCP implementation, it sounds easy and is full of gnarly interoperability issues

21:09 <cr1901_modern> Neither sounds particularly like fun (although two ppl have told me adding a new backend to Rust isn't THAT bad)

21:10 <whitequark> in any case, there is no significant advantage to having a TCP/IP stack in Rust since it is a cleanly separated layer that you can just wall off behind an FFI facade, without it affecting your Rust interfaces

21:13 <cr1901_modern> What prompted using Rust?

21:16 <rjo> whitequark: ack

21:47 <bb-m-labs> build #230 of artiq-pipistrello-nist_qc1 is complete: Failure [failed conda_build] Build details are at http://buildbot.m-labs.hk/builders/artiq-pipistrello-nist_qc1/builds/230

21:48 <whitequark> rjo: ooh, my timing error extraction code is working

21:48 <whitequark> I was going to test it but never did

21:51 fengling has joined #m-labs

21:52 <altker128> Do you guys still use the Navre AVR core?

21:56 fengling has quit [Ping timeout: 240 seconds]

22:52 fengling has joined #m-labs

22:57 fengling has quit [Ping timeout: 240 seconds]

23:01 mumptai has quit [Remote host closed the connection]

23:28 sandeepkr_ has joined #m-labs

23:31 kuldeep has quit [Ping timeout: 250 seconds]

23:31 sandeepkr has quit [Ping timeout: 244 seconds]

23:41 sandeepkr__ has joined #m-labs

23:41 kuldeep has joined #m-labs

23:43 sandeepkr_ has quit [Read error: No route to host]

23:44 sandeepkr has joined #m-labs

23:47 kuldeep has quit [Ping timeout: 250 seconds]

23:47 sandeepkr__ has quit [Ping timeout: 272 seconds]

23:54 fengling has joined #m-labs

23:58 ylamarre has quit [Quit: ylamarre]

23:59 fengling has quit [Ping timeout: 240 seconds]