<arigato>
adetokunbo: sorry, you'd need fijal to answer that question
<arigato>
I guess the first step is to write a three-lines test?
jcea has joined #pypy
<arigato>
what is slow here is conn.send(data_memory[start:])
<arigato>
where data_memory is a much longer string that what a single call to send() will actually consume
<arigato>
I think on CPython it works by taking a time that does not depend on len(data_memory) at all
<adetokunbo>
Ok. Reading through the thread, I got the impression that the issue was that conn.send was taking slices from data_memory, but that these slices were being copied unnecessarily
<arigato>
yes
<arigato>
I think that on CPython, conn.send(data_memory[start:]) is done by passing a pointer and a size around, and never actually doing the copy of all the data in "data_memory[start:]"
<arigato>
on PyPy it doesn't work
<arigato>
I think that "data_memory[start:]" by itself doesn't do a copy,
<arigato>
but conn.send(m) does if m is not a string (e.g. is a memoryview)
<haypo>
arigato: CPython 3 relies heavily on the Py_buffer API which can be seen as a pointer+size structure
<adetokunbo>
So, I could not tell clearly from the thread, but I inferred that this because of how memoryview is implemented in terms of interpreter level buffer. I.e, that's why Jason and Fijal end up agreeing that improving bufferstr is the route to the solution
<arigato>
right, so the first question is to double-check that it is still slow now (it probably is)
<arigato>
then the problem of bufferstr() is that it returns an RPython string
<arigato>
you'd like instead to return, say, an instance of a new small class with a few attributes
<arigato>
similar to Py_buffer in CPython
<adetokunbo>
ok
<arigato>
basically, a raw pointer, a size, a reference to an object to keep alive
<arigato>
and maybe something more to ensure that the string doesn't move in memory while you're using it
<haypo>
arigato: (on CPython, the performance is mostly steady)
<arigato>
(in case the Py_buffer contains a raw pointer inside an RPython string)
<arigato>
something using rffi.get_nonmovingbuffer()
<arigato>
or rffi.get_nonmovingbuffer_final_null()
<arigato>
it needs to be done carefully with try:finally:
<adetokunbo>
arigato: thanks I'll start looking
<arigato>
maybe simpler, as a good first step:
<arigato>
use the RPython buffer object's get_raw_address() method
<arigato>
this can actually play tricks if the buffer is based on an RPython string, after which the string is guaranteed never to move again
<arigato>
(see rpython.rlib.buffer for get_raw_address)
<arigato>
then the goal is to replace the unwrap_spec(...'bufferstr'...) inside pypy.module._socket.interp_socket for sendall() and other functions
<arigato>
with e.g. unwrap_spec(...'raw_ptr'...)
<arigato>
or maybe, just 'buffer', and you get a Buffer instance
<adetokunbo>
thanks again! I'll begin by confirming that this is indeed still slow, then I will attempt this simpler approach you've outlined.
<arigato>
seems that all pieces are here nowadays, it's only a matter of putting them together. this code (e.g. in pypy.module._socket) was written long ago, before the GC had any ability to freeze object positions
<arigato>
we could easily add to the class Buffer a context manager, so that if you say "with buffer as raw_ptr:" you get a raw ptr valid for the duration of the "with"
<arigato>
then using it in pypy.module._socket is easy
<arigato>
well, "easy" in both cases with enough quotes
amaury_ has joined #pypy
<adetokunbo>
ok
<danchr>
mattip_away: sounds likely; that builder is using OS X 10.9, which is rather old, and likely includes headers for the system OpenSSL — which is 0.9.8
<danchr>
later versions dropped the headers, but retained the library
<danchr>
(ideally, PyPy/CPython should use one of the system frameworks on OS X rather than OpenSSL, but that seems like a significant undertaking, and might not even be possible depending on the API exposed)
<tos9>
Which has mattip saying to delete some random stuff, and I was hoping to be lazy :)
<tos9>
Looks like installing even just subprocess32 itself blows up. Maybe that one is easier to fix.
tbodt has joined #pypy
jamesaxl has joined #pypy
<ronan>
haypo: "the Py_buffer API which can be seen as a pointer+size structure" No, no, no, no, no! That's a dangerous misconception whose consequences I've been fighting for 2 weeks.
<ronan>
only contiguous, 1-D Py_buffers may plausibly be thought of as pointer+size
<haypo>
ronan: most converters require contiguous 1-D data
<ronan>
yes, but all code handling buffers needs to consider the general case
<haypo>
ronan: it depends if a function converts to Py_buffer or gets a Py_buffer
<haypo>
most Python functions convert to Py_buffer
<mattip>
valgrind sees an invalid read, of something that was freed, in a GC cycle
<mattip>
the "something" is a class.method returned from PyObject_GetAttrStr
<mattip>
which is freed by calling method.tp_dealloc
<mattip>
(the class.method is defined in C via tp_methods)
<mattip>
but how can tp_dealloc be called from a bytecode?
tbodt has joined #pypy
<ronan>
mattip: explicit del, maybe??
<mattip>
maybe, but AFAICT that object never escapes the 10 lines of C it lives in inside a block of cython-generated code
tbodt has quit [Read error: Connection reset by peer]
marr has joined #pypy
tbodt has joined #pypy
forgottenone has quit [Ping timeout: 260 seconds]
tbodt has quit [Ping timeout: 260 seconds]
tbodt has joined #pypy
ramonvg has quit [Quit: Lost terminal]
tbodt has quit [Ping timeout: 240 seconds]
yuyichao has quit [Ping timeout: 260 seconds]
<mattip>
maybe it is just random, I quit and ran again, this time I got "Invalid read of size 8", "Address 0x2554dbe8 is not stack'd, malloc'd or (recently) free'd"
<mattip>
so maybe there is some corruption of the gc somehow
arigato has quit [Quit: Leaving]
tbodt has joined #pypy
Tiberium has quit [Remote host closed the connection]