cfbolz changed the topic of #pypy to: PyPy, the flexible snake (IRC logs: https://botbot.me/freenode/pypy/ ) | use cffi for calling C | "the modern world where network packets and compiler optimizations are effectively hostile"
forgottenone has quit [Quit: Konversation terminated!]
kolko has joined #pypy
vkirilichev has quit [Ping timeout: 240 seconds]
adamholmberg has quit [Read error: Connection reset by peer]
adamholmberg has joined #pypy
antocuni has quit [Ping timeout: 246 seconds]
inhahe_ has quit [Ping timeout: 240 seconds]
raynold has joined #pypy
inad922 has quit [Remote host closed the connection]
adamholmberg has quit [Remote host closed the connection]
rokujyouhitoma has joined #pypy
adamholmberg has joined #pypy
adamholmberg has quit [Read error: Connection reset by peer]
adamholmberg has joined #pypy
rokujyouhitoma has quit [Ping timeout: 240 seconds]
Guest34725 has quit [Remote host closed the connection]
marvin has joined #pypy
marvin is now known as Guest8550
oberstet has quit [Ping timeout: 240 seconds]
vkirilichev has joined #pypy
vkirilichev has quit [Ping timeout: 240 seconds]
realitix has quit [Quit: Leaving]
redj_ has joined #pypy
redj_ has quit [Read error: Connection reset by peer]
redj_ has joined #pypy
redj has quit [Disconnected by services]
redj has joined #pypy
redj_ has quit [Remote host closed the connection]
redj has quit [Remote host closed the connection]
redj has joined #pypy
kipras`away is now known as kipras
rokujyouhitoma has joined #pypy
rokujyouhitoma has quit [Ping timeout: 248 seconds]
tbodt has joined #pypy
vkirilichev has joined #pypy
Garen has quit [Read error: Connection reset by peer]
Garen has joined #pypy
vkirilichev has quit [Ping timeout: 240 seconds]
adamholmberg has quit [Remote host closed the connection]
adamholmberg has joined #pypy
adamholmberg has quit [Ping timeout: 248 seconds]
adamholmberg has joined #pypy
tbodt has quit [Quit: My Mac has gone to sleep. ZZZzzz…]
rokujyouhitoma has joined #pypy
tbodt has joined #pypy
rokujyouhitoma has quit [Ping timeout: 248 seconds]
tbodt has quit [Quit: My Mac has gone to sleep. ZZZzzz…]
jcea has joined #pypy
tbodt has joined #pypy
tbodt has quit [Remote host closed the connection]
jimbaker has quit [Quit: Quitting]
vkirilichev has joined #pypy
haypo has joined #pypy
<haypo>
hi. i sent an email to python-dev to propose to add a new opt-in C API to CPython 3.7 which would hide implementation details: https://mail.python.org/pipermail/python-dev/2017-September/149264.html [Python-Dev] New C API not leaking implementation details: an usable stable ABI
<kenaan>
arigo buildbot[cleanup-hg-bookmarks] da40d3078ef5 /bot2/pypybuildbot/builds.py: Tentative Windows version
vkirilichev has joined #pypy
<kenaan>
arigo default 4484e0986b55 /rpython/rlib/_rsocket_rffi.py: Blind fix for Windows
vkirilic_ has joined #pypy
vkirilichev has quit [Read error: Connection reset by peer]
marky1991 has quit [Read error: Connection reset by peer]
<arigato>
haypo: yes, as far as pypy is concerned, this PEP is just more boring work for us (but likely not hard)
rokujyouhitoma has joined #pypy
<arigato>
I don't quite believe that an incremental approach would work to eventually bring the CPython C API back to a sane complexity allowing free experimentation
<arigato>
any attempt to do that will just create a N+1'th way to do things
rokujyouhitoma has quit [Ping timeout: 248 seconds]
kolko has quit [Ping timeout: 248 seconds]
adamholmberg has quit [Remote host closed the connection]
adamholmberg has joined #pypy
adamholmberg has quit [Ping timeout: 260 seconds]
<haypo>
fijal: "what's the point in having the opt-in API?" it's the only possible way to make changes without breaking the world
<haypo>
fijal: the py2 vs py3 experience showed that we need to support both APIs until enough code is compatible
<fijal>
well, I don't know
<fijal>
py3 went too far and not far enough
<haypo>
fijal: sorry, this is one example of thing that i should document in the PEP, i know, but i didn't update the PEP yet
<fijal>
it changed everything, but didn't use that to change anything substantial or something
<fijal>
but yes, I don't think this is a step in good direction
<haypo>
fijal: why not?
<fijal>
because we would have to a) support both b) what sort of C API would not leak details?
<haypo>
fijal: hiding implementation details is the only solution that i see to make Python evolve, not only CPython
<fijal>
the crazy parts are like PyIncref/PyDecref
<haypo>
fijal: my PEP doesn't address the reference counting issue, that should be handled later
<haypo>
fijal: i have ideas, but since it's not doable in the short-term and depend on the first PEP to be accepted *and* adopted, i prefer to not even start discussing removing reference counting :)
<fijal>
so what?
<haypo>
fijal: IMHO allowing a Python implementation to change the memory layout (of PyObject structures) is already a major enhancement
<fijal>
maybe in 10 years it'll get somewhere?
<fijal>
so my question is: why don't you guys kill the really problematic APIs?
<fijal>
like, PySequence_*
<haypo>
fijal: a stable ABI will be useful right now for Linux distributors: a single binary for multiple Python versions, and maybe also multiple Python "runtimes"
<fijal>
stable ABI is absolutely unthinkable for us
<haypo>
fijal: Martin von Loewis created the "stable ABI". this limited API was never widely adopted, maybe because it doesn't add much features
<fijal>
where e.g. the PyIncref has to be a call
<haypo>
fijal: "I also gave you some feedback on the original proposal no?" i recall that you contacted me on Hangout or Twitter, i'm sorry, i don't recall. did you write an email?
<fijal>
I have no idea
<fijal>
I think I told you something over gchat, but who knows?
<haypo>
fijal: Google in that case :)
<haypo>
arigato: "I don't quite believe that an incremental approach would work (...)" do you see another solution?
<haypo>
arigato: IMHO pyston & gilectomy projects are "blocked" by the C API
<haypo>
i'm trying to fix the root issues
<haypo>
as fijal said, it's only half of the solution
<haypo>
but IMHO it's worth it
<haypo>
arigato, fijal : i'm coming to you to know which APIs cause you headaches
<fijal>
yes
<haypo>
which ones should be removed in the first steps
<haypo>
fijal: i don't know how cpyext is implemented. replacing (PyObject **)(((char*)list) + 24) with a PyList_GET_ITEM(list, index) function call should help you to avoid conversions, no?
nimaje1 has joined #pypy
nimaje has quit [Killed (barjavel.freenode.net (Nickname regained by services))]
nimaje1 is now known as nimaje
<fijal>
PySequence_FAST is my main issue
<arigato>
"I'm trying to fix the root issues": maybe, but imho in the wrong way. You're adding a N+1'th way which will take ages to be adopted by a small fraction of the C modules, and stay forever as an N+1'th way
<arigato>
pypy never tried to implement binary compatibility, and makes PyList_GET_ITEM a function in the first place
<haypo>
arigato: oh, i didn't know that pypy already converts PyList_GET_ITEM() to a function call. so basically, you already implemented my idea for PyPy?
<haypo>
fijal: sorry, i'm not aware of the PySequence_FAST issue
<haypo>
arigato: yes, it is possible that both APIs will coexist for many years
<fijal>
haypo: PySequence_Fast_ITEMS
<fijal>
that guy
<arigato>
almost every "macro" is a function call, yes
<fijal>
it returns PyObject**
<fijal>
which is a touch hard to implement
<arigato>
well, we implemented it in the end, so now it's there
<fijal>
that too
<haypo>
arigato: i don't see how doing nothing makes things better :)
<fijal>
haypo: I don't believe having an opt-in API that does not go even half way is helping anyone
<fijal>
it won't make our lives easier, people will generally not use it
<arigato>
haypo: it makes things better, because it means there are *not* an N+1'th part of the API to reimplement in another Python implementation
<haypo>
arigato, fijal : FYI instagram is building a large team working on Python performance, so we will have more hands to enhance Python
<haypo>
fijal: oh, i wrote the PEP for my own usage :) i see a direct advantage to experiment different optimizations, i gave examples in the PEP
<haypo>
fijal: we are discussing how to create a carrot big enough to motivate people to port the code
<fijal>
haypo: your examples won't work
<fijal>
right, so a carrot would be something like "from this point on, stuff will always work on pypy", maybe
<haypo>
fijal: can you explain why they will not work?
tbodt has joined #pypy
<fijal>
Indirect Reference Counting - would make stuff a lot slower
<arigato>
right now, people are motivated by the carrot of cffi, which allows to write sane code that works well on pypy too
<arigato>
with the admitted drawback that it is a complete rewrite
<fijal>
haypo: carrots - right now there are (mostly) 2 problems with C API
<fijal>
one is called numpy the other is called cython
<haypo>
fijal: "Indirect Reference Counting", oh ambv told me the same. well, it's just an example
<fijal>
Remove Reference Counting, New Garbage Collector - this requires DIFFERENT API than refcount
<fijal>
e.g. write barriers
<haypo>
fijal: i don't think that they are problem, we can fix numpy & cython
<fijal>
I don't know, so how can we make a plan for numpy & cython?
<fijal>
did you chat with njs?
<fijal>
Remove the GIL <- this needs a better GC
<haypo>
fijal: "Remove Reference Counting" you're right, i now understood that, it should be removed from the PEP
<haypo>
fijal: i wasn't contacted by njs
<fijal>
Tagged pointers <- I don't know about that, I claim tagged pointers are not very useful, but it's a debated topic
<fijal>
haypo: well, you should reach to him, the numeric/data science community is the main part of keeping C API alive
<haypo>
fijal: from a HHVM developer, i heard that tagged pointers are promising
<fijal>
and they're happy to explore ideas how to make it pypy friendly
<fijal>
yeah there is a lot of stuff done on tagged pointers, but cfbolz would tell you that there is very little actual evidence
<fijal>
e.g. people either do tagged pointers or they don't, but usually there isn't 1-1 comparison
<fijal>
anyway, I'm not an expert
<fijal>
you can definitely get the same (or better) benefits without tagged pointers and with clever JIT and GC
<arigato>
I think some old version of pypy was a 1-1 comparison, which turned out clearly against tagged pointers, but it was before the time of the JIT
<fijal>
arigato: I think we had at least one jit-related comparison
<haypo>
arigato: ok, good to know
<fijal>
haypo: so something that gets us somewehre in the middle, but does not help pypy or anyone would really be nothing interesting
<mattip>
hi
<fijal>
if you can have a transitional path for numpy and cython & a way to help pypy, that might be a good enough carrot
<fijal>
"maybe in 10 years" is definitely not
<fijal>
(maybe running pypy is not a good enough carrot to start with)
<fijal>
if it's not killed, we need a mess to support it anyway
<haypo>
i never used PySequence_Fast_ITEMS()
<fijal>
so having an opt in version that has a different API means that we have to support both
<fijal>
so specifically
<fijal>
1) the opt in, as opposed to direct deprecation means we have to support it
<fijal>
2) if the opt in and the opposite is different, we have to support both which is more work for us
<haypo>
fijal: it's not well described yet, but my whole plan involves to have a strict separation between API written for CPython itself, and the public API
<fijal>
which means that ideally, don't do it
<haypo>
fijal: sadly, i don't want to touch the current API because I don't want to be responsible of breaking the world
<fijal>
so an immediate step would be to remove PySequence_FAST_ITEMS form documentation
<fijal>
if you're not willing to do that, please at least don't add new ones (with any intentions)
<haypo>
fijal: i'm trying to design a path where we can actually enhance the API to make our life easier, to allow to change CPython internals, especially to make it faster
<fijal>
kill the C API
<fijal>
(or deprecate it)
<haypo>
fijal: come on
<haypo>
fijal: are you seriously proposing that?
<fijal>
-or- provide a path for numpy and cython to either compile to C API or to something else
<fijal>
haypo: from 3.8 you can make it private
<fijal>
or 3.9
<fijal>
you need a transition path for all the people who REALLY NEED IT
<haypo>
i don't think that it's feasible, IMHO it doesn't make sense
<haypo>
there is too much C extensions in the wild which uses the C API
<arigato>
(...yes, I see how we could make cffi extensions compilable only once for pypy and cpython. there may not really be a point, though)
<fijal>
so, yes, there are a few, but these days it's either numpy, cffi or cython
<haypo>
nobody would like to rewrite all C extensions with no carrot
<arigato>
haypo: for example, we haven't implemented the large amount of new PyUnicode_Xxx functions from CPython 3.3. It seems they are not really used so far, but it could change and if it does it will become another major pain point for pypy
<haypo>
PyPy is way faster. is this carrot big enough to justiyfy to rewrite numpy with cython?
<fijal>
haypo: if we can provide them with something, maybe!
<fijal>
there are some plans, but no funding yet
<fijal>
idea would be to have a higher level language that can compile to C API
<fijal>
but can also compile to e.g. jitcode
<haypo>
fijal: if you consider that the C API must be removed, please propose a PEP
<haypo>
fijal: this is how the Python language evolves
<fijal>
it won't be even considered, so what's the point?
<fijal>
haypo: you're one of the more revolutionary people on python-dev and what you're proposing is extremely conservative, from our point of view
<haypo>
fijal: well, it might create an interesting discussion?
<fijal>
doubt it
<haypo>
fijal: "what you're proposing is extremely conservative", oh, it is. i didn't say the opposite
<fijal>
right
<fijal>
so why cutting new APIs is unthinkable?
<fijal>
why e.g. unicode (as armin says) has to expose all kinds of internal crap?
<haypo>
fijal: can i ask you a question? do you understand why we need this very slow transition plan?
<fijal>
can we kill those? noone seems to use it (yet)
<fijal>
haypo: I understand a slow transition plan, and anything that leaves data science people behind would be massively frowned upon
<fijal>
but why do you dismiss our ideas as unthinkable?
<haypo>
"anything that leaves data science people behind would be massively frowned upon" it would just be a no-go
<haypo>
fijal: "but why do you dismiss our ideas as unthinkable?" well. see python3
<fijal>
right
<haypo>
fijal: in 2017, python3 transition is still on-going
<fijal>
ok, so here are concrete proposals:
<fijal>
a) deprecate PySequence_FAST_XXX
<haypo>
fijal: large companies like Dropbox... who hired Guido Van Rossum (and Benjamin Peterson)... are still running Python 2
<fijal>
b) deprecate and remove unicode APIs noone uses (yet)
<fijal>
python3 came with no carrots, that;s why
<haypo>
fijal: we provided many tools, documentations, books, etc. to port code to Python 3
<fijal>
but no actual carrots
<haypo>
fijal: another massive backward incompatible change is a no-go from the start
<arigato>
note that I'm unsure why PyList_GET_ITEM must be modified, given that PyList_GetItem already exists---why not just deprecate PyList_GET_ITEM?
<arigato>
(and PyList_GetItem while we're at it---there is already PySequence_GetItem and PyObject_GetItem)
<haypo>
IMHO we don't have the opportunity to do it again
<fijal>
it's not any faster (it is now with some unicode ops), it never provided any answers to any of the questions
<haypo>
arigato: "why not just deprecate PyList_GET_ITEM?" oh, that's an open question. but I would like to minimize changes
<mattip>
(also PyDict_Next is problematic)
<haypo>
arigato: barry wants to deprecate it
<fijal>
haypo: if there is no feasible way to stop adding new APIs and start removing old (bit by bit), then there is point in discussion
<haypo>
arigato: maybe, it will only be deprecated if you opt-in for the new API
<fijal>
haypo: deprecate-if-you-opt-in is a terrible idea, really
<fijal>
please don't add any new APIs
<haypo>
fijal: what is your carrot for "remove the C API"?
<fijal>
haypo: run stuff on pypy faster
<fijal>
or more specifically - unladen swallow, gilectomy and pyston would work far better if there was no C API
<fijal>
(and pypy)
<haypo>
fijal: hum, i don't see any need to add a new API. but the thing is that i don't know how C extensions use the C API
<fijal>
so new python3 stuff is not used much yet
<fijal>
maybe it's a good start to deprecate new APIs that got added in python3
rokujyouhitoma has joined #pypy
<fijal>
(unicode, finalizers_
<fijal>
then maybe deprecate Py<SpecialType>_stuff
<fijal>
note that deprecate and remove in two releases is actually not a terrible strategy
<haypo>
arigato: (and PyList_GetItem while we're at it---there is already PySequence_GetItem and PyObject_GetItem) again, in a first step, i want to minimize required changes to maximize adoption
<fijal>
if this was what python3 did, we would have been all on python3
<haypo>
arigato: *but* such opt-in option would allow to remove stuffs *slowly*
<fijal>
haypo: if the changes are too small you would see zero adoption
<haypo>
arigato: it will depend on the speed of C extensions to be updated to the "new API"
<fijal>
so maybe here is the actual proposal - list the APIs that pypy supports
<fijal>
this is the vast majority of used APIs in the wild
<fijal>
then declare "if you pass this flag, stuff is guaranteed to run on pypy"
<haypo>
fijal: i cannot promise that we are not going to add new APIs. i only want to find a way to remove APIs :)
<fijal>
promise is a very strong word
<arigato>
bitbucket.org/pypy/pypy/raw/default/pypy/module/cpyext/stubs.py <- stuff in pypy 2.7 that was never implemented
<fijal>
but you can probably promise to have a discussion with alternative implementations before doing so
<haypo>
fijal: right now, i think that it's just illegal for backward compatibility reasons to remove anything. or even modify in an incompatible way
<fijal>
right sure, but you can still emit warnings if you use certain things
<fijal>
I don't know if you can emit warning if you read fields directl
<fijal>
haypo: I think a good start would be something that reflects the reality of cpyext
<haypo>
fijal: "or more specifically - unladen swallow, gilectomy and pyston would work far better if there was no C API" i don't think that it's a good summary of the Python ecosystem. Python is popular thanks to the C API. and we all know the drawbacks of the C API :)
<haypo>
but ok, it seems like i have to explain that in the PEP :)
<fijal>
of course it's hard to argue with the popularity
<haypo>
since it seems like we disagree or at least that it's a known fact, and many people asked me a similar question :)
rokujyouhitoma has quit [Ping timeout: 248 seconds]
<simpson>
I mean, folks are not in agreement about exactly which features of the C API are drawbacks.
<simpson>
Can I list Cython as a drawback, for example?
<fijal>
it really would not make sense on a sane language
<haypo>
fijal: for unicode, the old API sucks :) many added PyUnicode are better since they use a PyObject* instead of Py_UNICODE*
<haypo>
but for sure, it adds more work for you, sorry about that
<fijal>
def PyUnicode_ClearFreeList
<fijal>
def Py_UNICODE_ISPRINTABLE
<haypo>
then declare "if you pass this flag, stuff is guaranteed to run on pypy" <= aha, i like this idea :)
<fijal>
def PyUnicode_DecodeUTF8Stateful
<fijal>
def PyUnicode_EncodeUTF32
<fijal>
def PyUnicode_EncodeCharmap
vkirilic_ has quit [Remote host closed the connection]
<fijal>
is really PyUnicode_DecodeMBCSStateful a needed C API?
<arigato>
fijal: I'm not sure the stubs.py has been even updated in py3.5
<fijal>
PyUnicode_4BYTE_DATA
<fijal>
arigato: no :)
<haypo>
fijal: "so cython exists because python is slow" i don't know. maybe cython exists because you want to call C functions but don't want to use directly the ugly C API :-)
<arigato>
right, so it's the same as the trunk (py2.7) one
<haypo>
fijal: "is really PyUnicode_DecodeMBCSStateful a needed C API?" if you are asking me the question, the answer is no
<arigato>
the real list of unimplemented CPython 3.5 API is much longer
<fijal>
haypo: it exists because slowness, but yeah, that too
<fijal>
(cffi solves the latter quite well, as seen by the super wide adoption)
<haypo>
fijal: the reason why we expose the API is... i'm not sure that i can tell you, i'm too ashame :)
<haypo>
fijal: so part of the answer is that CPython consumes its own API, but that most CPython developers are too lazy to annotate functions with Py_BUILD_CORE
<fijal>
so ALL OF THE INTERNAL UNICODE STUFF should be deprecated before someone starts using it
<fijal>
sorry this page just makes me mad
<haypo>
i have no opinion about deprecating functions at this point
<arigato>
the C API used to make tons of sense for CPython and definitely contributed to its popularity. but it grew immensely since then, to the point where it's a major pain for Python alternatives, and alternatives like Cython or cffi exist nowadays. so..?
<haypo>
arigato: "so..?" so, let's fix that
<fijal>
haypo: you can remove half of this page from documentation and noone would bat an eye
<fijal>
today
<fijal>
you might not be able to do the same in a year
<arigato>
cool. let's deprecate the C API then
<haypo>
arigato: maybe i wasn't explicitly enough, the C API is a blocker issue for CPython iteslf
<haypo>
itself
<haypo>
arigato: deprecate it to replace it with what?
<fijal>
haypo: so I would go ahead with a proposal to deprecate it and provide reasonable alternatives
<haypo>
fijal: sure, please go ahead
<fijal>
like, a high-level (cython-ish) language that you can express all of the stuff that cpython devs would maintain
<haypo>
from what I heard, Cython is really cool, but i never used it
<fijal>
so you know if you write stuff in this meta-language, it would compile to C using C API for each python
<haypo>
does PyPy "support Cython"?
<fijal>
cython is a bit too big, I think even numpy people said a subset of cython
antocuni has joined #pypy
<fijal>
yes
<haypo>
cool
<fijal>
but the way it works is through using C API, because you mix C API in places sometimes
<arigato>
(well, pypy supports the C API necessary for Cython-compiled CPython modules)
<fijal>
arigato: but it's much easier to change cython
<simpson>
haypo: I have a modest proposal: cffi everywhere, and *no* C API. Period. No more Cython, no more SWIG. While I recognize that this is zealous, hopefully it indicates to you the breadth of opinions on the issue.
<fijal>
than C API
<haypo>
please don't get me wrong, i am in favor of generalizing the usage of Cython
<mattip>
haypro: the answer to supporting cython is definitely yes, pandas (a large consumer of cython) works on pypy
<arigato>
I'm with simpson on that particular topic :-)
<pjenvey>
,yea if there's going to be a long transition away from the current C API, you really want to go all the way with it, transition to the FFI layer. while solving the maintainence numpy/etc however it needs to be done
<haypo>
but i don't feel able to port major C extensions to Cythong, whereas I expect little changes for my proposed new C API
tbodt has quit [Quit: My Mac has gone to sleep. ZZZzzz…]
<fijal>
haypo: so if cpython devs are willing to move to a subset of cython for "external" modules (e.g. array module) and compile it, that would be a big step forward
<fijal>
haypo: numpy people are. They are (mostly) moving to cython with massive behemoth being numpy
<fijal>
*but* they're willing to negotiate
tbodt has joined #pypy
<haypo>
simpson: same question than i asked to fijal: are you seriously proposing that?
<fijal>
like, deprecate the C API and make cython-ish (likely a subset) language an official replacement
<haypo>
simpson: if yes, can you please port all major C extensions to cffi?
<fijal>
haypo: he is putting himself in a position so I can claim I'm the reasonable and if you don't talk to me, you have to negotiate with him :)
<fijal>
I'm not sure it's working
<fijal>
*the reasonable one
<pjenvey>
you're going to have a long transition anyway, deprecating the C API doesn't mean it's going to be removed immediately (of course it won't) -- not everything needs to be rewritten in cffi before next tuesday
<simpson>
haypo: No, I opted to leave the Python community and build a new language, because this is not the only thing that I dislike about Python where I know that I won't be able to convince a majority of core devs of my position. So I'm not "serious" in that I won't really reap the rewards of such an endeavor; I have no stake. Otherwise, yes, I'm dead serious. Kill it with fire.
<haypo>
previously, it was said that an opt-in option is useless because nobody will port their code to a new API
<pjenvey>
well there has to be a carrot, fijal proposed the carrot being "works on pypy, and works well"
<haypo>
and now you propose to not make tiny changes to the C API but remove it, and expect that everybody will rush to use a totally different API?
<pjenvey>
maybe that's not enough of a carrot
nimaje1 has joined #pypy
nimaje1 is now known as nimaje
nimaje is now known as Guest38313
Guest38313 has quit [Killed (orwell.freenode.net (Nickname regained by services))]
<haypo>
pjenvey: i am not sure that pypy is a big enough carrot. otherwise, companies would already have invested money to do it
<fijal>
haypo: I'm not sure either. But it's bigger than no carrot at all
<pjenvey>
there definitely isn't much of a carrot w/ your proposed PEP (sorry) =]
<fijal>
(and companies have invested in pypy)
<simpson>
Well, everybody was supposed to have rushed to cffi already. When I ask folks why they don't use cffi, they usually claim inertia, or lack of buy-in, or lack of compelling features. I'm starting to think that a non-trivial chunk of the Python community just isn't interested in better infrastructure.
<haypo>
pjenvey: from what I heard, PyPy has issues, it's not only a matter of performance
<fijal>
simpson: well, people RUSHED to cffi
<haypo>
but sorry, i don't know much more, i cannot explain why people are not using more PyPy
<fijal>
haypo: if you want I can tell you :-)
<pjenvey>
haypo: there's been some investments, I think the community has certainly invested in it, because it's still around and in use =]
<haypo>
pjenvey: well, PyPy seems to be less used than what I would like to see
<haypo>
i should ask why instagram doesn't use PyPY?
<fijal>
I think ambv can answer that :)
<fijal>
numpy is a big issue usually
<mattip>
if the answer is numpy/pandas/cython, that all pretty much works these days
<fijal>
and it only has started working recently (and is slow)
<haypo>
simpson: "... yes, I'm dead serious ..." ok, well, let's agree to disagree on removing the C API :)
<antocuni>
arigato, fijal: IIRC, at some point we tried tagged pointers and we saw that there they brought no sensible speedup (but also no slowdown)
<arigato>
antocuni: ok, thanks
rokujyouhitoma has joined #pypy
<antocuni>
I think that it was because tagged pointers mostly save allocations, but allocations are cheap on pypy
<pjenvey>
haypo: you'd have to ask them. more abstractly, i think any VM, whether it's pypy or whatever it may be is in the same spot here. that the long path should be away from the C API
<haypo>
simpson: i don't want to offend you, i'm just surprised that someone proposes to break all existing code. for what? to support pypy?
<arigato>
...in other words, "why doesn't instagram uses PyPy" is answered by "because of the C API that is a mess to support in pypy"...
<haypo>
i'm sorry, i have issues to follow all discussions in parallel, i'm getting too many informations in a short time. but all your comments are already very useful to me
<pjenvey>
to support any potentially better VM
<fijal>
haypo: yes, to have more implementations
<haypo>
arigato: my plan is to remove problematic APIs to make C extensions more usable on PyPy. but fijal says that it adds more work for PyPy, so now i'm confused :)
<fijal>
haypo: pypy is really the only one standing in the graveyard of alternative implementations
<fijal>
haypo: I'm saying that adding new APIs is adding more problems
<fijal>
if you don't plan to add new APIs sure
<simpson>
haypo: I'm pretty hard to offend. Anyway, my primary goal along these lines, five years ago, would have been to remove the ability for any foreign C code to enter my address space. More PyPy usage would be great.
<fijal>
but I also pointed out to very concrete steps of how to not make stuff harder
<haypo>
fijal: it was said that we need to solve the "static PyTypeObject" issue. i expect that new APIs will be needed, to have a more declarative way to define new types
rokujyouhitoma has quit [Remote host closed the connection]
<fijal>
haypo: that's a solved problem on pypy
<haypo>
simpson: i don't understand why pypy developers never come to python-dev to propose changes to ease the compatibility?
<arigato>
haypo: that's because, to a large extend, there *are* no such changes
<arigato>
you can deprecate stuff, which might stop being used in 10 years
<arigato>
but that's about it
<simpson>
haypo: Well, *I* am not a PyPy developer; I don't want to lump myself in with the folks actually doing the back-breaking work here. But I've tried to be vocal about adopting things like cffi over SWIG/Cython, I had two aborted attempts to rewrite PIL in pure Python, and I try to make my code work on PyPy when I can.
<haypo>
arigato: yeah, cpython is a dead dinosaur, it's super slow to move :)
<arigato>
that's not my point here
<haypo>
arigato: at least, we can complain that CPython fails to remove APIs
<arigato>
my point is that you can't really make the C API much nicer to support in another implementation
<haypo>
well, in pratique, i'm quite sure that we remove C APIs sometimes, in less than 10 years :)
<haypo>
arigato: you are right. but there is room for small changes
<arigato>
it's too low-level and tied to the implementation
<arigato>
yes, and we don't really care about small changes, because they are at most a few hours of work for us
<haypo>
arigato: we have to find a way to collaborate :)
<haypo>
come on, we both care of the same programming language :)
<pjenvey>
haypo: well the PEP says the goal is to limit the C API for the sake of potential VM optimizations -- not "to make C extensions more usable on PyPy"
<arigato>
at this point I'd say Yes, but we don't care about the same programming language *implementation*
<haypo>
each time i join this channel, i'm depressed like everything is a dead end and we are fucked :)
<haypo>
i'm now reading the backlog to take notes
<simpson>
Sorry, that's probably me.
<pjenvey>
I agree you should limit it for the sake optimizations, but go whole hog =]
<arigato>
and so the way to collaborate would be if CPython would drop its C API and replace it with something else, like fijal suggested
<fijal>
haypo: well, I'm trying to propose ways
<fijal>
haypo: like, don't add stuff like half of the PyUnicode_xxx
<mattip>
haypro: fwiw, I have had actually good collaboration with numpy/cython/pandas, they are very willing to accept pull requests
<mattip>
and engage in discussions
<fijal>
haypo: if we can't remove the PyUnicode_TranslateCharmap in less time that it takes to implement it, there is very little point of discussing
tbodt has quit [Quit: My Mac has gone to sleep. ZZZzzz…]
<haypo>
fijal: (i'm reading the backlog) modifying numpy & cython to make them use the new A API is not only part of my PEP, but also requirement to get a successful PEP. if we cannot modify cython & numpy, the PEP is not going to work. so i am candidate to propose changes and make sure that they go upstream
<fijal>
haypo: so how about putting on everything that's not supported by pypy (as a first step) "DONT USE IMPLEMENTATION DETAIL"
<fijal>
is it too far?
<fijal>
we support the vast majority of C extensions out there that don't touch internals of C objects
<arigato>
(you can list in bulk all 214 functions from stubs.py and propose to deprecate them on the grounds that they seem unused by many C extension modules)
<haypo>
fijal: "2) if the opt in and the opposite is different, we have to support both which is more work for us" i don't see how a new API without PySequence_Fast_ITEMS() would add you more work, since you already have to support PySequence_Fast_ITEMS
<fijal>
you keep saying "new API"
<fijal>
what is the new API?
<fijal>
more functions?
<fijal>
simpson: I might go your way tbh
<fijal>
haypo: sorry I gonna go to sleep now, since it's 1 am and I have GPU cycles to improve on, I will read the backlog though
* arigato
zzz too
<fijal>
arigato: yes what are you doing up till now? ;-)
<haypo>
fijal: "so why cutting new APIs is unthinkable?" CPython is made by volunteers. anyone is free to propose new features. users love new features. sometimes, it's hard to prevent extending the C API when adding a new features, like the PEP 442 (finalizer)
<simpson>
Hey, discord is better than chaos. I am okay with this.
<haypo>
fijal: but most additions to the C API are mistakes, and slowly all core developers become aware of the issue, and i'm proposing a practical solution to detect more easily API additions: reorganize header files in Include/ of CPython
<arigato>
fijal: getting glued in some discussion about the C API and CPython :-)
<arigato>
would be cool if recognized mistakes were marked as such in the documentation
<haypo>
fijal: about the Unicode API, I dislike the idea of removing all functions added in 3.3. for example, i prefer the newly added PyUnicode_FindChar() ("black box API") rather than Py_UNICODE_MATCH() which relies on the Py_UNICODE* type
<haypo>
fijal: Py_UNICODE type was 16-bit or 32-bit depending on the platforms, which was problematic
<fijal>
haypo: more PyUnicode_4BYTE_DATA and friends
<haypo>
mattip: i am not aware of the PyDict_Next issue?
<haypo>
fijal: "fijal> haypo: if there is no feasible way to stop adding new APIs and start removing old (bit by bit), then there is point in discussion" i don't understand why you are so "final" (i'm not sure about the translation :-/). it's not because we have 2 issues that we cannot fix 1 issue
<fijal>
haypo: ok, can we remove PyUnicode_4BYTE_DATA?
<fijal>
and all the other things that clearly leak internal details
<fijal>
or not *remove* but mark as a mistake in documentation
<fijal>
INTERNAL ONLY DONT USE
<haypo>
arigato, fijal : about PyObject_GetItem vs PyList_GetItem, i understood that PyList_GetItem() was exposed to make CPython internals as efficient as possible. i'm not sure that it makes sense to use PyList_GetItem() (instead of PyObject_GetItem) in C extensions
<fijal>
haypo: so what about leaking internal data from unicode?
<haypo>
PyUnicode_ClearFreeList() <= this function is public? oops
<haypo>
"PyUnicode_4BYTE_DATA" i don't know what to do with such API. it can be used to produce the most efficient C code to produce/consume Unicode strings, but it obvisouly "hurts" the C API, it requires that PyPy implements the API
<haypo>
we have to draw a line somewhere between speed and "portability"
<fijal>
haypo: so if you ask why am I being so extreme
<kenaan>
arigo default a38744e20a86 /pypy/module/pyexpat/__init__.py: Add missing import. No clue why, it is not necessary to run the tests???
<fijal>
it's because if it's not obvious that a function that noone uses yet that leaks all the internal details very badly need to be hidden, then we really have no point of reference
<haypo>
PyUnicode_4BYTE_DATA is also one example of API which makes it more complicated to write code working on Python 2 & Python 3 for example
<fijal>
which is why noone uses it
<fijal>
but if you make people move to python 3 someone will
tbodt has joined #pypy
<fijal>
but this is like the very obvious example - it's incredible bad for us and underutilized
<fijal>
BUT, it needs to be there for SPEED
<haypo>
fijal: "so ALL OF THE INTERNAL UNICODE STUFF should be deprecated before someone starts using it" i don't understand why you didn't come to python-dev to propose that?
<fijal>
haypo: resignation?
<fijal>
I find it very hard for python-dev to agree on something that's obviously useful and not hard to agree on
<fijal>
this is at least a bit contentious
yuyichao_ has quit [Ping timeout: 248 seconds]
<fijal>
haypo: to give you concrete examples. I failed to convince people that python code and _ C module should have the same behavior
<arigato>
haypo: fwiw, some of the "optimizations" like PyList_GetItem vs PySequence_GetItem make no sense as optimization: the latter could be easily improved by a special case for lists and tuples (the common cases), after which the performance is just the same
<haypo>
even if C extensions are modified to use cython, it would remain useful for CPython that Cython would use a new C API, so CPython internals can evolve
rokujyouhitoma has joined #pypy
<fijal>
haypo: we're not even discussing removing it right? just marking it as internal
<arigato>
haypo: (and PyList_GetItem also checks the type first, it just raises an exception)
<fijal>
so cython can choose to use it anyway
<haypo>
pjenvey: "you're going to have a long transition anyway" well, don't tell anyone, but i hope that most C extensions will work *unchanged*
<fijal>
haypo: so what are you trying to achieve, actually?
<fijal>
if most extensions would work unchanged, what are you achieving compared to status quo?
<haypo>
simpson: "... I dislike about Python where I know that I won't be able to convince a majority of core devs of my position" are you sure that CPython core developers are the issue? IMHO the root issue is called "backward compatibility". we all still want to "break" (change) everything for good reasons :)
<haypo>
pjenvey: "there definitely isn't much of a carrot w/ your proposed PEP (sorry) =]" hum, it seems like i failed to explain to advantages for CPython itself, but FYI most if not all CPython core developers like my idea of a new opt-in C API
<haypo>
simpson: "I'm starting to think that a non-trivial chunk of the Python community just isn't interested in better infrastructure." i don't why this statement would be specific to Python
rokujyouhitoma has quit [Ping timeout: 240 seconds]
<fijal>
simpson: he's right, I don't think it's python specific
<arigato>
haypo: as far as we're concerned, the goal "most C extensions will work unchanged" is incompatible with "things that would help e.g. pypy": given the precondition, anything you change is just additional work for pypy
<haypo>
antocuni: "I think that it was because tagged pointers mostly save allocations, but allocations are cheap on pypy" i'm not sure that allocations are cheap in CPython
<antocuni>
haypo: I'm referring to pypy
<fijal>
haypo: so our study does not have anything to say about cpython really
<haypo>
antocuni: i spent 6 months to modify CPython internals (_PyObject_FastCall) to avoid allocating a tuple to pass positional arguments: it made C functions 20 ns faster, which is significant when the function took only 100 ns
<haypo>
fijal: how did you solve the "static PyTypeObject" issue in PyPy?
<fijal>
we create pytypeobject and populate it with wrappers
<fijal>
it was quite a bit of work
<fijal>
but it's done
<haypo>
pjenvey: well the PEP says the goal is to limit the C API for the sake of potential VM optimizations -- not "to make C extensions more usable on PyPy" <= yes, because i'm not sure that we can combine both goals. according to this discussion, it's just impossible :-)
<fijal>
haypo: right so you came to us to ask "how does it help pypy" and the answer is "it doesn't"
<fijal>
if you want to limit C API to make optimizations, that's perfectly fine
<fijal>
but has nothing to do with us
<haypo>
fijal: "new API" currently, it seems that it means replacing macros with functions
<haypo>
fijal: i also noticed the "static PyTypeObject" issue
<fijal>
for what is worth VERY few people are writing new C extensions
<fijal>
using C API directly
<fijal>
with a rare exception of language bindings (e.g. rust)
<haypo>
arigato: "would be cool if recognized mistakes were marked as such in the documentation" hum, it seems like the issue wasn't annoying enough to motivate anyone to deprecate APIs
<mattip>
arigato: still around? about xml issue 2641, I will only be able to get to a windows machine next week
<haypo>
arigato: maybe the fact that i'm coming to you now shows that the issue becomes annoying enough :)
<mattip>
if you could download local5_8.zip and check the expat there that might give us a clue
<arigato>
mattip: right, yes, my Windows VM still has an older localxxx.zip
<arigato>
mattip: I'll try (but zzz tonight)
<mattip>
:( sorry for the mess
<haypo>
fijal: "can we remove PyUnicode_4BYTE_DATA?" my plan is to clarify what was designed to only be used by CPython internally, and what should be public
<fijal>
so let's mark it as such?
<haypo>
fijal: but you need a transition plan for that, you cannot just "break" the API, to "respect the community" (to "avoid a Python 4 chaos"), or i don't know how to explain that
<fijal>
you don't need that for making it in documentation
<fijal>
"this is an internal API, but it won't be removed, don't use it"
<haypo>
fijal: FYI nobody complained about PyUnicode_ClearFreeList() on python-dev nor the bug tracker, so nothing was done
<fijal>
clearly it was never meant to be a public API
<fijal>
so maybe the procedure for exposing the API makes it public by default and that should be changed?
<haypo>
fijal: " to give you concrete examples. I failed to convince people that python code and _ C module should have the same behavior" there is PEP 399 which requires that, no?
<haypo>
fijal: *i* don't want to remove PyList_GetItem nor PyList_GET_ITEM because i would like to minimize required changes
<haypo>
fijal: "if most extensions would work unchanged, what are you achieving compared to status quo?" a concrete issue is that you cannot run a GTK application using system python3-dbg because the ABI is different
<haypo>
fijal: "we create pytypeobject and populate it with wrappers" "it was quite a bit of work" "but it's done" hum, on one side you complain that i give you too much work, on the other side i'm trying CPython in a way where you wouldn't have to workaround all these CPython implementation detail issues :)
<mattip>
failing test on own builds - TypeError: getcalldescr() got an unexpected keyword argument 'calling_graph'
<haypo>
fijal: "for what is worth VERY few people are writing new C extensions" oh really? that's good to know :-) to be honest, i don't know how numpy, cython, PyGtk, MySQL-Python, etc. are implemented
<haypo>
or lxml?
<fijal>
haypo: which of those is new?
<haypo>
fijal: [ .. PyUnicode_4BYTE_DATA? .. ] "so let's mark it as such?" for PyUnicode_4BYTE_DATA, yeah, i think that it's perfectly reasonable :-)
gutworth has joined #pypy
<haypo>
fijal: maybe we can just remove it from the new C API?
<haypo>
fijal: to decide what do with one specific API, my plan is to test the top 100 most popular C extensions and make sure that nothing breaks
<fijal>
a mark in docs would do, really
<haypo>
fijal: i mean, check how many C extensions fail, for example it's acceptable if a change only breaks 1%; we can fix the only failing C extension
<haypo>
fijal: "you don't need that for making it in documentation" oh, i fear that nobody reads the doc, nor take care of warnings ;-)
<fijal>
no, people do
<fijal>
you can't find it otherwise
<fijal>
having it internally marked in docs is a very good start
<mattip>
+1, also my experience, the docs are well respected
<haypo>
ok
<fijal>
or having it not in docs, but somewhere else (internal stuff)
<haypo>
fijal: "haypo: which of those is new?" sorry, i don't understand, i'm lost in the discussion :)
<fijal>
haypo: so yes, if you can do one thing, would be to mark in docs all the stupid functions that are internal-only
<haypo>
(I succeeded to read the backlog again, sorry, i was slow, the discussion was productive :-))
<fijal>
haypo: I said specifically "*new* C extentions"
<haypo>
fijal: ok, we can do that, sure
<fijal>
which mean when people decide to write new software
<haypo>
fijal: ah
<haypo>
fijal: the problem is the backward compatibility, which means legacy stuff :)
<fijal>
it's definitely used, less so than you think
<fijal>
right
<fijal>
but that means *new* APIs are not used that much
<fijal>
especially py3.3+ APIs
<haypo>
don't underestimate the cost of the technical debt
<fijal>
I certainly don't
<fijal>
we spent very significant amount of effort on cpyext
<haypo>
everyday i realize that the "py3 vs py2" issue is worse than what i expected :)
<haypo>
hehe
<haypo>
ok, fine
<fijal>
but yes
<fijal>
marking PyUnicode_DATA_4 as INTERNAL USE in docs or having it moved to internal cpython docs would be a very productive outcome indeed
<antocuni>
or, to say it differently: let's say that we can find a subset of the C API which is possible to implement efficiently on PyPy, and mark it somehow (either in the docs or in the code); and let's say that we can modify the 100 most popular C extensions to use just that. Then, 99% of the work would be done
<fijal>
*and* having it done by default (where API is not public unless specified) for new APIs
<fijal>
antocuni: well, that can't be done
<haypo>
fijal, arigato, simpson, pjenvey, mattip : again, thank you for your help. i took a lot of notes. i will try to include most of them in my next PEP proposal. i will try to come back to you once the PEP will be written
<fijal>
antocuni: but it can be done to WORK
<arigato>
:-)
<fijal>
(which does not cease to amaze me tbh)
<antocuni>
fijal: you mean that it can work but not fast?
<haypo>
i will add a section "Remove the C API" since, yeah, it's an obvious alternative, even if i forgot it
<fijal>
antocuni: yes, but I';m constantly amazed it works at all
<antocuni>
true enough
<fijal>
haypo: "remove and replace"
<fijal>
repeal and replace the C API
<fijal>
bad joke, but quite apt
<haypo>
i'm sure that someone else will ask me why i don't take this path, so it should be helpful to explain why it's not my favorite option
<haypo>
fijal :)
<fijal>
"replace with cython" is really not the worst
<fijal>
like, I would add a special paragraph about that, even if to discuss "why not" and what would it take
<haypo>
my practical problem is that my "new C API" PEP becomes a giant monster with long tentacles
<fijal>
it has to
<haypo>
i have to find a way to split it into smaller PEPs
<fijal>
haypo: the 3 key extensions that would need to be ported are numpy, cython and cffi
<fijal>
everything else either stays on python 2 or can be ported with just peer pressure
<haypo>
for example, i want to write a PEP just to split Include/ headers to clarify public vs private APIs
<mattip>
antocunii, fijal: even more amazing than the fact that cpyext works, is the fact that we can extensively test it three ways; untranslated/translated/with cpython
<fijal>
haypo: so can we start with a documentation PEP?
<haypo>
fijal: numpy & cython were already one my list, cffi was only mentionned. ok, i will add explicitly cffi to my list of extensions that must be adapted
<fijal>
haypo: "document internal APIs as such" <- that might even work
<fijal>
especially ones rarely used
<mattip>
numpy has a nice model for internal vs public APIs
<fijal>
mattip: kudos to you and ronan
<simpson>
haypo: I do not have a single prepared list of my demands; I'll write one up, but I'm not sure if sharing it with python-dev or python-ideas is going to be productive.
<haypo>
fijal: i don't know. right now, i'm exhausted. i have to take time to analyze that and try to organize my ideas
<fijal>
cool
<simpson>
In the meantime, thank you so much for this discussion. I learned a lot.
<fijal>
simpson: for what is worth, I'm moving forward with quill
<haypo>
simpson: i'm sorry that you got a bad feedback on python-dev in the past
<haypo>
simpson: tell me if i can help you
<haypo>
i can be your bridge to python-dev :)
<haypo>
if you fear dragons, and i can totally understand that, i do fear them as well!
<pjenvey>
the docs are great for hitting folks over the head with, e.g. numpy might have used something private but once pointed out to them they're very quick to fix it
<haypo>
FYI we started to use the GCC attribute to deprecate functions
<simpson>
haypo: I have never brought much of this forward beyond a few personal conversations with core devs and other folks core to the ecosystem. Don't be surprised if some of it is a little radical.