antocuni changed the topic of #pypy to: PyPy, the flexible snake (IRC logs: https://botbot.me/freenode/pypy/ ) | use cffi for calling C | "PyPy: the Gradual Reduction of Magic (tm)"
<kenaan>
hroncok hroncok/fix-typeerror-str-does-not-support-the-b-1514414905375 0551d0495942 /lib_pypy/pyrepl/unix_console.py: Fix: TypeError: 'str' does not support the buffer interfac...
<fijal>
This is an automatically-generated email to inform you that Domain Administrator (under handle ML7717-GANDI) has purchased a SSL Standard certificate for the domain pypy.org.
<fijal>
is it spam or what is it?
<fijal>
arigato: maybe the answer is to help squeaky and declare them official?
<arigato>
+1
<fijal>
I'll ping him on twitter
<fijal>
or mail sounds better
marr has joined #pypy
Joce has joined #pypy
Joce has quit [Ping timeout: 260 seconds]
Joce has joined #pypy
<Joce>
Hi all, what portable macos builds are you referring to? I checked squeaky's github and bitbucket and only found linux builds there
<fijal>
then maybe we need to do the work ourselves :)
<fijal>
or maybe even I need to do this
<arigato>
ah, confusion
<arigato>
I think we're actually talking about homebrew
<arigato>
which usually makes their own pypy
danieljabailey has quit [Ping timeout: 256 seconds]
danieljabailey has joined #pypy
awkwardpenguin has joined #pypy
awkwardpenguin has quit [Ping timeout: 240 seconds]
<LarstiQ>
Rotonen pasted a script to build on OSX/Windows as well
antocuni has joined #pypy
danieljabailey has quit [Ping timeout: 264 seconds]
<and1>
thanks, but I can't use non-public pypy releases
<arigato>
which is not a problem, except in one case: when memory usage goes down a lot, and you'd like to return the free memory to the OS
<LarstiQ>
and1: it's public?
<and1>
but not easy to install by an average user
<arigato>
then I can only assume that your problem is solved in this branch, and you need to come back in three months when we do the next release
<LarstiQ>
and1: right, but you're not an average user here, the point would be to see if it helps your usecase and give feedback to the developers
<LarstiQ>
or wait threee months to see if something else needs to be done
<and1>
ok, I'll try it if I'll have time (building pypy is quite time-consuming)
<LarstiQ>
s/eee/ee/
<and1>
i found __pypy__.newdict('strdict') - is it a good match for a dict with string keys?
<and1>
s/dict/cache
<arigato>
this is not necessary
<arigato>
it creates a dict that is essentially just {}
<arigato>
it's possible to go advanced things using __pypy__.newdict('module')
<arigato>
because this creates a dict which is optimized for read-only key-values
<arigato>
but it is only useful if the reads out of the dict occur with constant keys
<and1>
by constant you mean == or is?
<arigato>
like typical for a module dict, which is often read with a global variable name, which is completely constant
<arigato>
"constant" here means a JIT-time constant
<arigato>
i.e. always provably the same object
<and1>
ok, so it's not very useful for dicts with words coming from random documents
<arigato>
no
<and1>
I though about implementing a simple C/C++ extension the exposes eg. std::map. Basically the only thing that matters is allocating the values outside of pypy's gc.
<arigato>
try it out, but it's unlikely to help because you need a way to refer back to Python objects (e.g. the values of dict)
<and1>
the caches are with unicode keys and unicode values, so it will be easy
<arigato>
either you have a problem or you didn't try and are just thinking aloud
<and1>
I verified that disabling the caches, which are about 20MB in total size, reduces memory usage by hundreds of megabytes
<and1>
up to 500MB
<arigato>
are you sure it's fragmentation?
<and1>
what else could it be if the caches alone take only 20MB?
<arigato>
how did you measure 20MB?
<and1>
2*len(sum(w) for w in d.keys()+d.values())
<arigato>
if they are short strings then that's a large underestimate
<arigato>
because the dictionary is big and each string has several words of overhead
<and1>
so have could I measure it better?
<arigato>
normally I'd say "the answer is 500MB as you measured", but of course we don't want that answer here :-)
<arigato>
try to add maybe 8 words per string (64 bytes)? and multiply by 4, not 2 bytes per char
<arigato>
these 8 words should roughly count the dictionary, the W_UnicodeObject, and the RPython unicode string
<and1>
but these 64 bytes are for a dict? I have only two dicts
<arigato>
8 words = 2-3 words in W_UnicodeObject, 3 words in RPython unicode, and ~2 words inside the big dictionary
<arigato>
the big dictionary is a single large object but it should use maybe 3 words per key-value pair
<and1>
I didn't know the overhead is that big
<and1>
is it the same for bytestrings?
<arigato>
yes --- ah right, you avoid the 2-3 words in W_UnicodeObject if your dictionary contains *only* unicode strings
<arigato>
for keys
<arigato>
(not for values)
<and1>
that's the case
<arigato>
well anyway, add 6 or 7 words instead
<arigato>
it's still much more than the length of the unicode string if it is very short
<and1>
ok thanks, I must go now and will verify if my assumptions are correct. Thanks.
<arigato>
and yes, in this kind of case, if you want to optimize memory as much as possible, use cffi to call a C or C++ version---but note that it also has overheads if you malloc() every single string
<arigato>
(lower than pypy, but still)
and1 has quit [Quit: Leaving]
<arigato>
in this case, the most optimal is probably to use a specialized data structure, in C or in Python (with the array or mmap module for example)
<Cloud10>
I think it's a bug in PyPy. The code works perfectly in CPython.
squeaky_pl has joined #pypy
<arigato>
you're lucky, then
<Cloud10>
The docs say that "sys.setrecursionlimit(n) sets the limit only approximately, by setting the usable stack space to n * 768 bytes"
<Cloud10>
So there shouldn't be a segfault surely, arigato
<arigato>
no, it's a case where you can segfault CPython too
<arigato>
for example, try doing sys.setrecursionlimit(10**6) followed by def f(): f() and f()
<Cloud10>
But I resize the stack to be big enough
<Cloud10>
The docs promise that setrecursionlimit(n) only requires n*768 bytes
<Cloud10>
I provided over that
<arigato>
I admit that maybe the issue is threading.stack_size() which maybe has no effect
<Cloud10>
That makes sense!
<Cloud10>
Is there any reason it doesn't do anything?
<mattip>
Cloud10: which pypy/platform?
<arigato>
is that pypy2 or pypy3, and indeed, which platform
<Cloud10>
PyPy3 5.10 linux64
<Cloud10>
PyPy2 5.8 linux64 also has the problem
<Cloud10>
In fact, I *think* stack_size can only resize downwards
<Cloud10>
in PyPy that is
<arigato>
I vaguely suspect vmprof
<arigato>
with a version of pypy from Dec 19 I get a Fatal RPython error: AssertionError in the jit's handler_rvmprof_code_1
<arigato>
with "--jit off" I get a double-free error
<arigato>
Cloud10: why do you think stack_size can only resize downwards?
<Cloud10>
arigato: With set_stack(4000) using threading.stack_size(n*768) it segfaults but with threading.stack_size(n*32768) it works
<arigato>
sorry, not following
<arigato>
note that the interpretation of "*768" you make is backward, too:
<Cloud10>
How's it backwards?
<Cloud10>
setrecursionlimit sets the cap on stack in bytes to 768*n
<Cloud10>
I set the stack also to 768*n
<arigato>
yes, which means it is guaranteed to crash
<arigato>
let me explain
<arigato>
when you have a stack of N bytes, it's the total size of the stack, which includes uncontrollable bits and pieces
<arigato>
if you use setrecursionlimit to set the cap to 768*n bytes, it means that the "central" portion of the stack will not grow larger than 768*n bytes, where "central" means from the first to the last Python interpreter call
<arigato>
if that number 768*b is also the total size, then you're guaranteed to crash because the central portion will be smaller than 768*n when the total size is already larger
<Cloud10>
Okay, that makes sense. So I need to add some constant amount as well?
<arigato>
yes
<Cloud10>
Does 1MB sound sensible?
<arigato>
yes, likely
<arigato>
but with a very large number like 32768*n, like in the pasted code, it makes a large enough stack anyway
<arigato>
it still crashes
<arigato>
I'm retranslating a trunk version of pypy and I'll look
<Cloud10>
Is it likely to work in earlier PyPys?
<Cloud10>
I'll try PyPy 2.0 and see
<arigato>
yes, it used to work at some point, though maybe not the threading.stack_size()
<arigato>
but right now I seem to be getting strange crashes even without threads
<Cloud10>
Will stack_size get implemented at some point or will PyPy always have constant stacks?
<arigato>
it *is* implemented
<arigato>
seeems
<arigato>
seems that the problem is not threading at all
<Cloud10>
It works on PyPy2 2.0.1 indeed. Although it's much slower than CPython. Why's that?
<kenaan>
mattip release-pypy3.5-v5.9.x 737d3f5af2ce /pypy/module/errno/interp_errno.py: fix for win32 (grafted from 85e44c9458db62931917a86f8614d131b136aaff)
<kenaan>
hroncok release-pypy3.5-v5.9.x ac1ac8fceed5 /lib_pypy/pyrepl/unix_console.py: Fix: TypeError: 'str' does not support the buffer interface Fixes https://bitbucket.org/pypy/pypy...
<kenaan>
rlamy release-pypy3.5-v5.9.x f05e6bdccc8d /pypy/module/posix/: Fix: the 'flags' argument to setxattr() had no effect (grafted from 70dfe4f14f678cefb5bdc58ed4ac4b35...
<kenaan>
mattip release-pypy3.5-v5.9.x 291eb92c6b5d /pypy/module/: update version numbers