cfbolz changed the topic of #pypy to: PyPy, the flexible snake (IRC logs: https://quodlibet.duckdns.org/irc/pypy/latest.log.html#irc-end ) | use cffi for calling C | if a pep adds a mere 25-30 [C-API] functions or so, it's a drop in the ocean (cough) - Armin
<mattip>
and we probably should not be discarding warmup, we never did resolve that in the pyperf benchmark suite
<cfbolz>
yep
suhdonghwi has quit [Remote host closed the connection]
<arigato>
french windows, pypy3: x = time.tzname
<arigato>
>>>> (x,x,x,x)
<arigato>
('Europe de l\Ufffff44fuest (heure d\Ufffff4e9t\u79c0', 'Europe de l\Ufffff44fuest (heure d\Ufffff4e9t\x39', 'Europe de l\Ufffff44fuest (heure d\Ufffff4e9t\x34', 'Europe de l\Ufffff44fuest (heure d\Ufffff4e9t\x66')
<arigato>
sorry that's x = time.tzname[0]
<arigato>
note how each time it reprs the same string differently at the end
adamholmberg has joined #pypy
tsaka__ has joined #pypy
<cfbolz>
Ugh
<arigato>
I have no clue!
<arigato>
if I write any expression, even ['%s' % x, '%s %s' % (x, x), x], which uses x four times, then I get the same 4 endings as above, consistently, in order
<arigato>
>>>> ord(x[-1]), x[-1]
<arigato>
(0, '')
<arigato>
>>>> ord(x[-1]) == 0, x[-1] == '\0'
<arigato>
(True, False)
<arigato>
I bet it's very bogus utf8 that confuses everything
<cfbolz>
arigato: we should have a __pypy__ function that gives us the underlying string as bytes for situations like this
dddddd has joined #pypy
bitbit has joined #pypy
_stian has joined #pypy
_stian has quit [Remote host closed the connection]
__stian_ has joined #pypy
__stian_ has quit [Remote host closed the connection]
__stian_ has joined #pypy
<__stian_>
@shunning result = rbigint( source.digits[:source.size], source.sign, source.size) is most correct.
adamholmberg has quit [Remote host closed the connection]
xcm has quit [Read error: Connection reset by peer]
xcm has joined #pypy
adamholmberg has joined #pypy
jvesely has joined #pypy
<arigato>
cfbolz: good idea
<mattip>
arigato: can you check what pyinteractive does?
<arigato>
I guess just the tests in module/time would fail on my windows, but every time I try to run py.test I first need to spend an hour fixing random problems
<mattip>
:(
Smigwell has joined #pypy
<arigato>
but I can try pyinteractive
<mattip>
when you say "french windows" is that something I can somehow configure my english windows to do, or do I have to download a french image?
<arigato>
I know of no way to do that, no
<arigato>
I just happen to have a windows in french here
<arigato>
OK, pyinteractive says W_UnicodeObject('Europe de l\x92Ouest (heure d\x92\xe9t\xe9)\x00')
<arigato>
that's latin1 or similar
<mattip>
the \00 should go away if you pull latest HEAD
<arigato>
cool
<arigato>
the rest is "just" a matter of doing some encoding instead of just space.newtext(random_byte_string)
<arigato>
well some decoding I guess
<arigato>
I also think it's time to fix unicodeobject.py:59: remove the "if sys.platform=='win32'" and fix things
<mattip>
+1
<mattip>
also to remove the rpython unicode use in posix calls (unicode traits)
<arigato>
or at least stop using it in pypy
<mattip>
yeah, sorry, I meant in pypy's posix calls
<arigato>
hum wait
<mattip>
we should have a utf8 traits or so
<arigato>
we still want to call the XxxW() functions from the Windows API, right? and these take UTF16, which is what rpython unicodes are
<mattip>
ahh. but then we should be calling the unicode helpers to convert to/from utf16, not rpython_str.{en,de}code
<mattip>
so maybe a utf16 traits
<arigato>
but unicode == utf16 on windows
<arigato>
rpython unicodes, that is
jcea has joined #pypy
<mattip>
I don't remember the details, but there are some places we encode/decode/encode to make a posix call
<arigato>
as far as I see, it'll get the utf8 bytes, and call the rpython .decode('utf-8') on it (interp_posix.py:75)
<arigato>
which gives an rpython unicode
<arigato>
this seems to be correct
<arigato>
CPython does the same but caches the utf16 on the PyUnicodeObject too
<arigato>
no, wrong
<mattip>
it goes through the FileEncoder FileDecoder and the as_bytes or as_unicode is called from rpython
<arigato>
if you call e.g. posix.unlink("some_unicode") it goes through FileEncoder indeed
YannickJadoul has quit [Remote host closed the connection]
<arigato>
-> realunicode_w()
<arigato>
-> rpython_str.decode('utf-8') again
<mattip>
+1, I wanted to kill reealunicode_w but didn't get around to it
<arigato>
be careful, it does something slightly different than FileDecoder.as_unicode() with respect to surrogates
<mattip>
right, it was too tricky and failed even more tests when I touched it
<mattip>
right now we are at "it mostly works, don't touch"
<mattip>
moving forward is a major effort that should be sponsored
<arigato>
so do we have a case where it doesn't work?
<arigato>
realunicode_w() is a relic of the pre-utf8 world, but it might still be needed on Windows to convert directly to utf16 for filenames
tsaka_ has joined #pypy
<arigato>
same about the explicit .decode('utf-8'), I guess
<mattip>
gotta run but there are failing unicode related tests on win32
<arigato>
do you want me to have a look? I'm on Windows right now
<mattip>
(we lost our issue labels - I filed an issue with heptapod)
<mattip>
frustrating because I don't know which of the posix calls is giving the wrong answer: I am not sure the file is even being created with the correct name anymore
<arigato>
space.newfilename() is suspicious as an API, because it always takes a bytes argument
<arigato>
os.listdir() is already sending raw stuff to space.newtext() and crashing if I do the check in unicodeobject.py
jcea has quit [Remote host closed the connection]
<arigato>
re issue 3134: open() is doing the right thing, so I guess it's exists() that doesn't
<mattip>
it hits one of the stat calls, which seems to be wrong
<arigato>
no, exists() is a pure python function that checks os.stat()
<arigato>
stat() in interp_posix.py is called with a Path object, which has various attributes including "as_unicode", which is the utf8 bytes in this case
<arigato>
but it's passed directly to the rposix.py api
<mattip>
right, so I remember getting to the conclusion that os.stat is doing something wrong with the fsencode/fsdecode
<mattip>
ahh, you are on it
* mattip
dinner
<arigato>
it seems that various functions do that
marky1991 has quit [Remote host closed the connection]
<arigato>
it's what the generic call_rposix() helper does
marky1991 has joined #pypy
_whitelogger has joined #pypy
marky1991 has quit [Ping timeout: 258 seconds]
jcea has joined #pypy
jacob22_ has joined #pypy
<arigato>
pushed something, it seems to help os.exists() do the right thing
xcm has quit [Remote host closed the connection]
xcm has joined #pypy
suhdonghwi has quit [Remote host closed the connection]