arigato changed the topic of #pypy to: PyPy, the flexible snake (IRC logs: https://quodlibet.duckdns.org/irc/pypy/latest.log.html#irc-end ) | use cffi for calling C | mac OS and Fedora are not Windows
infernix has joined #pypy
speeder39_ has joined #pypy
<kenaan> mattip unicode-utf8-py3 5a61af129d87 /pypy/objspace/std/unicodeobject.py: fix logic, remove dead code
themsay has quit [Ping timeout: 250 seconds]
themsay has joined #pypy
stillinbeta has joined #pypy
phlebas has joined #pypy
krono has joined #pypy
Alex_Gaynor has joined #pypy
avakdh has joined #pypy
starlord has joined #pypy
ronan has joined #pypy
hexa- has joined #pypy
speeder39_ has quit [Quit: Connection closed for inactivity]
jcea has quit [Remote host closed the connection]
speeder39_ has joined #pypy
[Arfrever] has quit [Quit: leaving]
[Arfrever] has joined #pypy
Garen has quit [Read error: Connection reset by peer]
Garen has joined #pypy
xcm has quit [Remote host closed the connection]
xcm has joined #pypy
dddddd has quit [Read error: Connection reset by peer]
dustinm has quit [Quit: Leaving]
dustinm has joined #pypy
agronholm has quit [Read error: Connection reset by peer]
agronholm has joined #pypy
speeder39_ has quit [Quit: Connection closed for inactivity]
Ai9zO5AP has joined #pypy
dustinm has quit [Quit: Leaving]
i9zO5AP has joined #pypy
dustinm has joined #pypy
Ai9zO5AP has quit [Ping timeout: 268 seconds]
beystef has joined #pypy
beystef has quit [Ping timeout: 246 seconds]
illume has joined #pypy
zmt01 has quit [Read error: Connection reset by peer]
zmt00 has joined #pypy
<kenaan> mattip taskengine-sorted-optionals 879181847bd8 /: close abandoned branch
<kenaan> mattip inline-taskengine 468c9599a1f6 /: close abandoned branch
<kenaan> mattip numpypy-ctypes 7dc47b5a8a95 /: close abandoned branch
<kenaan> mattip numpy-record-type-pure-python 32693f76ec8f /: close abandoned branch
<kenaan> mattip struct-double 877a16704f95 /: close abandoned branch
<kenaan> mattip dynamic-specialized-tuple 5b69b3275f06 /: close abandoned branch
<kenaan> mattip jit-sys-exc-info b1f2ea41a0c5 /: close abandoned branch
i9zO5AP has quit [Quit: WeeChat 2.3]
Ai9zO5AP has joined #pypy
oberstet has joined #pypy
illume has quit [Ping timeout: 240 seconds]
illume has joined #pypy
arigo has joined #pypy
illume has quit [Client Quit]
antocuni has joined #pypy
xcm has quit [Remote host closed the connection]
<Ninpo> mattip: hey is there a 3.6 branch of your unicode stuff? :D
<Ninpo> This edge isn't bleeding enough hehe
<mattip> Ninpo: the plan is to merge it to py3.5 soonish, then merge that to py3.6
<Ninpo> I was hoping to wake up to a full run timing result but I forgot to increase net_write_timeout on MySQL so had to start it again :|
xcm has joined #pypy
<mattip> looking at check_for_mojibake, you never use the result from decoding field_bytes
<Ninpo> I know
<Ninpo> I'm literally looking for exceptions or not.
<Ninpo> It was suggested I binascii.unhexlify once instead of on each decode call
<mattip> it seems you are detecting whether any(binascii.unhexlify(data) > 127), which could also be written as
<Ninpo> originally I was
<Ninpo> In that pound-python paste is the straight up decode as it seemed quicker
<mattip> for c in data[::2]: if c > '7': #found non-ascii
<Ninpo> What's that doing?
<Ninpo> I still need to care if it's utf8 non ascii, not just non ascii
<Ninpo> is data indexable if it's bytes?
<mattip> yes
<Ninpo> so you're skipping every other character there? and seeing if it's higher than '7'? why '7'?
<mattip> if a c is ascii, then ord(c) < 127, so its hex repr is < '80'
<Ninpo> OH you're saying crawl the hex value
<Ninpo> derp
<Ninpo> and hex is two digits ofc
<Ninpo> penny just dropped
<Ninpo> Is there a magic number that's _definitely_ utf8 not latin1?
<Ninpo> 256 or higher in hex?
<mattip> no, 255 is the biggest 2-char hex you can have
<Ninpo> valid cp1252 I meant
<Ninpo> oh right
<Ninpo> hrm
<Ninpo> so still falls down on looking for utf8?
<mattip> try to grok the table at Description here https://en.wikipedia.org/wiki/UTF-8
<Ninpo> I think staring at that is what led me down the binascii path
<Ninpo> otherwise to look for utf8 don't I then start testing for 110 1110 or 11110 in the hex?
antocuni has quit [Ping timeout: 240 seconds]
<Ninpo> I mean seeing if it's all ASCII sure
<mattip> do you really have many invalid utf8 sequences?
<Ninpo> yep.
<Ninpo> over 24m fields at last full check
<Ninpo> ranging from decodes properly so just needs encoding back to cp1252, to proper knackered and needs an ftfy pass
<mattip> well then maybe the best you can do is skip the ascii ones, but then you will have to unhexlify
<Ninpo> yeah
xcm has quit [Remote host closed the connection]
<Ninpo> So your snippet there can be run before any unhex happens? I'll give that a try today thank you
<Ninpo> wonder how much actual time it'll save, I'll test shortly :D
<Ninpo> Genius idea to look at the raw hex for ascii, never occurred to me
illume has joined #pypy
<Ninpo> and that won't run the risk of seeing the start of a utf8 point that starts with 1 ?
xcm has joined #pypy
<Ninpo> given '1' < '7'
<mattip> how can you know it is a utf8 point? "for b in range(127): bytes((b,)).decode('ascii')" succeeds for all b (python3)
<Ninpo> I don't understand the question
<Ninpo> Or I don't understand your earlier example/hex
<Ninpo> you were looking for every other value being higher than 7 right?
<Ninpo> That table I tried to grok says byte 1 is '1'
<Ninpo> or starts with 1
<mattip> using the table, the only way to form a two- or three- or four-byte codepoint is by having the first byte be over 127
<Ninpo> What's the '110xxxxx'?
<Ninpo> oh it's binary
* Ninpo facepalms
<mattip> first bit is 1, so it is over 127 (actually first two bits are 1 so higher than 191)
<mattip> "The first 128 characters (US-ASCII) need one byte. The next 1,920 characters need two bytes to encode"
<Ninpo> I'm kinda weak on binary/hex etc so apologies I'm not getting this immediately.
<mattip> np
<Ninpo> So in a hex sequence it's always pairs, and multiple pairs can make up a sequence. You're saying if I understand correctly, if any pair starts with a value higher than '7' it must be a sequence higher than 127?
<Ninpo> and I only need to stop on the first one
<Ninpo> well in my case, see if it decodes as utf8
<Ninpo> but if we don't find any "start" to the pairs higher than 7 it's safely ascii, no need to unhex or carry on decoding?
<mattip> +1
<Ninpo> whew I get it, I think
<Ninpo> Thanks mattip
<mattip> hope it helps
<Ninpo> me too! worth a shot
<Ninpo> when it's over half a day to scan, any optimisation ends up exponential
<Ninpo> Another thing I want to investigate is pure async via trio_mysql as opposed to sqlalchemy with thread workers
<Ninpo> I know pypy is faster with pymysql than mysqldb due to the C interface
<Ninpo> So I'm using that as the dialect
<mattip> no idea
<Ninpo> I'm gonna kill this long run and conclude your branch _is_ faster overall mattip by a good amount. Projection suggestions 4 hours less time than last run on v6.0
<Ninpo> my v6.0 vs v7.0 stable were roughly the same so whatever this branch does it's working :D
<Ninpo> Let me know if there's anything newer I should grab and build mattip
<mattip> four hours out of how many total?
<Ninpo> 14
<Ninpo> on first run
<Ninpo> This run currently projected to finish in 10 and change
<mattip> nice
<Ninpo> I'm gonna kill it I'm dying to try your suggestions
<Ninpo> mattip: quick question on the walking of the hex value, am I not setting myself up for a much slower operation on a large result that's all ascii? Since I'm checking every single hex sequence in that case? Or do you think that's still faster in pypy than an unhex/decode("ascii")?
<mattip> unhex/decode walks it twice, allocates an intermediary, and does more complicated checks
<Ninpo> ah I see
<mattip> in the all-ascii case you save all that. In the non-ascii case you only save decode('ascii') but I think you still win
<mattip> since you do not need to try/except and internally the code is also simpler
<Ninpo> Actually I don't want to keep walking data, how do I break out of this on the first char?
<Ninpo> first > 7 char
<Ninpo> Goes in my else right?
<Ninpo> if blah > 7 break and have my tests under else?
<mattip> maybe for: else
<Ninpo> I thought else fired _if_ you broke
<Ninpo> no it's the opposite isn't it
<Ninpo> for this break, else no break do
<mattip> try it and see
<mattip> on a small loop
<Ninpo> yeah you're right
<Ninpo> else happens if break doesn't
<Ninpo> it's about the same/slightly slower on the small dataset mattip, testing on the bigger one now
<Ninpo> comes in around a second slower with the char walk vs the unhex/decode ascii
<Ninpo> on smaller set
<Ninpo> waiting on bigger set results
Zaab1t has joined #pypy
mosajjal has joined #pypy
mosajjal has quit [Client Quit]
mosajjal has joined #pypy
<Ninpo> yeah ends up 2 minutes slower on the larger set
<Ninpo> Worthy experiment though thank you mattip
<Ninpo> If for no other reason than TIL about hex better :D
<mattip> :(
<mattip> does it give the same results?
<Ninpo> yes
<kenaan> mattip default 20486c92ed2a /rpython/memory/gc/test/test_direct.py: fix test for linux 32
Zaab1t has quit [Ping timeout: 245 seconds]
mosajjal has quit [Remote host closed the connection]
mosajjal has joined #pypy
xcm has quit [Remote host closed the connection]
xcm has joined #pypy
<bbot2> Started: http://buildbot.pypy.org/builders/pypy-c-jit-linux-x86-64/builds/6019 [mattip: force build, unicode-utf8-py3]
<bbot2> Started: http://buildbot.pypy.org/builders/own-linux-x86-64/builds/7293 [mattip: force build, unicode-utf8-py3]
<bbot2> Started: http://buildbot.pypy.org/builders/rpython-win-x86-32/builds/125 [mattip: force build, unicode-utf8-py3]
<bbot2> Started: http://buildbot.pypy.org/builders/rpython-linux-x86-64/builds/149 [mattip: force build, unicode-utf8-py3]
<bbot2> Started: http://buildbot.pypy.org/builders/pypy-c-jit-macosx-x86-64/builds/4226 [mattip: force build, unicode-utf8-py3]
mosajjal has quit [Remote host closed the connection]
mosajjal has joined #pypy
mosajjal has quit [Client Quit]
dddddd has joined #pypy
mosajjal has joined #pypy
mosajjal has quit [Client Quit]
mosajjal has joined #pypy
mosajjal has quit [Ping timeout: 246 seconds]
jcea has joined #pypy
xcm has quit [Remote host closed the connection]
Taggnostr has quit [Quit: Switching to single player mode.]
Taggnostr has joined #pypy
xcm has joined #pypy
marky1991 has joined #pypy
marky1991 has quit [Ping timeout: 250 seconds]
mosajjal has joined #pypy
<mosajjal> hi everyone! I've been in contact for official Docker image for pypy and I did some digging around
<mosajjal> this is an official quote from Docker docs:
<mosajjal> While it is preferable to have upstream software authors maintaining their corresponding Official Images, this is not a strict requirement. Creating and maintaining images for Official Images is a public process. It takes place openly on GitHub where participation is encouraged. Anyone can provide feedback, contribute code, suggest process changes,
<mosajjal> or even propose a new Official Image.
<mosajjal> fijal
mosajjal has quit [Remote host closed the connection]
mosajjal has joined #pypy
themsay has quit [Ping timeout: 250 seconds]
themsay has joined #pypy
<kenaan> rlamy rpath-enforceargs 2bb6bffd210f /: Close obsolete branch
Masklinn has joined #pypy
antocuni has joined #pypy
<fijal> mosajjal: hey! I'm on my phone will be back in couple h
<mosajjal> ok cool
<LarstiQ> mosajjal: for those who haven't been following along, is there a question/suggestion?
<cfbolz> mosajjal: what where the issues you found with the current docker image?
<rguillebert> the issue probably is that they haven't been updated to PyPy 7.0 yet
<cfbolz> There is a pull request for 7.0
<cfbolz> So it's probably going to be available soon
<cfbolz> rguillebert, mosajjal ^^
mosajjal has quit [Ping timeout: 272 seconds]
mosajjal has joined #pypy
<mosajjal> the problem is, official pypy repo in Docker hub isn't maintained by pypy team
<cfbolz> mosajjal: yes, we don't have the bandwidth for that. We don't maintain Debian packages ourselves either
<antocuni> I think it's "official" because it is maintained by the docker guys; it's not that the ubuntu image is maintained by canonical
<mosajjal> I think a VPS will do the trick. After all, you guys don't have a lot of periodic releases anyway. Maybe once a month or sth?
<mosajjal> antocuni thing is, Docker is only doing that because pypy team isn't. Also, they don't have ARM64 release
<rguillebert> I think cfbolz was talking about mental bandwidth :)
<rguillebert> not data bandwidth
<mosajjal> lol
<antocuni> mosajjal: I suppose that if you volunteer maintaining it, we would be happy to give you the necessary permissions :)
<mosajjal> That's cool. I'll work on building an automated script to build and push images. I'll try it on my own and will let you know if it's stable enough
<mosajjal> I should try to be better than the official package first. I believe it's bloated and the default jessie image is way too big for pypy
<cfbolz> Yes, there's an issue on the repo to use alpine as the base
<cfbolz> But nobody finished the work
<mattip> do the 16.04 (jessie, ubuntu) binaries work on alpine?
<rguillebert> probably not, alpine doesn't use glibc
<mosajjal> cfbolz alpine would be a lot of work. Without glibc, making pypy is almost impossible (I've looked into it before)
<mosajjal> On the other hand, tinycore is good IMO. only 9M and it has glibc and a package manager
<bbot2> Failure: http://buildbot.pypy.org/builders/pypy-c-jit-linux-x86-64/builds/6019 [mattip: force build, unicode-utf8-py3]
<cfbolz> mosajjal: according to the issue PyPy2 works on alpine
<mosajjal> I wonder what patches are needed to build pypy for Alpine
<mosajjal> maybe there's another approach to this: getting pypy on Alpine's community repository and then try to slim down the Docker container
<mosajjal> I'm gonna test it for 7.0.0
<fijal> mosajjal: it does not ship arm64 because pypy does not support arm64
<bbot2> Failure: http://buildbot.pypy.org/builders/pypy-c-jit-macosx-x86-64/builds/4226 [mattip: force build, unicode-utf8-py3]
xcm has quit [Read error: Connection reset by peer]
xcm has joined #pypy
marky1991 has joined #pypy
marky1991 has quit [Remote host closed the connection]
marky1991 has joined #pypy
<bbot2> Failure: http://buildbot.pypy.org/builders/rpython-linux-x86-64/builds/149 [mattip: force build, unicode-utf8-py3]
gaze__ has joined #pypy
oberstet has quit [Remote host closed the connection]
marvin has quit [Remote host closed the connection]
marvin has joined #pypy
marky1991 has quit [Ping timeout: 268 seconds]
marvin has quit [Remote host closed the connection]
marvin has joined #pypy
<bbot2> Failure: http://buildbot.pypy.org/builders/own-linux-x86-64/builds/7293 [mattip: force build, unicode-utf8-py3]
xcm has quit [Remote host closed the connection]
xcm has joined #pypy
Masklinn has quit []
Zaab1t has joined #pypy
marky1991 has joined #pypy
Ai9zO5AP has quit [Ping timeout: 244 seconds]
Ai9zO5AP has joined #pypy
Masklinn has joined #pypy
illume has quit [Quit: My MacBook Pro has gone to sleep. ZZZzzz…]
Curi0 has quit [Ping timeout: 246 seconds]
Curi0 has joined #pypy
antocuni has quit [Ping timeout: 250 seconds]
xcm has quit [Remote host closed the connection]
xcm has joined #pypy
<bbot2> Failure: http://buildbot.pypy.org/builders/rpython-win-x86-32/builds/125 [mattip: force build, unicode-utf8-py3]
sknebel is now known as skenbel
skenbel is now known as sknebel
<mattip> cfbolz: it seems something is off with the arm32 jit backend, there are multiple failing tests
<mattip> maybe since the merge of regalloc-playground ?
<cfbolz> mattip: yes, would fit timing wise, and the general area of the code
<cfbolz> Will take a look tomorrow
kipras has quit [Read error: Connection reset by peer]
kipras has joined #pypy
kipras has quit [Read error: Connection reset by peer]
dmalcolm has quit [Ping timeout: 246 seconds]
kipras has joined #pypy
dmalcolm has joined #pypy
<kenaan> cfbolz default 26471ac5ce5f /rpython/jit/backend/arm/regalloc.py: try to fix arm
<cfbolz> mattip: probably fixed it, let's see what the build says
_whitelogger has quit [Remote host closed the connection]
_whitelogger_ has joined #pypy
themsay has quit [Ping timeout: 250 seconds]
dan- has joined #pypy
dan- has quit [Changing host]
dan- has joined #pypy
igitoor has quit [Changing host]
igitoor has joined #pypy
<mattip> cfbolz: +1
themsay has joined #pypy
themsay has quit [Ping timeout: 240 seconds]
themsay has joined #pypy
Zaab1t has quit [Quit: bye bye friends]
<mjacob> arigo: hi! i'm trying to fix _cffi_ssl on revdb. do you think that in _str_to_ffi_buffer() in lib_pypy/_cffi_ssl/_stdssl/utility.py it's okay to return `view` instead of `ffi.from_buffer(view)`?
<mjacob> is there any reason why `ffi.from_buffer(view)` should be preferred?
speeder39_ has joined #pypy
<kenaan> cfbolz promote-unicode 9617f2038bf2 /: close to-be-merged branch
<kenaan> cfbolz default 9d4fe930924e /: merge promote-unicode mostly for completeness sake: support for rlib.jit.promote_unicode, which behaves like prom...
antocuni has joined #pypy
antocuni has quit [Ping timeout: 246 seconds]
illume has joined #pypy
mosajjal has quit [Ping timeout: 250 seconds]
themsay has quit [Ping timeout: 246 seconds]
senyai has joined #pypy
_whitelogger has joined #pypy
demonimin has quit [Quit: bye]
demonimin has joined #pypy
illume has quit [Quit: My MacBook Pro has gone to sleep. ZZZzzz…]
moei has quit [Quit: Leaving...]
antocuni has joined #pypy
epelesis has joined #pypy
<tumbleweed> arigo: going to cut a cffi release? Debian is in soft freeze for buster. But I can still get things in for another week or so
kipras has quit [Ping timeout: 244 seconds]
marky1991 has quit [Ping timeout: 268 seconds]
speeder39_ has quit [Quit: Connection closed for inactivity]
kipras has joined #pypy
<kenaan> mattip unicode-utf8-py3 d89df30bad0b /pypy/objspace/std/unicodeobject.py: raise correct error
<kenaan> mattip unicode-utf8-py3 6165ec8e5e76 /rpython/rtyper/lltypesystem/rffi.py: allow surrogates in wcharpsize2utf8
antocuni has quit [Read error: Connection reset by peer]
antocuni has joined #pypy
<kenaan> mjacob py3.5-ssl-revdb 9dbb4911ba4d /lib_pypy/_cffi_ssl/_stdssl/__init__.py: Remove unnecessary variable.
<kenaan> mjacob py3.5-ssl-revdb 5c289da45ef2 /lib_pypy/_cffi_ssl/_stdssl/__init__.py: Defer creation of C buffer.
<kenaan> mjacob py3.5-ssl-revdb 664e95442ff7 /lib_pypy/_cffi_ssl/_stdssl/__init__.py: Fix _SSLSocket.read() for buffers that can’t get their raw addresses taken (e.g. when running on top of Re...
<kenaan> mjacob py3.5-ssl-revdb 08a735234778 /lib_pypy/_cffi_ssl/_stdssl/utility.py: Share code.