<mattip>
"The first 128 characters (US-ASCII) need one byte. The next 1,920 characters need two bytes to encode"
<Ninpo>
I'm kinda weak on binary/hex etc so apologies I'm not getting this immediately.
<mattip>
np
<Ninpo>
So in a hex sequence it's always pairs, and multiple pairs can make up a sequence. You're saying if I understand correctly, if any pair starts with a value higher than '7' it must be a sequence higher than 127?
<Ninpo>
and I only need to stop on the first one
<Ninpo>
well in my case, see if it decodes as utf8
<Ninpo>
but if we don't find any "start" to the pairs higher than 7 it's safely ascii, no need to unhex or carry on decoding?
<mattip>
+1
<Ninpo>
whew I get it, I think
<Ninpo>
Thanks mattip
<mattip>
hope it helps
<Ninpo>
me too! worth a shot
<Ninpo>
when it's over half a day to scan, any optimisation ends up exponential
<Ninpo>
Another thing I want to investigate is pure async via trio_mysql as opposed to sqlalchemy with thread workers
<Ninpo>
I know pypy is faster with pymysql than mysqldb due to the C interface
<Ninpo>
So I'm using that as the dialect
<mattip>
no idea
<Ninpo>
I'm gonna kill this long run and conclude your branch _is_ faster overall mattip by a good amount. Projection suggestions 4 hours less time than last run on v6.0
<Ninpo>
my v6.0 vs v7.0 stable were roughly the same so whatever this branch does it's working :D
<Ninpo>
Let me know if there's anything newer I should grab and build mattip
<mattip>
four hours out of how many total?
<Ninpo>
14
<Ninpo>
on first run
<Ninpo>
This run currently projected to finish in 10 and change
<mattip>
nice
<Ninpo>
I'm gonna kill it I'm dying to try your suggestions
<Ninpo>
mattip: quick question on the walking of the hex value, am I not setting myself up for a much slower operation on a large result that's all ascii? Since I'm checking every single hex sequence in that case? Or do you think that's still faster in pypy than an unhex/decode("ascii")?
<mattip>
unhex/decode walks it twice, allocates an intermediary, and does more complicated checks
<Ninpo>
ah I see
<mattip>
in the all-ascii case you save all that. In the non-ascii case you only save decode('ascii') but I think you still win
mosajjal has quit [Remote host closed the connection]
mosajjal has joined #pypy
mosajjal has quit [Client Quit]
dddddd has joined #pypy
mosajjal has joined #pypy
mosajjal has quit [Client Quit]
mosajjal has joined #pypy
mosajjal has quit [Ping timeout: 246 seconds]
jcea has joined #pypy
xcm has quit [Remote host closed the connection]
Taggnostr has quit [Quit: Switching to single player mode.]
Taggnostr has joined #pypy
xcm has joined #pypy
marky1991 has joined #pypy
marky1991 has quit [Ping timeout: 250 seconds]
mosajjal has joined #pypy
<mosajjal>
hi everyone! I've been in contact for official Docker image for pypy and I did some digging around
<mosajjal>
this is an official quote from Docker docs:
<mosajjal>
While it is preferable to have upstream software authors maintaining their corresponding Official Images, this is not a strict requirement. Creating and maintaining images for Official Images is a public process. It takes place openly on GitHub where participation is encouraged. Anyone can provide feedback, contribute code, suggest process changes,
<mosajjal>
or even propose a new Official Image.
<mosajjal>
fijal
mosajjal has quit [Remote host closed the connection]
mosajjal has joined #pypy
themsay has quit [Ping timeout: 250 seconds]
themsay has joined #pypy
<kenaan>
rlamy rpath-enforceargs 2bb6bffd210f /: Close obsolete branch
Masklinn has joined #pypy
antocuni has joined #pypy
<fijal>
mosajjal: hey! I'm on my phone will be back in couple h
<mosajjal>
ok cool
<LarstiQ>
mosajjal: for those who haven't been following along, is there a question/suggestion?
<cfbolz>
mosajjal: what where the issues you found with the current docker image?
<rguillebert>
the issue probably is that they haven't been updated to PyPy 7.0 yet
<cfbolz>
There is a pull request for 7.0
<cfbolz>
So it's probably going to be available soon
<mosajjal>
the problem is, official pypy repo in Docker hub isn't maintained by pypy team
<cfbolz>
mosajjal: yes, we don't have the bandwidth for that. We don't maintain Debian packages ourselves either
<antocuni>
I think it's "official" because it is maintained by the docker guys; it's not that the ubuntu image is maintained by canonical
<mosajjal>
I think a VPS will do the trick. After all, you guys don't have a lot of periodic releases anyway. Maybe once a month or sth?
<mosajjal>
antocuni thing is, Docker is only doing that because pypy team isn't. Also, they don't have ARM64 release
<rguillebert>
I think cfbolz was talking about mental bandwidth :)
<rguillebert>
not data bandwidth
<mosajjal>
lol
<antocuni>
mosajjal: I suppose that if you volunteer maintaining it, we would be happy to give you the necessary permissions :)
<mosajjal>
That's cool. I'll work on building an automated script to build and push images. I'll try it on my own and will let you know if it's stable enough
<mosajjal>
I should try to be better than the official package first. I believe it's bloated and the default jessie image is way too big for pypy
<cfbolz>
Yes, there's an issue on the repo to use alpine as the base
<cfbolz>
But nobody finished the work
<mattip>
do the 16.04 (jessie, ubuntu) binaries work on alpine?
<rguillebert>
probably not, alpine doesn't use glibc
<mosajjal>
cfbolz alpine would be a lot of work. Without glibc, making pypy is almost impossible (I've looked into it before)
<mosajjal>
On the other hand, tinycore is good IMO. only 9M and it has glibc and a package manager
<kenaan>
cfbolz default 26471ac5ce5f /rpython/jit/backend/arm/regalloc.py: try to fix arm
<cfbolz>
mattip: probably fixed it, let's see what the build says
_whitelogger has quit [Remote host closed the connection]
_whitelogger_ has joined #pypy
themsay has quit [Ping timeout: 250 seconds]
dan- has joined #pypy
dan- has quit [Changing host]
dan- has joined #pypy
igitoor has quit [Changing host]
igitoor has joined #pypy
<mattip>
cfbolz: +1
themsay has joined #pypy
themsay has quit [Ping timeout: 240 seconds]
themsay has joined #pypy
Zaab1t has quit [Quit: bye bye friends]
<mjacob>
arigo: hi! i'm trying to fix _cffi_ssl on revdb. do you think that in _str_to_ffi_buffer() in lib_pypy/_cffi_ssl/_stdssl/utility.py it's okay to return `view` instead of `ffi.from_buffer(view)`?
<mjacob>
is there any reason why `ffi.from_buffer(view)` should be preferred?
speeder39_ has joined #pypy
<kenaan>
cfbolz promote-unicode 9617f2038bf2 /: close to-be-merged branch
<kenaan>
cfbolz default 9d4fe930924e /: merge promote-unicode mostly for completeness sake: support for rlib.jit.promote_unicode, which behaves like prom...
<kenaan>
mjacob py3.5-ssl-revdb 5c289da45ef2 /lib_pypy/_cffi_ssl/_stdssl/__init__.py: Defer creation of C buffer.
<kenaan>
mjacob py3.5-ssl-revdb 664e95442ff7 /lib_pypy/_cffi_ssl/_stdssl/__init__.py: Fix _SSLSocket.read() for buffers that can’t get their raw addresses taken (e.g. when running on top of Re...