cfbolz changed the topic of #pypy to: PyPy, the flexible snake (IRC logs: https://quodlibet.duckdns.org/irc/pypy/latest.log.html#irc-end ) | use cffi for calling C | if a pep adds a mere 25-30 [C-API] functions or so, it's a drop in the ocean (cough) - Armin
dansan has joined #pypy
tos9 has quit [Ping timeout: 258 seconds]
Ninpo has quit [Ping timeout: 240 seconds]
jvesely has joined #pypy
Ninpo has joined #pypy
marself has quit [Ping timeout: 240 seconds]
marself has joined #pypy
tos9 has joined #pypy
tos9 has quit [Ping timeout: 244 seconds]
tos9 has joined #pypy
marself has quit [Ping timeout: 264 seconds]
marself has joined #pypy
jcea has quit [Quit: jcea]
_whitelogger has joined #pypy
todda7 has quit [Ping timeout: 260 seconds]
oberstet has quit [Remote host closed the connection]
_whitelogger has joined #pypy
jvesely has quit [Quit: jvesely]
fling has quit [Remote host closed the connection]
fling has joined #pypy
fling has quit [Quit: ZNC 1.7.2+deb3 - https://znc.in]
fling has joined #pypy
oberstet has joined #pypy
todda7 has joined #pypy
kipras`away has quit [Ping timeout: 256 seconds]
kipras`away has joined #pypy
todda7 has quit [Ping timeout: 258 seconds]
dmalcolm has joined #pypy
dmalcolm_ has quit [Ping timeout: 240 seconds]
todda7 has joined #pypy
TheNewbie has joined #pypy
TheNewbie has quit [Quit: Leaving]
rubdos has quit [Ping timeout: 240 seconds]
lritter has joined #pypy
i9zO5AP has joined #pypy
todda7 has quit [Ping timeout: 260 seconds]
todda7 has joined #pypy
todda7 has quit [Ping timeout: 240 seconds]
todda7 has joined #pypy
marself has quit [Quit: WeeChat 2.8]
Smigwell has joined #pypy
Rhy0lite has joined #pypy
exarkun has joined #pypy
<exarkun> Can I change PyPy's idea of the "filesystem encoding" after startup?
<exarkun> I just want it to be UTF-8
<exarkun> CPython has no public interface for this as far as I can tell but it has a Py_FileSystemDefaultEncoding symbol that I can manipulate with cffi
<mattip> exarkun: my guess is it will randomly break things like opening files, but who knows
<exarkun> mattip: Yea, not really interested in that, I just want to know how to do it.
<exarkun> For my application, things are already randomly broken if the encoding isn't UTF-8.
<tos9> exarkun: (I don't know the answer, but given https://foss.heptapod.net/pypy/pypy/-/blob/branch/default/pypy/module/sys/interp_encoding.py#L69 it looks like that question may be equivalent to "can you get access to the space object from interp level"?)
<exarkun> There are constraints I can't do anything about in the foreseeable future which make this so.
<tos9> because if so you can seemingly mutate that attribute on it
<exarkun> Darn. I think object space is pretty darn hard to manipulate from Python.
<mattip> I think you are supposed to do something like LC_CTYPE= 'en_US.UTF-8'
<exarkun> Yes, probably so
<mattip> in the environment before running python/pypy
<exarkun> Unfortunately you can't always count on having such an environment at process startup.
jcea has joined #pypy
<exarkun> On CPython, changing the environment after Py_Initialize runs has no effect. Looking at the linked code, I guess I doubt changing the environment would work on PyPy either ... unless you can do it before the first call to getfilesystemencoding, maybe.
<exarkun> but it looks like by the time any application code can run is too late
<mattip> so write a wapper that sets it then calls python
<tos9> exarkun: I'm assuming you're saying you're handed an already running Python process and it's been misconfigured?
<tos9> inb4 you say this has to do with some CI provider or something
<exarkun> eh, I have a CI job for this but it's intentional
<exarkun> This is actually a program for users to run
<exarkun> not much I can do if they run it without LANG set
<exarkun> mattip: yea, sure, it'd just be waaaay easier if I could `sys.setfilesystemencoding("UTF-8")`
<simpson> rjarry: I appreciate you posting that; it's a good lesson for other language designers: Force UTF-8 and write adapters for each OS. Completely take away the ability to choose to get filesystem encodings wrong.
<mattip> by the time you can import sys it is too late
<rjarry> simpson: I'm not even sure filesystem paths should be "encoded", to me they should remain in bytes
<rjarry> in fact, nothing prevents you (on linux, AFAIK) from using non printable characters in filenames
<rjarry> well, "most of the time" it works
<simpson> rjarry: That's a Linux-only view. On other OSs, they *are* Unicode and encoded. I'm suggesting that language runtimes should paper over this difference completely. (As a corollary, perhaps we should stop encouraging people to have filenames full of trash bytes.)
<rjarry> hehe
<rjarry> that would break backward compatibility for people who rely on having '\b' in their folder names, lol
<rjarry> btw, the same problem exists for network device names on linux
<rjarry> any non '\0' ascii character is considered valid
<simpson> One horrible mistake of history at a time.
jvesely has joined #pypy
wilbowma has quit [Ping timeout: 246 seconds]
wilbowma has joined #pypy
<cfbolz> exarkun: sounds reasonable to me to add an API for that
<cfbolz> (maybe in __pypy__
<cfbolz> )
<cfbolz> file an issue?
<exarkun> Sure
<mattip> https://bugs.python.org/issue9632 is where it was removed from cpython
<exarkun> Actually considering the latest Python 3.x behavior is kinda sort "always UTF-8" a new API would only be for 2.x.
<exarkun> So is it worthwhile?
<arigato> exarkun: yes
RemoteFox has joined #pypy
RemoteFox has left #pypy [#pypy]
<exarkun> Okay, cool, filing
<arigato> I don't know why there is sys.setdefaultencoding() that you need reload(sys) to access (which is very obscure), but there is no sys.setfilesystemencoding() at all
<exarkun> I guess there was for a couple releases of 3.x and then it was deleted
<exarkun> "mojibake by construction" or something
<exarkun> in general I agree it's a dangerous behavior so I can sort of understand the argument for not having it
<exarkun> but when the detected value is wrong and broken it sure sucks not to have it
<arigato> you could also do "if sys.getfilesystemencoding() != 'utf-8': os.environ['X'] = 'Y'; os.execv(sys.argv)...
<exarkun> yea, but re-executing a process is fraught :/
<exarkun> plus 2x python startup time for a short-lived cli sucks
<exarkun> (at least it only applies to linux so there's no wonky windows process code to think about ... still)
<exarkun> I guess doing that for only-linux/only-pypy/only-non-utf8 might limit the impact enough ...
todda7 has quit [Ping timeout: 272 seconds]
todda7 has joined #pypy
dansan has quit [Ping timeout: 264 seconds]
todda7 has quit [Ping timeout: 258 seconds]
rubdos has joined #pypy
todda7 has joined #pypy
Dejan has joined #pypy
lritter has quit [Quit: Leaving]
Ai9zO5AP has joined #pypy
i9zO5AP has quit [Ping timeout: 240 seconds]
todda7 has quit [Ping timeout: 258 seconds]
<exarkun> So functionality missing on PyPy + different schemes for building/packaging/installing CPython make the os.execv approach tempting... But of course there's a zillion edge cases
todda7 has joined #pypy
<exarkun> What if the code is imported as a library, what if it is run from the interactive interpreter, what if it is being run as part of a test suite...
todda7 has quit [Ping timeout: 256 seconds]
<exarkun> Haha. Also if the platform doesn't have the locale you pick for LANG then Python still goes with ASCII.
fling has quit [*.net *.split]
jerith has quit [*.net *.split]
kanaka has quit [*.net *.split]
LarstiQ has quit [*.net *.split]
Dejan has quit [Quit: Leaving]
<exarkun> Ah, and then there's `python -m ...` which randomly shuffles everything around some more.
kanaka has joined #pypy
fling has joined #pypy
jerith has joined #pypy
LarstiQ has joined #pypy
fling has quit [Max SendQ exceeded]
<exarkun> Okay, it's not clear to me that enough information about how the process was started is actually preserved any more to be able to re-execute it with a different environment.
<exarkun> Probably time to just say "LANG!=*.UTF-8 is unsupported" :/
fling has joined #pypy
todda7 has joined #pypy
proteusguy has quit [Ping timeout: 258 seconds]
i9zO5AP has joined #pypy
Ai9zO5AP has quit [Ping timeout: 240 seconds]
proteusguy has joined #pypy
jacob22 has quit [Read error: Connection reset by peer]
jacob22 has joined #pypy
<mattip> rain, rain
dansan has joined #pypy
dansan has quit [Excess Flood]
<arigato> got some snow at the top of the mountain sunday (which I reached by cable car)
dansan has joined #pypy
dansan has quit [Excess Flood]
dansan has joined #pypy
jvesely has quit [Quit: jvesely]
<Hodgestar> mattip, arigato: There was snow on the top of Table Mountain this weekend (!!).
_whitelogger has joined #pypy
jvesely has joined #pypy
todda7 has quit [Ping timeout: 265 seconds]
<tos9> Someone dropped some ice cream in front of my apt in NYC
<tos9> (sorry I was feeling left out of the snow discussion)
lritter has joined #pypy
Smigwell has left #pypy [#pypy]
<lazka> exarkun, cpython will never use ascii since 3.7 I think, it will force utf-8 in that case
<lazka> pep538
<lazka> PYTHONCOERCECLOCALE=warn LANG=NOPE python3 -c "import sys; print(sys.getfilesystemencoding())"
tbodt has quit [Ping timeout: 272 seconds]
kanaka has quit [Remote host closed the connection]
tbodt has joined #pypy
lritter has quit [Quit: Leaving]
speeder39_ has joined #pypy
jacob22 has quit [Read error: Connection reset by peer]
jacob22 has joined #pypy