ec changed the topic of #elliottcable to: a 𝕯𝖊𝖓 𝖔𝖋 𝕯𝖊𝖙𝖊𝖗𝖒𝖎𝖓𝖊𝖉 𝕯𝖆𝖒𝖘𝖊𝖑𝖘 slash s͔̞u͕͙p͙͓e̜̺r̼̦i̼̜o̖̬r̙̙ c̝͉ụ̧͘ḷ̡͙ţ͓̀ || #ELLIOTTCABLE is not about ELLIOTTCABLE
Sgeo_ has quit [Read error: Connection reset by peer]
Sgeo has joined #elliottcable
<ec> i wrote a thing, if anybody wants to proofread my technical writing, lol
<ec> 'specially if you don't Know Unicode Crap that well.
<ec> sigh i love and miss programming
<ljharb> ec: interesting. so this is for when you’re calling directly into Ocaml-compiled native modules?
<ec> ehhhhhhm, close. I see where I went wrong with the word "native"
<ec> there's no native component here; but there's code that was *written for* native platforms.
<ljharb> but invoked in node? or, js compiled to run in Ocaña
<ec> which you, the (ab)user, is compiling via BuckleScript to JavaScript instead. and thus need my little horror to adapt it.
<ljharb> ocaml
<ljharb> so is this lib something the compiler could inject?
<ec> I wish lol
<ljharb> but like ideally
<ec> so now that I have it working, and can publish working libraries for Current Real-World BuckleScript stuff, as I need to,
<ec> I'm definitely going to go complain to People
<ljharb> presumably you could write a babel transform that could be applied to the js output tho
<ec> but part of the problem is that this is a Can't-Make-Everyone-Happy situation
<ljharb> so that nobody ever has to manually use your lib
<ec> out of BuckleScript users, there's "people writing their JavaScript in OCaml", and then there's "people compiling their OCaml to JavaScript" — which sound similar, but aren't the same.
<ec> the former group know, and expect, one string-semantic; the latter group another
<ec> the real problem here, it boils down, is not BuckleScript, or JavaScript; it's OCaml. OCaml *doesn't have* a Unicode-handling story. All the UTF-8 handling stuff is very … ad-hoc, and ‘just what people do.’
<ec> there actually *is* no String type (in the JavaScript, fully-featured, Unicode-aware sense) in OCaml; only a "char array" type that's *named* `string`.
<ec> unfortunately, people expect to. y'know. use strings. and do string-y stuff with them. so, BuckleScript took a reasonable-if-annoying stance of "We're gonna leverage all of the JavaScript string-machinery, so most of the time, things function as you expect … and so code transpiles to clean, minimal, obvious operations"
<ec> but, yeah, that totally fucks up Unicode-handling in all these ancient rickety OCaml libraries.
<ec> in an ideal world it's not BuckleScript, or me, that comes up with a solution, but the *OCaml* community.
<ec> I'm trying to find a venerable GitHub Issue about this
<ec> but yeah *ideally* we'd collectively stop using, and maybe even eventually deprecate, the `string` type. (we've already started this in a different direction, for a different reason, with the new `bytes` type.) and have real, type-level encoding information and tooling ......
<ec> which is exactly the painful transition Ruby made from 1.9 to 2.0, btw. this is a well-documented growing pain for language designers: turns out, you can't make a language without already knowing Literally Everything about encoding and human language and ughhhhhhhhhhhhhhhhhh; otherwise, you're just, just *gonna* have to rebuild everything from scratch after community input from people who Actually Know Encoding
<ec> Thingies™
<ljharb> i mean tho, how did the ocaml designers not know about this
<ec> you mean, in the '80s, before Unicode existed? :P
<ljharb> is ocaml that old?
<ec> that's a somewhat facetious response, of course; OCaml, as opposed to the progenitor languages it extended, is younger than that … but also *not* so much, because Unicode also wasn't actually, well, universal, for a long time
<ec> *but* that said it's not just a matter of knowing Unicode exists. It's more … 'how do we allow developers to ergonomically deal with the real-world landscape of encodings?'
<ec> which is just a specific instantiation of the single, only Language Development Question that encompasses all language decisions: ‘How much do we hide from our user? How much do we abstract, how much power do we take away for their safety?’
<ec> what it boils down to is people building *programming* languages are somewhat rarely *human*-language nerds; and tend to belong to the tribe of programmers borne of silicon valley: "eh, I can type "LOL", it's good enough"
<ec> aaaaaaaaand then their languages grow and gain users that have to deal with real-world things like higher-plane glyphs, combining characters, legacy encodings or even outright malformed input, interoperability with systems that won't transit *well*-formed output … and those users get pissed, and kinda by definition-of-the-problem the language is now popular and established enough that those mistakes can't be unmade …
<ec> aaaaaaaaaaaand now your popular tool is a part of that ecosystem-of-other-shitty-tools-making-encoding-horrible-for-everyone, doing its very darnedest to make everything worse for everybody. great!
<ec> tl;dr I strongly respect Ruby for literally making the first large breaking-backwards-compatibility (1.0 to 2.0, after what, fifteen years? woah.) because the maintainers finally realised how important this was to The World As A Whole, lol
<ec> ANYWAY re: ocaml specifically: this is fixable of course, but OCaml is a community of crotchety academics, prolly mostly white, prolly mostly male, not exactly brimming with SJW culture and wokeness … everyone seems to think "uhhh just install Camomile if you have to 'deal with' some unicode crap ... idk? worry about it when it breaks." is good enough
<jfhbrook> lgpl huh?
<ec> hm?
<jfhbrook> idk in python I have to use byte strings and unicode strings
<jfhbrook> and like ok you have to pull in a lgpl library (camomile) to get unicode strings, but now you have unicode strings and it's fine, right?
<jfhbrook> though to be fair in your case
<jfhbrook> probably every library is written to use bytestrings so you'd have to convert in and out all the time anyway
<ec> nnnnnot quite — it's more "what's the interop story". Are all 'unicode strings' UTF-8 bytes in a byte-array? that should be something the language standardises (and, ideally, provides alternatives/escape-hatches to, as well), not something Some library authors Sometimes do.
<jfhbrook> it's fair to say that standardization is useful
<jfhbrook> when something's already a de facto standard is when you need it the least tho
<ec> Daniel Bünzli had a thing on this that I'm trying to find, in the docs to one of his Unicode-handling modules
<ec> ugh anyway I've already spent too long on this today, time to actually *use* this effort I put in, back in the place where I unearthed the bug, and get something shipped 🙄
<jfhbrook> hah, I hear that
<jfhbrook> work's been a little hectic lately
<jfhbrook> I mean hectic is the wrong word - busy I guess, stressful
<ec> oh but anyway you can see one of the fallout effects of that sort of agnosticism-based choice, right now
<jfhbrook> but I've been learning emacs, that's fun!
<ec> if BuckleScript were working off of a base that *inherently* differentiated, then it could sanely compile the two things two different ways.
<jfhbrook> predictably malformed - that's good
<ec> "the byte-string type" gets compiled to array-handling JavaScript, effort can be expended to maintain semantics for existing byte-array-manipulating-OCaml-code *and* produce idiomatic output; whereas "the user-input-string type" gets compiled with encoding/decoding machinery to massage it into JavaScript UCS-2 yadda yadda yadda.
<ec> but with this design? from the *language* perspective, the two are indistinguishable. there's no way to satisfy both requirements.
<ec> this exact thing is playing out with mutation — having mutable strings was causing serious problems for both the compiler and the community;
<ec> sure, you can just say "hey this is a string, and we're not gonna mutate it, and you shouldn't either", and document that at the library-level, maybe mint a type,
<ec> but that's just not the same.
<ec> finally things snapped in favour of breaking backwards-compatibility (in a really well-thought-out way, btw, imo!)
<ec> OCaml 4.05 introduced a new type, `bytes`, for mutable strings, just an alias to `string` … then 4.06 introduced an optional compiler-flag, `-safe-string`, to make `string` immutable, thus opting-in to breaking code that should have already switched from `string` to the explicit `bytes` type if they needed mutation ...
<ec> then 4.07 swapped the default, leaving iirc `-unsafe-string` to make legacy code work, but defaulting to `string` being an immutable type … and finally 4.08 removed the flag, breaking code that wasn't fixed in the intervening years
<ec> I might be off by one on all those numbers idk lmao
<ec> but. I appreciated that careful approach. I think processes like that are a good candidate for a replacement for the effectively-defunct SemVer, may ye rest in peace
<ec> anybody know if you can export/import typescript types? I still don't use typescript often enough to keep any of it in my head between forays ;_;
englishm has quit [Excess Flood]
englishm has joined #elliottcable
Rurik has joined #elliottcable
<ljharb> ec: lol things that don't follow semver make me facepalm so hard
<ljharb> ec: yes, you can import and export type space values
<ljharb> ec: sadly, TS doesn't have `import type` like flow does, so you have no way of knowing lexically at the callsite
Rurik has quit [Quit: My MacBook has gone to sleep. ZZZzzz…]
Sgeo_ has joined #elliottcable
Sgeo has quit [Ping timeout: 248 seconds]
Sgeo__ has joined #elliottcable
Sgeo_ has quit [Ping timeout: 245 seconds]
Sgeo has joined #elliottcable
Sgeo__ has quit [Ping timeout: 245 seconds]
Sgeo_ has joined #elliottcable
Sgeo has quit [Ping timeout: 248 seconds]