flux changed the topic of #ocaml to: Discussions about the OCaml programming language | http://caml.inria.fr/ | Grab OCaml 3.10.2 from http://caml.inria.fr/ocaml/release.html (featuring new camlp4 and more!)
<palomer> don't worry
<palomer> almost done my lexer generator
<palomer> how is a backlash before an EOF usually interpreted?
<Eridius> before EOF? I'd say syntax error
<Eridius> but maybe you should test it on existing languages
<palomer> what's the concensus, though?
<palomer> and what's \z ?
<palomer> or \p ?
<Eridius> I've never heard of those
mandel has joined #ocaml
<palomer> they're just z and p, right?
<palomer> I mean, what if a lexer encounters them
<Eridius> whatever you want
<palomer> does it output an error?
<Eridius> dude, you're making the lexer. YOU CAN CHOOSE
<palomer> I know I can choose
<palomer> but what's the concensus?
* Eridius sighs and idles
<palomer> there's nothing to sigh about!
<palomer> these things are standard
<palomer> ocaml gives a warning
<Eridius> I sighed because you keep asking questions as if you expect me to magically know what the most common behavior of lexers is when they encounter unexpected escapes
<Eridius> it's not like there's a standard
<palomer> oh
<Eridius> if you want to see what other languages do, test it
<palomer> thought there was
<palomer> btw, I finished writing my lexer
<Eridius> grats
<palomer> it supports escape sequences
<palomer> if you're interested
<Eridius> sure
<Eridius> hrm, that pastebin doesn't seem to handle your code properly
<Eridius> actually, I think it's not handling \"
<palomer> how so?
<Eridius> look at the URL you pasted
tomh_-_ has quit ["http://www.mibbit.com ajax IRC Client"]
<Eridius> it's syntax colorizing wrong
<Eridius> it thinks "{\"}" ends the string at the \"
<Eridius> i.e. it's not handling the escape
<palomer> oh, ocaml handles it fine
<Eridius> yeah, the pastebin doesn't
* palomer loves his lexer
<palomer> fast too!
m3ga has quit [Read error: 110 (Connection timed out)]
Amorphous has quit [Read error: 104 (Connection reset by peer)]
threeve has joined #ocaml
Amorphous has joined #ocaml
electronx has joined #ocaml
jeddhaberstro has quit []
tomh_-_ has joined #ocaml
jeddhaberstro has joined #ocaml
jeddhaberstro has quit [Client Quit]
pastasauce has joined #ocaml
Palace_Chan has quit [Client Quit]
threeve has quit []
seafood has joined #ocaml
sporkmonger has quit []
sporkmonger has joined #ocaml
struktured has quit [Read error: 110 (Connection timed out)]
struktured has joined #ocaml
mandel has quit ["leaving"]
johnnowak has joined #ocaml
Camarade_Tux has quit [Read error: 110 (Connection timed out)]
<flux> palomer, I guess UTF8 strings have their length easily available?
<flux> umh, of course they would have, even regular strings have
Camarade_Tux has joined #ocaml
<thelema> flux: the length of the string in bytes, yes. in characters - that's more difficult.
<palomer> hmm
<palomer> flux, yeah
<palomer> thelema, is it? UTF8.length
<thelema> that function has to iterate the string. O(n)
<thelema> s/iterate/scan/
struktured_ has joined #ocaml
<flux> thelema, has to? it could just keep track of the length, assuming the string cannot be mutated without the module
struktured has quit [Read error: 110 (Connection timed out)]
struktured_ is now known as struktured
<thelema> it could.
Associat0r has quit []
Associat0r has joined #ocaml
seafood has quit [Read error: 110 (Connection timed out)]
Snark has joined #ocaml
im_alone has quit [Read error: 110 (Connection timed out)]
Associat0r has quit [Connection timed out]
<palomer> whoa
<palomer> UTF8.length is O(n)?
<palomer> that's nuts!
<palomer> I call UTF8.length _all_ _the_ _time_
<ozy`> if you know stuff about UTF-8, this is the logical outcome, sadly
<flux> palomer, did you check the source or documentation to see which is the case?
<palomer> but...what about roped string?
<palomer> this means that utf8 is, like, super broken
<palomer> flux, not yet
<palomer> it...shouldn't really matter
<flux> palomer, broken because length isn't O(1)?
<palomer> since I decided my stuff wouldn't go into production
<palomer> flux, yeah!
<flux> palomer, C-guys live with it :)
<flux> what is rope's length complexity?
<palomer> O(1)
<palomer> methinks
<palomer> C-guys are nuts!
asmanur has joined #ocaml
Yoric[DT] has joined #ocaml
itewsh has joined #ocaml
<flux> well, they have other means of finding whether a point is at the string's end
<flux> so what we need is iterators!
Yoric[DT] has quit ["Ex-Chat"]
Linktim has joined #ocaml
fremo has joined #ocaml
filp has joined #ocaml
Linktim_ has joined #ocaml
itewsh has quit ["KTHXBYE"]
Linktim has quit [Read error: 113 (No route to host)]
Linktim has joined #ocaml
Linktim_ has quit [Read error: 110 (Connection timed out)]
johnnowak has quit []
im_alone has joined #ocaml
<tsuyoshi> hm? what do you need to call UTF8.length for?
<flux> he needs to find whether he's at the end of the string.
<tsuyoshi> oh.. that's not the right way to do it
Asmadeus has joined #ocaml
<flux> how does one iterate utf8 strings without doing it all at once?
<tsuyoshi> well.. you would go by the length of the string in bytes, not characters
<flux> what's the point in iterating utf8 by bytes?
<tsuyoshi> uh.. because the length in bytes is very cheap to find
<flux> yes, but then you need to understand utf8 where you should have offloaded that part to a library?-o
<tsuyoshi> I mean obviously there should be a get_next_character function that is string -> character option or whatever
<electronx> use Haskell instead :)
Linktim_ has joined #ocaml
<tsuyoshi> well, the problem with getting the length of a utf8 string is that it's slow.. haskell doesn't really solve that problem at all
<flux> extlib's get(n) is O(n)
<flux> I wonder if camomile's the same
<tsuyoshi> well, probably
<flux> but if so, or palomer is using ext-lib, then his lexer is doubly slow
<flux> but extlib does infact provide iterators
Linktim_ has quit [Client Quit]
Linktim_ has joined #ocaml
tomh_-_ has quit ["http://www.mibbit.com ajax IRC Client"]
<tsuyoshi> hrm.. that's a little primitive
<flux> what would be the alternative? other than making the type abstract..
<electronx> tsuyoshi: haskell has many implementations how do you know theyre slow?
<electronx> i'm pretty sure there is an optmized lib available
<tsuyoshi> electronx: oh, I just think haskell itself is slow
<flux> electronx, how do they solve the issue then?
<electronx> c lib i bet
<flux> a list-of-character-based approach would do nothing to solve it. bytestring-based approach - not much different from UTF8-module - could do other things
<tsuyoshi> I think the thing you want is like
<electronx> haskell and ocaml arn't that different in speed
<electronx> pretty close actually
<tsuyoshi> module UTF8Stream = sig type t val of_string: string -> t val next: t -> character option val previous: t -> character option val rewind: t -> unit val fastforward: t -> unit end
<flux> ats beats everything anyway and has a more powerful type system anyway, we should use that ;)
<tsuyoshi> where character would be an int or whatever you useto represent unicode characters
<tsuyoshi> next and previous would return None when they are at the end or the beginning of the string
<electronx> flux: ats user base is like only a few people :)
<tsuyoshi> plus you want iter, map, fold, but those are trival, given next
Linktim has quit [Connection timed out]
Linktim has joined #ocaml
<flux> (also ats' runtime lib is gpl, which I consider a big minus)
<tsuyoshi> really.. you could write a thing that does lazy indexing of a utf8 string... but I think that's overkill
<electronx> flux: you could persuade the author to change it to Open BSD
<electronx> flux: he'll prob do
<flux> I don't think I care enough ;)
<electronx> ya
<electronx> looks interesting as a language though
<flux> also, apparently all gzipped c-sources are _smaller_ than the gzipped ats sources, which is quite a feat in itself..
<electronx> yeah
* electronx googles ats languge in industry
<electronx> but look how ugly the language is
<electronx> its horrible
<flux> what's horrible about it?
<electronx> syntax
<electronx> looks ugly
<flux> reading stuff that one doesn't understand can be difficult
<subconscious> flux: you are taking this guy seriously? lol
Linktim has quit [Read error: 104 (Connection reset by peer)]
Linktim_ has quit [Read error: 110 (Connection timed out)]
Linktim has joined #ocaml
Linktim_ has joined #ocaml
asmanur_ has joined #ocaml
tomh_-_ has joined #ocaml
det has quit [Remote closed the connection]
det has joined #ocaml
det has quit [Remote closed the connection]
det has joined #ocaml
Yoric[DT] has joined #ocaml
middayc has joined #ocaml
struk_atwork has quit [Connection reset by peer]
struk_atwork has joined #ocaml
asmanur has quit [Read error: 110 (Connection timed out)]
Linktim has quit [Read error: 110 (Connection timed out)]
electronx has quit []
Camarade_Tux has quit []
itewsh has joined #ocaml
Camarade_Tux has joined #ocaml
marmotine has joined #ocaml
Yoric[DT] has quit [Read error: 113 (No route to host)]
Linktim has joined #ocaml
itewsh has quit ["KTHXBYE"]
Linktim_ has quit [Read error: 110 (Connection timed out)]
Linktim_ has joined #ocaml
Linktim has quit [Read error: 110 (Connection timed out)]
Yoric[DT] has joined #ocaml
Linktim has joined #ocaml
Linktim_ has quit [Read error: 110 (Connection timed out)]
kekstyle has quit ["Quitte"]
itewsh has joined #ocaml
pango_ has joined #ocaml
itewsh has quit [Read error: 110 (Connection timed out)]
itewsh has joined #ocaml
Snark has quit [Read error: 60 (Operation timed out)]
tomh_-_ has quit [Read error: 104 (Connection reset by peer)]
<thelema> palomer: well, ropes have to keep track of lengths, so roped UTF-8 isn't so bad.
<Jedai> thelema: But wouldn't Rope need to keep track of the memory length, not the "real" length in characters ?
<thelema> Jedai: no, ropes keep track of whatever the string will get indexed by.
<thelema> So you can have (reasonably) efficient byte *or* character indexing.
<thelema> I guess you could have both, but the overhead keeps growing.
itewsh has quit [Remote closed the connection]
tomh_-_ has joined #ocaml
jlouis has quit [Remote closed the connection]
pango_ has quit [Remote closed the connection]
<Yoric[DT]> thelema: what's the point of byte indexing?
<thelema> Yoric[DT]: one might have a reason to pull bytes out of a UTF-8 string.
<Yoric[DT]> mmmhh....
<Yoric[DT]> I guess.
* Yoric[DT] would favor character indexing.
<thelema> Not high priority for the library.
<Yoric[DT]> Ah, ok.
<thelema> probably sufficient to allow users to convert the rope to a byte string, which would allow any character indexing they wanted.
<Yoric[DT]> I assume so.
johnnowak has joined #ocaml
subconscious has quit [Read error: 113 (No route to host)]
subconscious has joined #ocaml
jlouis has joined #ocaml
Linktim has quit ["Quitte"]
asmanur_ has quit [Read error: 110 (Connection timed out)]
Linktim has joined #ocaml
icarus901 has quit [Read error: 104 (Connection reset by peer)]
tomh_-_ has quit ["http://www.mibbit.com ajax IRC Client"]
guillem has joined #ocaml
Yoric[DT] has quit [Read error: 113 (No route to host)]
Yoric[DT] has joined #ocaml
Yoric[DT] has quit ["Ex-Chat"]
tomh_-_ has joined #ocaml
tomh_-_ has quit [Remote closed the connection]
hkBst has joined #ocaml
itewsh has joined #ocaml
Linktim_ has joined #ocaml
<subconscious> just curious about something like this, http://rafb.net/p/0BsPCn76.html
<subconscious> would the Pairs get complied out?
jeddhaberstro has joined #ocaml
<subconscious> all that tupling and untupling
<thelema> subconscious: no. the OCaml compiler pretty consistently does what you tell it to do.
<subconscious> Is it possible to tell it to erase those tuples?
<thelema> if you remove the Pair tag, the compiler may optimize the paired function arguments into normal curried arguments (which it's very efficient at)
<subconscious> how do I do that?
Linktim has quit [Read error: 110 (Connection timed out)]
Linktim has joined #ocaml
<thelema> let rec f m0 n0 = match m0 with O -> S n0 | S m1 -> (match n0 with O -> f m1 (S O) | S n1 -> f m1 (f m0 n1))
Linktim_ has quit [Read error: 110 (Connection timed out)]
Linktim_ has joined #ocaml
<subconscious> ok thanks thelema, I will try this
asmanur has joined #ocaml
itewsh has quit [Read error: 60 (Operation timed out)]
itewsh has joined #ocaml
pastasauce has quit []
Linktim has quit [Read error: 110 (Connection timed out)]
filp has quit ["Bye"]
itewsh has quit [Connection timed out]
itewsh has joined #ocaml
johnnowak has quit []
Linktim has joined #ocaml
Linktim_ has quit [Read error: 110 (Connection timed out)]
Snark has joined #ocaml
itewsh has quit [Read error: 110 (Connection timed out)]
itewsh has joined #ocaml
jeremiah has quit [Read error: 110 (Connection timed out)]
itewsh has quit [Remote closed the connection]
Linktim_ has joined #ocaml
Linktim has quit [Read error: 110 (Connection timed out)]
asmanur has quit [Read error: 110 (Connection timed out)]
tomh_-_ has joined #ocaml
jeddhaberstro_ has joined #ocaml
jeddhaberstro has quit [Read error: 110 (Connection timed out)]
middayc has quit []
Yoric[DT] has joined #ocaml
<thelema> Yoric[DT]: what do you think of the one-file batteries.ml?
Snark has quit ["Ex-Chat"]
vixey has joined #ocaml
subconscious has quit [Read error: 113 (No route to host)]
<Yoric[DT]> thelema: what do you mean?
<Yoric[DT]> Actually, I'm tired.
<Yoric[DT]> I spent my whole day at a workshop, so I hope that can wait til tomorrow.
<thelema> ok. tomorrow.
<Yoric[DT]> Sorry.
<Yoric[DT]> I spent my day half-listening to talks while rushing my slides.
<Yoric[DT]> I'm really spent.
<thelema> It's okay. I'll just see if I can finish this camomile bit
<Yoric[DT]> Thanks.
<thelema> rest well.
<Yoric[DT]> Thanks.
<thelema> Yoric[DT]: unicode branch here
itewsh has joined #ocaml
Asmadeus has quit ["nighters"]
<Yoric[DT]> thelema: I'll look tomorrow.
<Yoric[DT]> Thanks.
<Yoric[DT]> For the moment, time to call it a night.
Yoric[DT] has quit ["Ex-Chat"]
Linktim_ has quit ["Quitte"]
rwmjones_ has joined #ocaml
seafood has joined #ocaml
OChameau has joined #ocaml
tomh_-_ has quit ["http://www.mibbit.com ajax IRC Client"]
itewsh has quit ["KTHXBYE"]
hkBst has quit [Read error: 104 (Connection reset by peer)]
threeve has joined #ocaml
marmotine has quit ["mv marmotine Laurie"]
threeve has quit []
rwmjones_ has quit ["Closed connection"]
* palomer is no fan of regular expressions
<Eridius> why not?
seafood has quit [Read error: 110 (Connection timed out)]
<palomer> too powerful
<ozy`> and I don't like modules
jlouis has quit ["Leaving"]
Associat0r has joined #ocaml
Palace_Chan has joined #ocaml
threeve has joined #ocaml
<mbishop> get out!
threeve has quit []