#ocaml on 2003-03-29 — irc logs at freenode.irclog.whitequark.org

2002-11-09 13:39 Yurik changed the topic of #ocaml to: http://icfpcontest.cse.ogi.edu/ -- OCaml wins | http://www.ocaml.org/ | http://caml.inria.fr/oreilly-book/ | http://icfp2002.cs.brown.edu/ | SWIG now supports OCaml| Early releases of OCamlBDB and OCamlGettext are available

00:00 <Vincenz> :(

00:01 * Vincenz sighs

00:03 <Vincenz> never mind

00:05 lament has joined #ocaml

00:05 <Vincenz> lament: do you know ocaml well, please say yes

00:05 <lament> no.

00:06 <Vincenz> :(

00:06 <Vincenz> whee: any clues?

00:07 <whee> yes, back. heh

00:07 <Vincenz> :)

00:08 <whee> you just want to tokenize some input stream, that's either an alphanumeric identifier or integer?

00:08 <Vincenz> that's just a basis

00:08 <Vincenz> but yes

00:08 <Vincenz> but why won't it remove the spaces in the line?

00:08 <Vincenz> | [< ''a'..'z'|'A'..'Z' as c; r = (identifier [c]); spaces >] -> IDENT (implode(r))

00:09 <whee> well, that's the job of the tokenizer :)

00:09 <Vincenz> euhm...but

00:09 <Vincenz> if I do...

00:09 <whee> see, with that, execution terminates right after

00:09 <Vincenz> parsetoken(stream), spaces(stream), parsetkoen(stream) it works

00:09 <whee> so you don't actually go and scan the rest of the stream

00:09 <Vincenz> oh

00:09 <whee> what you typically want to do is return a stream of tokens

00:09 <Vincenz> [< >] makes it stop?

00:09 <whee> no, your IDENT (implode(r)) thing does

00:10 <whee> I think

00:10 <Vincenz> but...

00:10 <Vincenz> the spaces is still int he matching part

00:10 <Vincenz> perhaps....

00:11 <whee> you want to approach it differently, I think

00:11 <Vincenz> the [< >] om identifier makes it Parse Failure

00:11 <Vincenz> whee: I've seen other approaches, just wondering why this wont' work

00:11 <whee> I don't believe it's properly recursing on itself

00:11 <Vincenz> but...

00:12 <Vincenz> it doesn't have to recurse

00:12 <Vincenz> I should be able to do

00:12 <Vincenz> parsetoken, parsetoken

00:12 <Vincenz> and get

00:12 <Vincenz> IDENT "abc", INT 123

00:12 <whee> can you paste all the code in a /msg to me?

00:12 * Vincenz nods

00:14 mattam has joined #ocaml

00:14 <whee> er, parsetoken looks wrong

00:15 <whee> I think you really want to have parsetoken as two cases, one where r = integer and another where r = identifier

00:15 <Vincenz> indeed

00:15 <Vincenz> and I call it once, I get string

00:15 <Vincenz> I call it agai

00:15 <Vincenz> and I get an int

00:15 <Vincenz> that should be the idea

00:15 <Vincenz> but it gets stuck on the space

00:16 <whee> I don't know if the spaces are the problem. parsetoken just looks zany to me

00:16 <Vincenz> heh

00:16 <whee> you're matching the same character classes twice, once in parsetoken, once in the actual place where it should be

00:17 <whee> I'd also take that one step further and make spaces another case for parsetoken which will just recurse on itself until there's no more whitespace

00:17 <Vincenz> that's alright

00:17 <Vincenz> that part works

00:18 <Vincenz> without a space it works fine

00:18 <Vincenz> it just gets stuck on the space

00:19 <Vincenz> is it possible to trace execution?

00:20 <Vincenz> # let mystream = Stream.of_string("abc123");;

00:20 <Vincenz> val mystream : char Stream.t = <abstr>

00:20 <Vincenz> # parsetoken(mystream);;

00:20 <Vincenz> - : token = IDENT "abc"

00:20 <Vincenz> # parsetoken(mystream);;

00:20 <Vincenz> - : token = INT 123

00:20 <Vincenz> # let mystream = Stream.of_string("abc 123");;

00:20 <Vincenz> val mystream : char Stream.t = <abstr>

00:20 <Vincenz> # parsetoken(mystream);;

00:20 <Vincenz> - : token = IDENT "abc"

00:20 <Vincenz> # parsetoken(mystream);;

00:20 <Vincenz> - : token = INT 0

00:20 <Vincenz> # Stream.peek(mystream);;

00:20 <Vincenz> - : char option = Some ' '

00:21 <Vincenz> # spaces(mystream);;

00:21 <Vincenz> - : unit = ()

00:21 <Vincenz> # parsetoken(mystream);;

00:21 <Vincenz> - : token = INT 123

00:21 <Vincenz> why doesn't it get past the spaces in parsetoken?

00:21 <whee> I still think it's fundamentally flawed code, so :)

00:22 <Vincenz> I wish I knew why tho

00:22 <whee> parsetoken is broken. :P

00:22 <Vincenz> it works like expected without spaces...

00:24 <whee> remove the ;spaces part, and add a case [< spaces; rest >] -> parsetoken rest

00:24 <Vincenz> I don't want it to recursively call itself yet, and I know otherways to do it, I'm just wondering WHY it won't match

00:26 <Vincenz> I could do a spaces

00:26 <Vincenz> and then a match...

00:26 <Vincenz> instead of parsing right away

00:26 <Vincenz> just curious why this won't work

00:28 <Vincenz> it's really odd

00:28 <Vincenz> I mean...

00:28 <Vincenz> if I do

00:28 <Vincenz> [< ''0'..'9' as c; r = (integer (int_of_digit c)); ''a' >] -> INT r

00:28 <Vincenz> and parse with 123a, I get 123

00:28 <Vincenz> but with 123, i get a parse error

00:28 <Vincenz> so it DOES check that last part

00:28 <Vincenz> just not with spaces

00:29 <Vincenz> WO!!!

00:29 <Vincenz> it works now

00:29 <Vincenz> odd

00:29 <Vincenz> | [< ''a'..'z'|'A'..'Z' as c; r = (identifier [c]); a = spaces >] -> IDENT (implode(r))

00:29 <Vincenz> oops, sorry

00:30 <Vincenz> now that I put it in a var, it works

00:30 <Vincenz> VERY odd

00:30 <Vincenz> well at least it works :)

00:31 <Vincenz> WOO, back to work :)

00:34 <whee> you should get into the habit of separating function parameters from the function

00:34 <whee> heh

00:34 <whee> and I still don't like that code :P

00:35 <Vincenz> I do, I have a plan with it :P

00:35 <Vincenz> at what point do you check syntax?

00:35 <whee> when you're writing? :P

00:35 <Vincenz> like...(for example)..begin is followed by end?

00:35 <Vincenz> at this stage, or the next stage on the AST?

00:35 <whee> well you'd check for syntax during parsing and construction of the AST

00:35 <whee> some things you can check that way, anyway

00:36 <whee> other things need to be checked when the AST is built

00:36 * Vincenz nods

00:36 <Vincenz> but not at this lexing stage, right?

00:36 <Vincenz> now I just turn text into tokens?

00:36 skylan has quit [Read error: 104 (Connection reset by peer)]

00:36 skylan has joined #ocaml

00:36 * Vincenz is a newb to ocaml and compilers, but is trying to learn by building one bottom-up

00:36 <whee> I'd go and transform the stream of chars into a stream of tokens

00:36 <Vincenz> and check after?

00:37 <Vincenz> yeah, prolly easier

00:37 <whee> well, yes

00:37 <Vincenz> thnx

00:37 <Vincenz> what can you use besides (* for comments?

00:38 <whee> that's all you can use

00:39 <Vincenz> ok

00:41 <Vincenz> you know any site that defines standard code in ocaml? (like, how to tab, etc...)

00:42 <whee> http://caml.inria.fr/FAQ/pgl-eng.html

00:45 <Vincenz> ah thanx :)

00:45 <Vincenz> you can put an | on the first match case?

00:45 <whee> yes

00:46 <Vincenz> good to know

00:46 TachYon has quit [Remote closed the connection]

01:03 skylan_ has joined #ocaml

01:05 Vincenz has quit ["sleep..."]

01:10 skylan has quit [Connection reset by peer]

02:55 mattam has quit ["zZz"]

03:02 palomer has joined #ocaml

03:17 lament has quit ["Did you know that God's name is ERIS, and that He is a girl?"]

03:27 Kinners has left #ocaml []

04:03 palomer has quit [Remote closed the connection]

06:01 mattam has joined #ocaml

08:52 whee has quit ["Leaving"]

09:29 TachYon has joined #ocaml

09:33 TachYon has quit [Client Quit]

10:52 Vincenz has joined #ocaml

10:53 <Vincenz> Hi!

11:03 Kinners has joined #ocaml

11:07 skylan_ has quit [Excess Flood]

11:07 skylan__ has joined #ocaml

11:53 mellum has joined #ocaml

11:54 rox has quit [Excess Flood]

11:55 rox has joined #ocaml

12:35 gl has quit [Read error: 60 (Operation timed out)]

12:45 Kinners has left #ocaml []

13:13 two-face has joined #ocaml

13:15 gl has joined #ocaml

13:26 two-face has quit ["Client Exiting"]

14:42 skylan__ is now known as skylan

15:40 pattern_ has quit [brunner.freenode.net irc.freenode.net]

15:41 pattern_ has joined #ocaml

15:55 mellum has quit [Read error: 110 (Connection timed out)]

16:34 Krystof is now known as gerd

16:34 gerd is now known as Krystof

16:34 sshaw_ has joined #ocaml

17:04 whee has joined #ocaml

17:09 <Vincenz> hmm is the parser in ocaml LALR?

17:15 <whee> eeh, what

17:16 <whee> you should probably check the reference manual, since I can't remember

17:16 <whee> if you're talking about stream parsers that camlp4 provides

17:17 <whee> I think it's LL, though

17:39 mrvn_ has joined #ocaml

17:44 <Vincenz> oh

17:44 <Vincenz> LALL

17:45 <Vincenz> Look-Ahead Left..?

17:45 <Vincenz> damn forgot the terminology

17:45 <Vincenz> or wait

17:46 * Vincenz gets his compiler book out

17:46 <Vincenz> I duobt it's LL

17:46 <Vincenz> if anything LR

17:54 mrvn has quit [Read error: 110 (Connection timed out)]

17:57 <Vincenz> if my internet were any slower

17:57 <Vincenz> I could write my individual characters that I type by snailmail to my provider

18:00 mellum has joined #ocaml

18:05 systems has joined #ocaml

18:12 sshaw_ has quit [Read error: 104 (Connection reset by peer)]

18:39 lament has joined #ocaml

18:42 mattam_ has joined #ocaml

18:43 systems has left #ocaml []

18:59 mattam has quit [Read error: 110 (Connection timed out)]

20:05 mrvn_ is now known as mrvn

20:29 systems has joined #ocaml

20:38 TrOn has joined #ocaml

20:49 systems has left #ocaml []

22:47 mrvn has quit [Read error: 60 (Operation timed out)]

23:04 mrvn has joined #ocaml

23:30 mrvn has quit [Read error: 110 (Connection timed out)]

23:51 mrvn has joined #ocaml