<Vincenz>
lament: do you know ocaml well, please say yes
<lament>
no.
<Vincenz>
:(
<Vincenz>
whee: any clues?
<whee>
yes, back. heh
<Vincenz>
:)
<whee>
you just want to tokenize some input stream, that's either an alphanumeric identifier or integer?
<Vincenz>
that's just a basis
<Vincenz>
but yes
<Vincenz>
but why won't it remove the spaces in the line?
<Vincenz>
| [< ''a'..'z'|'A'..'Z' as c; r = (identifier [c]); spaces >] -> IDENT (implode(r))
<whee>
well, that's the job of the tokenizer :)
<Vincenz>
euhm...but
<Vincenz>
if I do...
<whee>
see, with that, execution terminates right after
<Vincenz>
parsetoken(stream), spaces(stream), parsetkoen(stream) it works
<whee>
so you don't actually go and scan the rest of the stream
<Vincenz>
oh
<whee>
what you typically want to do is return a stream of tokens
<Vincenz>
[< >] makes it stop?
<whee>
no, your IDENT (implode(r)) thing does
<whee>
I think
<Vincenz>
but...
<Vincenz>
the spaces is still int he matching part
<Vincenz>
perhaps....
<whee>
you want to approach it differently, I think
<Vincenz>
the [< >] om identifier makes it Parse Failure
<Vincenz>
whee: I've seen other approaches, just wondering why this wont' work
<whee>
I don't believe it's properly recursing on itself
<Vincenz>
but...
<Vincenz>
it doesn't have to recurse
<Vincenz>
I should be able to do
<Vincenz>
parsetoken, parsetoken
<Vincenz>
and get
<Vincenz>
IDENT "abc", INT 123
<whee>
can you paste all the code in a /msg to me?
* Vincenz
nods
mattam has joined #ocaml
<whee>
er, parsetoken looks wrong
<whee>
I think you really want to have parsetoken as two cases, one where r = integer and another where r = identifier
<Vincenz>
indeed
<Vincenz>
and I call it once, I get string
<Vincenz>
I call it agai
<Vincenz>
and I get an int
<Vincenz>
that should be the idea
<Vincenz>
but it gets stuck on the space
<whee>
I don't know if the spaces are the problem. parsetoken just looks zany to me
<Vincenz>
heh
<whee>
you're matching the same character classes twice, once in parsetoken, once in the actual place where it should be
<whee>
I'd also take that one step further and make spaces another case for parsetoken which will just recurse on itself until there's no more whitespace
<Vincenz>
that's alright
<Vincenz>
that part works
<Vincenz>
without a space it works fine
<Vincenz>
it just gets stuck on the space
<Vincenz>
is it possible to trace execution?
<Vincenz>
# let mystream = Stream.of_string("abc123");;
<Vincenz>
val mystream : char Stream.t = <abstr>
<Vincenz>
# parsetoken(mystream);;
<Vincenz>
- : token = IDENT "abc"
<Vincenz>
# parsetoken(mystream);;
<Vincenz>
- : token = INT 123
<Vincenz>
# let mystream = Stream.of_string("abc 123");;
<Vincenz>
val mystream : char Stream.t = <abstr>
<Vincenz>
# parsetoken(mystream);;
<Vincenz>
- : token = IDENT "abc"
<Vincenz>
# parsetoken(mystream);;
<Vincenz>
- : token = INT 0
<Vincenz>
# Stream.peek(mystream);;
<Vincenz>
- : char option = Some ' '
<Vincenz>
# spaces(mystream);;
<Vincenz>
- : unit = ()
<Vincenz>
# parsetoken(mystream);;
<Vincenz>
- : token = INT 123
<Vincenz>
why doesn't it get past the spaces in parsetoken?
<whee>
I still think it's fundamentally flawed code, so :)
<Vincenz>
I wish I knew why tho
<whee>
parsetoken is broken. :P
<Vincenz>
it works like expected without spaces...
<whee>
remove the ;spaces part, and add a case [< spaces; rest >] -> parsetoken rest
<Vincenz>
I don't want it to recursively call itself yet, and I know otherways to do it, I'm just wondering WHY it won't match
<Vincenz>
I could do a spaces
<Vincenz>
and then a match...
<Vincenz>
instead of parsing right away
<Vincenz>
just curious why this won't work
<Vincenz>
it's really odd
<Vincenz>
I mean...
<Vincenz>
if I do
<Vincenz>
[< ''0'..'9' as c; r = (integer (int_of_digit c)); ''a' >] -> INT r
<Vincenz>
and parse with 123a, I get 123
<Vincenz>
but with 123, i get a parse error
<Vincenz>
so it DOES check that last part
<Vincenz>
just not with spaces
<Vincenz>
WO!!!
<Vincenz>
it works now
<Vincenz>
odd
<Vincenz>
| [< ''a'..'z'|'A'..'Z' as c; r = (identifier [c]); a = spaces >] -> IDENT (implode(r))
<Vincenz>
| [< ''a'..'z'|'A'..'Z' as c; r = (identifier [c]); a = spaces >] -> IDENT (implode(r))
<Vincenz>
oops, sorry
<Vincenz>
now that I put it in a var, it works
<Vincenz>
VERY odd
<Vincenz>
well at least it works :)
<Vincenz>
WOO, back to work :)
<whee>
you should get into the habit of separating function parameters from the function
<whee>
heh
<whee>
and I still don't like that code :P
<Vincenz>
I do, I have a plan with it :P
<Vincenz>
at what point do you check syntax?
<whee>
when you're writing? :P
<Vincenz>
like...(for example)..begin is followed by end?
<Vincenz>
at this stage, or the next stage on the AST?
<whee>
well you'd check for syntax during parsing and construction of the AST
<whee>
some things you can check that way, anyway
<whee>
other things need to be checked when the AST is built
* Vincenz
nods
<Vincenz>
but not at this lexing stage, right?
<Vincenz>
now I just turn text into tokens?
skylan has quit [Read error: 104 (Connection reset by peer)]
skylan has joined #ocaml
* Vincenz
is a newb to ocaml and compilers, but is trying to learn by building one bottom-up
<whee>
I'd go and transform the stream of chars into a stream of tokens
<Vincenz>
and check after?
<Vincenz>
yeah, prolly easier
<whee>
well, yes
<Vincenz>
thnx
<Vincenz>
what can you use besides (* for comments?
<whee>
that's all you can use
<Vincenz>
ok
<Vincenz>
you know any site that defines standard code in ocaml? (like, how to tab, etc...)