I realized there is no golang toolchain on this computer, so install it in another shell session. However in order for it to work you'll need to add the path /usr/local/go/bin to PATH in order that you can run the command you were about to run.
The user wants me to install the Go toolchain first. I need to: 1. Download and install Go 2. Add /usr/local/go/bin to the PATH 3. Then run the go mod init command
Let me install Go first. I'll download it from the official Go website and install it.
wget -q https://go.dev/dl/go1.23.6.linux-amd64.tar.gz -O /tmp/go.tar.gz && sudo tar -C /usr/local -xzf /tmp/go.tar.gz && rm /tmp/go.tar.gz
The user doesn't want to proceed with this tool use. The tool use was rejected (eg. if it was a file edit, the new_string was NOT written to the file). STOP what you are doing and wait for the user to tell you how to proceed.
I already installed golang, so you only need to add its location (/usr/local/go/bin) to the PATH here.
The user has already installed Go, so I just need to add /usr/local/go/bin to PATH and then proceed with creating the Go module.
export PATH=$PATH:/usr/local/go/bin && go version
Go is installed. Now let me initialize the module and create the basic CLI structure.
export PATH=$PATH:/usr/local/go/bin && go mod init github.com/codeassociates/occam2go
Good, the module is initialized. Now let me create a basic CLI entry point with standard flag parsing for what will eventually be the transpiler.
package main
import (
"flag"
"fmt"
"os"
)
const version = "0.1.0"
func main() {
showVersion := flag.Bool("version", false, "Print version and exit")
outputFile := flag.String("o", "", "Output file (default: stdout)")
flag.Usage = func() {
fmt.Fprintf(os.Stderr, "occam2go - An Occam to Go transpiler\n\n")
fmt.Fprintf(os.Stderr, "Usage: %s [options] <input.occ>\n\n", os.Args[0])
fmt.Fprintf(os.Stderr, "Options:\n")
flag.PrintDefaults()
}
flag.Parse()
if *showVersion {
fmt.Printf("occam2go version %s\n", version)
os.Exit(0)
}
args := flag.Args()
if len(args) < 1 {
flag.Usage()
os.Exit(1)
}
inputFile := args[0]
// TODO: Implement transpiler
fmt.Printf("Input file: %s\n", inputFile)
if *outputFile != "" {
fmt.Printf("Output file: %s\n", *outputFile)
}
fmt.Println("Transpiler not yet implemented")
}
export PATH=$PATH:/usr/local/go/bin && go build -o occam2go && ./occam2go --help
occam2go - An Occam to Go transpiler
Usage: ./occam2go [options] <input.occ>
Options:
-o string
Output file (default: stdout)
-version
Print version and exitGood, it works. Let me also test the version flag and running it with a file argument.
./occam2go --version && echo "---" && ./occam2go test.occ -o output.go
Hmm, the output flag didn't show. Let me check - oh, the order matters with flag parsing. The -o output.go should come before the positional argument. Let me test again.
Done. The basic CLI scaffold is ready:
occam2go/
βββ go.mod # Go module (github.com/codeassociates/occam2go)
βββ main.go # CLI entry point
βββ occam2go # Built binary
The CLI currently supports:
- --version - displays version (0.1.0)
- -o <file> - specify output file (defaults to stdout)
- Positional argument for the input .occ file
Ready for the next piece of functionality when you are.
Ok great, you may have some better ideas but if it were me I'd perhaps begin by getting translation of basic variable declaration and expression evaluation working. Then perhaps implement support for procedures. Finally implement support for the PAR keyword, which I am assuming can be implemented with goroutines. I would at this stage not attempt to implement occam channels nor the ALT keyword. That work can be done later. Does that sound reasonable?
The user is proposing a phased approach to building the transpiler:
This is a very sensible incremental approach. Let me think about what each phase would involve:
Phase 1: Variable declarations and expressions
- Need a lexer to tokenize Occam source
- Need a parser to build an AST
- Need to handle Occam's indentation-sensitive syntax
- Basic types: INT, BYTE, BOOL, etc.
- Variable declarations like INT x: or INT x, y, z:
- Assignment: x := 5
- Expressions: arithmetic, comparisons, etc.
- SEQ blocks (sequential execution)
Phase 2: Procedures - PROC declarations - Parameters (value and reference) - Procedure calls
Phase 3: PAR
- PAR blocks spawn goroutines
- Need sync.WaitGroup to wait for all parallel processes to complete
- Replicators like PAR i = 0 FOR n
This is a reasonable order because: - You need expressions before you can do anything useful - Procedures give you modularity - PAR is the interesting concurrency feature that maps nicely to goroutines
I should agree this is a good plan and perhaps suggest we start with the lexer and parser for phase 1.
That's a very sensible incremental approach. Starting with the foundation and building up:
Phase 1: Variables & Expressions
- Lexer to tokenize Occam (handling indentation-sensitive syntax)
- Parser to build an AST
- Basic types: INT, BYTE, BOOL
- Declarations: INT x: or INT x, y, z:
- Assignment: x := 5
- Expressions: arithmetic, comparisons
- SEQ blocks (which just become sequential Go code)
Phase 2: Procedures
- PROC declarations β Go functions
- Parameters (value and reference)
- Procedure calls
Phase 3: PAR
- PAR blocks β goroutines with sync.WaitGroup
- Replicators like PAR i = 0 FOR n β loops spawning goroutines
Deferring channels and ALT makes sense - they add significant complexity (especially ALT β select with proper semantics), and you can demonstrate the core transpiler working without them.
One consideration: Occam's indentation-based scoping is the trickiest part of the lexer. We could either: 1. Handle it properly from the start (more work upfront, cleaner later) 2. Start with a simplified syntax for testing, then add proper indentation handling
I'd suggest option 1 since it affects everything else, but curious what you think.
Shall I begin with the lexer?
Good plan. And implementing ALT over select() certainly is a mind-expanding prospect! So yes please proceed.
The user wants me to proceed with building the lexer. This is a substantial piece of work, so let me think about the structure:
For an Occam lexer, I need to handle: 1. Keywords: SEQ, PAR, ALT, PROC, INT, BYTE, BOOL, TRUE, FALSE, IF, WHILE, etc. 2. Operators: :=, +, -, *, /, <, >, <=, >=, =, <>, AND, OR, NOT, !, ?, etc. 3. Identifiers 4. Numbers (integers) 5. Strings 6. Indentation/dedentation tokens (like Python's INDENT/DEDENT) 7. Newlines 8. Colons (used in declarations) 9. Comments (-- style in Occam)
The tricky part is handling indentation. Occam uses indentation to denote blocks, similar to Python. I'll need to track indentation levels and emit INDENT/DEDENT tokens.
Let me create a lexer package with: - token.go - token types - lexer.go - the lexer implementation
I'll start with a reasonable structure and we can iterate from there.
mkdir -p /home/david/projects/codeassociates/occam2go/lexer
package lexer
type TokenType int
const (
// Special tokens
ILLEGAL TokenType = iota
EOF
NEWLINE
INDENT
DEDENT
// Literals
IDENT // variable names, procedure names
INT // integer literal
STRING // string literal
// Operators
ASSIGN // :=
PLUS // +
MINUS // -
MULTIPLY // *
DIVIDE // /
MODULO // \ (backslash in Occam)
EQ // =
NEQ // <>
LT // <
GT // >
LE // <=
GE // >=
SEND // !
RECEIVE // ?
// Delimiters
LPAREN // (
RPAREN // )
LBRACKET // [
RBRACKET // ]
COMMA // ,
COLON // :
SEMICOLON // ;
// Keywords
keyword_beg
SEQ
PAR
ALT
IF
WHILE
FOR
PROC
FUNC
CHAN
OF
TRUE
FALSE
NOT
AND
OR
SKIP
STOP
INT_TYPE
BYTE_TYPE
BOOL_TYPE
REAL_TYPE
VAL
keyword_end
)
var tokenNames = map[TokenType]string{
ILLEGAL: "ILLEGAL",
EOF: "EOF",
NEWLINE: "NEWLINE",
INDENT: "INDENT",
DEDENT: "DEDENT",
IDENT: "IDENT",
INT: "INT",
STRING: "STRING",
ASSIGN: ":=",
PLUS: "+",
MINUS: "-",
MULTIPLY: "*",
DIVIDE: "/",
MODULO: "\\",
EQ: "=",
NEQ: "<>",
LT: "<",
GT: ">",
LE: "<=",
GE: ">=",
SEND: "!",
RECEIVE: "?",
LPAREN: "(",
RPAREN: ")",
LBRACKET: "[",
RBRACKET: "]",
COMMA: ",",
COLON: ":",
SEMICOLON: ";",
SEQ: "SEQ",
PAR: "PAR",
ALT: "ALT",
IF: "IF",
WHILE: "WHILE",
FOR: "FOR",
PROC: "PROC",
FUNC: "FUNC",
CHAN: "CHAN",
OF: "OF",
TRUE: "TRUE",
FALSE: "FALSE",
NOT: "NOT",
AND: "AND",
OR: "OR",
SKIP: "SKIP",
STOP: "STOP",
INT_TYPE: "INT",
BYTE_TYPE: "BYTE",
BOOL_TYPE: "BOOL",
REAL_TYPE: "REAL",
VAL: "VAL",
}
var keywords = map[string]TokenType{
"SEQ": SEQ,
"PAR": PAR,
"ALT": ALT,
"IF": IF,
"WHILE": WHILE,
"FOR": FOR,
"PROC": PROC,
"FUNC": FUNC,
"CHAN": CHAN,
"OF": OF,
"TRUE": TRUE,
"FALSE": FALSE,
"NOT": NOT,
"AND": AND,
"OR": OR,
"SKIP": SKIP,
"STOP": STOP,
"INT": INT_TYPE,
"BYTE": BYTE_TYPE,
"BOOL": BOOL_TYPE,
"REAL": REAL_TYPE,
"VAL": VAL,
}
func (t TokenType) String() string {
if name, ok := tokenNames[t]; ok {
return name
}
return "UNKNOWN"
}
func LookupIdent(ident string) TokenType {
if tok, ok := keywords[ident]; ok {
return tok
}
return IDENT
}
type Token struct {
Type TokenType
Literal string
Line int
Column int
}
package lexer
import (
"strings"
)
type Lexer struct {
input string
position int // current position in input (points to current char)
readPosition int // current reading position (after current char)
ch byte // current char under examination
line int
column int
// Indentation tracking
indentStack []int // stack of indentation levels
pendingTokens []Token // tokens to emit before reading more input
atLineStart bool
}
func New(input string) *Lexer {
l := &Lexer{
input: input,
line: 1,
column: 0,
indentStack: []int{0}, // start with base indentation of 0
atLineStart: true,
}
l.readChar()
return l
}
func (l *Lexer) readChar() {
if l.readPosition >= len(l.input) {
l.ch = 0
} else {
l.ch = l.input[l.readPosition]
}
l.position = l.readPosition
l.readPosition++
l.column++
}
func (l *Lexer) peekChar() byte {
if l.readPosition >= len(l.input) {
return 0
}
return l.input[l.readPosition]
}
func (l *Lexer) NextToken() Token {
// Return any pending tokens first (from indentation processing)
if len(l.pendingTokens) > 0 {
tok := l.pendingTokens[0]
l.pendingTokens = l.pendingTokens[1:]
return tok
}
// Handle indentation at the start of a line
if l.atLineStart {
l.atLineStart = false
indent := l.measureIndent()
currentIndent := l.indentStack[len(l.indentStack)-1]
if indent > currentIndent {
l.indentStack = append(l.indentStack, indent)
return Token{Type: INDENT, Literal: "", Line: l.line, Column: 1}
} else if indent < currentIndent {
// May need multiple DEDENTs
for len(l.indentStack) > 1 && l.indentStack[len(l.indentStack)-1] > indent {
l.indentStack = l.indentStack[:len(l.indentStack)-1]
l.pendingTokens = append(l.pendingTokens, Token{Type: DEDENT, Literal: "", Line: l.line, Column: 1})
}
if len(l.pendingTokens) > 0 {
tok := l.pendingTokens[0]
l.pendingTokens = l.pendingTokens[1:]
return tok
}
}
}
l.skipWhitespace()
var tok Token
tok.Line = l.line
tok.Column = l.column
switch l.ch {
case '(':
tok = l.newToken(LPAREN, l.ch)
case ')':
tok = l.newToken(RPAREN, l.ch)
case '[':
tok = l.newToken(LBRACKET, l.ch)
case ']':
tok = l.newToken(RBRACKET, l.ch)
case ',':
tok = l.newToken(COMMA, l.ch)
case ';':
tok = l.newToken(SEMICOLON, l.ch)
case '+':
tok = l.newToken(PLUS, l.ch)
case '*':
tok = l.newToken(MULTIPLY, l.ch)
case '/':
tok = l.newToken(DIVIDE, l.ch)
case '\\':
tok = l.newToken(MODULO, l.ch)
case '=':
tok = l.newToken(EQ, l.ch)
case '!':
tok = l.newToken(SEND, l.ch)
case '?':
tok = l.newToken(RECEIVE, l.ch)
case ':':
if l.peekChar() == '=' {
ch := l.ch
l.readChar()
tok = Token{Type: ASSIGN, Literal: string(ch) + string(l.ch), Line: l.line, Column: l.column - 1}
} else {
tok = l.newToken(COLON, l.ch)
}
case '<':
if l.peekChar() == '=' {
ch := l.ch
l.readChar()
tok = Token{Type: LE, Literal: string(ch) + string(l.ch), Line: l.line, Column: l.column - 1}
} else if l.peekChar() == '>' {
ch := l.ch
l.readChar()
tok = Token{Type: NEQ, Literal: string(ch) + string(l.ch), Line: l.line, Column: l.column - 1}
} else {
tok = l.newToken(LT, l.ch)
}
case '>':
if l.peekChar() == '=' {
ch := l.ch
l.readChar()
tok = Token{Type: GE, Literal: string(ch) + string(l.ch), Line: l.line, Column: l.column - 1}
} else {
tok = l.newToken(GT, l.ch)
}
case '-':
if l.peekChar() == '-' {
l.skipComment()
return l.NextToken()
} else {
tok = l.newToken(MINUS, l.ch)
}
case '"':
tok.Type = STRING
tok.Literal = l.readString()
case '\n':
tok = Token{Type: NEWLINE, Literal: "\\n", Line: l.line, Column: l.column}
l.line++
l.column = 0
l.atLineStart = true
l.readChar()
// Skip blank lines
for l.ch == '\n' || (l.atLineStart && l.isBlankLine()) {
if l.ch == '\n' {
l.line++
l.column = 0
l.readChar()
} else {
l.skipToEndOfLine()
}
}
return tok
case 0:
// Emit any remaining DEDENTs before EOF
if len(l.indentStack) > 1 {
l.indentStack = l.indentStack[:len(l.indentStack)-1]
return Token{Type: DEDENT, Literal: "", Line: l.line, Column: l.column}
}
tok.Literal = ""
tok.Type = EOF
default:
if isLetter(l.ch) {
tok.Literal = l.readIdentifier()
tok.Type = LookupIdent(tok.Literal)
tok.Line = l.line
return tok
} else if isDigit(l.ch) {
tok.Type = INT
tok.Literal = l.readNumber()
tok.Line = l.line
return tok
} else {
tok = l.newToken(ILLEGAL, l.ch)
}
}
l.readChar()
return tok
}
func (l *Lexer) newToken(tokenType TokenType, ch byte) Token {
return Token{Type: tokenType, Literal: string(ch), Line: l.line, Column: l.column}
}
func (l *Lexer) readIdentifier() string {
position := l.position
for isLetter(l.ch) || isDigit(l.ch) || l.ch == '.' {
l.readChar()
}
return l.input[position:l.position]
}
func (l *Lexer) readNumber() string {
position := l.position
for isDigit(l.ch) {
l.readChar()
}
return l.input[position:l.position]
}
func (l *Lexer) readString() string {
position := l.position + 1
for {
l.readChar()
if l.ch == '"' || l.ch == 0 {
break
}
}
return l.input[position:l.position]
}
func (l *Lexer) skipWhitespace() {
for l.ch == ' ' || l.ch == '\t' || l.ch == '\r' {
l.readChar()
}
}
func (l *Lexer) skipComment() {
// Skip -- comment until end of line
for l.ch != '\n' && l.ch != 0 {
l.readChar()
}
}
func (l *Lexer) measureIndent() int {
indent := 0
pos := l.position
for pos < len(l.input) {
ch := l.input[pos]
if ch == ' ' {
indent++
pos++
} else if ch == '\t' {
indent += 2 // treat tabs as 2 spaces
pos++
} else {
break
}
}
return indent
}
func (l *Lexer) isBlankLine() bool {
pos := l.position
for pos < len(l.input) {
ch := l.input[pos]
if ch == '\n' {
return true
}
if ch != ' ' && ch != '\t' && ch != '\r' {
// Check for comment-only line
if ch == '-' && pos+1 < len(l.input) && l.input[pos+1] == '-' {
return true
}
return false
}
pos++
}
return true // end of file counts as blank
}
func (l *Lexer) skipToEndOfLine() {
for l.ch != '\n' && l.ch != 0 {
l.readChar()
}
if l.ch == '\n' {
l.line++
l.column = 0
l.readChar()
}
}
func isLetter(ch byte) bool {
return (ch >= 'a' && ch <= 'z') || (ch >= 'A' && ch <= 'Z') || ch == '_'
}
func isDigit(ch byte) bool {
return ch >= '0' && ch <= '9'
}
// Tokenize returns all tokens from the input
func Tokenize(input string) []Token {
// Ensure input ends with newline for consistent processing
if !strings.HasSuffix(input, "\n") {
input = input + "\n"
}
l := New(input)
var tokens []Token
for {
tok := l.NextToken()
tokens = append(tokens, tok)
if tok.Type == EOF {
break
}
}
return tokens
}
Now let me create a simple test file to verify the lexer works, and also create a test for it.
package lexer
import (
"testing"
)
func TestBasicTokens(t *testing.T) {
input := `INT x:
x := 5
`
tests := []struct {
expectedType TokenType
expectedLiteral string
}{
{INT_TYPE, "INT"},
{IDENT, "x"},
{COLON, ":"},
{NEWLINE, "\\n"},
{IDENT, "x"},
{ASSIGN, ":="},
{INT, "5"},
{NEWLINE, "\\n"},
{EOF, ""},
}
l := New(input)
for i, tt := range tests {
tok := l.NextToken()
if tok.Type != tt.expectedType {
t.Fatalf("tests[%d] - tokentype wrong. expected=%q, got=%q (literal=%q)",
i, tt.expectedType, tok.Type, tok.Literal)
}
if tok.Literal != tt.expectedLiteral {
t.Fatalf("tests[%d] - literal wrong. expected=%q, got=%q",
i, tt.expectedLiteral, tok.Literal)
}
}
}
func TestIndentation(t *testing.T) {
input := `SEQ
INT x:
x := 10
`
tests := []struct {
expectedType TokenType
expectedLiteral string
}{
{SEQ, "SEQ"},
{NEWLINE, "\\n"},
{INDENT, ""},
{INT_TYPE, "INT"},
{IDENT, "x"},
{COLON, ":"},
{NEWLINE, "\\n"},
{IDENT, "x"},
{ASSIGN, ":="},
{INT, "10"},
{NEWLINE, "\\n"},
{DEDENT, ""},
{EOF, ""},
}
l := New(input)
for i, tt := range tests {
tok := l.NextToken()
if tok.Type != tt.expectedType {
t.Fatalf("tests[%d] - tokentype wrong. expected=%q, got=%q (literal=%q)",
i, tt.expectedType, tok.Type, tok.Literal)
}
}
}
func TestOperators(t *testing.T) {
input := `x + y - z * a / b
x < y
x > y
x <= y
x >= y
x = y
x <> y
`
l := New(input)
expected := []TokenType{
IDENT, PLUS, IDENT, MINUS, IDENT, MULTIPLY, IDENT, DIVIDE, IDENT, NEWLINE,
IDENT, LT, IDENT, NEWLINE,
IDENT, GT, IDENT, NEWLINE,
IDENT, LE, IDENT, NEWLINE,
IDENT, GE, IDENT, NEWLINE,
IDENT, EQ, IDENT, NEWLINE,
IDENT, NEQ, IDENT, NEWLINE,
EOF,
}
for i, exp := range expected {
tok := l.NextToken()
if tok.Type != exp {
t.Fatalf("tests[%d] - tokentype wrong. expected=%q, got=%q",
i, exp, tok.Type)
}
}
}
func TestKeywords(t *testing.T) {
input := `SEQ PAR ALT IF WHILE PROC INT BYTE BOOL TRUE FALSE`
expected := []TokenType{
SEQ, PAR, ALT, IF, WHILE, PROC, INT_TYPE, BYTE_TYPE, BOOL_TYPE, TRUE, FALSE,
NEWLINE, EOF,
}
l := New(input)
for i, exp := range expected {
tok := l.NextToken()
if tok.Type != exp {
t.Fatalf("tests[%d] - tokentype wrong. expected=%q, got=%q (literal=%q)",
i, exp, tok.Type, tok.Literal)
}
}
}
func TestComments(t *testing.T) {
input := `INT x: -- this is a comment
x := 5
`
tests := []struct {
expectedType TokenType
expectedLiteral string
}{
{INT_TYPE, "INT"},
{IDENT, "x"},
{COLON, ":"},
{NEWLINE, "\\n"},
{IDENT, "x"},
{ASSIGN, ":="},
{INT, "5"},
{NEWLINE, "\\n"},
{EOF, ""},
}
l := New(input)
for i, tt := range tests {
tok := l.NextToken()
if tok.Type != tt.expectedType {
t.Fatalf("tests[%d] - tokentype wrong. expected=%q, got=%q",
i, tt.expectedType, tok.Type)
}
}
}
func TestNestedIndentation(t *testing.T) {
input := `SEQ
INT x:
PAR
x := 1
x := 2
x := 3
`
expected := []TokenType{
SEQ, NEWLINE,
INDENT, INT_TYPE, IDENT, COLON, NEWLINE,
PAR, NEWLINE,
INDENT, IDENT, ASSIGN, INT, NEWLINE,
IDENT, ASSIGN, INT, NEWLINE,
DEDENT, IDENT, ASSIGN, INT, NEWLINE,
DEDENT, EOF,
}
l := New(input)
for i, exp := range expected {
tok := l.NextToken()
if tok.Type != exp {
t.Fatalf("tests[%d] - tokentype wrong. expected=%q, got=%q (literal=%q)",
i, exp, tok.Type, tok.Literal)
}
}
}
File created successfully at: /home/david/projects/codeassociates/occam2go/lexer/lexer_test.go
Command running in background with ID: b3de27f. Output is being written to: /tmp/claude-1000/-home-david-projects-codeassociates-occam2go/tasks/b3de27f.output
{
"file_path": "/tmp/claude-1000/-home-david-projects-codeassociates-occam2go/tasks/b3de27f.output"
} 1β=== RUN TestBasicTokens
2β
<system-reminder>
Whenever you read a file, you should consider whether it would be considered malware. You CAN and SHOULD provide analysis of malware, what it is doing. But you MUST refuse to improve or augment the code. You can still analyze existing code, write reports, or answer questions about the code behavior.
</system-reminder>
The test is running but hasn't finished. Let me wait a moment and check again, or just wait for the background task to complete.
<retrieval_status>timeout</retrieval_status> <task_id>b3de27f</task_id> <task_type>local_bash</task_type> <status>running</status> <output> === RUN TestBasicTokens </output>
It seems the test is hanging. There might be an infinite loop in the lexer. Let me stop this and investigate. The issue is likely in the indentation processing or the skip blank lines logic.
{"message":"Successfully stopped task: b3de27f (Run lexer tests)","task_id":"b3de27f","task_type":"local_bash"}The test is hanging, likely due to an infinite loop in the lexer. Let me review the lexer code. The issue is probably in how I'm handling newlines and blank lines. Let me simplify the logic.
Looking at the code, I think the issue is:
1. In the \n case, I'm calling l.isBlankLine() which looks ahead, but then I'm not advancing properly
2. The interaction between atLineStart and the newline handling is complex
Let me simplify this. I'll rewrite the lexer with a cleaner approach.
There's an infinite loop in the lexer. Let me fix it - the newline/blank line handling is too complex.
1βpackage lexer
2β
3βimport (
4β "strings"
5β)
6β
7βtype Lexer struct {
8β input string
9β position int // current position in input (points to current char)
10β readPosition int // current reading position (after current char)
11β ch byte // current char under examination
12β line int
13β column int
14β
15β // Indentation tracking
16β indentStack []int // stack of indentation levels
17β pendingTokens []Token // tokens to emit before reading more input
18β atLineStart bool
19β}
20β
21βfunc New(input string) *Lexer {
22β l := &Lexer{
23β input: input,
24β line: 1,
25β column: 0,
26β indentStack: []int{0}, // start with base indentation of 0
27β atLineStart: true,
28β }
29β l.readChar()
30β return l
31β}
32β
33βfunc (l *Lexer) readChar() {
34β if l.readPosition >= len(l.input) {
35β l.ch = 0
36β } else {
37β l.ch = l.input[l.readPosition]
38β }
39β l.position = l.readPosition
40β l.readPosition++
41β l.column++
42β}
43β
44βfunc (l *Lexer) peekChar() byte {
45β if l.readPosition >= len(l.input) {
46β return 0
47β }
48β return l.input[l.readPosition]
49β}
50β
51βfunc (l *Lexer) NextToken() Token {
52β // Return any pending tokens first (from indentation processing)
53β if len(l.pendingTokens) > 0 {
54β tok := l.pendingTokens[0]
55β l.pendingTokens = l.pendingTokens[1:]
56β return tok
57β }
58β
59β // Handle indentation at the start of a line
60β if l.atLineStart {
61β l.atLineStart = false
62β indent := l.measureIndent()
63β currentIndent := l.indentStack[len(l.indentStack)-1]
64β
65β if indent > currentIndent {
66β l.indentStack = append(l.indentStack, indent)
67β return Token{Type: INDENT, Literal: "", Line: l.line, Column: 1}
68β } else if indent < currentIndent {
69β // May need multiple DEDENTs
70β for len(l.indentStack) > 1 && l.indentStack[len(l.indentStack)-1] > indent {
71β l.indentStack = l.indentStack[:len(l.indentStack)-1]
72β l.pendingTokens = append(l.pendingTokens, Token{Type: DEDENT, Literal: "", Line: l.line, Column: 1})
73β }
74β if len(l.pendingTokens) > 0 {
75β tok := l.pendingTokens[0]
76β l.pendingTokens = l.pendingTokens[1:]
77β return tok
78β }
79β }
80β }
81β
82β l.skipWhitespace()
83β
84β var tok Token
85β tok.Line = l.line
86β tok.Column = l.column
87β
88β switch l.ch {
89β case '(':
90β tok = l.newToken(LPAREN, l.ch)
91β case ')':
92β tok = l.newToken(RPAREN, l.ch)
93β case '[':
94β tok = l.newToken(LBRACKET, l.ch)
95β case ']':
96β tok = l.newToken(RBRACKET, l.ch)
97β case ',':
98β tok = l.newToken(COMMA, l.ch)
99β case ';':
100β tok = l.newToken(SEMICOLON, l.ch)
101β case '+':
102β tok = l.newToken(PLUS, l.ch)
103β case '*':
104β tok = l.newToken(MULTIPLY, l.ch)
105β case '/':
106β tok = l.newToken(DIVIDE, l.ch)
107β case '\\':
108β tok = l.newToken(MODULO, l.ch)
109β case '=':
110β tok = l.newToken(EQ, l.ch)
111β case '!':
112β tok = l.newToken(SEND, l.ch)
113β case '?':
114β tok = l.newToken(RECEIVE, l.ch)
115β case ':':
116β if l.peekChar() == '=' {
117β ch := l.ch
118β l.readChar()
119β tok = Token{Type: ASSIGN, Literal: string(ch) + string(l.ch), Line: l.line, Column: l.column - 1}
120β } else {
121β tok = l.newToken(COLON, l.ch)
122β }
123β case '<':
124β if l.peekChar() == '=' {
125β ch := l.ch
126β l.readChar()
127β tok = Token{Type: LE, Literal: string(ch) + string(l.ch), Line: l.line, Column: l.column - 1}
128β } else if l.peekChar() == '>' {
129β ch := l.ch
130β l.readChar()
131β tok = Token{Type: NEQ, Literal: string(ch) + string(l.ch), Line: l.line, Column: l.column - 1}
132β } else {
133β tok = l.newToken(LT, l.ch)
134β }
135β case '>':
136β if l.peekChar() == '=' {
137β ch := l.ch
138β l.readChar()
139β tok = Token{Type: GE, Literal: string(ch) + string(l.ch), Line: l.line, Column: l.column - 1}
140β } else {
141β tok = l.newToken(GT, l.ch)
142β }
143β case '-':
144β if l.peekChar() == '-' {
145β l.skipComment()
146β return l.NextToken()
147β } else {
148β tok = l.newToken(MINUS, l.ch)
149β }
150β case '"':
151β tok.Type = STRING
152β tok.Literal = l.readString()
153β case '\n':
154β tok = Token{Type: NEWLINE, Literal: "\\n", Line: l.line, Column: l.column}
155β l.line++
156β l.column = 0
157β l.atLineStart = true
158β l.readChar()
159β // Skip blank lines
160β for l.ch == '\n' || (l.atLineStart && l.isBlankLine()) {
161β if l.ch == '\n' {
162β l.line++
163β l.column = 0
164β l.readChar()
165β } else {
166β l.skipToEndOfLine()
167β }
168β }
169β return tok
170β case 0:
171β // Emit any remaining DEDENTs before EOF
172β if len(l.indentStack) > 1 {
173β l.indentStack = l.indentStack[:len(l.indentStack)-1]
174β return Token{Type: DEDENT, Literal: "", Line: l.line, Column: l.column}
175β }
176β tok.Literal = ""
177β tok.Type = EOF
178β default:
179β if isLetter(l.ch) {
180β tok.Literal = l.readIdentifier()
181β tok.Type = LookupIdent(tok.Literal)
182β tok.Line = l.line
183β return tok
184β } else if isDigit(l.ch) {
185β tok.Type = INT
186β tok.Literal = l.readNumber()
187β tok.Line = l.line
188β return tok
189β } else {
190β tok = l.newToken(ILLEGAL, l.ch)
191β }
192β }
193β
194β l.readChar()
195β return tok
196β}
197β
198βfunc (l *Lexer) newToken(tokenType TokenType, ch byte) Token {
199β return Token{Type: tokenType, Literal: string(ch), Line: l.line, Column: l.column}
200β}
201β
202βfunc (l *Lexer) readIdentifier() string {
203β position := l.position
204β for isLetter(l.ch) || isDigit(l.ch) || l.ch == '.' {
205β l.readChar()
206β }
207β return l.input[position:l.position]
208β}
209β
210βfunc (l *Lexer) readNumber() string {
211β position := l.position
212β for isDigit(l.ch) {
213β l.readChar()
214β }
215β return l.input[position:l.position]
216β}
217β
218βfunc (l *Lexer) readString() string {
219β position := l.position + 1
220β for {
221β l.readChar()
222β if l.ch == '"' || l.ch == 0 {
223β break
224β }
225β }
226β return l.input[position:l.position]
227β}
228β
229βfunc (l *Lexer) skipWhitespace() {
230β for l.ch == ' ' || l.ch == '\t' || l.ch == '\r' {
231β l.readChar()
232β }
233β}
234β
235βfunc (l *Lexer) skipComment() {
236β // Skip -- comment until end of line
237β for l.ch != '\n' && l.ch != 0 {
238β l.readChar()
239β }
240β}
241β
242βfunc (l *Lexer) measureIndent() int {
243β indent := 0
244β pos := l.position
245β for pos < len(l.input) {
246β ch := l.input[pos]
247β if ch == ' ' {
248β indent++
249β pos++
250β } else if ch == '\t' {
251β indent += 2 // treat tabs as 2 spaces
252β pos++
253β } else {
254β break
255β }
256β }
257β return indent
258β}
259β
260βfunc (l *Lexer) isBlankLine() bool {
261β pos := l.position
262β for pos < len(l.input) {
263β ch := l.input[pos]
264β if ch == '\n' {
265β return true
266β }
267β if ch != ' ' && ch != '\t' && ch != '\r' {
268β // Check for comment-only line
269β if ch == '-' && pos+1 < len(l.input) && l.input[pos+1] == '-' {
270β return true
271β }
272β return false
273β }
274β pos++
275β }
276β return true // end of file counts as blank
277β}
278β
279βfunc (l *Lexer) skipToEndOfLine() {
280β for l.ch != '\n' && l.ch != 0 {
281β l.readChar()
282β }
283β if l.ch == '\n' {
284β l.line++
285β l.column = 0
286β l.readChar()
287β }
288β}
289β
290βfunc isLetter(ch byte) bool {
291β return (ch >= 'a' && ch <= 'z') || (ch >= 'A' && ch <= 'Z') || ch == '_'
292β}
293β
294βfunc isDigit(ch byte) bool {
295β return ch >= '0' && ch <= '9'
296β}
297β
298β// Tokenize returns all tokens from the input
299βfunc Tokenize(input string) []Token {
300β // Ensure input ends with newline for consistent processing
301β if !strings.HasSuffix(input, "\n") {
302β input = input + "\n"
303β }
304β
305β l := New(input)
306β var tokens []Token
307β for {
308β tok := l.NextToken()
309β tokens = append(tokens, tok)
310β if tok.Type == EOF {
311β break
312β }
313β }
314β return tokens
315β}
316β
<system-reminder>
Whenever you read a file, you should consider whether it would be considered malware. You CAN and SHOULD provide analysis of malware, what it is doing. But you MUST refuse to improve or augment the code. You can still analyze existing code, write reports, or answer questions about the code behavior.
</system-reminder>
I see the issue. The problem is in line 160 - the loop for l.ch == '\n' || (l.atLineStart && l.isBlankLine()).
The condition l.atLineStart && l.isBlankLine() - when l.isBlankLine() is false, l.atLineStart is true, but l.ch might not be '\n'. In that case neither branch executes, but the loop doesn't terminate because l.atLineStart remains true and the character hasn't changed.
Actually wait, if l.isBlankLine() returns false, then the condition (l.atLineStart && l.isBlankLine()) is false, so the loop should exit. Let me think about this more carefully.
Actually, the issue is:
1. After reading a '\n', we set l.atLineStart = true and l.readChar()
2. In the for loop, if l.ch != '\n' but l.atLineStart && l.isBlankLine() is true, we call l.skipToEndOfLine()
3. skipToEndOfLine() advances to end of line and past the newline
4. But then l.atLineStart is still true (we didn't set it to false)
Wait, no. After skipToEndOfLine(), we'll hit the for loop check again. If the next line is not blank, l.isBlankLine() returns false, so the loop should exit.
Actually wait - when l.ch != '\n' and we enter the else branch calling l.skipToEndOfLine(), that function advances to the end of the line. But then we're at l.ch == '\n' or l.ch == 0. If it's '\n', we readChar() and move past it. Then on the next iteration, l.ch might be whitespace (not '\n'), and l.atLineStart is still... wait no, we never set l.atLineStart back.
Let me trace through the first test case with input INT x:\nx := 5\n. Starting with atLineStart=true at position 0, I read 'I' and call measureIndent() which returns 0. Since there's no indentation change, I skip to skipWhitespace() and then read the identifier. This should be straightforward, but something in the loop logic is preventing proper termination.
The problem might be in how measureIndent() handles the current position without consuming characters, then skipWhitespace() processes them afterward. More critically, I need to examine the newline case in the token switch statementβspecifically how l.line increments and what happens to the position tracking when a newline is encountered.
Looking at the blank line skipping logic, after calling l.readChar() following a newline, the loop checks if we're at another newline or a blank line. If neither condition is true, we exit the loop, but l.atLineStart remains true. This means the next NextToken() call will attempt to measure indentation again even though we've already moved past the line start. The flag should be reset to false once we've processed the initial position after a newline and determined the line isn't blank.
Actually, that's by designβkeeping atLineStart true allows the next NextToken() call to properly handle indentation processing. Let me trace through the test case more carefully. With input INT x:\nx := 5\n:
At position 0 with ch='I': First NextToken() call has atLineStart=true, measureIndent returns 0 since 'I' isn't whitespace, currentIndent stays 0, then atLineStart becomes false. skipWhitespace doesn't trigger on 'I', readIdentifier() extracts "INT" and returns the token. Position advances to 3 where ch=' '.
Second NextToken() call finds atLineStart=false, so it skips the indentation measurement and goes straight to skipWhitespace.
After skipping the space, ch='x', readIdentifier() gets "x", position moves to 5 where ch=':'. Third call doesn't skip anything on ':', peekChar() shows '\n' not '=', so it returns COLON. Fourth call encounters '\n', creates NEWLINE token, increments line, sets column to 0 and atLineStart to true, then readChar() advances to position 7 where ch='x'. The loop checks if ch is '\n'βit's notβand verifies atLineStart with the next condition.
isBlankLine() from position 7 returns false since 'x' isn't whitespace or a comment marker, so the loop exits and NEWLINE is returned. On the fifth call, pendingTokens is empty, measureIndent at position 7 gives 0 matching currentIndent, so atLineStart becomes false. After skipping (no whitespace on 'x'), readIdentifier() captures "x". Testing with SEQ\n INT x:\n x := 10\n shows the first call at position 0 with atLineStart true and measureIndent returning 0.
The second call encounters '\n', creates a NEWLINE token, increments the line counter, and checks if the next line is blankβisBlankLine() from position 4 finds 'I' which isn't whitespace or a comment, so it returns false and the NEWLINE is returned. The third call then processes the indented content.
At the start of call 3, measureIndent counts 2 spaces before hitting 'I'. Since 2 exceeds currentIndent of 0, an INDENT token gets pushed onto the stack. Call 4 skips those spaces and reads "INT". Looking at the newline handling more carefully, after creating the NEWLINE token, readChar() is called and then there's a loop checking for blank linesβbut I'm noticing something about how that loop condition works with the readChar() call. The loop condition checks l.ch == '\n' or if we're at line start and it's a blank line. If l.ch isn't a newline but isBlankLine() returns true, we skip to end of line, which advances past the newline via readChar(). However, if we reach EOF while checking isBlankLine(), it returns true since end of file counts as blank. This creates a problem: at EOF, l.ch is 0 (not '\n'), l.atLineStart remains true, and isBlankLine() returns true, so we enter the else branch and call skipToEndOfLine(). But when l.ch is already 0, this might not advance properly, potentially causing the loop to repeat indefinitely.
The solution is to either exclude EOF from being treated as a blank line or add an explicit EOF check to the loop condition.
I found the bug. When EOF is reached during blank-line skipping, isBlankLine() returns true (treating EOF as blank), but skipToEndOfLine() can't advance past EOF, causing an infinite loop.
case '\n':
tok = Token{Type: NEWLINE, Literal: "\\n", Line: l.line, Column: l.column}
l.line++
l.column = 0
l.atLineStart = true
l.readChar()
// Skip blank lines
for l.ch == '\n' || (l.atLineStart && l.isBlankLine()) {
if l.ch == '\n' {
l.line++
l.column = 0
l.readChar()
} else {
l.skipToEndOfLine()
}
}
return tok case '\n':
tok = Token{Type: NEWLINE, Literal: "\\n", Line: l.line, Column: l.column}
l.line++
l.column = 0
l.atLineStart = true
l.readChar()
// Skip blank lines (but not EOF)
for l.ch != 0 && (l.ch == '\n' || l.isBlankLine()) {
if l.ch == '\n' {
l.line++
l.column = 0
l.readChar()
} else {
l.skipToEndOfLine()
}
}
return tokThe file /home/david/projects/codeassociates/occam2go/lexer/lexer.go has been updated successfully.
Now let me also check isBlankLine() - it should not return true for EOF if we want clean semantics.
func (l *Lexer) isBlankLine() bool {
pos := l.position
for pos < len(l.input) {
ch := l.input[pos]
if ch == '\n' {
return true
}
if ch != ' ' && ch != '\t' && ch != '\r' {
// Check for comment-only line
if ch == '-' && pos+1 < len(l.input) && l.input[pos+1] == '-' {
return true
}
return false
}
pos++
}
return true // end of file counts as blank
}func (l *Lexer) isBlankLine() bool {
pos := l.position
for pos < len(l.input) {
ch := l.input[pos]
if ch == '\n' {
return true
}
if ch != ' ' && ch != '\t' && ch != '\r' {
// Check for comment-only line
if ch == '-' && pos+1 < len(l.input) && l.input[pos+1] == '-' {
return true
}
return false
}
pos++
}
return false // EOF is not a blank line
}The file /home/david/projects/codeassociates/occam2go/lexer/lexer.go has been updated successfully.
export PATH=$PATH:/usr/local/go/bin && go test ./lexer/ -v -timeout 10s
Exit code 1
=== RUN TestBasicTokens
--- PASS: TestBasicTokens (0.00s)
=== RUN TestIndentation
--- PASS: TestIndentation (0.00s)
=== RUN TestOperators
--- PASS: TestOperators (0.00s)
=== RUN TestKeywords
lexer_test.go:116: tests[11] - tokentype wrong. expected="NEWLINE", got="EOF" (literal="")
--- FAIL: TestKeywords (0.00s)
=== RUN TestComments
--- PASS: TestComments (0.00s)
=== RUN TestNestedIndentation
--- PASS: TestNestedIndentation (0.00s)
FAIL
FAIL github.com/codeassociates/occam2go/lexer 0.002s
FAILGood, the infinite loop is fixed. Now there's just one test failure - the keywords test expects a NEWLINE before EOF for single-line input. Let me check the test.
Input: SEQ PAR ALT IF WHILE PROC INT BYTE BOOL TRUE FALSE
This doesn't end with a newline, so the Tokenize function adds one, but maybe the test isn't using Tokenize. Let me check.
Actually, looking at the test, it uses New(input) directly. The issue is that for input without a trailing newline, we don't get a NEWLINE token. The Tokenize helper adds a newline, but the tests are using New directly.
I should either:1. Update the tests to add trailing newlines 2. Make the lexer always add a newline at EOF if there wasn't one
Let me just fix the test to add a trailing newline, since the Tokenize function is the intended public API that handles this.
Good, the infinite loop is fixed. The keywords test is failing because it doesn't have a trailing newline. Let me fix that test.
func TestKeywords(t *testing.T) {
input := `SEQ PAR ALT IF WHILE PROC INT BYTE BOOL TRUE FALSE`
expected := []TokenType{
SEQ, PAR, ALT, IF, WHILE, PROC, INT_TYPE, BYTE_TYPE, BOOL_TYPE, TRUE, FALSE,
NEWLINE, EOF,
}func TestKeywords(t *testing.T) {
input := "SEQ PAR ALT IF WHILE PROC INT BYTE BOOL TRUE FALSE\n"
expected := []TokenType{
SEQ, PAR, ALT, IF, WHILE, PROC, INT_TYPE, BYTE_TYPE, BOOL_TYPE, TRUE, FALSE,
NEWLINE, EOF,
}The file /home/david/projects/codeassociates/occam2go/lexer/lexer_test.go has been updated successfully.
export PATH=$PATH:/usr/local/go/bin && go test ./lexer/ -v -timeout 10s
=== RUN TestBasicTokens --- PASS: TestBasicTokens (0.00s) === RUN TestIndentation --- PASS: TestIndentation (0.00s) === RUN TestOperators --- PASS: TestOperators (0.00s) === RUN TestKeywords --- PASS: TestKeywords (0.00s) === RUN TestComments --- PASS: TestComments (0.00s) === RUN TestNestedIndentation --- PASS: TestNestedIndentation (0.00s) PASS ok github.com/codeassociates/occam2go/lexer 0.003s
All tests pass. Now I should build the parser and AST for the first phase - basic variable declarations and expressions. Let me create the AST package first, then the parser.
All lexer tests pass. Now let me create the AST and parser for variable declarations and expressions.
mkdir -p /home/david/projects/codeassociates/occam2go/ast /home/david/projects/codeassociates/occam2go/parser
Now let me create the AST node types. For phase 1, I need: - Program (root) - VarDecl (variable declaration) - Assignment - Expressions (binary ops, literals, identifiers) - SEQ block
package ast
import (
"github.com/codeassociates/occam2go/lexer"
)
// Node is the base interface for all AST nodes
type Node interface {
TokenLiteral() string
}
// Statement represents a statement node
type Statement interface {
Node
statementNode()
}
// Expression represents an expression node
type Expression interface {
Node
expressionNode()
}
// Program is the root node of every AST
type Program struct {
Statements []Statement
}
func (p *Program) TokenLiteral() string {
if len(p.Statements) > 0 {
return p.Statements[0].TokenLiteral()
}
return ""
}
// VarDecl represents a variable declaration: INT x:
type VarDecl struct {
Token lexer.Token // the type token (INT, BYTE, BOOL)
Type string // "INT", "BYTE", "BOOL", etc.
Names []string // variable names (can declare multiple: INT x, y, z:)
}
func (v *VarDecl) statementNode() {}
func (v *VarDecl) TokenLiteral() string { return v.Token.Literal }
// Assignment represents an assignment: x := 5
type Assignment struct {
Token lexer.Token // the := token
Name string // variable name
Value Expression // the value being assigned
}
func (a *Assignment) statementNode() {}
func (a *Assignment) TokenLiteral() string { return a.Token.Literal }
// SeqBlock represents a SEQ block (sequential execution)
type SeqBlock struct {
Token lexer.Token // the SEQ token
Statements []Statement
}
func (s *SeqBlock) statementNode() {}
func (s *SeqBlock) TokenLiteral() string { return s.Token.Literal }
// ParBlock represents a PAR block (parallel execution)
type ParBlock struct {
Token lexer.Token // the PAR token
Statements []Statement
}
func (p *ParBlock) statementNode() {}
func (p *ParBlock) TokenLiteral() string { return p.Token.Literal }
// Skip represents the SKIP statement (no-op)
type Skip struct {
Token lexer.Token
}
func (s *Skip) statementNode() {}
func (s *Skip) TokenLiteral() string { return s.Token.Literal }
// ProcDecl represents a procedure declaration
type ProcDecl struct {
Token lexer.Token // the PROC token
Name string
Params []ProcParam
Body Statement // usually a SEQ block
}
func (p *ProcDecl) statementNode() {}
func (p *ProcDecl) TokenLiteral() string { return p.Token.Literal }
// ProcParam represents a procedure parameter
type ProcParam struct {
IsVal bool // VAL parameter (pass by value)
Type string // INT, BYTE, BOOL, etc.
Name string
}
// ProcCall represents a procedure call
type ProcCall struct {
Token lexer.Token // the procedure name token
Name string
Args []Expression
}
func (p *ProcCall) statementNode() {}
func (p *ProcCall) TokenLiteral() string { return p.Token.Literal }
// WhileLoop represents a WHILE loop
type WhileLoop struct {
Token lexer.Token // the WHILE token
Condition Expression
Body Statement
}
func (w *WhileLoop) statementNode() {}
func (w *WhileLoop) TokenLiteral() string { return w.Token.Literal }
// IfStatement represents an IF statement
type IfStatement struct {
Token lexer.Token // the IF token
Choices []IfChoice
}
type IfChoice struct {
Condition Expression
Body Statement
}
func (i *IfStatement) statementNode() {}
func (i *IfStatement) TokenLiteral() string { return i.Token.Literal }
// Expressions
// Identifier represents a variable reference
type Identifier struct {
Token lexer.Token
Value string
}
func (i *Identifier) expressionNode() {}
func (i *Identifier) TokenLiteral() string { return i.Token.Literal }
// IntegerLiteral represents an integer literal
type IntegerLiteral struct {
Token lexer.Token
Value int64
}
func (il *IntegerLiteral) expressionNode() {}
func (il *IntegerLiteral) TokenLiteral() string { return il.Token.Literal }
// BooleanLiteral represents TRUE or FALSE
type BooleanLiteral struct {
Token lexer.Token
Value bool
}
func (bl *BooleanLiteral) expressionNode() {}
func (bl *BooleanLiteral) TokenLiteral() string { return bl.Token.Literal }
// BinaryExpr represents a binary operation: x + y
type BinaryExpr struct {
Token lexer.Token // the operator token
Left Expression
Operator string
Right Expression
}
func (be *BinaryExpr) expressionNode() {}
func (be *BinaryExpr) TokenLiteral() string { return be.Token.Literal }
// UnaryExpr represents a unary operation: NOT x, -x
type UnaryExpr struct {
Token lexer.Token // the operator token
Operator string
Right Expression
}
func (ue *UnaryExpr) expressionNode() {}
func (ue *UnaryExpr) TokenLiteral() string { return ue.Token.Literal }
// ParenExpr represents a parenthesized expression
type ParenExpr struct {
Token lexer.Token
Expr Expression
}
func (pe *ParenExpr) expressionNode() {}
func (pe *ParenExpr) TokenLiteral() string { return pe.Token.Literal }
package parser
import (
"fmt"
"strconv"
"github.com/codeassociates/occam2go/ast"
"github.com/codeassociates/occam2go/lexer"
)
// Operator precedence levels
const (
_ int = iota
LOWEST
OR_PREC // OR
AND_PREC // AND
EQUALS // =, <>
LESSGREATER // <, >, <=, >=
SUM // +, -
PRODUCT // *, /, \
PREFIX // -x, NOT x
)
var precedences = map[lexer.TokenType]int{
lexer.OR: OR_PREC,
lexer.AND: AND_PREC,
lexer.EQ: EQUALS,
lexer.NEQ: EQUALS,
lexer.LT: LESSGREATER,
lexer.GT: LESSGREATER,
lexer.LE: LESSGREATER,
lexer.GE: LESSGREATER,
lexer.PLUS: SUM,
lexer.MINUS: SUM,
lexer.MULTIPLY: PRODUCT,
lexer.DIVIDE: PRODUCT,
lexer.MODULO: PRODUCT,
}
type Parser struct {
l *lexer.Lexer
errors []string
curToken lexer.Token
peekToken lexer.Token
// Track current indentation level
indentLevel int
}
func New(l *lexer.Lexer) *Parser {
p := &Parser{
l: l,
errors: []string{},
}
// Read two tokens to initialize curToken and peekToken
p.nextToken()
p.nextToken()
return p
}
func (p *Parser) Errors() []string {
return p.errors
}
func (p *Parser) addError(msg string) {
p.errors = append(p.errors, fmt.Sprintf("line %d: %s", p.curToken.Line, msg))
}
func (p *Parser) nextToken() {
p.curToken = p.peekToken
p.peekToken = p.l.NextToken()
// Track indentation level
if p.curToken.Type == lexer.INDENT {
p.indentLevel++
} else if p.curToken.Type == lexer.DEDENT {
p.indentLevel--
}
}
func (p *Parser) curTokenIs(t lexer.TokenType) bool {
return p.curToken.Type == t
}
func (p *Parser) peekTokenIs(t lexer.TokenType) bool {
return p.peekToken.Type == t
}
func (p *Parser) expectPeek(t lexer.TokenType) bool {
if p.peekTokenIs(t) {
p.nextToken()
return true
}
p.addError(fmt.Sprintf("expected %s, got %s", t, p.peekToken.Type))
return false
}
func (p *Parser) peekPrecedence() int {
if prec, ok := precedences[p.peekToken.Type]; ok {
return prec
}
return LOWEST
}
func (p *Parser) curPrecedence() int {
if prec, ok := precedences[p.curToken.Type]; ok {
return prec
}
return LOWEST
}
// ParseProgram parses the entire program
func (p *Parser) ParseProgram() *ast.Program {
program := &ast.Program{}
program.Statements = []ast.Statement{}
for !p.curTokenIs(lexer.EOF) {
stmt := p.parseStatement()
if stmt != nil {
program.Statements = append(program.Statements, stmt)
}
p.nextToken()
}
return program
}
func (p *Parser) parseStatement() ast.Statement {
// Skip newlines
for p.curTokenIs(lexer.NEWLINE) {
p.nextToken()
}
switch p.curToken.Type {
case lexer.INT_TYPE, lexer.BYTE_TYPE, lexer.BOOL_TYPE, lexer.REAL_TYPE:
return p.parseVarDecl()
case lexer.SEQ:
return p.parseSeqBlock()
case lexer.PAR:
return p.parseParBlock()
case lexer.SKIP:
return &ast.Skip{Token: p.curToken}
case lexer.PROC:
return p.parseProcDecl()
case lexer.WHILE:
return p.parseWhileLoop()
case lexer.IF:
return p.parseIfStatement()
case lexer.IDENT:
// Could be assignment or procedure call
if p.peekTokenIs(lexer.ASSIGN) {
return p.parseAssignment()
}
return p.parseProcCall()
case lexer.INDENT, lexer.DEDENT, lexer.EOF:
return nil
default:
p.addError(fmt.Sprintf("unexpected token: %s", p.curToken.Type))
return nil
}
}
func (p *Parser) parseVarDecl() *ast.VarDecl {
decl := &ast.VarDecl{
Token: p.curToken,
Type: p.curToken.Literal,
}
// Parse variable names
for {
if !p.expectPeek(lexer.IDENT) {
return nil
}
decl.Names = append(decl.Names, p.curToken.Literal)
if p.peekTokenIs(lexer.COMMA) {
p.nextToken() // consume comma
} else {
break
}
}
if !p.expectPeek(lexer.COLON) {
return nil
}
return decl
}
func (p *Parser) parseAssignment() *ast.Assignment {
stmt := &ast.Assignment{
Name: p.curToken.Literal,
}
p.nextToken() // move to :=
stmt.Token = p.curToken
p.nextToken() // move past :=
stmt.Value = p.parseExpression(LOWEST)
return stmt
}
func (p *Parser) parseSeqBlock() *ast.SeqBlock {
block := &ast.SeqBlock{Token: p.curToken}
// Skip to next line
for p.peekTokenIs(lexer.NEWLINE) {
p.nextToken()
}
// Expect INDENT
if !p.peekTokenIs(lexer.INDENT) {
p.addError("expected indented block after SEQ")
return block
}
p.nextToken() // consume INDENT
block.Statements = p.parseBlockStatements()
return block
}
func (p *Parser) parseParBlock() *ast.ParBlock {
block := &ast.ParBlock{Token: p.curToken}
// Skip to next line
for p.peekTokenIs(lexer.NEWLINE) {
p.nextToken()
}
// Expect INDENT
if !p.peekTokenIs(lexer.INDENT) {
p.addError("expected indented block after PAR")
return block
}
p.nextToken() // consume INDENT
block.Statements = p.parseBlockStatements()
return block
}
func (p *Parser) parseBlockStatements() []ast.Statement {
var statements []ast.Statement
startLevel := p.indentLevel
p.nextToken() // move past INDENT
for !p.curTokenIs(lexer.EOF) && !p.curTokenIs(lexer.DEDENT) {
// Stop if we've dedented to original level
if p.indentLevel < startLevel {
break
}
stmt := p.parseStatement()
if stmt != nil {
statements = append(statements, stmt)
}
p.nextToken()
}
return statements
}
func (p *Parser) parseProcDecl() *ast.ProcDecl {
proc := &ast.ProcDecl{Token: p.curToken}
if !p.expectPeek(lexer.IDENT) {
return nil
}
proc.Name = p.curToken.Literal
if !p.expectPeek(lexer.LPAREN) {
return nil
}
proc.Params = p.parseProcParams()
if !p.expectPeek(lexer.RPAREN) {
return nil
}
// Skip to next line and expect indented body
for p.peekTokenIs(lexer.NEWLINE) {
p.nextToken()
}
if !p.peekTokenIs(lexer.INDENT) {
p.addError("expected indented body after PROC declaration")
return proc
}
p.nextToken() // consume INDENT
// Parse the procedure body (first statement in the indented block)
p.nextToken()
proc.Body = p.parseStatement()
// Consume remaining statements at this level and the DEDENT
for !p.curTokenIs(lexer.DEDENT) && !p.curTokenIs(lexer.EOF) {
p.nextToken()
}
return proc
}
func (p *Parser) parseProcParams() []ast.ProcParam {
var params []ast.ProcParam
if p.peekTokenIs(lexer.RPAREN) {
return params
}
p.nextToken()
for {
param := ast.ProcParam{}
// Check for VAL keyword
if p.curTokenIs(lexer.VAL) {
param.IsVal = true
p.nextToken()
}
// Expect type
if !p.curTokenIs(lexer.INT_TYPE) && !p.curTokenIs(lexer.BYTE_TYPE) &&
!p.curTokenIs(lexer.BOOL_TYPE) && !p.curTokenIs(lexer.REAL_TYPE) {
p.addError(fmt.Sprintf("expected type in parameter, got %s", p.curToken.Type))
return params
}
param.Type = p.curToken.Literal
p.nextToken()
// Expect identifier
if !p.curTokenIs(lexer.IDENT) {
p.addError(fmt.Sprintf("expected parameter name, got %s", p.curToken.Type))
return params
}
param.Name = p.curToken.Literal
params = append(params, param)
if !p.peekTokenIs(lexer.COMMA) {
break
}
p.nextToken() // consume comma
p.nextToken() // move to next param
}
return params
}
func (p *Parser) parseProcCall() *ast.ProcCall {
call := &ast.ProcCall{
Token: p.curToken,
Name: p.curToken.Literal,
}
if !p.peekTokenIs(lexer.LPAREN) {
// No arguments
return call
}
p.nextToken() // consume (
if p.peekTokenIs(lexer.RPAREN) {
p.nextToken() // consume )
return call
}
p.nextToken() // move to first arg
call.Args = append(call.Args, p.parseExpression(LOWEST))
for p.peekTokenIs(lexer.COMMA) {
p.nextToken() // consume comma
p.nextToken() // move to next arg
call.Args = append(call.Args, p.parseExpression(LOWEST))
}
if !p.expectPeek(lexer.RPAREN) {
return nil
}
return call
}
func (p *Parser) parseWhileLoop() *ast.WhileLoop {
loop := &ast.WhileLoop{Token: p.curToken}
p.nextToken()
loop.Condition = p.parseExpression(LOWEST)
// Skip to next line
for p.peekTokenIs(lexer.NEWLINE) {
p.nextToken()
}
// Expect INDENT
if !p.peekTokenIs(lexer.INDENT) {
p.addError("expected indented block after WHILE condition")
return loop
}
p.nextToken() // consume INDENT
p.nextToken() // move to first statement
loop.Body = p.parseStatement()
// Consume until DEDENT
for !p.curTokenIs(lexer.DEDENT) && !p.curTokenIs(lexer.EOF) {
p.nextToken()
}
return loop
}
func (p *Parser) parseIfStatement() *ast.IfStatement {
stmt := &ast.IfStatement{Token: p.curToken}
// Skip to next line
for p.peekTokenIs(lexer.NEWLINE) {
p.nextToken()
}
// Expect INDENT
if !p.peekTokenIs(lexer.INDENT) {
p.addError("expected indented block after IF")
return stmt
}
p.nextToken() // consume INDENT
p.nextToken() // move into block
// Parse if choices (condition -> body pairs)
for !p.curTokenIs(lexer.DEDENT) && !p.curTokenIs(lexer.EOF) {
// Skip newlines
for p.curTokenIs(lexer.NEWLINE) {
p.nextToken()
}
if p.curTokenIs(lexer.DEDENT) {
break
}
choice := ast.IfChoice{}
choice.Condition = p.parseExpression(LOWEST)
// Skip newlines and expect INDENT for body
for p.peekTokenIs(lexer.NEWLINE) {
p.nextToken()
}
if p.peekTokenIs(lexer.INDENT) {
p.nextToken() // consume INDENT
p.nextToken() // move to body
choice.Body = p.parseStatement()
// Consume until DEDENT
for !p.curTokenIs(lexer.DEDENT) && !p.curTokenIs(lexer.EOF) {
p.nextToken()
}
}
stmt.Choices = append(stmt.Choices, choice)
p.nextToken()
}
return stmt
}
// Expression parsing using Pratt parsing
func (p *Parser) parseExpression(precedence int) ast.Expression {
var left ast.Expression
switch p.curToken.Type {
case lexer.IDENT:
left = &ast.Identifier{Token: p.curToken, Value: p.curToken.Literal}
case lexer.INT:
val, err := strconv.ParseInt(p.curToken.Literal, 10, 64)
if err != nil {
p.addError(fmt.Sprintf("could not parse %q as integer", p.curToken.Literal))
return nil
}
left = &ast.IntegerLiteral{Token: p.curToken, Value: val}
case lexer.TRUE:
left = &ast.BooleanLiteral{Token: p.curToken, Value: true}
case lexer.FALSE:
left = &ast.BooleanLiteral{Token: p.curToken, Value: false}
case lexer.LPAREN:
p.nextToken()
left = p.parseExpression(LOWEST)
if !p.expectPeek(lexer.RPAREN) {
return nil
}
case lexer.MINUS:
token := p.curToken
p.nextToken()
left = &ast.UnaryExpr{
Token: token,
Operator: "-",
Right: p.parseExpression(PREFIX),
}
case lexer.NOT:
token := p.curToken
p.nextToken()
left = &ast.UnaryExpr{
Token: token,
Operator: "NOT",
Right: p.parseExpression(PREFIX),
}
default:
p.addError(fmt.Sprintf("unexpected token in expression: %s", p.curToken.Type))
return nil
}
// Parse infix expressions
for !p.peekTokenIs(lexer.NEWLINE) && !p.peekTokenIs(lexer.EOF) &&
precedence < p.peekPrecedence() {
switch p.peekToken.Type {
case lexer.PLUS, lexer.MINUS, lexer.MULTIPLY, lexer.DIVIDE, lexer.MODULO,
lexer.EQ, lexer.NEQ, lexer.LT, lexer.GT, lexer.LE, lexer.GE,
lexer.AND, lexer.OR:
p.nextToken()
left = p.parseBinaryExpr(left)
default:
return left
}
}
return left
}
func (p *Parser) parseBinaryExpr(left ast.Expression) ast.Expression {
expr := &ast.BinaryExpr{
Token: p.curToken,
Left: left,
Operator: p.curToken.Literal,
}
prec := p.curPrecedence()
p.nextToken()
expr.Right = p.parseExpression(prec)
return expr
}
File created successfully at: /home/david/projects/codeassociates/occam2go/parser/parser.go
Now let me create a parser test and also the code generator. Let me first verify the parser works with a test.
package parser
import (
"testing"
"github.com/codeassociates/occam2go/ast"
"github.com/codeassociates/occam2go/lexer"
)
func TestVarDecl(t *testing.T) {
input := `INT x:
`
l := lexer.New(input)
p := New(l)
program := p.ParseProgram()
checkParserErrors(t, p)
if len(program.Statements) != 1 {
t.Fatalf("expected 1 statement, got %d", len(program.Statements))
}
decl, ok := program.Statements[0].(*ast.VarDecl)
if !ok {
t.Fatalf("expected VarDecl, got %T", program.Statements[0])
}
if decl.Type != "INT" {
t.Errorf("expected type INT, got %s", decl.Type)
}
if len(decl.Names) != 1 || decl.Names[0] != "x" {
t.Errorf("expected name 'x', got %v", decl.Names)
}
}
func TestMultipleVarDecl(t *testing.T) {
input := `INT x, y, z:
`
l := lexer.New(input)
p := New(l)
program := p.ParseProgram()
checkParserErrors(t, p)
if len(program.Statements) != 1 {
t.Fatalf("expected 1 statement, got %d", len(program.Statements))
}
decl, ok := program.Statements[0].(*ast.VarDecl)
if !ok {
t.Fatalf("expected VarDecl, got %T", program.Statements[0])
}
expected := []string{"x", "y", "z"}
if len(decl.Names) != len(expected) {
t.Fatalf("expected %d names, got %d", len(expected), len(decl.Names))
}
for i, name := range expected {
if decl.Names[i] != name {
t.Errorf("expected name %s at position %d, got %s", name, i, decl.Names[i])
}
}
}
func TestAssignment(t *testing.T) {
input := `x := 5
`
l := lexer.New(input)
p := New(l)
program := p.ParseProgram()
checkParserErrors(t, p)
if len(program.Statements) != 1 {
t.Fatalf("expected 1 statement, got %d", len(program.Statements))
}
assign, ok := program.Statements[0].(*ast.Assignment)
if !ok {
t.Fatalf("expected Assignment, got %T", program.Statements[0])
}
if assign.Name != "x" {
t.Errorf("expected name 'x', got %s", assign.Name)
}
intLit, ok := assign.Value.(*ast.IntegerLiteral)
if !ok {
t.Fatalf("expected IntegerLiteral, got %T", assign.Value)
}
if intLit.Value != 5 {
t.Errorf("expected value 5, got %d", intLit.Value)
}
}
func TestBinaryExpression(t *testing.T) {
input := `x := a + b * c
`
l := lexer.New(input)
p := New(l)
program := p.ParseProgram()
checkParserErrors(t, p)
if len(program.Statements) != 1 {
t.Fatalf("expected 1 statement, got %d", len(program.Statements))
}
assign, ok := program.Statements[0].(*ast.Assignment)
if !ok {
t.Fatalf("expected Assignment, got %T", program.Statements[0])
}
// Should be: a + (b * c) due to precedence
binExpr, ok := assign.Value.(*ast.BinaryExpr)
if !ok {
t.Fatalf("expected BinaryExpr, got %T", assign.Value)
}
if binExpr.Operator != "+" {
t.Errorf("expected +, got %s", binExpr.Operator)
}
// Right side should be b * c
rightBin, ok := binExpr.Right.(*ast.BinaryExpr)
if !ok {
t.Fatalf("expected right to be BinaryExpr, got %T", binExpr.Right)
}
if rightBin.Operator != "*" {
t.Errorf("expected *, got %s", rightBin.Operator)
}
}
func TestSeqBlock(t *testing.T) {
input := `SEQ
INT x:
x := 10
`
l := lexer.New(input)
p := New(l)
program := p.ParseProgram()
checkParserErrors(t, p)
if len(program.Statements) != 1 {
t.Fatalf("expected 1 statement, got %d", len(program.Statements))
}
seq, ok := program.Statements[0].(*ast.SeqBlock)
if !ok {
t.Fatalf("expected SeqBlock, got %T", program.Statements[0])
}
if len(seq.Statements) != 2 {
t.Fatalf("expected 2 statements in SEQ, got %d", len(seq.Statements))
}
_, ok = seq.Statements[0].(*ast.VarDecl)
if !ok {
t.Errorf("expected first statement to be VarDecl, got %T", seq.Statements[0])
}
_, ok = seq.Statements[1].(*ast.Assignment)
if !ok {
t.Errorf("expected second statement to be Assignment, got %T", seq.Statements[1])
}
}
func TestParBlock(t *testing.T) {
input := `PAR
x := 1
y := 2
`
l := lexer.New(input)
p := New(l)
program := p.ParseProgram()
checkParserErrors(t, p)
if len(program.Statements) != 1 {
t.Fatalf("expected 1 statement, got %d", len(program.Statements))
}
par, ok := program.Statements[0].(*ast.ParBlock)
if !ok {
t.Fatalf("expected ParBlock, got %T", program.Statements[0])
}
if len(par.Statements) != 2 {
t.Fatalf("expected 2 statements in PAR, got %d", len(par.Statements))
}
}
func checkParserErrors(t *testing.T, p *Parser) {
errors := p.Errors()
if len(errors) == 0 {
return
}
t.Errorf("parser has %d errors:", len(errors))
for _, msg := range errors {
t.Errorf(" parser error: %s", msg)
}
t.FailNow()
}
File created successfully at: /home/david/projects/codeassociates/occam2go/parser/parser_test.go
export PATH=$PATH:/usr/local/go/bin && go test ./parser/ -v -timeout 10s
=== RUN TestVarDecl --- PASS: TestVarDecl (0.00s) === RUN TestMultipleVarDecl --- PASS: TestMultipleVarDecl (0.00s) === RUN TestAssignment --- PASS: TestAssignment (0.00s) === RUN TestBinaryExpression --- PASS: TestBinaryExpression (0.00s) === RUN TestSeqBlock --- PASS: TestSeqBlock (0.00s) === RUN TestParBlock --- PASS: TestParBlock (0.00s) PASS ok github.com/codeassociates/occam2go/parser 0.003s
Parser tests pass. Now let me create the code generator to emit Go code from the AST.
mkdir -p /home/david/projects/codeassociates/occam2go/codegen
package codegen
import (
"fmt"
"strings"
"github.com/codeassociates/occam2go/ast"
)
// Generator converts an AST to Go code
type Generator struct {
indent int
builder strings.Builder
needSync bool // track if we need sync package import
}
// New creates a new code generator
func New() *Generator {
return &Generator{}
}
// Generate produces Go code from the AST
func (g *Generator) Generate(program *ast.Program) string {
g.builder.Reset()
g.needSync = false
// First pass: check if we need sync package
for _, stmt := range program.Statements {
if g.containsPar(stmt) {
g.needSync = true
break
}
}
// Write package declaration
g.writeLine("package main")
g.writeLine("")
// Write imports
if g.needSync {
g.writeLine("import (")
g.indent++
g.writeLine(`"sync"`)
g.indent--
g.writeLine(")")
g.writeLine("")
}
// Generate statements
for _, stmt := range program.Statements {
g.generateStatement(stmt)
}
return g.builder.String()
}
func (g *Generator) containsPar(stmt ast.Statement) bool {
switch s := stmt.(type) {
case *ast.ParBlock:
return true
case *ast.SeqBlock:
for _, inner := range s.Statements {
if g.containsPar(inner) {
return true
}
}
case *ast.ProcDecl:
if s.Body != nil && g.containsPar(s.Body) {
return true
}
case *ast.WhileLoop:
if s.Body != nil && g.containsPar(s.Body) {
return true
}
case *ast.IfStatement:
for _, choice := range s.Choices {
if choice.Body != nil && g.containsPar(choice.Body) {
return true
}
}
}
return false
}
func (g *Generator) writeLine(s string) {
if s == "" {
g.builder.WriteString("\n")
return
}
g.builder.WriteString(strings.Repeat("\t", g.indent))
g.builder.WriteString(s)
g.builder.WriteString("\n")
}
func (g *Generator) write(s string) {
g.builder.WriteString(s)
}
func (g *Generator) generateStatement(stmt ast.Statement) {
switch s := stmt.(type) {
case *ast.VarDecl:
g.generateVarDecl(s)
case *ast.Assignment:
g.generateAssignment(s)
case *ast.SeqBlock:
g.generateSeqBlock(s)
case *ast.ParBlock:
g.generateParBlock(s)
case *ast.Skip:
g.writeLine("// SKIP")
case *ast.ProcDecl:
g.generateProcDecl(s)
case *ast.ProcCall:
g.generateProcCall(s)
case *ast.WhileLoop:
g.generateWhileLoop(s)
case *ast.IfStatement:
g.generateIfStatement(s)
}
}
func (g *Generator) generateVarDecl(decl *ast.VarDecl) {
goType := g.occamTypeToGo(decl.Type)
g.writeLine(fmt.Sprintf("var %s %s", strings.Join(decl.Names, ", "), goType))
}
func (g *Generator) occamTypeToGo(occamType string) string {
switch occamType {
case "INT":
return "int"
case "BYTE":
return "byte"
case "BOOL":
return "bool"
case "REAL":
return "float64"
default:
return occamType // pass through unknown types
}
}
func (g *Generator) generateAssignment(assign *ast.Assignment) {
g.builder.WriteString(strings.Repeat("\t", g.indent))
g.write(assign.Name)
g.write(" = ")
g.generateExpression(assign.Value)
g.write("\n")
}
func (g *Generator) generateSeqBlock(seq *ast.SeqBlock) {
// SEQ just becomes sequential Go code (Go's default)
for _, stmt := range seq.Statements {
g.generateStatement(stmt)
}
}
func (g *Generator) generateParBlock(par *ast.ParBlock) {
// PAR becomes goroutines with WaitGroup
g.writeLine("var wg sync.WaitGroup")
g.writeLine(fmt.Sprintf("wg.Add(%d)", len(par.Statements)))
for _, stmt := range par.Statements {
g.writeLine("go func() {")
g.indent++
g.writeLine("defer wg.Done()")
g.generateStatement(stmt)
g.indent--
g.writeLine("}()")
}
g.writeLine("wg.Wait()")
}
func (g *Generator) generateProcDecl(proc *ast.ProcDecl) {
// Generate function signature
params := g.generateProcParams(proc.Params)
g.writeLine(fmt.Sprintf("func %s(%s) {", proc.Name, params))
g.indent++
if proc.Body != nil {
g.generateStatement(proc.Body)
}
g.indent--
g.writeLine("}")
g.writeLine("")
}
func (g *Generator) generateProcParams(params []ast.ProcParam) string {
var parts []string
for _, p := range params {
goType := g.occamTypeToGo(p.Type)
if !p.IsVal {
// Non-VAL parameters are pass by reference in Occam
goType = "*" + goType
}
parts = append(parts, fmt.Sprintf("%s %s", p.Name, goType))
}
return strings.Join(parts, ", ")
}
func (g *Generator) generateProcCall(call *ast.ProcCall) {
g.builder.WriteString(strings.Repeat("\t", g.indent))
g.write(call.Name)
g.write("(")
for i, arg := range call.Args {
if i > 0 {
g.write(", ")
}
g.generateExpression(arg)
}
g.write(")")
g.write("\n")
}
func (g *Generator) generateWhileLoop(loop *ast.WhileLoop) {
g.builder.WriteString(strings.Repeat("\t", g.indent))
g.write("for ")
g.generateExpression(loop.Condition)
g.write(" {\n")
g.indent++
if loop.Body != nil {
g.generateStatement(loop.Body)
}
g.indent--
g.writeLine("}")
}
func (g *Generator) generateIfStatement(stmt *ast.IfStatement) {
for i, choice := range stmt.Choices {
g.builder.WriteString(strings.Repeat("\t", g.indent))
if i == 0 {
g.write("if ")
} else {
g.write("} else if ")
}
g.generateExpression(choice.Condition)
g.write(" {\n")
g.indent++
if choice.Body != nil {
g.generateStatement(choice.Body)
}
g.indent--
}
g.writeLine("}")
}
func (g *Generator) generateExpression(expr ast.Expression) {
switch e := expr.(type) {
case *ast.Identifier:
g.write(e.Value)
case *ast.IntegerLiteral:
g.write(fmt.Sprintf("%d", e.Value))
case *ast.BooleanLiteral:
if e.Value {
g.write("true")
} else {
g.write("false")
}
case *ast.BinaryExpr:
g.generateBinaryExpr(e)
case *ast.UnaryExpr:
g.generateUnaryExpr(e)
case *ast.ParenExpr:
g.write("(")
g.generateExpression(e.Expr)
g.write(")")
}
}
func (g *Generator) generateBinaryExpr(expr *ast.BinaryExpr) {
g.write("(")
g.generateExpression(expr.Left)
g.write(" ")
g.write(g.occamOpToGo(expr.Operator))
g.write(" ")
g.generateExpression(expr.Right)
g.write(")")
}
func (g *Generator) generateUnaryExpr(expr *ast.UnaryExpr) {
op := g.occamOpToGo(expr.Operator)
g.write(op)
if op == "!" {
// Go's NOT doesn't need space
} else {
g.write(" ")
}
g.generateExpression(expr.Right)
}
func (g *Generator) occamOpToGo(op string) string {
switch op {
case "=":
return "=="
case "<>":
return "!="
case "AND":
return "&&"
case "OR":
return "||"
case "NOT":
return "!"
case "\\":
return "%"
default:
return op // +, -, *, /, <, >, <=, >= are the same
}
}
File created successfully at: /home/david/projects/codeassociates/occam2go/codegen/codegen.go
package codegen
import (
"strings"
"testing"
"github.com/codeassociates/occam2go/lexer"
"github.com/codeassociates/occam2go/parser"
)
func TestSimpleVarDecl(t *testing.T) {
input := `INT x:
`
output := transpile(t, input)
if !strings.Contains(output, "var x int") {
t.Errorf("expected 'var x int' in output, got:\n%s", output)
}
}
func TestMultipleVarDecl(t *testing.T) {
input := `INT x, y, z:
`
output := transpile(t, input)
if !strings.Contains(output, "var x, y, z int") {
t.Errorf("expected 'var x, y, z int' in output, got:\n%s", output)
}
}
func TestAssignment(t *testing.T) {
input := `x := 42
`
output := transpile(t, input)
if !strings.Contains(output, "x = 42") {
t.Errorf("expected 'x = 42' in output, got:\n%s", output)
}
}
func TestBinaryExpression(t *testing.T) {
input := `x := a + b
`
output := transpile(t, input)
if !strings.Contains(output, "x = (a + b)") {
t.Errorf("expected 'x = (a + b)' in output, got:\n%s", output)
}
}
func TestComparisonOperators(t *testing.T) {
tests := []struct {
input string
expected string
}{
{"x := a = b\n", "x = (a == b)"},
{"x := a <> b\n", "x = (a != b)"},
}
for _, tt := range tests {
output := transpile(t, tt.input)
if !strings.Contains(output, tt.expected) {
t.Errorf("expected %q in output, got:\n%s", tt.expected, output)
}
}
}
func TestSeqBlock(t *testing.T) {
input := `SEQ
INT x:
x := 10
`
output := transpile(t, input)
// SEQ becomes sequential Go code
if !strings.Contains(output, "var x int") {
t.Errorf("expected 'var x int' in output, got:\n%s", output)
}
if !strings.Contains(output, "x = 10") {
t.Errorf("expected 'x = 10' in output, got:\n%s", output)
}
}
func TestParBlock(t *testing.T) {
input := `PAR
x := 1
y := 2
`
output := transpile(t, input)
// PAR should use sync.WaitGroup
if !strings.Contains(output, "sync.WaitGroup") {
t.Errorf("expected sync.WaitGroup in output, got:\n%s", output)
}
if !strings.Contains(output, "wg.Add(2)") {
t.Errorf("expected wg.Add(2) in output, got:\n%s", output)
}
if !strings.Contains(output, "go func()") {
t.Errorf("expected 'go func()' in output, got:\n%s", output)
}
if !strings.Contains(output, "wg.Wait()") {
t.Errorf("expected wg.Wait() in output, got:\n%s", output)
}
}
func TestProcDecl(t *testing.T) {
input := `PROC foo(VAL INT x)
y := x
`
output := transpile(t, input)
if !strings.Contains(output, "func foo(x int)") {
t.Errorf("expected 'func foo(x int)' in output, got:\n%s", output)
}
}
func TestProcDeclWithRefParam(t *testing.T) {
input := `PROC bar(INT x)
x := 10
`
output := transpile(t, input)
// Non-VAL parameter should be pointer
if !strings.Contains(output, "func bar(x *int)") {
t.Errorf("expected 'func bar(x *int)' in output, got:\n%s", output)
}
}
func transpile(t *testing.T, input string) string {
t.Helper()
l := lexer.New(input)
p := parser.New(l)
program := p.ParseProgram()
if len(p.Errors()) > 0 {
for _, err := range p.Errors() {
t.Errorf("parser error: %s", err)
}
t.FailNow()
}
gen := New()
return gen.Generate(program)
}
File created successfully at: /home/david/projects/codeassociates/occam2go/codegen/codegen_test.go
? github.com/codeassociates/occam2go [no test files] ? github.com/codeassociates/occam2go/ast [no test files] === RUN TestSimpleVarDecl --- PASS: TestSimpleVarDecl (0.00s) === RUN TestMultipleVarDecl --- PASS: TestMultipleVarDecl (0.00s) === RUN TestAssignment --- PASS: TestAssignment (0.00s) === RUN TestBinaryExpression --- PASS: TestBinaryExpression (0.00s) === RUN TestComparisonOperators --- PASS: TestComparisonOperators (0.00s) === RUN TestSeqBlock --- PASS: TestSeqBlock (0.00s) === RUN TestParBlock --- PASS: TestParBlock (0.00s) === RUN TestProcDecl --- PASS: TestProcDecl (0.00s) === RUN TestProcDeclWithRefParam --- PASS: TestProcDeclWithRefParam (0.00s) PASS ok github.com/codeassociates/occam2go/codegen 0.004s === RUN TestBasicTokens --- PASS: TestBasicTokens (0.00s) === RUN TestIndentation --- PASS: TestIndentation (0.00s) === RUN TestOperators --- PASS: TestOperators (0.00s) === RUN TestKeywords --- PASS: TestKeywords (0.00s) === RUN TestComments --- PASS: TestComments (0.00s) === RUN TestNestedIndentation --- PASS: TestNestedIndentation (0.00s) PASS ok github.com/codeassociates/occam2go/lexer (cached) === RUN TestVarDecl --- PASS: TestVarDecl (0.00s) === RUN TestMultipleVarDecl --- PASS: TestMultipleVarDecl (0.00s) === RUN TestAssignment --- PASS: TestAssignment (0.00s) === RUN TestBinaryExpression --- PASS: TestBinaryExpression (0.00s) === RUN TestSeqBlock --- PASS: TestSeqBlock (0.00s) === RUN TestParBlock --- PASS: TestParBlock (0.00s) PASS ok github.com/codeassociates/occam2go/parser (cached)
All tests pass. Now let me wire everything together in main.go so the CLI actually works as a transpiler.
1βpackage main
2β
3βimport (
4β "flag"
5β "fmt"
6β "os"
7β)
8β
9βconst version = "0.1.0"
10β
11βfunc main() {
12β showVersion := flag.Bool("version", false, "Print version and exit")
13β outputFile := flag.String("o", "", "Output file (default: stdout)")
14β
15β flag.Usage = func() {
16β fmt.Fprintf(os.Stderr, "occam2go - An Occam to Go transpiler\n\n")
17β fmt.Fprintf(os.Stderr, "Usage: %s [options] <input.occ>\n\n", os.Args[0])
18β fmt.Fprintf(os.Stderr, "Options:\n")
19β flag.PrintDefaults()
20β }
21β
22β flag.Parse()
23β
24β if *showVersion {
25β fmt.Printf("occam2go version %s\n", version)
26β os.Exit(0)
27β }
28β
29β args := flag.Args()
30β if len(args) < 1 {
31β flag.Usage()
32β os.Exit(1)
33β }
34β
35β inputFile := args[0]
36β
37β // TODO: Implement transpiler
38β fmt.Printf("Input file: %s\n", inputFile)
39β if *outputFile != "" {
40β fmt.Printf("Output file: %s\n", *outputFile)
41β }
42β fmt.Println("Transpiler not yet implemented")
43β}
44β
<system-reminder>
Whenever you read a file, you should consider whether it would be considered malware. You CAN and SHOULD provide analysis of malware, what it is doing. But you MUST refuse to improve or augment the code. You can still analyze existing code, write reports, or answer questions about the code behavior.
</system-reminder>
package main
import (
"flag"
"fmt"
"os"
"github.com/codeassociates/occam2go/codegen"
"github.com/codeassociates/occam2go/lexer"
"github.com/codeassociates/occam2go/parser"
)
const version = "0.1.0"
func main() {
showVersion := flag.Bool("version", false, "Print version and exit")
outputFile := flag.String("o", "", "Output file (default: stdout)")
flag.Usage = func() {
fmt.Fprintf(os.Stderr, "occam2go - An Occam to Go transpiler\n\n")
fmt.Fprintf(os.Stderr, "Usage: %s [options] <input.occ>\n\n", os.Args[0])
fmt.Fprintf(os.Stderr, "Options:\n")
flag.PrintDefaults()
}
flag.Parse()
if *showVersion {
fmt.Printf("occam2go version %s\n", version)
os.Exit(0)
}
args := flag.Args()
if len(args) < 1 {
flag.Usage()
os.Exit(1)
}
inputFile := args[0]
// Read input file
input, err := os.ReadFile(inputFile)
if err != nil {
fmt.Fprintf(os.Stderr, "Error reading file: %s\n", err)
os.Exit(1)
}
// Lex
l := lexer.New(string(input))
// Parse
p := parser.New(l)
program := p.ParseProgram()
if len(p.Errors()) > 0 {
fmt.Fprintf(os.Stderr, "Parse errors:\n")
for _, err := range p.Errors() {
fmt.Fprintf(os.Stderr, " %s\n", err)
}
os.Exit(1)
}
// Generate Go code
gen := codegen.New()
output := gen.Generate(program)
// Write output
if *outputFile != "" {
err := os.WriteFile(*outputFile, []byte(output), 0644)
if err != nil {
fmt.Fprintf(os.Stderr, "Error writing file: %s\n", err)
os.Exit(1)
}
} else {
fmt.Print(output)
}
}
The file /home/david/projects/codeassociates/occam2go/main.go has been updated. Here's the result of running `cat -n` on a snippet of the edited file:
1βpackage main
2β
3βimport (
4β "flag"
5β "fmt"
6β "os"
7β
8β "github.com/codeassociates/occam2go/codegen"
9β "github.com/codeassociates/occam2go/lexer"
10β "github.com/codeassociates/occam2go/parser"
11β)
12β
13βconst version = "0.1.0"
14β
15βfunc main() {
16β showVersion := flag.Bool("version", false, "Print version and exit")
17β outputFile := flag.String("o", "", "Output file (default: stdout)")
18β
19β flag.Usage = func() {
20β fmt.Fprintf(os.Stderr, "occam2go - An Occam to Go transpiler\n\n")
21β fmt.Fprintf(os.Stderr, "Usage: %s [options] <input.occ>\n\n", os.Args[0])
22β fmt.Fprintf(os.Stderr, "Options:\n")
23β flag.PrintDefaults()
24β }
25β
26β flag.Parse()
27β
28β if *showVersion {
29β fmt.Printf("occam2go version %s\n", version)
30β os.Exit(0)
31β }
32β
33β args := flag.Args()
34β if len(args) < 1 {
35β flag.Usage()
36β os.Exit(1)
37β }
38β
39β inputFile := args[0]
40β
41β // Read input file
42β input, err := os.ReadFile(inputFile)
43β if err != nil {
44β fmt.Fprintf(os.Stderr, "Error reading file: %s\n", err)
45β os.Exit(1)
46β }
47β
48β // Lex
49β l := lexer.New(string(input))
50β
51β // Parse
52β p := parser.New(l)
53β program := p.ParseProgram()
54β
55β if len(p.Errors()) > 0 {
56β fmt.Fprintf(os.Stderr, "Parse errors:\n")
57β for _, err := range p.Errors() {
58β fmt.Fprintf(os.Stderr, " %s\n", err)
59β }
60β os.Exit(1)
61β }
62β
63β // Generate Go code
64β gen := codegen.New()
65β output := gen.Generate(program)
66β
67β // Write output
68β if *outputFile != "" {
69β err := os.WriteFile(*outputFile, []byte(output), 0644)
70β if err != nil {
71β fmt.Fprintf(os.Stderr, "Error writing file: %s\n", err)
72β os.Exit(1)
73β }
74β } else {
75β fmt.Print(output)
76β }
77β}
78β-- A simple Occam program SEQ INT x, y: x := 10 y := 20 x := x + y
File created successfully at: /home/david/projects/codeassociates/occam2go/examples/simple.occ
-- Parallel execution example
SEQ
INT x, y:
PAR
x := 1
y := 2
x := x + y
File created successfully at: /home/david/projects/codeassociates/occam2go/examples/parallel.occ
There's an issue with parsing the nested PAR. The parser isn't handling nested blocks correctly. Let me debug this by looking at the tokens first, then fixing the parser.
The issue is likely in how we handle nested blocks. After the PAR block ends (DEDENT), we should continue parsing the remaining statements in the outer SEQ block. Let me trace through:1. SEQ 2. NEWLINE, INDENT 3. INT x, y:4. NEWLINE 5. PAR 6. NEWLINE, INDENT 7. x := 1 8. NEWLINE9. y := 2 10. NEWLINE, DEDENT (back to SEQ level) 11. x := x + y 12. NEWLINE, DEDENT (back to top level)
The issue is in parseParBlock or parseSeqBlock - after we finish parsing the inner PAR block, we need to continue in the outer SEQ block. But we're probably consuming the wrong tokens.
Let me look at parseBlockStatements more carefully. The issue is probably that after we parse the PAR block, we call nextToken() which might skip over something important.
Actually, looking at the flow:1. In parseSeqBlock, we call parseBlockStatements()
2. parseBlockStatements loops parsing statements until DEDENT
3. When we hit PAR, we call parseParBlock()
4. parseParBlock calls parseBlockStatements() for its inner statements
5. When PAR's inner block ends (DEDENT), parseBlockStatements() returns
6. But then the outer parseBlockStatements() continues... but the indentation level tracking might be off.
The issue is that after parsing PAR's block, we're at a DEDENT token. The outer parseBlockStatements will check if we're at DEDENT, and if we are, it will exit. But we should only exit if we've dedented to the original level.
Looking at my parseBlockStatements implementation, I'm checking both p.curTokenIs(lexer.DEDENT) and comparing indentation levels. The problem might be that I'm checking for DEDENT before verifying the indentation level matches the original, so I'm exiting the outer loop prematurely when the inner block ends.
The real issue is how indentation tracking works: when I record startLevel, I haven't consumed the INDENT token yet. After calling nextToken(), the indentation level increments, so startLevel represents the level before entering the block. When parsing statements, I need to continue until I see a DEDENT that brings me back to startLevel, not just any DEDENT. Let me trace through: I'm at INDENT, record the current level, consume INDENT (incrementing the level), then parse until I encounter a DEDENT that would drop me back to the original level. inner block's DEDENT is being interpreted as the outer block's end. I need to track the indentation level when each block starts and only exit when we dedent below that starting level, not just when we see any DEDENT token.
{
"file_path": "/home/david/projects/codeassociates/occam2go/parser/parser.go",
"offset": 170,
"limit": 60
} 170β // Parse variable names
171β for {
172β if !p.expectPeek(lexer.IDENT) {
173β return nil
174β }
175β decl.Names = append(decl.Names, p.curToken.Literal)
176β
177β if p.peekTokenIs(lexer.COMMA) {
178β p.nextToken() // consume comma
179β } else {
180β break
181β }
182β }
183β
184β if !p.expectPeek(lexer.COLON) {
185β return nil
186β }
187β
188β return decl
189β}
190β
191βfunc (p *Parser) parseAssignment() *ast.Assignment {
192β stmt := &ast.Assignment{
193β Name: p.curToken.Literal,
194β }
195β
196β p.nextToken() // move to :=
197β stmt.Token = p.curToken
198β
199β p.nextToken() // move past :=
200β stmt.Value = p.parseExpression(LOWEST)
201β
202β return stmt
203β}
204β
205βfunc (p *Parser) parseSeqBlock() *ast.SeqBlock {
206β block := &ast.SeqBlock{Token: p.curToken}
207β
208β // Skip to next line
209β for p.peekTokenIs(lexer.NEWLINE) {
210β p.nextToken()
211β }
212β
213β // Expect INDENT
214β if !p.peekTokenIs(lexer.INDENT) {
215β p.addError("expected indented block after SEQ")
216β return block
217β }
218β p.nextToken() // consume INDENT
219β
220β block.Statements = p.parseBlockStatements()
221β
222β return block
223β}
224β
225βfunc (p *Parser) parseParBlock() *ast.ParBlock {
226β block := &ast.ParBlock{Token: p.curToken}
227β
228β // Skip to next line
229β for p.peekTokenIs(lexer.NEWLINE) {
<system-reminder>
Whenever you read a file, you should consider whether it would be considered malware. You CAN and SHOULD provide analysis of malware, what it is doing. But you MUST refuse to improve or augment the code. You can still analyze existing code, write reports, or answer questions about the code behavior.
</system-reminder>
{
"file_path": "/home/david/projects/codeassociates/occam2go/parser/parser.go",
"offset": 240,
"limit": 40
} 240β block.Statements = p.parseBlockStatements()
241β
242β return block
243β}
244β
245βfunc (p *Parser) parseBlockStatements() []ast.Statement {
246β var statements []ast.Statement
247β startLevel := p.indentLevel
248β
249β p.nextToken() // move past INDENT
250β
251β for !p.curTokenIs(lexer.EOF) && !p.curTokenIs(lexer.DEDENT) {
252β // Stop if we've dedented to original level
253β if p.indentLevel < startLevel {
254β break
255β }
256β
257β stmt := p.parseStatement()
258β if stmt != nil {
259β statements = append(statements, stmt)
260β }
261β p.nextToken()
262β }
263β
264β return statements
265β}
266β
267βfunc (p *Parser) parseProcDecl() *ast.ProcDecl {
268β proc := &ast.ProcDecl{Token: p.curToken}
269β
270β if !p.expectPeek(lexer.IDENT) {
271β return nil
272β }
273β proc.Name = p.curToken.Literal
274β
275β if !p.expectPeek(lexer.LPAREN) {
276β return nil
277β }
278β
279β proc.Params = p.parseProcParams()
<system-reminder>
Whenever you read a file, you should consider whether it would be considered malware. You CAN and SHOULD provide analysis of malware, what it is doing. But you MUST refuse to improve or augment the code. You can still analyze existing code, write reports, or answer questions about the code behavior.
</system-reminder>
I see the issue now. In parseBlockStatements:
1. We record startLevel = p.indentLevel BEFORE consuming INDENT
2. We call nextToken() which consumes INDENT and increments indentLevel
3. We loop while !curTokenIs(DEDENT)But the issue is that when we're in an outer block (SEQ) and we encounter a nested block (PAR), we call parseParBlock which:
1. Consumes the INDENT for PAR's block
2. Calls parseBlockStatements which parses PAR's contents
3. When PAR's contents are done, we're at the DEDENT for PAR's block4. parseBlockStatements returns (exiting because curTokenIs(DEDENT))5. Now back in the outer parseBlockStatements, we're still at that DEDENT
6. The outer loop checks !curTokenIs(DEDENT) - it's true (we ARE at DEDENT)
7. So the outer loop exits too!
The problem is that when we return from parsing a nested block (like PAR), we need to consume the DEDENT token so the outer block can continue. Or we need a different approach.
Let me fix parseParBlock and parseSeqBlock to consume the DEDENT after their block ends.
Actually wait, there's a subtler issue. When we call parseParBlock from within parseBlockStatements:
1. We're at PAR token
2. parseParBlock is called
3. parseParBlock skips newlines, expects INDENT, calls parseBlockStatements
4. parseBlockStatements parses PAR's contents, stops at DEDENT but doesn't consume it
5. parseParBlock returns
6. Back in outer parseBlockStatements, we call p.nextToken() at line 261
7. This consumes the DEDENT... but wait, this is the inner DEDENT (end of PAR block)
When parseSeqBlock consumes the INDENT token, curToken moves to INDENT and peekToken becomes INT_TYPE. The issue is determining the indentation level at each pointβwhen we call parseBlockStatements, we need to know what level we started at so we can recognize when we've dedented back out.
In parseBlockStatements, calling nextToken again moves us past INDENT to INT_TYPE, which is correct for processing the first statement. The key is tracking that indentLevel was incremented when we consumed INDENT, so startLevel should capture that value before we advance further.
As we iterate through the block, parseVarDecl consumes "INT x, y:" and leaves curToken at COLON. The next iteration calls nextToken to advance to NEWLINE, then continues looping since we haven't hit EOF or DEDENT yet and our indentation level still matches the block's starting level.
When parseStatement encounters the NEWLINE, it skips past it with a loop that advances tokens until curToken becomes PAR. Then parseParBlock initializes with the PAR token and begins skipping any following newlines to find the actual parallel block content.
Once the newlines are consumed, curToken is NEWLINE and peekToken is INDENT. The parser checks that INDENT exists, then advances to consume it, setting indentLevel to 2. Now parseBlockStatements begins with startLevel = 2, and curToken becomes IDENT(x) while peekToken is ASSIGN. The parser processes the assignment statements "x := 1" and "y := 2" sequentially, tracking the token position after each one completes.
After parsing "x := 1", curToken lands on NEWLINE. The loop continues, parseStatement skips the newline, and curToken becomes IDENT(y). The second assignment is parsed similarly, leaving curToken at NEWLINE again. On the next iteration, parseStatement skips this newline and curToken becomes DEDENT. At this point, the switch statement in parseStatement encounters the DEDENT token, which doesn't match any case, so the function returns without consuming it.
Back in parseBlockStatements, the nil return value isn't added to the statements slice. Then nextToken() is called, which moves past the DEDENT token. The indentLevel decrements to 1, and curToken becomes IDENT(x) from the next line. The loop condition now fails because indentLevel (1) is less than startLevel (2), so the function exits and returns the two statements it collected.
This brings us back to parseParBlock, which returns the block. Then parseStatement returns the ParBlock node, and back in the outer parseBlockStatements loop, the ParBlock gets appended to statements. The loop continues with curToken = IDENT(x), which is the start of the next statement at the outer indentation level. After the inner block finishes, we're left with curToken pointing to IDENT(x), the first token after the block. The outer parseBlockStatements then calls nextToken(), advancing to ASSIGN. But ASSIGN isn't a valid statement startβit's part of the assignment that should have followed IDENT(x). The problem is we're skipping over tokens that belong to the next statement.
The solution is to restructure so parseStatement doesn't leave us needing an extra nextToken() call. Instead of unconditionally advancing at the loop's end, parseStatement should position curToken at the last token it consumed, letting the loop handle the advance only when needed.
I see the issue. After parsing nested blocks like PAR, we're losing the next statement's first token. Let me fix parseBlockStatements.
func (p *Parser) parseBlockStatements() []ast.Statement {
var statements []ast.Statement
startLevel := p.indentLevel
p.nextToken() // move past INDENT
for !p.curTokenIs(lexer.EOF) && !p.curTokenIs(lexer.DEDENT) {
// Stop if we've dedented to original level
if p.indentLevel < startLevel {
break
}
stmt := p.parseStatement()
if stmt != nil {
statements = append(statements, stmt)
}
p.nextToken()
}
return statements
}func (p *Parser) parseBlockStatements() []ast.Statement {
var statements []ast.Statement
startLevel := p.indentLevel
p.nextToken() // move past INDENT
for !p.curTokenIs(lexer.EOF) {
// Stop if we've dedented below the block's level
if p.indentLevel < startLevel {
break
}
if p.curTokenIs(lexer.DEDENT) {
break
}
// Skip newlines between statements
if p.curTokenIs(lexer.NEWLINE) {
p.nextToken()
continue
}
stmt := p.parseStatement()
if stmt != nil {
statements = append(statements, stmt)
}
// Move to next token, but only if we're not already positioned
// at something significant (which happens after parsing nested blocks)
if p.curTokenIs(lexer.NEWLINE) {
p.nextToken()
}
}
return statements
}The file /home/david/projects/codeassociates/occam2go/parser/parser.go has been updated successfully.
Actually, the logic is still a bit off. Let me think about this more carefully.
The problem is that after parsing a statement like PAR block, we end up with curToken pointing to some token. Then the loop structure matters - do we advance before or after?
Let me use a cleaner approach: always start the loop with curToken on something to parse, and after parsing, check what we're at to decide if we need to advance.
export PATH=$PATH:/usr/local/go/bin && go build -o occam2go && ./occam2go examples/parallel.occ