Implement the following plan:
The transpiler cleanly processes course_nofloat.module but produces 598 parse errors on the full course.module because float_io.occ uses 6 unimplemented constructs. All 598 errors trace to 4 parse-level root causes plus 2 codegen-only features.
lexer/lexer.go: Add [/] tracking to parenDepth so NEWLINEs inside [...] are suppressed (needed for multi-line array literals like VAL PowerTenFrac IS [#80000000, #A0000000, ...] :). Just add l.parenDepth++ on [ and l.parenDepth-- on ].
lexer/token.go: Add RETYPES keyword token + entries in tokenNames and keywords maps.
ast/ast.go: Add two nodes:
- ArrayLiteral (Expression) — Token, Elements []Expression
- RetypesDecl (Statement) — Token, IsVal, TargetType, IsArray, ArraySize, Name, Source
parser/parser.go → parseAbbreviation(): After the [] open-array check (line 325-329), before the isTypeToken check (line 332), detect: if curToken is IDENT and peekToken is IS, it's an untyped abbreviation VAL <name> IS <expr> :. Parse with Type = "".
parser/parser.go → parseExpression() LBRACKET case (line 2591): After [, parse first expression, then:
- COMMA → array literal: continue parsing comma-separated elements until ]
- FROM → slice expression (existing logic)
- FOR → slice shorthand (existing logic)
- RBRACKET → single-element array literal
parser/parser.go → parseAbbreviation(): After parsing VAL [n]<type> <name>, if next token is RETYPES (instead of IS), parse as RetypesDecl. Handle both VAL INT X RETYPES X : and VAL [2]INT X RETYPES X :.
parser/parser.go → parseBinaryExpr() (line 2680): After p.nextToken() past the operator, skip NEWLINE/INDENT tokens (tracking indent count). After parsing RHS, consume matching NEWLINE+DEDENT pairs to restore indent balance. This handles expr AND\n expr patterns.
codegen/codegen.go:
- Package-level abbreviations with Type == "": emit var name = expr (omit type, let Go infer)
- ArrayLiteral: emit []int{e1, e2, ...} (all float_io.occ arrays are INT)
- Add ArrayLiteral case to containsX scanner functions
codegen/codegen.go:
- VAL INT X RETYPES X : → X := int(math.Float32bits(float32(X)))
- VAL [2]INT X RETYPES X : → _tmp := math.Float64bits(float64(X)); X := [2]int{int(uint32(_tmp)), int(uint32(_tmp >> 32))}
- Set needMath = true
codegen/codegen.go:
- Add transpIntrinsics map for LONGPROD/LONGDIV/LONGSUM/LONGDIFF/NORMALISE/SHIFTRIGHT/SHIFTLEFT
- In generateFuncCallExpr(): prefix intrinsic names with _
- Add containsIntrinsics() scanner → needMathBits flag → "math/bits" import
- Emit 7 Go helper functions (using uint64/uint32 arithmetic for 32-bit transputer semantics)
codegen/codegen.go → generateProcCall(): If name is CAUSEERROR, emit panic("CAUSEERROR").
go build -o occam2go .
./occam2go -I kroc/modules/course/libsrc -D TARGET.BITS.PER.WORD=32 -o /tmp/course_out.go kroc/modules/course/libsrc/course.module
go vet /tmp/course_out.go
go test ./...
| File | Changes |
|---|---|
lexer/lexer.go |
[/] parenDepth tracking (2 lines) |
lexer/token.go |
RETYPES keyword (3 additions) |
ast/ast.go |
ArrayLiteral + RetypesDecl nodes |
parser/parser.go |
Untyped VAL, array literals, RETYPES, multi-line continuation |
codegen/codegen.go |
Array literal gen, RETYPES gen, intrinsic helpers, CAUSEERROR |
codegen/e2e_*_test.go |
New E2E tests |
codegen/codegen_test.go |
New unit tests |
parser/parser_test.go |
New parser tests |
If you need specific details from before exiting plan mode (like exact code snippets, error messages, or content you generated), read the full transcript at: /home/david/.claude/projects/-home-david-projects-code-associates-occam2go/c4302863-3896-4a30-bd1b-760356b188c1.jsonl
The user wants me to implement the Phase 2 plan for full course module transpilation. This is a large, multi-step implementation. Let me start by reading the key files I'll need to modify, then work through the steps systematically.
Let me begin by reading the current state of the key files in parallel.
I'll implement the Phase 2 plan step by step. Let me start by reading all the key files I need to modify.
1→package lexer
2→
3→type TokenType int
4→
5→const (
6→ // Special tokens
7→ ILLEGAL TokenType = iota
8→ EOF
9→ NEWLINE
10→ INDENT
11→ DEDENT
12→
13→ // Literals
14→ IDENT // variable names, procedure names
15→ INT // integer literal
16→ STRING // string literal
17→ BYTE_LIT // byte literal: 'A', '*n', etc.
18→
19→ // Operators
20→ ASSIGN // :=
21→ PLUS // +
22→ MINUS // -
23→ MULTIPLY // *
24→ DIVIDE // /
25→ MODULO // \ (backslash in Occam)
26→ EQ // =
27→ NEQ // <>
28→ LT // <
29→ GT // >
30→ LE // <=
31→ GE // >=
32→ SEND // !
33→ RECEIVE // ?
34→ AMPERSAND // & (guard separator in ALT)
35→ BITAND // /\ (bitwise AND)
36→ BITOR // \/ (bitwise OR)
37→ BITXOR // >< (bitwise XOR)
38→ BITNOT // ~ (bitwise NOT)
39→ LSHIFT // << (left shift)
40→ RSHIFT // >> (right shift)
41→
42→ // Delimiters
43→ LPAREN // (
44→ RPAREN // )
45→ LBRACKET // [
46→ RBRACKET // ]
47→ COMMA // ,
48→ COLON // :
49→ SEMICOLON // ;
50→
51→ // Keywords
52→ keyword_beg
53→ SEQ
54→ PAR
55→ ALT
56→ IF
57→ CASE
58→ ELSE
59→ WHILE
60→ FOR
61→ FROM
62→ PROC
63→ FUNC
64→ FUNCTION
65→ VALOF
66→ RESULT
67→ IS
68→ CHAN
69→ OF
70→ TRUE
71→ FALSE
72→ NOT
73→ AND
74→ OR
75→ SKIP
76→ STOP
77→ INT_TYPE
78→ BYTE_TYPE
79→ BOOL_TYPE
80→ REAL_TYPE
81→ REAL32_TYPE
82→ REAL64_TYPE
83→ TIMER
84→ AFTER
85→ VAL
86→ PROTOCOL
87→ RECORD
88→ SIZE_KW
89→ STEP
90→ MOSTNEG_KW
91→ MOSTPOS_KW
92→ INITIAL
93→ PLUS_KW // PLUS (modular addition keyword, distinct from + symbol)
94→ MINUS_KW // MINUS (modular subtraction keyword, distinct from - symbol)
95→ TIMES // TIMES (modular multiplication keyword)
96→ keyword_end
97→)
98→
99→var tokenNames = map[TokenType]string{
100→ ILLEGAL: "ILLEGAL",
101→ EOF: "EOF",
102→ NEWLINE: "NEWLINE",
103→ INDENT: "INDENT",
104→ DEDENT: "DEDENT",
105→
106→ IDENT: "IDENT",
107→ INT: "INT",
108→ STRING: "STRING",
109→ BYTE_LIT: "BYTE_LIT",
110→
111→ ASSIGN: ":=",
112→ PLUS: "+",
113→ MINUS: "-",
114→ MULTIPLY: "*",
115→ DIVIDE: "/",
116→ MODULO: "\\",
117→ EQ: "=",
118→ NEQ: "<>",
119→ LT: "<",
120→ GT: ">",
121→ LE: "<=",
122→ GE: ">=",
123→ SEND: "!",
124→ RECEIVE: "?",
125→ AMPERSAND: "&",
126→ BITAND: "/\\",
127→ BITOR: "\\/",
128→ BITXOR: "><",
129→ BITNOT: "~",
130→ LSHIFT: "<<",
131→ RSHIFT: ">>",
132→
133→ LPAREN: "(",
134→ RPAREN: ")",
135→ LBRACKET: "[",
136→ RBRACKET: "]",
137→ COMMA: ",",
138→ COLON: ":",
139→ SEMICOLON: ";",
140→
141→ SEQ: "SEQ",
142→ PAR: "PAR",
143→ ALT: "ALT",
144→ IF: "IF",
145→ CASE: "CASE",
146→ ELSE: "ELSE",
147→ WHILE: "WHILE",
148→ FOR: "FOR",
149→ FROM: "FROM",
150→ PROC: "PROC",
151→ FUNC: "FUNC",
152→ FUNCTION: "FUNCTION",
153→ VALOF: "VALOF",
154→ RESULT: "RESULT",
155→ IS: "IS",
156→ CHAN: "CHAN",
157→ OF: "OF",
158→ TRUE: "TRUE",
159→ FALSE: "FALSE",
160→ NOT: "NOT",
161→ AND: "AND",
162→ OR: "OR",
163→ SKIP: "SKIP",
164→ STOP: "STOP",
165→ INT_TYPE: "INT",
166→ BYTE_TYPE: "BYTE",
167→ BOOL_TYPE: "BOOL",
168→ REAL_TYPE: "REAL",
169→ REAL32_TYPE: "REAL32",
170→ REAL64_TYPE: "REAL64",
171→ TIMER: "TIMER",
172→ AFTER: "AFTER",
173→ VAL: "VAL",
174→ PROTOCOL: "PROTOCOL",
175→ RECORD: "RECORD",
176→ SIZE_KW: "SIZE",
177→ STEP: "STEP",
178→ MOSTNEG_KW: "MOSTNEG",
179→ MOSTPOS_KW: "MOSTPOS",
180→ INITIAL: "INITIAL",
181→ PLUS_KW: "PLUS",
182→ MINUS_KW: "MINUS",
183→ TIMES: "TIMES",
184→}
185→
186→var keywords = map[string]TokenType{
187→ "SEQ": SEQ,
188→ "PAR": PAR,
189→ "ALT": ALT,
190→ "IF": IF,
191→ "CASE": CASE,
192→ "ELSE": ELSE,
193→ "WHILE": WHILE,
194→ "FOR": FOR,
195→ "FROM": FROM,
196→ "PROC": PROC,
197→ "FUNC": FUNC,
198→ "FUNCTION": FUNCTION,
199→ "VALOF": VALOF,
200→ "RESULT": RESULT,
201→ "IS": IS,
202→ "CHAN": CHAN,
203→ "OF": OF,
204→ "TRUE": TRUE,
205→ "FALSE": FALSE,
206→ "NOT": NOT,
207→ "AND": AND,
208→ "OR": OR,
209→ "SKIP": SKIP,
210→ "STOP": STOP,
211→ "INT": INT_TYPE,
212→ "BYTE": BYTE_TYPE,
213→ "BOOL": BOOL_TYPE,
214→ "REAL": REAL_TYPE,
215→ "REAL32": REAL32_TYPE,
216→ "REAL64": REAL64_TYPE,
217→ "TIMER": TIMER,
218→ "AFTER": AFTER,
219→ "VAL": VAL,
220→ "PROTOCOL": PROTOCOL,
221→ "RECORD": RECORD,
222→ "SIZE": SIZE_KW,
223→ "STEP": STEP,
224→ "MOSTNEG": MOSTNEG_KW,
225→ "MOSTPOS": MOSTPOS_KW,
226→ "INITIAL": INITIAL,
227→ "PLUS": PLUS_KW,
228→ "MINUS": MINUS_KW,
229→ "TIMES": TIMES,
230→}
231→
232→func (t TokenType) String() string {
233→ if name, ok := tokenNames[t]; ok {
234→ return name
235→ }
236→ return "UNKNOWN"
237→}
238→
239→func LookupIdent(ident string) TokenType {
240→ if tok, ok := keywords[ident]; ok {
241→ return tok
242→ }
243→ return IDENT
244→}
245→
246→type Token struct {
247→ Type TokenType
248→ Literal string
249→ Line int
250→ Column int
251→}
252→ 1→package lexer
2→
3→import (
4→ "strings"
5→)
6→
7→type Lexer struct {
8→ input string
9→ position int // current position in input (points to current char)
10→ readPosition int // current reading position (after current char)
11→ ch byte // current char under examination
12→ line int
13→ column int
14→
15→ // Indentation tracking
16→ indentStack []int // stack of indentation levels
17→ pendingTokens []Token // tokens to emit before reading more input
18→ atLineStart bool
19→
20→ // Parenthesis depth: suppress INDENT/DEDENT/NEWLINE inside (...)
21→ parenDepth int
22→}
23→
24→func New(input string) *Lexer {
25→ l := &Lexer{
26→ input: input,
27→ line: 1,
28→ column: 0,
29→ indentStack: []int{0}, // start with base indentation of 0
30→ atLineStart: true,
31→ }
32→ l.readChar()
33→ return l
34→}
35→
36→func (l *Lexer) readChar() {
37→ if l.readPosition >= len(l.input) {
38→ l.ch = 0
39→ } else {
40→ l.ch = l.input[l.readPosition]
41→ }
42→ l.position = l.readPosition
43→ l.readPosition++
44→ l.column++
45→}
46→
47→func (l *Lexer) peekChar() byte {
48→ if l.readPosition >= len(l.input) {
49→ return 0
50→ }
51→ return l.input[l.readPosition]
52→}
53→
54→func (l *Lexer) NextToken() Token {
55→ // Return any pending tokens first (from indentation processing)
56→ if len(l.pendingTokens) > 0 {
57→ tok := l.pendingTokens[0]
58→ l.pendingTokens = l.pendingTokens[1:]
59→ return tok
60→ }
61→
62→ // Handle indentation at the start of a line
63→ if l.atLineStart {
64→ l.atLineStart = false
65→ indent := l.measureIndent()
66→ currentIndent := l.indentStack[len(l.indentStack)-1]
67→
68→ if l.parenDepth > 0 {
69→ // Inside parentheses: suppress INDENT/DEDENT tokens
70→ // (don't modify indentStack — resume normal tracking after close paren)
71→ } else if indent > currentIndent {
72→ l.indentStack = append(l.indentStack, indent)
73→ return Token{Type: INDENT, Literal: "", Line: l.line, Column: 1}
74→ } else if indent < currentIndent {
75→ // May need multiple DEDENTs
76→ for len(l.indentStack) > 1 && l.indentStack[len(l.indentStack)-1] > indent {
77→ l.indentStack = l.indentStack[:len(l.indentStack)-1]
78→ l.pendingTokens = append(l.pendingTokens, Token{Type: DEDENT, Literal: "", Line: l.line, Column: 1})
79→ }
80→ if len(l.pendingTokens) > 0 {
81→ tok := l.pendingTokens[0]
82→ l.pendingTokens = l.pendingTokens[1:]
83→ return tok
84→ }
85→ }
86→ }
87→
88→ l.skipWhitespace()
89→
90→ var tok Token
91→ tok.Line = l.line
92→ tok.Column = l.column
93→
94→ switch l.ch {
95→ case '(':
96→ l.parenDepth++
97→ tok = l.newToken(LPAREN, l.ch)
98→ case ')':
99→ if l.parenDepth > 0 {
100→ l.parenDepth--
101→ }
102→ tok = l.newToken(RPAREN, l.ch)
103→ case '[':
104→ tok = l.newToken(LBRACKET, l.ch)
105→ case ']':
106→ tok = l.newToken(RBRACKET, l.ch)
107→ case ',':
108→ tok = l.newToken(COMMA, l.ch)
109→ case ';':
110→ tok = l.newToken(SEMICOLON, l.ch)
111→ case '+':
112→ tok = l.newToken(PLUS, l.ch)
113→ case '*':
114→ tok = l.newToken(MULTIPLY, l.ch)
115→ case '/':
116→ if l.peekChar() == '\\' {
117→ ch := l.ch
118→ l.readChar()
119→ tok = Token{Type: BITAND, Literal: string(ch) + string(l.ch), Line: l.line, Column: l.column - 1}
120→ } else {
121→ tok = l.newToken(DIVIDE, l.ch)
122→ }
123→ case '\\':
124→ if l.peekChar() == '/' {
125→ ch := l.ch
126→ l.readChar()
127→ tok = Token{Type: BITOR, Literal: string(ch) + string(l.ch), Line: l.line, Column: l.column - 1}
128→ } else {
129→ tok = l.newToken(MODULO, l.ch)
130→ }
131→ case '~':
132→ tok = l.newToken(BITNOT, l.ch)
133→ case '=':
134→ tok = l.newToken(EQ, l.ch)
135→ case '!':
136→ tok = l.newToken(SEND, l.ch)
137→ case '?':
138→ tok = l.newToken(RECEIVE, l.ch)
139→ case '&':
140→ tok = l.newToken(AMPERSAND, l.ch)
141→ case ':':
142→ if l.peekChar() == '=' {
143→ ch := l.ch
144→ l.readChar()
145→ tok = Token{Type: ASSIGN, Literal: string(ch) + string(l.ch), Line: l.line, Column: l.column - 1}
146→ } else {
147→ tok = l.newToken(COLON, l.ch)
148→ }
149→ case '<':
150→ if l.peekChar() == '=' {
151→ ch := l.ch
152→ l.readChar()
153→ tok = Token{Type: LE, Literal: string(ch) + string(l.ch), Line: l.line, Column: l.column - 1}
154→ } else if l.peekChar() == '>' {
155→ ch := l.ch
156→ l.readChar()
157→ tok = Token{Type: NEQ, Literal: string(ch) + string(l.ch), Line: l.line, Column: l.column - 1}
158→ } else if l.peekChar() == '<' {
159→ ch := l.ch
160→ l.readChar()
161→ tok = Token{Type: LSHIFT, Literal: string(ch) + string(l.ch), Line: l.line, Column: l.column - 1}
162→ } else {
163→ tok = l.newToken(LT, l.ch)
164→ }
165→ case '>':
166→ if l.peekChar() == '=' {
167→ ch := l.ch
168→ l.readChar()
169→ tok = Token{Type: GE, Literal: string(ch) + string(l.ch), Line: l.line, Column: l.column - 1}
170→ } else if l.peekChar() == '>' {
171→ ch := l.ch
172→ l.readChar()
173→ tok = Token{Type: RSHIFT, Literal: string(ch) + string(l.ch), Line: l.line, Column: l.column - 1}
174→ } else if l.peekChar() == '<' {
175→ ch := l.ch
176→ l.readChar()
177→ tok = Token{Type: BITXOR, Literal: string(ch) + string(l.ch), Line: l.line, Column: l.column - 1}
178→ } else {
179→ tok = l.newToken(GT, l.ch)
180→ }
181→ case '#':
182→ if isHexDigit(l.peekChar()) {
183→ tok.Type = INT
184→ tok.Literal = l.readHexNumber()
185→ tok.Line = l.line
186→ return tok
187→ } else {
188→ tok = l.newToken(ILLEGAL, l.ch)
189→ }
190→ case '-':
191→ if l.peekChar() == '-' {
192→ l.skipComment()
193→ return l.NextToken()
194→ } else {
195→ tok = l.newToken(MINUS, l.ch)
196→ }
197→ case '"':
198→ tok.Type = STRING
199→ tok.Literal = l.readString()
200→ case '\'':
201→ tok.Type = BYTE_LIT
202→ tok.Literal = l.readByteLiteral()
203→ tok.Line = l.line
204→ tok.Column = l.column
205→ case '\n':
206→ l.line++
207→ l.column = 0
208→ l.atLineStart = true
209→ l.readChar()
210→ // Skip blank lines (but not EOF)
211→ for l.ch != 0 && (l.ch == '\n' || l.isBlankLine()) {
212→ if l.ch == '\n' {
213→ l.line++
214→ l.column = 0
215→ l.readChar()
216→ } else {
217→ l.skipToEndOfLine()
218→ }
219→ }
220→ if l.parenDepth > 0 {
221→ // Inside parentheses: suppress NEWLINE, get next real token
222→ return l.NextToken()
223→ }
224→ tok = Token{Type: NEWLINE, Literal: "\\n", Line: l.line, Column: l.column}
225→ return tok
226→ case 0:
227→ // Emit any remaining DEDENTs before EOF
228→ if len(l.indentStack) > 1 {
229→ l.indentStack = l.indentStack[:len(l.indentStack)-1]
230→ return Token{Type: DEDENT, Literal: "", Line: l.line, Column: l.column}
231→ }
232→ tok.Literal = ""
233→ tok.Type = EOF
234→ default:
235→ if isLetter(l.ch) {
236→ tok.Literal = l.readIdentifier()
237→ tok.Type = LookupIdent(tok.Literal)
238→ tok.Line = l.line
239→ return tok
240→ } else if isDigit(l.ch) {
241→ tok.Type = INT
242→ tok.Literal = l.readNumber()
243→ tok.Line = l.line
244→ return tok
245→ } else {
246→ tok = l.newToken(ILLEGAL, l.ch)
247→ }
248→ }
249→
250→ l.readChar()
251→ return tok
252→}
253→
254→func (l *Lexer) newToken(tokenType TokenType, ch byte) Token {
255→ return Token{Type: tokenType, Literal: string(ch), Line: l.line, Column: l.column}
256→}
257→
258→func (l *Lexer) readIdentifier() string {
259→ position := l.position
260→ for isLetter(l.ch) || isDigit(l.ch) || l.ch == '.' {
261→ l.readChar()
262→ }
263→ return l.input[position:l.position]
264→}
265→
266→func (l *Lexer) readNumber() string {
267→ position := l.position
268→ for isDigit(l.ch) {
269→ l.readChar()
270→ }
271→ return l.input[position:l.position]
272→}
273→
274→func (l *Lexer) readHexNumber() string {
275→ // Current char is '#', skip it
276→ l.readChar()
277→ position := l.position
278→ for isHexDigit(l.ch) {
279→ l.readChar()
280→ }
281→ return "0x" + l.input[position:l.position]
282→}
283→
284→func (l *Lexer) readString() string {
285→ position := l.position + 1
286→ for {
287→ l.readChar()
288→ if l.ch == '"' || l.ch == 0 {
289→ break
290→ }
291→ }
292→ return l.input[position:l.position]
293→}
294→
295→func (l *Lexer) readByteLiteral() string {
296→ // Current char is the opening single quote.
297→ // Read content between single quotes, handling *' escape.
298→ // In occam, * is the escape character. ** means literal *, *' means literal '.
299→ position := l.position + 1
300→ escaped := false
301→ for {
302→ l.readChar()
303→ if l.ch == 0 {
304→ break
305→ }
306→ if escaped {
307→ // This char is the escaped character; consume it and clear flag
308→ escaped = false
309→ continue
310→ }
311→ if l.ch == '*' {
312→ escaped = true
313→ continue
314→ }
315→ if l.ch == '\'' {
316→ break
317→ }
318→ }
319→ return l.input[position:l.position]
320→}
321→
322→func (l *Lexer) skipWhitespace() {
323→ for l.ch == ' ' || l.ch == '\t' || l.ch == '\r' {
324→ l.readChar()
325→ }
326→}
327→
328→func (l *Lexer) skipComment() {
329→ // Skip -- comment until end of line
330→ for l.ch != '\n' && l.ch != 0 {
331→ l.readChar()
332→ }
333→}
334→
335→func (l *Lexer) measureIndent() int {
336→ indent := 0
337→ pos := l.position
338→ for pos < len(l.input) {
339→ ch := l.input[pos]
340→ if ch == ' ' {
341→ indent++
342→ pos++
343→ } else if ch == '\t' {
344→ indent += 2 // treat tabs as 2 spaces
345→ pos++
346→ } else {
347→ break
348→ }
349→ }
350→ return indent
351→}
352→
353→func (l *Lexer) isBlankLine() bool {
354→ pos := l.position
355→ for pos < len(l.input) {
356→ ch := l.input[pos]
357→ if ch == '\n' {
358→ return true
359→ }
360→ if ch != ' ' && ch != '\t' && ch != '\r' {
361→ // Check for comment-only line
362→ if ch == '-' && pos+1 < len(l.input) && l.input[pos+1] == '-' {
363→ return true
364→ }
365→ return false
366→ }
367→ pos++
368→ }
369→ return false // EOF is not a blank line
370→}
371→
372→func (l *Lexer) skipToEndOfLine() {
373→ for l.ch != '\n' && l.ch != 0 {
374→ l.readChar()
375→ }
376→ if l.ch == '\n' {
377→ l.line++
378→ l.column = 0
379→ l.readChar()
380→ }
381→}
382→
383→func isLetter(ch byte) bool {
384→ return (ch >= 'a' && ch <= 'z') || (ch >= 'A' && ch <= 'Z') || ch == '_'
385→}
386→
387→func isDigit(ch byte) bool {
388→ return ch >= '0' && ch <= '9'
389→}
390→
391→func isHexDigit(ch byte) bool {
392→ return (ch >= '0' && ch <= '9') || (ch >= 'a' && ch <= 'f') || (ch >= 'A' && ch <= 'F')
393→}
394→
395→// Tokenize returns all tokens from the input
396→func Tokenize(input string) []Token {
397→ // Ensure input ends with newline for consistent processing
398→ if !strings.HasSuffix(input, "\n") {
399→ input = input + "\n"
400→ }
401→
402→ l := New(input)
403→ var tokens []Token
404→ for {
405→ tok := l.NextToken()
406→ tokens = append(tokens, tok)
407→ if tok.Type == EOF {
408→ break
409→ }
410→ }
411→ return tokens
412→}
413→ 1→package ast
2→
3→import (
4→ "github.com/codeassociates/occam2go/lexer"
5→)
6→
7→// Node is the base interface for all AST nodes
8→type Node interface {
9→ TokenLiteral() string
10→}
11→
12→// Statement represents a statement node
13→type Statement interface {
14→ Node
15→ statementNode()
16→}
17→
18→// Expression represents an expression node
19→type Expression interface {
20→ Node
21→ expressionNode()
22→}
23→
24→// Program is the root node of every AST
25→type Program struct {
26→ Statements []Statement
27→}
28→
29→func (p *Program) TokenLiteral() string {
30→ if len(p.Statements) > 0 {
31→ return p.Statements[0].TokenLiteral()
32→ }
33→ return ""
34→}
35→
36→// VarDecl represents a variable declaration: INT x:
37→type VarDecl struct {
38→ Token lexer.Token // the type token (INT, BYTE, BOOL)
39→ Type string // "INT", "BYTE", "BOOL", etc.
40→ Names []string // variable names (can declare multiple: INT x, y, z:)
41→}
42→
43→func (v *VarDecl) statementNode() {}
44→func (v *VarDecl) TokenLiteral() string { return v.Token.Literal }
45→
46→// ArrayDecl represents an array declaration: [5]INT arr:
47→type ArrayDecl struct {
48→ Token lexer.Token // the [ token
49→ Size Expression // array size
50→ Type string // element type ("INT", "BYTE", "BOOL", etc.)
51→ Names []string // variable names
52→}
53→
54→func (a *ArrayDecl) statementNode() {}
55→func (a *ArrayDecl) TokenLiteral() string { return a.Token.Literal }
56→
57→// Assignment represents an assignment: x := 5 or arr[i] := 5 or [arr FROM n FOR m] := value
58→type Assignment struct {
59→ Token lexer.Token // the := token
60→ Name string // variable name
61→ Index Expression // optional: index expression for arr[i] := x (nil for simple assignments)
62→ SliceTarget *SliceExpr // optional: slice target for [arr FROM n FOR m] := value
63→ Value Expression // the value being assigned
64→}
65→
66→func (a *Assignment) statementNode() {}
67→func (a *Assignment) TokenLiteral() string { return a.Token.Literal }
68→
69→// MultiAssignTarget represents one target in a multi-assignment.
70→// Name is always set. Index is non-nil for indexed targets like arr[i].
71→type MultiAssignTarget struct {
72→ Name string // variable name
73→ Index Expression // optional: index expression for arr[i] (nil for simple ident)
74→}
75→
76→// MultiAssignment represents a multi-target assignment: a, b := func(x)
77→type MultiAssignment struct {
78→ Token lexer.Token // the := token
79→ Targets []MultiAssignTarget // targets on the left side
80→ Values []Expression // expressions on the right side
81→}
82→
83→func (m *MultiAssignment) statementNode() {}
84→func (m *MultiAssignment) TokenLiteral() string { return m.Token.Literal }
85→
86→// SeqBlock represents a SEQ block (sequential execution)
87→// If Replicator is non-nil, this is a replicated SEQ (SEQ i = 0 FOR n)
88→type SeqBlock struct {
89→ Token lexer.Token // the SEQ token
90→ Statements []Statement
91→ Replicator *Replicator // optional replicator
92→}
93→
94→func (s *SeqBlock) statementNode() {}
95→func (s *SeqBlock) TokenLiteral() string { return s.Token.Literal }
96→
97→// ParBlock represents a PAR block (parallel execution)
98→// If Replicator is non-nil, this is a replicated PAR (PAR i = 0 FOR n)
99→type ParBlock struct {
100→ Token lexer.Token // the PAR token
101→ Statements []Statement
102→ Replicator *Replicator // optional replicator
103→}
104→
105→func (p *ParBlock) statementNode() {}
106→func (p *ParBlock) TokenLiteral() string { return p.Token.Literal }
107→
108→// Replicator represents a replication spec: i = start FOR count [STEP step]
109→type Replicator struct {
110→ Variable string // loop variable name
111→ Start Expression // start value
112→ Count Expression // number of iterations
113→ Step Expression // optional step value (nil means step of 1)
114→}
115→
116→// Skip represents the SKIP statement (no-op)
117→type Skip struct {
118→ Token lexer.Token
119→}
120→
121→func (s *Skip) statementNode() {}
122→func (s *Skip) TokenLiteral() string { return s.Token.Literal }
123→
124→// Stop represents the STOP statement (deadlock/halt)
125→type Stop struct {
126→ Token lexer.Token
127→}
128→
129→func (s *Stop) statementNode() {}
130→func (s *Stop) TokenLiteral() string { return s.Token.Literal }
131→
132→// ProcDecl represents a procedure declaration
133→type ProcDecl struct {
134→ Token lexer.Token // the PROC token
135→ Name string
136→ Params []ProcParam
137→ Body []Statement // local declarations + body process
138→}
139→
140→func (p *ProcDecl) statementNode() {}
141→func (p *ProcDecl) TokenLiteral() string { return p.Token.Literal }
142→
143→// ProcParam represents a procedure parameter
144→type ProcParam struct {
145→ IsVal bool // VAL parameter (pass by value)
146→ Type string // INT, BYTE, BOOL, etc.
147→ Name string
148→ IsChan bool // true if this is a CHAN OF <type> parameter
149→ IsChanArray bool // true for []CHAN OF TYPE params
150→ IsOpenArray bool // true for []TYPE params (open array)
151→ ChanElemType string // element type when IsChan (e.g., "INT")
152→ ChanDir string // "?" for input, "!" for output, "" for bidirectional
153→ ArraySize string // non-empty for fixed-size array params like [2]INT
154→}
155→
156→// ProcCall represents a procedure call
157→type ProcCall struct {
158→ Token lexer.Token // the procedure name token
159→ Name string
160→ Args []Expression
161→}
162→
163→func (p *ProcCall) statementNode() {}
164→func (p *ProcCall) TokenLiteral() string { return p.Token.Literal }
165→
166→// FuncDecl represents a function declaration (single or multi-result)
167→type FuncDecl struct {
168→ Token lexer.Token // the return type token
169→ ReturnTypes []string // return types: ["INT"], ["INT", "INT"], etc.
170→ Name string
171→ Params []ProcParam
172→ Body []Statement // local decls + body statements (VALOF form), empty for IS form
173→ ResultExprs []Expression // return expressions (from IS or RESULT)
174→}
175→
176→func (f *FuncDecl) statementNode() {}
177→func (f *FuncDecl) TokenLiteral() string { return f.Token.Literal }
178→
179→// FuncCall represents a function call expression
180→type FuncCall struct {
181→ Token lexer.Token // the function name token
182→ Name string
183→ Args []Expression
184→}
185→
186→func (f *FuncCall) expressionNode() {}
187→func (f *FuncCall) TokenLiteral() string { return f.Token.Literal }
188→
189→// WhileLoop represents a WHILE loop
190→type WhileLoop struct {
191→ Token lexer.Token // the WHILE token
192→ Condition Expression
193→ Body []Statement
194→}
195→
196→func (w *WhileLoop) statementNode() {}
197→func (w *WhileLoop) TokenLiteral() string { return w.Token.Literal }
198→
199→// IfStatement represents an IF statement
200→type IfStatement struct {
201→ Token lexer.Token // the IF token
202→ Choices []IfChoice
203→ Replicator *Replicator // optional replicator for IF i = start FOR count
204→}
205→
206→type IfChoice struct {
207→ Condition Expression
208→ Body []Statement
209→ NestedIf *IfStatement // non-nil when this choice is a nested/replicated IF
210→}
211→
212→func (i *IfStatement) statementNode() {}
213→func (i *IfStatement) TokenLiteral() string { return i.Token.Literal }
214→
215→// CaseStatement represents a CASE statement
216→type CaseStatement struct {
217→ Token lexer.Token // the CASE token
218→ Selector Expression // the selector expression
219→ Choices []CaseChoice
220→}
221→
222→type CaseChoice struct {
223→ Values []Expression // nil/empty for ELSE
224→ IsElse bool
225→ Body []Statement
226→}
227→
228→func (c *CaseStatement) statementNode() {}
229→func (c *CaseStatement) TokenLiteral() string { return c.Token.Literal }
230→
231→// Expressions
232→
233→// Identifier represents a variable reference
234→type Identifier struct {
235→ Token lexer.Token
236→ Value string
237→}
238→
239→func (i *Identifier) expressionNode() {}
240→func (i *Identifier) TokenLiteral() string { return i.Token.Literal }
241→
242→// IntegerLiteral represents an integer literal
243→type IntegerLiteral struct {
244→ Token lexer.Token
245→ Value int64
246→}
247→
248→func (il *IntegerLiteral) expressionNode() {}
249→func (il *IntegerLiteral) TokenLiteral() string { return il.Token.Literal }
250→
251→// BooleanLiteral represents TRUE or FALSE
252→type BooleanLiteral struct {
253→ Token lexer.Token
254→ Value bool
255→}
256→
257→func (bl *BooleanLiteral) expressionNode() {}
258→func (bl *BooleanLiteral) TokenLiteral() string { return bl.Token.Literal }
259→
260→// StringLiteral represents a string literal: "hello"
261→type StringLiteral struct {
262→ Token lexer.Token
263→ Value string
264→}
265→
266→func (sl *StringLiteral) expressionNode() {}
267→func (sl *StringLiteral) TokenLiteral() string { return sl.Token.Literal }
268→
269→// ByteLiteral represents a byte literal: 'A', '*n', etc.
270→type ByteLiteral struct {
271→ Token lexer.Token
272→ Value byte
273→}
274→
275→func (bl *ByteLiteral) expressionNode() {}
276→func (bl *ByteLiteral) TokenLiteral() string { return bl.Token.Literal }
277→
278→// BinaryExpr represents a binary operation: x + y
279→type BinaryExpr struct {
280→ Token lexer.Token // the operator token
281→ Left Expression
282→ Operator string
283→ Right Expression
284→}
285→
286→func (be *BinaryExpr) expressionNode() {}
287→func (be *BinaryExpr) TokenLiteral() string { return be.Token.Literal }
288→
289→// UnaryExpr represents a unary operation: NOT x, -x
290→type UnaryExpr struct {
291→ Token lexer.Token // the operator token
292→ Operator string
293→ Right Expression
294→}
295→
296→func (ue *UnaryExpr) expressionNode() {}
297→func (ue *UnaryExpr) TokenLiteral() string { return ue.Token.Literal }
298→
299→// TypeConversion represents a type conversion expression: INT x, BYTE n, etc.
300→type TypeConversion struct {
301→ Token lexer.Token // the type token (INT, BYTE, etc.)
302→ TargetType string // "INT", "BYTE", "BOOL", "REAL"
303→ Expr Expression // the expression to convert
304→}
305→
306→func (tc *TypeConversion) expressionNode() {}
307→func (tc *TypeConversion) TokenLiteral() string { return tc.Token.Literal }
308→
309→// SizeExpr represents a SIZE expression: SIZE arr
310→type SizeExpr struct {
311→ Token lexer.Token // the SIZE token
312→ Expr Expression // the array/string expression
313→}
314→
315→func (se *SizeExpr) expressionNode() {}
316→func (se *SizeExpr) TokenLiteral() string { return se.Token.Literal }
317→
318→// MostExpr represents MOSTNEG/MOSTPOS type expressions: MOSTNEG INT, MOSTPOS BYTE, etc.
319→type MostExpr struct {
320→ Token lexer.Token // the MOSTNEG or MOSTPOS token
321→ ExprType string // "INT", "BYTE", "REAL32", "REAL64", etc.
322→ IsNeg bool // true for MOSTNEG, false for MOSTPOS
323→}
324→
325→func (me *MostExpr) expressionNode() {}
326→func (me *MostExpr) TokenLiteral() string { return me.Token.Literal }
327→
328→// ParenExpr represents a parenthesized expression
329→type ParenExpr struct {
330→ Token lexer.Token
331→ Expr Expression
332→}
333→
334→func (pe *ParenExpr) expressionNode() {}
335→func (pe *ParenExpr) TokenLiteral() string { return pe.Token.Literal }
336→
337→// IndexExpr represents an array index expression: arr[i]
338→type IndexExpr struct {
339→ Token lexer.Token // the [ token
340→ Left Expression // the array expression
341→ Index Expression // the index expression
342→}
343→
344→func (ie *IndexExpr) expressionNode() {}
345→func (ie *IndexExpr) TokenLiteral() string { return ie.Token.Literal }
346→
347→// ChanDecl represents a channel declaration: CHAN OF INT c: or [n]CHAN OF INT cs:
348→type ChanDecl struct {
349→ Token lexer.Token // the CHAN token
350→ ElemType string // the element type (INT, BYTE, etc.)
351→ Names []string // channel names
352→ IsArray bool // true for [n]CHAN OF TYPE
353→ Size Expression // array size when IsArray
354→}
355→
356→func (c *ChanDecl) statementNode() {}
357→func (c *ChanDecl) TokenLiteral() string { return c.Token.Literal }
358→
359→// Send represents a channel send: c ! x or c ! x ; y or c ! tag ; x
360→type Send struct {
361→ Token lexer.Token // the ! token
362→ Channel string // channel name
363→ ChannelIndex Expression // non-nil for cs[i] ! value
364→ Value Expression // value to send (simple send, backward compat)
365→ Values []Expression // additional values for sequential sends (c ! x ; y)
366→ VariantTag string // variant tag name for variant sends (c ! tag ; x)
367→}
368→
369→func (s *Send) statementNode() {}
370→func (s *Send) TokenLiteral() string { return s.Token.Literal }
371→
372→// Receive represents a channel receive: c ? x or c ? x ; y
373→type Receive struct {
374→ Token lexer.Token // the ? token
375→ Channel string // channel name
376→ ChannelIndex Expression // non-nil for cs[i] ? x
377→ Variable string // variable to receive into (simple receive)
378→ Variables []string // additional variables for sequential receives (c ? x ; y)
379→}
380→
381→func (r *Receive) statementNode() {}
382→func (r *Receive) TokenLiteral() string { return r.Token.Literal }
383→
384→// AltBlock represents an ALT block (alternation/select)
385→type AltBlock struct {
386→ Token lexer.Token // the ALT token
387→ Cases []AltCase
388→}
389→
390→func (a *AltBlock) statementNode() {}
391→func (a *AltBlock) TokenLiteral() string { return a.Token.Literal }
392→
393→// AltCase represents a single case in an ALT block
394→type AltCase struct {
395→ Guard Expression // optional guard condition (nil if no guard)
396→ Channel string // channel name
397→ ChannelIndex Expression // non-nil for cs[i] ? x in ALT
398→ Variable string // variable to receive into
399→ Body []Statement // the body to execute
400→ IsTimer bool // true if this is a timer AFTER case
401→ Timer string // timer name (when IsTimer)
402→ Deadline Expression // AFTER deadline expression (when IsTimer)
403→}
404→
405→// TimerDecl represents a timer declaration: TIMER tim:
406→type TimerDecl struct {
407→ Token lexer.Token // the TIMER token
408→ Names []string // timer variable names
409→}
410→
411→func (td *TimerDecl) statementNode() {}
412→func (td *TimerDecl) TokenLiteral() string { return td.Token.Literal }
413→
414→// TimerRead represents a timer read: tim ? t
415→type TimerRead struct {
416→ Token lexer.Token // the ? token
417→ Timer string // timer name
418→ Variable string // variable to receive time into
419→}
420→
421→func (tr *TimerRead) statementNode() {}
422→func (tr *TimerRead) TokenLiteral() string { return tr.Token.Literal }
423→
424→// ProtocolDecl represents a protocol declaration
425→type ProtocolDecl struct {
426→ Token lexer.Token // the PROTOCOL token
427→ Name string // protocol name
428→ Kind string // "simple", "sequential", or "variant"
429→ Types []string // element types (simple: len=1, sequential: len>1)
430→ Variants []ProtocolVariant // only for Kind="variant"
431→}
432→
433→type ProtocolVariant struct {
434→ Tag string // tag name (e.g., "text", "quit")
435→ Types []string // associated types (empty for no-payload tags)
436→}
437→
438→func (pd *ProtocolDecl) statementNode() {}
439→func (pd *ProtocolDecl) TokenLiteral() string { return pd.Token.Literal }
440→
441→// VariantReceive represents a variant protocol receive: c ? CASE ...
442→type VariantReceive struct {
443→ Token lexer.Token // the ? token
444→ Channel string
445→ ChannelIndex Expression // non-nil for cs[i] ? CASE ...
446→ Cases []VariantCase
447→}
448→
449→type VariantCase struct {
450→ Tag string // variant tag name
451→ Variables []string // variables to bind payload fields
452→ Body Statement
453→}
454→
455→func (vr *VariantReceive) statementNode() {}
456→func (vr *VariantReceive) TokenLiteral() string { return vr.Token.Literal }
457→
458→// RecordDecl represents a record type declaration: RECORD POINT { INT x: INT y: }
459→type RecordDecl struct {
460→ Token lexer.Token // the RECORD token
461→ Name string // record type name
462→ Fields []RecordField // named fields
463→}
464→
465→type RecordField struct {
466→ Type string // "INT", "BYTE", "BOOL", "REAL"
467→ Name string
468→}
469→
470→func (rd *RecordDecl) statementNode() {}
471→func (rd *RecordDecl) TokenLiteral() string { return rd.Token.Literal }
472→
473→// SliceExpr represents an array slice: [arr FROM start FOR length]
474→type SliceExpr struct {
475→ Token lexer.Token // the [ token
476→ Array Expression // the array being sliced
477→ Start Expression // start index
478→ Length Expression // number of elements
479→}
480→
481→func (se *SliceExpr) expressionNode() {}
482→func (se *SliceExpr) TokenLiteral() string { return se.Token.Literal }
483→
484→// Abbreviation represents an abbreviation: VAL INT x IS 42:, INT y IS z:, or INITIAL INT x IS 42:
485→type Abbreviation struct {
486→ Token lexer.Token // VAL, INITIAL, or type token
487→ IsVal bool // true for VAL abbreviations
488→ IsInitial bool // true for INITIAL declarations
489→ IsOpenArray bool // true for []TYPE abbreviations (e.g. VAL []BYTE)
490→ Type string // "INT", "BYTE", "BOOL", etc.
491→ Name string // variable name
492→ Value Expression // the expression
493→}
494→
495→func (a *Abbreviation) statementNode() {}
496→func (a *Abbreviation) TokenLiteral() string { return a.Token.Literal }
497→ 1→package parser
2→
3→import (
4→ "fmt"
5→ "strconv"
6→ "strings"
7→
8→ "github.com/codeassociates/occam2go/ast"
9→ "github.com/codeassociates/occam2go/lexer"
10→)
11→
12→// Operator precedence levels
13→const (
14→ _ int = iota
15→ LOWEST
16→ OR_PREC // OR
17→ AND_PREC // AND
18→ EQUALS // =, <>
19→ LESSGREATER // <, >, <=, >=
20→ SUM // +, -
21→ PRODUCT // *, /, \
22→ PREFIX // -x, NOT x
23→ INDEX // arr[i]
24→)
25→
26→var precedences = map[lexer.TokenType]int{
27→ lexer.OR: OR_PREC,
28→ lexer.AND: AND_PREC,
29→ lexer.EQ: EQUALS,
30→ lexer.NEQ: EQUALS,
31→ lexer.LT: LESSGREATER,
32→ lexer.GT: LESSGREATER,
33→ lexer.LE: LESSGREATER,
34→ lexer.GE: LESSGREATER,
35→ lexer.AFTER: LESSGREATER,
36→ lexer.PLUS: SUM,
37→ lexer.MINUS: SUM,
38→ lexer.PLUS_KW: SUM,
39→ lexer.MINUS_KW: SUM,
40→ lexer.MULTIPLY: PRODUCT,
41→ lexer.DIVIDE: PRODUCT,
42→ lexer.MODULO: PRODUCT,
43→ lexer.TIMES: PRODUCT,
44→ lexer.BITAND: PRODUCT,
45→ lexer.LSHIFT: PRODUCT,
46→ lexer.RSHIFT: PRODUCT,
47→ lexer.BITOR: SUM,
48→ lexer.BITXOR: SUM,
49→ lexer.LBRACKET: INDEX,
50→}
51→
52→type Parser struct {
53→ l *lexer.Lexer
54→ errors []string
55→
56→ curToken lexer.Token
57→ peekToken lexer.Token
58→
59→ // Track current indentation level
60→ indentLevel int
61→
62→ // Track timer names to distinguish timer reads from channel receives
63→ timerNames map[string]bool
64→
65→ // Track protocol names and definitions
66→ protocolNames map[string]bool
67→ protocolDefs map[string]*ast.ProtocolDecl
68→
69→ // Track record type names and definitions
70→ recordNames map[string]bool
71→ recordDefs map[string]*ast.RecordDecl
72→}
73→
74→func New(l *lexer.Lexer) *Parser {
75→ p := &Parser{
76→ l: l,
77→ errors: []string{},
78→ timerNames: make(map[string]bool),
79→ protocolNames: make(map[string]bool),
80→ protocolDefs: make(map[string]*ast.ProtocolDecl),
81→ recordNames: make(map[string]bool),
82→ recordDefs: make(map[string]*ast.RecordDecl),
83→ }
84→ // Read two tokens to initialize curToken and peekToken
85→ p.nextToken()
86→ p.nextToken()
87→ return p
88→}
89→
90→func (p *Parser) Errors() []string {
91→ return p.errors
92→}
93→
94→func (p *Parser) addError(msg string) {
95→ p.errors = append(p.errors, fmt.Sprintf("line %d: %s", p.curToken.Line, msg))
96→}
97→
98→func (p *Parser) nextToken() {
99→ p.curToken = p.peekToken
100→ p.peekToken = p.l.NextToken()
101→
102→ // Track indentation level
103→ if p.curToken.Type == lexer.INDENT {
104→ p.indentLevel++
105→ } else if p.curToken.Type == lexer.DEDENT {
106→ p.indentLevel--
107→ }
108→}
109→
110→func (p *Parser) curTokenIs(t lexer.TokenType) bool {
111→ return p.curToken.Type == t
112→}
113→
114→func (p *Parser) peekTokenIs(t lexer.TokenType) bool {
115→ return p.peekToken.Type == t
116→}
117→
118→func (p *Parser) expectPeek(t lexer.TokenType) bool {
119→ if p.peekTokenIs(t) {
120→ p.nextToken()
121→ return true
122→ }
123→ p.addError(fmt.Sprintf("expected %s, got %s", t, p.peekToken.Type))
124→ return false
125→}
126→
127→func (p *Parser) peekPrecedence() int {
128→ if prec, ok := precedences[p.peekToken.Type]; ok {
129→ return prec
130→ }
131→ return LOWEST
132→}
133→
134→func (p *Parser) curPrecedence() int {
135→ if prec, ok := precedences[p.curToken.Type]; ok {
136→ return prec
137→ }
138→ return LOWEST
139→}
140→
141→// ParseProgram parses the entire program
142→func (p *Parser) ParseProgram() *ast.Program {
143→ program := &ast.Program{}
144→ program.Statements = []ast.Statement{}
145→
146→ for !p.curTokenIs(lexer.EOF) {
147→ stmt := p.parseStatement()
148→ if stmt != nil {
149→ program.Statements = append(program.Statements, stmt)
150→ }
151→ p.nextToken()
152→ }
153→
154→ return program
155→}
156→
157→func (p *Parser) parseStatement() ast.Statement {
158→ // Skip newlines
159→ for p.curTokenIs(lexer.NEWLINE) {
160→ p.nextToken()
161→ }
162→
163→ switch p.curToken.Type {
164→ case lexer.VAL:
165→ return p.parseAbbreviation()
166→ case lexer.INITIAL:
167→ return p.parseInitialDecl()
168→ case lexer.INT_TYPE, lexer.BYTE_TYPE, lexer.BOOL_TYPE, lexer.REAL_TYPE, lexer.REAL32_TYPE, lexer.REAL64_TYPE:
169→ if p.peekTokenIs(lexer.FUNCTION) || p.peekTokenIs(lexer.FUNC) || p.peekTokenIs(lexer.COMMA) {
170→ return p.parseFuncDecl()
171→ }
172→ return p.parseVarDeclOrAbbreviation()
173→ case lexer.LBRACKET:
174→ return p.parseArrayDecl()
175→ case lexer.CHAN:
176→ return p.parseChanDecl()
177→ case lexer.PROTOCOL:
178→ return p.parseProtocolDecl()
179→ case lexer.RECORD:
180→ return p.parseRecordDecl()
181→ case lexer.TIMER:
182→ return p.parseTimerDecl()
183→ case lexer.SEQ:
184→ return p.parseSeqBlock()
185→ case lexer.PAR:
186→ return p.parseParBlock()
187→ case lexer.ALT:
188→ return p.parseAltBlock()
189→ case lexer.SKIP:
190→ return &ast.Skip{Token: p.curToken}
191→ case lexer.STOP:
192→ return &ast.Stop{Token: p.curToken}
193→ case lexer.PROC:
194→ return p.parseProcDecl()
195→ case lexer.WHILE:
196→ return p.parseWhileLoop()
197→ case lexer.IF:
198→ return p.parseIfStatement()
199→ case lexer.CASE:
200→ return p.parseCaseStatement()
201→ case lexer.IDENT:
202→ // Check for record variable declaration: TYPENAME var:
203→ if p.recordNames[p.curToken.Literal] && p.peekTokenIs(lexer.IDENT) {
204→ return p.parseRecordVarDecl()
205→ }
206→ // Could be assignment, indexed assignment, indexed send/receive, send, receive, or procedure call
207→ if p.peekTokenIs(lexer.LBRACKET) {
208→ return p.parseIndexedOperation()
209→ }
210→ if p.peekTokenIs(lexer.ASSIGN) {
211→ return p.parseAssignment()
212→ }
213→ if p.peekTokenIs(lexer.COMMA) {
214→ return p.parseMultiAssignment()
215→ }
216→ if p.peekTokenIs(lexer.SEND) {
217→ return p.parseSend()
218→ }
219→ if p.peekTokenIs(lexer.RECEIVE) {
220→ if p.timerNames[p.curToken.Literal] {
221→ return p.parseTimerRead()
222→ }
223→ return p.parseReceive()
224→ }
225→ return p.parseProcCall()
226→ case lexer.INDENT, lexer.DEDENT, lexer.EOF:
227→ return nil
228→ default:
229→ p.addError(fmt.Sprintf("unexpected token: %s", p.curToken.Type))
230→ return nil
231→ }
232→}
233→
234→func (p *Parser) parseVarDecl() *ast.VarDecl {
235→ decl := &ast.VarDecl{
236→ Token: p.curToken,
237→ Type: p.curToken.Literal,
238→ }
239→
240→ // Parse variable names
241→ for {
242→ if !p.expectPeek(lexer.IDENT) {
243→ return nil
244→ }
245→ decl.Names = append(decl.Names, p.curToken.Literal)
246→
247→ if p.peekTokenIs(lexer.COMMA) {
248→ p.nextToken() // consume comma
249→ } else {
250→ break
251→ }
252→ }
253→
254→ if !p.expectPeek(lexer.COLON) {
255→ return nil
256→ }
257→
258→ return decl
259→}
260→
261→// parseVarDeclOrAbbreviation parses either a variable declaration (INT x:)
262→// or a non-VAL abbreviation (INT x IS expr:). Called when current token is a type keyword.
263→func (p *Parser) parseVarDeclOrAbbreviation() ast.Statement {
264→ typeToken := p.curToken
265→ typeName := p.curToken.Literal
266→
267→ // Consume the name
268→ if !p.expectPeek(lexer.IDENT) {
269→ return nil
270→ }
271→ name := p.curToken.Literal
272→
273→ // Check if this is an abbreviation (next token is IS)
274→ if p.peekTokenIs(lexer.IS) {
275→ p.nextToken() // consume IS
276→ p.nextToken() // move to expression
277→ value := p.parseExpression(LOWEST)
278→
279→ if !p.expectPeek(lexer.COLON) {
280→ return nil
281→ }
282→
283→ return &ast.Abbreviation{
284→ Token: typeToken,
285→ IsVal: false,
286→ Type: typeName,
287→ Name: name,
288→ Value: value,
289→ }
290→ }
291→
292→ // Otherwise, it's a regular variable declaration — continue parsing names
293→ decl := &ast.VarDecl{
294→ Token: typeToken,
295→ Type: typeName,
296→ Names: []string{name},
297→ }
298→
299→ // Parse additional comma-separated names
300→ for p.peekTokenIs(lexer.COMMA) {
301→ p.nextToken() // consume comma
302→ if !p.expectPeek(lexer.IDENT) {
303→ return nil
304→ }
305→ decl.Names = append(decl.Names, p.curToken.Literal)
306→ }
307→
308→ if !p.expectPeek(lexer.COLON) {
309→ return nil
310→ }
311→
312→ return decl
313→}
314→
315→// parseAbbreviation parses a VAL abbreviation: VAL INT x IS expr:
316→// Also handles VAL []BYTE x IS "string": (open array abbreviation)
317→// Current token is VAL.
318→func (p *Parser) parseAbbreviation() *ast.Abbreviation {
319→ token := p.curToken // VAL token
320→
321→ p.nextToken()
322→
323→ // Check for []TYPE (open array abbreviation)
324→ isOpenArray := false
325→ if p.curTokenIs(lexer.LBRACKET) && p.peekTokenIs(lexer.RBRACKET) {
326→ isOpenArray = true
327→ p.nextToken() // consume ]
328→ p.nextToken() // move to type
329→ }
330→
331→ // Expect a type keyword
332→ if !isTypeToken(p.curToken.Type) {
333→ p.addError(fmt.Sprintf("expected type after VAL, got %s", p.curToken.Type))
334→ return nil
335→ }
336→ typeName := p.curToken.Literal
337→
338→ // Expect name
339→ if !p.expectPeek(lexer.IDENT) {
340→ return nil
341→ }
342→ name := p.curToken.Literal
343→
344→ // Expect IS
345→ if !p.expectPeek(lexer.IS) {
346→ return nil
347→ }
348→
349→ // Parse expression
350→ p.nextToken()
351→ value := p.parseExpression(LOWEST)
352→
353→ // Expect colon
354→ if !p.expectPeek(lexer.COLON) {
355→ return nil
356→ }
357→
358→ return &ast.Abbreviation{
359→ Token: token,
360→ IsVal: true,
361→ IsOpenArray: isOpenArray,
362→ Type: typeName,
363→ Name: name,
364→ Value: value,
365→ }
366→}
367→
368→// parseInitialDecl parses an INITIAL declaration: INITIAL INT x IS expr:
369→// Current token is INITIAL.
370→func (p *Parser) parseInitialDecl() *ast.Abbreviation {
371→ token := p.curToken // INITIAL token
372→
373→ // Expect a type keyword
374→ p.nextToken()
375→ if !p.curTokenIs(lexer.INT_TYPE) && !p.curTokenIs(lexer.BYTE_TYPE) &&
376→ !p.curTokenIs(lexer.BOOL_TYPE) && !p.curTokenIs(lexer.REAL_TYPE) &&
377→ !p.curTokenIs(lexer.REAL32_TYPE) && !p.curTokenIs(lexer.REAL64_TYPE) {
378→ p.addError(fmt.Sprintf("expected type after INITIAL, got %s", p.curToken.Type))
379→ return nil
380→ }
381→ typeName := p.curToken.Literal
382→
383→ // Expect name
384→ if !p.expectPeek(lexer.IDENT) {
385→ return nil
386→ }
387→ name := p.curToken.Literal
388→
389→ // Expect IS
390→ if !p.expectPeek(lexer.IS) {
391→ return nil
392→ }
393→
394→ // Parse expression
395→ p.nextToken()
396→ value := p.parseExpression(LOWEST)
397→
398→ // Expect colon
399→ if !p.expectPeek(lexer.COLON) {
400→ return nil
401→ }
402→
403→ return &ast.Abbreviation{
404→ Token: token,
405→ IsInitial: true,
406→ Type: typeName,
407→ Name: name,
408→ Value: value,
409→ }
410→}
411→
412→func (p *Parser) parseAssignment() *ast.Assignment {
413→ stmt := &ast.Assignment{
414→ Name: p.curToken.Literal,
415→ }
416→
417→ p.nextToken() // move to :=
418→ stmt.Token = p.curToken
419→
420→ p.nextToken() // move past :=
421→ stmt.Value = p.parseExpression(LOWEST)
422→
423→ return stmt
424→}
425→
426→func (p *Parser) parseMultiAssignment() *ast.MultiAssignment {
427→ firstTarget := ast.MultiAssignTarget{Name: p.curToken.Literal}
428→ return p.parseMultiAssignmentFrom(firstTarget)
429→}
430→
431→// parseMultiAssignmentFrom parses a multi-assignment given the first target already parsed.
432→// The current token should be on the first target's last token (ident or ']').
433→// Peek token should be COMMA.
434→func (p *Parser) parseMultiAssignmentFrom(firstTarget ast.MultiAssignTarget) *ast.MultiAssignment {
435→ stmt := &ast.MultiAssignment{
436→ Targets: []ast.MultiAssignTarget{firstTarget},
437→ }
438→
439→ // Parse comma-separated targets: a, b[i], c
440→ for p.peekTokenIs(lexer.COMMA) {
441→ p.nextToken() // consume comma
442→ p.nextToken() // move to next target
443→ target := ast.MultiAssignTarget{Name: p.curToken.Literal}
444→ if p.peekTokenIs(lexer.LBRACKET) {
445→ p.nextToken() // move to [
446→ p.nextToken() // move past [
447→ target.Index = p.parseExpression(LOWEST)
448→ if !p.expectPeek(lexer.RBRACKET) {
449→ return nil
450→ }
451→ }
452→ stmt.Targets = append(stmt.Targets, target)
453→ }
454→
455→ p.nextToken() // move to :=
456→ stmt.Token = p.curToken
457→
458→ p.nextToken() // move past :=
459→
460→ // Parse comma-separated values
461→ stmt.Values = []ast.Expression{p.parseExpression(LOWEST)}
462→ for p.peekTokenIs(lexer.COMMA) {
463→ p.nextToken() // consume comma
464→ p.nextToken() // move to next expression
465→ stmt.Values = append(stmt.Values, p.parseExpression(LOWEST))
466→ }
467→
468→ return stmt
469→}
470→
471→func (p *Parser) parseArrayDecl() ast.Statement {
472→ lbracketToken := p.curToken
473→
474→ // Parse size expression after [
475→ p.nextToken()
476→ size := p.parseExpression(LOWEST)
477→
478→ // Check if this is a slice assignment: [arr FROM start FOR length] := value
479→ // Also handles [arr FOR length] shorthand (FROM 0)
480→ if p.peekTokenIs(lexer.FROM) || p.peekTokenIs(lexer.FOR) {
481→ return p.parseSliceAssignment(lbracketToken, size)
482→ }
483→
484→ // Expect ]
485→ if !p.expectPeek(lexer.RBRACKET) {
486→ return nil
487→ }
488→
489→ // Check if this is a channel array: [n]CHAN OF TYPE
490→ if p.peekTokenIs(lexer.CHAN) {
491→ p.nextToken() // move to CHAN
492→ chanDecl := &ast.ChanDecl{
493→ Token: p.curToken,
494→ IsArray: true,
495→ Size: size,
496→ }
497→
498→ // Expect OF (optional — CHAN BYTE is shorthand for CHAN OF BYTE)
499→ if p.peekTokenIs(lexer.OF) {
500→ p.nextToken() // consume OF
501→ }
502→
503→ // Expect type (INT, BYTE, BOOL, etc.) or protocol name (IDENT)
504→ p.nextToken()
505→ if p.curTokenIs(lexer.INT_TYPE) || p.curTokenIs(lexer.BYTE_TYPE) ||
506→ p.curTokenIs(lexer.BOOL_TYPE) || p.curTokenIs(lexer.REAL_TYPE) ||
507→ p.curTokenIs(lexer.REAL32_TYPE) || p.curTokenIs(lexer.REAL64_TYPE) {
508→ chanDecl.ElemType = p.curToken.Literal
509→ } else if p.curTokenIs(lexer.IDENT) {
510→ chanDecl.ElemType = p.curToken.Literal
511→ } else {
512→ p.addError(fmt.Sprintf("expected type after CHAN, got %s", p.curToken.Type))
513→ return nil
514→ }
515→
516→ // Parse channel names
517→ for {
518→ if !p.expectPeek(lexer.IDENT) {
519→ return nil
520→ }
521→ chanDecl.Names = append(chanDecl.Names, p.curToken.Literal)
522→
523→ if p.peekTokenIs(lexer.COMMA) {
524→ p.nextToken() // consume comma
525→ } else {
526→ break
527→ }
528→ }
529→
530→ if !p.expectPeek(lexer.COLON) {
531→ return nil
532→ }
533→
534→ return chanDecl
535→ }
536→
537→ // Regular array declaration
538→ decl := &ast.ArrayDecl{Token: lbracketToken, Size: size}
539→
540→ // Expect type (INT, BYTE, BOOL, REAL, REAL32, REAL64)
541→ p.nextToken()
542→ if !p.curTokenIs(lexer.INT_TYPE) && !p.curTokenIs(lexer.BYTE_TYPE) &&
543→ !p.curTokenIs(lexer.BOOL_TYPE) && !p.curTokenIs(lexer.REAL_TYPE) &&
544→ !p.curTokenIs(lexer.REAL32_TYPE) && !p.curTokenIs(lexer.REAL64_TYPE) {
545→ p.addError(fmt.Sprintf("expected type after array size, got %s", p.curToken.Type))
546→ return nil
547→ }
548→ decl.Type = p.curToken.Literal
549→
550→ // Parse variable names
551→ for {
552→ if !p.expectPeek(lexer.IDENT) {
553→ return nil
554→ }
555→ decl.Names = append(decl.Names, p.curToken.Literal)
556→
557→ if p.peekTokenIs(lexer.COMMA) {
558→ p.nextToken() // consume comma
559→ } else {
560→ break
561→ }
562→ }
563→
564→ if !p.expectPeek(lexer.COLON) {
565→ return nil
566→ }
567→
568→ return decl
569→}
570→
571→// parseSliceAssignment parses [arr FROM start FOR length] := value
572→// Also handles [arr FOR length] shorthand (start defaults to 0).
573→// Called from parseArrayDecl when FROM or FOR is detected after the array expression.
574→// lbracketToken is the [ token, arrayExpr is the already-parsed array expression.
575→func (p *Parser) parseSliceAssignment(lbracketToken lexer.Token, arrayExpr ast.Expression) ast.Statement {
576→ var startExpr ast.Expression
577→ if p.peekTokenIs(lexer.FOR) {
578→ // [arr FOR length] shorthand — start is 0
579→ startExpr = &ast.IntegerLiteral{Token: lexer.Token{Type: lexer.INT, Literal: "0"}, Value: 0}
580→ } else {
581→ p.nextToken() // consume FROM
582→ p.nextToken() // move to start expression
583→ startExpr = p.parseExpression(LOWEST)
584→ }
585→
586→ if !p.expectPeek(lexer.FOR) {
587→ return nil
588→ }
589→ p.nextToken() // move to length expression
590→ lengthExpr := p.parseExpression(LOWEST)
591→
592→ if !p.expectPeek(lexer.RBRACKET) {
593→ return nil
594→ }
595→
596→ if !p.expectPeek(lexer.ASSIGN) {
597→ return nil
598→ }
599→
600→ assignToken := p.curToken
601→ p.nextToken() // move past :=
602→
603→ value := p.parseExpression(LOWEST)
604→
605→ return &ast.Assignment{
606→ Token: assignToken,
607→ SliceTarget: &ast.SliceExpr{
608→ Token: lbracketToken,
609→ Array: arrayExpr,
610→ Start: startExpr,
611→ Length: lengthExpr,
612→ },
613→ Value: value,
614→ }
615→}
616→
617→func (p *Parser) parseIndexedOperation() ast.Statement {
618→ name := p.curToken.Literal
619→
620→ p.nextToken() // move to [
621→ p.nextToken() // move past [
622→ index := p.parseExpression(LOWEST)
623→
624→ // Expect ]
625→ if !p.expectPeek(lexer.RBRACKET) {
626→ return nil
627→ }
628→
629→ // Check what follows ]
630→ if p.peekTokenIs(lexer.COMMA) {
631→ // Multi-assignment starting with indexed target: name[index], ... := ...
632→ firstTarget := ast.MultiAssignTarget{Name: name, Index: index}
633→ return p.parseMultiAssignmentFrom(firstTarget)
634→ }
635→ if p.peekTokenIs(lexer.ASSIGN) {
636→ // Indexed assignment: name[index] := value
637→ p.nextToken() // move to :=
638→ stmt := &ast.Assignment{
639→ Name: name,
640→ Token: p.curToken,
641→ Index: index,
642→ }
643→ p.nextToken() // move past :=
644→ stmt.Value = p.parseExpression(LOWEST)
645→ return stmt
646→ }
647→
648→ if p.peekTokenIs(lexer.SEND) {
649→ // Indexed channel send: cs[i] ! value
650→ p.nextToken() // move to !
651→ sendToken := p.curToken
652→ p.nextToken() // move past !
653→
654→ stmt := &ast.Send{
655→ Token: sendToken,
656→ Channel: name,
657→ ChannelIndex: index,
658→ }
659→
660→ // Check if this is a variant send: first token is an identifier that is a variant tag
661→ if p.curTokenIs(lexer.IDENT) && p.peekTokenIs(lexer.SEMICOLON) {
662→ possibleTag := p.curToken.Literal
663→ if p.isVariantTag(possibleTag) {
664→ stmt.VariantTag = possibleTag
665→ p.nextToken() // move to ;
666→ for p.curTokenIs(lexer.SEMICOLON) {
667→ p.nextToken() // move past ;
668→ val := p.parseExpression(LOWEST)
669→ stmt.Values = append(stmt.Values, val)
670→ }
671→ return stmt
672→ }
673→ }
674→
675→ stmt.Value = p.parseExpression(LOWEST)
676→
677→ // Check for sequential send
678→ for p.peekTokenIs(lexer.SEMICOLON) {
679→ p.nextToken() // move to ;
680→ p.nextToken() // move past ;
681→ val := p.parseExpression(LOWEST)
682→ stmt.Values = append(stmt.Values, val)
683→ }
684→
685→ return stmt
686→ }
687→
688→ if p.peekTokenIs(lexer.RECEIVE) {
689→ // Indexed channel receive: cs[i] ? x or cs[i] ? CASE ...
690→ p.nextToken() // move to ?
691→ recvToken := p.curToken
692→
693→ // Check for variant receive: cs[i] ? CASE
694→ if p.peekTokenIs(lexer.CASE) {
695→ p.nextToken() // move to CASE
696→ return p.parseVariantReceiveWithIndex(name, index, recvToken)
697→ }
698→
699→ stmt := &ast.Receive{
700→ Token: recvToken,
701→ Channel: name,
702→ ChannelIndex: index,
703→ }
704→
705→ if !p.expectPeek(lexer.IDENT) {
706→ return nil
707→ }
708→ stmt.Variable = p.curToken.Literal
709→
710→ // Check for sequential receive
711→ for p.peekTokenIs(lexer.SEMICOLON) {
712→ p.nextToken() // move to ;
713→ if !p.expectPeek(lexer.IDENT) {
714→ return nil
715→ }
716→ stmt.Variables = append(stmt.Variables, p.curToken.Literal)
717→ }
718→
719→ return stmt
720→ }
721→
722→ // Default: treat as indexed assignment (shouldn't reach here normally)
723→ p.addError(fmt.Sprintf("expected :=, !, or ? after %s[...], got %s", name, p.peekToken.Type))
724→ return nil
725→}
726→
727→func (p *Parser) parseIndexExpression(left ast.Expression) *ast.IndexExpr {
728→ expr := &ast.IndexExpr{
729→ Token: p.curToken,
730→ Left: left,
731→ }
732→
733→ p.nextToken() // move past [
734→ expr.Index = p.parseExpression(LOWEST)
735→
736→ if !p.expectPeek(lexer.RBRACKET) {
737→ return nil
738→ }
739→
740→ return expr
741→}
742→
743→func (p *Parser) parseChanDecl() *ast.ChanDecl {
744→ decl := &ast.ChanDecl{Token: p.curToken}
745→
746→ // Expect OF (optional — CHAN BYTE is shorthand for CHAN OF BYTE)
747→ if p.peekTokenIs(lexer.OF) {
748→ p.nextToken() // consume OF
749→ }
750→
751→ // Expect type (INT, BYTE, BOOL, etc.) or protocol name (IDENT)
752→ p.nextToken()
753→ if p.curTokenIs(lexer.INT_TYPE) || p.curTokenIs(lexer.BYTE_TYPE) ||
754→ p.curTokenIs(lexer.BOOL_TYPE) || p.curTokenIs(lexer.REAL_TYPE) ||
755→ p.curTokenIs(lexer.REAL32_TYPE) || p.curTokenIs(lexer.REAL64_TYPE) {
756→ decl.ElemType = p.curToken.Literal
757→ } else if p.curTokenIs(lexer.IDENT) {
758→ decl.ElemType = p.curToken.Literal
759→ } else {
760→ p.addError(fmt.Sprintf("expected type after CHAN, got %s", p.curToken.Type))
761→ return nil
762→ }
763→
764→ // Parse channel names
765→ for {
766→ if !p.expectPeek(lexer.IDENT) {
767→ return nil
768→ }
769→ decl.Names = append(decl.Names, p.curToken.Literal)
770→
771→ if p.peekTokenIs(lexer.COMMA) {
772→ p.nextToken() // consume comma
773→ } else {
774→ break
775→ }
776→ }
777→
778→ if !p.expectPeek(lexer.COLON) {
779→ return nil
780→ }
781→
782→ return decl
783→}
784→
785→func (p *Parser) parseProtocolDecl() *ast.ProtocolDecl {
786→ decl := &ast.ProtocolDecl{Token: p.curToken}
787→
788→ // Expect protocol name
789→ if !p.expectPeek(lexer.IDENT) {
790→ return nil
791→ }
792→ decl.Name = p.curToken.Literal
793→
794→ // Check if this is IS form (simple/sequential) or CASE form (variant)
795→ if p.peekTokenIs(lexer.NEWLINE) || p.peekTokenIs(lexer.INDENT) {
796→ // Could be variant: PROTOCOL NAME \n INDENT CASE ...
797→ // Skip newlines
798→ for p.peekTokenIs(lexer.NEWLINE) {
799→ p.nextToken()
800→ }
801→
802→ if p.peekTokenIs(lexer.INDENT) {
803→ p.nextToken() // consume INDENT
804→ p.nextToken() // move into block
805→
806→ if p.curTokenIs(lexer.CASE) {
807→ // Variant protocol
808→ decl.Kind = "variant"
809→ decl.Variants = p.parseProtocolVariants()
810→ p.protocolNames[decl.Name] = true
811→ p.protocolDefs[decl.Name] = decl
812→ return decl
813→ }
814→ }
815→
816→ p.addError("expected IS or CASE in protocol declaration")
817→ return nil
818→ }
819→
820→ // IS form: PROTOCOL NAME IS TYPE [; TYPE]*
821→ if !p.expectPeek(lexer.IS) {
822→ return nil
823→ }
824→
825→ // Parse type list
826→ p.nextToken()
827→ typeName := p.parseProtocolTypeName()
828→ if typeName == "" {
829→ return nil
830→ }
831→ decl.Types = append(decl.Types, typeName)
832→
833→ // Check for sequential: ; TYPE
834→ for p.peekTokenIs(lexer.SEMICOLON) {
835→ p.nextToken() // move to ;
836→ p.nextToken() // move past ;
837→ typeName = p.parseProtocolTypeName()
838→ if typeName == "" {
839→ return nil
840→ }
841→ decl.Types = append(decl.Types, typeName)
842→ }
843→
844→ if len(decl.Types) == 1 {
845→ decl.Kind = "simple"
846→ } else {
847→ decl.Kind = "sequential"
848→ }
849→
850→ p.protocolNames[decl.Name] = true
851→ p.protocolDefs[decl.Name] = decl
852→ return decl
853→}
854→
855→func (p *Parser) parseProtocolTypeName() string {
856→ switch p.curToken.Type {
857→ case lexer.INT_TYPE:
858→ return "INT"
859→ case lexer.BYTE_TYPE:
860→ return "BYTE"
861→ case lexer.BOOL_TYPE:
862→ return "BOOL"
863→ case lexer.REAL_TYPE:
864→ return "REAL"
865→ case lexer.REAL32_TYPE:
866→ return "REAL32"
867→ case lexer.REAL64_TYPE:
868→ return "REAL64"
869→ case lexer.IDENT:
870→ return p.curToken.Literal
871→ default:
872→ p.addError(fmt.Sprintf("expected type name in protocol, got %s", p.curToken.Type))
873→ return ""
874→ }
875→}
876→
877→func (p *Parser) parseProtocolVariants() []ast.ProtocolVariant {
878→ var variants []ast.ProtocolVariant
879→
880→ // Skip to next line after CASE
881→ for p.peekTokenIs(lexer.NEWLINE) {
882→ p.nextToken()
883→ }
884→
885→ // Expect INDENT
886→ if !p.peekTokenIs(lexer.INDENT) {
887→ p.addError("expected indented block after CASE in protocol")
888→ return variants
889→ }
890→ p.nextToken() // consume INDENT
891→ startLevel := p.indentLevel
892→ p.nextToken() // move into block
893→
894→ for !p.curTokenIs(lexer.EOF) {
895→ // Skip newlines
896→ for p.curTokenIs(lexer.NEWLINE) {
897→ p.nextToken()
898→ }
899→
900→ // Handle DEDENT tokens
901→ for p.curTokenIs(lexer.DEDENT) {
902→ if p.indentLevel < startLevel {
903→ return variants
904→ }
905→ p.nextToken()
906→ }
907→
908→ // Skip any more newlines after DEDENT
909→ for p.curTokenIs(lexer.NEWLINE) {
910→ p.nextToken()
911→ }
912→
913→ if p.curTokenIs(lexer.EOF) {
914→ break
915→ }
916→
917→ if p.indentLevel < startLevel {
918→ break
919→ }
920→
921→ // Parse variant: tag [; TYPE]*
922→ if !p.curTokenIs(lexer.IDENT) {
923→ p.addError(fmt.Sprintf("expected variant tag name, got %s", p.curToken.Type))
924→ return variants
925→ }
926→
927→ v := ast.ProtocolVariant{
928→ Tag: p.curToken.Literal,
929→ }
930→
931→ // Parse optional types after semicolons
932→ for p.peekTokenIs(lexer.SEMICOLON) {
933→ p.nextToken() // move to ;
934→ p.nextToken() // move past ;
935→ typeName := p.parseProtocolTypeName()
936→ if typeName == "" {
937→ return variants
938→ }
939→ v.Types = append(v.Types, typeName)
940→ }
941→
942→ variants = append(variants, v)
943→
944→ // Advance past newline if needed
945→ if !p.curTokenIs(lexer.NEWLINE) && !p.curTokenIs(lexer.DEDENT) && !p.curTokenIs(lexer.EOF) {
946→ p.nextToken()
947→ }
948→ }
949→
950→ return variants
951→}
952→
953→func (p *Parser) parseRecordDecl() *ast.RecordDecl {
954→ decl := &ast.RecordDecl{Token: p.curToken}
955→
956→ // Expect record name
957→ if !p.expectPeek(lexer.IDENT) {
958→ return nil
959→ }
960→ decl.Name = p.curToken.Literal
961→
962→ // Skip newlines
963→ for p.peekTokenIs(lexer.NEWLINE) {
964→ p.nextToken()
965→ }
966→
967→ // Expect INDENT for field block
968→ if !p.peekTokenIs(lexer.INDENT) {
969→ p.addError("expected indented block after RECORD declaration")
970→ return nil
971→ }
972→ p.nextToken() // consume INDENT
973→ startLevel := p.indentLevel
974→ p.nextToken() // move into block
975→
976→ // Parse field declarations: TYPE name[, name]*:
977→ for !p.curTokenIs(lexer.EOF) {
978→ // Skip newlines
979→ for p.curTokenIs(lexer.NEWLINE) {
980→ p.nextToken()
981→ }
982→
983→ // Handle DEDENT tokens
984→ for p.curTokenIs(lexer.DEDENT) {
985→ if p.indentLevel < startLevel {
986→ p.recordNames[decl.Name] = true
987→ p.recordDefs[decl.Name] = decl
988→ return decl
989→ }
990→ p.nextToken()
991→ }
992→
993→ // Skip any more newlines after DEDENT
994→ for p.curTokenIs(lexer.NEWLINE) {
995→ p.nextToken()
996→ }
997→
998→ if p.curTokenIs(lexer.EOF) {
999→ break
1000→ }
1001→
1002→ if p.indentLevel < startLevel {
1003→ break
1004→ }
1005→
1006→ // Expect a type keyword (INT, BYTE, BOOL, REAL, REAL32, REAL64)
1007→ if !p.curTokenIs(lexer.INT_TYPE) && !p.curTokenIs(lexer.BYTE_TYPE) &&
1008→ !p.curTokenIs(lexer.BOOL_TYPE) && !p.curTokenIs(lexer.REAL_TYPE) &&
1009→ !p.curTokenIs(lexer.REAL32_TYPE) && !p.curTokenIs(lexer.REAL64_TYPE) {
1010→ p.addError(fmt.Sprintf("expected type in record field, got %s", p.curToken.Type))
1011→ return nil
1012→ }
1013→ fieldType := p.curToken.Literal
1014→
1015→ // Parse field names (comma-separated)
1016→ for {
1017→ if !p.expectPeek(lexer.IDENT) {
1018→ return nil
1019→ }
1020→ decl.Fields = append(decl.Fields, ast.RecordField{
1021→ Type: fieldType,
1022→ Name: p.curToken.Literal,
1023→ })
1024→
1025→ if p.peekTokenIs(lexer.COMMA) {
1026→ p.nextToken() // consume comma
1027→ } else {
1028→ break
1029→ }
1030→ }
1031→
1032→ // Expect colon
1033→ if !p.expectPeek(lexer.COLON) {
1034→ return nil
1035→ }
1036→
1037→ // Advance past newline if needed
1038→ if !p.curTokenIs(lexer.NEWLINE) && !p.curTokenIs(lexer.DEDENT) && !p.curTokenIs(lexer.EOF) {
1039→ p.nextToken()
1040→ }
1041→ }
1042→
1043→ p.recordNames[decl.Name] = true
1044→ p.recordDefs[decl.Name] = decl
1045→ return decl
1046→}
1047→
1048→func (p *Parser) parseRecordVarDecl() *ast.VarDecl {
1049→ decl := &ast.VarDecl{
1050→ Token: p.curToken,
1051→ Type: p.curToken.Literal,
1052→ }
1053→
1054→ // Parse variable names
1055→ for {
1056→ if !p.expectPeek(lexer.IDENT) {
1057→ return nil
1058→ }
1059→ decl.Names = append(decl.Names, p.curToken.Literal)
1060→
1061→ if p.peekTokenIs(lexer.COMMA) {
1062→ p.nextToken() // consume comma
1063→ } else {
1064→ break
1065→ }
1066→ }
1067→
1068→ if !p.expectPeek(lexer.COLON) {
1069→ return nil
1070→ }
1071→
1072→ return decl
1073→}
1074→
1075→func (p *Parser) parseTimerDecl() *ast.TimerDecl {
1076→ decl := &ast.TimerDecl{Token: p.curToken}
1077→
1078→ // Parse timer names
1079→ for {
1080→ if !p.expectPeek(lexer.IDENT) {
1081→ return nil
1082→ }
1083→ decl.Names = append(decl.Names, p.curToken.Literal)
1084→ p.timerNames[p.curToken.Literal] = true
1085→
1086→ if p.peekTokenIs(lexer.COMMA) {
1087→ p.nextToken() // consume comma
1088→ } else {
1089→ break
1090→ }
1091→ }
1092→
1093→ if !p.expectPeek(lexer.COLON) {
1094→ return nil
1095→ }
1096→
1097→ return decl
1098→}
1099→
1100→func (p *Parser) parseTimerRead() *ast.TimerRead {
1101→ stmt := &ast.TimerRead{
1102→ Timer: p.curToken.Literal,
1103→ }
1104→
1105→ p.nextToken() // move to ?
1106→ stmt.Token = p.curToken
1107→
1108→ if !p.expectPeek(lexer.IDENT) {
1109→ return nil
1110→ }
1111→ stmt.Variable = p.curToken.Literal
1112→
1113→ return stmt
1114→}
1115→
1116→func (p *Parser) parseSend() *ast.Send {
1117→ stmt := &ast.Send{
1118→ Channel: p.curToken.Literal,
1119→ }
1120→
1121→ p.nextToken() // move to !
1122→ stmt.Token = p.curToken
1123→
1124→ p.nextToken() // move past !
1125→
1126→ // Check if this is a variant send: first token is an identifier that is a variant tag
1127→ // We detect this by checking if the identifier is followed by SEMICOLON
1128→ // and the identifier is NOT followed by an operator (i.e., it's a bare tag name)
1129→ if p.curTokenIs(lexer.IDENT) && p.peekTokenIs(lexer.SEMICOLON) {
1130→ // Could be variant send (tag ; values) or expression ; values
1131→ // Check if the identifier is a known protocol variant tag
1132→ // For simplicity, if IDENT is followed by SEMICOLON and the ident is
1133→ // not followed by an operator, treat it as a variant tag
1134→ // We save the ident and check further
1135→ possibleTag := p.curToken.Literal
1136→ // Check if this identifier is a protocol variant tag
1137→ if p.isVariantTag(possibleTag) {
1138→ stmt.VariantTag = possibleTag
1139→ p.nextToken() // move to ;
1140→ // Parse remaining values after the tag
1141→ for p.curTokenIs(lexer.SEMICOLON) {
1142→ p.nextToken() // move past ;
1143→ val := p.parseExpression(LOWEST)
1144→ stmt.Values = append(stmt.Values, val)
1145→ }
1146→ return stmt
1147→ }
1148→ }
1149→
1150→ stmt.Value = p.parseExpression(LOWEST)
1151→
1152→ // Check for sequential send: c ! expr ; expr ; ...
1153→ for p.peekTokenIs(lexer.SEMICOLON) {
1154→ p.nextToken() // move to ;
1155→ p.nextToken() // move past ;
1156→ val := p.parseExpression(LOWEST)
1157→ stmt.Values = append(stmt.Values, val)
1158→ }
1159→
1160→ return stmt
1161→}
1162→
1163→func (p *Parser) isVariantTag(name string) bool {
1164→ for _, proto := range p.protocolDefs {
1165→ if proto.Kind == "variant" {
1166→ for _, v := range proto.Variants {
1167→ if v.Tag == name {
1168→ return true
1169→ }
1170→ }
1171→ }
1172→ }
1173→ return false
1174→}
1175→
1176→func (p *Parser) parseReceive() ast.Statement {
1177→ channel := p.curToken.Literal
1178→
1179→ p.nextToken() // move to ?
1180→ recvToken := p.curToken
1181→
1182→ // Check for variant receive: c ? CASE
1183→ if p.peekTokenIs(lexer.CASE) {
1184→ p.nextToken() // move to CASE
1185→ return p.parseVariantReceive(channel, recvToken)
1186→ }
1187→
1188→ stmt := &ast.Receive{
1189→ Channel: channel,
1190→ Token: recvToken,
1191→ }
1192→
1193→ if !p.expectPeek(lexer.IDENT) {
1194→ return nil
1195→ }
1196→ stmt.Variable = p.curToken.Literal
1197→
1198→ // Check for sequential receive: c ? x ; y ; z
1199→ for p.peekTokenIs(lexer.SEMICOLON) {
1200→ p.nextToken() // move to ;
1201→ if !p.expectPeek(lexer.IDENT) {
1202→ return nil
1203→ }
1204→ stmt.Variables = append(stmt.Variables, p.curToken.Literal)
1205→ }
1206→
1207→ return stmt
1208→}
1209→
1210→func (p *Parser) parseVariantReceive(channel string, token lexer.Token) *ast.VariantReceive {
1211→ stmt := &ast.VariantReceive{
1212→ Token: token,
1213→ Channel: channel,
1214→ }
1215→
1216→ // Skip to next line
1217→ for p.peekTokenIs(lexer.NEWLINE) {
1218→ p.nextToken()
1219→ }
1220→
1221→ // Expect INDENT
1222→ if !p.peekTokenIs(lexer.INDENT) {
1223→ p.addError("expected indented block after ? CASE")
1224→ return stmt
1225→ }
1226→ p.nextToken() // consume INDENT
1227→ startLevel := p.indentLevel
1228→ p.nextToken() // move into block
1229→
1230→ // Parse variant cases (similar to parseCaseStatement pattern)
1231→ for !p.curTokenIs(lexer.EOF) {
1232→ // Skip newlines
1233→ for p.curTokenIs(lexer.NEWLINE) {
1234→ p.nextToken()
1235→ }
1236→
1237→ // Handle DEDENT tokens
1238→ for p.curTokenIs(lexer.DEDENT) {
1239→ if p.indentLevel < startLevel {
1240→ return stmt
1241→ }
1242→ p.nextToken()
1243→ }
1244→
1245→ // Skip any more newlines after DEDENT
1246→ for p.curTokenIs(lexer.NEWLINE) {
1247→ p.nextToken()
1248→ }
1249→
1250→ if p.curTokenIs(lexer.EOF) {
1251→ break
1252→ }
1253→
1254→ if p.indentLevel < startLevel {
1255→ break
1256→ }
1257→
1258→ // Safety guard: record position before parsing to detect no-progress
1259→ prevToken := p.curToken
1260→ prevPeek := p.peekToken
1261→
1262→ // Parse a variant case: tag [; var]* \n INDENT body
1263→ vc := ast.VariantCase{}
1264→
1265→ if !p.curTokenIs(lexer.IDENT) {
1266→ p.addError(fmt.Sprintf("expected variant tag name, got %s", p.curToken.Type))
1267→ p.nextToken() // skip unrecognized token to avoid infinite loop
1268→ continue
1269→ }
1270→ vc.Tag = p.curToken.Literal
1271→
1272→ // Parse optional variables after semicolons: tag ; x ; y
1273→ for p.peekTokenIs(lexer.SEMICOLON) {
1274→ p.nextToken() // move to ;
1275→ if !p.expectPeek(lexer.IDENT) {
1276→ return stmt
1277→ }
1278→ vc.Variables = append(vc.Variables, p.curToken.Literal)
1279→ }
1280→
1281→ // Skip newlines and expect INDENT for body
1282→ for p.peekTokenIs(lexer.NEWLINE) {
1283→ p.nextToken()
1284→ }
1285→
1286→ if p.peekTokenIs(lexer.INDENT) {
1287→ p.nextToken() // consume INDENT
1288→ p.nextToken() // move to body
1289→ vc.Body = p.parseStatement()
1290→
1291→ // Advance past the last token of the statement if needed
1292→ if !p.curTokenIs(lexer.NEWLINE) && !p.curTokenIs(lexer.DEDENT) && !p.curTokenIs(lexer.EOF) {
1293→ p.nextToken()
1294→ }
1295→ }
1296→
1297→ stmt.Cases = append(stmt.Cases, vc)
1298→
1299→ // No-progress guard: if we haven't moved, break to prevent infinite loop
1300→ if p.curToken == prevToken && p.peekToken == prevPeek {
1301→ p.nextToken() // force progress
1302→ if p.curToken == prevToken {
1303→ break
1304→ }
1305→ }
1306→ }
1307→
1308→ return stmt
1309→}
1310→
1311→func (p *Parser) parseVariantReceiveWithIndex(channel string, channelIndex ast.Expression, token lexer.Token) *ast.VariantReceive {
1312→ stmt := &ast.VariantReceive{
1313→ Token: token,
1314→ Channel: channel,
1315→ ChannelIndex: channelIndex,
1316→ }
1317→
1318→ // Skip to next line
1319→ for p.peekTokenIs(lexer.NEWLINE) {
1320→ p.nextToken()
1321→ }
1322→
1323→ // Expect INDENT
1324→ if !p.peekTokenIs(lexer.INDENT) {
1325→ p.addError("expected indented block after ? CASE")
1326→ return stmt
1327→ }
1328→ p.nextToken() // consume INDENT
1329→ startLevel := p.indentLevel
1330→ p.nextToken() // move into block
1331→
1332→ for !p.curTokenIs(lexer.EOF) {
1333→ for p.curTokenIs(lexer.NEWLINE) {
1334→ p.nextToken()
1335→ }
1336→
1337→ for p.curTokenIs(lexer.DEDENT) {
1338→ if p.indentLevel < startLevel {
1339→ return stmt
1340→ }
1341→ p.nextToken()
1342→ }
1343→
1344→ for p.curTokenIs(lexer.NEWLINE) {
1345→ p.nextToken()
1346→ }
1347→
1348→ if p.curTokenIs(lexer.EOF) {
1349→ break
1350→ }
1351→
1352→ if p.indentLevel < startLevel {
1353→ break
1354→ }
1355→
1356→ // Safety guard: record position before parsing to detect no-progress
1357→ prevToken := p.curToken
1358→ prevPeek := p.peekToken
1359→
1360→ vc := ast.VariantCase{}
1361→
1362→ if !p.curTokenIs(lexer.IDENT) {
1363→ p.addError(fmt.Sprintf("expected variant tag name, got %s", p.curToken.Type))
1364→ p.nextToken() // skip unrecognized token to avoid infinite loop
1365→ continue
1366→ }
1367→ vc.Tag = p.curToken.Literal
1368→
1369→ for p.peekTokenIs(lexer.SEMICOLON) {
1370→ p.nextToken() // move to ;
1371→ if !p.expectPeek(lexer.IDENT) {
1372→ return stmt
1373→ }
1374→ vc.Variables = append(vc.Variables, p.curToken.Literal)
1375→ }
1376→
1377→ for p.peekTokenIs(lexer.NEWLINE) {
1378→ p.nextToken()
1379→ }
1380→
1381→ if p.peekTokenIs(lexer.INDENT) {
1382→ p.nextToken() // consume INDENT
1383→ p.nextToken() // move to body
1384→ vc.Body = p.parseStatement()
1385→
1386→ if !p.curTokenIs(lexer.NEWLINE) && !p.curTokenIs(lexer.DEDENT) && !p.curTokenIs(lexer.EOF) {
1387→ p.nextToken()
1388→ }
1389→ }
1390→
1391→ stmt.Cases = append(stmt.Cases, vc)
1392→
1393→ // No-progress guard: if we haven't moved, break to prevent infinite loop
1394→ if p.curToken == prevToken && p.peekToken == prevPeek {
1395→ p.nextToken() // force progress
1396→ if p.curToken == prevToken {
1397→ break
1398→ }
1399→ }
1400→ }
1401→
1402→ return stmt
1403→}
1404→
1405→func (p *Parser) parseSeqBlock() *ast.SeqBlock {
1406→ block := &ast.SeqBlock{Token: p.curToken}
1407→
1408→ // Check for replicator: SEQ i = start FOR count
1409→ if p.peekTokenIs(lexer.IDENT) {
1410→ // Save position to check if it's a replicator
1411→ p.nextToken() // move to identifier
1412→ if p.peekTokenIs(lexer.EQ) {
1413→ // This is a replicator
1414→ block.Replicator = p.parseReplicator()
1415→ } else {
1416→ // Not a replicator, this shouldn't happen in valid Occam
1417→ // (SEQ followed by identifier at same indentation level)
1418→ p.addError("unexpected identifier after SEQ")
1419→ return block
1420→ }
1421→ }
1422→
1423→ // Skip to next line
1424→ for p.peekTokenIs(lexer.NEWLINE) {
1425→ p.nextToken()
1426→ }
1427→
1428→ // Expect INDENT
1429→ if !p.peekTokenIs(lexer.INDENT) {
1430→ p.addError("expected indented block after SEQ")
1431→ return block
1432→ }
1433→ p.nextToken() // consume INDENT
1434→
1435→ block.Statements = p.parseBlockStatements()
1436→
1437→ return block
1438→}
1439→
1440→func (p *Parser) parseParBlock() *ast.ParBlock {
1441→ block := &ast.ParBlock{Token: p.curToken}
1442→
1443→ // Check for replicator: PAR i = start FOR count
1444→ if p.peekTokenIs(lexer.IDENT) {
1445→ // Save position to check if it's a replicator
1446→ p.nextToken() // move to identifier
1447→ if p.peekTokenIs(lexer.EQ) {
1448→ // This is a replicator
1449→ block.Replicator = p.parseReplicator()
1450→ } else {
1451→ // Not a replicator, this shouldn't happen in valid Occam
1452→ // (PAR followed by identifier at same indentation level)
1453→ p.addError("unexpected identifier after PAR")
1454→ return block
1455→ }
1456→ }
1457→
1458→ // Skip to next line
1459→ for p.peekTokenIs(lexer.NEWLINE) {
1460→ p.nextToken()
1461→ }
1462→
1463→ // Expect INDENT
1464→ if !p.peekTokenIs(lexer.INDENT) {
1465→ p.addError("expected indented block after PAR")
1466→ return block
1467→ }
1468→ p.nextToken() // consume INDENT
1469→
1470→ block.Statements = p.parseBlockStatements()
1471→
1472→ return block
1473→}
1474→
1475→// parseReplicator parses: variable = start FOR count [STEP step]
1476→// Assumes the variable identifier has already been consumed and is in curToken
1477→func (p *Parser) parseReplicator() *ast.Replicator {
1478→ rep := &ast.Replicator{
1479→ Variable: p.curToken.Literal,
1480→ }
1481→
1482→ // Expect =
1483→ if !p.expectPeek(lexer.EQ) {
1484→ return nil
1485→ }
1486→
1487→ // Parse start expression
1488→ p.nextToken()
1489→ rep.Start = p.parseExpression(LOWEST)
1490→
1491→ // Expect FOR
1492→ if !p.expectPeek(lexer.FOR) {
1493→ return nil
1494→ }
1495→
1496→ // Parse count expression
1497→ p.nextToken()
1498→ rep.Count = p.parseExpression(LOWEST)
1499→
1500→ // Optional STEP
1501→ if p.peekTokenIs(lexer.STEP) {
1502→ p.nextToken() // consume STEP
1503→ p.nextToken() // move to step expression
1504→ rep.Step = p.parseExpression(LOWEST)
1505→ }
1506→
1507→ return rep
1508→}
1509→
1510→func (p *Parser) parseAltBlock() *ast.AltBlock {
1511→ block := &ast.AltBlock{Token: p.curToken}
1512→
1513→ // Skip to next line
1514→ for p.peekTokenIs(lexer.NEWLINE) {
1515→ p.nextToken()
1516→ }
1517→
1518→ // Expect INDENT
1519→ if !p.peekTokenIs(lexer.INDENT) {
1520→ p.addError("expected indented block after ALT")
1521→ return block
1522→ }
1523→ p.nextToken() // consume INDENT
1524→
1525→ block.Cases = p.parseAltCases()
1526→
1527→ return block
1528→}
1529→
1530→func (p *Parser) parseAltCases() []ast.AltCase {
1531→ var cases []ast.AltCase
1532→ startLevel := p.indentLevel
1533→
1534→ p.nextToken() // move past INDENT
1535→
1536→ for !p.curTokenIs(lexer.EOF) {
1537→ // Skip newlines
1538→ for p.curTokenIs(lexer.NEWLINE) {
1539→ p.nextToken()
1540→ }
1541→
1542→ // Handle DEDENT tokens
1543→ for p.curTokenIs(lexer.DEDENT) {
1544→ if p.indentLevel < startLevel {
1545→ return cases
1546→ }
1547→ p.nextToken()
1548→ }
1549→
1550→ // Skip any more newlines after DEDENT
1551→ for p.curTokenIs(lexer.NEWLINE) {
1552→ p.nextToken()
1553→ }
1554→
1555→ if p.curTokenIs(lexer.EOF) {
1556→ break
1557→ }
1558→
1559→ if p.indentLevel < startLevel {
1560→ break
1561→ }
1562→
1563→ // Safety guard: record position before parsing to detect no-progress
1564→ prevToken := p.curToken
1565→ prevPeek := p.peekToken
1566→
1567→ // Parse an ALT case: [guard &] channel ? var
1568→ altCase := p.parseAltCase()
1569→ if altCase != nil {
1570→ cases = append(cases, *altCase)
1571→ }
1572→
1573→ // No-progress guard: if we haven't moved, break to prevent infinite loop
1574→ if p.curToken == prevToken && p.peekToken == prevPeek {
1575→ break
1576→ }
1577→ }
1578→
1579→ return cases
1580→}
1581→
1582→func (p *Parser) parseAltCase() *ast.AltCase {
1583→ altCase := &ast.AltCase{}
1584→
1585→ // Check for guard: expression & channel ? var
1586→ // For now, we expect: channel ? var (no guard support yet)
1587→ // or: guard & channel ? var
1588→
1589→ // First token should be identifier (channel name or guard start)
1590→ if !p.curTokenIs(lexer.IDENT) && !p.curTokenIs(lexer.TRUE) && !p.curTokenIs(lexer.FALSE) {
1591→ p.addError(fmt.Sprintf("expected channel name or guard in ALT case, got %s", p.curToken.Type))
1592→ return nil
1593→ }
1594→
1595→ // Look ahead to determine if this is a guard or channel
1596→ // If next token is & then we have a guard
1597→ // If next token is ? then it's a channel/timer receive
1598→ if p.peekTokenIs(lexer.RECEIVE) {
1599→ name := p.curToken.Literal
1600→ if p.timerNames[name] {
1601→ // Timer case: tim ? AFTER deadline
1602→ altCase.IsTimer = true
1603→ altCase.Timer = name
1604→ p.nextToken() // move to ?
1605→ if !p.expectPeek(lexer.AFTER) {
1606→ return nil
1607→ }
1608→ p.nextToken() // move past AFTER
1609→ altCase.Deadline = p.parseExpression(LOWEST)
1610→ } else {
1611→ // Simple case: channel ? var
1612→ altCase.Channel = name
1613→ p.nextToken() // move to ?
1614→ if !p.expectPeek(lexer.IDENT) {
1615→ return nil
1616→ }
1617→ altCase.Variable = p.curToken.Literal
1618→ }
1619→ } else if p.peekTokenIs(lexer.LBRACKET) {
1620→ // Indexed channel case: cs[i] ? var
1621→ name := p.curToken.Literal
1622→ altCase.Channel = name
1623→ p.nextToken() // move to [
1624→ p.nextToken() // move past [
1625→ altCase.ChannelIndex = p.parseExpression(LOWEST)
1626→ if !p.expectPeek(lexer.RBRACKET) {
1627→ return nil
1628→ }
1629→ if !p.expectPeek(lexer.RECEIVE) {
1630→ return nil
1631→ }
1632→ if !p.expectPeek(lexer.IDENT) {
1633→ return nil
1634→ }
1635→ altCase.Variable = p.curToken.Literal
1636→ } else {
1637→ // Could be a guard followed by & channel ? var
1638→ // For simplicity, parse expression until we hit &
1639→ // For now, only support simple TRUE/FALSE or identifier guards
1640→ guard := p.parseExpression(LOWEST)
1641→ altCase.Guard = guard
1642→
1643→ // Expect &
1644→ if !p.peekTokenIs(lexer.AMPERSAND) {
1645→ p.addError("expected & after guard in ALT case")
1646→ return nil
1647→ }
1648→ p.nextToken() // move to &
1649→ p.nextToken() // move past &
1650→
1651→ // Now expect channel ? var or channel[index] ? var
1652→ if !p.curTokenIs(lexer.IDENT) {
1653→ p.addError(fmt.Sprintf("expected channel name after guard, got %s", p.curToken.Type))
1654→ return nil
1655→ }
1656→ altCase.Channel = p.curToken.Literal
1657→
1658→ if p.peekTokenIs(lexer.LBRACKET) {
1659→ // Indexed channel with guard: guard & cs[i] ? var
1660→ p.nextToken() // move to [
1661→ p.nextToken() // move past [
1662→ altCase.ChannelIndex = p.parseExpression(LOWEST)
1663→ if !p.expectPeek(lexer.RBRACKET) {
1664→ return nil
1665→ }
1666→ }
1667→
1668→ if !p.expectPeek(lexer.RECEIVE) {
1669→ return nil
1670→ }
1671→ if !p.expectPeek(lexer.IDENT) {
1672→ return nil
1673→ }
1674→ altCase.Variable = p.curToken.Literal
1675→ }
1676→
1677→ // Skip to next line for the body
1678→ for p.peekTokenIs(lexer.NEWLINE) {
1679→ p.nextToken()
1680→ }
1681→
1682→ // Expect INDENT for body
1683→ if !p.peekTokenIs(lexer.INDENT) {
1684→ p.addError("expected indented body after ALT case")
1685→ return altCase
1686→ }
1687→ p.nextToken() // consume INDENT
1688→ p.nextToken() // move into body
1689→
1690→ altCase.Body = p.parseBodyStatements()
1691→
1692→ return altCase
1693→}
1694→
1695→func (p *Parser) parseBlockStatements() []ast.Statement {
1696→ var statements []ast.Statement
1697→ startLevel := p.indentLevel
1698→
1699→ p.nextToken() // move past INDENT
1700→
1701→ for !p.curTokenIs(lexer.EOF) {
1702→ // Skip newlines
1703→ for p.curTokenIs(lexer.NEWLINE) {
1704→ p.nextToken()
1705→ }
1706→
1707→ // Handle DEDENT tokens
1708→ // If we're at a DEDENT and indentLevel has dropped below startLevel,
1709→ // this DEDENT ends our block - stop parsing
1710→ // If indentLevel >= startLevel, this DEDENT is from a nested block - skip it
1711→ for p.curTokenIs(lexer.DEDENT) {
1712→ if p.indentLevel < startLevel {
1713→ return statements
1714→ }
1715→ p.nextToken() // skip nested block's DEDENT
1716→ }
1717→
1718→ // Skip any more newlines after DEDENT
1719→ for p.curTokenIs(lexer.NEWLINE) {
1720→ p.nextToken()
1721→ }
1722→
1723→ if p.curTokenIs(lexer.EOF) {
1724→ break
1725→ }
1726→
1727→ // Double-check we haven't gone below our level
1728→ if p.indentLevel < startLevel {
1729→ break
1730→ }
1731→
1732→ stmt := p.parseStatement()
1733→ if stmt != nil {
1734→ statements = append(statements, stmt)
1735→ }
1736→
1737→ // After parsing a statement, we need to advance.
1738→ // But if we're already at NEWLINE/DEDENT/EOF, the next iteration will handle it.
1739→ // Only advance if we're still on the last token of the statement.
1740→ if !p.curTokenIs(lexer.NEWLINE) && !p.curTokenIs(lexer.DEDENT) && !p.curTokenIs(lexer.EOF) {
1741→ p.nextToken()
1742→ }
1743→ }
1744→
1745→ return statements
1746→}
1747→
1748→// parseBodyStatements parses multiple statements inside a branch body
1749→// (IF choice, CASE choice, ALT case, WHILE). Called after the caller has
1750→// consumed the INDENT token and advanced into the body.
1751→// Returns all statements found at this indentation level.
1752→func (p *Parser) parseBodyStatements() []ast.Statement {
1753→ var statements []ast.Statement
1754→ startLevel := p.indentLevel
1755→
1756→ for !p.curTokenIs(lexer.EOF) {
1757→ // Skip newlines
1758→ for p.curTokenIs(lexer.NEWLINE) {
1759→ p.nextToken()
1760→ }
1761→
1762→ // Handle DEDENT tokens
1763→ for p.curTokenIs(lexer.DEDENT) {
1764→ if p.indentLevel < startLevel {
1765→ return statements
1766→ }
1767→ p.nextToken()
1768→ }
1769→
1770→ // Skip any more newlines after DEDENT
1771→ for p.curTokenIs(lexer.NEWLINE) {
1772→ p.nextToken()
1773→ }
1774→
1775→ if p.curTokenIs(lexer.EOF) {
1776→ break
1777→ }
1778→
1779→ if p.indentLevel < startLevel {
1780→ break
1781→ }
1782→
1783→ // Safety guard: record position before parsing to detect no-progress
1784→ prevToken := p.curToken
1785→ prevPeek := p.peekToken
1786→
1787→ stmt := p.parseStatement()
1788→ if stmt != nil {
1789→ statements = append(statements, stmt)
1790→ }
1791→
1792→ // Advance past the last token of the statement if needed
1793→ if !p.curTokenIs(lexer.NEWLINE) && !p.curTokenIs(lexer.DEDENT) && !p.curTokenIs(lexer.EOF) {
1794→ p.nextToken()
1795→ }
1796→
1797→ // No-progress guard: if we haven't moved, break to prevent infinite loop
1798→ if p.curToken == prevToken && p.peekToken == prevPeek {
1799→ break
1800→ }
1801→ }
1802→
1803→ return statements
1804→}
1805→
1806→func (p *Parser) parseProcDecl() *ast.ProcDecl {
1807→ proc := &ast.ProcDecl{Token: p.curToken}
1808→
1809→ if !p.expectPeek(lexer.IDENT) {
1810→ return nil
1811→ }
1812→ proc.Name = p.curToken.Literal
1813→
1814→ if !p.expectPeek(lexer.LPAREN) {
1815→ return nil
1816→ }
1817→
1818→ proc.Params = p.parseProcParams()
1819→
1820→ if !p.expectPeek(lexer.RPAREN) {
1821→ return nil
1822→ }
1823→
1824→ // Skip to next line and expect indented body
1825→ for p.peekTokenIs(lexer.NEWLINE) {
1826→ p.nextToken()
1827→ }
1828→
1829→ if !p.peekTokenIs(lexer.INDENT) {
1830→ p.addError("expected indented body after PROC declaration")
1831→ return proc
1832→ }
1833→ p.nextToken() // consume INDENT
1834→
1835→ // Parse all statements in the body (local declarations + body process)
1836→ bodyLevel := p.indentLevel
1837→ p.nextToken()
1838→
1839→ for !p.curTokenIs(lexer.EOF) {
1840→ // Skip newlines
1841→ for p.curTokenIs(lexer.NEWLINE) {
1842→ p.nextToken()
1843→ }
1844→
1845→ // Handle DEDENTs
1846→ for p.curTokenIs(lexer.DEDENT) {
1847→ if p.indentLevel < bodyLevel {
1848→ goto procBodyDone
1849→ }
1850→ p.nextToken()
1851→ }
1852→
1853→ // Skip more newlines after DEDENT
1854→ for p.curTokenIs(lexer.NEWLINE) {
1855→ p.nextToken()
1856→ }
1857→
1858→ if p.curTokenIs(lexer.EOF) || p.indentLevel < bodyLevel {
1859→ break
1860→ }
1861→
1862→ stmt := p.parseStatement()
1863→ if stmt != nil {
1864→ proc.Body = append(proc.Body, stmt)
1865→ }
1866→
1867→ if !p.curTokenIs(lexer.NEWLINE) && !p.curTokenIs(lexer.DEDENT) && !p.curTokenIs(lexer.EOF) {
1868→ p.nextToken()
1869→ }
1870→ }
1871→procBodyDone:
1872→
1873→ // Optionally consume KRoC-style colon terminator
1874→ if p.peekTokenIs(lexer.COLON) {
1875→ p.nextToken()
1876→ }
1877→
1878→ return proc
1879→}
1880→
1881→// isTypeToken returns true if the token type is a scalar type keyword.
1882→func isTypeToken(t lexer.TokenType) bool {
1883→ return t == lexer.INT_TYPE || t == lexer.BYTE_TYPE ||
1884→ t == lexer.BOOL_TYPE || t == lexer.REAL_TYPE ||
1885→ t == lexer.REAL32_TYPE || t == lexer.REAL64_TYPE
1886→}
1887→
1888→func (p *Parser) parseProcParams() []ast.ProcParam {
1889→ var params []ast.ProcParam
1890→
1891→ if p.peekTokenIs(lexer.RPAREN) {
1892→ return params
1893→ }
1894→
1895→ p.nextToken()
1896→
1897→ // Track the previous param's type info for shared-type parameters
1898→ var prevParam *ast.ProcParam
1899→
1900→ for {
1901→ // Skip newlines inside parameter lists (multi-line params)
1902→ // Note: INDENT/DEDENT/NEWLINE inside (...) are suppressed by the lexer
1903→ for p.curTokenIs(lexer.NEWLINE) {
1904→ p.nextToken()
1905→ }
1906→
1907→ param := ast.ProcParam{}
1908→
1909→ // Check if this is a shared-type parameter: after a comma, if current token
1910→ // is an IDENT that is NOT a type keyword, record name, CHAN, VAL, RESULT, or [,
1911→ // re-use the previous param's type/flags.
1912→ if prevParam != nil && p.curTokenIs(lexer.IDENT) && !p.recordNames[p.curToken.Literal] {
1913→ // This is a shared-type param — re-use type info from previous param
1914→ param.IsVal = prevParam.IsVal
1915→ param.Type = prevParam.Type
1916→ param.IsChan = prevParam.IsChan
1917→ param.IsChanArray = prevParam.IsChanArray
1918→ param.IsOpenArray = prevParam.IsOpenArray
1919→ param.ChanElemType = prevParam.ChanElemType
1920→ param.ArraySize = prevParam.ArraySize
1921→ param.Name = p.curToken.Literal
1922→
1923→ // Check for channel direction marker (? or !)
1924→ if (param.IsChan || param.IsChanArray) && (p.peekTokenIs(lexer.RECEIVE) || p.peekTokenIs(lexer.SEND)) {
1925→ p.nextToken()
1926→ param.ChanDir = p.curToken.Literal
1927→ }
1928→
1929→ params = append(params, param)
1930→ prevParam = ¶ms[len(params)-1]
1931→
1932→ if !p.peekTokenIs(lexer.COMMA) {
1933→ break
1934→ }
1935→ p.nextToken() // consume comma
1936→ p.nextToken() // move to next param
1937→ continue
1938→ }
1939→
1940→ // Check for VAL keyword
1941→ if p.curTokenIs(lexer.VAL) {
1942→ param.IsVal = true
1943→ p.nextToken()
1944→ }
1945→
1946→ // Check for RESULT keyword (output-only parameter — maps to pointer like non-VAL)
1947→ if p.curTokenIs(lexer.RESULT) {
1948→ // RESULT is semantically like non-VAL (pointer param), just skip it
1949→ p.nextToken()
1950→ }
1951→
1952→ // Check for []CHAN OF <type>, []TYPE (open array), or [n]TYPE (fixed-size array)
1953→ if p.curTokenIs(lexer.LBRACKET) {
1954→ if p.peekTokenIs(lexer.RBRACKET) {
1955→ // Open array: []CHAN OF TYPE or []TYPE
1956→ p.nextToken() // consume ]
1957→ p.nextToken() // move past ]
1958→ if p.curTokenIs(lexer.CHAN) {
1959→ // []CHAN OF <type> or []CHAN <type> (channel array parameter)
1960→ param.IsChan = true
1961→ param.IsChanArray = true
1962→ if p.peekTokenIs(lexer.OF) {
1963→ p.nextToken() // consume OF
1964→ }
1965→ p.nextToken() // move to element type
1966→ if isTypeToken(p.curToken.Type) || p.curTokenIs(lexer.IDENT) {
1967→ param.ChanElemType = p.curToken.Literal
1968→ } else {
1969→ p.addError(fmt.Sprintf("expected type after []CHAN, got %s", p.curToken.Type))
1970→ return params
1971→ }
1972→ p.nextToken()
1973→ } else if isTypeToken(p.curToken.Type) {
1974→ param.IsOpenArray = true
1975→ param.Type = p.curToken.Literal
1976→ p.nextToken()
1977→ } else if p.curTokenIs(lexer.IDENT) && p.recordNames[p.curToken.Literal] {
1978→ param.IsOpenArray = true
1979→ param.Type = p.curToken.Literal
1980→ p.nextToken()
1981→ } else {
1982→ p.addError(fmt.Sprintf("expected type after [], got %s", p.curToken.Type))
1983→ return params
1984→ }
1985→ } else {
1986→ // Fixed-size array: [n]TYPE
1987→ p.nextToken() // move past [
1988→ if !p.curTokenIs(lexer.INT) {
1989→ p.addError(fmt.Sprintf("expected array size, got %s", p.curToken.Type))
1990→ return params
1991→ }
1992→ param.ArraySize = p.curToken.Literal
1993→ if !p.expectPeek(lexer.RBRACKET) {
1994→ return params
1995→ }
1996→ p.nextToken() // move to type
1997→ if isTypeToken(p.curToken.Type) {
1998→ param.Type = p.curToken.Literal
1999→ } else if p.curTokenIs(lexer.IDENT) && p.recordNames[p.curToken.Literal] {
2000→ param.Type = p.curToken.Literal
2001→ } else {
2002→ p.addError(fmt.Sprintf("expected type after [%s], got %s", param.ArraySize, p.curToken.Type))
2003→ return params
2004→ }
2005→ p.nextToken()
2006→ }
2007→ } else if p.curTokenIs(lexer.CHAN) {
2008→ // Check for CHAN OF <type> or CHAN <type>
2009→ param.IsChan = true
2010→ if p.peekTokenIs(lexer.OF) {
2011→ p.nextToken() // consume OF
2012→ }
2013→ p.nextToken() // move to element type
2014→ if isTypeToken(p.curToken.Type) || p.curTokenIs(lexer.IDENT) {
2015→ param.ChanElemType = p.curToken.Literal
2016→ } else {
2017→ p.addError(fmt.Sprintf("expected type after CHAN, got %s", p.curToken.Type))
2018→ return params
2019→ }
2020→ p.nextToken()
2021→ } else if p.curTokenIs(lexer.IDENT) && p.recordNames[p.curToken.Literal] {
2022→ // Record type parameter
2023→ param.Type = p.curToken.Literal
2024→ p.nextToken()
2025→ } else {
2026→ // Expect scalar type
2027→ if !isTypeToken(p.curToken.Type) {
2028→ p.addError(fmt.Sprintf("expected type in parameter, got %s", p.curToken.Type))
2029→ return params
2030→ }
2031→ param.Type = p.curToken.Literal
2032→ p.nextToken()
2033→ }
2034→
2035→ // Expect identifier
2036→ if !p.curTokenIs(lexer.IDENT) {
2037→ p.addError(fmt.Sprintf("expected parameter name, got %s", p.curToken.Type))
2038→ return params
2039→ }
2040→ param.Name = p.curToken.Literal
2041→
2042→ // Check for channel direction marker (? or !)
2043→ if (param.IsChan || param.IsChanArray) && (p.peekTokenIs(lexer.RECEIVE) || p.peekTokenIs(lexer.SEND)) {
2044→ p.nextToken()
2045→ param.ChanDir = p.curToken.Literal
2046→ }
2047→
2048→ params = append(params, param)
2049→ prevParam = ¶ms[len(params)-1]
2050→
2051→ if !p.peekTokenIs(lexer.COMMA) {
2052→ break
2053→ }
2054→ p.nextToken() // consume comma
2055→ p.nextToken() // move to next param
2056→ }
2057→
2058→ return params
2059→}
2060→
2061→func (p *Parser) parseProcCall() *ast.ProcCall {
2062→ call := &ast.ProcCall{
2063→ Token: p.curToken,
2064→ Name: p.curToken.Literal,
2065→ }
2066→
2067→ if !p.peekTokenIs(lexer.LPAREN) {
2068→ // No arguments
2069→ return call
2070→ }
2071→
2072→ p.nextToken() // consume (
2073→
2074→ if p.peekTokenIs(lexer.RPAREN) {
2075→ p.nextToken() // consume )
2076→ return call
2077→ }
2078→
2079→ p.nextToken() // move to first arg
2080→ call.Args = append(call.Args, p.parseExpression(LOWEST))
2081→ // Consume optional channel direction annotation at call site (e.g., out!)
2082→ if p.peekTokenIs(lexer.SEND) || p.peekTokenIs(lexer.RECEIVE) {
2083→ p.nextToken()
2084→ }
2085→
2086→ for p.peekTokenIs(lexer.COMMA) {
2087→ p.nextToken() // consume comma
2088→ p.nextToken() // move to next arg
2089→ call.Args = append(call.Args, p.parseExpression(LOWEST))
2090→ // Consume optional channel direction annotation at call site
2091→ if p.peekTokenIs(lexer.SEND) || p.peekTokenIs(lexer.RECEIVE) {
2092→ p.nextToken()
2093→ }
2094→ }
2095→
2096→ if !p.expectPeek(lexer.RPAREN) {
2097→ return nil
2098→ }
2099→
2100→ return call
2101→}
2102→
2103→func (p *Parser) parseFuncDecl() *ast.FuncDecl {
2104→ fn := &ast.FuncDecl{
2105→ Token: p.curToken,
2106→ ReturnTypes: []string{p.curToken.Literal},
2107→ }
2108→
2109→ // Parse additional return types for multi-result functions: INT, INT FUNCTION
2110→ for p.peekTokenIs(lexer.COMMA) {
2111→ p.nextToken() // consume comma
2112→ p.nextToken() // move to next type
2113→ fn.ReturnTypes = append(fn.ReturnTypes, p.curToken.Literal)
2114→ }
2115→
2116→ // Consume FUNCTION keyword
2117→ p.nextToken()
2118→
2119→ if !p.expectPeek(lexer.IDENT) {
2120→ return nil
2121→ }
2122→ fn.Name = p.curToken.Literal
2123→
2124→ if !p.expectPeek(lexer.LPAREN) {
2125→ return nil
2126→ }
2127→
2128→ fn.Params = p.parseProcParams()
2129→
2130→ if !p.expectPeek(lexer.RPAREN) {
2131→ return nil
2132→ }
2133→
2134→ // Force all params to IsVal = true (occam FUNCTION params are always VAL)
2135→ for i := range fn.Params {
2136→ fn.Params[i].IsVal = true
2137→ }
2138→
2139→ // Skip newlines, expect INDENT
2140→ for p.peekTokenIs(lexer.NEWLINE) {
2141→ p.nextToken()
2142→ }
2143→
2144→ if !p.peekTokenIs(lexer.INDENT) {
2145→ p.addError("expected indented body after FUNCTION declaration")
2146→ return fn
2147→ }
2148→ funcLevel := p.indentLevel
2149→ p.nextToken() // consume INDENT
2150→ p.nextToken() // move into body
2151→
2152→ // IS form: simple expression return
2153→ if p.curTokenIs(lexer.IS) {
2154→ p.nextToken() // move past IS
2155→ fn.ResultExprs = []ast.Expression{p.parseExpression(LOWEST)}
2156→
2157→ // Consume remaining tokens and DEDENTs back to function's indentation level
2158→ for !p.curTokenIs(lexer.EOF) {
2159→ if p.curTokenIs(lexer.DEDENT) && p.indentLevel <= funcLevel {
2160→ break
2161→ }
2162→ p.nextToken()
2163→ }
2164→
2165→ // Optionally consume KRoC-style colon terminator
2166→ if p.peekTokenIs(lexer.COLON) {
2167→ p.nextToken()
2168→ }
2169→ return fn
2170→ }
2171→
2172→ // VALOF form: local declarations, then VALOF keyword, then body, then RESULT
2173→ // Parse local declarations (type keywords before VALOF)
2174→ for p.curTokenIs(lexer.INT_TYPE) || p.curTokenIs(lexer.BYTE_TYPE) ||
2175→ p.curTokenIs(lexer.BOOL_TYPE) || p.curTokenIs(lexer.REAL_TYPE) ||
2176→ p.curTokenIs(lexer.REAL32_TYPE) || p.curTokenIs(lexer.REAL64_TYPE) {
2177→ stmt := p.parseVarDecl()
2178→ if stmt != nil {
2179→ fn.Body = append(fn.Body, stmt)
2180→ }
2181→ // Advance past NEWLINE
2182→ for p.peekTokenIs(lexer.NEWLINE) {
2183→ p.nextToken()
2184→ }
2185→ p.nextToken()
2186→ }
2187→
2188→ // Expect VALOF keyword
2189→ if !p.curTokenIs(lexer.VALOF) {
2190→ p.addError(fmt.Sprintf("expected VALOF or IS in function body, got %s", p.curToken.Type))
2191→ return fn
2192→ }
2193→
2194→ // Skip newlines and expect INDENT for VALOF body
2195→ for p.peekTokenIs(lexer.NEWLINE) {
2196→ p.nextToken()
2197→ }
2198→
2199→ if !p.peekTokenIs(lexer.INDENT) {
2200→ p.addError("expected indented block after VALOF")
2201→ return fn
2202→ }
2203→ p.nextToken() // consume INDENT
2204→ startLevel := p.indentLevel
2205→ p.nextToken() // move into VALOF body
2206→
2207→ // Parse the VALOF body — declarations and statements until RESULT
2208→ for !p.curTokenIs(lexer.RESULT) && !p.curTokenIs(lexer.EOF) {
2209→ // Skip newlines
2210→ for p.curTokenIs(lexer.NEWLINE) {
2211→ p.nextToken()
2212→ }
2213→ // Handle DEDENTs
2214→ for p.curTokenIs(lexer.DEDENT) {
2215→ if p.indentLevel < startLevel {
2216→ break
2217→ }
2218→ p.nextToken()
2219→ }
2220→ if p.curTokenIs(lexer.EOF) || p.curTokenIs(lexer.RESULT) {
2221→ break
2222→ }
2223→ stmt := p.parseStatement()
2224→ if stmt != nil {
2225→ fn.Body = append(fn.Body, stmt)
2226→ }
2227→ if !p.curTokenIs(lexer.NEWLINE) && !p.curTokenIs(lexer.DEDENT) && !p.curTokenIs(lexer.EOF) && !p.curTokenIs(lexer.RESULT) {
2228→ p.nextToken()
2229→ }
2230→ }
2231→
2232→ // Parse RESULT expression(s) — comma-separated for multi-result functions
2233→ if p.curTokenIs(lexer.RESULT) {
2234→ p.nextToken() // move past RESULT
2235→ fn.ResultExprs = []ast.Expression{p.parseExpression(LOWEST)}
2236→ for p.peekTokenIs(lexer.COMMA) {
2237→ p.nextToken() // consume comma
2238→ p.nextToken() // move to next expression
2239→ fn.ResultExprs = append(fn.ResultExprs, p.parseExpression(LOWEST))
2240→ }
2241→ }
2242→
2243→ // Consume remaining tokens and DEDENTs back to function's indentation level
2244→ for !p.curTokenIs(lexer.EOF) {
2245→ if p.curTokenIs(lexer.DEDENT) && p.indentLevel <= funcLevel {
2246→ break
2247→ }
2248→ p.nextToken()
2249→ }
2250→
2251→ // Optionally consume KRoC-style colon terminator
2252→ if p.peekTokenIs(lexer.COLON) {
2253→ p.nextToken()
2254→ }
2255→
2256→ return fn
2257→}
2258→
2259→// parseByteLiteralValue processes the raw content of a byte literal (between single quotes),
2260→// handling occam escape sequences (* prefix), and returns the resulting byte value.
2261→func (p *Parser) parseByteLiteralValue(raw string) (byte, error) {
2262→ if len(raw) == 0 {
2263→ return 0, fmt.Errorf("empty byte literal")
2264→ }
2265→ if raw[0] == '*' {
2266→ if len(raw) != 2 {
2267→ return 0, fmt.Errorf("invalid escape sequence in byte literal: '*%s'", raw[1:])
2268→ }
2269→ switch raw[1] {
2270→ case 'n':
2271→ return '\n', nil
2272→ case 'c':
2273→ return '\r', nil
2274→ case 't':
2275→ return '\t', nil
2276→ case 's':
2277→ return ' ', nil
2278→ case '*':
2279→ return '*', nil
2280→ case '\'':
2281→ return '\'', nil
2282→ case '"':
2283→ return '"', nil
2284→ default:
2285→ return 0, fmt.Errorf("unknown escape sequence in byte literal: '*%c'", raw[1])
2286→ }
2287→ }
2288→ if len(raw) != 1 {
2289→ return 0, fmt.Errorf("byte literal must be a single character, got %q", raw)
2290→ }
2291→ return raw[0], nil
2292→}
2293→
2294→func (p *Parser) parseFuncCallExpr() *ast.FuncCall {
2295→ call := &ast.FuncCall{
2296→ Token: p.curToken,
2297→ Name: p.curToken.Literal,
2298→ }
2299→
2300→ p.nextToken() // consume (
2301→
2302→ if p.peekTokenIs(lexer.RPAREN) {
2303→ p.nextToken() // consume )
2304→ return call
2305→ }
2306→
2307→ p.nextToken() // move to first arg
2308→ call.Args = append(call.Args, p.parseExpression(LOWEST))
2309→
2310→ for p.peekTokenIs(lexer.COMMA) {
2311→ p.nextToken() // consume comma
2312→ p.nextToken() // move to next arg
2313→ call.Args = append(call.Args, p.parseExpression(LOWEST))
2314→ }
2315→
2316→ if !p.expectPeek(lexer.RPAREN) {
2317→ return nil
2318→ }
2319→
2320→ return call
2321→}
2322→
2323→func (p *Parser) parseWhileLoop() *ast.WhileLoop {
2324→ loop := &ast.WhileLoop{Token: p.curToken}
2325→
2326→ p.nextToken()
2327→ loop.Condition = p.parseExpression(LOWEST)
2328→
2329→ // Skip to next line
2330→ for p.peekTokenIs(lexer.NEWLINE) {
2331→ p.nextToken()
2332→ }
2333→
2334→ // Expect INDENT
2335→ if !p.peekTokenIs(lexer.INDENT) {
2336→ p.addError("expected indented block after WHILE condition")
2337→ return loop
2338→ }
2339→ p.nextToken() // consume INDENT
2340→ p.nextToken() // move to first statement
2341→
2342→ loop.Body = p.parseBodyStatements()
2343→
2344→ return loop
2345→}
2346→
2347→func (p *Parser) parseIfStatement() *ast.IfStatement {
2348→ stmt := &ast.IfStatement{Token: p.curToken}
2349→
2350→ // Check for replicator: IF i = start FOR count
2351→ if p.peekTokenIs(lexer.IDENT) {
2352→ p.nextToken() // move to identifier
2353→ if p.peekTokenIs(lexer.EQ) {
2354→ stmt.Replicator = p.parseReplicator()
2355→ } else {
2356→ p.addError("unexpected identifier after IF")
2357→ return stmt
2358→ }
2359→ }
2360→
2361→ // Skip to next line
2362→ for p.peekTokenIs(lexer.NEWLINE) {
2363→ p.nextToken()
2364→ }
2365→
2366→ // Expect INDENT
2367→ if !p.peekTokenIs(lexer.INDENT) {
2368→ p.addError("expected indented block after IF")
2369→ return stmt
2370→ }
2371→ p.nextToken() // consume INDENT
2372→ startLevel := p.indentLevel
2373→ p.nextToken() // move into block
2374→
2375→ // Parse if choices (condition -> body pairs)
2376→ for !p.curTokenIs(lexer.EOF) {
2377→ // Skip newlines
2378→ for p.curTokenIs(lexer.NEWLINE) {
2379→ p.nextToken()
2380→ }
2381→
2382→ // Handle DEDENT tokens
2383→ for p.curTokenIs(lexer.DEDENT) {
2384→ if p.indentLevel < startLevel {
2385→ return stmt
2386→ }
2387→ p.nextToken()
2388→ }
2389→
2390→ // Skip any more newlines after DEDENT
2391→ for p.curTokenIs(lexer.NEWLINE) {
2392→ p.nextToken()
2393→ }
2394→
2395→ if p.curTokenIs(lexer.EOF) {
2396→ break
2397→ }
2398→
2399→ if p.indentLevel < startLevel {
2400→ break
2401→ }
2402→
2403→ // Safety guard: record position before parsing to detect no-progress
2404→ prevToken := p.curToken
2405→ prevPeek := p.peekToken
2406→
2407→ choice := ast.IfChoice{}
2408→
2409→ // Nested IF (plain or replicated) used as a choice within this IF
2410→ if p.curTokenIs(lexer.IF) {
2411→ nestedIf := p.parseIfStatement()
2412→ choice.NestedIf = nestedIf
2413→ } else {
2414→ choice.Condition = p.parseExpression(LOWEST)
2415→
2416→ // Skip newlines and expect INDENT for body
2417→ for p.peekTokenIs(lexer.NEWLINE) {
2418→ p.nextToken()
2419→ }
2420→
2421→ if p.peekTokenIs(lexer.INDENT) {
2422→ p.nextToken() // consume INDENT
2423→ p.nextToken() // move to body
2424→ choice.Body = p.parseBodyStatements()
2425→ }
2426→ }
2427→
2428→ stmt.Choices = append(stmt.Choices, choice)
2429→
2430→ // No-progress guard: if we haven't moved, break to prevent infinite loop
2431→ if p.curToken == prevToken && p.peekToken == prevPeek {
2432→ break
2433→ }
2434→ }
2435→
2436→ return stmt
2437→}
2438→
2439→func (p *Parser) parseCaseStatement() *ast.CaseStatement {
2440→ stmt := &ast.CaseStatement{Token: p.curToken}
2441→
2442→ // Parse selector expression on the same line
2443→ p.nextToken()
2444→ stmt.Selector = p.parseExpression(LOWEST)
2445→
2446→ // Skip to next line
2447→ for p.peekTokenIs(lexer.NEWLINE) {
2448→ p.nextToken()
2449→ }
2450→
2451→ // Expect INDENT
2452→ if !p.peekTokenIs(lexer.INDENT) {
2453→ p.addError("expected indented block after CASE")
2454→ return stmt
2455→ }
2456→ p.nextToken() // consume INDENT
2457→ startLevel := p.indentLevel
2458→ p.nextToken() // move into block
2459→
2460→ // Parse case choices
2461→ for !p.curTokenIs(lexer.EOF) {
2462→ // Skip newlines
2463→ for p.curTokenIs(lexer.NEWLINE) {
2464→ p.nextToken()
2465→ }
2466→
2467→ // Handle DEDENT tokens
2468→ for p.curTokenIs(lexer.DEDENT) {
2469→ if p.indentLevel < startLevel {
2470→ return stmt
2471→ }
2472→ p.nextToken()
2473→ }
2474→
2475→ // Skip any more newlines after DEDENT
2476→ for p.curTokenIs(lexer.NEWLINE) {
2477→ p.nextToken()
2478→ }
2479→
2480→ if p.curTokenIs(lexer.EOF) {
2481→ break
2482→ }
2483→
2484→ if p.indentLevel < startLevel {
2485→ break
2486→ }
2487→
2488→ // Safety guard: record position before parsing to detect no-progress
2489→ prevToken := p.curToken
2490→ prevPeek := p.peekToken
2491→
2492→ choice := ast.CaseChoice{}
2493→
2494→ if p.curTokenIs(lexer.ELSE) {
2495→ choice.IsElse = true
2496→ } else {
2497→ // Parse value expression
2498→ choice.Values = append(choice.Values, p.parseExpression(LOWEST))
2499→ }
2500→
2501→ // Skip newlines and expect INDENT for body
2502→ for p.peekTokenIs(lexer.NEWLINE) {
2503→ p.nextToken()
2504→ }
2505→
2506→ if p.peekTokenIs(lexer.INDENT) {
2507→ p.nextToken() // consume INDENT
2508→ p.nextToken() // move to body
2509→ choice.Body = p.parseBodyStatements()
2510→ }
2511→
2512→ stmt.Choices = append(stmt.Choices, choice)
2513→
2514→ // No-progress guard: if we haven't moved, break to prevent infinite loop
2515→ if p.curToken == prevToken && p.peekToken == prevPeek {
2516→ break
2517→ }
2518→ }
2519→
2520→ return stmt
2521→}
2522→
2523→// Expression parsing using Pratt parsing
2524→
2525→func (p *Parser) parseExpression(precedence int) ast.Expression {
2526→ var left ast.Expression
2527→
2528→ switch p.curToken.Type {
2529→ case lexer.IDENT:
2530→ if p.peekTokenIs(lexer.LPAREN) {
2531→ left = p.parseFuncCallExpr()
2532→ } else {
2533→ left = &ast.Identifier{Token: p.curToken, Value: p.curToken.Literal}
2534→ }
2535→ case lexer.INT:
2536→ base := 10
2537→ literal := p.curToken.Literal
2538→ if strings.HasPrefix(literal, "0x") || strings.HasPrefix(literal, "0X") {
2539→ base = 16
2540→ literal = literal[2:]
2541→ }
2542→ val, err := strconv.ParseInt(literal, base, 64)
2543→ if err != nil {
2544→ p.addError(fmt.Sprintf("could not parse %q as integer", p.curToken.Literal))
2545→ return nil
2546→ }
2547→ left = &ast.IntegerLiteral{Token: p.curToken, Value: val}
2548→ case lexer.TRUE:
2549→ left = &ast.BooleanLiteral{Token: p.curToken, Value: true}
2550→ case lexer.FALSE:
2551→ left = &ast.BooleanLiteral{Token: p.curToken, Value: false}
2552→ case lexer.STRING:
2553→ left = &ast.StringLiteral{Token: p.curToken, Value: p.curToken.Literal}
2554→ case lexer.BYTE_LIT:
2555→ b, err := p.parseByteLiteralValue(p.curToken.Literal)
2556→ if err != nil {
2557→ p.addError(err.Error())
2558→ return nil
2559→ }
2560→ left = &ast.ByteLiteral{Token: p.curToken, Value: b}
2561→ case lexer.LPAREN:
2562→ p.nextToken()
2563→ left = p.parseExpression(LOWEST)
2564→ if !p.expectPeek(lexer.RPAREN) {
2565→ return nil
2566→ }
2567→ case lexer.MINUS, lexer.MINUS_KW:
2568→ token := p.curToken
2569→ p.nextToken()
2570→ left = &ast.UnaryExpr{
2571→ Token: token,
2572→ Operator: "-",
2573→ Right: p.parseExpression(PREFIX),
2574→ }
2575→ case lexer.NOT:
2576→ token := p.curToken
2577→ p.nextToken()
2578→ left = &ast.UnaryExpr{
2579→ Token: token,
2580→ Operator: "NOT",
2581→ Right: p.parseExpression(PREFIX),
2582→ }
2583→ case lexer.BITNOT:
2584→ token := p.curToken
2585→ p.nextToken()
2586→ left = &ast.UnaryExpr{
2587→ Token: token,
2588→ Operator: "~",
2589→ Right: p.parseExpression(PREFIX),
2590→ }
2591→ case lexer.LBRACKET:
2592→ // Slice expression: [arr FROM start FOR length] or [arr FOR length]
2593→ lbracket := p.curToken
2594→ p.nextToken() // move past [
2595→ arrayExpr := p.parseExpression(LOWEST)
2596→ var startExpr ast.Expression
2597→ if p.peekTokenIs(lexer.FOR) {
2598→ // [arr FOR length] shorthand — start is 0
2599→ startExpr = &ast.IntegerLiteral{Token: lexer.Token{Type: lexer.INT, Literal: "0"}, Value: 0}
2600→ } else {
2601→ if !p.expectPeek(lexer.FROM) {
2602→ return nil
2603→ }
2604→ p.nextToken() // move past FROM
2605→ startExpr = p.parseExpression(LOWEST)
2606→ }
2607→ if !p.expectPeek(lexer.FOR) {
2608→ return nil
2609→ }
2610→ p.nextToken() // move past FOR
2611→ lengthExpr := p.parseExpression(LOWEST)
2612→ if !p.expectPeek(lexer.RBRACKET) {
2613→ return nil
2614→ }
2615→ left = &ast.SliceExpr{
2616→ Token: lbracket,
2617→ Array: arrayExpr,
2618→ Start: startExpr,
2619→ Length: lengthExpr,
2620→ }
2621→ case lexer.SIZE_KW:
2622→ token := p.curToken
2623→ p.nextToken()
2624→ left = &ast.SizeExpr{
2625→ Token: token,
2626→ Expr: p.parseExpression(PREFIX),
2627→ }
2628→ case lexer.MOSTNEG_KW, lexer.MOSTPOS_KW:
2629→ token := p.curToken
2630→ isNeg := token.Type == lexer.MOSTNEG_KW
2631→ // Expect a type name next
2632→ if !p.peekTokenIs(lexer.INT_TYPE) && !p.peekTokenIs(lexer.BYTE_TYPE) &&
2633→ !p.peekTokenIs(lexer.BOOL_TYPE) && !p.peekTokenIs(lexer.REAL_TYPE) &&
2634→ !p.peekTokenIs(lexer.REAL32_TYPE) && !p.peekTokenIs(lexer.REAL64_TYPE) {
2635→ p.addError(fmt.Sprintf("expected type after %s, got %s", token.Literal, p.peekToken.Type))
2636→ return nil
2637→ }
2638→ p.nextToken()
2639→ left = &ast.MostExpr{
2640→ Token: token,
2641→ ExprType: p.curToken.Literal,
2642→ IsNeg: isNeg,
2643→ }
2644→ case lexer.INT_TYPE, lexer.BYTE_TYPE, lexer.BOOL_TYPE, lexer.REAL_TYPE, lexer.REAL32_TYPE, lexer.REAL64_TYPE:
2645→ token := p.curToken
2646→ p.nextToken()
2647→ left = &ast.TypeConversion{
2648→ Token: token,
2649→ TargetType: token.Literal,
2650→ Expr: p.parseExpression(PREFIX),
2651→ }
2652→ default:
2653→ p.addError(fmt.Sprintf("unexpected token in expression: %s", p.curToken.Type))
2654→ return nil
2655→ }
2656→
2657→ // Parse infix expressions
2658→ for !p.peekTokenIs(lexer.NEWLINE) && !p.peekTokenIs(lexer.EOF) &&
2659→ precedence < p.peekPrecedence() {
2660→
2661→ switch p.peekToken.Type {
2662→ case lexer.PLUS, lexer.MINUS, lexer.MULTIPLY, lexer.DIVIDE, lexer.MODULO,
2663→ lexer.PLUS_KW, lexer.MINUS_KW, lexer.TIMES,
2664→ lexer.EQ, lexer.NEQ, lexer.LT, lexer.GT, lexer.LE, lexer.GE,
2665→ lexer.AND, lexer.OR, lexer.AFTER,
2666→ lexer.BITAND, lexer.BITOR, lexer.BITXOR, lexer.LSHIFT, lexer.RSHIFT:
2667→ p.nextToken()
2668→ left = p.parseBinaryExpr(left)
2669→ case lexer.LBRACKET:
2670→ p.nextToken()
2671→ left = p.parseIndexExpression(left)
2672→ default:
2673→ return left
2674→ }
2675→ }
2676→
2677→ return left
2678→}
2679→
2680→func (p *Parser) parseBinaryExpr(left ast.Expression) ast.Expression {
2681→ expr := &ast.BinaryExpr{
2682→ Token: p.curToken,
2683→ Left: left,
2684→ Operator: p.curToken.Literal,
2685→ }
2686→
2687→ prec := p.curPrecedence()
2688→ p.nextToken()
2689→ expr.Right = p.parseExpression(prec)
2690→
2691→ return expr
2692→}
2693→ 1→package codegen
2→
3→import (
4→ "fmt"
5→ "strings"
6→
7→ "github.com/codeassociates/occam2go/ast"
8→)
9→
10→// Generator converts an AST to Go code
11→type Generator struct {
12→ indent int
13→ builder strings.Builder
14→ needSync bool // track if we need sync package import
15→ needFmt bool // track if we need fmt package import
16→ needTime bool // track if we need time package import
17→ needOs bool // track if we need os package import
18→ needMath bool // track if we need math package import
19→
20→ // Track procedure signatures for proper pointer handling
21→ procSigs map[string][]ast.ProcParam
22→ // Track current procedure's reference parameters
23→ refParams map[string]bool
24→
25→ // Protocol support
26→ protocolDefs map[string]*ast.ProtocolDecl
27→ chanProtocols map[string]string // channel name → protocol name
28→ tmpCounter int // for unique temp variable names
29→
30→ // Record support
31→ recordDefs map[string]*ast.RecordDecl
32→ recordVars map[string]string // variable name → record type name
33→
34→ // Nesting level: 0 = package level, >0 = inside a function
35→ nestingLevel int
36→}
37→
38→// Built-in print procedures
39→var printBuiltins = map[string]bool{
40→ "print.int": true,
41→ "print.string": true,
42→ "print.bool": true,
43→ "print.newline": true,
44→}
45→
46→// New creates a new code generator
47→func New() *Generator {
48→ return &Generator{}
49→}
50→
51→// goIdent converts an occam identifier to a valid Go identifier.
52→// Occam allows dots in identifiers (e.g., out.repeat); Go does not.
53→func goIdent(name string) string {
54→ return strings.ReplaceAll(name, ".", "_")
55→}
56→
57→// Generate produces Go code from the AST
58→func (g *Generator) Generate(program *ast.Program) string {
59→ g.builder.Reset()
60→ g.needSync = false
61→ g.needFmt = false
62→ g.needTime = false
63→ g.needOs = false
64→ g.needMath = false
65→ g.procSigs = make(map[string][]ast.ProcParam)
66→ g.refParams = make(map[string]bool)
67→ g.protocolDefs = make(map[string]*ast.ProtocolDecl)
68→ g.chanProtocols = make(map[string]string)
69→ g.tmpCounter = 0
70→ g.recordDefs = make(map[string]*ast.RecordDecl)
71→ g.recordVars = make(map[string]string)
72→
73→ // First pass: collect procedure signatures, protocols, and check for PAR/print
74→ for _, stmt := range program.Statements {
75→ if g.containsPar(stmt) {
76→ g.needSync = true
77→ }
78→ if g.containsPrint(stmt) {
79→ g.needFmt = true
80→ }
81→ if g.containsTimer(stmt) {
82→ g.needTime = true
83→ }
84→ if g.containsStop(stmt) {
85→ g.needOs = true
86→ g.needFmt = true
87→ }
88→ if g.containsMostExpr(stmt) {
89→ g.needMath = true
90→ }
91→ if proc, ok := stmt.(*ast.ProcDecl); ok {
92→ g.procSigs[proc.Name] = proc.Params
93→ g.collectNestedProcSigs(proc.Body)
94→ }
95→ if fn, ok := stmt.(*ast.FuncDecl); ok {
96→ g.procSigs[fn.Name] = fn.Params
97→ }
98→ if proto, ok := stmt.(*ast.ProtocolDecl); ok {
99→ g.protocolDefs[proto.Name] = proto
100→ }
101→ if rec, ok := stmt.(*ast.RecordDecl); ok {
102→ g.recordDefs[rec.Name] = rec
103→ }
104→ g.collectChanProtocols(stmt)
105→ g.collectRecordVars(stmt)
106→ }
107→
108→ // Write package declaration
109→ g.writeLine("package main")
110→ g.writeLine("")
111→
112→ // Write imports
113→ if g.needSync || g.needFmt || g.needTime || g.needOs || g.needMath {
114→ g.writeLine("import (")
115→ g.indent++
116→ if g.needFmt {
117→ g.writeLine(`"fmt"`)
118→ }
119→ if g.needMath {
120→ g.writeLine(`"math"`)
121→ }
122→ if g.needOs {
123→ g.writeLine(`"os"`)
124→ }
125→ if g.needSync {
126→ g.writeLine(`"sync"`)
127→ }
128→ if g.needTime {
129→ g.writeLine(`"time"`)
130→ }
131→ g.indent--
132→ g.writeLine(")")
133→ g.writeLine("")
134→ }
135→
136→ // Separate protocol, record, procedure declarations from other statements
137→ var typeDecls []ast.Statement
138→ var procDecls []ast.Statement
139→ var mainStatements []ast.Statement
140→
141→ // First pass: check if there are any proc/func declarations
142→ hasProcDecls := false
143→ for _, stmt := range program.Statements {
144→ if _, ok := stmt.(*ast.ProcDecl); ok {
145→ hasProcDecls = true
146→ break
147→ }
148→ if _, ok := stmt.(*ast.FuncDecl); ok {
149→ hasProcDecls = true
150→ break
151→ }
152→ }
153→
154→ var abbrDecls []ast.Statement
155→ for _, stmt := range program.Statements {
156→ switch stmt.(type) {
157→ case *ast.ProtocolDecl, *ast.RecordDecl:
158→ typeDecls = append(typeDecls, stmt)
159→ case *ast.ProcDecl, *ast.FuncDecl:
160→ procDecls = append(procDecls, stmt)
161→ case *ast.Abbreviation:
162→ if hasProcDecls {
163→ // Top-level abbreviations need to be at package level
164→ // so PROCs can reference them
165→ abbrDecls = append(abbrDecls, stmt)
166→ } else {
167→ mainStatements = append(mainStatements, stmt)
168→ }
169→ default:
170→ mainStatements = append(mainStatements, stmt)
171→ }
172→ }
173→
174→ // Generate type definitions first (at package level)
175→ for _, stmt := range typeDecls {
176→ g.generateStatement(stmt)
177→ }
178→
179→ // Generate package-level abbreviations (constants)
180→ for _, stmt := range abbrDecls {
181→ abbr := stmt.(*ast.Abbreviation)
182→ goType := g.occamTypeToGo(abbr.Type)
183→ if abbr.IsOpenArray {
184→ goType = "[]" + goType
185→ }
186→ g.builder.WriteString("var ")
187→ g.write(fmt.Sprintf("%s %s = ", goIdent(abbr.Name), goType))
188→ g.generateExpression(abbr.Value)
189→ g.write("\n")
190→ }
191→ if len(abbrDecls) > 0 {
192→ g.writeLine("")
193→ }
194→
195→ // Generate procedure declarations (at package level)
196→ for _, stmt := range procDecls {
197→ g.generateStatement(stmt)
198→ }
199→
200→ // Generate main function with other statements
201→ if len(mainStatements) > 0 {
202→ g.writeLine("func main() {")
203→ g.indent++
204→ g.nestingLevel++
205→ for _, stmt := range mainStatements {
206→ g.generateStatement(stmt)
207→ }
208→ g.nestingLevel--
209→ g.indent--
210→ g.writeLine("}")
211→ }
212→
213→ return g.builder.String()
214→}
215→
216→// collectNestedProcSigs recursively collects procedure/function signatures
217→// from nested declarations inside PROC bodies.
218→func (g *Generator) collectNestedProcSigs(stmts []ast.Statement) {
219→ for _, stmt := range stmts {
220→ if proc, ok := stmt.(*ast.ProcDecl); ok {
221→ g.procSigs[proc.Name] = proc.Params
222→ g.collectNestedProcSigs(proc.Body)
223→ }
224→ if fn, ok := stmt.(*ast.FuncDecl); ok {
225→ g.procSigs[fn.Name] = fn.Params
226→ }
227→ }
228→}
229→
230→func (g *Generator) containsPar(stmt ast.Statement) bool {
231→ switch s := stmt.(type) {
232→ case *ast.ParBlock:
233→ return true
234→ case *ast.SeqBlock:
235→ for _, inner := range s.Statements {
236→ if g.containsPar(inner) {
237→ return true
238→ }
239→ }
240→ case *ast.AltBlock:
241→ for _, c := range s.Cases {
242→ for _, inner := range c.Body {
243→ if g.containsPar(inner) {
244→ return true
245→ }
246→ }
247→ }
248→ case *ast.ProcDecl:
249→ for _, inner := range s.Body {
250→ if g.containsPar(inner) {
251→ return true
252→ }
253→ }
254→ case *ast.FuncDecl:
255→ for _, inner := range s.Body {
256→ if g.containsPar(inner) {
257→ return true
258→ }
259→ }
260→ case *ast.WhileLoop:
261→ for _, inner := range s.Body {
262→ if g.containsPar(inner) {
263→ return true
264→ }
265→ }
266→ case *ast.IfStatement:
267→ for _, choice := range s.Choices {
268→ if choice.NestedIf != nil {
269→ if g.containsPar(choice.NestedIf) {
270→ return true
271→ }
272→ }
273→ for _, inner := range choice.Body {
274→ if g.containsPar(inner) {
275→ return true
276→ }
277→ }
278→ }
279→ case *ast.CaseStatement:
280→ for _, choice := range s.Choices {
281→ for _, inner := range choice.Body {
282→ if g.containsPar(inner) {
283→ return true
284→ }
285→ }
286→ }
287→ case *ast.VariantReceive:
288→ for _, c := range s.Cases {
289→ if c.Body != nil && g.containsPar(c.Body) {
290→ return true
291→ }
292→ }
293→ }
294→ return false
295→}
296→
297→func (g *Generator) containsPrint(stmt ast.Statement) bool {
298→ switch s := stmt.(type) {
299→ case *ast.ProcCall:
300→ return printBuiltins[s.Name]
301→ case *ast.SeqBlock:
302→ for _, inner := range s.Statements {
303→ if g.containsPrint(inner) {
304→ return true
305→ }
306→ }
307→ case *ast.ParBlock:
308→ for _, inner := range s.Statements {
309→ if g.containsPrint(inner) {
310→ return true
311→ }
312→ }
313→ case *ast.AltBlock:
314→ for _, c := range s.Cases {
315→ for _, inner := range c.Body {
316→ if g.containsPrint(inner) {
317→ return true
318→ }
319→ }
320→ }
321→ case *ast.ProcDecl:
322→ for _, inner := range s.Body {
323→ if g.containsPrint(inner) {
324→ return true
325→ }
326→ }
327→ case *ast.FuncDecl:
328→ for _, inner := range s.Body {
329→ if g.containsPrint(inner) {
330→ return true
331→ }
332→ }
333→ case *ast.WhileLoop:
334→ for _, inner := range s.Body {
335→ if g.containsPrint(inner) {
336→ return true
337→ }
338→ }
339→ case *ast.IfStatement:
340→ for _, choice := range s.Choices {
341→ if choice.NestedIf != nil {
342→ if g.containsPrint(choice.NestedIf) {
343→ return true
344→ }
345→ }
346→ for _, inner := range choice.Body {
347→ if g.containsPrint(inner) {
348→ return true
349→ }
350→ }
351→ }
352→ case *ast.CaseStatement:
353→ for _, choice := range s.Choices {
354→ for _, inner := range choice.Body {
355→ if g.containsPrint(inner) {
356→ return true
357→ }
358→ }
359→ }
360→ case *ast.VariantReceive:
361→ for _, c := range s.Cases {
362→ if c.Body != nil && g.containsPrint(c.Body) {
363→ return true
364→ }
365→ }
366→ }
367→ return false
368→}
369→
370→func (g *Generator) containsTimer(stmt ast.Statement) bool {
371→ switch s := stmt.(type) {
372→ case *ast.TimerDecl, *ast.TimerRead:
373→ return true
374→ case *ast.AltBlock:
375→ for _, c := range s.Cases {
376→ if c.IsTimer {
377→ return true
378→ }
379→ for _, inner := range c.Body {
380→ if g.containsTimer(inner) {
381→ return true
382→ }
383→ }
384→ }
385→ case *ast.SeqBlock:
386→ for _, inner := range s.Statements {
387→ if g.containsTimer(inner) {
388→ return true
389→ }
390→ }
391→ case *ast.ParBlock:
392→ for _, inner := range s.Statements {
393→ if g.containsTimer(inner) {
394→ return true
395→ }
396→ }
397→ case *ast.ProcDecl:
398→ for _, inner := range s.Body {
399→ if g.containsTimer(inner) {
400→ return true
401→ }
402→ }
403→ case *ast.FuncDecl:
404→ for _, inner := range s.Body {
405→ if g.containsTimer(inner) {
406→ return true
407→ }
408→ }
409→ case *ast.WhileLoop:
410→ for _, inner := range s.Body {
411→ if g.containsTimer(inner) {
412→ return true
413→ }
414→ }
415→ case *ast.IfStatement:
416→ for _, choice := range s.Choices {
417→ if choice.NestedIf != nil {
418→ if g.containsTimer(choice.NestedIf) {
419→ return true
420→ }
421→ }
422→ for _, inner := range choice.Body {
423→ if g.containsTimer(inner) {
424→ return true
425→ }
426→ }
427→ }
428→ case *ast.CaseStatement:
429→ for _, choice := range s.Choices {
430→ for _, inner := range choice.Body {
431→ if g.containsTimer(inner) {
432→ return true
433→ }
434→ }
435→ }
436→ case *ast.VariantReceive:
437→ for _, c := range s.Cases {
438→ if c.Body != nil && g.containsTimer(c.Body) {
439→ return true
440→ }
441→ }
442→ }
443→ return false
444→}
445→
446→func (g *Generator) containsStop(stmt ast.Statement) bool {
447→ switch s := stmt.(type) {
448→ case *ast.Stop:
449→ return true
450→ case *ast.SeqBlock:
451→ for _, inner := range s.Statements {
452→ if g.containsStop(inner) {
453→ return true
454→ }
455→ }
456→ case *ast.ParBlock:
457→ for _, inner := range s.Statements {
458→ if g.containsStop(inner) {
459→ return true
460→ }
461→ }
462→ case *ast.AltBlock:
463→ for _, c := range s.Cases {
464→ for _, inner := range c.Body {
465→ if g.containsStop(inner) {
466→ return true
467→ }
468→ }
469→ }
470→ case *ast.ProcDecl:
471→ for _, inner := range s.Body {
472→ if g.containsStop(inner) {
473→ return true
474→ }
475→ }
476→ case *ast.FuncDecl:
477→ for _, inner := range s.Body {
478→ if g.containsStop(inner) {
479→ return true
480→ }
481→ }
482→ case *ast.WhileLoop:
483→ for _, inner := range s.Body {
484→ if g.containsStop(inner) {
485→ return true
486→ }
487→ }
488→ case *ast.IfStatement:
489→ for _, choice := range s.Choices {
490→ if choice.NestedIf != nil {
491→ if g.containsStop(choice.NestedIf) {
492→ return true
493→ }
494→ }
495→ for _, inner := range choice.Body {
496→ if g.containsStop(inner) {
497→ return true
498→ }
499→ }
500→ }
501→ case *ast.CaseStatement:
502→ for _, choice := range s.Choices {
503→ for _, inner := range choice.Body {
504→ if g.containsStop(inner) {
505→ return true
506→ }
507→ }
508→ }
509→ case *ast.VariantReceive:
510→ for _, c := range s.Cases {
511→ if c.Body != nil && g.containsStop(c.Body) {
512→ return true
513→ }
514→ }
515→ }
516→ return false
517→}
518→
519→func (g *Generator) containsMostExpr(stmt ast.Statement) bool {
520→ switch s := stmt.(type) {
521→ case *ast.Assignment:
522→ return g.exprNeedsMath(s.Value) || g.exprNeedsMath(s.Index)
523→ case *ast.MultiAssignment:
524→ for _, t := range s.Targets {
525→ if g.exprNeedsMath(t.Index) {
526→ return true
527→ }
528→ }
529→ for _, v := range s.Values {
530→ if g.exprNeedsMath(v) {
531→ return true
532→ }
533→ }
534→ case *ast.Abbreviation:
535→ return g.exprNeedsMath(s.Value)
536→ case *ast.SeqBlock:
537→ for _, inner := range s.Statements {
538→ if g.containsMostExpr(inner) {
539→ return true
540→ }
541→ }
542→ case *ast.ParBlock:
543→ for _, inner := range s.Statements {
544→ if g.containsMostExpr(inner) {
545→ return true
546→ }
547→ }
548→ case *ast.ProcDecl:
549→ for _, inner := range s.Body {
550→ if g.containsMostExpr(inner) {
551→ return true
552→ }
553→ }
554→ case *ast.FuncDecl:
555→ for _, inner := range s.Body {
556→ if g.containsMostExpr(inner) {
557→ return true
558→ }
559→ }
560→ case *ast.WhileLoop:
561→ if g.exprNeedsMath(s.Condition) {
562→ return true
563→ }
564→ for _, inner := range s.Body {
565→ if g.containsMostExpr(inner) {
566→ return true
567→ }
568→ }
569→ case *ast.IfStatement:
570→ for _, choice := range s.Choices {
571→ if choice.NestedIf != nil {
572→ if g.containsMostExpr(choice.NestedIf) {
573→ return true
574→ }
575→ }
576→ if g.exprNeedsMath(choice.Condition) {
577→ return true
578→ }
579→ for _, inner := range choice.Body {
580→ if g.containsMostExpr(inner) {
581→ return true
582→ }
583→ }
584→ }
585→ case *ast.CaseStatement:
586→ if g.exprNeedsMath(s.Selector) {
587→ return true
588→ }
589→ for _, choice := range s.Choices {
590→ for _, v := range choice.Values {
591→ if g.exprNeedsMath(v) {
592→ return true
593→ }
594→ }
595→ for _, inner := range choice.Body {
596→ if g.containsMostExpr(inner) {
597→ return true
598→ }
599→ }
600→ }
601→ case *ast.Send:
602→ if g.exprNeedsMath(s.Value) {
603→ return true
604→ }
605→ for _, v := range s.Values {
606→ if g.exprNeedsMath(v) {
607→ return true
608→ }
609→ }
610→ case *ast.ProcCall:
611→ for _, arg := range s.Args {
612→ if g.exprNeedsMath(arg) {
613→ return true
614→ }
615→ }
616→ case *ast.AltBlock:
617→ for _, c := range s.Cases {
618→ for _, inner := range c.Body {
619→ if g.containsMostExpr(inner) {
620→ return true
621→ }
622→ }
623→ }
624→ case *ast.VariantReceive:
625→ for _, c := range s.Cases {
626→ if c.Body != nil && g.containsMostExpr(c.Body) {
627→ return true
628→ }
629→ }
630→ }
631→ return false
632→}
633→
634→func (g *Generator) exprNeedsMath(expr ast.Expression) bool {
635→ if expr == nil {
636→ return false
637→ }
638→ switch e := expr.(type) {
639→ case *ast.MostExpr:
640→ // BYTE uses literal 0/255, doesn't need math
641→ return e.ExprType != "BYTE"
642→ case *ast.BinaryExpr:
643→ return g.exprNeedsMath(e.Left) || g.exprNeedsMath(e.Right)
644→ case *ast.UnaryExpr:
645→ return g.exprNeedsMath(e.Right)
646→ case *ast.ParenExpr:
647→ return g.exprNeedsMath(e.Expr)
648→ case *ast.TypeConversion:
649→ return g.exprNeedsMath(e.Expr)
650→ case *ast.SizeExpr:
651→ return g.exprNeedsMath(e.Expr)
652→ case *ast.IndexExpr:
653→ return g.exprNeedsMath(e.Left) || g.exprNeedsMath(e.Index)
654→ case *ast.FuncCall:
655→ for _, arg := range e.Args {
656→ if g.exprNeedsMath(arg) {
657→ return true
658→ }
659→ }
660→ case *ast.SliceExpr:
661→ return g.exprNeedsMath(e.Array) || g.exprNeedsMath(e.Start) || g.exprNeedsMath(e.Length)
662→ }
663→ return false
664→}
665→
666→func (g *Generator) generateMostExpr(e *ast.MostExpr) {
667→ switch e.ExprType {
668→ case "INT":
669→ if e.IsNeg {
670→ g.write("math.MinInt")
671→ } else {
672→ g.write("math.MaxInt")
673→ }
674→ case "BYTE":
675→ if e.IsNeg {
676→ g.write("0")
677→ } else {
678→ g.write("255")
679→ }
680→ case "REAL32":
681→ if e.IsNeg {
682→ g.write("-math.MaxFloat32")
683→ } else {
684→ g.write("math.MaxFloat32")
685→ }
686→ case "REAL64":
687→ if e.IsNeg {
688→ g.write("-math.MaxFloat64")
689→ } else {
690→ g.write("math.MaxFloat64")
691→ }
692→ }
693→}
694→
695→func (g *Generator) writeLine(s string) {
696→ if s == "" {
697→ g.builder.WriteString("\n")
698→ return
699→ }
700→ g.builder.WriteString(strings.Repeat("\t", g.indent))
701→ g.builder.WriteString(s)
702→ g.builder.WriteString("\n")
703→}
704→
705→func (g *Generator) write(s string) {
706→ g.builder.WriteString(s)
707→}
708→
709→func (g *Generator) generateStatement(stmt ast.Statement) {
710→ switch s := stmt.(type) {
711→ case *ast.VarDecl:
712→ g.generateVarDecl(s)
713→ case *ast.ArrayDecl:
714→ g.generateArrayDecl(s)
715→ case *ast.ChanDecl:
716→ g.generateChanDecl(s)
717→ case *ast.Assignment:
718→ g.generateAssignment(s)
719→ case *ast.Send:
720→ g.generateSend(s)
721→ case *ast.Receive:
722→ g.generateReceive(s)
723→ case *ast.SeqBlock:
724→ g.generateSeqBlock(s)
725→ case *ast.ParBlock:
726→ g.generateParBlock(s)
727→ case *ast.AltBlock:
728→ g.generateAltBlock(s)
729→ case *ast.Skip:
730→ g.writeLine("// SKIP")
731→ case *ast.Stop:
732→ g.writeLine(`fmt.Fprintln(os.Stderr, "STOP encountered")`)
733→ g.writeLine("select {}")
734→ case *ast.ProcDecl:
735→ g.generateProcDecl(s)
736→ case *ast.FuncDecl:
737→ g.generateFuncDecl(s)
738→ case *ast.ProcCall:
739→ g.generateProcCall(s)
740→ case *ast.WhileLoop:
741→ g.generateWhileLoop(s)
742→ case *ast.IfStatement:
743→ g.generateIfStatement(s)
744→ case *ast.CaseStatement:
745→ g.generateCaseStatement(s)
746→ case *ast.TimerDecl:
747→ g.generateTimerDecl(s)
748→ case *ast.TimerRead:
749→ g.generateTimerRead(s)
750→ case *ast.ProtocolDecl:
751→ g.generateProtocolDecl(s)
752→ case *ast.VariantReceive:
753→ g.generateVariantReceive(s)
754→ case *ast.RecordDecl:
755→ g.generateRecordDecl(s)
756→ case *ast.Abbreviation:
757→ g.generateAbbreviation(s)
758→ case *ast.MultiAssignment:
759→ g.generateMultiAssignment(s)
760→ }
761→}
762→
763→func (g *Generator) generateVarDecl(decl *ast.VarDecl) {
764→ goType := g.occamTypeToGo(decl.Type)
765→ goNames := make([]string, len(decl.Names))
766→ for i, n := range decl.Names {
767→ goNames[i] = goIdent(n)
768→ }
769→ g.writeLine(fmt.Sprintf("var %s %s", strings.Join(goNames, ", "), goType))
770→ // Suppress "declared and not used" for each variable
771→ for _, n := range goNames {
772→ g.writeLine(fmt.Sprintf("_ = %s", n))
773→ }
774→}
775→
776→func (g *Generator) generateAbbreviation(abbr *ast.Abbreviation) {
777→ g.builder.WriteString(strings.Repeat("\t", g.indent))
778→ g.write(fmt.Sprintf("%s := ", goIdent(abbr.Name)))
779→ g.generateExpression(abbr.Value)
780→ g.write("\n")
781→}
782→
783→func (g *Generator) generateChanDecl(decl *ast.ChanDecl) {
784→ goType := g.occamTypeToGo(decl.ElemType)
785→ if decl.IsArray {
786→ for _, name := range decl.Names {
787→ n := goIdent(name)
788→ g.builder.WriteString(strings.Repeat("\t", g.indent))
789→ g.write(fmt.Sprintf("%s := make([]chan %s, ", n, goType))
790→ g.generateExpression(decl.Size)
791→ g.write(")\n")
792→ g.builder.WriteString(strings.Repeat("\t", g.indent))
793→ g.write(fmt.Sprintf("for _i := range %s { %s[_i] = make(chan %s) }\n", n, n, goType))
794→ }
795→ } else {
796→ for _, name := range decl.Names {
797→ g.writeLine(fmt.Sprintf("%s := make(chan %s)", goIdent(name), goType))
798→ }
799→ }
800→}
801→
802→func (g *Generator) generateTimerDecl(decl *ast.TimerDecl) {
803→ for _, name := range decl.Names {
804→ g.writeLine(fmt.Sprintf("// TIMER %s", name))
805→ }
806→}
807→
808→func (g *Generator) generateTimerRead(tr *ast.TimerRead) {
809→ g.writeLine(fmt.Sprintf("%s = int(time.Now().UnixMicro())", goIdent(tr.Variable)))
810→}
811→
812→func (g *Generator) generateArrayDecl(decl *ast.ArrayDecl) {
813→ goType := g.occamTypeToGo(decl.Type)
814→ for _, name := range decl.Names {
815→ n := goIdent(name)
816→ g.builder.WriteString(strings.Repeat("\t", g.indent))
817→ g.write(fmt.Sprintf("%s := make([]%s, ", n, goType))
818→ g.generateExpression(decl.Size)
819→ g.write(")\n")
820→ }
821→}
822→
823→func (g *Generator) generateSend(send *ast.Send) {
824→ g.builder.WriteString(strings.Repeat("\t", g.indent))
825→ g.write(goIdent(send.Channel))
826→ if send.ChannelIndex != nil {
827→ g.write("[")
828→ g.generateExpression(send.ChannelIndex)
829→ g.write("]")
830→ }
831→ g.write(" <- ")
832→
833→ protoName := g.chanProtocols[send.Channel]
834→ proto := g.protocolDefs[protoName]
835→ gProtoName := goIdent(protoName)
836→
837→ if send.VariantTag != "" && proto != nil && proto.Kind == "variant" {
838→ // Variant send with explicit tag: c <- _proto_NAME_tag{values...}
839→ g.write(fmt.Sprintf("_proto_%s_%s{", gProtoName, goIdent(send.VariantTag)))
840→ for i, val := range send.Values {
841→ if i > 0 {
842→ g.write(", ")
843→ }
844→ g.generateExpression(val)
845→ }
846→ g.write("}")
847→ } else if proto != nil && proto.Kind == "variant" && send.Value != nil && len(send.Values) == 0 {
848→ // Check if the send value is a bare identifier matching a variant tag
849→ if ident, ok := send.Value.(*ast.Identifier); ok && g.isVariantTag(protoName, ident.Value) {
850→ g.write(fmt.Sprintf("_proto_%s_%s{}", gProtoName, goIdent(ident.Value)))
851→ } else {
852→ g.generateExpression(send.Value)
853→ }
854→ } else if len(send.Values) > 0 && proto != nil && proto.Kind == "sequential" {
855→ // Sequential send: c <- _proto_NAME{val1, val2, ...}
856→ g.write(fmt.Sprintf("_proto_%s{", gProtoName))
857→ g.generateExpression(send.Value)
858→ for _, val := range send.Values {
859→ g.write(", ")
860→ g.generateExpression(val)
861→ }
862→ g.write("}")
863→ } else {
864→ // Simple send
865→ g.generateExpression(send.Value)
866→ }
867→ g.write("\n")
868→}
869→
870→func (g *Generator) generateReceive(recv *ast.Receive) {
871→ chanRef := goIdent(recv.Channel)
872→ if recv.ChannelIndex != nil {
873→ var buf strings.Builder
874→ buf.WriteString(goIdent(recv.Channel))
875→ buf.WriteString("[")
876→ // Generate the index expression into a temporary buffer
877→ oldBuilder := g.builder
878→ g.builder = strings.Builder{}
879→ g.generateExpression(recv.ChannelIndex)
880→ buf.WriteString(g.builder.String())
881→ g.builder = oldBuilder
882→ buf.WriteString("]")
883→ chanRef = buf.String()
884→ }
885→
886→ if len(recv.Variables) > 0 {
887→ // Sequential receive: _tmpN := <-c; x = _tmpN._0; y = _tmpN._1
888→ tmpName := fmt.Sprintf("_tmp%d", g.tmpCounter)
889→ g.tmpCounter++
890→ g.writeLine(fmt.Sprintf("%s := <-%s", tmpName, chanRef))
891→ varRef := goIdent(recv.Variable)
892→ if g.refParams[recv.Variable] {
893→ varRef = "*" + varRef
894→ }
895→ g.writeLine(fmt.Sprintf("%s = %s._0", varRef, tmpName))
896→ for i, v := range recv.Variables {
897→ vRef := goIdent(v)
898→ if g.refParams[v] {
899→ vRef = "*" + vRef
900→ }
901→ g.writeLine(fmt.Sprintf("%s = %s._%d", vRef, tmpName, i+1))
902→ }
903→ } else {
904→ varRef := goIdent(recv.Variable)
905→ if g.refParams[recv.Variable] {
906→ varRef = "*" + varRef
907→ }
908→ g.writeLine(fmt.Sprintf("%s = <-%s", varRef, chanRef))
909→ }
910→}
911→
912→func (g *Generator) generateProtocolDecl(proto *ast.ProtocolDecl) {
913→ gName := goIdent(proto.Name)
914→ switch proto.Kind {
915→ case "simple":
916→ goType := g.occamTypeToGoBase(proto.Types[0])
917→ g.writeLine(fmt.Sprintf("type _proto_%s = %s", gName, goType))
918→ g.writeLine("")
919→ case "sequential":
920→ g.writeLine(fmt.Sprintf("type _proto_%s struct {", gName))
921→ g.indent++
922→ for i, t := range proto.Types {
923→ goType := g.occamTypeToGoBase(t)
924→ g.writeLine(fmt.Sprintf("_%d %s", i, goType))
925→ }
926→ g.indent--
927→ g.writeLine("}")
928→ g.writeLine("")
929→ case "variant":
930→ // Interface type
931→ g.writeLine(fmt.Sprintf("type _proto_%s interface {", gName))
932→ g.indent++
933→ g.writeLine(fmt.Sprintf("_is_%s()", gName))
934→ g.indent--
935→ g.writeLine("}")
936→ g.writeLine("")
937→ // Concrete types for each variant
938→ for _, v := range proto.Variants {
939→ gTag := goIdent(v.Tag)
940→ if len(v.Types) == 0 {
941→ // No-payload variant: empty struct
942→ g.writeLine(fmt.Sprintf("type _proto_%s_%s struct{}", gName, gTag))
943→ } else {
944→ g.writeLine(fmt.Sprintf("type _proto_%s_%s struct {", gName, gTag))
945→ g.indent++
946→ for i, t := range v.Types {
947→ goType := g.occamTypeToGoBase(t)
948→ g.writeLine(fmt.Sprintf("_%d %s", i, goType))
949→ }
950→ g.indent--
951→ g.writeLine("}")
952→ }
953→ g.writeLine(fmt.Sprintf("func (_proto_%s_%s) _is_%s() {}", gName, gTag, gName))
954→ g.writeLine("")
955→ }
956→ }
957→}
958→
959→func (g *Generator) generateVariantReceive(vr *ast.VariantReceive) {
960→ protoName := g.chanProtocols[vr.Channel]
961→ gProtoName := goIdent(protoName)
962→ chanRef := goIdent(vr.Channel)
963→ if vr.ChannelIndex != nil {
964→ var buf strings.Builder
965→ buf.WriteString(goIdent(vr.Channel))
966→ buf.WriteString("[")
967→ oldBuilder := g.builder
968→ g.builder = strings.Builder{}
969→ g.generateExpression(vr.ChannelIndex)
970→ buf.WriteString(g.builder.String())
971→ g.builder = oldBuilder
972→ buf.WriteString("]")
973→ chanRef = buf.String()
974→ }
975→ g.writeLine(fmt.Sprintf("switch _v := (<-%s).(type) {", chanRef))
976→ for _, vc := range vr.Cases {
977→ g.writeLine(fmt.Sprintf("case _proto_%s_%s:", gProtoName, goIdent(vc.Tag)))
978→ g.indent++
979→ for i, v := range vc.Variables {
980→ g.writeLine(fmt.Sprintf("%s = _v._%d", goIdent(v), i))
981→ }
982→ if vc.Body != nil {
983→ g.generateStatement(vc.Body)
984→ }
985→ g.indent--
986→ }
987→ g.writeLine("}")
988→}
989→
990→func (g *Generator) isVariantTag(protoName, tagName string) bool {
991→ proto := g.protocolDefs[protoName]
992→ if proto == nil {
993→ return false
994→ }
995→ for _, v := range proto.Variants {
996→ if v.Tag == tagName {
997→ return true
998→ }
999→ }
1000→ return false
1001→}
1002→
1003→func (g *Generator) collectChanProtocols(stmt ast.Statement) {
1004→ switch s := stmt.(type) {
1005→ case *ast.ChanDecl:
1006→ if _, ok := g.protocolDefs[s.ElemType]; ok {
1007→ for _, name := range s.Names {
1008→ g.chanProtocols[name] = s.ElemType
1009→ }
1010→ }
1011→ case *ast.SeqBlock:
1012→ for _, inner := range s.Statements {
1013→ g.collectChanProtocols(inner)
1014→ }
1015→ case *ast.ParBlock:
1016→ for _, inner := range s.Statements {
1017→ g.collectChanProtocols(inner)
1018→ }
1019→ case *ast.ProcDecl:
1020→ // Register PROC param channels (including channel array params)
1021→ for _, p := range s.Params {
1022→ if p.IsChan || p.IsChanArray {
1023→ if _, ok := g.protocolDefs[p.ChanElemType]; ok {
1024→ g.chanProtocols[p.Name] = p.ChanElemType
1025→ }
1026→ }
1027→ }
1028→ for _, inner := range s.Body {
1029→ g.collectChanProtocols(inner)
1030→ }
1031→ case *ast.FuncDecl:
1032→ for _, inner := range s.Body {
1033→ g.collectChanProtocols(inner)
1034→ }
1035→ case *ast.WhileLoop:
1036→ for _, inner := range s.Body {
1037→ g.collectChanProtocols(inner)
1038→ }
1039→ case *ast.IfStatement:
1040→ for _, choice := range s.Choices {
1041→ if choice.NestedIf != nil {
1042→ g.collectChanProtocols(choice.NestedIf)
1043→ }
1044→ for _, inner := range choice.Body {
1045→ g.collectChanProtocols(inner)
1046→ }
1047→ }
1048→ case *ast.CaseStatement:
1049→ for _, choice := range s.Choices {
1050→ for _, inner := range choice.Body {
1051→ g.collectChanProtocols(inner)
1052→ }
1053→ }
1054→ case *ast.AltBlock:
1055→ for _, c := range s.Cases {
1056→ for _, inner := range c.Body {
1057→ g.collectChanProtocols(inner)
1058→ }
1059→ }
1060→ }
1061→}
1062→
1063→func (g *Generator) collectRecordVars(stmt ast.Statement) {
1064→ switch s := stmt.(type) {
1065→ case *ast.VarDecl:
1066→ if _, ok := g.recordDefs[s.Type]; ok {
1067→ for _, name := range s.Names {
1068→ g.recordVars[name] = s.Type
1069→ }
1070→ }
1071→ case *ast.SeqBlock:
1072→ for _, inner := range s.Statements {
1073→ g.collectRecordVars(inner)
1074→ }
1075→ case *ast.ParBlock:
1076→ for _, inner := range s.Statements {
1077→ g.collectRecordVars(inner)
1078→ }
1079→ case *ast.ProcDecl:
1080→ for _, p := range s.Params {
1081→ if !p.IsChan {
1082→ if _, ok := g.recordDefs[p.Type]; ok {
1083→ g.recordVars[p.Name] = p.Type
1084→ }
1085→ }
1086→ }
1087→ for _, inner := range s.Body {
1088→ g.collectRecordVars(inner)
1089→ }
1090→ case *ast.FuncDecl:
1091→ for _, inner := range s.Body {
1092→ g.collectRecordVars(inner)
1093→ }
1094→ case *ast.WhileLoop:
1095→ for _, inner := range s.Body {
1096→ g.collectRecordVars(inner)
1097→ }
1098→ case *ast.IfStatement:
1099→ for _, choice := range s.Choices {
1100→ if choice.NestedIf != nil {
1101→ g.collectRecordVars(choice.NestedIf)
1102→ }
1103→ for _, inner := range choice.Body {
1104→ g.collectRecordVars(inner)
1105→ }
1106→ }
1107→ case *ast.CaseStatement:
1108→ for _, choice := range s.Choices {
1109→ for _, inner := range choice.Body {
1110→ g.collectRecordVars(inner)
1111→ }
1112→ }
1113→ case *ast.AltBlock:
1114→ for _, c := range s.Cases {
1115→ for _, inner := range c.Body {
1116→ g.collectRecordVars(inner)
1117→ }
1118→ }
1119→ }
1120→}
1121→
1122→func (g *Generator) generateRecordDecl(rec *ast.RecordDecl) {
1123→ g.writeLine(fmt.Sprintf("type %s struct {", goIdent(rec.Name)))
1124→ g.indent++
1125→ for _, f := range rec.Fields {
1126→ goType := g.occamTypeToGoBase(f.Type)
1127→ g.writeLine(fmt.Sprintf("%s %s", goIdent(f.Name), goType))
1128→ }
1129→ g.indent--
1130→ g.writeLine("}")
1131→ g.writeLine("")
1132→}
1133→
1134→// occamTypeToGoBase converts a type name without checking protocol defs
1135→// (used inside protocol generation to avoid infinite recursion)
1136→func (g *Generator) occamTypeToGoBase(occamType string) string {
1137→ switch occamType {
1138→ case "INT":
1139→ return "int"
1140→ case "BYTE":
1141→ return "byte"
1142→ case "BOOL":
1143→ return "bool"
1144→ case "REAL":
1145→ return "float64"
1146→ case "REAL32":
1147→ return "float32"
1148→ case "REAL64":
1149→ return "float64"
1150→ default:
1151→ return occamType
1152→ }
1153→}
1154→
1155→func (g *Generator) occamTypeToGo(occamType string) string {
1156→ switch occamType {
1157→ case "INT":
1158→ return "int"
1159→ case "BYTE":
1160→ return "byte"
1161→ case "BOOL":
1162→ return "bool"
1163→ case "REAL":
1164→ return "float64"
1165→ case "REAL32":
1166→ return "float32"
1167→ case "REAL64":
1168→ return "float64"
1169→ default:
1170→ // Check if it's a protocol name
1171→ if _, ok := g.protocolDefs[occamType]; ok {
1172→ return "_proto_" + occamType
1173→ }
1174→ // Check if it's a record type name
1175→ if _, ok := g.recordDefs[occamType]; ok {
1176→ return occamType
1177→ }
1178→ return occamType // pass through unknown types
1179→ }
1180→}
1181→
1182→func (g *Generator) generateAssignment(assign *ast.Assignment) {
1183→ g.builder.WriteString(strings.Repeat("\t", g.indent))
1184→
1185→ if assign.SliceTarget != nil {
1186→ // Slice assignment: [arr FROM start FOR length] := value
1187→ // Maps to: copy(arr[start : start + length], value)
1188→ g.write("copy(")
1189→ g.generateExpression(assign.SliceTarget.Array)
1190→ g.write("[")
1191→ g.generateExpression(assign.SliceTarget.Start)
1192→ g.write(" : ")
1193→ g.generateExpression(assign.SliceTarget.Start)
1194→ g.write(" + ")
1195→ g.generateExpression(assign.SliceTarget.Length)
1196→ g.write("], ")
1197→ g.generateExpression(assign.Value)
1198→ g.write(")\n")
1199→ return
1200→ }
1201→
1202→ if assign.Index != nil {
1203→ // Check if this is a record field access
1204→ if _, ok := g.recordVars[assign.Name]; ok {
1205→ if ident, ok := assign.Index.(*ast.Identifier); ok {
1206→ // Record field: p.x = value (Go auto-dereferences pointers)
1207→ g.write(goIdent(assign.Name))
1208→ g.write(".")
1209→ g.write(goIdent(ident.Value))
1210→ g.write(" = ")
1211→ g.generateExpression(assign.Value)
1212→ g.write("\n")
1213→ return
1214→ }
1215→ }
1216→ // Array index: dereference if ref param
1217→ if g.refParams[assign.Name] {
1218→ g.write("*")
1219→ }
1220→ g.write(goIdent(assign.Name))
1221→ g.write("[")
1222→ g.generateExpression(assign.Index)
1223→ g.write("]")
1224→ } else {
1225→ // Simple assignment: dereference if ref param
1226→ if g.refParams[assign.Name] {
1227→ g.write("*")
1228→ }
1229→ g.write(goIdent(assign.Name))
1230→ }
1231→ g.write(" = ")
1232→ g.generateExpression(assign.Value)
1233→ g.write("\n")
1234→}
1235→
1236→func (g *Generator) generateSeqBlock(seq *ast.SeqBlock) {
1237→ if seq.Replicator != nil {
1238→ if seq.Replicator.Step != nil {
1239→ // Replicated SEQ with STEP: counter-based loop
1240→ v := goIdent(seq.Replicator.Variable)
1241→ counter := "_repl_" + v
1242→ g.builder.WriteString(strings.Repeat("\t", g.indent))
1243→ g.write(fmt.Sprintf("for %s := 0; %s < ", counter, counter))
1244→ g.generateExpression(seq.Replicator.Count)
1245→ g.write(fmt.Sprintf("; %s++ {\n", counter))
1246→ g.indent++
1247→ g.builder.WriteString(strings.Repeat("\t", g.indent))
1248→ g.write(fmt.Sprintf("%s := ", v))
1249→ g.generateExpression(seq.Replicator.Start)
1250→ g.write(fmt.Sprintf(" + %s * ", counter))
1251→ g.generateExpression(seq.Replicator.Step)
1252→ g.write("\n")
1253→ } else {
1254→ // Replicated SEQ: SEQ i = start FOR count becomes a for loop
1255→ v := goIdent(seq.Replicator.Variable)
1256→ g.builder.WriteString(strings.Repeat("\t", g.indent))
1257→ g.write(fmt.Sprintf("for %s := ", v))
1258→ g.generateExpression(seq.Replicator.Start)
1259→ g.write(fmt.Sprintf("; %s < ", v))
1260→ g.generateExpression(seq.Replicator.Start)
1261→ g.write(" + ")
1262→ g.generateExpression(seq.Replicator.Count)
1263→ g.write(fmt.Sprintf("; %s++ {\n", v))
1264→ g.indent++
1265→ }
1266→ for _, stmt := range seq.Statements {
1267→ g.generateStatement(stmt)
1268→ }
1269→ g.indent--
1270→ g.writeLine("}")
1271→ } else {
1272→ // SEQ just becomes sequential Go code (Go's default)
1273→ for _, stmt := range seq.Statements {
1274→ g.generateStatement(stmt)
1275→ }
1276→ }
1277→}
1278→
1279→func (g *Generator) generateParBlock(par *ast.ParBlock) {
1280→ if par.Replicator != nil {
1281→ // Replicated PAR: PAR i = start FOR count becomes goroutines in a loop
1282→ g.writeLine("var wg sync.WaitGroup")
1283→ g.builder.WriteString(strings.Repeat("\t", g.indent))
1284→ g.write("wg.Add(int(")
1285→ g.generateExpression(par.Replicator.Count)
1286→ g.write("))\n")
1287→
1288→ v := goIdent(par.Replicator.Variable)
1289→ if par.Replicator.Step != nil {
1290→ counter := "_repl_" + v
1291→ g.builder.WriteString(strings.Repeat("\t", g.indent))
1292→ g.write(fmt.Sprintf("for %s := 0; %s < ", counter, counter))
1293→ g.generateExpression(par.Replicator.Count)
1294→ g.write(fmt.Sprintf("; %s++ {\n", counter))
1295→ g.indent++
1296→ // Compute loop variable from counter — also serves as closure capture
1297→ g.builder.WriteString(strings.Repeat("\t", g.indent))
1298→ g.write(fmt.Sprintf("%s := ", v))
1299→ g.generateExpression(par.Replicator.Start)
1300→ g.write(fmt.Sprintf(" + %s * ", counter))
1301→ g.generateExpression(par.Replicator.Step)
1302→ g.write("\n")
1303→ } else {
1304→ g.builder.WriteString(strings.Repeat("\t", g.indent))
1305→ g.write(fmt.Sprintf("for %s := ", v))
1306→ g.generateExpression(par.Replicator.Start)
1307→ g.write(fmt.Sprintf("; %s < ", v))
1308→ g.generateExpression(par.Replicator.Start)
1309→ g.write(" + ")
1310→ g.generateExpression(par.Replicator.Count)
1311→ g.write(fmt.Sprintf("; %s++ {\n", v))
1312→ g.indent++
1313→ // Capture loop variable to avoid closure issues
1314→ g.writeLine(fmt.Sprintf("%s := %s", v, v))
1315→ }
1316→ g.writeLine("go func() {")
1317→ g.indent++
1318→ g.writeLine("defer wg.Done()")
1319→ for _, stmt := range par.Statements {
1320→ g.generateStatement(stmt)
1321→ }
1322→ g.indent--
1323→ g.writeLine("}()")
1324→
1325→ g.indent--
1326→ g.writeLine("}")
1327→ g.writeLine("wg.Wait()")
1328→ } else {
1329→ // PAR becomes goroutines with WaitGroup
1330→ g.writeLine("var wg sync.WaitGroup")
1331→ g.writeLine(fmt.Sprintf("wg.Add(%d)", len(par.Statements)))
1332→
1333→ for _, stmt := range par.Statements {
1334→ g.writeLine("go func() {")
1335→ g.indent++
1336→ g.writeLine("defer wg.Done()")
1337→ g.generateStatement(stmt)
1338→ g.indent--
1339→ g.writeLine("}()")
1340→ }
1341→
1342→ g.writeLine("wg.Wait()")
1343→ }
1344→}
1345→
1346→func (g *Generator) generateAltBlock(alt *ast.AltBlock) {
1347→ // ALT becomes Go select statement
1348→ // For guards, we use a pattern with nil channels
1349→
1350→ // Check if any cases have guards
1351→ hasGuards := false
1352→ for _, c := range alt.Cases {
1353→ if c.Guard != nil {
1354→ hasGuards = true
1355→ break
1356→ }
1357→ }
1358→
1359→ if hasGuards {
1360→ // Generate channel variables for guarded cases
1361→ for i, c := range alt.Cases {
1362→ if c.Guard != nil {
1363→ g.builder.WriteString(strings.Repeat("\t", g.indent))
1364→ g.write(fmt.Sprintf("var _alt%d chan ", i))
1365→ // We don't know the channel type here, so use interface{}
1366→ // Actually, we should use the same type as the original channel
1367→ // For now, let's just reference the original channel conditionally
1368→ g.write(fmt.Sprintf("int = nil\n")) // Assuming int for now
1369→ g.builder.WriteString(strings.Repeat("\t", g.indent))
1370→ g.write(fmt.Sprintf("if "))
1371→ g.generateExpression(c.Guard)
1372→ g.write(fmt.Sprintf(" { _alt%d = %s }\n", i, goIdent(c.Channel)))
1373→ }
1374→ }
1375→ }
1376→
1377→ g.writeLine("select {")
1378→ for i, c := range alt.Cases {
1379→ g.builder.WriteString(strings.Repeat("\t", g.indent))
1380→ if c.IsTimer {
1381→ g.write("case <-time.After(time.Duration(")
1382→ g.generateExpression(c.Deadline)
1383→ g.write(" - int(time.Now().UnixMicro())) * time.Microsecond):\n")
1384→ } else if c.Guard != nil {
1385→ g.write(fmt.Sprintf("case %s = <-_alt%d:\n", goIdent(c.Variable), i))
1386→ } else if c.ChannelIndex != nil {
1387→ g.write(fmt.Sprintf("case %s = <-%s[", goIdent(c.Variable), goIdent(c.Channel)))
1388→ g.generateExpression(c.ChannelIndex)
1389→ g.write("]:\n")
1390→ } else {
1391→ g.write(fmt.Sprintf("case %s = <-%s:\n", goIdent(c.Variable), goIdent(c.Channel)))
1392→ }
1393→ g.indent++
1394→ for _, s := range c.Body {
1395→ g.generateStatement(s)
1396→ }
1397→ g.indent--
1398→ }
1399→ g.writeLine("}")
1400→}
1401→
1402→func (g *Generator) generateProcDecl(proc *ast.ProcDecl) {
1403→ // Track reference parameters for this procedure
1404→ oldRefParams := g.refParams
1405→ newRefParams := make(map[string]bool)
1406→ // Inherit parent's ref params for closure captures when nested
1407→ if g.nestingLevel > 0 {
1408→ for k, v := range oldRefParams {
1409→ newRefParams[k] = v
1410→ }
1411→ }
1412→ for _, p := range proc.Params {
1413→ if !p.IsVal && !p.IsChan && !p.IsChanArray && !p.IsOpenArray {
1414→ newRefParams[p.Name] = true
1415→ } else {
1416→ // Own param shadows any inherited ref param with same name
1417→ delete(newRefParams, p.Name)
1418→ }
1419→ // Register chan params with protocol mappings
1420→ if p.IsChan || p.IsChanArray {
1421→ if _, ok := g.protocolDefs[p.ChanElemType]; ok {
1422→ g.chanProtocols[p.Name] = p.ChanElemType
1423→ }
1424→ }
1425→ // Register record-typed params
1426→ if !p.IsChan {
1427→ if _, ok := g.recordDefs[p.Type]; ok {
1428→ g.recordVars[p.Name] = p.Type
1429→ }
1430→ }
1431→ }
1432→ g.refParams = newRefParams
1433→
1434→ // Generate function signature
1435→ params := g.generateProcParams(proc.Params)
1436→ gName := goIdent(proc.Name)
1437→ if g.nestingLevel > 0 {
1438→ // Nested PROC: generate as Go closure
1439→ g.writeLine(fmt.Sprintf("%s := func(%s) {", gName, params))
1440→ } else {
1441→ g.writeLine(fmt.Sprintf("func %s(%s) {", gName, params))
1442→ }
1443→ g.indent++
1444→ g.nestingLevel++
1445→
1446→ for _, stmt := range proc.Body {
1447→ g.generateStatement(stmt)
1448→ }
1449→
1450→ g.nestingLevel--
1451→ g.indent--
1452→ g.writeLine("}")
1453→ g.writeLine("")
1454→
1455→ // Restore previous context
1456→ g.refParams = oldRefParams
1457→}
1458→
1459→func (g *Generator) generateProcParams(params []ast.ProcParam) string {
1460→ var parts []string
1461→ for _, p := range params {
1462→ var goType string
1463→ if p.IsChanArray {
1464→ goType = "[]" + chanDirPrefix(p.ChanDir) + g.occamTypeToGo(p.ChanElemType)
1465→ } else if p.IsChan {
1466→ goType = chanDirPrefix(p.ChanDir) + g.occamTypeToGo(p.ChanElemType)
1467→ } else if p.IsOpenArray {
1468→ goType = "[]" + g.occamTypeToGo(p.Type)
1469→ } else if p.ArraySize != "" {
1470→ // Fixed-size array parameter: [n]TYPE
1471→ goType = "[" + p.ArraySize + "]" + g.occamTypeToGo(p.Type)
1472→ if !p.IsVal {
1473→ goType = "*" + goType
1474→ }
1475→ } else {
1476→ goType = g.occamTypeToGo(p.Type)
1477→ if !p.IsVal {
1478→ // Non-VAL parameters are pass by reference in Occam
1479→ goType = "*" + goType
1480→ }
1481→ }
1482→ parts = append(parts, fmt.Sprintf("%s %s", goIdent(p.Name), goType))
1483→ }
1484→ return strings.Join(parts, ", ")
1485→}
1486→
1487→func chanDirPrefix(dir string) string {
1488→ switch dir {
1489→ case "?":
1490→ return "<-chan " // input/receive-only
1491→ case "!":
1492→ return "chan<- " // output/send-only
1493→ default:
1494→ return "chan " // bidirectional
1495→ }
1496→}
1497→
1498→func (g *Generator) generateProcCall(call *ast.ProcCall) {
1499→ // Handle built-in print procedures
1500→ if printBuiltins[call.Name] {
1501→ g.generatePrintCall(call)
1502→ return
1503→ }
1504→
1505→ g.builder.WriteString(strings.Repeat("\t", g.indent))
1506→ g.write(goIdent(call.Name))
1507→ g.write("(")
1508→
1509→ // Look up procedure signature to determine which args need address-of
1510→ params := g.procSigs[call.Name]
1511→
1512→ for i, arg := range call.Args {
1513→ if i > 0 {
1514→ g.write(", ")
1515→ }
1516→ // If this parameter is not VAL (i.e., pass by reference), take address
1517→ // Channels and channel arrays are already reference types, so no & needed
1518→ if i < len(params) && !params[i].IsVal && !params[i].IsChan && !params[i].IsChanArray && !params[i].IsOpenArray && params[i].ArraySize == "" {
1519→ g.write("&")
1520→ }
1521→ // Wrap string literals with []byte() when passed to []BYTE parameters
1522→ if _, isStr := arg.(*ast.StringLiteral); isStr && i < len(params) && params[i].IsOpenArray && params[i].Type == "BYTE" {
1523→ g.write("[]byte(")
1524→ g.generateExpression(arg)
1525→ g.write(")")
1526→ } else {
1527→ g.generateExpression(arg)
1528→ }
1529→ }
1530→ g.write(")")
1531→ g.write("\n")
1532→}
1533→
1534→func (g *Generator) generateFuncDecl(fn *ast.FuncDecl) {
1535→ params := g.generateProcParams(fn.Params)
1536→
1537→ // Build return type string
1538→ var returnTypeStr string
1539→ if len(fn.ReturnTypes) == 1 {
1540→ returnTypeStr = g.occamTypeToGo(fn.ReturnTypes[0])
1541→ } else {
1542→ goTypes := make([]string, len(fn.ReturnTypes))
1543→ for i, rt := range fn.ReturnTypes {
1544→ goTypes[i] = g.occamTypeToGo(rt)
1545→ }
1546→ returnTypeStr = "(" + strings.Join(goTypes, ", ") + ")"
1547→ }
1548→
1549→ gName := goIdent(fn.Name)
1550→ if g.nestingLevel > 0 {
1551→ // Nested FUNCTION: generate as Go closure
1552→ g.writeLine(fmt.Sprintf("%s := func(%s) %s {", gName, params, returnTypeStr))
1553→ } else {
1554→ g.writeLine(fmt.Sprintf("func %s(%s) %s {", gName, params, returnTypeStr))
1555→ }
1556→ g.indent++
1557→ g.nestingLevel++
1558→
1559→ for _, stmt := range fn.Body {
1560→ g.generateStatement(stmt)
1561→ }
1562→
1563→ if len(fn.ResultExprs) > 0 {
1564→ g.builder.WriteString(strings.Repeat("\t", g.indent))
1565→ g.write("return ")
1566→ for i, expr := range fn.ResultExprs {
1567→ if i > 0 {
1568→ g.write(", ")
1569→ }
1570→ g.generateExpression(expr)
1571→ }
1572→ g.write("\n")
1573→ }
1574→
1575→ g.nestingLevel--
1576→ g.indent--
1577→ g.writeLine("}")
1578→ g.writeLine("")
1579→}
1580→
1581→func (g *Generator) generateFuncCallExpr(call *ast.FuncCall) {
1582→ g.write(goIdent(call.Name))
1583→ g.write("(")
1584→ params := g.procSigs[call.Name]
1585→ for i, arg := range call.Args {
1586→ if i > 0 {
1587→ g.write(", ")
1588→ }
1589→ // Wrap string literals with []byte() when passed to []BYTE parameters
1590→ if _, isStr := arg.(*ast.StringLiteral); isStr && i < len(params) && params[i].IsOpenArray && params[i].Type == "BYTE" {
1591→ g.write("[]byte(")
1592→ g.generateExpression(arg)
1593→ g.write(")")
1594→ } else {
1595→ g.generateExpression(arg)
1596→ }
1597→ }
1598→ g.write(")")
1599→}
1600→
1601→func (g *Generator) generateMultiAssignment(stmt *ast.MultiAssignment) {
1602→ g.builder.WriteString(strings.Repeat("\t", g.indent))
1603→ for i, target := range stmt.Targets {
1604→ if i > 0 {
1605→ g.write(", ")
1606→ }
1607→ if target.Index != nil {
1608→ // Check if this is a record field access
1609→ if _, ok := g.recordVars[target.Name]; ok {
1610→ if ident, ok := target.Index.(*ast.Identifier); ok {
1611→ g.write(goIdent(target.Name))
1612→ g.write(".")
1613→ g.write(goIdent(ident.Value))
1614→ continue
1615→ }
1616→ }
1617→ if g.refParams[target.Name] {
1618→ g.write("(*")
1619→ g.write(goIdent(target.Name))
1620→ g.write(")")
1621→ } else {
1622→ g.write(goIdent(target.Name))
1623→ }
1624→ g.write("[")
1625→ g.generateExpression(target.Index)
1626→ g.write("]")
1627→ } else {
1628→ if g.refParams[target.Name] {
1629→ g.write("*")
1630→ }
1631→ g.write(goIdent(target.Name))
1632→ }
1633→ }
1634→ g.write(" = ")
1635→ for i, val := range stmt.Values {
1636→ if i > 0 {
1637→ g.write(", ")
1638→ }
1639→ g.generateExpression(val)
1640→ }
1641→ g.write("\n")
1642→}
1643→
1644→func (g *Generator) generatePrintCall(call *ast.ProcCall) {
1645→ g.builder.WriteString(strings.Repeat("\t", g.indent))
1646→
1647→ switch call.Name {
1648→ case "print.int", "print.string", "print.bool":
1649→ g.write("fmt.Println(")
1650→ if len(call.Args) > 0 {
1651→ g.generateExpression(call.Args[0])
1652→ }
1653→ g.write(")")
1654→ case "print.newline":
1655→ g.write("fmt.Println()")
1656→ }
1657→
1658→ g.write("\n")
1659→}
1660→
1661→func (g *Generator) generateWhileLoop(loop *ast.WhileLoop) {
1662→ g.builder.WriteString(strings.Repeat("\t", g.indent))
1663→ g.write("for ")
1664→ g.generateExpression(loop.Condition)
1665→ g.write(" {\n")
1666→ g.indent++
1667→
1668→ for _, s := range loop.Body {
1669→ g.generateStatement(s)
1670→ }
1671→
1672→ g.indent--
1673→ g.writeLine("}")
1674→}
1675→
1676→func (g *Generator) generateIfStatement(stmt *ast.IfStatement) {
1677→ if stmt.Replicator != nil {
1678→ // Replicated IF: IF i = start FOR count → for loop with break on first match
1679→ g.generateReplicatedIfLoop(stmt, false)
1680→ } else {
1681→ // Flatten non-replicated nested IFs into the parent choice list
1682→ choices := g.flattenIfChoices(stmt.Choices)
1683→ g.generateIfChoiceChain(choices, true)
1684→ }
1685→}
1686→
1687→// flattenIfChoices inlines choices from non-replicated nested IFs into a flat list.
1688→// Replicated nested IFs are preserved as-is (they need special loop codegen).
1689→func (g *Generator) flattenIfChoices(choices []ast.IfChoice) []ast.IfChoice {
1690→ var flat []ast.IfChoice
1691→ for _, c := range choices {
1692→ if c.NestedIf != nil && c.NestedIf.Replicator == nil {
1693→ // Non-replicated nested IF: inline its choices recursively
1694→ flat = append(flat, g.flattenIfChoices(c.NestedIf.Choices)...)
1695→ } else {
1696→ flat = append(flat, c)
1697→ }
1698→ }
1699→ return flat
1700→}
1701→
1702→// generateReplicatedIfLoop emits a for loop that breaks on first matching choice.
1703→// When withinFlag is true, it sets the named flag to true before breaking.
1704→func (g *Generator) generateReplicatedIfLoop(stmt *ast.IfStatement, withinFlag bool, flagName ...string) {
1705→ repl := stmt.Replicator
1706→ v := goIdent(repl.Variable)
1707→ if repl.Step != nil {
1708→ counter := "_repl_" + v
1709→ g.builder.WriteString(strings.Repeat("\t", g.indent))
1710→ g.write(fmt.Sprintf("for %s := 0; %s < ", counter, counter))
1711→ g.generateExpression(repl.Count)
1712→ g.write(fmt.Sprintf("; %s++ {\n", counter))
1713→ g.indent++
1714→ g.builder.WriteString(strings.Repeat("\t", g.indent))
1715→ g.write(fmt.Sprintf("%s := ", v))
1716→ g.generateExpression(repl.Start)
1717→ g.write(fmt.Sprintf(" + %s * ", counter))
1718→ g.generateExpression(repl.Step)
1719→ g.write("\n")
1720→ } else {
1721→ g.builder.WriteString(strings.Repeat("\t", g.indent))
1722→ g.write(fmt.Sprintf("for %s := ", v))
1723→ g.generateExpression(repl.Start)
1724→ g.write(fmt.Sprintf("; %s < ", v))
1725→ g.generateExpression(repl.Start)
1726→ g.write(" + ")
1727→ g.generateExpression(repl.Count)
1728→ g.write(fmt.Sprintf("; %s++ {\n", v))
1729→ g.indent++
1730→ }
1731→
1732→ for i, choice := range stmt.Choices {
1733→ g.builder.WriteString(strings.Repeat("\t", g.indent))
1734→ if i == 0 {
1735→ g.write("if ")
1736→ } else {
1737→ g.write("} else if ")
1738→ }
1739→ g.generateExpression(choice.Condition)
1740→ g.write(" {\n")
1741→ g.indent++
1742→
1743→ for _, s := range choice.Body {
1744→ g.generateStatement(s)
1745→ }
1746→ if withinFlag && len(flagName) > 0 {
1747→ g.writeLine(fmt.Sprintf("%s = true", flagName[0]))
1748→ }
1749→ g.writeLine("break")
1750→
1751→ g.indent--
1752→ }
1753→ g.writeLine("}")
1754→
1755→ g.indent--
1756→ g.writeLine("}")
1757→}
1758→
1759→// generateIfChoiceChain emits a chain of if/else-if for the given choices.
1760→// When a replicated nested IF is encountered, it splits the chain and uses
1761→// a _ifmatched flag to determine whether remaining choices should be tried.
1762→func (g *Generator) generateIfChoiceChain(choices []ast.IfChoice, isFirst bool) {
1763→ // Find first replicated nested IF
1764→ replIdx := -1
1765→ for i, c := range choices {
1766→ if c.NestedIf != nil && c.NestedIf.Replicator != nil {
1767→ replIdx = i
1768→ break
1769→ }
1770→ }
1771→
1772→ if replIdx == -1 {
1773→ // No replicated nested IFs — simple if/else-if chain
1774→ for i, choice := range choices {
1775→ g.builder.WriteString(strings.Repeat("\t", g.indent))
1776→ if i == 0 && isFirst {
1777→ g.write("if ")
1778→ } else {
1779→ g.write("} else if ")
1780→ }
1781→ g.generateExpression(choice.Condition)
1782→ g.write(" {\n")
1783→ g.indent++
1784→
1785→ for _, s := range choice.Body {
1786→ g.generateStatement(s)
1787→ }
1788→
1789→ g.indent--
1790→ }
1791→ if len(choices) > 0 {
1792→ g.writeLine("}")
1793→ }
1794→ return
1795→ }
1796→
1797→ // Split at the replicated nested IF
1798→ before := choices[:replIdx]
1799→ replChoice := choices[replIdx]
1800→ after := choices[replIdx+1:]
1801→
1802→ // Emit choices before the replicated IF as a normal if-else chain
1803→ if len(before) > 0 {
1804→ for i, choice := range before {
1805→ g.builder.WriteString(strings.Repeat("\t", g.indent))
1806→ if i == 0 && isFirst {
1807→ g.write("if ")
1808→ } else {
1809→ g.write("} else if ")
1810→ }
1811→ g.generateExpression(choice.Condition)
1812→ g.write(" {\n")
1813→ g.indent++
1814→ for _, s := range choice.Body {
1815→ g.generateStatement(s)
1816→ }
1817→ g.indent--
1818→ }
1819→ // Open else block for the replicated IF + remaining choices
1820→ g.builder.WriteString(strings.Repeat("\t", g.indent))
1821→ g.write("} else {\n")
1822→ g.indent++
1823→ }
1824→
1825→ // Emit the replicated nested IF with a flag
1826→ needFlag := len(after) > 0
1827→ flagName := fmt.Sprintf("_ifmatched%d", g.tmpCounter)
1828→ g.tmpCounter++
1829→ if needFlag {
1830→ g.writeLine(fmt.Sprintf("%s := false", flagName))
1831→ }
1832→ g.generateReplicatedIfLoop(replChoice.NestedIf, needFlag, flagName)
1833→
1834→ // Emit remaining choices inside if !flagName (recursive for multiple)
1835→ if len(after) > 0 {
1836→ g.builder.WriteString(strings.Repeat("\t", g.indent))
1837→ g.write(fmt.Sprintf("if !%s {\n", flagName))
1838→ g.indent++
1839→ g.generateIfChoiceChain(after, true) // recursive for remaining
1840→ g.indent--
1841→ g.writeLine("}")
1842→ }
1843→
1844→ if len(before) > 0 {
1845→ g.indent--
1846→ g.writeLine("}")
1847→ }
1848→}
1849→
1850→func (g *Generator) generateCaseStatement(stmt *ast.CaseStatement) {
1851→ g.builder.WriteString(strings.Repeat("\t", g.indent))
1852→ g.write("switch ")
1853→ g.generateExpression(stmt.Selector)
1854→ g.write(" {\n")
1855→
1856→ for _, choice := range stmt.Choices {
1857→ if choice.IsElse {
1858→ g.writeLine("default:")
1859→ } else {
1860→ g.builder.WriteString(strings.Repeat("\t", g.indent))
1861→ g.write("case ")
1862→ for i, val := range choice.Values {
1863→ if i > 0 {
1864→ g.write(", ")
1865→ }
1866→ g.generateExpression(val)
1867→ }
1868→ g.write(":\n")
1869→ }
1870→ g.indent++
1871→ for _, s := range choice.Body {
1872→ g.generateStatement(s)
1873→ }
1874→ g.indent--
1875→ }
1876→
1877→ g.writeLine("}")
1878→}
1879→
1880→func (g *Generator) generateExpression(expr ast.Expression) {
1881→ switch e := expr.(type) {
1882→ case *ast.Identifier:
1883→ if g.refParams[e.Value] {
1884→ g.write("*" + goIdent(e.Value))
1885→ } else {
1886→ g.write(goIdent(e.Value))
1887→ }
1888→ case *ast.IntegerLiteral:
1889→ g.write(fmt.Sprintf("%d", e.Value))
1890→ case *ast.StringLiteral:
1891→ g.write(fmt.Sprintf("%q", e.Value))
1892→ case *ast.ByteLiteral:
1893→ g.write(fmt.Sprintf("byte(%d)", e.Value))
1894→ case *ast.BooleanLiteral:
1895→ if e.Value {
1896→ g.write("true")
1897→ } else {
1898→ g.write("false")
1899→ }
1900→ case *ast.BinaryExpr:
1901→ g.generateBinaryExpr(e)
1902→ case *ast.UnaryExpr:
1903→ g.generateUnaryExpr(e)
1904→ case *ast.SizeExpr:
1905→ g.write("len(")
1906→ g.generateExpression(e.Expr)
1907→ g.write(")")
1908→ case *ast.ParenExpr:
1909→ g.write("(")
1910→ g.generateExpression(e.Expr)
1911→ g.write(")")
1912→ case *ast.IndexExpr:
1913→ // Check if this is a record field access
1914→ if ident, ok := e.Left.(*ast.Identifier); ok {
1915→ if _, ok := g.recordVars[ident.Value]; ok {
1916→ if field, ok := e.Index.(*ast.Identifier); ok {
1917→ g.generateExpression(e.Left)
1918→ g.write(".")
1919→ g.write(goIdent(field.Value))
1920→ break
1921→ }
1922→ }
1923→ }
1924→ g.generateExpression(e.Left)
1925→ g.write("[")
1926→ g.generateExpression(e.Index)
1927→ g.write("]")
1928→ case *ast.SliceExpr:
1929→ g.generateExpression(e.Array)
1930→ g.write("[")
1931→ g.generateExpression(e.Start)
1932→ g.write(" : ")
1933→ g.generateExpression(e.Start)
1934→ g.write(" + ")
1935→ g.generateExpression(e.Length)
1936→ g.write("]")
1937→ case *ast.FuncCall:
1938→ g.generateFuncCallExpr(e)
1939→ case *ast.TypeConversion:
1940→ g.write(g.occamTypeToGo(e.TargetType))
1941→ g.write("(")
1942→ g.generateExpression(e.Expr)
1943→ g.write(")")
1944→ case *ast.MostExpr:
1945→ g.generateMostExpr(e)
1946→ }
1947→}
1948→
1949→func (g *Generator) generateBinaryExpr(expr *ast.BinaryExpr) {
1950→ g.write("(")
1951→ g.generateExpression(expr.Left)
1952→ g.write(" ")
1953→ g.write(g.occamOpToGo(expr.Operator))
1954→ g.write(" ")
1955→ g.generateExpression(expr.Right)
1956→ g.write(")")
1957→}
1958→
1959→func (g *Generator) generateUnaryExpr(expr *ast.UnaryExpr) {
1960→ op := g.occamOpToGo(expr.Operator)
1961→ g.write(op)
1962→ if op == "!" || op == "^" {
1963→ // Go's logical NOT and bitwise NOT don't need space
1964→ } else {
1965→ g.write(" ")
1966→ }
1967→ g.generateExpression(expr.Right)
1968→}
1969→
1970→func (g *Generator) occamOpToGo(op string) string {
1971→ switch op {
1972→ case "=":
1973→ return "=="
1974→ case "<>":
1975→ return "!="
1976→ case "AND":
1977→ return "&&"
1978→ case "OR":
1979→ return "||"
1980→ case "NOT":
1981→ return "!"
1982→ case "\\":
1983→ return "%"
1984→ case "AFTER":
1985→ return ">"
1986→ case "/\\":
1987→ return "&"
1988→ case "\\/":
1989→ return "|"
1990→ case "><":
1991→ return "^"
1992→ case "~":
1993→ return "^"
1994→ case "<<":
1995→ return "<<"
1996→ case ">>":
1997→ return ">>"
1998→ case "PLUS":
1999→ return "+"
2000→ case "MINUS":
2001→ return "-"
2002→ case "TIMES":
2003→ return "*"
2004→ default:
2005→ return op // +, -, *, /, <, >, <=, >= are the same
2006→ }
2007→}
2008→Now I have a thorough understanding of the codebase. Let me create the task list and start implementing step by step. This is a large implementation effort, so let me organize it well.
Let me now also look at the float_io.occ file to understand the actual patterns we need to handle, and look at the existing tests to understand the testing patterns.
Let me also look at the float_io.occ to understand the actual patterns, and the existing test files.
{
"file_path": "/home/david/projects/code-associates/occam2go/codegen/e2e_test.go",
"limit": 80
}head -100 /home/david/projects/code-associates/occam2go/kroc/modules/course/libsrc/float_io.occ 2>/dev/null || echo "FILE NOT FOUND"
./occam2go -I kroc/modules/course/libsrc -D TARGET.BITS.PER.WORD=32 -o /tmp/course_out.go kroc/modules/course/libsrc/course.module 2>&1 | head -30
{
"file_path": "/home/david/projects/code-associates/occam2go/kroc/modules/course/libsrc/float_io.occ",
"limit": 150
} 1→--
2→-- Type conversion library
3→-- Copyright (C) 1988, 1990 Inmos Limited
4→--
5→-- This library is free software; you can redistribute it and/or
6→-- modify it under the terms of the GNU Lesser General Public
7→-- License as published by the Free Software Foundation; either
8→-- version 2 of the License, or (at your option) any later version.
9→--
10→-- This library is distributed in the hope that it will be useful,
11→-- but WITHOUT ANY WARRANTY; without even the implied warranty of
12→-- MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
13→-- Lesser General Public License for more details.
14→--
15→-- You should have received a copy of the GNU Lesser General Public
16→-- License along with this library; if not, write to the Free Software
17→-- Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
18→--
19→
20→--** @module course
21→
22→--{{{ #'s
23→-- #OPTION "E V" -- no compiler libraries, no vector space
24→#COMMENT "Type conversion library V1.6 4/12/90"
25→#COMMENT "(C) Copyright INMOS limited 1988, 1990"
26→--}}}
27→
28→--{{{ REAL32TOSTRING
29→--* Format a [@code REAL32] number as a decimal string.
30→--
31→-- [@code Ip] and [@code Dp] control how the number is formatted.
32→-- If [@code Ip > 0] and [@code Dp > 0], fixed-point format will be used with
33→-- [@code Ip] digits before the point and [@code Dp] digits after the point.
34→-- If [@code Ip = 0] and [@code Dp > 0], exponential format will be used with
35→-- [@code Dp] digits in the fraction.
36→-- If [@code Ip = 0] and [@code Dp = 0], a "reasonable" format will be chosen
37→-- automatically.
38→--
39→-- @param len The number of characters ([@code BYTE]s) of string occupied by
40→-- the formatted decimal representation of the real number
41→-- @param string An array containing the formatted decimal representation of
42→-- the real number in the first [@code len] bytes, the remaining bytes being
43→-- undefined
44→-- @param X The real number, in IEEE format, to be converted
45→-- @param Ip The first of two formatting values
46→-- @param Dp The second of two formatting values
47→PROC course.REAL32TOSTRING (RESULT INT len, RESULT []BYTE string, VAL REAL32 X, VAL INT Ip, Dp)
48→
49→ --{{{ specification
50→ --{{{ Notes
51→ -- Notes: Rounding mode is round to nearest.
52→ -- Which format is used depends on the combination of values
53→ -- of Ip, Dp and X. In all cases, any digits beyond the 9th
54→ -- significant digit for single precision or 17th significant
55→ -- digit for double precision will be given as 0 and cannot
56→ -- be considered accurate.
57→ -- If string overflows this routine acts as an invalid process.
58→ --{{{ Case i
59→ -- Case (i): Ip = 0, Dp = 0 => free format
60→ -- Where possible a fixed point representation is used. If
61→ -- it is not used then exponential form is used. It is not
62→ -- used if more than 9 | 17 significant digits of accuracy
63→ -- ( single | double ) would be required before the decimal
64→ -- point, or if there are more than 3 zeroes after the decimal
65→ -- point before the first significant digit. In any case, the
66→ -- maximum number of characters returned in string is 15 for a
67→ -- single precision X, and 24 for a double precision X.
68→ -- string is left justified.
69→ -- If X is infinity or a NaN, then the string will contain one
70→ -- of "Inf", "-Inf" or "NaN", but not the quotes.
71→ --}}}
72→ --{{{ Case ii
73→ -- Case (ii): Ip = 0, Dp > 0 => exponential format
74→ -- The form of exponential format is, firstly either a minus
75→ -- sign or a space ( this latter instead of an explicit plus
76→ -- sign ), a fraction in the form <digit>.<digits>, the
77→ -- exponential symbol ( E ), the sign of the exponent ( + or
78→ -- - ), then the exponent, which is two digits for a single
79→ -- precision X, three digits for a double precision X. Dp
80→ -- gives the number of digits in the fraction ( 1 before the
81→ -- point, and the others after, but not counting the point ).
82→ -- The total number of characters in string is ( Dp + 6 )
83→ -- for a single precision X, and ( Dp + 7 ) for a double
84→ -- precision X.
85→ -- If Dp = 1 then the fraction is of the form <space>digit
86→ -- ( which, note, will not result in occam syntax for the
87→ -- real ).
88→ -- If X is infinity or a NaN, then the string will contain one
89→ -- of " Inf", "-Inf" or " NaN", but not the quotes, padded on
90→ -- the right to fill the field width.
91→ --}}}
92→ --{{{ Case iii
93→ -- Case (iii): Ip > 0, Dp > 0 => fixed point if possible
94→ -- Ip gives the number of places before the point, not
95→ -- counting the sign place; Dp the number of places after
96→ -- the point. Padding spaces are added on the left as
97→ -- necessary. If the number will not fit the format,
98→ -- then an exponential format is used with the same field
99→ -- width as the fixed point representation would have had.
100→ -- If Ip and Dp are very small then an exponential
101→ -- representation may not fit in the field width so the
102→ -- special value "Ov" with a sign is returned.
103→ -- There are always ( Ip + Dp + 2 ) characters in string,
104→ -- the 2 being the decimal point and the sign ( - or space ).
105→ -- If X is infinity or a NaN, then the string will contain one
106→ -- of " Inf", "-Inf" or " NaN", but not the quotes, padded on
107→ -- the right to fill the field width.
108→ --}}}
109→ -- All other combinations of Ip and Dp are meaningless and will cause
110→ -- an error.
111→ --}}}
112→ --}}}
113→ --{{{ history
114→ -- V1.0, 1-Jul-88
115→ -- Added V to the OPTIONs; V1.1, 22-Mar-90 SRH
116→ -- Formatting completely revamped; bug report number TS/37; V1.2,
117→ -- 26-Apr-90 SRH
118→ -- Added more comments; 5-Oct-90 SRH
119→ -- Removed the test ( Xexp >= BitsPerWord ) from Round because
120→ -- redundant; V1.3, 11-Oct-90 SRH
121→ -- Altered to take account of rounding done to numbers almost half
122→ -- way between two other numbers - round.direction introduced; bug
123→ -- report number TS/789; V1.4, 1-Nov-90 SRH
124→ -- Removed a bit of redundant code ( that which used to display Un );
125→ -- V1.5, 15-Nov-90 SRH
126→ -- Changed the names of some variables in QuickLog to make code
127→ -- clearer; 28-Nov-90 SRH
128→ -- Added assignment to zero of Carry in ScaleX when overflow to
129→ -- infinity; this prevents possible addition of 1 to infinity to
130→ -- make a NaN; V1.6, 4-Dec-90 SRH
131→ --}}}
132→ --{{{ more information
133→ -- To help understand the binary to character conversion see "Contributions
134→ -- to a Proposed Standard for Binary Floating-Point Arithmetic" by J T Coonen,
135→ -- PhD thesis, 1984, University of California, Berkeley.
136→ --}}}
137→
138→ #IF (TARGET.BITS.PER.WORD = 32)
139→ VAL INT X RETYPES X :
140→
141→ --{{{ Definitions
142→ --{{{ Word length specific values for 32 bit machine
143→ VAL SignBit IS #80000000 :
144→ VAL BitsPerWord IS 32 :
145→ VAL MaxInt IS #FFFFFFFF :
146→ VAL BytesPerWord IS 4 :
147→ --}}}
148→ --{{{ IEEE Single length constants for 32 bit implementation
149→ VAL RealShift IS 8 :
150→ VAL BitsInFrac IS 24 :./occam2go -I kroc/modules/course/libsrc -D TARGET.BITS.PER.WORD=32 -o /tmp/course_out.go kroc/modules/course/libsrc/course.module 2>&1 | tail -5
line 3941: unexpected token: INT line 3941: unexpected token: ) line 3941: unexpected token: <= line 3941: unexpected token: ) line 3941: unexpected token: )
/home/david/projects/code-associates/occam2go/codegen/e2e_concurrency_test.go /home/david/projects/code-associates/occam2go/codegen/e2e_preproc_test.go /home/david/projects/code-associates/occam2go/codegen/e2e_helpers_test.go /home/david/projects/code-associates/occam2go/codegen/e2e_protocol_test.go /home/david/projects/code-associates/occam2go/codegen/e2e_record_test.go /home/david/projects/code-associates/occam2go/codegen/e2e_proc_func_test.go /home/david/projects/code-associates/occam2go/codegen/e2e_replicator_test.go /home/david/projects/code-associates/occam2go/codegen/e2e_array_test.go /home/david/projects/code-associates/occam2go/codegen/e2e_basic_test.go /home/david/projects/code-associates/occam2go/codegen/e2e_types_test.go /home/david/projects/code-associates/occam2go/codegen/codegen_test.go /home/david/projects/code-associates/occam2go/codegen/e2e_control_test.go
./occam2go -I kroc/modules/course/libsrc -D TARGET.BITS.PER.WORD=32 -o /tmp/course_out.go kroc/modules/course/libsrc/course.module 2>&1 | wc -l
{
"file_path": "/home/david/projects/code-associates/occam2go/codegen/e2e_helpers_test.go"
} 1→package codegen
2→
3→import (
4→ "os"
5→ "os/exec"
6→ "path/filepath"
7→ "testing"
8→
9→ "github.com/codeassociates/occam2go/lexer"
10→ "github.com/codeassociates/occam2go/parser"
11→ "github.com/codeassociates/occam2go/preproc"
12→)
13→
14→// transpileCompileRun takes Occam source, transpiles to Go, compiles, runs,
15→// and returns the stdout output
16→func transpileCompileRun(t *testing.T, occamSource string) string {
17→ t.Helper()
18→
19→ // Transpile
20→ l := lexer.New(occamSource)
21→ p := parser.New(l)
22→ program := p.ParseProgram()
23→
24→ if len(p.Errors()) > 0 {
25→ for _, err := range p.Errors() {
26→ t.Errorf("parser error: %s", err)
27→ }
28→ t.FailNow()
29→ }
30→
31→ gen := New()
32→ goCode := gen.Generate(program)
33→
34→ // Create temp directory for this test
35→ tmpDir, err := os.MkdirTemp("", "occam2go-test-*")
36→ if err != nil {
37→ t.Fatalf("failed to create temp dir: %v", err)
38→ }
39→ defer os.RemoveAll(tmpDir)
40→
41→ // Write Go source
42→ goFile := filepath.Join(tmpDir, "main.go")
43→ if err := os.WriteFile(goFile, []byte(goCode), 0644); err != nil {
44→ t.Fatalf("failed to write Go file: %v", err)
45→ }
46→
47→ // Compile
48→ binFile := filepath.Join(tmpDir, "main")
49→ compileCmd := exec.Command("go", "build", "-o", binFile, goFile)
50→ compileOutput, err := compileCmd.CombinedOutput()
51→ if err != nil {
52→ t.Fatalf("compilation failed: %v\nOutput: %s\nGo code:\n%s", err, compileOutput, goCode)
53→ }
54→
55→ // Run
56→ runCmd := exec.Command(binFile)
57→ output, err := runCmd.CombinedOutput()
58→ if err != nil {
59→ t.Fatalf("execution failed: %v\nOutput: %s", err, output)
60→ }
61→
62→ return string(output)
63→}
64→
65→// transpileCompileRunFromFile takes an occam file path, preprocesses it,
66→// then transpiles, compiles, and runs.
67→func transpileCompileRunFromFile(t *testing.T, mainFile string, includePaths []string) string {
68→ t.Helper()
69→
70→ pp := preproc.New(preproc.WithIncludePaths(includePaths))
71→ expanded, err := pp.ProcessFile(mainFile)
72→ if err != nil {
73→ t.Fatalf("preprocessor error: %v", err)
74→ }
75→ if len(pp.Errors()) > 0 {
76→ for _, e := range pp.Errors() {
77→ t.Errorf("preprocessor warning: %s", e)
78→ }
79→ }
80→
81→ return transpileCompileRun(t, expanded)
82→}
83→grep -n "RETYPES" /home/david/projects/code-associates/occam2go/kroc/modules/course/libsrc/float_io.occ
grep -n "CAUSEERROR" /home/david/projects/code-associates/occam2go/kroc/modules/course/libsrc/float_io.occ
grep -n "LONGPROD\|LONGDIV\|LONGSUM\|LONGDIFF\|NORMALISE\|SHIFTRIGHT\|SHIFTLEFT" /home/david/projects/code-associates/occam2go/kroc/modules/course/libsrc/float_io.occ | head -30
242: Zfrac, Carry := LONGPROD (PowerTenFrac[St], 244: Places, Zfrac, Carry := NORMALISE (Zfrac, Carry) 269: Xfrac, Carry := LONGDIV (Xfrac >> 1, 0, Zfrac) 272: -- set. This means that the smallest that Xfrac as answer to the LONGDIV 278: --Borrow, dummy := LONGDIFF (Carry, Zfrac >> 1, 0) --see if over half left 291: Carry, remainder := LONGDIV( Carry, 0, Zfrac ) 300: Xfrac, Carry := LONGPROD (Xfrac, Zfrac, 0) 304: Places, Xfrac, Carry := NORMALISE (Xfrac, Carry) 316: Carry, Xfrac := LONGSUM (Xfrac, 1, 0) 343: dummy, X := SHIFTRIGHT (Xexp, Xfrac << 1, BitsPerWord / 2) 348: X, low.word := LONGPROD (-X, R2L + 1, 0) 349: Borrow, dummy := LONGDIFF (0, low.word, 0) 350: Borrow, LogX := LONGDIFF (0, X, Borrow) 352: LogX, dummy := LONGPROD (X, R2L, 0) --mul by Log10(2) 379: Xint, Guard := SHIFTRIGHT (Xfrac, 0, Places) 409: ExpExtra, Xfrac := SHIFTLEFT (0, Xint, Places) 441: N, T := LONGDIV (0, N, 10) --effects N := N / 10 ; T := N REM 10 601: Xexp, Xfrac := SHIFTLEFT (0, X /\ (~SignBit), RealShift + 1) 700: Places, Xfrac, Carry := NORMALISE (Xfrac, 0) 738: Carry, M := LONGDIFF (Xfrac, PowerTenFrac[N], 0) 1207: Carry, D[0] := LONGPROD ( X[0], Y[0], 0) 1208: Carry, D[1] := LONGPROD ( X[0], Y[1], Carry) 1209: D[3], D[2] := LONGPROD ( X[1], Y[0], 0) 1210: R.hi, R.lo := LONGPROD ( X[1], Y[1], Carry) 1211: Carry, Guard := LONGSUM ( D[2], D[1], 0) 1212: Carry, R.lo := LONGSUM ( R.lo, D[3], Carry) 1213: Carry, R.hi := LONGSUM ( R.hi, 0, Carry) 1279: Quot, Carry := LONGDIV ( Hi, Lo, Yfrac[1] ) 1283: W[1], W[0] := LONGPROD ( Quot, Yfrac[0], 0 ) 1284: W[2], W[1] := LONGPROD ( Quot, Yfrac[1], W[1] )
Let me check the key patterns more carefully — the untyped VAL and array literal patterns.
{
"file_path": "/home/david/projects/code-associates/occam2go/kroc/modules/course/libsrc/float_io.occ",
"offset": 138,
"limit": 80
} 138→ #IF (TARGET.BITS.PER.WORD = 32)
139→ VAL INT X RETYPES X :
140→
141→ --{{{ Definitions
142→ --{{{ Word length specific values for 32 bit machine
143→ VAL SignBit IS #80000000 :
144→ VAL BitsPerWord IS 32 :
145→ VAL MaxInt IS #FFFFFFFF :
146→ VAL BytesPerWord IS 4 :
147→ --}}}
148→ --{{{ IEEE Single length constants for 32 bit implementation
149→ VAL RealShift IS 8 :
150→ VAL BitsInFrac IS 24 :
151→ VAL RealExpSh IS 23 :
152→ VAL RealExp IS 255 :
153→ VAL RealXcess IS 127 :
154→ VAL RealRBit IS 128 :
155→ VAL RealInf IS #7F800000 :
156→ --}}}
157→ --{{{ read/write constants
158→ -- write number constants
159→ VAL Table10 IS 13 : --exact power of ten in word
160→ VAL Bias10 IS 3 : --extras for full power ten range
161→ VAL MaxDecN IS 9 : --maximum dec digits; IEEE sec. 5.6
162→ VAL R2L IS #4D10 : --Log10(2) = 0.3010299956639811952 in low half word
163→
164→ -- The table PowerTenFrac contains the fraction with implicit bit
165→ -- explicit and then normalized so that there is a one bit in bit 31
166→ -- of powers of 10 from 10^0 to 10^13. For example,
167→ -- 10^3 = 1111101000 in binary, which is 1.111101 * 2^9; so the fraction
168→ -- bits, with implicit bit explicit ( ie the 1 before the point kept ) is
169→ -- 1111101, and the actual exponent is 9. This latter ( the actual
170→ -- exponent ) is what is in the corresponding table PowerTenExp.
171→ -- The index of an entry is the same as the power of 10; hence
172→ -- PowerTenFrac[ 3 ] = #FA000000, and PowerTenExp[ 3 ] = 9.
173→ -- No bits are lost in the PowerTenFrac table; even the largest value,
174→ -- 10^13, is exact.
175→ -- These two tables of constants are simply set up in advance for
176→ -- efficiency.
177→
178→ VAL PowerTenFrac IS [#80000000, #A0000000, #C8000000, #FA000000,
179→ #9C400000, #C3500000, #F4240000, #98968000,
180→ #BEBC2000, #EE6B2800, #9502F900, #BA43B740,
181→ #E8D4A510, #9184E72A] :
182→
183→ VAL PowerTenExp IS [0, 3, 6, 9, 13, 16, 19, 23, 26, 29, 33, 36, 39, 43] :
184→
185→ VAL BiasTenFrac IS [#9184E72A, #CECB8F28, #EB194F8E] : --P13, P27, P40
186→
187→ VAL BiasTenExp IS [43, 89, 132] :
188→
189→ VAL BiasTen IS [14, 28, 41, 54] : --bias changeover points
190→ --}}}
191→ --{{{ rounding constants
192→ VAL INT rounded.down IS -1 :
193→ VAL INT exact IS 0 :
194→ VAL INT rounded.up IS 1 :
195→ --}}}
196→ --}}}
197→ --{{{ ScaleX
198→ PROC ScaleX (INT Xexp, Xfrac, VAL INT Scale, INT direction)
199→
200→ -- Scale X by value of 10^Scale
201→
202→ INT Carry, Places, Sb, St, Zexp, Zfrac :
203→ SEQ
204→ --{{{ get abs value of Scale
205→ IF
206→ Scale < 0
207→ St := -Scale
208→ TRUE
209→ St := Scale
210→ --}}}
211→ --{{{ set Zexp and Zfrac
212→ IF
213→ St <= Table10
214→ --{{{ easy case, 10 in table
215→ SEQ
216→ Zexp := PowerTenExp[St]
217→ Zfrac := PowerTenFrac[St]{
"file_path": "/home/david/projects/code-associates/occam2go/kroc/modules/course/libsrc/float_io.occ",
"offset": 145,
"limit": 70
} 145→ VAL MaxInt IS #FFFFFFFF :
146→ VAL BytesPerWord IS 4 :
147→ --}}}
148→ --{{{ IEEE Single length constants for 32 bit implementation
149→ VAL RealShift IS 8 :
150→ VAL BitsInFrac IS 24 :
151→ VAL RealExpSh IS 23 :
152→ VAL RealExp IS 255 :
153→ VAL RealXcess IS 127 :
154→ VAL RealRBit IS 128 :
155→ VAL RealInf IS #7F800000 :
156→ --}}}
157→ --{{{ read/write constants
158→ -- write number constants
159→ VAL Table10 IS 13 : --exact power of ten in word
160→ VAL Bias10 IS 3 : --extras for full power ten range
161→ VAL MaxDecN IS 9 : --maximum dec digits; IEEE sec. 5.6
162→ VAL R2L IS #4D10 : --Log10(2) = 0.3010299956639811952 in low half word
163→
164→ -- The table PowerTenFrac contains the fraction with implicit bit
165→ -- explicit and then normalized so that there is a one bit in bit 31
166→ -- of powers of 10 from 10^0 to 10^13. For example,
167→ -- 10^3 = 1111101000 in binary, which is 1.111101 * 2^9; so the fraction
168→ -- bits, with implicit bit explicit ( ie the 1 before the point kept ) is
169→ -- 1111101, and the actual exponent is 9. This latter ( the actual
170→ -- exponent ) is what is in the corresponding table PowerTenExp.
171→ -- The index of an entry is the same as the power of 10; hence
172→ -- PowerTenFrac[ 3 ] = #FA000000, and PowerTenExp[ 3 ] = 9.
173→ -- No bits are lost in the PowerTenFrac table; even the largest value,
174→ -- 10^13, is exact.
175→ -- These two tables of constants are simply set up in advance for
176→ -- efficiency.
177→
178→ VAL PowerTenFrac IS [#80000000, #A0000000, #C8000000, #FA000000,
179→ #9C400000, #C3500000, #F4240000, #98968000,
180→ #BEBC2000, #EE6B2800, #9502F900, #BA43B740,
181→ #E8D4A510, #9184E72A] :
182→
183→ VAL PowerTenExp IS [0, 3, 6, 9, 13, 16, 19, 23, 26, 29, 33, 36, 39, 43] :
184→
185→ VAL BiasTenFrac IS [#9184E72A, #CECB8F28, #EB194F8E] : --P13, P27, P40
186→
187→ VAL BiasTenExp IS [43, 89, 132] :
188→
189→ VAL BiasTen IS [14, 28, 41, 54] : --bias changeover points
190→ --}}}
191→ --{{{ rounding constants
192→ VAL INT rounded.down IS -1 :
193→ VAL INT exact IS 0 :
194→ VAL INT rounded.up IS 1 :
195→ --}}}
196→ --}}}
197→ --{{{ ScaleX
198→ PROC ScaleX (INT Xexp, Xfrac, VAL INT Scale, INT direction)
199→
200→ -- Scale X by value of 10^Scale
201→
202→ INT Carry, Places, Sb, St, Zexp, Zfrac :
203→ SEQ
204→ --{{{ get abs value of Scale
205→ IF
206→ Scale < 0
207→ St := -Scale
208→ TRUE
209→ St := Scale
210→ --}}}
211→ --{{{ set Zexp and Zfrac
212→ IF
213→ St <= Table10
214→ --{{{ easy case, 10 in table{
"file_path": "/home/david/projects/code-associates/occam2go/kroc/modules/course/libsrc/float_io.occ",
"offset": 238,
"limit": 70
} 238→ --{{{ valid power
239→ SEQ
240→ St := (St - BiasTen[Sb]) + 1
241→ Zexp := (PowerTenExp[St] + BiasTenExp[Sb]) + 1
242→ Zfrac, Carry := LONGPROD (PowerTenFrac[St],
243→ BiasTenFrac[Sb], 0)
244→ Places, Zfrac, Carry := NORMALISE (Zfrac, Carry)
245→ Zexp := Zexp - Places
246→ --}}}
247→ --}}}
248→ --}}}
249→ --{{{ set Xexp, Xfrac and Carry
250→ IF
251→ Zexp < 0
252→ --{{{ out of range
253→ SEQ
254→ IF
255→ Scale < 0 --set to big number
256→ Xexp := -RealExp
257→ TRUE
258→ Xexp := RealExp
259→ Xfrac := SignBit --prevent normalisation
260→ Carry := 0
261→ --}}}
262→ Scale < 0
263→ --{{{ scale down
264→ SEQ
265→ Xexp := Xexp - Zexp
266→ -- least significant bit of Xfrac is not set, so Xfrac >> 1 does not loose
267→ -- information
268→ #PRAGMA DEFINED Zfrac
269→ Xfrac, Carry := LONGDIV (Xfrac >> 1, 0, Zfrac)
270→ -- Zfrac always has top bit set; Xfrac >> 1 always has top bit clear and,
271→ -- because it is passed as a normalised number, has the second top bit
272→ -- set. This means that the smallest that Xfrac as answer to the LONGDIV
273→ -- can be is #40000000, and so when it comes to normalising it later, only
274→ -- the top bit of Carry will matter.
275→
276→ -- original code to assign Carry:
277→ --INT dummy :
278→ --Borrow, dummy := LONGDIFF (Carry, Zfrac >> 1, 0) --see if over half left
279→ --IF
280→ -- Borrow <> 0 --less than half
281→ -- Carry := 0
282→ -- TRUE --note sticky bit
283→ -- Carry := SignBit
284→
285→ --{{{ obtain more accurate Carry - matters in assignment of direction later
286→ IF
287→ Carry = 0
288→ SKIP --divided exactly
289→ TRUE
290→ INT remainder :
291→ Carry, remainder := LONGDIV( Carry, 0, Zfrac )
292→ -- Carry is not zero now if this branch of the IF taken
293→ --}}}
294→ --}}}
295→ TRUE
296→ --{{{ scale up
297→ SEQ
298→ Xexp := (Xexp + Zexp) + 1
299→ #PRAGMA DEFINED Zfrac
300→ Xfrac, Carry := LONGPROD (Xfrac, Zfrac, 0)
301→ --}}}
302→ --}}}
303→ --{{{ normalise
304→ Places, Xfrac, Carry := NORMALISE (Xfrac, Carry)
305→ Xexp := Xexp - Places
306→ --}}}
307→ --{{{ round and set direction{
"file_path": "/home/david/projects/code-associates/occam2go/kroc/modules/course/libsrc/float_io.occ",
"offset": 1098,
"limit": 30
} 1098→ --{{{ more information
1099→ -- To help understand the binary to character conversion see "Contributions
1100→ -- to a Proposed Standard for Binary Floating-Point Arithmetic" by J T Coonen,
1101→ -- PhD thesis, 1984, University of California, Berkeley.
1102→ --}}}
1103→
1104→ #IF (TARGET.BITS.PER.WORD = 32)
1105→ VAL [2]INT X RETYPES X :
1106→
1107→ --{{{ Definitions
1108→ --{{{ Word length specific values for 32 bit machine
1109→ VAL SignBit IS #80000000 :
1110→ VAL BitsPerWord IS 32 :
1111→ VAL MaxUnsignedInt IS #FFFFFFFF :
1112→ VAL BytesPerWord IS 4 :
1113→ --}}}
1114→ --{{{ IEEE Double length constants for 32 bit implementation
1115→ VAL DRealShift IS 11 :
1116→ VAL DBitsInFrac IS 53 :
1117→ VAL DRealExpSh IS 20 :
1118→ VAL DRealExp IS #7FF :
1119→ VAL DRealXcess IS 1023 :
1120→ VAL DRealRBit IS #400 :
1121→ VAL DRealInf IS #7FF00000 :
1122→ --}}}
1123→ --{{{ read/write constants
1124→ -- write number constants
1125→ VAL Table10 IS 27 : --exact power of ten in word
1126→ VAL Bias10 IS 4 : --extras for full power ten range
1127→ VAL MaxDecN IS 17 : --maximum dec digits; IEEE 754-1985, Sec 5.6Now I have a thorough understanding of the patterns. Let me look at some multi-line expression patterns.
grep -n 'AND$\|OR$' /home/david/projects/code-associates/occam2go/kroc/modules/course/libsrc/float_io.occ | head -20
522: round.up := ( direction = exact ) AND 523: ( ( Buf[digits - 1] = '1' ) OR 524: ( Buf[digits - 1] = '3' ) OR 525: ( Buf[digits - 1] = '5' ) OR 526: ( Buf[digits - 1] = '7' ) OR 754: ( (Ip <> 0) AND (LogX < Ip) ) OR 755: ((Dp = 0) AND 756: (((-max.no.of.zeroes.before.first.significant.digit) - 1) <= LogX) AND 779: ( (Ip <> 0) AND (LogX.1 < Ip) ) OR 780: ((Dp = 0) AND 781: (((-max.no.of.zeroes.before.first.significant.digit) - 1) <= LogX.1) AND 1710: round.up := ( direction = exact ) AND 1711: ( ( Buf[digits - 1] = '1' ) OR 1712: ( Buf[digits - 1] = '3' ) OR 1713: ( Buf[digits - 1] = '5' ) OR 1714: ( Buf[digits - 1] = '7' ) OR 1943: (Xexp < PowerTenExp[N - 1]) OR 1960: ( (Ip <> 0) AND (LogX < Ip) ) OR 1961: ( (Dp = 0) AND (LogX < N) AND 1981: ( (Ip <> 0) AND (LogX.1 < Ip) ) OR
{
"file_path": "/home/david/projects/code-associates/occam2go/kroc/modules/course/libsrc/float_io.occ",
"offset": 520,
"limit": 15
} 520→ round.up := TRUE
521→ TRUE
522→ round.up := ( direction = exact ) AND
523→ ( ( Buf[digits - 1] = '1' ) OR
524→ ( Buf[digits - 1] = '3' ) OR
525→ ( Buf[digits - 1] = '5' ) OR
526→ ( Buf[digits - 1] = '7' ) OR
527→ ( Buf[digits - 1] = '9' ) )
528→ TRUE
529→ round.up := Buf[digits] >= '5'
530→ --}}}
531→ IF
532→ round.up
533→ --{{{ round up
534→ SEQNow I understand all the patterns. Let me create the implementation plan with tasks and start working through them step by step.
Key patterns to handle:
1. Bracket NEWLINE suppression - [ and ] need to track parenDepth like ( and ) so multi-line array literals work
2. RETYPES keyword - new token
3. ArrayLiteral and RetypesDecl AST nodes
4. Untyped VAL abbreviations - VAL SignBit IS #80000000 : (no type keyword)
5. Array literal expressions - [#80000000, #A0000000, ...]
VAL INT X RETYPES X : and VAL [2]INT X RETYPES X :expr AND\n expr patternspanic("CAUSEERROR")Now I'll start implementing these in order.
{
"subject": "Lexer: bracket NEWLINE suppression + RETYPES keyword",
"activeForm": "Implementing lexer changes"
}{
"subject": "AST: ArrayLiteral + RetypesDecl nodes",
"activeForm": "Adding AST node types"
}{
"subject": "Parser: untyped VAL, array literals, RETYPES, multi-line expressions",
"activeForm": "Implementing parser changes"
}Task #3 created successfully: Parser: untyped VAL, array literals, RETYPES, multi-line expressions
{
"subject": "Codegen: all new constructs + intrinsics + CAUSEERROR",
"activeForm": "Implementing codegen changes"
}{
"subject": "Tests + integration verification",
"activeForm": "Writing tests and verifying integration"
}case '[': tok = l.newToken(LBRACKET, l.ch) case ']': tok = l.newToken(RBRACKET, l.ch)
case '[':
l.parenDepth++
tok = l.newToken(LBRACKET, l.ch)
case ']':
if l.parenDepth > 0 {
l.parenDepth--
}
tok = l.newToken(RBRACKET, l.ch)The file /home/david/projects/code-associates/occam2go/lexer/lexer.go has been updated successfully.
PLUS_KW // PLUS (modular addition keyword, distinct from + symbol) MINUS_KW // MINUS (modular subtraction keyword, distinct from - symbol) TIMES // TIMES (modular multiplication keyword) keyword_end
RETYPES // RETYPES (bit-level type reinterpretation) PLUS_KW // PLUS (modular addition keyword, distinct from + symbol) MINUS_KW // MINUS (modular subtraction keyword, distinct from - symbol) TIMES // TIMES (modular multiplication keyword) keyword_end
The file /home/david/projects/code-associates/occam2go/lexer/token.go has been updated successfully.
INITIAL: "INITIAL", PLUS_KW: "PLUS",
INITIAL: "INITIAL", RETYPES: "RETYPES", PLUS_KW: "PLUS",
The file /home/david/projects/code-associates/occam2go/lexer/token.go has been updated successfully.
"INITIAL": INITIAL, "PLUS": PLUS_KW,
"INITIAL": INITIAL, "RETYPES": RETYPES, "PLUS": PLUS_KW,
The file /home/david/projects/code-associates/occam2go/lexer/token.go has been updated successfully.
func (a *Abbreviation) statementNode() {}
func (a *Abbreviation) TokenLiteral() string { return a.Token.Literal }func (a *Abbreviation) statementNode() {}
func (a *Abbreviation) TokenLiteral() string { return a.Token.Literal }
// ArrayLiteral represents an array literal expression: [expr1, expr2, ...]
type ArrayLiteral struct {
Token lexer.Token // the [ token
Elements []Expression // the elements
}
func (al *ArrayLiteral) expressionNode() {}
func (al *ArrayLiteral) TokenLiteral() string { return al.Token.Literal }
// RetypesDecl represents a RETYPES declaration:
// VAL INT X RETYPES X : or VAL [2]INT X RETYPES X :
type RetypesDecl struct {
Token lexer.Token // the VAL token
IsVal bool // always true for now (VAL ... RETYPES ...)
TargetType string // "INT", "REAL32", etc.
IsArray bool // true for [n]TYPE
ArraySize Expression // array size when IsArray
Name string // target variable name
Source string // source variable name
}
func (r *RetypesDecl) statementNode() {}
func (r *RetypesDecl) TokenLiteral() string { return r.Token.Literal }The file /home/david/projects/code-associates/occam2go/ast/ast.go has been updated successfully.
First, let me modify parseAbbreviation() to handle untyped VAL abbreviations and RETYPES.
// parseAbbreviation parses a VAL abbreviation: VAL INT x IS expr:
// Also handles VAL []BYTE x IS "string": (open array abbreviation)
// Current token is VAL.
func (p *Parser) parseAbbreviation() *ast.Abbreviation {
token := p.curToken // VAL token
p.nextToken()
// Check for []TYPE (open array abbreviation)
isOpenArray := false
if p.curTokenIs(lexer.LBRACKET) && p.peekTokenIs(lexer.RBRACKET) {
isOpenArray = true
p.nextToken() // consume ]
p.nextToken() // move to type
}
// Expect a type keyword
if !isTypeToken(p.curToken.Type) {
p.addError(fmt.Sprintf("expected type after VAL, got %s", p.curToken.Type))
return nil
}
typeName := p.curToken.Literal
// Expect name
if !p.expectPeek(lexer.IDENT) {
return nil
}
name := p.curToken.Literal
// Expect IS
if !p.expectPeek(lexer.IS) {
return nil
}
// Parse expression
p.nextToken()
value := p.parseExpression(LOWEST)
// Expect colon
if !p.expectPeek(lexer.COLON) {
return nil
}
return &ast.Abbreviation{
Token: token,
IsVal: true,
IsOpenArray: isOpenArray,
Type: typeName,
Name: name,
Value: value,
}
}// parseAbbreviation parses VAL abbreviations:
// VAL INT x IS expr: (typed VAL abbreviation)
// VAL []BYTE x IS "string": (open array abbreviation)
// VAL x IS expr: (untyped VAL abbreviation)
// VAL INT X RETYPES X : (RETYPES declaration)
// VAL [n]INT X RETYPES X : (array RETYPES declaration)
// Current token is VAL.
func (p *Parser) parseAbbreviation() ast.Statement {
token := p.curToken // VAL token
p.nextToken()
// Check for []TYPE (open array abbreviation)
isOpenArray := false
if p.curTokenIs(lexer.LBRACKET) && p.peekTokenIs(lexer.RBRACKET) {
isOpenArray = true
p.nextToken() // consume ]
p.nextToken() // move to type
}
// Check for [n]TYPE (fixed-size array, used in RETYPES)
isArray := false
var arraySize ast.Expression
if !isOpenArray && p.curTokenIs(lexer.LBRACKET) {
// Could be [n]TYPE name RETYPES ...
isArray = true
p.nextToken() // move past [
arraySize = p.parseExpression(LOWEST)
if !p.expectPeek(lexer.RBRACKET) {
return nil
}
p.nextToken() // move to type
}
// Check for untyped VAL abbreviation: VAL name IS expr :
// Detect: curToken is IDENT and peekToken is IS (no type keyword)
if !isOpenArray && !isArray && p.curTokenIs(lexer.IDENT) && p.peekTokenIs(lexer.IS) {
name := p.curToken.Literal
p.nextToken() // consume IS
p.nextToken() // move to expression
value := p.parseExpression(LOWEST)
if !p.expectPeek(lexer.COLON) {
return nil
}
return &ast.Abbreviation{
Token: token,
IsVal: true,
Type: "",
Name: name,
Value: value,
}
}
// Expect a type keyword
if !isTypeToken(p.curToken.Type) {
p.addError(fmt.Sprintf("expected type after VAL, got %s", p.curToken.Type))
return nil
}
typeName := p.curToken.Literal
// Expect name
if !p.expectPeek(lexer.IDENT) {
return nil
}
name := p.curToken.Literal
// Check for RETYPES (instead of IS)
if p.peekTokenIs(lexer.RETYPES) {
p.nextToken() // consume RETYPES
if !p.expectPeek(lexer.IDENT) {
return nil
}
source := p.curToken.Literal
if !p.expectPeek(lexer.COLON) {
return nil
}
return &ast.RetypesDecl{
Token: token,
IsVal: true,
TargetType: typeName,
IsArray: isArray,
ArraySize: arraySize,
Name: name,
Source: source,
}
}
// Expect IS
if !p.expectPeek(lexer.IS) {
return nil
}
// Parse expression
p.nextToken()
value := p.parseExpression(LOWEST)
// Expect colon
if !p.expectPeek(lexer.COLON) {
return nil
}
return &ast.Abbreviation{
Token: token,
IsVal: true,
IsOpenArray: isOpenArray,
Type: typeName,
Name: name,
Value: value,
}
}The file /home/david/projects/code-associates/occam2go/parser/parser.go has been updated successfully.
Now update parseStatement() to handle the fact that parseAbbreviation() now returns ast.Statement (was *ast.Abbreviation):
The call site is already return p.parseAbbreviation() which returns an ast.Statement now, so the parseStatement switch is fine. But let me verify the return type change doesn't break anything:
case lexer.LBRACKET:
// Slice expression: [arr FROM start FOR length] or [arr FOR length]
lbracket := p.curToken
p.nextToken() // move past [
arrayExpr := p.parseExpression(LOWEST)
var startExpr ast.Expression
if p.peekTokenIs(lexer.FOR) {
// [arr FOR length] shorthand — start is 0
startExpr = &ast.IntegerLiteral{Token: lexer.Token{Type: lexer.INT, Literal: "0"}, Value: 0}
} else {
if !p.expectPeek(lexer.FROM) {
return nil
}
p.nextToken() // move past FROM
startExpr = p.parseExpression(LOWEST)
}
if !p.expectPeek(lexer.FOR) {
return nil
}
p.nextToken() // move past FOR
lengthExpr := p.parseExpression(LOWEST)
if !p.expectPeek(lexer.RBRACKET) {
return nil
}
left = &ast.SliceExpr{
Token: lbracket,
Array: arrayExpr,
Start: startExpr,
Length: lengthExpr,
} case lexer.LBRACKET:
// Could be: [arr FROM start FOR length], [arr FOR length], or [expr, expr, ...] array literal
lbracket := p.curToken
p.nextToken() // move past [
firstExpr := p.parseExpression(LOWEST)
if p.peekTokenIs(lexer.COMMA) {
// Array literal: [expr, expr, ...]
elements := []ast.Expression{firstExpr}
for p.peekTokenIs(lexer.COMMA) {
p.nextToken() // consume comma
p.nextToken() // move to next element
elements = append(elements, p.parseExpression(LOWEST))
}
if !p.expectPeek(lexer.RBRACKET) {
return nil
}
left = &ast.ArrayLiteral{
Token: lbracket,
Elements: elements,
}
} else if p.peekTokenIs(lexer.RBRACKET) {
// Single-element array literal: [expr]
p.nextToken() // consume ]
left = &ast.ArrayLiteral{
Token: lbracket,
Elements: []ast.Expression{firstExpr},
}
} else {
// Slice expression: [arr FROM start FOR length] or [arr FOR length]
var startExpr ast.Expression
if p.peekTokenIs(lexer.FOR) {
// [arr FOR length] shorthand — start is 0
startExpr = &ast.IntegerLiteral{Token: lexer.Token{Type: lexer.INT, Literal: "0"}, Value: 0}
} else {
if !p.expectPeek(lexer.FROM) {
return nil
}
p.nextToken() // move past FROM
startExpr = p.parseExpression(LOWEST)
}
if !p.expectPeek(lexer.FOR) {
return nil
}
p.nextToken() // move past FOR
lengthExpr := p.parseExpression(LOWEST)
if !p.expectPeek(lexer.RBRACKET) {
return nil
}
left = &ast.SliceExpr{
Token: lbracket,
Array: firstExpr,
Start: startExpr,
Length: lengthExpr,
}
}The file /home/david/projects/code-associates/occam2go/parser/parser.go has been updated successfully.
Now add multi-line expression continuation in parseBinaryExpr(). The key pattern is: after consuming the operator, if the next token is NEWLINE, skip NEWLINE/INDENT tokens and parse the RHS, then consume matching DEDENTs.
func (p *Parser) parseBinaryExpr(left ast.Expression) ast.Expression {
expr := &ast.BinaryExpr{
Token: p.curToken,
Left: left,
Operator: p.curToken.Literal,
}
prec := p.curPrecedence()
p.nextToken()
expr.Right = p.parseExpression(prec)
return expr
}func (p *Parser) parseBinaryExpr(left ast.Expression) ast.Expression {
expr := &ast.BinaryExpr{
Token: p.curToken,
Left: left,
Operator: p.curToken.Literal,
}
prec := p.curPrecedence()
p.nextToken()
// Handle multi-line expression continuation: operator at end of line,
// expression continues on next indented line(s).
indentCount := 0
for p.curTokenIs(lexer.NEWLINE) || p.curTokenIs(lexer.INDENT) {
if p.curTokenIs(lexer.INDENT) {
indentCount++
}
p.nextToken()
}
expr.Right = p.parseExpression(prec)
// Consume matching NEWLINE+DEDENT pairs for the indents we consumed
for indentCount > 0 {
if p.peekTokenIs(lexer.NEWLINE) {
p.nextToken()
}
if p.peekTokenIs(lexer.DEDENT) {
p.nextToken()
indentCount--
} else {
break
}
}
return expr
}The file /home/david/projects/code-associates/occam2go/parser/parser.go has been updated successfully.
Now I also need to update the infix expression loop to continue past NEWLINEs when there's a pending operator. The issue is the infix loop currently stops at NEWLINE:
for !p.peekTokenIs(lexer.NEWLINE) && !p.peekTokenIs(lexer.EOF) && precedence < p.peekPrecedence() {
This is correct for the normal case — the NEWLINE stops parsing. But the parseBinaryExpr already handles the case where the operator is at end-of-line and the RHS is on the next line. The real question is: in the expression (direction = exact) AND\n (...), the AND is at end of line, and the parser reads the AND token. Then parseBinaryExpr gets called, which moves past AND to NEWLINE, sees the NEWLINE+INDENT, and skips them to get the RHS. That should work.
But wait — looking at the infix loop more carefully, if peekToken is NEWLINE, the loop stops. So when we have expr AND\n, the peek token after the infix operator AND would NOT be NEWLINE because AND hasn't been consumed yet. Let me trace through:
parseExpression() sees expr as leftpeekToken — which would be AND (not NEWLINE)parseBinaryExpr(left)parseBinaryExpr, curToken is AND, we do p.nextToken() — now curToken is NEWLINEBut there's a subtlety: after parseBinaryExpr returns, the infix loop continues. The issue is for chained operators like expr OR\n expr AND\n expr. After parsing the first OR+RHS, the peek might be NEWLINE. Let me check... Actually, the right-hand expression in parseBinaryExpr calls parseExpression(prec) which will parse the inner expression. The inner call to parseExpression will see expr AND\n expr and handle it recursively because AND has higher precedence.
Let me build and test this compiles.
Good, compiles clean. Let me test it quickly against the course module to see error reduction.
go build -o occam2go . && ./occam2go -I kroc/modules/course/libsrc -D TARGET.BITS.PER.WORD=32 -o /tmp/course_out.go kroc/modules/course/libsrc/course.module 2>&1 | wc -l
./occam2go -I kroc/modules/course/libsrc -D TARGET.BITS.PER.WORD=32 -o /tmp/course_out.go kroc/modules/course/libsrc/course.module 2>&1
Exit code 1 Parse errors: line 2717: unexpected token in expression: SEQ line 2742: unexpected token in expression: SEQ line 3299: unexpected token in expression: NEWLINE line 3304: unexpected token in expression: NEWLINE line 3922: unexpected token in expression: SEQ line 3943: unexpected token in expression: SEQ Parse errors: line 2717: unexpected token in expression: SEQ line 2742: unexpected token in expression: SEQ line 3299: unexpected token in expression: NEWLINE line 3304: unexpected token in expression: NEWLINE line 3922: unexpected token in expression: SEQ line 3943: unexpected token in expression: SEQ
Let me look at those error lines in float_io.occ. The "unexpected token in expression: SEQ" and "unexpected token in expression: NEWLINE" errors suggest there are some patterns I'm not handling correctly.
The errors are on lines 2717, 2742, 3299, 3304, 3922, 3943 of the preprocessed output. But they might not match the source file lines exactly due to preprocessing. Let me find the actual patterns.
Wait — the preprocessor expands the file, so the line numbers in the error messages are from the preprocessed output. Let me check what's at those lines. Actually, the preprocessed output is the full course.module expansion. Let me look at the relevant parts of float_io.occ.
I need to examine the preprocessed output directly to see what's actually on those error lines.
./occam2go -I kroc/modules/course/libsrc -D TARGET.BITS.PER.WORD=32 -o /tmp/course_out.go kroc/modules/course/libsrc/course.module 2>&1; head -1 /dev/null; # Let me look at the actual line numbers in the preprocessed file. Let me create a debug version. go run . -I kroc/modules/course/libsrc -D TARGET.BITS.PER.WORD=32 -o /tmp/course_out.go kroc/modules/course/libsrc/course.module 2>&1 | head -1
Parse errors: line 2717: unexpected token in expression: SEQ line 2742: unexpected token in expression: SEQ line 3299: unexpected token in expression: NEWLINE line 3304: unexpected token in expression: NEWLINE line 3922: unexpected token in expression: SEQ line 3943: unexpected token in expression: SEQ Parse errors:
{
"prompt": "I need to find the source lines that correspond to parse errors at lines 2717, 2742, 3299, 3304, 3922, 3943 in the preprocessed output of the course.module file.\n\nPlease do the following:\n1. Look at the main.go file to understand how the preprocessor is invoked\n2. Write a small Go program or use the existing tooling to preprocess the course.module file and extract lines around 2717, 2742, 3299, 3304, 3922, 3943 (±5 lines context each)\n\nThe command to preprocess is:\n```\n./occam2go -I kroc/modules/course/libsrc -D TARGET.BITS.PER.WORD=32 -o /tmp/course_out.go kroc/modules/course/libsrc/course.module\n```\n\nBut we need to see the preprocessed output. Read main.go to understand how to get the preprocessed text, then write a quick script.\n\nWorking directory: /home/david/projects/code-associates/occam2go",
"subagent_type": "Bash"
}[
{
"type": "text",
"text": "Now the picture is clear. Here is a summary of all 6 parse errors and what they correspond to in the preprocessed source:\n\n---\n\n**Lines 2717, 2742, 3922, 3943 -- \"unexpected token in expression: SEQ\"**\n\nThese are all the same pattern: an `IF` construct where a **comment line** (`--{{{`) appears between the boolean condition and its associated body (`SEQ`). The parser is treating the comment/fold-mark line as part of the condition expression and then encountering `SEQ` unexpectedly.\n\nFor example, at line 2717:\n\n```\n (((-max.no.of.zeroes.before.first.significant.digit) - 1) <= LogX) AND\n (LogX < N))\n --{{{ try to output in mmm.nnnn form <-- comment between condition and body\n SEQ <-- parser chokes here\n```\n\nThe occam `IF` construct expects:\n```\nIF\n condition\n body\n```\n\nBut these have a `--{{{` fold marker comment on the line between the condition and its indented body. The comment should be transparent to the parser, but it seems the parser is not handling this case -- the comment line is likely causing the parser to think the condition expression is complete, and then `SEQ` at the same indentation as the comment is unexpected because it's being parsed as a continuation of the expression context.\n\nAll four instances are in float formatting code (`float_io.occ`) with the same structure: multi-line boolean conditions in `IF` branches, a fold-marker comment, then `SEQ`.\n\n---\n\n**Lines 3299, 3304 -- \"unexpected token in expression: NEWLINE\"**\n\nThese are multi-assignment statements where the right-hand side is a function call that continues on the next line:\n\n```\n Zfrac[1], Zfrac[0], Guard :=\n RealIMul (Zfrac, [PowerTenFrac FROM (S + S) FOR 2])\n```\n\nand:\n\n```\n Zfrac[1], Zfrac[0], Guard :=\n RealIMul (Zfrac, [BiasTenFrac FROM (N + N) FOR 2])\n```\n\nThe `:=` is at the end of the line, and the RHS expression (`RealIMul(...)`) is on the **next line** at a deeper indentation. The parser encounters `NEWLINE` after `:=` when it expects an expression on the right-hand side. The parser doesn't currently support multi-line assignments where the RHS continues on the next indented line.\n\n---\n\n**Summary of root causes:**\n\n| Lines | Error | Root Cause |\n|-------|-------|------------|\n| 2717, 2742, 3922, 3943 | `unexpected token in expression: SEQ` | Comment (`--{{{`) between IF condition and body confuses the parser; the `SEQ` keyword is not expected in expression context |\n| 3299, 3304 | `unexpected token in expression: NEWLINE` | Multi-line assignment with `:=` at end of line and RHS on next indented line; parser doesn't consume NEWLINE+INDENT to find the continuation expression |"
},
{
"type": "text",
"text": "agentId: a3abc0d (for resuming to continue this agent's work if needed)\n<usage>total_tokens: 14546\ntool_uses: 4\nduration_ms: 706540</usage>"
}
]OK so there are two remaining issues:
--{{{ fold markers are comment lines. In occam, -- starts a comment that goes to end of line. The preprocessor should strip these, but let me check... Actually, the lexer handles -- comments by skipping them. But the issue is: in the IF statement parsing, the comment line between the condition and the body causes a NEWLINE. The condition expression ends at NEWLINE. Then the parser sees the comment line (which the lexer skips), then NEWLINE again, then SEQ. But the parser expects another IF condition or a DEDENT at this point.Looking at the actual error case more carefully, the multi-line condition ends with (LogX < N)) and then there's a comment at the same indentation level, followed by SEQ which should be the body. The issue is that the comment line is breaking the parser's expectation of what comes after the condition. Examining the lexer's blank line detection logic to understand how it handles comment-only lines and whether they're properly skipped before the indentation check for the IF body. I'm seeing the full condition now — it ends with a double closing parenthesis, which means there's an outer paren wrapping the entire AND expression. The real issue is figuring out where the condition actually ends and the body begins, especially with that fold comment sitting between them. I need to trace through the exact token sequence to understand how the parser should handle the transition from the closing paren to the INDENT that marks the start of the SEQ body. my multi-line continuation handling, the DEDENT balancing is wrong.
Let me trace through what's happening step by step:
parseExpression(), parsing the IF condition(((-max.no.of.zeroes.before.first.significant.digit) - 1) <= LogX) AND\n (LogX < N))parseBinaryExpr(LogX < N)) — but this has TWO closing parensThe double ) means the outer paren is part of a bigger expression. Looking at the actual context from the plan transcript, these are multi-line conditions in IF branches where the entire condition is wrapped in parentheses. Since ( and ) suppress NEWLINE in the lexer, multi-line continuation inside parens should work without special handling — the issue only arises when the operator is at the end of a line without parentheses wrapping the whole expression.
Now that I've added [ and ] to parenDepth in the lexer, I need to make sure that inside [...] brackets, the same NEWLINE suppression applies.
The real problem is what happens after the condition closes with ) and there's a NEWLINE, followed by a fold-mark comment on the next line, then SEQ on the line after that. The IF statement parser expects NEWLINE then INDENT for the body, but if a comment line sits between them at the same indent level, I need to trace through how the lexer handles this scenario. and SEQ is at deeper indent, then the indent for SEQ should produce an INDENT token. The comment line is treated as a blank line and skipped.
Hmm, but actually the flow is:
1. The parser finishes parsing the condition expression
2. The condition parsing left the token at the closing )
3. The parser's peekToken after ) is ... what?
Let me reconsider. The IF condition is wrapped in parens. If the whole condition is (something), then the paren depth handles the multi-line bits. The closing ) brings parenDepth back to 0. After ), the next characters are newlines and the comment line.
Since parenDepth is 0 now, the newline WILL produce a NEWLINE token. Then the comment line is blank, gets skipped. Then SEQ at the next indentation level produces INDENT.
So the sequence of tokens after the condition should be: NEWLINE, INDENT, SEQ.
In the IF parser, I'm skipping over NEWLINE tokens, checking for INDENT, consuming it, then moving to parse the body statements. This should handle the flow correctly—the parser finds SEQ as the start of the body.
But the error "unexpected token in expression: SEQ" means the parser is treating SEQ as part of an expression rather than a statement, which suggests the parser isn't reaching the body parsing logic at all.
I think the issue is that my parseBinaryExpr change is too aggressive with consuming NEWLINE+INDENT pairs after operators, and it's eating the NEWLINE and INDENT that should mark the start of the IF body block.
Wait, actually I'm reconsidering — the condition expression has outer parentheses around the whole thing, so parenDepth > 0 means the lexer is suppressing NEWLINEs anyway. The AND operators are nested inside those parens, so they shouldn't be producing any NEWLINE tokens at all. The lexer should be handling this correctly.
Let me look at the actual preprocessed output around those error line numbers to see what's really happening.
cat > /tmp/dump_preproc.go << 'EOF'
package main
import (
"fmt"
"os"
"strings"
"github.com/codeassociates/occam2go/preproc"
)
func main() {
pp := preproc.New(
preproc.WithIncludePaths([]string{"kroc/modules/course/libsrc"}),
preproc.WithDefines(map[string]string{"TARGET.BITS.PER.WORD": "32"}),
)
expanded, err := pp.ProcessFile("kroc/modules/course/libsrc/course.module")
if err != nil {
fmt.Fprintln(os.Stderr, err)
os.Exit(1)
}
lines := strings.Split(expanded, "\n")
// Print lines around errors: 2717, 2742, 3299, 3304, 3922, 3943
targets := []int{2717, 2742, 3299, 3304, 3922, 3943}
for _, t := range targets {
fmt.Printf("\n=== Around line %d ===\n", t)
for i := t - 6; i <= t + 4; i++ {
if i >= 0 && i < len(lines) {
fmt.Printf("%d: %s\n", i+1, lines[i])
}
}
}
}
EOF
go run /tmp/dump_preproc.go 2>&1=== Around line 2717 ===
2712: ( (Ip <> 0) AND (LogX < Ip) ) OR
2713: ((Dp = 0) AND
2714: (((-max.no.of.zeroes.before.first.significant.digit) - 1) <= LogX) AND
2715: (LogX < N))
2716: --{{{ try to output in mmm.nnnn form
2717: SEQ
2718: --{{{ possibly restrict
2719: M := (Dp + LogX) + 1 --current estimate at actual field width
2720: IF
2721: M < 0 --must ensure that M >= 0
2722: M := 0
=== Around line 2742 ===
2737: ( (Ip <> 0) AND (LogX.1 < Ip) ) OR
2738: ((Dp = 0) AND
2739: (((-max.no.of.zeroes.before.first.significant.digit) - 1) <= LogX.1) AND
2740: (LogX.1 < N))
2741: --{{{ output in mmm.nnnn form
2742: SEQ
2743: --{{{ set Scale
2744: IF
2745: LogX.1 < 0 --always 0decimal.point
2746: Scale := 1
2747: TRUE
=== Around line 3299 ===
3294: SEQ
3295: IF
3296: (N = 0) AND (S <= Table10) --final scale factor
3297: SEQ
3298: Zfrac[1], Zfrac[0], Guard :=
3299: RealIMul (Zfrac, [PowerTenFrac FROM (S + S) FOR 2])
3300: Zexp := (Zexp + PowerTenExp[S]) + 1
3301: TRUE
3302: SEQ
3303: Zfrac[1], Zfrac[0], Guard :=
3304: RealIMul (Zfrac, [BiasTenFrac FROM (N + N) FOR 2])
=== Around line 3304 ===
3299: RealIMul (Zfrac, [PowerTenFrac FROM (S + S) FOR 2])
3300: Zexp := (Zexp + PowerTenExp[S]) + 1
3301: TRUE
3302: SEQ
3303: Zfrac[1], Zfrac[0], Guard :=
3304: RealIMul (Zfrac, [BiasTenFrac FROM (N + N) FOR 2])
3305: S := S - BiasTen[N]
3306: Zexp := (Zexp + BiasTenExp[N]) + 1
3307: --{{{ Normalise and round Z
3308: Places, Zfrac[1], dummy := NORMALISE (Zfrac[1], Zfrac[0])
3309: Zfrac[0], Guard := SHIFTLEFT (Zfrac[0], Guard, Places)
=== Around line 3922 ===
3917: IF
3918: ( (Ip <> 0) AND (LogX < Ip) ) OR
3919: ( (Dp = 0) AND (LogX < N) AND
3920: (((-max.no.of.zeroes.before.first.significant.digit) - 1 ) <= LogX) )
3921: --{{{ try to output in mmm.nnnn form
3922: SEQ
3923: Y := (Dp + LogX) + 1 --current estimate at actual field width
3924: IF
3925: Y < 0 --must ensure that Y >= 0
3926: Y := 0
3927: TRUE
=== Around line 3943 ===
3938: IF
3939: ( (Ip <> 0) AND (LogX.1 < Ip) ) OR
3940: ( (Dp = 0) AND (LogX.1 < N) AND
3941: (((-max.no.of.zeroes.before.first.significant.digit) - 1 ) <= LogX) )
3942: --{{{ output in mmm.nnnn form
3943: SEQ
3944: --{{{ set Scale
3945: IF
3946: LogX.1 < 0 --always 0decimal.point
3947: Scale := 1
3948: TRUENow I see the issues clearly.
Issue 1: Lines 2717, 2742, 3922, 3943 - SEQ errors
These are IF conditions that span multiple lines. The condition expression is:
( (Ip <> 0) AND (LogX < Ip) ) OR
((Dp = 0) AND
(((-max.no.of.zeroes.before.first.significant.digit) - 1) <= LogX) AND
(LogX < N))
Note: the OR is at end of line 2712, followed by the continuation on line 2713-2715. The issue is the condition OR\n ((Dp = 0) AND...).
I need to trace through the parser logic — when the IF parser calls parseExpression(LOWEST) for the condition, it should handle the multi-line OR expression correctly through the binary expression parser's continuation logic, but something's going wrong with how the SEQ token is being recognized after the closing parenthesis. Advancing to the OR token and parsing it as a binary expression, where the operator has its own precedence level. After OR, the next token is a newline, so my code needs to skip over any newlines and indentation to find the right operand for the binary operation.
Once the right side is parsed, I'm consuming the matching dedentation tokens to properly close out the indented block. I need to trace through the lexer's paren depth tracking as it processes this multi-line condition. The expression starts with an opening paren and includes nested parens and brackets, so I'm checking how the depth changes through each token to understand what the peek token would be after the closing parens. The RHS is fully parenthesized with nested expressions for (Dp = 0) and the nested conditions, all balanced out to close at the final ), so newlines inside are suppressed. After parsing the RHS, the cursor lands on that closing paren and parenDepth returns to 0 since we're back at the outer IF level. Now I need to handle what comes after.
I'm looping through to consume any NEWLINE and DEDENT tokens that follow. Since parenDepth is 0, the peek shows a NEWLINE, so I consume it. The next line is a comment that the lexer treats as blank, and I need to figure out what happens when the lexer encounters it.
After skipping the blank comment line, the real next line is SEQ. I'm trying to determine whether SEQ is indented deeper than the IF body level or at the same level, which matters for understanding the parse structure.
Looking at the column positions: the IF starts at column 15, the condition at 17, and the continuation after OR at 19. The SEQ is also at column 19, which means it's at the same indentation as the OR continuation rather than deeper than the IF itself. This suggests the entire condition spans multiple lines at varying indentation levels, and then the body follows.
The parser is consuming the NEWLINE and INDENT tokens during the binary expression parsing for the OR operator, which means by the time it reaches the SEQ, those indent tokens have already been consumed. The lexer sees the SEQ at column 19, but the parser's state doesn't match what the indentation structure actually requires. Now I'm tracing through what happens after the closing parenthesis: the lexer produces a NEWLINE, then processes the comment line at column 19, which gets skipped since it's at the same indent level. The next real token should be SEQ on line 2717, with no DEDENT in between since the indent hasn't changed.
The problem is my code expects a DEDENT to match the INDENT it consumed earlier, but since SEQ is at the same indentation level, there's no DEDENT token. The loop breaks without decrementing indentCount, leaving the indent tracking off by one, and then control returns to the IF parser with the token stream in an inconsistent state. After parseBinaryExpr returns, curToken might have advanced past the closing parenthesis to a NEWLINE or even further, which throws off the token position for the rest of the parser. The DEDENT-consuming loop I wrote has a flaw: it only breaks when there's no DEDENT to consume, leaving the indent stack out of sync with what the IF parser expects when it looks for the body's INDENT token. SEQ is a keyword token, not an identifier. Since the peek token doesn't match DEDENT, I exit the loop with indentCount still at 1, meaning there's an unmatched INDENT. Back in parseExpression, the infix operator loop checks if SEQ has a precedence—it doesn't, so the loop exits and I return the expression.
Now I'm tracing through what happens when this expression returns to the IF parser. The binary expression becomes the left operand, and the loop checks the peek token again, which is still SEQ. Since SEQ has no precedence, the infix loop exits and parseExpression returns with curToken at NEWLINE. Back in the IF parser, it tries to skip newlines and find an INDENT for the body, but instead encounters SEQ as the current token, so it fails to parse a body and moves on to look for the next choice.
The real issue is that my multi-line continuation logic in parseBinaryExpr consumes the INDENT to reach the right-hand side, but then expects a matching DEDENT that doesn't come until after the body block ends—since the body is at the same indentation level as the continued expression. I think the solution is to stop trying to consume the DEDENT in parseBinaryExpr and instead just handle the NEWLINE+INDENT to get to the RHS, letting the caller manage the indentation balance. The IF parser already tracks its own indentation level separately, so this shouldn't break that logic.
But I'm realizing my whole approach of consuming INDENT and DEDENT in parseBinaryExpr is too rigid. Maybe instead of modifying parseBinaryExpr, I should handle this at the expression parsing level—either by skipping NEWLINE tokens in the infix loop when there's a valid operator ahead, or by checking the next non-NEWLINE token for operators before deciding whether to continue parsing. I need to think through how the original occam syntax actually handles these multi-line expressions.
In occam, indentation determines whether an expression continues, and our lexer produces INDENT/DEDENT tokens to reflect that. So when an expression spans multiple lines at deeper indentation, we get NEWLINE followed by INDENT. The key insight is that in the infix loop, when I encounter a NEWLINE, I should consume it and check if the next token is a valid operator—if it is, keep parsing; if not, put the NEWLINE back or handle it differently.
But actually, looking at the real code in float_io.occ, these multi-line expressions aren't wrapped in outer parentheses like I initially thought. The IF condition itself spans multiple lines with OR and AND operators continuing across line breaks, so the lexer's NEWLINE/INDENT tokens are definitely being produced and need to be handled properly in the expression parser. I'm realizing the core issue: when an OR operator appears at the end of a line with no closing parenthesis, the lexer produces a NEWLINE token, and my parser needs to handle that continuation. The problem is that when I consume an INDENT token to skip over the indentation on the next line, I'm incrementing the parser's indent level but not always finding a matching DEDENT to decrement it, which throws off the tracking.
The key insight is that I don't need to manually track and consume DEDENTs in the binary expression parser—the DEDENT tokens that come from the actual end of the indented block will naturally decrement the indent level back to where it should be. The IF statement parser already has logic to stop consuming DEDENTs once the indent level drops below the starting level, so if I just let the indent level track correctly, everything should align. I'm exploring whether skipping only NEWLINEs in the expression parser would work, but that would cause the expression to stop prematurely since INDENT has no precedence. I could modify the infix loop to peek ahead when encountering a NEWLINE to detect continuations, though that gets complicated. Another angle is tracking the INDENT more carefully in the binary expression parser to avoid mismatched dedents later.
Actually, looking at the real patterns in the code, these multi-line expressions only appear in IF conditions where the body sits at the same indentation level as the condition continuation—there's no extra INDENT for the body itself. So the parser needs to handle IF statements where the body doesn't require an additional indentation level relative to the condition.
Let me trace through the actual spacing: the IF keyword is at column 15, the first line of the condition is at column 17 (one INDENT deeper), and the subsequent condition lines are at column 19 (another INDENT deeper), but then the SEQ body appears at the same level as those continuation lines rather than going deeper still.
This suggests the body's indentation is being calculated relative to the IF statement itself, not relative to where the condition expression ends, which is how occam's grammar actually works—the condition gets one indent level, and the body gets another level from there. I need to trace through the parenthesis depth more carefully to understand when NEWLINE and INDENT tokens should be suppressed. On line 2712, the opening parenthesis increments the depth, and I need to track how it changes through the closing parentheses to determine whether the INDENT on line 2713 actually gets emitted or gets suppressed by the parser's paren-depth logic. Continuing through the closing parentheses and the DEDENT-consuming loop in my parser, I'm tracking how the indentation gets unwound after the binary expression finishes. The loop consumes any remaining NEWLINE and DEDENT tokens to properly handle the indentation level changes that occurred during the multi-line expression.
However, I'm noticing a problem: after consuming the NEWLINE following the closing paren, the next token is SEQ at the same indentation level as before, which means there's no DEDENT token to consume. The loop breaks without decrementing indentCount, leaving the parser's indentation level one step too high compared to the actual source indentation.
Back in the IF parser, I'm checking for a NEWLINE followed by an INDENT to parse the body. But since peekToken is SEQ (not NEWLINE or INDENT), the body parsing is skipped entirely, which is incorrect—the body should have been parsed.
The parser then loops back, advances past the NEWLINE token, and now curToken is SEQ. When it tries to parse SEQ as an expression, it fails because SEQ isn't a valid expression prefix, triggering an error.
The core issue is that after a multi-line expression continuation, the body keyword appears at the same indentation level as the continuation itself, so there's no INDENT token to signal the start of the body. I'm considering two approaches: either stop consuming INDENT in the binary expression parser and let it naturally terminate when it hits INDENT, which would break multi-line continuations, or have the parser consume INDENT but track it carefully to restore it if no matching DEDENT follows—though that's getting complicated.
A cleaner option is to handle multi-line continuations at the expression level rather than in binary operators, by peeking past NEWLINEs to check if an infix operator follows. But the simplest fix might be to make the IF parser itself handle the case where the body keyword is already at the right indentation without requiring an additional INDENT token, since that INDENT was already consumed during expression parsing. token IS the body statement (without an extra INDENT).
But wait, this would be fragile. It means after any multi-line expression, the parser might accidentally consume the next statement as part of the current scope.
Actually, let me reconsider Option A with a twist. What if instead of consuming INDENT in parseBinaryExpr, we just consume NEWLINEs? But INDENT is needed for the lexer to properly track indentation...
Hmm, actually there's a simpler approach. Let me look at how the problem manifests:
What if parseBinaryExpr, when it can't find a matching DEDENT, simply decrements indentLevel manually to account for the unmatched INDENT it consumed? The lexer has its own indent stack that will eventually produce the corresponding DEDENT token when indentation actually decreases, so the parser's tracking should stay in sync as long as I'm careful about when those DEDENTs arrive. I'm tracing through the execution and realizing this approach breaks down—the IF parser can't locate the body because the indentation state is wrong. I need to step back and reconsider how to handle multi-line expressions entirely, rather than trying to manage them within the binary expression parser.
The real issue is that parseExpression() stops at newlines, so it never continues parsing infix operators that appear on the next line. I could modify the infix loop to lookahead past newlines and indentation to check if a valid infix operator follows, but that requires consuming tokens without modifying parser state, which is tricky. Another option is to move this logic entirely into the lexer—have it suppress newlines and indentation tokens in certain contexts so the parser naturally continues across lines.
The cleanest approach might be to track the last token type in the lexer and suppress newlines that immediately follow binary operators, similar to how newlines are already suppressed inside parentheses. This way the parser sees a continuous stream of tokens and doesn't need special lookahead logic.
But I need to be careful here—suppressing the newline also means the indentation tracking gets disrupted, since INDENT and DEDENT tokens depend on the newline to signal where indentation changes occur. I should only suppress the newline itself while preserving the indentation logic.
Actually, the cleanest approach is to treat continuation lines after binary operators similarly to how lines inside parentheses are handled—skip both the NEWLINE token and the indent/dedent processing for that line, so the deeper indentation doesn't trigger an unwanted INDENT token. est way: when the previous token was a binary operator, increment parenDepth temporarily so that NEWLINE/INDENT/DEDENT are suppressed. Then decrement it when we see the next real token.
Actually, this is getting complicated. Let me go back to the parseBinaryExpr approach but fix the DEDENT issue.
Revised parseBinaryExpr approach: Instead of trying to consume DEDENT, just consume NEWLINE+INDENT, track the count, and manually adjust p.indentLevel back down after parsing the RHS. This way, the parser's indentLevel is correct, but the lexer's indent stack has the extra level. The DEDENT from the lexer will come later, and when it does, the parser will see it and handle it.
But wait, the DEDENT token will come later and decrement p.indentLevel below the correct value. That's a problem too. The real issue is that INDENT/DEDENT tokens are lexer artifacts that mess up the parser's indentation tracking when I try to consume them in the middle of an expression. Maybe I should just skip NEWLINE tokens and leave INDENT alone, but then the RHS won't parse correctly since INDENT doesn't match any expression pattern. I could try suppressing INDENT at the lexer level with a flag, but that's getting too convoluted.
Let me try the simplest approach: track the last real token type in the lexer and suppress NEWLINE/INDENT/DEDENT when the previous token was a binary operator or assignment operator. This way, the lexer handles the continuation logic, and the parser doesn't have to deal with INDENT/DEDENT in the middle of expressions.
Implementing this by tracking the last real token type and checking if it's a binary or assignment operator before emitting NEWLINE tokens. Writing continuation logic... Still writing the indent logic...
Actually, I realize the issue—once we process a continuation line and read the next real token, lastRealTokenType gets updated, so the suppression won't carry over incorrectly. The check in the indent processing section should work fine because it only suppresses when we're actively in a continuation context. to whatever that token is. So the next NEWLINE won't be suppressed (unless it too follows an operator).
This should work! But there's a subtle issue: after processing the continuation (suppressing NEWLINE and INDENT/DEDENT), the next line might ALSO be a continuation. For example:
expr AND
expr2 OR
expr3
In this case:
1. After AND, NEWLINE+INDENT suppressed → parse expr2
2. After expr2, lastRealTokenType is the closing ) or whatever
3. After OR, lastRealTokenType is OR
4. NEWLINE+INDENT suppressed → parse expr3
This works because each operator at end of line will trigger continuation. And when the expression ends (no more operators), the NEWLINE is NOT suppressed, and the normal indent/dedent processing resumes.
But here's the real problem: the indent stack in the lexer doesn't get updated while we're suppressing tokens during continuation. So when the expression finally ends and we return to the original indentation level, the lexer might generate spurious INDENT/DEDENT tokens or fail to generate the ones we actually need. For instance, if we suppress NEWLINE+INDENT across multiple lines and then hit a line at a different indentation, the stack won't reflect the intermediate indentation changes we skipped over. isContinuationOp(')')` is false, so NEWLINE IS produced. Then at line D (SEQ), indent is 18. The lexer's indent stack still has 16 at the top (since INDENT was suppressed for line B). So indent 18 > 16 → INDENT is produced.
Wait, but at what point was the indent stack at 16? Let's trace:
- Before line A: indent stack has some value, let's say 14 (from the IF body level) and 16 (from the IF choice level). Top of stack is 16.
- Line A: atLineStart, indent = 16, same as top of stack. No INDENT/DEDENT.
- After line A: \n → set atLineStart. isContinuationOp(OR) = true.
- Line B: atLineStart, indent = 18, but since lastRealTokenType = OR (continuation), INDENT is suppressed and the indent stack keeps 16 at top.
- Line C: still inside parens (parenDepth > 0 from (( on line B), so everything is suppressed until the matching ).
- After line C: \n → atLineStart, isContinuationOp(')') = false, parenDepth = 0, so NEWLINE is produced.
Now I need to look at how the newline handler actually works in the lexer. Handling newlines...
Continuing to trace the newline handler logic...
When we hit the newline after line C, lastRealTokenType is ) which isn't a continuation operator, so the NEWLINE token gets produced. On the next call, we're at the start of line D with indent 18, and since the indent stack still shows 16 (the continuation line's indent was suppressed), we generate an INDENT token. The parser then consumes the NEWLINE and checks for INDENT to continue parsing the block.
Now I'm thinking about how := works as a continuation operator for the assignment spanning lines 3298-3299.
When the parser encounters :=, it suppresses the NEWLINE token, so the next line's INDENT is also suppressed. This lets the expression parser see RealIMul(...) directly as the right-hand side of the assignment.
I'm looking at how the multiassignment parsing handles this — after consuming :=, it moves past the operator and then parses the expression on the following line.
If NEWLINE isn't suppressed after :=, the parser would pass NEWLINE to parseExpression, which tries to parse it as a prefix expression and fails. So := definitely needs to be marked as a continuation operator to handle multi-line patterns correctly.
Now I'm implementing the solution by tracking the last real token type in the lexer, checking for continuation operators when encountering NEWLINE, and suppressing it appropriately. The continuation operators include binary operators like AND, OR, arithmetic operators, comparison operators, and punctuation like COMMA and SEMICOLON that indicate the expression continues on the next line. I should be comprehensive and include all binary operators to handle continuation properly, since that's how the original occam parser would work anyway. I'll also need to handle the := assignment operator and potentially the SEND operator at end of line.
Now I'm thinking through the cases where NEWLINE should not trigger continuation — like when a statement genuinely ends with a value or closing paren, where there's no operator signaling that more is coming. The logic seems sound: if the last real token isn't a continuation operator, the NEWLINE gets produced normally, which correctly handles separate statements and expressions that are already complete.
For the implementation, I should mirror the existing pattern for parenthesis depth — adding the continuation check right alongside the parenDepth logic in both the newline handler and the indent processing at the top of NextToken. I need to be careful to reset lastRealTokenType appropriately after processing a continuation line so it doesn't interfere with subsequent tokens.
The challenge is that tokens get returned from many places throughout NextToken, so I'd need to track and update lastRealTokenType before each return statement to avoid missing updates. I'm settling on wrapping NextToken with a helper function that tracks the last token type, so I can cleanly separate the raw token generation from the tracking logic without cluttering the lexer with multiple return points. I'm realizing the wrapper approach actually handles this correctly—when nextTokenRaw recursively calls NextToken to suppress a newline, that recursive call updates lastTokenType to the actual token being returned, and then the outer call also updates it to that same token, so it ends up tracking the last real token before the next newline check. Let me trace through a concrete example to verify this works as expected. lastTokenType) → OR → true, so I skip indent processing and whitespace, then encounter the opening parenthesis and return it. After wrapping back through the recursive calls, lastTokenType becomes LPAREN, which is correct—the next newline inside the parentheses will be suppressed by the paren depth tracking anyway. Now let me trace through the := continuation case where the assignment operator is followed by a newline and indented content.
When I hit the newline after :=, the paren depth is zero and the continuation operator check returns true, so I suppress the newline and recurse. In the recursive call, I'm at the line start, I measure the indent, and since the last token was ASSIGN (a continuation operator), I suppress the indent token, skip whitespace, and process the identifier that follows. This correctly handles the continuation. Let me verify this works for multiple chained continuations as well—after each operator like AND, the newline and indent get suppressed, allowing the next expression to be parsed seamlessly across lines.
Now I'm thinking through what happens after the final expression in a chain like this. Once I've parsed expr3, the last token type is something like RPAREN or IDENT, and the newline is no longer suppressed. The key insight is that I never pushed the continuation indents onto the indent stack—I only suppressed them. So when I eventually reach a line at a different indentation level, the indent measurement compares against the base level that's still on the stack, and the DEDENT processing works correctly.
Looking at the example, when SEQ appears at indent 18 after the continuation lines, the lexer sees 18 is greater than the 16 that's still on top of the stack, so it produces an INDENT token and pushes 18. This gives the IF parser exactly the INDENT it needs for the body. The continuation lines never touched the stack, so the body indentation triggers the right token. I'm ready to implement this approach now.
Let me reconsider my approach for multi-line expressions. The parseBinaryExpr approach has issues with INDENT tracking. A better approach is to handle continuation at the lexer level, similar to paren suppression. Let me revert the parseBinaryExpr change and implement lexer-level continuation instead.
func (p *Parser) parseBinaryExpr(left ast.Expression) ast.Expression {
expr := &ast.BinaryExpr{
Token: p.curToken,
Left: left,
Operator: p.curToken.Literal,
}
prec := p.curPrecedence()
p.nextToken()
// Handle multi-line expression continuation: operator at end of line,
// expression continues on next indented line(s).
indentCount := 0
for p.curTokenIs(lexer.NEWLINE) || p.curTokenIs(lexer.INDENT) {
if p.curTokenIs(lexer.INDENT) {
indentCount++
}
p.nextToken()
}
expr.Right = p.parseExpression(prec)
// Consume matching NEWLINE+DEDENT pairs for the indents we consumed
for indentCount > 0 {
if p.peekTokenIs(lexer.NEWLINE) {
p.nextToken()
}
if p.peekTokenIs(lexer.DEDENT) {
p.nextToken()
indentCount--
} else {
break
}
}
return expr
}func (p *Parser) parseBinaryExpr(left ast.Expression) ast.Expression {
expr := &ast.BinaryExpr{
Token: p.curToken,
Left: left,
Operator: p.curToken.Literal,
}
prec := p.curPrecedence()
p.nextToken()
expr.Right = p.parseExpression(prec)
return expr
}The file /home/david/projects/code-associates/occam2go/parser/parser.go has been updated successfully.
Now implement lexer-level continuation. When the last real token is a binary operator or :=, suppress NEWLINE and INDENT/DEDENT on the continuation line.
type Lexer struct {
input string
position int // current position in input (points to current char)
readPosition int // current reading position (after current char)
ch byte // current char under examination
line int
column int
// Indentation tracking
indentStack []int // stack of indentation levels
pendingTokens []Token // tokens to emit before reading more input
atLineStart bool
// Parenthesis depth: suppress INDENT/DEDENT/NEWLINE inside (...)
parenDepth int
}type Lexer struct {
input string
position int // current position in input (points to current char)
readPosition int // current reading position (after current char)
ch byte // current char under examination
line int
column int
// Indentation tracking
indentStack []int // stack of indentation levels
pendingTokens []Token // tokens to emit before reading more input
atLineStart bool
// Parenthesis/bracket depth: suppress INDENT/DEDENT/NEWLINE inside (...) and [...]
parenDepth int
// Last real token type for continuation detection.
// When the last token is a binary operator or :=, NEWLINE and INDENT/DEDENT
// are suppressed on the next line (multi-line expression continuation).
lastTokenType TokenType
}The file /home/david/projects/code-associates/occam2go/lexer/lexer.go has been updated successfully.
Now I need to add the isContinuationOp function and modify NextToken to track lastTokenType and suppress NEWLINE/INDENT on continuation lines. Let me refactor NextToken to use a wrapper pattern.
func (l *Lexer) NextToken() Token {
// Return any pending tokens first (from indentation processing)
if len(l.pendingTokens) > 0 {
tok := l.pendingTokens[0]
l.pendingTokens = l.pendingTokens[1:]
return tok
}
// Handle indentation at the start of a line
if l.atLineStart {
l.atLineStart = false
indent := l.measureIndent()
currentIndent := l.indentStack[len(l.indentStack)-1]
if l.parenDepth > 0 {
// Inside parentheses: suppress INDENT/DEDENT tokens
// (don't modify indentStack — resume normal tracking after close paren)
} else if indent > currentIndent {func (l *Lexer) NextToken() Token {
tok := l.nextTokenInner()
// Track last real token type for continuation detection
if tok.Type != NEWLINE && tok.Type != INDENT && tok.Type != DEDENT && tok.Type != EOF {
l.lastTokenType = tok.Type
}
return tok
}
func (l *Lexer) nextTokenInner() Token {
// Return any pending tokens first (from indentation processing)
if len(l.pendingTokens) > 0 {
tok := l.pendingTokens[0]
l.pendingTokens = l.pendingTokens[1:]
return tok
}
// Handle indentation at the start of a line
if l.atLineStart {
l.atLineStart = false
indent := l.measureIndent()
currentIndent := l.indentStack[len(l.indentStack)-1]
if l.parenDepth > 0 || isContinuationOp(l.lastTokenType) {
// Inside parentheses/brackets or after a continuation operator:
// suppress INDENT/DEDENT tokens
// (don't modify indentStack — resume normal tracking after)
} else if indent > currentIndent {The file /home/david/projects/code-associates/occam2go/lexer/lexer.go has been updated successfully.
if l.parenDepth > 0 {
// Inside parentheses: suppress NEWLINE, get next real token
return l.NextToken()
} if l.parenDepth > 0 || isContinuationOp(l.lastTokenType) {
// Inside parentheses/brackets or after a continuation operator:
// suppress NEWLINE, get next real token
return l.NextToken()
}The file /home/david/projects/code-associates/occam2go/lexer/lexer.go has been updated successfully.
// Tokenize returns all tokens from the input
// isContinuationOp returns true if the given token type, when appearing at the
// end of a line, indicates that the expression continues on the next line.
// This causes NEWLINE and INDENT/DEDENT suppression on the continuation line.
func isContinuationOp(t TokenType) bool {
switch t {
case AND, OR,
PLUS, MINUS, MULTIPLY, DIVIDE, MODULO,
PLUS_KW, MINUS_KW, TIMES,
EQ, NEQ, LT, GT, LE, GE,
BITAND, BITOR, BITXOR, LSHIFT, RSHIFT,
ASSIGN, AFTER:
return true
}
return false
}
// Tokenize returns all tokens from the inputThe file /home/david/projects/code-associates/occam2go/lexer/lexer.go has been updated successfully.
go build -o occam2go . && ./occam2go -I kroc/modules/course/libsrc -D TARGET.BITS.PER.WORD=32 -o /tmp/course_out.go kroc/modules/course/libsrc/course.module 2>&1
Zero parse errors! Let me also run the existing test suite to make sure nothing is broken.
? github.com/codeassociates/occam2go [no test files] ? github.com/codeassociates/occam2go/ast [no test files] ok github.com/codeassociates/occam2go/codegen 10.399s ok github.com/codeassociates/occam2go/lexer 0.002s ok github.com/codeassociates/occam2go/modgen (cached) ok github.com/codeassociates/occam2go/parser 0.002s ok github.com/codeassociates/occam2go/preproc (cached)
Exit code 1 # command-line-arguments # [command-line-arguments] vet: /tmp/course_out.go:1234:14: expected ';', found ':=' (and 10 more errors) # command-line-arguments # [command-line-arguments] vet: /tmp/course_out.go:1234:14: expected ';', found ':=' (and 10 more errors)
1228→ RealInf := 2139095040
1229→ Table10 := 13
1230→ Bias10 := 3
1231→ MaxDecN := 9
1232→ R2L := 19728
1233→ PowerTenFrac :=
1234→ PowerTenExp :=
1235→ BiasTenFrac :=
1236→ BiasTenExp :=
1237→ BiasTen :=
1238→ rounded_down := - 1
1239→ exact := 0
1240→ rounded_up := 1
1241→ ScaleX := func(Xexp *int, Xfrac *int, Scale int, direction *int) {
1242→ var Carry, Places, Sb, St, Zexp, Zfrac int
1243→ _ = Carry
1244→ _ = Places
1245→ _ = Sb
1246→ _ = St
1247→ _ = ZexpThe array literal expressions are being generated as empty. The ArrayLiteral nodes aren't being handled in codegen. Also, there are RETYPES declarations that need codegen. And untyped VAL abbreviations too.
Let me now check what the array literals look like — they're PowerTenFrac := with nothing after. This means the codegen doesn't handle ArrayLiteral expressions at all (which makes sense — I haven't implemented the codegen yet).
Let me also check the RETYPES:
The array literals are empty and RETYPES nodes aren't generating code yet. Let me mark the parser task done and start on codegen.
Let me check what the RETYPES and CAUSEERROR look like in the generated code, and also look at the intrinsic function call patterns.
grep -n "RETYPES\|CAUSEERROR\|LONGPROD\|LONGDIV\|LONGSUM\|LONGDIFF\|NORMALISE\|SHIFTRIGHT\|SHIFTLEFT" /tmp/course_out.go | head -30
1279: Zfrac, Carry = LONGPROD(PowerTenFrac[St], BiasTenFrac[Sb], 0) 1280: Places, Zfrac, Carry = NORMALISE(Zfrac, Carry) 1294: *Xfrac, Carry = LONGDIV((*Xfrac >> 1), 0, Zfrac) 1300: Carry, remainder = LONGDIV(Carry, 0, Zfrac) 1304: *Xfrac, Carry = LONGPROD(*Xfrac, Zfrac, 0) 1306: Places, *Xfrac, Carry = NORMALISE(*Xfrac, Carry) 1314: Carry, *Xfrac = LONGSUM(*Xfrac, 1, 0) 1328: dummy, X = SHIFTRIGHT(Xexp, (Xfrac << 1), (BitsPerWord / 2)) 1333: X, low_word = LONGPROD(- X, (R2L + 1), 0) 1334: Borrow, dummy = LONGDIFF(0, low_word, 0) 1335: Borrow, *LogX = LONGDIFF(0, X, Borrow) 1337: *LogX, dummy = LONGPROD(X, R2L, 0) 1347: *Xint, Guard = SHIFTRIGHT(*Xfrac, 0, Places) 1370: ExpExtra, *Xfrac = SHIFTLEFT(0, *Xint, Places) 1383: N, T = LONGDIV(0, N, 10) 1502: Xexp, Xfrac = SHIFTLEFT(0, (X & ^SignBit), (RealShift + 1)) 1575: Places, Xfrac, Carry = NORMALISE(Xfrac, 0) 1597: Carry, M = LONGDIFF(Xfrac, PowerTenFrac[N], 0) 1814: Carry, D[0] = LONGPROD(X[0], Y[0], 0) 1815: Carry, D[1] = LONGPROD(X[0], Y[1], Carry) 1816: D[3], D[2] = LONGPROD(X[1], Y[0], 0) 1817: R_hi, R_lo = LONGPROD(X[1], Y[1], Carry) 1818: Carry, Guard = LONGSUM(D[2], D[1], 0) 1819: Carry, R_lo = LONGSUM(R_lo, D[3], Carry) 1820: Carry, R_hi = LONGSUM(R_hi, 0, Carry) 1841: Quot, Carry = LONGDIV(Hi, Lo, Yfrac[1]) 1844: W[1], W[0] = LONGPROD(Quot, Yfrac[0], 0) 1845: W[2], W[1] = LONGPROD(Quot, Yfrac[1], W[1]) 1846: Carry, Rem_Lo = LONGDIFF(Guard, W[0], 0) 1847: Carry, Rem_Hi = LONGDIFF(Lo, W[1], Carry)
Good — the intrinsics are being generated as-is (called as Go functions). I need to:
1. Emit them as _LONGPROD(...) etc. (prefixed with _)
2. Add the helper function implementations
3. Handle ArrayLiteral in expression generation
4. Handle RetypesDecl in statement generation
5. Handle CAUSEERROR() → panic("CAUSEERROR")
6. Handle untyped VAL abbreviations at package level
Let me also check the RETYPES lines.
grep -n "RETYPES\|CAUSEERROR" /tmp/course_out.go
RETYPES and CAUSEERROR aren't in the output at all. Let me check — the RetypesDecl nodes might just be silently skipped by codegen. Let me check what happens with them.
grep -c "LONGPROD\|LONGDIV\|NORMALISE\|SHIFTLEFT\|SHIFTRIGHT\|LONGSUM\|LONGDIFF" /tmp/course_out.go
Now let me implement all the codegen changes. I'll work through the codegen.go file systematically.
{
"file_path": "/home/david/projects/code-associates/occam2go/codegen/codegen.go",
"offset": 1,
"limit": 40
} 1→package codegen
2→
3→import (
4→ "fmt"
5→ "strings"
6→
7→ "github.com/codeassociates/occam2go/ast"
8→)
9→
10→// Generator converts an AST to Go code
11→type Generator struct {
12→ indent int
13→ builder strings.Builder
14→ needSync bool // track if we need sync package import
15→ needFmt bool // track if we need fmt package import
16→ needTime bool // track if we need time package import
17→ needOs bool // track if we need os package import
18→ needMath bool // track if we need math package import
19→
20→ // Track procedure signatures for proper pointer handling
21→ procSigs map[string][]ast.ProcParam
22→ // Track current procedure's reference parameters
23→ refParams map[string]bool
24→
25→ // Protocol support
26→ protocolDefs map[string]*ast.ProtocolDecl
27→ chanProtocols map[string]string // channel name → protocol name
28→ tmpCounter int // for unique temp variable names
29→
30→ // Record support
31→ recordDefs map[string]*ast.RecordDecl
32→ recordVars map[string]string // variable name → record type name
33→
34→ // Nesting level: 0 = package level, >0 = inside a function
35→ nestingLevel int
36→}
37→
38→// Built-in print procedures
39→var printBuiltins = map[string]bool{
40→ "print.int": true,// Generator converts an AST to Go code
type Generator struct {
indent int
builder strings.Builder
needSync bool // track if we need sync package import
needFmt bool // track if we need fmt package import
needTime bool // track if we need time package import
needOs bool // track if we need os package import
needMath bool // track if we need math package import
// Track procedure signatures for proper pointer handling
procSigs map[string][]ast.ProcParam
// Track current procedure's reference parameters
refParams map[string]bool
// Protocol support
protocolDefs map[string]*ast.ProtocolDecl
chanProtocols map[string]string // channel name → protocol name
tmpCounter int // for unique temp variable names
// Record support
recordDefs map[string]*ast.RecordDecl
recordVars map[string]string // variable name → record type name
// Nesting level: 0 = package level, >0 = inside a function
nestingLevel int
}// Generator converts an AST to Go code
type Generator struct {
indent int
builder strings.Builder
needSync bool // track if we need sync package import
needFmt bool // track if we need fmt package import
needTime bool // track if we need time package import
needOs bool // track if we need os package import
needMath bool // track if we need math package import
needMathBits bool // track if we need math/bits package import
// Track procedure signatures for proper pointer handling
procSigs map[string][]ast.ProcParam
// Track current procedure's reference parameters
refParams map[string]bool
// Protocol support
protocolDefs map[string]*ast.ProtocolDecl
chanProtocols map[string]string // channel name → protocol name
tmpCounter int // for unique temp variable names
// Record support
recordDefs map[string]*ast.RecordDecl
recordVars map[string]string // variable name → record type name
// Nesting level: 0 = package level, >0 = inside a function
nestingLevel int
}
// Transputer intrinsic function names
var transpIntrinsics = map[string]bool{
"LONGPROD": true,
"LONGDIV": true,
"LONGSUM": true,
"LONGDIFF": true,
"NORMALISE": true,
"SHIFTRIGHT": true,
"SHIFTLEFT": true,
}The file /home/david/projects/code-associates/occam2go/codegen/codegen.go has been updated successfully.
Now update the Generate function to handle the new flags and emit intrinsic helpers. Let me update the first-pass scanner and imports.
g.builder.Reset() g.needSync = false g.needFmt = false g.needTime = false g.needOs = false g.needMath = false
g.builder.Reset() g.needSync = false g.needFmt = false g.needTime = false g.needOs = false g.needMath = false g.needMathBits = false
The file /home/david/projects/code-associates/occam2go/codegen/codegen.go has been updated successfully.
if g.containsMostExpr(stmt) {
g.needMath = true
}
if proc, ok := stmt.(*ast.ProcDecl); ok { if g.containsMostExpr(stmt) {
g.needMath = true
}
if g.containsIntrinsics(stmt) {
g.needMathBits = true
}
if g.containsRetypes(stmt) {
g.needMath = true
}
if proc, ok := stmt.(*ast.ProcDecl); ok {The file /home/david/projects/code-associates/occam2go/codegen/codegen.go has been updated successfully.
// Write imports
if g.needSync || g.needFmt || g.needTime || g.needOs || g.needMath {
g.writeLine("import (")
g.indent++
if g.needFmt {
g.writeLine(`"fmt"`)
}
if g.needMath {
g.writeLine(`"math"`)
}
if g.needOs {
g.writeLine(`"os"`)
}
if g.needSync {
g.writeLine(`"sync"`)
}
if g.needTime {
g.writeLine(`"time"`)
}
g.indent--
g.writeLine(")")
g.writeLine("")
} // Write imports
if g.needSync || g.needFmt || g.needTime || g.needOs || g.needMath || g.needMathBits {
g.writeLine("import (")
g.indent++
if g.needFmt {
g.writeLine(`"fmt"`)
}
if g.needMath {
g.writeLine(`"math"`)
}
if g.needMathBits {
g.writeLine(`"math/bits"`)
}
if g.needOs {
g.writeLine(`"os"`)
}
if g.needSync {
g.writeLine(`"sync"`)
}
if g.needTime {
g.writeLine(`"time"`)
}
g.indent--
g.writeLine(")")
g.writeLine("")
}The file /home/david/projects/code-associates/occam2go/codegen/codegen.go has been updated successfully.
// Generate package-level abbreviations (constants)
for _, stmt := range abbrDecls {
abbr := stmt.(*ast.Abbreviation)
goType := g.occamTypeToGo(abbr.Type)
if abbr.IsOpenArray {
goType = "[]" + goType
}
g.builder.WriteString("var ")
g.write(fmt.Sprintf("%s %s = ", goIdent(abbr.Name), goType))
g.generateExpression(abbr.Value)
g.write("\n")
} // Generate package-level abbreviations (constants)
for _, stmt := range abbrDecls {
abbr := stmt.(*ast.Abbreviation)
if abbr.Type == "" {
// Untyped VAL: let Go infer the type
g.builder.WriteString("var ")
g.write(fmt.Sprintf("%s = ", goIdent(abbr.Name)))
g.generateExpression(abbr.Value)
g.write("\n")
} else {
goType := g.occamTypeToGo(abbr.Type)
if abbr.IsOpenArray {
goType = "[]" + goType
}
g.builder.WriteString("var ")
g.write(fmt.Sprintf("%s %s = ", goIdent(abbr.Name), goType))
g.generateExpression(abbr.Value)
g.write("\n")
}
}The file /home/david/projects/code-associates/occam2go/codegen/codegen.go has been updated successfully.
Now I need to emit intrinsic helpers after imports. Let me add them after the proc declarations and before the main function. Actually, the best place is as package-level functions. Let me add them right after the import block, and before the type declarations.
// Separate protocol, record, procedure declarations from other statements
// Emit transputer intrinsic helper functions
if g.needMathBits {
g.emitIntrinsicHelpers()
}
// Separate protocol, record, procedure declarations from other statementsThe file /home/david/projects/code-associates/occam2go/codegen/codegen.go has been updated successfully.
Now add RetypesDecl to the generateStatement switch and handle it, along with ArrayLiteral in generateExpression. Also add the statement classification for RetypesDecl in the package-level sorting. Let me first add the RetypesDecl case to generateStatement.
case *ast.MultiAssignment: g.generateMultiAssignment(s) } }
case *ast.MultiAssignment: g.generateMultiAssignment(s) case *ast.RetypesDecl: g.generateRetypesDecl(s) } }
The file /home/david/projects/code-associates/occam2go/codegen/codegen.go has been updated successfully.
case *ast.MostExpr: g.generateMostExpr(e) } }
case *ast.MostExpr: g.generateMostExpr(e) case *ast.ArrayLiteral: g.generateArrayLiteral(e) } }
The file /home/david/projects/code-associates/occam2go/codegen/codegen.go has been updated successfully.
func (g *Generator) generateAbbreviation(abbr *ast.Abbreviation) {
g.builder.WriteString(strings.Repeat("\t", g.indent))
g.write(fmt.Sprintf("%s := ", goIdent(abbr.Name)))
g.generateExpression(abbr.Value)
g.write("\n")
}func (g *Generator) generateAbbreviation(abbr *ast.Abbreviation) {
g.builder.WriteString(strings.Repeat("\t", g.indent))
if abbr.IsOpenArray && abbr.Type == "BYTE" {
// VAL []BYTE s IS "string": → var s []byte = []byte("string")
g.write(fmt.Sprintf("var %s []byte = []byte(", goIdent(abbr.Name)))
g.generateExpression(abbr.Value)
g.write(")\n")
} else {
g.write(fmt.Sprintf("%s := ", goIdent(abbr.Name)))
g.generateExpression(abbr.Value)
g.write("\n")
}
}The file /home/david/projects/code-associates/occam2go/codegen/codegen.go has been updated successfully.
Wait, that abbreviation change may break existing tests for VAL []BYTE. Let me check what the existing behavior is. Let me look at the existing test.
grep -A5 "VAL \[\]BYTE" /home/david/projects/code-associates/occam2go/codegen/codegen_test.go
grep -rn "IsOpenArray\|VAL.*BYTE.*IS" /home/david/projects/code-associates/occam2go/codegen/codegen_test.go
No existing tests for that. Let me revert that change — the generateAbbreviation function is only called for local (inside function) abbreviations. Package-level abbreviations have their own code path. So I should keep generateAbbreviation simple.
func (g *Generator) generateAbbreviation(abbr *ast.Abbreviation) {
g.builder.WriteString(strings.Repeat("\t", g.indent))
if abbr.IsOpenArray && abbr.Type == "BYTE" {
// VAL []BYTE s IS "string": → var s []byte = []byte("string")
g.write(fmt.Sprintf("var %s []byte = []byte(", goIdent(abbr.Name)))
g.generateExpression(abbr.Value)
g.write(")\n")
} else {
g.write(fmt.Sprintf("%s := ", goIdent(abbr.Name)))
g.generateExpression(abbr.Value)
g.write("\n")
}
}func (g *Generator) generateAbbreviation(abbr *ast.Abbreviation) {
g.builder.WriteString(strings.Repeat("\t", g.indent))
g.write(fmt.Sprintf("%s := ", goIdent(abbr.Name)))
g.generateExpression(abbr.Value)
g.write("\n")
}The file /home/david/projects/code-associates/occam2go/codegen/codegen.go has been updated successfully.
Now add RetypesDecl to the package-level statement classification. It should be emitted at the local level (inside a function), not at package level.
var abbrDecls []ast.Statement
for _, stmt := range program.Statements {
switch stmt.(type) {
case *ast.ProtocolDecl, *ast.RecordDecl:
typeDecls = append(typeDecls, stmt)
case *ast.ProcDecl, *ast.FuncDecl:
procDecls = append(procDecls, stmt)
case *ast.Abbreviation:
if hasProcDecls {
// Top-level abbreviations need to be at package level
// so PROCs can reference them
abbrDecls = append(abbrDecls, stmt)
} else {
mainStatements = append(mainStatements, stmt)
}
default:
mainStatements = append(mainStatements, stmt)
}
} var abbrDecls []ast.Statement
for _, stmt := range program.Statements {
switch s := stmt.(type) {
case *ast.ProtocolDecl, *ast.RecordDecl:
typeDecls = append(typeDecls, stmt)
case *ast.ProcDecl, *ast.FuncDecl:
procDecls = append(procDecls, stmt)
case *ast.Abbreviation:
if hasProcDecls {
// Top-level abbreviations need to be at package level
// so PROCs can reference them
abbrDecls = append(abbrDecls, stmt)
} else {
mainStatements = append(mainStatements, stmt)
}
case *ast.RetypesDecl:
_ = s
// RETYPES declarations are local to functions, not package-level
mainStatements = append(mainStatements, stmt)
default:
mainStatements = append(mainStatements, stmt)
}
}The file /home/david/projects/code-associates/occam2go/codegen/codegen.go has been updated successfully.
Now I need to update the generateProcCall to handle CAUSEERROR and the generateFuncCallExpr to prefix intrinsic names. Let me also add the implementation functions for ArrayLiteral, RetypesDecl, intrinsic helpers, and the containsX scanners.
First, CAUSEERROR in generateProcCall:
func (g *Generator) generateProcCall(call *ast.ProcCall) {
// Handle built-in print procedures
if printBuiltins[call.Name] {
g.generatePrintCall(call)
return
}
g.builder.WriteString(strings.Repeat("\t", g.indent))
g.write(goIdent(call.Name))func (g *Generator) generateProcCall(call *ast.ProcCall) {
// Handle built-in print procedures
if printBuiltins[call.Name] {
g.generatePrintCall(call)
return
}
// Handle CAUSEERROR
if call.Name == "CAUSEERROR" {
g.writeLine(`panic("CAUSEERROR")`)
return
}
g.builder.WriteString(strings.Repeat("\t", g.indent))
g.write(goIdent(call.Name))The file /home/david/projects/code-associates/occam2go/codegen/codegen.go has been updated successfully.
func (g *Generator) generateFuncCallExpr(call *ast.FuncCall) {
g.write(goIdent(call.Name))
g.write("(")func (g *Generator) generateFuncCallExpr(call *ast.FuncCall) {
if transpIntrinsics[call.Name] {
g.write("_" + call.Name)
} else {
g.write(goIdent(call.Name))
}
g.write("(")The file /home/david/projects/code-associates/occam2go/codegen/codegen.go has been updated successfully.
Now add all the new helper functions at the end of codegen.go. I'll add them before the occamOpToGo function:
{
"file_path": "/home/david/projects/code-associates/occam2go/codegen/codegen.go",
"offset": 2030,
"limit": 30
}2030→ return "&&" 2031→ case "OR": 2032→ return "||" 2033→ case "NOT": 2034→ return "!" 2035→ case "\\": 2036→ return "%" 2037→ case "AFTER": 2038→ return ">" 2039→ case "/\\": 2040→ return "&" 2041→ case "\\/": 2042→ return "|" 2043→ case "><": 2044→ return "^" 2045→ case "~": 2046→ return "^" 2047→ case "<<": 2048→ return "<<" 2049→ case ">>": 2050→ return ">>" 2051→ case "PLUS": 2052→ return "+" 2053→ case "MINUS": 2054→ return "-" 2055→ case "TIMES": 2056→ return "*" 2057→ default: 2058→ return op // +, -, *, /, <, >, <=, >= are the same 2059→ }
{
"file_path": "/home/david/projects/code-associates/occam2go/codegen/codegen.go",
"offset": 2058,
"limit": 5
}return op // +, -, *, /, <, >, <=, >= are the same } }
return op // +, -, *, /, <, >, <=, >= are the same
}
}
// generateArrayLiteral emits a Go slice literal: []int{e1, e2, ...}
func (g *Generator) generateArrayLiteral(al *ast.ArrayLiteral) {
g.write("[]int{")
for i, elem := range al.Elements {
if i > 0 {
g.write(", ")
}
g.generateExpression(elem)
}
g.write("}")
}
// generateRetypesDecl emits code for a RETYPES declaration.
// VAL INT X RETYPES X : → X_retyped := int(math.Float32bits(float32(X)))
// VAL [2]INT X RETYPES X : → _tmp := math.Float64bits(float64(X)); X_retyped := [2]int{...}
func (g *Generator) generateRetypesDecl(r *ast.RetypesDecl) {
gName := goIdent(r.Name)
gSource := goIdent(r.Source)
if r.IsArray {
// VAL [2]INT X RETYPES X : — split float64 into two int32 words
tmpVar := fmt.Sprintf("_retmp%d", g.tmpCounter)
g.tmpCounter++
g.writeLine(fmt.Sprintf("%s := math.Float64bits(float64(%s))", tmpVar, gSource))
g.writeLine(fmt.Sprintf("%s := [2]int{int(int32(uint32(%s))), int(int32(uint32(%s >> 32)))}", gName, tmpVar, tmpVar))
} else {
// VAL INT X RETYPES X : — reinterpret float32 as int
g.writeLine(fmt.Sprintf("%s := int(int32(math.Float32bits(float32(%s))))", gName, gSource))
}
}
// containsIntrinsics checks if a statement tree contains transputer intrinsic calls.
func (g *Generator) containsIntrinsics(stmt ast.Statement) bool {
return g.walkStatements(stmt, func(e ast.Expression) bool {
if fc, ok := e.(*ast.FuncCall); ok {
return transpIntrinsics[fc.Name]
}
return false
})
}
// containsRetypes checks if a statement tree contains RETYPES declarations.
func (g *Generator) containsRetypes(stmt ast.Statement) bool {
switch s := stmt.(type) {
case *ast.RetypesDecl:
return true
case *ast.SeqBlock:
for _, inner := range s.Statements {
if g.containsRetypes(inner) {
return true
}
}
case *ast.ParBlock:
for _, inner := range s.Statements {
if g.containsRetypes(inner) {
return true
}
}
case *ast.ProcDecl:
for _, inner := range s.Body {
if g.containsRetypes(inner) {
return true
}
}
case *ast.FuncDecl:
for _, inner := range s.Body {
if g.containsRetypes(inner) {
return true
}
}
case *ast.WhileLoop:
for _, inner := range s.Body {
if g.containsRetypes(inner) {
return true
}
}
case *ast.IfStatement:
for _, choice := range s.Choices {
if choice.NestedIf != nil && g.containsRetypes(choice.NestedIf) {
return true
}
for _, inner := range choice.Body {
if g.containsRetypes(inner) {
return true
}
}
}
case *ast.CaseStatement:
for _, choice := range s.Choices {
for _, inner := range choice.Body {
if g.containsRetypes(inner) {
return true
}
}
}
}
return false
}
// walkStatements recursively walks a statement tree, applying fn to all expressions.
// Returns true if fn returns true for any expression.
func (g *Generator) walkStatements(stmt ast.Statement, fn func(ast.Expression) bool) bool {
switch s := stmt.(type) {
case *ast.Assignment:
return g.walkExpr(s.Value, fn) || g.walkExpr(s.Index, fn)
case *ast.MultiAssignment:
for _, v := range s.Values {
if g.walkExpr(v, fn) {
return true
}
}
case *ast.Abbreviation:
return g.walkExpr(s.Value, fn)
case *ast.SeqBlock:
for _, inner := range s.Statements {
if g.walkStatements(inner, fn) {
return true
}
}
case *ast.ParBlock:
for _, inner := range s.Statements {
if g.walkStatements(inner, fn) {
return true
}
}
case *ast.ProcDecl:
for _, inner := range s.Body {
if g.walkStatements(inner, fn) {
return true
}
}
case *ast.FuncDecl:
for _, inner := range s.Body {
if g.walkStatements(inner, fn) {
return true
}
}
case *ast.WhileLoop:
if g.walkExpr(s.Condition, fn) {
return true
}
for _, inner := range s.Body {
if g.walkStatements(inner, fn) {
return true
}
}
case *ast.IfStatement:
for _, choice := range s.Choices {
if choice.NestedIf != nil && g.walkStatements(choice.NestedIf, fn) {
return true
}
if g.walkExpr(choice.Condition, fn) {
return true
}
for _, inner := range choice.Body {
if g.walkStatements(inner, fn) {
return true
}
}
}
case *ast.CaseStatement:
if g.walkExpr(s.Selector, fn) {
return true
}
for _, choice := range s.Choices {
for _, v := range choice.Values {
if g.walkExpr(v, fn) {
return true
}
}
for _, inner := range choice.Body {
if g.walkStatements(inner, fn) {
return true
}
}
}
case *ast.Send:
if g.walkExpr(s.Value, fn) {
return true
}
for _, v := range s.Values {
if g.walkExpr(v, fn) {
return true
}
}
case *ast.ProcCall:
for _, arg := range s.Args {
if g.walkExpr(arg, fn) {
return true
}
}
case *ast.AltBlock:
for _, c := range s.Cases {
for _, inner := range c.Body {
if g.walkStatements(inner, fn) {
return true
}
}
}
case *ast.VariantReceive:
for _, c := range s.Cases {
if c.Body != nil && g.walkStatements(c.Body, fn) {
return true
}
}
}
return false
}
// walkExpr recursively walks an expression tree, applying fn.
func (g *Generator) walkExpr(expr ast.Expression, fn func(ast.Expression) bool) bool {
if expr == nil {
return false
}
if fn(expr) {
return true
}
switch e := expr.(type) {
case *ast.BinaryExpr:
return g.walkExpr(e.Left, fn) || g.walkExpr(e.Right, fn)
case *ast.UnaryExpr:
return g.walkExpr(e.Right, fn)
case *ast.ParenExpr:
return g.walkExpr(e.Expr, fn)
case *ast.TypeConversion:
return g.walkExpr(e.Expr, fn)
case *ast.SizeExpr:
return g.walkExpr(e.Expr, fn)
case *ast.IndexExpr:
return g.walkExpr(e.Left, fn) || g.walkExpr(e.Index, fn)
case *ast.FuncCall:
for _, arg := range e.Args {
if g.walkExpr(arg, fn) {
return true
}
}
case *ast.SliceExpr:
return g.walkExpr(e.Array, fn) || g.walkExpr(e.Start, fn) || g.walkExpr(e.Length, fn)
case *ast.ArrayLiteral:
for _, elem := range e.Elements {
if g.walkExpr(elem, fn) {
return true
}
}
}
return false
}
// emitIntrinsicHelpers writes the Go helper functions for transputer intrinsics.
// These implement 32-bit transputer semantics using uint32/uint64 arithmetic.
func (g *Generator) emitIntrinsicHelpers() {
g.writeLine("// Transputer intrinsic helper functions")
g.writeLine("func _LONGPROD(a, b, c int) (int, int) {")
g.writeLine("\tr := uint64(uint32(a))*uint64(uint32(b)) + uint64(uint32(c))")
g.writeLine("\treturn int(int32(uint32(r >> 32))), int(int32(uint32(r)))")
g.writeLine("}")
g.writeLine("")
g.writeLine("func _LONGDIV(hi, lo, divisor int) (int, int) {")
g.writeLine("\tn := (uint64(uint32(hi)) << 32) | uint64(uint32(lo))")
g.writeLine("\td := uint64(uint32(divisor))")
g.writeLine("\tif d == 0 { panic(\"LONGDIV: division by zero\") }")
g.writeLine("\treturn int(int32(uint32(n / d))), int(int32(uint32(n % d)))")
g.writeLine("}")
g.writeLine("")
g.writeLine("func _LONGSUM(a, b, carry int) (int, int) {")
g.writeLine("\tr := uint64(uint32(a)) + uint64(uint32(b)) + uint64(uint32(carry))")
g.writeLine("\treturn int(int32(uint32(r >> 32))), int(int32(uint32(r)))")
g.writeLine("}")
g.writeLine("")
g.writeLine("func _LONGDIFF(a, b, borrow int) (int, int) {")
g.writeLine("\tr := uint64(uint32(a)) - uint64(uint32(b)) - uint64(uint32(borrow))")
g.writeLine("\tif uint32(a) >= uint32(b)+uint32(borrow) { return 0, int(int32(uint32(r))) }")
g.writeLine("\treturn 1, int(int32(uint32(r)))")
g.writeLine("}")
g.writeLine("")
g.writeLine("func _NORMALISE(hi, lo int) (int, int, int) {")
g.writeLine("\tv := (uint64(uint32(hi)) << 32) | uint64(uint32(lo))")
g.writeLine("\tif v == 0 { return 64, 0, 0 }")
g.writeLine("\tn := bits.LeadingZeros64(v)")
g.writeLine("\tv <<= uint(n)")
g.writeLine("\treturn n, int(int32(uint32(v >> 32))), int(int32(uint32(v)))")
g.writeLine("}")
g.writeLine("")
g.writeLine("func _SHIFTRIGHT(hi, lo, n int) (int, int) {")
g.writeLine("\tv := (uint64(uint32(hi)) << 32) | uint64(uint32(lo))")
g.writeLine("\tv >>= uint(uint32(n))")
g.writeLine("\treturn int(int32(uint32(v >> 32))), int(int32(uint32(v)))")
g.writeLine("}")
g.writeLine("")
g.writeLine("func _SHIFTLEFT(hi, lo, n int) (int, int) {")
g.writeLine("\tv := (uint64(uint32(hi)) << 32) | uint64(uint32(lo))")
g.writeLine("\tv <<= uint(uint32(n))")
g.writeLine("\treturn int(int32(uint32(v >> 32))), int(int32(uint32(v)))")
g.writeLine("}")
g.writeLine("")
}The file /home/david/projects/code-associates/occam2go/codegen/codegen.go has been updated successfully.
Now let me also add ArrayLiteral to the exprNeedsMath scanner (for completeness), and add the containsX cases for the new node types:
case *ast.SliceExpr: return g.exprNeedsMath(e.Array) || g.exprNeedsMath(e.Start) || g.exprNeedsMath(e.Length) } return false }
case *ast.SliceExpr:
return g.exprNeedsMath(e.Array) || g.exprNeedsMath(e.Start) || g.exprNeedsMath(e.Length)
case *ast.ArrayLiteral:
for _, elem := range e.Elements {
if g.exprNeedsMath(elem) {
return true
}
}
}
return false
}The file /home/david/projects/code-associates/occam2go/codegen/codegen.go has been updated successfully.
go build -o occam2go . && ./occam2go -I kroc/modules/course/libsrc -D TARGET.BITS.PER.WORD=32 -o /tmp/course_out.go kroc/modules/course/libsrc/course.module 2>&1
Exit code 1 # command-line-arguments # [command-line-arguments] vet: /tmp/course_out.go:1263:7: cannot use int(int32(math.Float32bits(float32(X)))) (value of type int) as float32 value in assignment # command-line-arguments # [command-line-arguments] vet: /tmp/course_out.go:1263:7: cannot use int(int32(math.Float32bits(float32(X)))) (value of type int) as float32 value in assignment
The RETYPES is assigning to the same variable name (X) but with different type. In occam, VAL INT X RETYPES X : means: take the float32 X (parameter) and reinterpret its bits as an INT, giving a new X (shadowing the parameter). In Go, we need a different approach.
Let me look at the context:
1258→func file_in_number(n *int, max int, ok *bool, in <-chan byte) {
1259→ file_in_int(&*n, max, &*ok, in)
1260→}
1261→
1262→func course_REAL32TOSTRING(len *int, string []byte, X float32, Ip int, Dp int) {
1263→ X := int(int32(math.Float32bits(float32(X))))
1264→ SignBit := 2147483648
1265→ BitsPerWord := 32
1266→ MaxInt := 4294967295
1267→ BytesPerWord := 4The issue is: X is already declared as a float32 parameter, and we're trying to do X := int(...) which would try to declare a new variable with the same name. In Go, := in a function body CAN shadow a parameter, but the error is actually about type — it says "cannot use int as float32 value in assignment". Wait, that's odd for :=.
Actually, looking more carefully at the error: cannot use int(int32(math.Float32bits(float32(X)))) (value of type int) as float32 value in assignment. This suggests Go sees this as an = assignment (not :=). Let me check — in Go, if the variable is already declared in the same scope, := becomes =. Since X is a parameter in the same scope, Go treats X := int(...) as just X = int(...), which is a type mismatch.
The fix: use a different variable name for the RETYPES result. In occam, RETYPES creates a new alias, so we should use a distinct name. But wait — in the occam code, the NEW X (the int) shadows the OLD X (the float32). All subsequent code uses X as an int.
The cleanest approach: create a temporary for the old value, then shadow with a new variable.
This session is being continued from a previous conversation that ran out of context. The summary below covers the earlier portion of the conversation.
Analysis: Let me go through the conversation chronologically to capture all important details.
The user provided a detailed implementation plan for "Phase 2: Full Course Module Transpilation (float_io.occ)" with 12 steps.
I started by reading key files: lexer/token.go, lexer/lexer.go, ast/ast.go, parser/parser.go, codegen/codegen.go, and test files.
I checked the current error count: 598 parse errors on the full course.module.
I examined float_io.occ to understand the patterns: RETYPES (lines 139, 1105), CAUSEERROR (lines 988, 2187), intrinsics (~77 occurrences), untyped VAL abbreviations, array literals, multi-line expressions.
Step 1 (Lexer):
[/] tracking to parenDepth in lexer.goAdded RETYPES keyword token to token.go (3 places: enum, tokenNames, keywords)
Step 2 (AST):
Added ArrayLiteral (Expression) and RetypesDecl (Statement) node types to ast.go
Step 3 (Parser - untyped VAL):
Added RETYPES parsing after name when peekToken is RETYPES
Step 4 (Parser - array literals):
Also handles single-element arrays with RBRACKET after first expression
Step 5 (Parser - RETYPES): Already handled in step 3's parseAbbreviation changes.
Step 6 (Parser - multi-line expressions): First attempt: Modified parseBinaryExpr to consume NEWLINE/INDENT, parse RHS, consume DEDENT
Second attempt (final): Lexer-level continuation
- Added lastTokenType field to Lexer struct
- Created NextToken() wrapper that tracks lastTokenType, renamed original to nextTokenInner()
- Added isContinuationOp() function that checks if a token type implies line continuation
- Modified NEWLINE handling to suppress NEWLINE when isContinuationOp(lastTokenType) is true
- Modified indent processing to suppress INDENT/DEDENT when isContinuationOp(lastTokenType) is true
- This approach works because: continuation lines don't modify the indent stack, so when the body starts at the same indentation, the lexer correctly produces INDENT relative to the pre-continuation level
After parser changes: errors went from 598 to 0!
Step 7-10 (Codegen):
Error: RETYPES same-name shadowing
VAL INT X RETYPES X : generates X := int(...) which Go treats as X = int(...) since X is already a parameter_retmp := int(...); X := _retmpCurrent state: The code compiles, transpiles course.module with 0 parse errors, but the RETYPES shadowing fix was just applied and hasn't been tested yet with go vet.
Task list status: - #1 completed: Lexer changes - #2 completed: AST nodes - #3 completed: Parser changes - #4 in_progress: Codegen changes (RETYPES shadowing fix just applied, not yet tested) - #5 pending: Tests + integration verification
Summary:
1. Primary Request and Intent:
The user asked me to implement a detailed 12-step plan for "Phase 2: Full Course Module Transpilation (float_io.occ)". This involves adding 6 unimplemented language constructs to the occam2go transpiler so that the full course.module (including float_io.occ) can be transpiled, compiled, and pass go vet. The constructs are: bracket NEWLINE suppression, RETYPES keyword/declarations, untyped VAL abbreviations, array literal expressions, multi-line expression continuation, transputer intrinsic helper functions, and CAUSEERROR. The plan reference was at /home/david/.claude/projects/-home-david-projects-code-associates-occam2go/c4302863-3896-4a30-bd1b-760356b188c1.jsonl.
:= when parameter and new variable share the same nameMulti-line expression continuation at lexer level (suppressing NEWLINE after binary operators and :=)
Files and Code Sections:
lexer/token.go
RETYPES to the TokenType enum (between INITIAL and PLUS_KW), to tokenNames map, and to keywords map
go
RETYPES // RETYPES (bit-level type reinterpretation)
go
RETYPES: "RETYPES",
go
"RETYPES": RETYPES,lexer/lexer.go
lastTokenType TokenType field to Lexer struct[/] tracking to parenDepth (lines in case '[' and case ']')go
func (l *Lexer) NextToken() Token {
tok := l.nextTokenInner()
if tok.Type != NEWLINE && tok.Type != INDENT && tok.Type != DEDENT && tok.Type != EOF {
l.lastTokenType = tok.Type
}
return tok
}isContinuationOp(l.lastTokenType):
go
if l.parenDepth > 0 || isContinuationOp(l.lastTokenType) {
// suppress INDENT/DEDENT tokens
}go
if l.parenDepth > 0 || isContinuationOp(l.lastTokenType) {
return l.NextToken()
}go
func isContinuationOp(t TokenType) bool {
switch t {
case AND, OR,
PLUS, MINUS, MULTIPLY, DIVIDE, MODULO,
PLUS_KW, MINUS_KW, TIMES,
EQ, NEQ, LT, GT, LE, GE,
BITAND, BITOR, BITXOR, LSHIFT, RSHIFT,
ASSIGN, AFTER:
return true
}
return false
}ast/ast.go
type RetypesDecl struct { Token lexer.Token IsVal bool TargetType string IsArray bool ArraySize Expression Name string Source string } func (r RetypesDecl) statementNode() {} func (r RetypesDecl) TokenLiteral() string { return r.Token.Literal } ```
parser/parser.go
parseAbbreviation() — now returns ast.Statement instead of *ast.AbbreviationVAL name IS expr :), fixed-array VAL for RETYPES (VAL [n]TYPE name RETYPES source :), and simple RETYPES (VAL TYPE name RETYPES source :)go
if !isOpenArray && !isArray && p.curTokenIs(lexer.IDENT) && p.peekTokenIs(lexer.IS) {
name := p.curToken.Literal
p.nextToken() // consume IS
p.nextToken() // move to expression
value := p.parseExpression(LOWEST)
// ... return Abbreviation with Type: ""
}go
if !isOpenArray && p.curTokenIs(lexer.LBRACKET) {
isArray = true
p.nextToken()
arraySize = p.parseExpression(LOWEST)
// expect RBRACKET, advance to type
}go
if p.peekTokenIs(lexer.RETYPES) {
p.nextToken() // consume RETYPES
// expect IDENT (source), COLON
return &ast.RetypesDecl{...}
}parseExpression() for array literals:
go
if p.peekTokenIs(lexer.COMMA) {
// Array literal: [expr, expr, ...]
elements := []ast.Expression{firstExpr}
for p.peekTokenIs(lexer.COMMA) { ... }
left = &ast.ArrayLiteral{Token: lbracket, Elements: elements}
} else if p.peekTokenIs(lexer.RBRACKET) {
// Single-element array literal
left = &ast.ArrayLiteral{...}
} else {
// Slice expression (existing logic)
}codegen/codegen.go
needMathBits bool flag and transpIntrinsics map to GeneratorcontainsIntrinsics() and containsRetypes() scanners to first pass"math/bits" when neededemitIntrinsicHelpers() call after importsgo
if abbr.Type == "" {
g.builder.WriteString("var ")
g.write(fmt.Sprintf("%s = ", goIdent(abbr.Name)))
g.generateExpression(abbr.Value)
g.write("\n")
}go
if call.Name == "CAUSEERROR" {
g.writeLine(`panic("CAUSEERROR")`)
return
}go
if transpIntrinsics[call.Name] {
g.write("_" + call.Name)
}[]int{e1, e2, ...}Errors and Fixes:
First parseBinaryExpr approach failed: Consuming INDENT in parseBinaryExpr and trying to consume matching DEDENT didn't work because the IF body (SEQ) was at the same indentation as the continuation line, so no DEDENT existed between the expression and body. The parser's indentLevel got off by 1, preventing the IF parser from finding the body's INDENT.
:=, suppress NEWLINE and INDENT/DEDENT on the next line. This correctly preserves the indent stack so the body's INDENT is produced relative to the pre-continuation level.RETYPES same-name shadowing error: VAL INT X RETYPES X : generated X := int(int32(math.Float32bits(float32(X)))) which Go treated as X = int(...) (assignment, not declaration) since X was already a float32 parameter, causing type mismatch.
_retmp0 := int(int32(math.Float32bits(float32(X)))); X := _retmp0. This was just applied but not yet tested.Problem Solving:
:= at end of line (2 errors)Second go vet showed RETYPES shadowing type error — fixed with temp variable approach (not yet verified)
All User Messages:
"Implement the following plan: [detailed 12-step Phase 2 plan for full course module transpilation]" — This was the only user message with the full implementation plan.
Pending Tasks:
go vetgo vet /tmp/course_out.go and go test ./...)Update CLAUDE.md and TODO.md with Phase 2 features (implied by the plan's Step 12)
Current Work:
I was implementing Step 8 of the plan (Codegen — RETYPES). I had just fixed a RETYPES same-name shadowing error where VAL INT X RETYPES X : was generating invalid Go code. The fix uses a temp variable when the RETYPES target and source share the same name:
go
if gName == gSource {
tmpVar := fmt.Sprintf("_retmp%d", g.tmpCounter)
g.tmpCounter++
g.writeLine(fmt.Sprintf("%s := int(int32(math.Float32bits(float32(%s))))", tmpVar, gSource))
g.writeLine(fmt.Sprintf("%s := %s", gName, tmpVar))
}
This fix was written to codegen/codegen.go but has not yet been tested with go build, go vet, or go test.
Optional Next Step:
Build and test the RETYPES shadowing fix, then proceed to verify the full course.module transpilation with go vet. The immediate commands to run are:
bash
go build -o occam2go .
./occam2go -I kroc/modules/course/libsrc -D TARGET.BITS.PER.WORD=32 -o /tmp/course_out.go kroc/modules/course/libsrc/course.module
go vet /tmp/course_out.go
go test ./...
If go vet passes, proceed to Task #5: writing parser tests, codegen tests, E2E tests, and updating documentation. The plan specifies:
If you need specific details from before compaction (like exact code snippets, error messages, or content you generated), read the full transcript at: /home/david/.claude/projects/-home-david-projects-code-associates-occam2go/65034dd7-a4f6-4eb4-956f-e3eedc8f87a6.jsonl Please continue the conversation from where we left off without asking the user any further questions. Continue with the last task that you were asked to work on.
Let me continue from where we left off. The last task was implementing Phase 2 of the course module transpilation. I need to:
Let me start by building and testing.
Let me pick up where we left off. I need to verify the RETYPES shadowing fix and then run integration tests. Let me build and test.
./occam2go -I kroc/modules/course/libsrc -D TARGET.BITS.PER.WORD=32 -o /tmp/course_out.go kroc/modules/course/libsrc/course.module 2>&1
Exit code 1 # command-line-arguments # [command-line-arguments] vet: /tmp/course_out.go:1264:7: cannot use _retmp13 (variable of type int) as float32 value in assignment # command-line-arguments # [command-line-arguments] vet: /tmp/course_out.go:1264:7: cannot use _retmp13 (variable of type int) as float32 value in assignment
There's an issue with the RETYPES shadowing — the temp variable is int but is being assigned to a float32 variable. Let me look at the generated code around that line.
1258→func file_in_number(n *int, max int, ok *bool, in <-chan byte) {
1259→ file_in_int(&*n, max, &*ok, in)
1260→}
1261→
1262→func course_REAL32TOSTRING(len *int, string []byte, X float32, Ip int, Dp int) {
1263→ _retmp13 := int(int32(math.Float32bits(float32(X))))
1264→ X := _retmp13
1265→ SignBit := 2147483648
1266→ BitsPerWord := 32
1267→ MaxInt := 4294967295
1268→ BytesPerWord := 4
1269→ RealShift := 8
1270→ BitsInFrac := 24
1271→ RealExpSh := 23
1272→ RealExp := 255The problem is clear: X is a float32 parameter, and X := _retmp13 tries to declare a new X of type int (from the temp), but Go sees it as trying to assign _retmp13 (int) to the existing X (float32). The := actually creates a new shadowing variable, so the real issue is that Go vet detects the type incompatibility at the declaration point — wait, actually X := should create a new variable shadowing the parameter. Let me check more carefully what Go does here.
Actually, the issue is that := in a function body does create a new variable. The error message says "cannot use _retmp13 (variable of type int) as float32 value in assignment." This suggests Go is treating it as an assignment to the existing X float32. But := should create a new variable...
Let me check — maybe it's because of some Go rule about := when the same name exists in the same scope (not just outer scope).
cat > /tmp/test_shadow.go << 'EOF'
package main
import "fmt"
func foo(X float32) {
tmp := int(42)
X := tmp
fmt.Println(X)
}
func main() {
foo(3.14)
}
EOF
go vet /tmp/test_shadow.go 2>&1Exit code 1 # command-line-arguments # [command-line-arguments] vet: /tmp/test_shadow.go:7:10: cannot use tmp (variable of type int) as float32 value in assignment # command-line-arguments # [command-line-arguments] vet: /tmp/test_shadow.go:7:10: cannot use tmp (variable of type int) as float32 value in assignment
So := on a function parameter doesn't create a new variable in Go — it tries to assign to the existing parameter variable. This is because parameters are in the same scope as the function body. So X := tmp is equivalent to X = tmp here, not a new declaration.
To fix this, I need to use a different approach for RETYPES when the source and target are the same name. Options:
{ ... } — but in Go, this would mean the new variable is only visible inside the blockUse var X int = ... — this won't work either, same scope issue
Using a different name for the retyped variable would require replacing all subsequent uses, which gets messy
VAL INT X RETYPES X : creates a NEW variable X that shadows the parameter X. So in Go, I need to create a new X of type int that shadows the parameter X (float32) — and I can do this by introducing a nested block scope where the new declaration is valid.Writing Go code example...
The issue is that occam's RETYPES declaration applies to the entire rest of the function body, so I'd need to wrap everything after the RETYPES in a block. Alternatively, I could just use a differently-named variable like X_int and replace all subsequent uses, or save the original value and perform the bit reinterpretation conversion in the generated code. Hmm, this is tricky. Let me think about what the occam code actually does.
In float_io.occ, the pattern is that the float32 parameter X gets retyped as an int via bit reinterpretation, and all subsequent code in the proc uses X as an integer.
In Go, I can't shadow a parameter with := in the same scope, and I can't redeclare it with var either. The cleanest solution is to use a different variable name for the retyped version — I could save the original parameter to a temp and then declare a new X as an int, though that means updating all the subsequent references throughout the function.
Actually, a better approach: wrap the function body in a scope block where I do the conversion at the top, then all the existing code that references X can just use the retyped version without needing to rename everything. I see the issue now—when the parameter name matches the type we're converting to, we can't just shadow it in the same scope. The cleanest solution is to rename the function parameter to something like X_param, then declare X as the converted integer value inside the function body. This avoids the scoping conflict entirely, though it does require modifying how the procedure is generated.
Alternatively, I could wrap the rest of the function body in a nested block scope, which would allow X := int(...) to create a new variable that shadows the parameter. Or I could use a var declaration instead of := to explicitly shadow the parameter name. I need to step back and think about this differently. The core issue is that Go won't let me redeclare a parameter in the same scope, so I can't just reassign X with a different type. The cleanest solution is to rename the parameter itself during code generation—when there's a RETYPES operation on a parameter, I should generate the function with a different parameter name and then create a local variable with the original name that holds the retyped value. This way, subsequent code can reference X normally without scope conflicts. I'm realizing the cleanest solution is to handle this at the PROC level—when a RETYPES operation shadows a parameter, I should rename the parameter in the function signature itself, which avoids the redeclaration problem entirely. Alternatively, I could use a nested scope block to shadow the binding, though that gets messy with wrapping the rest of the function body. I think the cleanest solution is to detect when a RETYPES declaration in the proc body shadows a parameter, then prefix that parameter name with an underscore during code generation so the new variable can take the original name. Let me look at how generateProcDecl works to implement this approach.
Now I'm examining the RETYPES patterns in the float_io.occ file to understand how many times this shadowing occurs and what the actual usage looks like.
So Go doesn't allow := to shadow a parameter in the same scope — it treats it as assignment. I need a different approach. Let me understand the full picture first.
grep -n "RETYPES" /tmp/course_out.go 2>&1
grep -n "_retmp\|Float32bits\|Float64bits" /tmp/course_out.go 2>&1
1263: _retmp13 := int(int32(math.Float32bits(float32(X))))
1264: X := _retmp13
1830: _retmp15 := math.Float64bits(float64(X))
1831: _reshadow16 := [2]int{int(int32(uint32(_retmp15))), int(int32(uint32(_retmp15 >> 32)))} 1262→func course_REAL32TOSTRING(len *int, string []byte, X float32, Ip int, Dp int) {
1263→ _retmp13 := int(int32(math.Float32bits(float32(X))))
1264→ X := _retmp13
1265→ SignBit := 2147483648
1266→ BitsPerWord := 32 1826→ }
1827→}
1828→
1829→func course_REAL64TOSTRING(len *int, string []byte, X float64, Ip int, Dp int) {
1830→ _retmp15 := math.Float64bits(float64(X))
1831→ _reshadow16 := [2]int{int(int32(uint32(_retmp15))), int(int32(uint32(_retmp15 >> 32)))}
1832→ X := _reshadow16
1833→ SignBit := 2147483648
1834→ BitsPerWord := 32
1835→ MaxUnsignedInt := 4294967295OK so there are two RETYPES occurrences that shadow parameters:
1. course_REAL32TOSTRING(... X float32 ...) → VAL INT X RETYPES X : → tries to make X int
2. course_REAL64TOSTRING(... X float64 ...) → VAL [2]INT X RETYPES X : → tries to make X [2]int
The problem is that Go won't let you use := to redeclare a variable in the same scope as its original declaration (function parameters are in the same scope as the function body).
The cleanest fix is to rename the parameter when there's a RETYPES that shadows it.
I need to scan the proc body in generateProcDecl to find any RETYPES declarations that shadow parameter names, then emit those parameters with a prefix like _X_param. The RETYPES codegen would then use the prefixed name as the source and the original name for the new variable. Let me trace through how generateProcDecl works to find where to hook this in.
Actually, a simpler approach might be to use a scope block within the RETYPES output itself—wrapping the reassignment and new declaration so they don't conflict with the original parameter in the outer scope.
Generating function wrapper code... I'm considering a simpler approach where I create a variable shadowing the parameter with the converted type, but Go doesn't allow redeclaring a variable in the same scope. Let me think through how the proc body generation actually works to find a cleaner solution.
The rename approach seems most straightforward: I'll track which parameters need retyping, prefix them with _rp_ when declaring them, then use that renamed version as the source in the RETYPES conversion. This keeps the logic contained and avoids scope issues.
Actually, a simpler solution just occurred to me—using a block scope to shadow the parameter. By wrapping the function body in braces, I can declare a new variable with the same name inside that block, which will shadow the outer parameter without any renaming tricks. I need to scan the proc body before generating the parameter list to identify which parameters get retyped, then rename those parameters with a prefix and update the RETYPES expression to use the prefixed name. Let me check how generateProcDecl is structured to implement this two-pass approach.
The core issue: Go doesn't allow := to redeclare a function parameter in the same scope. I need to rename the parameter when a RETYPES shadows it. Let me look at the codegen to implement the fix.
{
"pattern": "func \\(g \\*Generator\\) generateProcDecl",
"path": "/home/david/projects/code-associates/occam2go/codegen/codegen.go",
"output_mode": "content"
}{
"file_path": "/home/david/projects/code-associates/occam2go/codegen/codegen.go",
"offset": 1449,
"limit": 80
} 1449→func (g *Generator) generateProcDecl(proc *ast.ProcDecl) {
1450→ // Track reference parameters for this procedure
1451→ oldRefParams := g.refParams
1452→ newRefParams := make(map[string]bool)
1453→ // Inherit parent's ref params for closure captures when nested
1454→ if g.nestingLevel > 0 {
1455→ for k, v := range oldRefParams {
1456→ newRefParams[k] = v
1457→ }
1458→ }
1459→ for _, p := range proc.Params {
1460→ if !p.IsVal && !p.IsChan && !p.IsChanArray && !p.IsOpenArray {
1461→ newRefParams[p.Name] = true
1462→ } else {
1463→ // Own param shadows any inherited ref param with same name
1464→ delete(newRefParams, p.Name)
1465→ }
1466→ // Register chan params with protocol mappings
1467→ if p.IsChan || p.IsChanArray {
1468→ if _, ok := g.protocolDefs[p.ChanElemType]; ok {
1469→ g.chanProtocols[p.Name] = p.ChanElemType
1470→ }
1471→ }
1472→ // Register record-typed params
1473→ if !p.IsChan {
1474→ if _, ok := g.recordDefs[p.Type]; ok {
1475→ g.recordVars[p.Name] = p.Type
1476→ }
1477→ }
1478→ }
1479→ g.refParams = newRefParams
1480→
1481→ // Generate function signature
1482→ params := g.generateProcParams(proc.Params)
1483→ gName := goIdent(proc.Name)
1484→ if g.nestingLevel > 0 {
1485→ // Nested PROC: generate as Go closure
1486→ g.writeLine(fmt.Sprintf("%s := func(%s) {", gName, params))
1487→ } else {
1488→ g.writeLine(fmt.Sprintf("func %s(%s) {", gName, params))
1489→ }
1490→ g.indent++
1491→ g.nestingLevel++
1492→
1493→ for _, stmt := range proc.Body {
1494→ g.generateStatement(stmt)
1495→ }
1496→
1497→ g.nestingLevel--
1498→ g.indent--
1499→ g.writeLine("}")
1500→ g.writeLine("")
1501→
1502→ // Restore previous context
1503→ g.refParams = oldRefParams
1504→}
1505→
1506→func (g *Generator) generateProcParams(params []ast.ProcParam) string {
1507→ var parts []string
1508→ for _, p := range params {
1509→ var goType string
1510→ if p.IsChanArray {
1511→ goType = "[]" + chanDirPrefix(p.ChanDir) + g.occamTypeToGo(p.ChanElemType)
1512→ } else if p.IsChan {
1513→ goType = chanDirPrefix(p.ChanDir) + g.occamTypeToGo(p.ChanElemType)
1514→ } else if p.IsOpenArray {
1515→ goType = "[]" + g.occamTypeToGo(p.Type)
1516→ } else if p.ArraySize != "" {
1517→ // Fixed-size array parameter: [n]TYPE
1518→ goType = "[" + p.ArraySize + "]" + g.occamTypeToGo(p.Type)
1519→ if !p.IsVal {
1520→ goType = "*" + goType
1521→ }
1522→ } else {
1523→ goType = g.occamTypeToGo(p.Type)
1524→ if !p.IsVal {
1525→ // Non-VAL parameters are pass by reference in Occam
1526→ goType = "*" + goType
1527→ }
1528→ }{
"file_path": "/home/david/projects/code-associates/occam2go/codegen/codegen.go",
"offset": 1528,
"limit": 15
} 1528→ }
1529→ parts = append(parts, fmt.Sprintf("%s %s", goIdent(p.Name), goType))
1530→ }
1531→ return strings.Join(parts, ", ")
1532→}
1533→
1534→func chanDirPrefix(dir string) string {
1535→ switch dir {
1536→ case "?":
1537→ return "<-chan " // input/receive-only
1538→ case "!":
1539→ return "chan<- " // output/send-only
1540→ default:
1541→ return "chan " // bidirectional
1542→ }{
"pattern": "func \\(g \\*Generator\\) generateRetypesDecl",
"path": "/home/david/projects/code-associates/occam2go/codegen/codegen.go",
"output_mode": "content"
}{
"file_path": "/home/david/projects/code-associates/occam2go/codegen/codegen.go",
"offset": 2084,
"limit": 50
} 2084→func (g *Generator) generateRetypesDecl(r *ast.RetypesDecl) {
2085→ gName := goIdent(r.Name)
2086→ gSource := goIdent(r.Source)
2087→ if r.IsArray {
2088→ // VAL [2]INT X RETYPES X : — split float64 into two int32 words
2089→ tmpVar := fmt.Sprintf("_retmp%d", g.tmpCounter)
2090→ g.tmpCounter++
2091→ g.writeLine(fmt.Sprintf("%s := math.Float64bits(float64(%s))", tmpVar, gSource))
2092→ if gName == gSource {
2093→ // Shadowing: declare a new variable of a different type
2094→ shadowTmp := fmt.Sprintf("_reshadow%d", g.tmpCounter)
2095→ g.tmpCounter++
2096→ g.writeLine(fmt.Sprintf("%s := [2]int{int(int32(uint32(%s))), int(int32(uint32(%s >> 32)))}", shadowTmp, tmpVar, tmpVar))
2097→ g.writeLine(fmt.Sprintf("%s := %s", gName, shadowTmp))
2098→ } else {
2099→ g.writeLine(fmt.Sprintf("%s := [2]int{int(int32(uint32(%s))), int(int32(uint32(%s >> 32)))}", gName, tmpVar, tmpVar))
2100→ }
2101→ } else {
2102→ // VAL INT X RETYPES X : — reinterpret float32 as int
2103→ if gName == gSource {
2104→ // Shadowing: use a temp to avoid Go treating := as = on the same name
2105→ tmpVar := fmt.Sprintf("_retmp%d", g.tmpCounter)
2106→ g.tmpCounter++
2107→ g.writeLine(fmt.Sprintf("%s := int(int32(math.Float32bits(float32(%s))))", tmpVar, gSource))
2108→ g.writeLine(fmt.Sprintf("%s := %s", gName, tmpVar))
2109→ } else {
2110→ g.writeLine(fmt.Sprintf("%s := int(int32(math.Float32bits(float32(%s))))", gName, gSource))
2111→ }
2112→ }
2113→}
2114→
2115→// containsIntrinsics checks if a statement tree contains transputer intrinsic calls.
2116→func (g *Generator) containsIntrinsics(stmt ast.Statement) bool {
2117→ return g.walkStatements(stmt, func(e ast.Expression) bool {
2118→ if fc, ok := e.(*ast.FuncCall); ok {
2119→ return transpIntrinsics[fc.Name]
2120→ }
2121→ return false
2122→ })
2123→}
2124→
2125→// containsRetypes checks if a statement tree contains RETYPES declarations.
2126→func (g *Generator) containsRetypes(stmt ast.Statement) bool {
2127→ switch s := stmt.(type) {
2128→ case *ast.RetypesDecl:
2129→ return true
2130→ case *ast.SeqBlock:
2131→ for _, inner := range s.Statements {
2132→ if g.containsRetypes(inner) {
2133→ return trueOK, I see the problem clearly now. The approach of using X := _retmp13 doesn't work because Go treats := on a parameter name in the same scope as assignment, not redeclaration.
The fix: In generateProcDecl, before generating the proc, scan the body for RetypesDecl where Name == Source (same-name RETYPES that shadows a parameter). For each such case, rename the parameter in the parameter list (e.g., X → _rp_X). Then in generateRetypesDecl, use the renamed parameter name as the source.
I'll add a rename map to the Generator and populate it during the proc body scan. Before generating the proc, I'll identify any RETYPES declarations that shadow parameter names, build the rename mapping, and then apply those renames when generating the parameter list and when referencing the source in RETYPES declarations.
The issue is clear: X := _retmp13 in the same scope as parameter X float32 is treated as assignment by Go, not redeclaration. The fix is to rename the parameter when a RETYPES shadows it. Let me implement this.
{
"pattern": "retypesRenames|tmpCounter",
"path": "/home/david/projects/code-associates/occam2go/codegen/codegen.go",
"output_mode": "content",
"-n": true
}29: tmpCounter int // for unique temp variable names
82: g.tmpCounter = 0
935: tmpName := fmt.Sprintf("_tmp%d", g.tmpCounter)
936: g.tmpCounter++
1884: flagName := fmt.Sprintf("_ifmatched%d", g.tmpCounter)
1885: g.tmpCounter++
2089: tmpVar := fmt.Sprintf("_retmp%d", g.tmpCounter)
2090: g.tmpCounter++
2094: shadowTmp := fmt.Sprintf("_reshadow%d", g.tmpCounter)
2095: g.tmpCounter++
2105: tmpVar := fmt.Sprintf("_retmp%d", g.tmpCounter)
2106: g.tmpCounter++{
"file_path": "/home/david/projects/code-associates/occam2go/codegen/codegen.go",
"offset": 16,
"limit": 30
} 16→ needTime bool // track if we need time package import
17→ needOs bool // track if we need os package import
18→ needMath bool // track if we need math package import
19→ needMathBits bool // track if we need math/bits package import
20→
21→ // Track procedure signatures for proper pointer handling
22→ procSigs map[string][]ast.ProcParam
23→ // Track current procedure's reference parameters
24→ refParams map[string]bool
25→
26→ // Protocol support
27→ protocolDefs map[string]*ast.ProtocolDecl
28→ chanProtocols map[string]string // channel name → protocol name
29→ tmpCounter int // for unique temp variable names
30→
31→ // Record support
32→ recordDefs map[string]*ast.RecordDecl
33→ recordVars map[string]string // variable name → record type name
34→
35→ // Nesting level: 0 = package level, >0 = inside a function
36→ nestingLevel int
37→}
38→
39→// Transputer intrinsic function names
40→var transpIntrinsics = map[string]bool{
41→ "LONGPROD": true,
42→ "LONGDIV": true,
43→ "LONGSUM": true,
44→ "LONGDIFF": true,
45→ "NORMALISE": true,Now I'll add a retypesRenames map to the Generator struct and use it in both generateProcDecl and generateRetypesDecl. The approach: before generating a proc's parameter list, scan its body for RETYPES that shadow parameters, and rename those parameters.
// Nesting level: 0 = package level, >0 = inside a function nestingLevel int }
// Nesting level: 0 = package level, >0 = inside a function nestingLevel int // RETYPES parameter renames: when a RETYPES declaration shadows a // parameter (e.g. VAL INT X RETYPES X :), the parameter is renamed // in the signature so := can create a new variable with the original name. retypesRenames map[string]string }
The file /home/david/projects/code-associates/occam2go/codegen/codegen.go has been updated successfully.
Now modify generateProcDecl to scan for RETYPES that shadow parameters and set up renames:
// Generate function signature
params := g.generateProcParams(proc.Params)
gName := goIdent(proc.Name)
if g.nestingLevel > 0 {
// Nested PROC: generate as Go closure
g.writeLine(fmt.Sprintf("%s := func(%s) {", gName, params))
} else {
g.writeLine(fmt.Sprintf("func %s(%s) {", gName, params))
}
g.indent++
g.nestingLevel++
for _, stmt := range proc.Body {
g.generateStatement(stmt)
} // Scan proc body for RETYPES declarations that shadow parameters.
// When VAL INT X RETYPES X :, Go can't redeclare X in the same scope,
// so we rename the parameter (e.g. X → _rp_X) and let RETYPES declare the original name.
oldRenames := g.retypesRenames
g.retypesRenames = nil
paramNames := make(map[string]bool)
for _, p := range proc.Params {
paramNames[p.Name] = true
}
for _, stmt := range proc.Body {
if rd, ok := stmt.(*ast.RetypesDecl); ok {
if paramNames[rd.Source] && rd.Name == rd.Source {
if g.retypesRenames == nil {
g.retypesRenames = make(map[string]string)
}
g.retypesRenames[rd.Name] = "_rp_" + goIdent(rd.Name)
}
}
}
// Generate function signature
params := g.generateProcParams(proc.Params)
gName := goIdent(proc.Name)
if g.nestingLevel > 0 {
// Nested PROC: generate as Go closure
g.writeLine(fmt.Sprintf("%s := func(%s) {", gName, params))
} else {
g.writeLine(fmt.Sprintf("func %s(%s) {", gName, params))
}
g.indent++
g.nestingLevel++
for _, stmt := range proc.Body {
g.generateStatement(stmt)
}The file /home/david/projects/code-associates/occam2go/codegen/codegen.go has been updated successfully.
{
"file_path": "/home/david/projects/code-associates/occam2go/codegen/codegen.go",
"offset": 1509,
"limit": 15
} 1509→ if g.nestingLevel > 0 {
1510→ // Nested PROC: generate as Go closure
1511→ g.writeLine(fmt.Sprintf("%s := func(%s) {", gName, params))
1512→ } else {
1513→ g.writeLine(fmt.Sprintf("func %s(%s) {", gName, params))
1514→ }
1515→ g.indent++
1516→ g.nestingLevel++
1517→
1518→ for _, stmt := range proc.Body {
1519→ g.generateStatement(stmt)
1520→ }
1521→
1522→ g.nestingLevel--
1523→ g.indent--{
"file_path": "/home/david/projects/code-associates/occam2go/codegen/codegen.go",
"offset": 1522,
"limit": 10
} 1522→ g.nestingLevel--
1523→ g.indent--
1524→ g.writeLine("}")
1525→ g.writeLine("")
1526→
1527→ // Restore previous context
1528→ g.refParams = oldRefParams
1529→}
1530→
1531→func (g *Generator) generateProcParams(params []ast.ProcParam) string {// Restore previous context g.refParams = oldRefParams }
// Restore previous context g.refParams = oldRefParams g.retypesRenames = oldRenames }
The file /home/david/projects/code-associates/occam2go/codegen/codegen.go has been updated successfully.
{
"file_path": "/home/david/projects/code-associates/occam2go/codegen/codegen.go",
"offset": 1531,
"limit": 25
} 1531→
1532→func (g *Generator) generateProcParams(params []ast.ProcParam) string {
1533→ var parts []string
1534→ for _, p := range params {
1535→ var goType string
1536→ if p.IsChanArray {
1537→ goType = "[]" + chanDirPrefix(p.ChanDir) + g.occamTypeToGo(p.ChanElemType)
1538→ } else if p.IsChan {
1539→ goType = chanDirPrefix(p.ChanDir) + g.occamTypeToGo(p.ChanElemType)
1540→ } else if p.IsOpenArray {
1541→ goType = "[]" + g.occamTypeToGo(p.Type)
1542→ } else if p.ArraySize != "" {
1543→ // Fixed-size array parameter: [n]TYPE
1544→ goType = "[" + p.ArraySize + "]" + g.occamTypeToGo(p.Type)
1545→ if !p.IsVal {
1546→ goType = "*" + goType
1547→ }
1548→ } else {
1549→ goType = g.occamTypeToGo(p.Type)
1550→ if !p.IsVal {
1551→ // Non-VAL parameters are pass by reference in Occam
1552→ goType = "*" + goType
1553→ }
1554→ }
1555→ parts = append(parts, fmt.Sprintf("%s %s", goIdent(p.Name), goType)) parts = append(parts, fmt.Sprintf("%s %s", goIdent(p.Name), goType)) pName := goIdent(p.Name)
if renamed, ok := g.retypesRenames[p.Name]; ok {
pName = renamed
}
parts = append(parts, fmt.Sprintf("%s %s", pName, goType))The file /home/david/projects/code-associates/occam2go/codegen/codegen.go has been updated successfully.
{
"file_path": "/home/david/projects/code-associates/occam2go/codegen/codegen.go",
"offset": 2098,
"limit": 30
} 2098→// generateArrayLiteral emits a Go slice literal: []int{e1, e2, ...}
2099→func (g *Generator) generateArrayLiteral(al *ast.ArrayLiteral) {
2100→ g.write("[]int{")
2101→ for i, elem := range al.Elements {
2102→ if i > 0 {
2103→ g.write(", ")
2104→ }
2105→ g.generateExpression(elem)
2106→ }
2107→ g.write("}")
2108→}
2109→
2110→// generateRetypesDecl emits code for a RETYPES declaration.
2111→// VAL INT X RETYPES X : — reinterpret float32/64 bits as int(s)
2112→// When source and target have the same name (shadowing), we use a temp variable
2113→// to avoid Go's `:=` treating it as `=` to the existing parameter.
2114→func (g *Generator) generateRetypesDecl(r *ast.RetypesDecl) {
2115→ gName := goIdent(r.Name)
2116→ gSource := goIdent(r.Source)
2117→ if r.IsArray {
2118→ // VAL [2]INT X RETYPES X : — split float64 into two int32 words
2119→ tmpVar := fmt.Sprintf("_retmp%d", g.tmpCounter)
2120→ g.tmpCounter++
2121→ g.writeLine(fmt.Sprintf("%s := math.Float64bits(float64(%s))", tmpVar, gSource))
2122→ if gName == gSource {
2123→ // Shadowing: declare a new variable of a different type
2124→ shadowTmp := fmt.Sprintf("_reshadow%d", g.tmpCounter)
2125→ g.tmpCounter++
2126→ g.writeLine(fmt.Sprintf("%s := [2]int{int(int32(uint32(%s))), int(int32(uint32(%s >> 32)))}", shadowTmp, tmpVar, tmpVar))
2127→ g.writeLine(fmt.Sprintf("%s := %s", gName, shadowTmp))Now I'll rewrite generateRetypesDecl to use the renamed parameter source from retypesRenames:
// generateRetypesDecl emits code for a RETYPES declaration.
// VAL INT X RETYPES X : — reinterpret float32/64 bits as int(s)
// When source and target have the same name (shadowing), we use a temp variable
// to avoid Go's `:=` treating it as `=` to the existing parameter.
func (g *Generator) generateRetypesDecl(r *ast.RetypesDecl) {
gName := goIdent(r.Name)
gSource := goIdent(r.Source)
if r.IsArray {
// VAL [2]INT X RETYPES X : — split float64 into two int32 words
tmpVar := fmt.Sprintf("_retmp%d", g.tmpCounter)
g.tmpCounter++
g.writeLine(fmt.Sprintf("%s := math.Float64bits(float64(%s))", tmpVar, gSource))
if gName == gSource {
// Shadowing: declare a new variable of a different type
shadowTmp := fmt.Sprintf("_reshadow%d", g.tmpCounter)
g.tmpCounter++
g.writeLine(fmt.Sprintf("%s := [2]int{int(int32(uint32(%s))), int(int32(uint32(%s >> 32)))}", shadowTmp, tmpVar, tmpVar))
g.writeLine(fmt.Sprintf("%s := %s", gName, shadowTmp))
} else {
g.writeLine(fmt.Sprintf("%s := [2]int{int(int32(uint32(%s))), int(int32(uint32(%s >> 32)))}", gName, tmpVar, tmpVar))
}
} else {
// VAL INT X RETYPES X : — reinterpret float32 as int
if gName == gSource {
// Shadowing: use a temp to avoid Go treating := as = on the same name
tmpVar := fmt.Sprintf("_retmp%d", g.tmpCounter)
g.tmpCounter++
g.writeLine(fmt.Sprintf("%s := int(int32(math.Float32bits(float32(%s))))", tmpVar, gSource))
g.writeLine(fmt.Sprintf("%s := %s", gName, tmpVar))
} else {
g.writeLine(fmt.Sprintf("%s := int(int32(math.Float32bits(float32(%s))))", gName, gSource))
}
}
}// generateRetypesDecl emits code for a RETYPES declaration.
// VAL INT X RETYPES X : — reinterpret float32/64 bits as int(s)
// When source and target share the same name (shadowing a parameter), the parameter
// has been renamed in the signature (e.g. X → _rp_X) so we can use := with the
// original name to create a new variable.
func (g *Generator) generateRetypesDecl(r *ast.RetypesDecl) {
gName := goIdent(r.Name)
gSource := goIdent(r.Source)
// If the parameter was renamed for RETYPES shadowing, use the renamed source
if renamed, ok := g.retypesRenames[r.Source]; ok {
gSource = renamed
}
if r.IsArray {
// VAL [2]INT X RETYPES X : — split float64 into two int32 words
tmpVar := fmt.Sprintf("_retmp%d", g.tmpCounter)
g.tmpCounter++
g.writeLine(fmt.Sprintf("%s := math.Float64bits(float64(%s))", tmpVar, gSource))
g.writeLine(fmt.Sprintf("%s := [2]int{int(int32(uint32(%s))), int(int32(uint32(%s >> 32)))}", gName, tmpVar, tmpVar))
} else {
// VAL INT X RETYPES X : — reinterpret float32 as int
g.writeLine(fmt.Sprintf("%s := int(int32(math.Float32bits(float32(%s))))", gName, gSource))
}
}The file /home/david/projects/code-associates/occam2go/codegen/codegen.go has been updated successfully.
Now let me also do the same for generateFuncDecl since FUNCTIONs might also have RETYPES. Let me check if the float_io.occ uses RETYPES inside functions.
{
"pattern": "RETYPES",
"path": "/home/david/projects/code-associates/occam2go/kroc/modules/course/libsrc/float_io.occ",
"output_mode": "content",
"-B": 3
}136- --}}} 137- 138- #IF (TARGET.BITS.PER.WORD = 32) 139: VAL INT X RETYPES X : -- 1102- --}}} 1103- 1104- #IF (TARGET.BITS.PER.WORD = 32) 1105: VAL [2]INT X RETYPES X :
./occam2go -I kroc/modules/course/libsrc -D TARGET.BITS.PER.WORD=32 -o /tmp/course_out.go kroc/modules/course/libsrc/course.module 2>&1
Exit code 1 # command-line-arguments # [command-line-arguments] vet: /tmp/course_out.go:1642:17: cannot use Xfrac (variable of type int) as *int value in argument to Round # command-line-arguments # [command-line-arguments] vet: /tmp/course_out.go:1642:17: cannot use Xfrac (variable of type int) as *int value in argument to Round
1635→ // SKIP
1636→ }
1637→ } else if true {
1638→ N = MaxDecN
1639→ }
1640→ Scale = ((N - 1) - LogX)
1641→ ScaleX(&Xexp, &Xfrac, Scale, &round_direction)
1642→ Round(&Xexp, Xfrac, Xint, &round_direction)
1643→ Carry, M = _LONGDIFF(Xfrac, PowerTenFrac[N], 0)
1644→ if ((Xexp > PowerTenExp[N]) || ((Xexp == PowerTenExp[N]) && (Carry == 0))) {
1645→ LogX = (LogX + 1)
1646→ Xexp = Yexp
1647→ Xfrac = Yfrac
1648→ ScaleX(&Xexp, &Xfrac, (Scale - 1), &round_direction)
1649→ Round(&Xexp, Xfrac, Xint, &round_direction)The issue is Round(&Xexp, Xfrac, Xint, &round_direction) — Xfrac is being passed without & but the proc expects a pointer. Let me check the Round proc signature.
{
"pattern": "func Round\\(",
"path": "/tmp/course_out.go",
"output_mode": "content"
}{
"pattern": "Round\"?\\s*:?=?\\s*func",
"path": "/tmp/course_out.go",
"output_mode": "content"
}1387: Round := func(Xexp *int, Xfrac *int, Xint *int, direction *int) {
2022: Round := func(Xexp *int, Xfrac *[2]int, Xint *[2]int, direction *int) {So Round expects Xfrac *int (a pointer), but Xfrac is being passed as Xfrac (an int). The occam source likely passes it as a non-VAL parameter (by reference). Let me look at the original occam source to understand.
{
"pattern": "Round.*Xfrac.*Xint\"",
"path": "/home/david/projects/code-associates/occam2go/kroc/modules/course/libsrc/float_io.occ",
"output_mode": "content",
"-n": true
}{
"pattern": "Round\\s*\\(\"",
"path": "/home/david/projects/code-associates/occam2go/kroc/modules/course/libsrc/float_io.occ",
"output_mode": "content",
"-n": true
}{
"pattern": "Round",
"path": "/home/david/projects/code-associates/occam2go/kroc/modules/course/libsrc/float_io.occ",
"output_mode": "content",
"-n": true
}51: -- Notes: Rounding mode is round to nearest.
119: -- Removed the test ( Xexp >= BitsPerWord ) from Round because
355: --{{{ Round
356: PROC Round (INT Xexp, Xfrac, Xint, direction)
734: Round (Xexp, Xfrac, Xint, round.direction)--now as integer
746: Round (Xexp, Xfrac, Xint, round.direction)
1019: -- Notes: Rounding mode is round to nearest.
1087: -- Removed the test Xexp >= ( 2 * BitsPerWord ) from Round because
1477: --{{{ Round
1478: PROC Round (INT Xexp, [2]INT Xfrac, Xint, INT direction)
1919: Round (Xexp, Xfrac, Xint, round.direction) --now as integer
1935: Round (Xexp, Xfrac, Xint, round.direction)
1949: Round (Xexp, Xfrac, Xint, round.direction)Line 356: PROC Round (INT Xexp, Xfrac, Xint, direction) — all INT parameters, not VAL, so they're all reference params. The call is Round (Xexp, Xfrac, Xint, round.direction).
In the generated Go, Round expects (*int, *int, *int, *int) and the call should pass &Xexp, &Xfrac, &Xint, &round_direction. The issue is that Xfrac and Xint are being passed without & on line 1642.
This is a proc call argument generation issue — the codegen needs to know that these are non-VAL params and wrap them with &. Let me check how generateProcCall handles this.
{
"pattern": "func \\(g \\*Generator\\) generateProcCall",
"path": "/home/david/projects/code-associates/occam2go/codegen/codegen.go",
"output_mode": "content"
}{
"file_path": "/home/david/projects/code-associates/occam2go/codegen/codegen.go",
"offset": 1575,
"limit": 80
} 1575→func (g *Generator) generateProcCall(call *ast.ProcCall) {
1576→ // Handle built-in print procedures
1577→ if printBuiltins[call.Name] {
1578→ g.generatePrintCall(call)
1579→ return
1580→ }
1581→
1582→ // Handle CAUSEERROR
1583→ if call.Name == "CAUSEERROR" {
1584→ g.writeLine(`panic("CAUSEERROR")`)
1585→ return
1586→ }
1587→
1588→ g.builder.WriteString(strings.Repeat("\t", g.indent))
1589→ g.write(goIdent(call.Name))
1590→ g.write("(")
1591→
1592→ // Look up procedure signature to determine which args need address-of
1593→ params := g.procSigs[call.Name]
1594→
1595→ for i, arg := range call.Args {
1596→ if i > 0 {
1597→ g.write(", ")
1598→ }
1599→ // If this parameter is not VAL (i.e., pass by reference), take address
1600→ // Channels and channel arrays are already reference types, so no & needed
1601→ if i < len(params) && !params[i].IsVal && !params[i].IsChan && !params[i].IsChanArray && !params[i].IsOpenArray && params[i].ArraySize == "" {
1602→ g.write("&")
1603→ }
1604→ // Wrap string literals with []byte() when passed to []BYTE parameters
1605→ if _, isStr := arg.(*ast.StringLiteral); isStr && i < len(params) && params[i].IsOpenArray && params[i].Type == "BYTE" {
1606→ g.write("[]byte(")
1607→ g.generateExpression(arg)
1608→ g.write(")")
1609→ } else {
1610→ g.generateExpression(arg)
1611→ }
1612→ }
1613→ g.write(")")
1614→ g.write("\n")
1615→}
1616→
1617→func (g *Generator) generateFuncDecl(fn *ast.FuncDecl) {
1618→ params := g.generateProcParams(fn.Params)
1619→
1620→ // Build return type string
1621→ var returnTypeStr string
1622→ if len(fn.ReturnTypes) == 1 {
1623→ returnTypeStr = g.occamTypeToGo(fn.ReturnTypes[0])
1624→ } else {
1625→ goTypes := make([]string, len(fn.ReturnTypes))
1626→ for i, rt := range fn.ReturnTypes {
1627→ goTypes[i] = g.occamTypeToGo(rt)
1628→ }
1629→ returnTypeStr = "(" + strings.Join(goTypes, ", ") + ")"
1630→ }
1631→
1632→ gName := goIdent(fn.Name)
1633→ if g.nestingLevel > 0 {
1634→ // Nested FUNCTION: generate as Go closure
1635→ g.writeLine(fmt.Sprintf("%s := func(%s) %s {", gName, params, returnTypeStr))
1636→ } else {
1637→ g.writeLine(fmt.Sprintf("func %s(%s) %s {", gName, params, returnTypeStr))
1638→ }
1639→ g.indent++
1640→ g.nestingLevel++
1641→
1642→ for _, stmt := range fn.Body {
1643→ g.generateStatement(stmt)
1644→ }
1645→
1646→ if len(fn.ResultExprs) > 0 {
1647→ g.builder.WriteString(strings.Repeat("\t", g.indent))
1648→ g.write("return ")
1649→ for i, expr := range fn.ResultExprs {
1650→ if i > 0 {
1651→ g.write(", ")
1652→ }
1653→ g.generateExpression(expr)
1654→ }The issue is at line 1601: the generateProcCall function checks i < len(params) and adds & for non-VAL params. The problem is likely that Round is a nested PROC (defined inside a closure), and its signature might not be in g.procSigs.
Let me check: Round is defined as a nested PROC inside course_REAL32TOSTRING. Since it's nested, its signature may not be registered in g.procSigs. Let me check how procSigs gets populated.
The issue is likely that Round is a nested PROC and its signature isn't in g.procSigs. Let me check.
{
"pattern": "procSigs",
"path": "/home/david/projects/code-associates/occam2go/codegen/codegen.go",
"output_mode": "content",
"-n": true
}22: procSigs map[string][]ast.ProcParam 83: g.procSigs = make(map[string][]ast.ProcParam) 116: g.procSigs[proc.Name] = proc.Params 120: g.procSigs[fn.Name] = fn.Params 265: g.procSigs[proc.Name] = proc.Params 269: g.procSigs[fn.Name] = fn.Params 1593: params := g.procSigs[call.Name] 1671: params := g.procSigs[call.Name]
{
"file_path": "/home/david/projects/code-associates/occam2go/codegen/codegen.go",
"offset": 106,
"limit": 30
} 106→ if g.containsMostExpr(stmt) {
107→ g.needMath = true
108→ }
109→ if g.containsIntrinsics(stmt) {
110→ g.needMathBits = true
111→ }
112→ if g.containsRetypes(stmt) {
113→ g.needMath = true
114→ }
115→ if proc, ok := stmt.(*ast.ProcDecl); ok {
116→ g.procSigs[proc.Name] = proc.Params
117→ g.collectNestedProcSigs(proc.Body)
118→ }
119→ if fn, ok := stmt.(*ast.FuncDecl); ok {
120→ g.procSigs[fn.Name] = fn.Params
121→ }
122→ if proto, ok := stmt.(*ast.ProtocolDecl); ok {
123→ g.protocolDefs[proto.Name] = proto
124→ }
125→ if rec, ok := stmt.(*ast.RecordDecl); ok {
126→ g.recordDefs[rec.Name] = rec
127→ }
128→ g.collectChanProtocols(stmt)
129→ g.collectRecordVars(stmt)
130→ }
131→
132→ // Write package declaration
133→ g.writeLine("package main")
134→ g.writeLine("")
135→{
"pattern": "func \\(g \\*Generator\\) collectNestedProcSigs",
"path": "/home/david/projects/code-associates/occam2go/codegen/codegen.go",
"output_mode": "content",
"-A": 20
}262:func (g *Generator) collectNestedProcSigs(stmts []ast.Statement) {
263- for _, stmt := range stmts {
264- if proc, ok := stmt.(*ast.ProcDecl); ok {
265- g.procSigs[proc.Name] = proc.Params
266- g.collectNestedProcSigs(proc.Body)
267- }
268- if fn, ok := stmt.(*ast.FuncDecl); ok {
269- g.procSigs[fn.Name] = fn.Params
270- }
271- }
272-}
273-
274-func (g *Generator) containsPar(stmt ast.Statement) bool {
275- switch s := stmt.(type) {
276- case *ast.ParBlock:
277- return true
278- case *ast.SeqBlock:
279- for _, inner := range s.Statements {
280- if g.containsPar(inner) {
281- return true
282- }I see the issue. collectNestedProcSigs only looks at top-level statements within a proc body. It doesn't recurse into SeqBlock, ParBlock, IfStatement, etc. So nested PROCs that are inside a SEQ block within another PROC won't be found.
In float_io.occ, Round is defined as a nested PROC inside course.REAL32TOSTRING. The body of course.REAL32TOSTRING likely has a SEQ block containing Round as a nested proc declaration. Since collectNestedProcSigs only checks for *ast.ProcDecl at the top level of the statement list, it won't find Round if it's inside a SeqBlock.
I need to update collectNestedProcSigs to recursively traverse into compound statement types like SeqBlock, ParBlock, and IfStatement so it can find nested proc declarations at any depth.
The issue is that collectNestedProcSigs only checks direct children, not nested compound blocks like SeqBlock. Round is nested inside a SeqBlock within the outer PROC, so its signature isn't found. Let me fix the recursive scanning.
func (g *Generator) collectNestedProcSigs(stmts []ast.Statement) {
for _, stmt := range stmts {
if proc, ok := stmt.(*ast.ProcDecl); ok {
g.procSigs[proc.Name] = proc.Params
g.collectNestedProcSigs(proc.Body)
}
if fn, ok := stmt.(*ast.FuncDecl); ok {
g.procSigs[fn.Name] = fn.Params
}
}
}func (g *Generator) collectNestedProcSigs(stmts []ast.Statement) {
for _, stmt := range stmts {
switch s := stmt.(type) {
case *ast.ProcDecl:
g.procSigs[s.Name] = s.Params
g.collectNestedProcSigs(s.Body)
case *ast.FuncDecl:
g.procSigs[s.Name] = s.Params
g.collectNestedProcSigs(s.Body)
case *ast.SeqBlock:
g.collectNestedProcSigs(s.Statements)
case *ast.ParBlock:
g.collectNestedProcSigs(s.Processes)
case *ast.IfStatement:
for _, c := range s.Choices {
g.collectNestedProcSigs(c.Body)
}
case *ast.WhileLoop:
g.collectNestedProcSigs(s.Body)
case *ast.CaseStatement:
for _, opt := range s.Options {
g.collectNestedProcSigs(opt.Body)
}
}
}
}The file /home/david/projects/code-associates/occam2go/codegen/codegen.go has been updated successfully.
Exit code 1 # github.com/codeassociates/occam2go/codegen codegen/codegen.go:274:30: s.Processes undefined (type *ast.ParBlock has no field or method Processes) codegen/codegen.go:282:26: s.Options undefined (type *ast.CaseStatement has no field or method Options) # github.com/codeassociates/occam2go/codegen codegen/codegen.go:274:30: s.Processes undefined (type *ast.ParBlock has no field or method Processes) codegen/codegen.go:282:26: s.Options undefined (type *ast.CaseStatement has no field or method Options)
{
"pattern": "type ParBlock struct|type CaseStatement struct",
"path": "/home/david/projects/code-associates/occam2go/ast/ast.go",
"output_mode": "content",
"-A": 5
}99:type ParBlock struct {
100- Token lexer.Token // the PAR token
101- Statements []Statement
102- Replicator *Replicator // optional replicator
103-}
104-
--
216:type CaseStatement struct {
217- Token lexer.Token // the CASE token
218- Selector Expression // the selector expression
219- Choices []CaseChoice
220-}
221- case *ast.ParBlock:
g.collectNestedProcSigs(s.Processes)
case *ast.IfStatement:
for _, c := range s.Choices {
g.collectNestedProcSigs(c.Body)
}
case *ast.WhileLoop:
g.collectNestedProcSigs(s.Body)
case *ast.CaseStatement:
for _, opt := range s.Options {
g.collectNestedProcSigs(opt.Body)
} case *ast.ParBlock:
g.collectNestedProcSigs(s.Statements)
case *ast.IfStatement:
for _, c := range s.Choices {
g.collectNestedProcSigs(c.Body)
}
case *ast.WhileLoop:
g.collectNestedProcSigs(s.Body)
case *ast.CaseStatement:
for _, ch := range s.Choices {
g.collectNestedProcSigs(ch.Body)
}The file /home/david/projects/code-associates/occam2go/codegen/codegen.go has been updated successfully.
go build -o occam2go . 2>&1 && ./occam2go -I kroc/modules/course/libsrc -D TARGET.BITS.PER.WORD=32 -o /tmp/course_out.go kroc/modules/course/libsrc/course.module 2>&1 && go vet /tmp/course_out.go 2>&1
Exit code 1 # command-line-arguments # [command-line-arguments] vet: /tmp/course_out.go:1642:17: cannot use Xfrac (variable of type int) as *int value in argument to Round # command-line-arguments # [command-line-arguments] vet: /tmp/course_out.go:1642:17: cannot use Xfrac (variable of type int) as *int value in argument to Round
Still the same error. The nested proc sigs aren't being collected deep enough. Let me check what surrounds the Round proc in the generated output and in the parsed AST.
{
"pattern": "Round := func\"",
"path": "/tmp/course_out.go",
"output_mode": "content",
"-n": true
}{
"pattern": "Round := func",
"path": "/tmp/course_out.go",
"output_mode": "content",
"-n": true
}1387: Round := func(Xexp *int, Xfrac *int, Xint *int, direction *int) {
2022: Round := func(Xexp *int, Xfrac *[2]int, Xint *[2]int, direction *int) { 1380→ Borrow, dummy = _LONGDIFF(0, low_word, 0)
1381→ Borrow, *LogX = _LONGDIFF(0, X, Borrow)
1382→ } else if true {
1383→ *LogX, dummy = _LONGPROD(X, R2L, 0)
1384→ }
1385→ }
1386→
1387→ Round := func(Xexp *int, Xfrac *int, Xint *int, direction *int) {
1388→ var Places, Guard, ExpExtra int
1389→ _ = Places
1390→ _ = Guard
1391→ _ = ExpExtraSo Round is generated correctly as a closure. Now the call Round(&Xexp, Xfrac, Xint, &round_direction) — the issue is Xfrac and Xint are not being prefixed with &.
Since Round is a nested proc defined as a closure, its signature was registered in procSigs during collectNestedProcSigs of the parent proc's body. But perhaps the signature was registered with the occam name Round while the lookup uses a different name? Let me check.
1639→ } 1640→ Scale = ((N - 1) - LogX) 1641→ ScaleX(&Xexp, &Xfrac, Scale, &round_direction) 1642→ Round(&Xexp, Xfrac, Xint, &round_direction) 1643→ Carry, M = _LONGDIFF(Xfrac, PowerTenFrac[N], 0)
Round(&Xexp, Xfrac, Xint, &round_direction) — Xexp and round_direction get & but Xfrac and Xint don't. Let me check the occam signature:
PROC Round (INT Xexp, Xfrac, Xint, direction) — this is the shared-type pattern where all 4 params share the INT type, all non-VAL. So they should all be *int.
The issue might be in how the parser handles shared-type params in nested procs. Let me add some debug output to understand.
cat > /tmp/test_shared.occ << 'EOF'
PROC main()
PROC inner(INT a, b, c)
SEQ
a := 1
b := 2
c := 3
:
INT x, y, z:
SEQ
x := 0
y := 0
z := 0
inner(x, y, z)
:
EOF
./occam2go -o /tmp/test_shared.go /tmp/test_shared.occ 2>&1 && cat /tmp/test_shared.gopackage main
func main() {
inner := func(a *int, b *int, c *int) {
*a = 1
*b = 2
*c = 3
}
var x, y, z int
_ = x
_ = y
_ = z
x = 0
y = 0
z = 0
inner(&x, &y, &z)
}That works correctly! So shared-type params work fine in general. The issue must be specific to how Round is nested deeper or how the proc signature is being registered. Let me add a quick debug check.
grep -n "PROC Round" /home/david/projects/code-associates/occam2go/kroc/modules/course/libsrc/float_io.occ
356: PROC Round (INT Xexp, Xfrac, Xint, direction) 1478: PROC Round (INT Xexp, [2]INT Xfrac, Xint, INT direction)
{
"file_path": "/home/david/projects/code-associates/occam2go/kroc/modules/course/libsrc/float_io.occ",
"offset": 1476,
"limit": 3
} 1476→ --}}}
1477→ --{{{ Round
1478→ PROC Round (INT Xexp, [2]INT Xfrac, Xint, INT direction)There are TWO Round definitions with different signatures:
1. Line 356: PROC Round (INT Xexp, Xfrac, Xint, direction) — all INT
2. Line 1478: PROC Round (INT Xexp, [2]INT Xfrac, Xint, INT direction) — INT, [2]INT, [2]INT (shared), INT
Since the second definition (line 1478) is processed after the first, it overwrites the signature in procSigs. The call at line 1642 is using the first Round's context, but the procSigs map has been overwritten with the second definition's signature.
The real issue is that both Round procedures are nested inside different parent procedures (course_REAL32TOSTRING and course_REAL64TOSTRING), but procSigs is a flat map that doesn't account for scope. During the first pass, collectNestedProcSigs processes course_REAL32TOSTRING first and registers the 4-INT version of Round, then overwrites it when processing course_REAL64TOSTRING with the mixed-type signature. When code generation happens later, course_REAL32TOSTRING tries to call Round but finds the wrong signature in the map.
The fix needs to handle scoped lookups — either by not overwriting existing signatures, or by deferring nested proc registration until generation time when we can manage scope properly. Let me trace through exactly how collectNestedProcSigs walks the tree to understand where the collision happens. The real issue is that collectNestedProcSigs runs globally in the first pass, so nested procs from different scopes overwrite each other in procSigs. The simplest fix is to register nested proc signatures at generation time instead — when entering generateProcDecl, I'll scan for immediate nested ProcDecls, register them temporarily, generate the body, then restore the old signatures.
The problem is a name collision! There are TWO nested Round procs with different signatures — one inside REAL32TOSTRING and one inside REAL64TOSTRING. The first pass's collectNestedProcSigs puts both into a flat map, so the second overwrites the first. I need to register nested proc signatures locally when entering each proc scope.
{
"file_path": "/home/david/projects/code-associates/occam2go/codegen/codegen.go",
"offset": 1490,
"limit": 40
} 1490→ }
1491→ }
1492→ // Register record-typed params
1493→ if !p.IsChan {
1494→ if _, ok := g.recordDefs[p.Type]; ok {
1495→ g.recordVars[p.Name] = p.Type
1496→ }
1497→ }
1498→ }
1499→ g.refParams = newRefParams
1500→
1501→ // Scan proc body for RETYPES declarations that shadow parameters.
1502→ // When VAL INT X RETYPES X :, Go can't redeclare X in the same scope,
1503→ // so we rename the parameter (e.g. X → _rp_X) and let RETYPES declare the original name.
1504→ oldRenames := g.retypesRenames
1505→ g.retypesRenames = nil
1506→ paramNames := make(map[string]bool)
1507→ for _, p := range proc.Params {
1508→ paramNames[p.Name] = true
1509→ }
1510→ for _, stmt := range proc.Body {
1511→ if rd, ok := stmt.(*ast.RetypesDecl); ok {
1512→ if paramNames[rd.Source] && rd.Name == rd.Source {
1513→ if g.retypesRenames == nil {
1514→ g.retypesRenames = make(map[string]string)
1515→ }
1516→ g.retypesRenames[rd.Name] = "_rp_" + goIdent(rd.Name)
1517→ }
1518→ }
1519→ }
1520→
1521→ // Generate function signature
1522→ params := g.generateProcParams(proc.Params)
1523→ gName := goIdent(proc.Name)
1524→ if g.nestingLevel > 0 {
1525→ // Nested PROC: generate as Go closure
1526→ g.writeLine(fmt.Sprintf("%s := func(%s) {", gName, params))
1527→ } else {
1528→ g.writeLine(fmt.Sprintf("func %s(%s) {", gName, params))
1529→ }I need to add local nested proc signature registration. Let me add it right before the body generation loop:
{
"file_path": "/home/david/projects/code-associates/occam2go/codegen/codegen.go",
"offset": 1530,
"limit": 20
} 1530→ g.indent++
1531→ g.nestingLevel++
1532→
1533→ for _, stmt := range proc.Body {
1534→ g.generateStatement(stmt)
1535→ }
1536→
1537→ g.nestingLevel--
1538→ g.indent--
1539→ g.writeLine("}")
1540→ g.writeLine("")
1541→
1542→ // Restore previous context
1543→ g.refParams = oldRefParams
1544→ g.retypesRenames = oldRenames
1545→}
1546→
1547→func (g *Generator) generateProcParams(params []ast.ProcParam) string {
1548→ var parts []string
1549→ for _, p := range params { g.indent++
g.nestingLevel++
for _, stmt := range proc.Body {
g.generateStatement(stmt)
}
g.nestingLevel--
g.indent--
g.writeLine("}")
g.writeLine("")
// Restore previous context
g.refParams = oldRefParams
g.retypesRenames = oldRenames
} g.indent++
g.nestingLevel++
// Register nested proc/func signatures for this scope so that calls
// within this proc resolve to the correct (local) signature rather than
// a same-named proc from a different scope.
oldSigs := make(map[string][]ast.ProcParam)
g.collectNestedProcSigsScoped(proc.Body, oldSigs)
for _, stmt := range proc.Body {
g.generateStatement(stmt)
}
// Restore overwritten signatures
for name, params := range oldSigs {
if params == nil {
delete(g.procSigs, name)
} else {
g.procSigs[name] = params
}
}
g.nestingLevel--
g.indent--
g.writeLine("}")
g.writeLine("")
// Restore previous context
g.refParams = oldRefParams
g.retypesRenames = oldRenames
}The file /home/david/projects/code-associates/occam2go/codegen/codegen.go has been updated successfully.
Now I need to add the collectNestedProcSigsScoped helper that saves old values before overwriting:
{
"file_path": "/home/david/projects/code-associates/occam2go/codegen/codegen.go",
"offset": 275,
"limit": 20
} 275→ case *ast.IfStatement:
276→ for _, c := range s.Choices {
277→ g.collectNestedProcSigs(c.Body)
278→ }
279→ case *ast.WhileLoop:
280→ g.collectNestedProcSigs(s.Body)
281→ case *ast.CaseStatement:
282→ for _, ch := range s.Choices {
283→ g.collectNestedProcSigs(ch.Body)
284→ }
285→ }
286→ }
287→}
288→
289→func (g *Generator) containsPar(stmt ast.Statement) bool {
290→ switch s := stmt.(type) {
291→ case *ast.ParBlock:
292→ return true
293→ case *ast.SeqBlock:
294→ for _, inner := range s.Statements {func (g *Generator) containsPar(stmt ast.Statement) bool {// collectNestedProcSigsScoped registers nested proc/func signatures into procSigs
// for the current scope. It saves old values into oldSigs so they can be restored
// after the scope ends (preventing name collisions between same-named nested procs
// in different parent procs).
func (g *Generator) collectNestedProcSigsScoped(stmts []ast.Statement, oldSigs map[string][]ast.ProcParam) {
for _, stmt := range stmts {
switch s := stmt.(type) {
case *ast.ProcDecl:
if _, saved := oldSigs[s.Name]; !saved {
oldSigs[s.Name] = g.procSigs[s.Name] // nil if not previously set
}
g.procSigs[s.Name] = s.Params
g.collectNestedProcSigsScoped(s.Body, oldSigs)
case *ast.FuncDecl:
if _, saved := oldSigs[s.Name]; !saved {
oldSigs[s.Name] = g.procSigs[s.Name]
}
g.procSigs[s.Name] = s.Params
g.collectNestedProcSigsScoped(s.Body, oldSigs)
case *ast.SeqBlock:
g.collectNestedProcSigsScoped(s.Statements, oldSigs)
case *ast.ParBlock:
g.collectNestedProcSigsScoped(s.Statements, oldSigs)
case *ast.IfStatement:
for _, c := range s.Choices {
g.collectNestedProcSigsScoped(c.Body, oldSigs)
}
case *ast.WhileLoop:
g.collectNestedProcSigsScoped(s.Body, oldSigs)
case *ast.CaseStatement:
for _, ch := range s.Choices {
g.collectNestedProcSigsScoped(ch.Body, oldSigs)
}
}
}
}
func (g *Generator) containsPar(stmt ast.Statement) bool {The file /home/david/projects/code-associates/occam2go/codegen/codegen.go has been updated successfully.
go build -o occam2go . 2>&1 && ./occam2go -I kroc/modules/course/libsrc -D TARGET.BITS.PER.WORD=32 -o /tmp/course_out.go kroc/modules/course/libsrc/course.module 2>&1 && go vet /tmp/course_out.go 2>&1
Exit code 1 # command-line-arguments # [command-line-arguments] vet: /tmp/course_out.go:1266:2: declared and not used: MaxInt # command-line-arguments # [command-line-arguments] vet: /tmp/course_out.go:1266:2: declared and not used: MaxInt
The RETYPES and proc sig issues are fixed. Now we have a "declared and not used" error for MaxInt. Let me check.
1262→func course_REAL32TOSTRING(len *int, string []byte, _rp_X float32, Ip int, Dp int) {
1263→ X := int(int32(math.Float32bits(float32(_rp_X))))
1264→ SignBit := 2147483648
1265→ BitsPerWord := 32
1266→ MaxInt := 4294967295
1267→ BytesPerWord := 4
1268→ RealShift := 8
1269→ BitsInFrac := 24
1270→ RealExpSh := 23
1271→ RealExp := 255
1272→ RealXcess := 127
1273→ RealRBit := 128
1274→ RealInf := 2139095040
1275→ Table10 := 13
1276→ Bias10 := 3Good — the RETYPES rename is working correctly (_rp_X parameter, X := new variable). The remaining MaxInt declared-not-used issue is from the occam source — it's a constant declared but never referenced. This is a general issue with how we generate _ = var suppression statements.
Let me check if we already generate _ = var for local abbreviations.
{
"pattern": "_ = BitsPerWord",
"path": "/tmp/course_out.go",
"output_mode": "content"
}So the abbreviations defined inside function bodies don't get _ = suppression. These are VAL abbreviations (untyped, like MaxInt := 4294967295). Let me check how abbreviations are generated inside function bodies.
{
"pattern": "Abbreviation.*generateStatement\\|generateStatement.*Abbreviation\\|case \\*ast.Abbreviation\"",
"path": "/home/david/projects/code-associates/occam2go/codegen/codegen.go",
"output_mode": "content",
"-n": true
}{
"pattern": "case \\*ast\\.Abbreviation",
"path": "/home/david/projects/code-associates/occam2go/codegen/codegen.go",
"output_mode": "content",
"-n": true
}193: case *ast.Abbreviation: 630: case *ast.Abbreviation: 858: case *ast.Abbreviation: 2281: case *ast.Abbreviation:
{
"file_path": "/home/david/projects/code-associates/occam2go/codegen/codegen.go",
"offset": 858,
"limit": 30
} 858→ case *ast.Abbreviation:
859→ g.generateAbbreviation(s)
860→ case *ast.MultiAssignment:
861→ g.generateMultiAssignment(s)
862→ case *ast.RetypesDecl:
863→ g.generateRetypesDecl(s)
864→ }
865→}
866→
867→func (g *Generator) generateVarDecl(decl *ast.VarDecl) {
868→ goType := g.occamTypeToGo(decl.Type)
869→ goNames := make([]string, len(decl.Names))
870→ for i, n := range decl.Names {
871→ goNames[i] = goIdent(n)
872→ }
873→ g.writeLine(fmt.Sprintf("var %s %s", strings.Join(goNames, ", "), goType))
874→ // Suppress "declared and not used" for each variable
875→ for _, n := range goNames {
876→ g.writeLine(fmt.Sprintf("_ = %s", n))
877→ }
878→}
879→
880→func (g *Generator) generateAbbreviation(abbr *ast.Abbreviation) {
881→ g.builder.WriteString(strings.Repeat("\t", g.indent))
882→ g.write(fmt.Sprintf("%s := ", goIdent(abbr.Name)))
883→ g.generateExpression(abbr.Value)
884→ g.write("\n")
885→}
886→
887→func (g *Generator) generateChanDecl(decl *ast.ChanDecl) {The generateAbbreviation function (line 880) doesn't add _ = name suppression! Variables declared with var get it (line 875-876), but abbreviations don't. Let me add that.
func (g *Generator) generateAbbreviation(abbr *ast.Abbreviation) {
g.builder.WriteString(strings.Repeat("\t", g.indent))
g.write(fmt.Sprintf("%s := ", goIdent(abbr.Name)))
g.generateExpression(abbr.Value)
g.write("\n")
}func (g *Generator) generateAbbreviation(abbr *ast.Abbreviation) {
g.builder.WriteString(strings.Repeat("\t", g.indent))
g.write(fmt.Sprintf("%s := ", goIdent(abbr.Name)))
g.generateExpression(abbr.Value)
g.write("\n")
// Suppress "declared and not used" for abbreviations inside function bodies
if g.nestingLevel > 0 {
g.writeLine(fmt.Sprintf("_ = %s", goIdent(abbr.Name)))
}
}The file /home/david/projects/code-associates/occam2go/codegen/codegen.go has been updated successfully.
go build -o occam2go . 2>&1 && ./occam2go -I kroc/modules/course/libsrc -D TARGET.BITS.PER.WORD=32 -o /tmp/course_out.go kroc/modules/course/libsrc/course.module 2>&1 && go vet /tmp/course_out.go 2>&1
Exit code 1 # command-line-arguments # [command-line-arguments] vet: /tmp/course_out.go:1996:43: cannot use Zfrac (variable of type []int) as [2]int value in argument to RealIMul # command-line-arguments # [command-line-arguments] vet: /tmp/course_out.go:1996:43: cannot use Zfrac (variable of type []int) as [2]int value in argument to RealIMul
1990→ N = Bias10
1991→ for (N >= 0) {
1992→ if ((N > 0) && (S < BiasTen[N])) {
1993→ N = (N - 1)
1994→ } else if true {
1995→ if ((N == 0) && (S <= Table10)) {
1996→ Zfrac[1], Zfrac[0], Guard = RealIMul(Zfrac, PowerTenFrac[(S + S) : (S + S) + 2])
1997→ Zexp = ((Zexp + PowerTenExp[S]) + 1)
1998→ } else if true {
1999→ Zfrac[1], Zfrac[0], Guard = RealIMul(Zfrac, BiasTenFrac[(N + N) : (N + N) + 2])The issue: RealIMul(Zfrac, ...) — Zfrac is declared as []int (a slice from an array literal) but the function RealIMul expects [2]int. Let me check the RealIMul signature.
{
"pattern": "RealIMul := func",
"path": "/tmp/course_out.go",
"output_mode": "content",
"-n": true
}{
"pattern": "Zfrac\"",
"path": "/tmp/course_out.go",
"output_mode": "content",
"-n": true,
"head_limit": 5
}{
"pattern": "Zfrac",
"path": "/tmp/course_out.go",
"output_mode": "content",
"-n": true,
"head_limit": 8
}1325: var Carry, Places, Sb, St, Zexp, Zfrac int 1331: _ = Zfrac 1339: Zfrac = PowerTenFrac[St] 1342: Zfrac = BiasTenFrac[1] 1362: Zfrac, Carry = _LONGPROD(PowerTenFrac[St], BiasTenFrac[Sb], 0) 1363: Places, Zfrac, Carry = _NORMALISE(Zfrac, Carry) 1377: *Xfrac, Carry = _LONGDIV((*Xfrac >> 1), 0, Zfrac) 1383: Carry, remainder = _LONGDIV(Carry, 0, Zfrac) [Showing results with pagination = limit: 8, offset: 0]
{
"pattern": "Zfrac\"",
"path": "/tmp/course_out.go",
"output_mode": "content",
"-n": true,
"head_limit": 5,
"offset": 8
}{
"pattern": "Zfrac",
"path": "/tmp/course_out.go",
"output_mode": "content",
"-n": true,
"head_limit": 20,
"offset": 8
}1387: *Xfrac, Carry = _LONGPROD(*Xfrac, Zfrac, 0) 1981: Zfrac := make([]int, 2) 1988: Zfrac[1] = SignBit 1989: Zfrac[0] = 0 1996: Zfrac[1], Zfrac[0], Guard = RealIMul(Zfrac, PowerTenFrac[(S + S) : (S + S) + 2]) 1999: Zfrac[1], Zfrac[0], Guard = RealIMul(Zfrac, BiasTenFrac[(N + N) : (N + N) + 2]) 2003: Places, Zfrac[1], dummy = _NORMALISE(Zfrac[1], Zfrac[0]) 2004: Zfrac[0], Guard = _SHIFTLEFT(Zfrac[0], Guard, Places) 2009: Carry, Zfrac[0] = _LONGSUM(Zfrac[0], 1, 0) 2010: Carry, Zfrac[1] = _LONGSUM(Zfrac[1], 0, Carry) 2015: Zfrac[1] = SignBit 2033: (*Xfrac)[1], (*Xfrac)[0], Guard = RealIDiv(*Xfrac[1], *Xfrac[0], 0, Zfrac) 2034: (*Xfrac)[0], Guard, Carry = RealIDiv(*Xfrac[0], Guard, 0, Zfrac) 2041: Guard, Rem_hi, Rem_lo = RealIDiv(Guard, Carry, 0, Zfrac) 2045: (*Xfrac)[1], (*Xfrac)[0], Guard = RealIMul(*Xfrac, Zfrac) [Showing results with pagination = limit: 20, offset: 8]
I see — line 1981: Zfrac := make([]int, 2). The [2]INT Zfrac: declaration in the REAL64 version is generated as a Go slice, but RealIMul expects [2]int (a fixed-size array).
This is a broader issue: the [2]INT declarations in function bodies are generated as make([]int, 2) (slices) but function parameters with [2]INT are [2]int or *[2]int (arrays). Let me check how [2]INT var declarations are generated.
{
"file_path": "/home/david/projects/code-associates/occam2go/codegen/codegen.go",
"offset": 876,
"limit": 20
} 876→ g.writeLine(fmt.Sprintf("_ = %s", n))
877→ }
878→}
879→
880→func (g *Generator) generateAbbreviation(abbr *ast.Abbreviation) {
881→ g.builder.WriteString(strings.Repeat("\t", g.indent))
882→ g.write(fmt.Sprintf("%s := ", goIdent(abbr.Name)))
883→ g.generateExpression(abbr.Value)
884→ g.write("\n")
885→ // Suppress "declared and not used" for abbreviations inside function bodies
886→ if g.nestingLevel > 0 {
887→ g.writeLine(fmt.Sprintf("_ = %s", goIdent(abbr.Name)))
888→ }
889→}
890→
891→func (g *Generator) generateChanDecl(decl *ast.ChanDecl) {
892→ goType := g.occamTypeToGo(decl.ElemType)
893→ if decl.IsArray {
894→ for _, name := range decl.Names {
895→ n := goIdent(name){
"pattern": "generateArrayDecl\"",
"path": "/home/david/projects/code-associates/occam2go/codegen/codegen.go",
"output_mode": "content",
"-n": true
}{
"pattern": "ArrayDecl",
"path": "/home/david/projects/code-associates/occam2go/codegen/codegen.go",
"output_mode": "content",
"-n": true
}815: case *ast.ArrayDecl:
816: g.generateArrayDecl(s)
920:func (g *Generator) generateArrayDecl(decl *ast.ArrayDecl) {{
"file_path": "/home/david/projects/code-associates/occam2go/codegen/codegen.go",
"offset": 920,
"limit": 20
} 920→func (g *Generator) generateArrayDecl(decl *ast.ArrayDecl) {
921→ goType := g.occamTypeToGo(decl.Type)
922→ for _, name := range decl.Names {
923→ n := goIdent(name)
924→ g.builder.WriteString(strings.Repeat("\t", g.indent))
925→ g.write(fmt.Sprintf("%s := make([]%s, ", n, goType))
926→ g.generateExpression(decl.Size)
927→ g.write(")\n")
928→ }
929→}
930→
931→func (g *Generator) generateSend(send *ast.Send) {
932→ g.builder.WriteString(strings.Repeat("\t", g.indent))
933→ g.write(goIdent(send.Channel))
934→ if send.ChannelIndex != nil {
935→ g.write("[")
936→ g.generateExpression(send.ChannelIndex)
937→ g.write("]")
938→ }
939→ g.write(" <- ")I see the issue. generateArrayDecl always uses make([]type, size) which creates a slice. But for fixed-size arrays like [2]INT, when the array is passed to a function that expects [2]int (not []int), there's a type mismatch.
This is a broader issue. In occam, [2]INT Zfrac: declares a fixed-size array. The codegen generates it as a Go slice (make([]int, 2)), but function parameters that take [2]INT are generated as fixed arrays ([2]int or *[2]int).
I need to fix this mismatch. I could generate fixed-size arrays as actual Go arrays instead of slices, but that's a larger refactor affecting indexing and slicing operations. The simpler approach would be to make function parameters accept slices, though that's semantically incorrect. Looking at the actual generated code, I'm seeing the slice being created for Zfrac while RealIMul expects a fixed array, and the function call is trying to pass the slice where a fixed array is needed.
In Go, this won't work directly—I'd need to either convert the slice to an array or change how I'm generating the code. The real decision is whether [2]INT Zfrac: should generate a Go array or slice, and looking at how it's used—indexed assignment, passed to functions expecting arrays, and element access—it should probably be a fixed array.
Actually, I realize Go arrays can be sliced just fine with arr[n:m], so that's not a blocker. But I need to think through how function arguments work when the function signature expects a slice versus an array.
The real issue is more targeted: when a fixed-size array variable gets passed to a function expecting a pointer to that specific array type, I'd need to pass &Zfrac instead of just Zfrac. Let me focus on just fixing the course module case rather than overhauling the entire array handling strategy. where Zfrac is [2]int) would work by value copy. For non-VAL fixed arrays like *[2]int, I'd need to add & at the call site.
Looking at the existing proc call logic, there's a condition that skips adding & when ArraySize == "", which means fixed-size arrays don't get the address operator. That's the bug — non-VAL fixed arrays should get & added, but the current code treats them as if they don't need it.
The type generation already correctly marks non-VAL fixed arrays as *[N]type, so the call site should be passing addresses for them. Let me trace through the RealIMul case where both parameters are VAL [2]INT to understand what's happening.
The real problem is that generateArrayDecl is creating slices instead of arrays for constant-size declarations. I need to check if the size expression is a constant integer at parse time, and if so, generate a proper array type like [2]int instead of a slice. For occam, array sizes are always compile-time constants, so I can safely generate arrays whenever the size is known.
But there's a complication: if I switch to Go arrays, I'll need to handle slice conversions when passing fixed-size arrays to functions expecting open array parameters. Let me trace through the actual occam signatures to see if this is even necessary—ScaleX takes a scalar INT, not an array, and RealIMul expects [2]INT. The question is whether there are other cases where a fixed-size array gets passed to an open array parameter.
Looking at the course module, it seems the fixed arrays are only used with fixed-size or scalar parameters, so I can just make the change to generateArrayDecl and test it. That's the cleaner approach anyway.
The issue is [2]INT variables are generated as slices (make([]int, 2)) but functions expect fixed arrays ([2]int). I need to change generateArrayDecl to generate Go fixed-size arrays for constant-size arrays, and also fix proc call & handling for non-VAL fixed arrays.
func (g *Generator) generateArrayDecl(decl *ast.ArrayDecl) {
goType := g.occamTypeToGo(decl.Type)
for _, name := range decl.Names {
n := goIdent(name)
g.builder.WriteString(strings.Repeat("\t", g.indent))
g.write(fmt.Sprintf("%s := make([]%s, ", n, goType))
g.generateExpression(decl.Size)
g.write(")\n")
}
}func (g *Generator) generateArrayDecl(decl *ast.ArrayDecl) {
goType := g.occamTypeToGo(decl.Type)
// Use a Go fixed-size array when the size is a constant integer literal,
// otherwise fall back to a slice. Fixed arrays are needed for type-correct
// proc calls with [N]TYPE parameters.
if lit, ok := decl.Size.(*ast.IntegerLiteral); ok {
for _, name := range decl.Names {
n := goIdent(name)
g.writeLine(fmt.Sprintf("var %s [%s]%s", n, lit.Value, goType))
}
} else {
for _, name := range decl.Names {
n := goIdent(name)
g.builder.WriteString(strings.Repeat("\t", g.indent))
g.write(fmt.Sprintf("%s := make([]%s, ", n, goType))
g.generateExpression(decl.Size)
g.write(")\n")
}
}
}The file /home/david/projects/code-associates/occam2go/codegen/codegen.go has been updated successfully.
{
"file_path": "/home/david/projects/code-associates/occam2go/codegen/codegen.go",
"offset": 1606,
"limit": 15
} 1606→ g.writeLine("")
1607→
1608→ // Restore previous context
1609→ g.refParams = oldRefParams
1610→ g.retypesRenames = oldRenames
1611→}
1612→
1613→func (g *Generator) generateProcParams(params []ast.ProcParam) string {
1614→ var parts []string
1615→ for _, p := range params {
1616→ var goType string
1617→ if p.IsChanArray {
1618→ goType = "[]" + chanDirPrefix(p.ChanDir) + g.occamTypeToGo(p.ChanElemType)
1619→ } else if p.IsChan {
1620→ goType = chanDirPrefix(p.ChanDir) + g.occamTypeToGo(p.ChanElemType){
"file_path": "/home/david/projects/code-associates/occam2go/codegen/codegen.go",
"offset": 1651,
"limit": 30
} 1651→ default:
1652→ return "chan " // bidirectional
1653→ }
1654→}
1655→
1656→func (g *Generator) generateProcCall(call *ast.ProcCall) {
1657→ // Handle built-in print procedures
1658→ if printBuiltins[call.Name] {
1659→ g.generatePrintCall(call)
1660→ return
1661→ }
1662→
1663→ // Handle CAUSEERROR
1664→ if call.Name == "CAUSEERROR" {
1665→ g.writeLine(`panic("CAUSEERROR")`)
1666→ return
1667→ }
1668→
1669→ g.builder.WriteString(strings.Repeat("\t", g.indent))
1670→ g.write(goIdent(call.Name))
1671→ g.write("(")
1672→
1673→ // Look up procedure signature to determine which args need address-of
1674→ params := g.procSigs[call.Name]
1675→
1676→ for i, arg := range call.Args {
1677→ if i > 0 {
1678→ g.write(", ")
1679→ }
1680→ // If this parameter is not VAL (i.e., pass by reference), take address{
"file_path": "/home/david/projects/code-associates/occam2go/codegen/codegen.go",
"offset": 1680,
"limit": 20
} 1680→ // If this parameter is not VAL (i.e., pass by reference), take address
1681→ // Channels and channel arrays are already reference types, so no & needed
1682→ if i < len(params) && !params[i].IsVal && !params[i].IsChan && !params[i].IsChanArray && !params[i].IsOpenArray && params[i].ArraySize == "" {
1683→ g.write("&")
1684→ }
1685→ // Wrap string literals with []byte() when passed to []BYTE parameters
1686→ if _, isStr := arg.(*ast.StringLiteral); isStr && i < len(params) && params[i].IsOpenArray && params[i].Type == "BYTE" {
1687→ g.write("[]byte(")
1688→ g.generateExpression(arg)
1689→ g.write(")")
1690→ } else {
1691→ g.generateExpression(arg)
1692→ }
1693→ }
1694→ g.write(")")
1695→ g.write("\n")
1696→}
1697→
1698→func (g *Generator) generateFuncDecl(fn *ast.FuncDecl) {
1699→ params := g.generateProcParams(fn.Params)The condition params[i].ArraySize == "" excludes fixed arrays from getting &. I need to remove that exclusion so non-VAL fixed arrays also get &:
// If this parameter is not VAL (i.e., pass by reference), take address
// Channels and channel arrays are already reference types, so no & needed
if i < len(params) && !params[i].IsVal && !params[i].IsChan && !params[i].IsChanArray && !params[i].IsOpenArray && params[i].ArraySize == "" {
g.write("&")
} // If this parameter is not VAL (i.e., pass by reference), take address
// Channels and channel arrays are already reference types, so no & needed
if i < len(params) && !params[i].IsVal && !params[i].IsChan && !params[i].IsChanArray && !params[i].IsOpenArray {
g.write("&")
}The file /home/david/projects/code-associates/occam2go/codegen/codegen.go has been updated successfully.
Now I also need to handle passing fixed-size arrays to open-array params. When a [2]int array is passed to a []int param, Go needs a slice expression arr[:]. But let me first build and see what happens.
go build -o occam2go . 2>&1 && ./occam2go -I kroc/modules/course/libsrc -D TARGET.BITS.PER.WORD=32 -o /tmp/course_out.go kroc/modules/course/libsrc/course.module 2>&1 && go vet /tmp/course_out.go 2>&1
Exit code 1 # command-line-arguments # [command-line-arguments] vet: /tmp/course_out.go:1923:10: expected operand, found '%' (and 10 more errors) # command-line-arguments # [command-line-arguments] vet: /tmp/course_out.go:1923:10: expected operand, found '%' (and 10 more errors)
1918→ _ = R_hi 1919→ _ = R_lo 1920→ _ = Guard 1921→ var Carry int 1922→ _ = Carry 1923→ var D [%!s(int64=4)]int 1924→ Carry, D[0] = _LONGPROD(X[0], Y[0], 0) 1925→ Carry, D[1] = _LONGPROD(X[0], Y[1], Carry) 1926→ D[3], D[2] = _LONGPROD(X[1], Y[0], 0) 1927→ R_hi, R_lo = _LONGPROD(X[1], Y[1], Carry)
The problem: %!s(int64=4) — the array size literal is being formatted incorrectly. The IntegerLiteral.Value is an int64 but I'm using lit.Value with %s format. Let me check the IntegerLiteral type.
{
"pattern": "type IntegerLiteral struct",
"path": "/home/david/projects/code-associates/occam2go/ast/ast.go",
"output_mode": "content",
"-A": 5
}243:type IntegerLiteral struct {
244- Token lexer.Token
245- Value int64
246-}
247-
248-func (il *IntegerLiteral) expressionNode() {} if lit, ok := decl.Size.(*ast.IntegerLiteral); ok {
for _, name := range decl.Names {
n := goIdent(name)
g.writeLine(fmt.Sprintf("var %s [%s]%s", n, lit.Value, goType))
} if lit, ok := decl.Size.(*ast.IntegerLiteral); ok {
for _, name := range decl.Names {
n := goIdent(name)
g.writeLine(fmt.Sprintf("var %s [%d]%s", n, lit.Value, goType))
}The file /home/david/projects/code-associates/occam2go/codegen/codegen.go has been updated successfully.
go build -o occam2go . 2>&1 && ./occam2go -I kroc/modules/course/libsrc -D TARGET.BITS.PER.WORD=32 -o /tmp/course_out.go kroc/modules/course/libsrc/course.module 2>&1 && go vet /tmp/course_out.go 2>&1
Exit code 1 # command-line-arguments # [command-line-arguments] vet: /tmp/course_out.go:1996:50: cannot use PowerTenFrac[(S + S):(S + S) + 2] (value of type []int) as [2]int value in argument to RealIMul # command-line-arguments # [command-line-arguments] vet: /tmp/course_out.go:1996:50: cannot use PowerTenFrac[(S + S):(S + S) + 2] (value of type []int) as [2]int value in argument to RealIMul
Now the issue is slicing. PowerTenFrac[(S+S):(S+S)+2] produces a []int slice, but RealIMul expects [2]int. This is an inherent Go limitation — slicing an array gives a slice, not a fixed-size array.
Let me look at the occam source to understand the pattern.
1993→ N = (N - 1)
1994→ } else if true {
1995→ if ((N == 0) && (S <= Table10)) {
1996→ Zfrac[1], Zfrac[0], Guard = RealIMul(Zfrac, PowerTenFrac[(S + S) : (S + S) + 2])
1997→ Zexp = ((Zexp + PowerTenExp[S]) + 1)
1998→ } else if true {
1999→ Zfrac[1], Zfrac[0], Guard = RealIMul(Zfrac, BiasTenFrac[(N + N) : (N + N) + 2])
2000→ S = (S - BiasTen[N])
2001→ Zexp = ((Zexp + BiasTenExp[N]) + 1)
2002→ }The occam code is RealIMul(Zfrac, [PowerTenFrac FROM (S+S) FOR 2]). The slice [FROM ... FOR 2] produces a []int, but the function expects [2]int.
The cleanest solution: change RealIMul (and similar) to accept []int instead of [2]int. But the proc definition says [2]INT which generates as a fixed-size array.
Actually, the better approach: for proc parameters, when they're VAL fixed-size arrays, generate them as slices rather than arrays. That way both fixed arrays and slices can be passed. Let me think about this more carefully.
Actually, the simplest targeted fix: when generating a function call and the param expects a fixed-size array but the argument is a slice expression, wrap it with a conversion. But we don't have type info in codegen.
The pragmatic fix: change [2]INT proc params to use Go slices. In occam, [2]INT params and []INT open-array params are semantically different (one knows the size, one doesn't), but in Go we can use slices for both.
{
"file_path": "/home/david/projects/code-associates/occam2go/codegen/codegen.go",
"offset": 1613,
"limit": 35
} 1613→func (g *Generator) generateProcParams(params []ast.ProcParam) string {
1614→ var parts []string
1615→ for _, p := range params {
1616→ var goType string
1617→ if p.IsChanArray {
1618→ goType = "[]" + chanDirPrefix(p.ChanDir) + g.occamTypeToGo(p.ChanElemType)
1619→ } else if p.IsChan {
1620→ goType = chanDirPrefix(p.ChanDir) + g.occamTypeToGo(p.ChanElemType)
1621→ } else if p.IsOpenArray {
1622→ goType = "[]" + g.occamTypeToGo(p.Type)
1623→ } else if p.ArraySize != "" {
1624→ // Fixed-size array parameter: [n]TYPE
1625→ goType = "[" + p.ArraySize + "]" + g.occamTypeToGo(p.Type)
1626→ if !p.IsVal {
1627→ goType = "*" + goType
1628→ }
1629→ } else {
1630→ goType = g.occamTypeToGo(p.Type)
1631→ if !p.IsVal {
1632→ // Non-VAL parameters are pass by reference in Occam
1633→ goType = "*" + goType
1634→ }
1635→ }
1636→ pName := goIdent(p.Name)
1637→ if renamed, ok := g.retypesRenames[p.Name]; ok {
1638→ pName = renamed
1639→ }
1640→ parts = append(parts, fmt.Sprintf("%s %s", pName, goType))
1641→ }
1642→ return strings.Join(parts, ", ")
1643→}
1644→
1645→func chanDirPrefix(dir string) string {
1646→ switch dir {
1647→ case "?":Let me think about this more carefully. The issue is that in occam, [2]INT X as a proc param means "X is a 2-element INT array". When calling with a slice like [arr FROM n FOR 2], it works in occam because the types are compatible.
In Go, we have a mismatch: [2]int (fixed-size array) vs []int (slice). They're not interchangeable.
Options:
1. Change fixed-size array params to use slices: []int instead of [2]int
- Pro: Slices can accept both arrays (via arr[:]) and slices
- Con: Loses size information, and we need to make sure non-VAL fixed arrays work correctly
Con: Complex to implement without type info
Use slices everywhere (both declarations and params)
var x []int would have the same issue.I think option 1 is the most pragmatic. Change fixed-size array params to use slices for both VAL and non-VAL, since non-VAL params already use slice semantics anyway.
But I'm realizing there's a semantic issue here. With VAL parameters in occam, the array is read-only to the callee, but converting to a slice means the callee can actually modify the underlying elements. That breaks the value semantics we're trying to preserve.
The real problem is that Go doesn't have a clean way to express "read-only fixed-size array parameter" without either copying the array or using a slice (which allows modification). For this course module, since VAL arrays are only read anyway, the slice approach works in practice, but it's not semantically correct.
Let me step back and focus on what actually needs to change to make go vet pass. The core issue is that slicing operations like PowerTenFrac[(S+S):(S+S)+2] produce a slice, but functions like RealIMul are expecting a fixed-size array parameter. If I change those function signatures to accept slices instead, that would resolve the type mismatch. params from [2]int to []int, this would work. Let me see if there are other issues.
The change: in generateProcParams, generate fixed-size array params as slices instead of arrays.
And for non-VAL, slices are already reference types, so no * needed.
But then at call sites, when passing a [2]int variable (like Zfrac), we'd need Zfrac[:]. Hmm.
Actually, the real solution is simpler—if I just make all function parameters use slices instead of arrays, everything stays consistent and I avoid the conversion hassle at call sites.
Let me reconsider the approach. The simplest fix is to make fixed-size array params use slices too, keeping everything consistent. Both array declarations and params will be slices.
// Use a Go fixed-size array when the size is a constant integer literal,
// otherwise fall back to a slice. Fixed arrays are needed for type-correct
// proc calls with [N]TYPE parameters.
if lit, ok := decl.Size.(*ast.IntegerLiteral); ok {
for _, name := range decl.Names {
n := goIdent(name)
g.writeLine(fmt.Sprintf("var %s [%d]%s", n, lit.Value, goType))
}
} else {
for _, name := range decl.Names {
n := goIdent(name)
g.builder.WriteString(strings.Repeat("\t", g.indent))
g.write(fmt.Sprintf("%s := make([]%s, ", n, goType))
g.generateExpression(decl.Size)
g.write(")\n")
}
} for _, name := range decl.Names {
n := goIdent(name)
g.builder.WriteString(strings.Repeat("\t", g.indent))
g.write(fmt.Sprintf("%s := make([]%s, ", n, goType))
g.generateExpression(decl.Size)
g.write(")\n")
}The file /home/david/projects/code-associates/occam2go/codegen/codegen.go has been updated successfully.
} else if p.ArraySize != "" {
// Fixed-size array parameter: [n]TYPE
goType = "[" + p.ArraySize + "]" + g.occamTypeToGo(p.Type)
if !p.IsVal {
goType = "*" + goType
}
} else { } else if p.ArraySize != "" {
// Fixed-size array parameter: use slice for Go compatibility
// (occam [n]TYPE and []TYPE both map to Go slices)
goType = "[]" + g.occamTypeToGo(p.Type)
} else {The file /home/david/projects/code-associates/occam2go/codegen/codegen.go has been updated successfully.
// If this parameter is not VAL (i.e., pass by reference), take address
// Channels and channel arrays are already reference types, so no & needed
if i < len(params) && !params[i].IsVal && !params[i].IsChan && !params[i].IsChanArray && !params[i].IsOpenArray {
g.write("&")
} // If this parameter is not VAL (i.e., pass by reference), take address
// Channels, channel arrays, open arrays, and fixed-size arrays (mapped to slices) are already reference types
if i < len(params) && !params[i].IsVal && !params[i].IsChan && !params[i].IsChanArray && !params[i].IsOpenArray && params[i].ArraySize == "" {
g.write("&")
}The file /home/david/projects/code-associates/occam2go/codegen/codegen.go has been updated successfully.
Now I also need to update the RETYPES codegen for [2]INT — it currently generates [2]int{...} but should now generate []int{...}:
{
"pattern": "\\[2\\]int",
"path": "/home/david/projects/code-associates/occam2go/codegen/codegen.go",
"output_mode": "content",
"-n": true
}2196: g.writeLine(fmt.Sprintf("%s := [2]int{int(int32(uint32(%s))), int(int32(uint32(%s >> 32)))}", gName, tmpVar, tmpVar)) g.writeLine(fmt.Sprintf("%s := [2]int{int(int32(uint32(%s))), int(int32(uint32(%s >> 32)))}", gName, tmpVar, tmpVar)) g.writeLine(fmt.Sprintf("%s := []int{int(int32(uint32(%s))), int(int32(uint32(%s >> 32)))}", gName, tmpVar, tmpVar))The file /home/david/projects/code-associates/occam2go/codegen/codegen.go has been updated successfully.
go build -o occam2go . 2>&1 && ./occam2go -I kroc/modules/course/libsrc -D TARGET.BITS.PER.WORD=32 -o /tmp/course_out.go kroc/modules/course/libsrc/course.module 2>&1 && go vet /tmp/course_out.go 2>&1
Exit code 1 # command-line-arguments # [command-line-arguments] vet: /tmp/course_out.go:2027:5: invalid operation: cannot indirect Xfrac[1] (variable of type int) # command-line-arguments # [command-line-arguments] vet: /tmp/course_out.go:2027:5: invalid operation: cannot indirect Xfrac[1] (variable of type int)
2020→ }
2021→ if (S > Table10) {
2022→ if (Scale > 0) {
2023→ *Xexp = DRealExp
2024→ } else if true {
2025→ *Xexp = - DRealExp
2026→ }
2027→ *Xfrac[1] = SignBit
2028→ *Xfrac[0] = 0
2029→ Guard = 0
2030→ } else if (Scale < 0) {
2031→ *Xexp = (*Xexp - Zexp)
2032→ (*Xfrac)[1], (*Xfrac)[0] = _SHIFTRIGHT(*Xfrac[1], *Xfrac[0], 1)
2033→ (*Xfrac)[1], (*Xfrac)[0], Guard = RealIDiv(*Xfrac[1], *Xfrac[0], 0, Zfrac)
2034→ (*Xfrac)[0], Guard, Carry = RealIDiv(*Xfrac[0], Guard, 0, Zfrac)The issue is *Xfrac[1] — Xfrac is now []int (a slice param), not *[2]int. The * dereference is wrong. The codegen is treating Xfrac as a pointer param and adding * for assignments, but since we changed fixed-size array params to slices, the * is wrong.
The root issue: when a non-VAL [2]INT Xfrac param is now []int, the codegen's refParams tracking still marks it as a reference param and adds * for reads/writes. Since slices are already reference types, I need to exclude fixed-size array params from refParams.
{
"file_path": "/home/david/projects/code-associates/occam2go/codegen/codegen.go",
"offset": 1469,
"limit": 15
} 1469→ for i, c := range alt.Cases {
1470→ if c.Guard != nil {
1471→ g.builder.WriteString(strings.Repeat("\t", g.indent))
1472→ g.write(fmt.Sprintf("var _alt%d chan ", i))
1473→ // We don't know the channel type here, so use interface{}
1474→ // Actually, we should use the same type as the original channel
1475→ // For now, let's just reference the original channel conditionally
1476→ g.write(fmt.Sprintf("int = nil\n")) // Assuming int for now
1477→ g.builder.WriteString(strings.Repeat("\t", g.indent))
1478→ g.write(fmt.Sprintf("if "))
1479→ g.generateExpression(c.Guard)
1480→ g.write(fmt.Sprintf(" { _alt%d = %s }\n", i, goIdent(c.Channel)))
1481→ }
1482→ }
1483→ }{
"file_path": "/home/david/projects/code-associates/occam2go/codegen/codegen.go",
"offset": 1517,
"limit": 15
} 1517→ newRefParams[k] = v
1518→ }
1519→ }
1520→ for _, p := range proc.Params {
1521→ if !p.IsVal && !p.IsChan && !p.IsChanArray && !p.IsOpenArray {
1522→ newRefParams[p.Name] = true
1523→ } else {
1524→ // Own param shadows any inherited ref param with same name
1525→ delete(newRefParams, p.Name)
1526→ }
1527→ // Register chan params with protocol mappings
1528→ if p.IsChan || p.IsChanArray {
1529→ if _, ok := g.protocolDefs[p.ChanElemType]; ok {
1530→ g.chanProtocols[p.Name] = p.ChanElemType
1531→ }Line 1521: non-VAL params without channel/open-array status are registered as reference params. Fixed-size arrays (p.ArraySize != "") are included here because the condition doesn't exclude them. Since we changed fixed-size array params to slices, they shouldn't be treated as reference params (no * dereference needed).
if !p.IsVal && !p.IsChan && !p.IsChanArray && !p.IsOpenArray {
newRefParams[p.Name] = true if !p.IsVal && !p.IsChan && !p.IsChanArray && !p.IsOpenArray && p.ArraySize == "" {
newRefParams[p.Name] = trueThe file /home/david/projects/code-associates/occam2go/codegen/codegen.go has been updated successfully.
go build -o occam2go . 2>&1 && ./occam2go -I kroc/modules/course/libsrc -D TARGET.BITS.PER.WORD=32 -o /tmp/course_out.go kroc/modules/course/libsrc/course.module 2>&1 && go vet /tmp/course_out.go 2>&1
Exit code 1 # command-line-arguments # [command-line-arguments] vet: /tmp/course_out.go:2608:12: invalid operation: cannot call byte (variable of type *byte): *byte is not a function # command-line-arguments # [command-line-arguments] vet: /tmp/course_out.go:2608:12: invalid operation: cannot call byte (variable of type *byte): *byte is not a function
2603→ var W, C int
2604→ _ = W
2605→ _ = C
2606→ hex_digit := func(byte *byte, digit int) {
2607→ if (digit < 10) {
2608→ *byte = byte((digit + int(byte(48))))
2609→ } else if (digit < 16) {
2610→ *byte = byte((digit + (int(byte(65)) - 10)))
2611→ }
2612→ }The issue: a param named byte conflicts with Go's built-in byte type. The param byte *byte shadows the built-in byte type, so byte(...) tries to call the *byte variable instead of a type conversion.
This is a name collision with Go keywords/builtins. The occam source uses byte as a variable name, which clashes with Go's byte type. Let me check the occam source.
{
"pattern": "hex.digit",
"path": "/home/david/projects/code-associates/occam2go/kroc/modules/course/libsrc/float_io.occ",
"output_mode": "content",
"-n": true,
"-A": 4
}2210: --{{{ hex.digit
2211: PROC hex.digit (BYTE byte, VAL INT digit)
2212- SEQ
2213- IF
2214- digit < 10
2215- byte := (BYTE (digit + (INT '0')))
--
2232: hex.digit (string[i], C)
2233- len := nibbles.per.word
2234-:
2235---}}}
2236-The occam proc has BYTE byte — a param named byte of type BYTE. In Go, byte is a built-in type, so using it as a variable name causes confusion when you also need byte(...) type conversion.
The fix: the goIdent() function should rename identifiers that conflict with Go keywords/builtins. Let me add byte to the set of reserved words that get renamed.
{
"pattern": "func goIdent",
"path": "/home/david/projects/code-associates/occam2go/codegen/codegen.go",
"output_mode": "content",
"-n": true,
"-A": 15
}70:func goIdent(name string) string {
71- return strings.ReplaceAll(name, ".", "_")
72-}
73-
74-// Generate produces Go code from the AST
75-func (g *Generator) Generate(program *ast.Program) string {
76- g.builder.Reset()
77- g.needSync = false
78- g.needFmt = false
79- g.needTime = false
80- g.needOs = false
81- g.needMath = false
82- g.needMathBits = false
83- g.procSigs = make(map[string][]ast.ProcParam)
84- g.refParams = make(map[string]bool)
85- g.protocolDefs = make(map[string]*ast.ProtocolDecl)func goIdent(name string) string {
return strings.ReplaceAll(name, ".", "_")
}// goReserved is a set of Go keywords and predeclared identifiers that cannot be
// used as variable names when they also appear as type conversions in the generated code.
var goReserved = map[string]bool{
"byte": true, "int": true, "string": true, "len": true, "cap": true,
"make": true, "new": true, "copy": true, "close": true, "delete": true,
"panic": true, "recover": true, "print": true, "println": true,
"error": true, "rune": true, "bool": true, "true": true, "false": true,
}
func goIdent(name string) string {
name = strings.ReplaceAll(name, ".", "_")
if goReserved[name] {
return "_" + name
}
return name
}The file /home/david/projects/code-associates/occam2go/codegen/codegen.go has been updated successfully.
go build -o occam2go . 2>&1 && ./occam2go -I kroc/modules/course/libsrc -D TARGET.BITS.PER.WORD=32 -o /tmp/course_out.go kroc/modules/course/libsrc/course.module 2>&1 && go vet /tmp/course_out.go 2>&1
go vet passes! The full course.module including float_io.occ now transpiles cleanly. Let me run the existing tests to make sure nothing is broken.
? github.com/codeassociates/occam2go [no test files] ? github.com/codeassociates/occam2go/ast [no test files] ok github.com/codeassociates/occam2go/codegen 10.482s ok github.com/codeassociates/occam2go/lexer (cached) ok github.com/codeassociates/occam2go/modgen (cached) ok github.com/codeassociates/occam2go/parser (cached) ok github.com/codeassociates/occam2go/preproc (cached)
./occam2go -I kroc/modules/course/libsrc -D TARGET.BITS.PER.WORD=32 -o /tmp/course_nofloat_out.go course_nofloat.module 2>&1 && go vet /tmp/course_nofloat_out.go 2>&1
Now let me mark task #4 complete and move to task #5 — tests and integration verification.
Now let me write the tests. I'll read the existing test files to understand conventions, then add parser, codegen, and e2e tests in parallel.
{
"file_path": "/home/david/projects/code-associates/occam2go/parser/parser_test.go",
"offset": 1,
"limit": 50
} 1→package parser
2→
3→import (
4→ "testing"
5→
6→ "github.com/codeassociates/occam2go/ast"
7→ "github.com/codeassociates/occam2go/lexer"
8→)
9→
10→func TestVarDecl(t *testing.T) {
11→ input := `INT x:
12→`
13→ l := lexer.New(input)
14→ p := New(l)
15→ program := p.ParseProgram()
16→ checkParserErrors(t, p)
17→
18→ if len(program.Statements) != 1 {
19→ t.Fatalf("expected 1 statement, got %d", len(program.Statements))
20→ }
21→
22→ decl, ok := program.Statements[0].(*ast.VarDecl)
23→ if !ok {
24→ t.Fatalf("expected VarDecl, got %T", program.Statements[0])
25→ }
26→
27→ if decl.Type != "INT" {
28→ t.Errorf("expected type INT, got %s", decl.Type)
29→ }
30→
31→ if len(decl.Names) != 1 || decl.Names[0] != "x" {
32→ t.Errorf("expected name 'x', got %v", decl.Names)
33→ }
34→}
35→
36→func TestMultipleVarDecl(t *testing.T) {
37→ input := `INT x, y, z:
38→`
39→ l := lexer.New(input)
40→ p := New(l)
41→ program := p.ParseProgram()
42→ checkParserErrors(t, p)
43→
44→ if len(program.Statements) != 1 {
45→ t.Fatalf("expected 1 statement, got %d", len(program.Statements))
46→ }
47→
48→ decl, ok := program.Statements[0].(*ast.VarDecl)
49→ if !ok {
50→ t.Fatalf("expected VarDecl, got %T", program.Statements[0]){
"file_path": "/home/david/projects/code-associates/occam2go/codegen/codegen_test.go",
"offset": 1,
"limit": 60
} 1→package codegen
2→
3→import (
4→ "strings"
5→ "testing"
6→
7→ "github.com/codeassociates/occam2go/lexer"
8→ "github.com/codeassociates/occam2go/parser"
9→)
10→
11→func TestSimpleVarDecl(t *testing.T) {
12→ input := `INT x:
13→`
14→ output := transpile(t, input)
15→
16→ if !strings.Contains(output, "var x int") {
17→ t.Errorf("expected 'var x int' in output, got:\n%s", output)
18→ }
19→}
20→
21→func TestMultipleVarDecl(t *testing.T) {
22→ input := `INT x, y, z:
23→`
24→ output := transpile(t, input)
25→
26→ if !strings.Contains(output, "var x, y, z int") {
27→ t.Errorf("expected 'var x, y, z int' in output, got:\n%s", output)
28→ }
29→}
30→
31→func TestReal32VarDecl(t *testing.T) {
32→ input := `REAL32 x:
33→`
34→ output := transpile(t, input)
35→
36→ if !strings.Contains(output, "var x float32") {
37→ t.Errorf("expected 'var x float32' in output, got:\n%s", output)
38→ }
39→}
40→
41→func TestReal64VarDecl(t *testing.T) {
42→ input := `REAL64 x:
43→`
44→ output := transpile(t, input)
45→
46→ if !strings.Contains(output, "var x float64") {
47→ t.Errorf("expected 'var x float64' in output, got:\n%s", output)
48→ }
49→}
50→
51→func TestAssignment(t *testing.T) {
52→ input := `x := 42
53→`
54→ output := transpile(t, input)
55→
56→ if !strings.Contains(output, "x = 42") {
57→ t.Errorf("expected 'x = 42' in output, got:\n%s", output)
58→ }
59→}
60→{
"pattern": "func transpile\\(|func checkParserErrors\"",
"path": "/home/david/projects/code-associates/occam2go/codegen/codegen_test.go",
"output_mode": "content",
"-n": true,
"-A": 10
}204:func transpile(t *testing.T, input string) string {
205- t.Helper()
206-
207- l := lexer.New(input)
208- p := parser.New(l)
209- program := p.ParseProgram()
210-
211- if len(p.Errors()) > 0 {
212- for _, err := range p.Errors() {
213- t.Errorf("parser error: %s", err)
214- }{
"file_path": "/home/david/projects/code-associates/occam2go/codegen/codegen_test.go",
"offset": 204,
"limit": 20
} 204→func transpile(t *testing.T, input string) string {
205→ t.Helper()
206→
207→ l := lexer.New(input)
208→ p := parser.New(l)
209→ program := p.ParseProgram()
210→
211→ if len(p.Errors()) > 0 {
212→ for _, err := range p.Errors() {
213→ t.Errorf("parser error: %s", err)
214→ }
215→ t.FailNow()
216→ }
217→
218→ gen := New()
219→ return gen.Generate(program)
220→}
221→
222→func TestBitwiseOperators(t *testing.T) {
223→ tests := []struct {{
"file_path": "/home/david/projects/code-associates/occam2go/codegen/e2e_test.go",
"offset": 1,
"limit": 60
}/home/david/projects/code-associates/occam2go/codegen/e2e_concurrency_test.go /home/david/projects/code-associates/occam2go/codegen/e2e_preproc_test.go /home/david/projects/code-associates/occam2go/codegen/e2e_helpers_test.go /home/david/projects/code-associates/occam2go/codegen/e2e_protocol_test.go /home/david/projects/code-associates/occam2go/codegen/e2e_record_test.go /home/david/projects/code-associates/occam2go/codegen/e2e_proc_func_test.go /home/david/projects/code-associates/occam2go/codegen/e2e_replicator_test.go /home/david/projects/code-associates/occam2go/codegen/e2e_array_test.go /home/david/projects/code-associates/occam2go/codegen/e2e_basic_test.go /home/david/projects/code-associates/occam2go/codegen/e2e_types_test.go /home/david/projects/code-associates/occam2go/codegen/e2e_control_test.go
{
"file_path": "/home/david/projects/code-associates/occam2go/codegen/e2e_helpers_test.go"
} 1→package codegen
2→
3→import (
4→ "os"
5→ "os/exec"
6→ "path/filepath"
7→ "testing"
8→
9→ "github.com/codeassociates/occam2go/lexer"
10→ "github.com/codeassociates/occam2go/parser"
11→ "github.com/codeassociates/occam2go/preproc"
12→)
13→
14→// transpileCompileRun takes Occam source, transpiles to Go, compiles, runs,
15→// and returns the stdout output
16→func transpileCompileRun(t *testing.T, occamSource string) string {
17→ t.Helper()
18→
19→ // Transpile
20→ l := lexer.New(occamSource)
21→ p := parser.New(l)
22→ program := p.ParseProgram()
23→
24→ if len(p.Errors()) > 0 {
25→ for _, err := range p.Errors() {
26→ t.Errorf("parser error: %s", err)
27→ }
28→ t.FailNow()
29→ }
30→
31→ gen := New()
32→ goCode := gen.Generate(program)
33→
34→ // Create temp directory for this test
35→ tmpDir, err := os.MkdirTemp("", "occam2go-test-*")
36→ if err != nil {
37→ t.Fatalf("failed to create temp dir: %v", err)
38→ }
39→ defer os.RemoveAll(tmpDir)
40→
41→ // Write Go source
42→ goFile := filepath.Join(tmpDir, "main.go")
43→ if err := os.WriteFile(goFile, []byte(goCode), 0644); err != nil {
44→ t.Fatalf("failed to write Go file: %v", err)
45→ }
46→
47→ // Compile
48→ binFile := filepath.Join(tmpDir, "main")
49→ compileCmd := exec.Command("go", "build", "-o", binFile, goFile)
50→ compileOutput, err := compileCmd.CombinedOutput()
51→ if err != nil {
52→ t.Fatalf("compilation failed: %v\nOutput: %s\nGo code:\n%s", err, compileOutput, goCode)
53→ }
54→
55→ // Run
56→ runCmd := exec.Command(binFile)
57→ output, err := runCmd.CombinedOutput()
58→ if err != nil {
59→ t.Fatalf("execution failed: %v\nOutput: %s", err, output)
60→ }
61→
62→ return string(output)
63→}
64→
65→// transpileCompileRunFromFile takes an occam file path, preprocesses it,
66→// then transpiles, compiles, and runs.
67→func transpileCompileRunFromFile(t *testing.T, mainFile string, includePaths []string) string {
68→ t.Helper()
69→
70→ pp := preproc.New(preproc.WithIncludePaths(includePaths))
71→ expanded, err := pp.ProcessFile(mainFile)
72→ if err != nil {
73→ t.Fatalf("preprocessor error: %v", err)
74→ }
75→ if len(pp.Errors()) > 0 {
76→ for _, e := range pp.Errors() {
77→ t.Errorf("preprocessor warning: %s", e)
78→ }
79→ }
80→
81→ return transpileCompileRun(t, expanded)
82→}
83→Good, I have a clear understanding of the test patterns. Let me also check one e2e test for reference:
{
"file_path": "/home/david/projects/code-associates/occam2go/codegen/e2e_basic_test.go",
"offset": 1,
"limit": 30
} 1→package codegen
2→
3→import (
4→ "strings"
5→ "testing"
6→)
7→
8→func TestE2E_PrintInt(t *testing.T) {
9→ occam := `SEQ
10→ INT x:
11→ x := 42
12→ print.int(x)
13→`
14→ output := transpileCompileRun(t, occam)
15→ expected := "42\n"
16→ if output != expected {
17→ t.Errorf("expected %q, got %q", expected, output)
18→ }
19→}
20→
21→func TestE2E_Addition(t *testing.T) {
22→ occam := `SEQ
23→ INT x, y:
24→ x := 10
25→ y := 20
26→ print.int(x + y)
27→`
28→ output := transpileCompileRun(t, occam)
29→ expected := "30\n"
30→ if output != expected {{
"pattern": "func checkParserErrors\"",
"path": "/home/david/projects/code-associates/occam2go/parser/parser_test.go",
"output_mode": "content",
"-n": true,
"-A": 10
}{
"pattern": "func checkParserErrors\"",
"path": "/home/david/projects/code-associates/occam2go/parser",
"output_mode": "content",
"-n": true,
"-A": 10
}{
"pattern": "checkParserErrors",
"path": "/home/david/projects/code-associates/occam2go/parser",
"output_mode": "content",
"-n": true
}parser/parser_test.go:16: checkParserErrors(t, p)
parser/parser_test.go:42: checkParserErrors(t, p)
parser/parser_test.go:70: checkParserErrors(t, p)
parser/parser_test.go:101: checkParserErrors(t, p)
parser/parser_test.go:141: checkParserErrors(t, p)
parser/parser_test.go:175: checkParserErrors(t, p)
parser/parser_test.go:197: checkParserErrors(t, p)
parser/parser_test.go:223: checkParserErrors(t, p)
parser/parser_test.go:254: checkParserErrors(t, p)
parser/parser_test.go:284: checkParserErrors(t, p)
parser/parser_test.go:326: checkParserErrors(t, p)
parser/parser_test.go:359: checkParserErrors(t, p)
parser/parser_test.go:389: checkParserErrors(t, p)
parser/parser_test.go:428: checkParserErrors(t, p)
parser/parser_test.go:471: checkParserErrors(t, p)
parser/parser_test.go:525: checkParserErrors(t, p)
parser/parser_test.go:553: checkParserErrors(t, p)
parser/parser_test.go:591: checkParserErrors(t, p)
parser/parser_test.go:625: checkParserErrors(t, p)
parser/parser_test.go:657: checkParserErrors(t, p)
parser/parser_test.go:699: checkParserErrors(t, p)
parser/parser_test.go:749: checkParserErrors(t, p)
parser/parser_test.go:799: checkParserErrors(t, p)
parser/parser_test.go:844: checkParserErrors(t, p)
parser/parser_test.go:882: checkParserErrors(t, p)
parser/parser_test.go:920: checkParserErrors(t, p)
parser/parser_test.go:952: checkParserErrors(t, p)
parser/parser_test.go:986: checkParserErrors(t, p)
parser/parser_test.go:1047: checkParserErrors(t, p)
parser/parser_test.go:1072: checkParserErrors(t, p)
parser/parser_test.go:1112: checkParserErrors(t, p)
parser/parser_test.go:1158: checkParserErrors(t, p)
parser/parser_test.go:1193: checkParserErrors(t, p)
parser/parser_test.go:1238: checkParserErrors(t, p)
parser/parser_test.go:1273: checkParserErrors(t, p)
parser/parser_test.go:1300:func checkParserErrors(t *testing.T, p *Parser) {
parser/parser_test.go:1319: checkParserErrors(t, p)
parser/parser_test.go:1365: checkParserErrors(t, p)
parser/parser_test.go:1393: checkParserErrors(t, p)
parser/parser_test.go:1424: checkParserErrors(t, p)
parser/parser_test.go:1454: checkParserErrors(t, p)
parser/parser_test.go:1491: checkParserErrors(t, p)
parser/parser_test.go:1546: checkParserErrors(t, p)
parser/parser_test.go:1572: checkParserErrors(t, p)
parser/parser_test.go:1617: checkParserErrors(t, p)
parser/parser_test.go:1652: checkParserErrors(t, p)
parser/parser_test.go:1685: checkParserErrors(t, p)
parser/parser_test.go:1712: checkParserErrors(t, p)
parser/parser_test.go:1747: checkParserErrors(t, p)
parser/parser_test.go:1791: checkParserErrors(t, p)
parser/parser_test.go:1829: checkParserErrors(t, p)
parser/parser_test.go:1871: checkParserErrors(t, p)
parser/parser_test.go:1910: checkParserErrors(t, p)
parser/parser_test.go:1958: checkParserErrors(t, p)
parser/parser_test.go:2005: checkParserErrors(t, p)
parser/parser_test.go:2027: checkParserErrors(t, p)
parser/parser_test.go:2057: checkParserErrors(t, p)
parser/parser_test.go:2089: checkParserErrors(t, p)
parser/parser_test.go:2113: checkParserErrors(t, p)
parser/parser_test.go:2143: checkParserErrors(t, p)
parser/parser_test.go:2170: checkParserErrors(t, p)
parser/parser_test.go:2194: checkParserErrors(t, p)
parser/parser_test.go:2232: checkParserErrors(t, p)
parser/parser_test.go:2267: checkParserErrors(t, p)
parser/parser_test.go:2295: checkParserErrors(t, p)
parser/parser_test.go:2323: checkParserErrors(t, p)
parser/parser_test.go:2354: checkParserErrors(t, p)
parser/parser_test.go:2385: checkParserErrors(t, p)
parser/parser_test.go:2432: checkParserErrors(t, p)
parser/parser_test.go:2458: checkParserErrors(t, p)
parser/parser_test.go:2489: checkParserErrors(t, p)
parser/parser_test.go:2532: checkParserErrors(t, p)
parser/parser_test.go:2559: checkParserErrors(t, p)
parser/parser_test.go:2592: checkParserErrors(t, p)
parser/parser_test.go:2642: checkParserErrors(t, p)
parser/parser_test.go:2682: checkParserErrors(t, p)
parser/parser_test.go:2723: checkParserErrors(t, p)
parser/parser_test.go:2751: checkParserErrors(t, p)
parser/parser_test.go:2775: checkParserErrors(t, p)
parser/parser_test.go:2797: checkParserErrors(t, p)
parser/parser_test.go:2841: checkParserErrors(t, p){
"file_path": "/home/david/projects/code-associates/occam2go/parser/parser_test.go",
"offset": 1300,
"limit": 15
} 1300→func checkParserErrors(t *testing.T, p *Parser) {
1301→ errors := p.Errors()
1302→ if len(errors) == 0 {
1303→ return
1304→ }
1305→
1306→ t.Errorf("parser has %d errors:", len(errors))
1307→ for _, msg := range errors {
1308→ t.Errorf(" parser error: %s", msg)
1309→ }
1310→ t.FailNow()
1311→}
1312→
1313→func TestStringLiteral(t *testing.T) {
1314→ input := `x := "hello world"wc -l /home/david/projects/code-associates/occam2go/parser/parser_test.go
{
"file_path": "/home/david/projects/code-associates/occam2go/parser/parser_test.go",
"offset": 2860,
"limit": 20
} 2860→ arg0, ok := call.Args[0].(*ast.Identifier)
2861→ if !ok {
2862→ t.Fatalf("expected Identifier for arg 0, got %T", call.Args[0])
2863→ }
2864→ if arg0.Value != "out" {
2865→ t.Errorf("expected arg 0 = 'out', got %q", arg0.Value)
2866→ }
2867→
2868→ arg1, ok := call.Args[1].(*ast.Identifier)
2869→ if !ok {
2870→ t.Fatalf("expected Identifier for arg 1, got %T", call.Args[1])
2871→ }
2872→ if arg1.Value != "in" {
2873→ t.Errorf("expected arg 1 = 'in', got %q", arg1.Value)
2874→ }
2875→}
2876→wc -l /home/david/projects/code-associates/occam2go/codegen/codegen_test.go
{
"file_path": "/home/david/projects/code-associates/occam2go/codegen/codegen_test.go",
"offset": 600,
"limit": 15
} 600→`
601→ output := transpile(t, input)
602→ if !strings.Contains(output, "x[0], x[1] = x[1], x[0]") {
603→ t.Errorf("expected 'x[0], x[1] = x[1], x[0]' in output, got:\n%s", output)
604→ }
605→}
606→
607→func TestMultiAssignmentMixed(t *testing.T) {
608→ input := `a, x[0] := 1, 2
609→`
610→ output := transpile(t, input)
611→ if !strings.Contains(output, "a, x[0] = 1, 2") {
612→ t.Errorf("expected 'a, x[0] = 1, 2' in output, got:\n%s", output)
613→ }
614→}Now I have a clear picture of the test conventions. Let me write all the tests. I'll add parser tests, codegen tests, and e2e tests in separate files to keep things organized.
{
"subagent_type": "general-purpose",
"prompt": "I need you to append new parser unit tests to /home/david/projects/code-associates/occam2go/parser/parser_test.go. The file currently ends at line 2875 with `}`.\n\nUse the existing test patterns from the file. Key pattern:\n```go\nfunc TestXxx(t *testing.T) {\n input := `occam code here\n`\n l := lexer.New(input)\n p := New(l)\n program := p.ParseProgram()\n checkParserErrors(t, p)\n // assertions on program.Statements\n}\n```\n\nThe imports at the top already include `\"testing\"`, `\"github.com/codeassociates/occam2go/ast\"`, `\"github.com/codeassociates/occam2go/lexer\"`.\n\nAdd these tests at the end of the file (after line 2875):\n\n1. **TestUntypedValAbbreviation** — `VAL x IS 42 :\\n` should parse as Abbreviation with Type==\"\" and Name==\"x\".\n\n2. **TestArrayLiteral** — An expression `[1, 2, 3]\\n` parsed inside an abbreviation like:\n```\nVAL x IS [1, 2, 3] :\n```\nShould parse with an ArrayLiteral containing 3 IntegerLiteral elements.\n\n3. **TestRetypesDecl** — `VAL INT X RETYPES Y :\\n` should parse as ast.RetypesDecl with IsVal=true, TargetType=\"INT\", Name=\"X\", Source=\"Y\", IsArray=false.\n\n4. **TestRetypesDeclArray** — `VAL [2]INT X RETYPES Y :\\n` should parse as ast.RetypesDecl with IsVal=true, TargetType=\"INT\", Name=\"X\", Source=\"Y\", IsArray=true. The ArraySize should be an IntegerLiteral with Value=2.\n\n5. **TestMultiLineBooleanExpression** — Tests that multi-line expressions with AND/OR operators parse correctly inside IF:\n```\nPROC test()\n INT x:\n IF\n TRUE AND\n TRUE\n x := 1\n TRUE\n x := 2\n:\n```\nThis should parse without errors and have an IfStatement.\n\nPlease read the current file first to understand the exact patterns, then use Edit to append these tests after the final `}` at line 2875."
}{
"subagent_type": "general-purpose",
"prompt": "I need you to add tests to two files in /home/david/projects/code-associates/occam2go/codegen/:\n\n## 1. Append to codegen_test.go (ends at line 614)\n\nThe existing test pattern uses:\n```go\nfunc TestXxx(t *testing.T) {\n input := `occam code\n`\n output := transpile(t, input)\n if !strings.Contains(output, \"expected substring\") {\n t.Errorf(\"expected 'expected substring' in output, got:\\n%s\", output)\n }\n}\n```\n\nThe `transpile(t, input)` helper takes occam source and returns Go code. Imports at top: `\"strings\"`, `\"testing\"`, `\"github.com/codeassociates/occam2go/lexer\"`, `\"github.com/codeassociates/occam2go/parser\"`.\n\nAdd these tests:\n\na. **TestArrayLiteralCodegen** — `VAL x IS [10, 20, 30] :\\n` → output should contain `[]int{10, 20, 30}`\n\nb. **TestUntypedValCodegen** — `VAL x IS 42 :\\n` → output should contain `var x = 42`\n\nc. **TestCAUSEERROR** — \n```\nPROC main()\n CAUSEERROR()\n:\n```\n→ output should contain `panic(\"CAUSEERROR\")`\n\nd. **TestGoIdentByteReserved** — Test that identifiers named \"byte\" are renamed. Use:\n```\nPROC main()\n BYTE byte:\n byte := 65\n:\n```\n→ output should contain `_byte` (the variable should be renamed)\n\n## 2. Create new file e2e_phase2_test.go\n\nUse the same package `codegen` and the `transpileCompileRun(t, occamSource)` helper from e2e_helpers_test.go. This takes occam source, transpiles it, compiles the Go output, runs it, and returns stdout.\n\nTests to add:\n\na. **TestE2E_UntypedValAbbreviation** —\n```\nSEQ\n VAL x IS 42 :\n print.int(x)\n```\nExpected output: `\"42\\n\"`\n\nb. **TestE2E_ArrayLiteralIndexing** —\n```\nSEQ\n VAL arr IS [10, 20, 30] :\n print.int(arr[1])\n```\nExpected output: `\"20\\n\"`\n\nc. **TestE2E_MultiLineBooleanIF** —\n```\nSEQ\n INT x:\n x := 1\n IF\n (x > 0) AND\n (x < 10)\n print.int(x)\n TRUE\n print.int(0)\n```\nExpected output: `\"1\\n\"`\n\nd. **TestE2E_CAUSEERROR** —\nTest that CAUSEERROR generates a panic. Use `exec.Command` approach — transpile, compile, run and check that it exits with error:\n```\nPROC main()\n CAUSEERROR()\n:\n```\nFor this test, you'll need to NOT use transpileCompileRun (since it expects success). Instead, manually transpile, compile, and run, checking that the run exits with a non-zero exit code. Import `os`, `os/exec`, `path/filepath`.\n\nPlease read both files first to understand the exact patterns, then use Edit to append to codegen_test.go and Write to create e2e_phase2_test.go."
}[
{
"type": "text",
"text": "All parser tests pass (the full suite). Here is a summary of the 5 tests appended to `/home/david/projects/code-associates/occam2go/parser/parser_test.go`:\n\n1. **TestUntypedValAbbreviation** (line ~2877) -- Parses `VAL x IS 42 :` and asserts it produces an `ast.Abbreviation` with `IsVal=true`, `Type==\"\"`, `Name=\"x\"`, and `Value` being an `IntegerLiteral` with value 42.\n\n2. **TestArrayLiteral** (line ~2920) -- Parses `VAL x IS [1, 2, 3] :` and asserts the abbreviation's `Value` is an `ast.ArrayLiteral` with 3 `IntegerLiteral` elements having values 1, 2, and 3.\n\n3. **TestRetypesDecl** (line ~2955) -- Parses `VAL INT X RETYPES Y :` and asserts it produces an `ast.RetypesDecl` with `IsVal=true`, `TargetType=\"INT\"`, `Name=\"X\"`, `Source=\"Y\"`, and `IsArray=false`.\n\n4. **TestRetypesDeclArray** (line ~2988) -- Parses `VAL [2]INT X RETYPES Y :` and asserts it produces an `ast.RetypesDecl` with `IsVal=true`, `TargetType=\"INT\"`, `Name=\"X\"`, `Source=\"Y\"`, `IsArray=true`, and `ArraySize` being an `IntegerLiteral` with value 2.\n\n5. **TestMultiLineBooleanExpression** (line ~3030) -- Parses a full `PROC` containing an `IF` block with a multi-line `TRUE AND TRUE` condition (the `AND` operator and its right operand on separate lines). Asserts the PROC body contains a `VarDecl` and an `IfStatement` with 2 choices, where the first choice's condition is a `BinaryExpr` with operator `\"AND\"`."
},
{
"type": "text",
"text": "agentId: ac5e51a (for resuming to continue this agent's work if needed)\n<usage>total_tokens: 28226\ntool_uses: 13\nduration_ms: 72155</usage>"
}
][
{
"type": "text",
"text": "Here is a summary of what was done:\n\n## Changes Made\n\n### 1. `/home/david/projects/code-associates/occam2go/codegen/codegen_test.go` (appended 4 tests)\n\n**TestArrayLiteralCodegen** -- Verifies that `VAL x IS [10, 20, 30] :` produces output containing `[]int{10, 20, 30}`.\n\n**TestUntypedValCodegen** -- Verifies that an untyped VAL abbreviation at package level (alongside a PROC) produces `var x = 42`. Note: I included a dummy PROC declaration alongside the abbreviation because the codegen only emits the `var x = 42` form for package-level abbreviations (when PROCs are present). Without PROCs, the abbreviation is placed inside `func main()` and uses `:=` syntax instead.\n\n**TestCAUSEERROR** -- Verifies that `CAUSEERROR()` inside a PROC produces `panic(\"CAUSEERROR\")` in the output.\n\n**TestGoIdentByteReserved** -- Verifies that a variable named `byte` (a Go reserved/predeclared identifier) is renamed to `_byte` in the generated code.\n\n### 2. `/home/david/projects/code-associates/occam2go/codegen/e2e_phase2_test.go` (new file, 4 tests)\n\n**TestE2E_UntypedValAbbreviation** -- End-to-end test: `VAL x IS 42 :` followed by `print.int(x)` outputs `\"42\\n\"`.\n\n**TestE2E_ArrayLiteralIndexing** -- End-to-end test: `VAL arr IS [10, 20, 30] :` followed by `print.int(arr[1])` outputs `\"20\\n\"`.\n\n**TestE2E_MultiLineBooleanIF** -- End-to-end test: A multi-line boolean expression in an IF condition (`(x > 0) AND` continuing on the next line with `(x < 10)`) correctly evaluates and prints `\"1\\n\"`.\n\n**TestE2E_CAUSEERROR** -- End-to-end test: Manually transpiles, compiles, and runs a program with `CAUSEERROR()`, then verifies the process exits with a non-zero exit code (due to panic). This test does not use `transpileCompileRun` since that helper expects successful execution; instead it uses `exec.Command` directly and checks for a non-nil error from `Run()`."
},
{
"type": "text",
"text": "agentId: a4922d0 (for resuming to continue this agent's work if needed)\n<usage>total_tokens: 44887\ntool_uses: 31\nduration_ms: 155554</usage>"
}
]? github.com/codeassociates/occam2go [no test files] ? github.com/codeassociates/occam2go/ast [no test files] ok github.com/codeassociates/occam2go/codegen (cached) ok github.com/codeassociates/occam2go/lexer (cached) ok github.com/codeassociates/occam2go/modgen (cached) ok github.com/codeassociates/occam2go/parser (cached) ok github.com/codeassociates/occam2go/preproc (cached)
Hmm, all cached. Let me force re-running the tests to make sure the new ones are actually executed.
ok github.com/codeassociates/occam2go/parser 0.002s ok github.com/codeassociates/occam2go/codegen 10.913s
All tests pass including the new Phase 2 tests. Let me also run the specific new tests by name to confirm they were picked up:
go test -count=1 -v -run "TestUntypedVal|TestArrayLiteral|TestRetypes|TestMultiLineBoolean|TestCAUSEERROR|TestGoIdent|TestE2E_Phase2|TestE2E_CAUSEERROR|TestE2E_UntypedVal|TestE2E_ArrayLiteral|TestE2E_MultiLineBoolean" ./parser ./codegen 2>&1
=== RUN TestUntypedValAbbreviation --- PASS: TestUntypedValAbbreviation (0.00s) === RUN TestArrayLiteral --- PASS: TestArrayLiteral (0.00s) === RUN TestRetypesDecl --- PASS: TestRetypesDecl (0.00s) === RUN TestRetypesDeclArray --- PASS: TestRetypesDeclArray (0.00s) === RUN TestMultiLineBooleanExpression --- PASS: TestMultiLineBooleanExpression (0.00s) PASS ok github.com/codeassociates/occam2go/parser 0.004s === RUN TestArrayLiteralCodegen --- PASS: TestArrayLiteralCodegen (0.00s) === RUN TestUntypedValCodegen --- PASS: TestUntypedValCodegen (0.00s) === RUN TestCAUSEERROR --- PASS: TestCAUSEERROR (0.00s) === RUN TestGoIdentByteReserved --- PASS: TestGoIdentByteReserved (0.00s) === RUN TestE2E_UntypedValAbbreviation --- PASS: TestE2E_UntypedValAbbreviation (0.09s) === RUN TestE2E_ArrayLiteralIndexing --- PASS: TestE2E_ArrayLiteralIndexing (0.09s) === RUN TestE2E_MultiLineBooleanIF --- PASS: TestE2E_MultiLineBooleanIF (0.08s) === RUN TestE2E_CAUSEERROR --- PASS: TestE2E_CAUSEERROR (0.07s) PASS ok github.com/codeassociates/occam2go/codegen 0.336s
All 13 new tests pass: - 5 parser tests: untyped VAL, array literal, RETYPES, RETYPES array, multi-line boolean - 4 codegen tests: array literal, untyped VAL, CAUSEERROR, Go reserved word escaping - 4 e2e tests: untyped VAL, array literal indexing, multi-line boolean IF, CAUSEERROR panic
Let me now update the task status and update the documentation.
{
"pattern": "Not Yet Implemented",
"path": "/home/david/projects/code-associates/occam2go/CLAUDE.md",
"output_mode": "content",
"-n": true,
"-A": 3
}181:## Not Yet Implemented 182- 183-RETYPES (bit-level type reinterpretation), transputer intrinsics (LONGPROD, LONGDIV, LONGSUM, LONGDIFF, NORMALISE, SHIFTRIGHT, SHIFTLEFT), CAUSEERROR, PRI ALT/PRI PAR, PLACED PAR, PORT OF. These are needed to transpile `float_io.occ` (Phase 2). See `TODO.md` for the full list with priorities.
## Not Yet Implemented RETYPES (bit-level type reinterpretation), transputer intrinsics (LONGPROD, LONGDIV, LONGSUM, LONGDIFF, NORMALISE, SHIFTRIGHT, SHIFTLEFT), CAUSEERROR, PRI ALT/PRI PAR, PLACED PAR, PORT OF. These are needed to transpile `float_io.occ` (Phase 2). See `TODO.md` for the full list with priorities.
## Not Yet Implemented PRI ALT/PRI PAR, PLACED PAR, PORT OF. See `TODO.md` for the full list with priorities.
<tool_use_error>File has not been read yet. Read it first before writing to it.</tool_use_error>
{
"file_path": "/home/david/projects/code-associates/occam2go/CLAUDE.md",
"offset": 125,
"limit": 65
} 125→| `#FF`, `#80000000` | `0xFF`, `0x80000000` (hex integer literals) |
126→| `SIZE arr` / `SIZE "str"` | `len(arr)` / `len("str")` |
127→| `MOSTNEG INT` / `MOSTPOS INT` | `math.MinInt` / `math.MaxInt` |
128→| `MOSTNEG BYTE` / `MOSTPOS BYTE` | `0` / `255` |
129→| `MOSTNEG REAL32` / `MOSTPOS REAL32` | `-math.MaxFloat32` / `math.MaxFloat32` |
130→| `MOSTNEG REAL64` / `MOSTPOS REAL64` | `-math.MaxFloat64` / `math.MaxFloat64` |
131→| `[arr FROM n FOR m]` | `arr[n : n+m]` (array slice) |
132→| `[arr FOR m]` | `arr[0 : m]` (shorthand slice, FROM 0 implied) |
133→| `[arr FROM n FOR m] := src` | `copy(arr[n:n+m], src)` (slice assignment) |
134→| Nested `PROC`/`FUNCTION` | `name := func(...) { ... }` (Go closure) |
135→
136→## Key Parser Patterns
137→
138→### Indentation Tracking
139→- `p.indentLevel` is incremented/decremented in `nextToken()` when INDENT/DEDENT tokens pass through
140→- **startLevel pattern**: After consuming INDENT, save `startLevel := p.indentLevel`. Loop with `for p.curTokenIs(DEDENT) { if p.indentLevel < startLevel { return } }` to distinguish nested DEDENTs from block-ending DEDENTs
141→- Used in: `parseBlockStatements()`, `parseAltCases()`, `parseIfStatement()`, `parseCaseStatement()`
142→
143→### Token Flow Conventions
144→- Callers consume the INDENT token before calling block-parsing functions
145→- Block-parsing functions call `p.nextToken()` first to move past INDENT to the first real token
146→- `parseAssignment()` / `parseExpression()` leave the cursor on the last consumed token
147→- After `parseStatement()` returns, callers must advance if not already at NEWLINE/DEDENT/EOF
148→
149→### Expression Parsing
150→- Pratt parser with precedence levels: OR < AND < EQUALS < COMPARISON < SUM < PRODUCT < PREFIX < INDEX
151→- `parseExpression()` handles prefix (IDENT, INT, STRING, TRUE/FALSE, LPAREN, MINUS, NOT, BITNOT, INT_TYPE/BYTE_TYPE/BOOL_TYPE/REAL_TYPE for type conversions) then infix loop
152→- Function calls detected by `IDENT` followed by `LPAREN`
153→
154→## Adding a New Feature
155→
156→Typical workflow for a new language construct:
157→
158→1. **Lexer** (`lexer/token.go`, `lexer/lexer.go`): Add token types and keywords if needed
159→2. **AST** (`ast/ast.go`): Define new node struct(s) implementing `Statement` or `Expression`
160→3. **Parser** (`parser/parser.go`): Add case to `parseStatement()` switch; implement parse function
161→4. **Codegen** (`codegen/codegen.go`): Add case to `generateStatement()` or `generateExpression()`; implement generation. If the new construct needs an import (sync, fmt, time), add a `containsX()` scanner
162→5. **Tests**: Add parser unit tests in `parser/parser_test.go`, codegen unit tests in `codegen/codegen_test.go`, and e2e tests in `codegen/e2e_test.go`
163→6. **Documentation**: Update TODO.md to reflect support for the new feature.
164→
165→## What's Implemented
166→
167→Preprocessor (`#IF`/`#ELSE`/`#ENDIF`/`#DEFINE`/`#INCLUDE` with search paths, include guards, include-once deduplication, `#COMMENT`/`#PRAGMA`/`#USE` ignored), module file generation from SConscript (`gen-module` subcommand), SEQ, PAR, IF, WHILE, CASE, ALT (with guards, timer timeouts, and multi-statement bodies with scoped declarations), SKIP, STOP, variable/array/channel/timer declarations, abbreviations (`VAL INT x IS 42:`, `INT y IS z:`, `VAL []BYTE s IS "hi":`), assignments (simple and indexed), channel send/receive, channel arrays (`[n]CHAN OF TYPE` with indexed send/receive and `[]CHAN OF TYPE` proc params), PROC (with VAL, RESULT, reference, CHAN, []CHAN, open array `[]TYPE`, fixed-size array `[n]TYPE`, and shared-type params), channel direction restrictions (`CHAN OF INT c?` → `<-chan int`, `CHAN OF INT c!` → `chan<- int`, call-site annotations `out!`/`in?` accepted), multi-line parameter lists (lexer suppresses INDENT/DEDENT/NEWLINE inside parens), FUNCTION (IS and VALOF forms with multi-statement bodies, including multi-result `INT, INT FUNCTION` with `RESULT a, b`), multi-assignment (`a, b := func(...)` including indexed targets like `x[0], x[1] := x[1], x[0]`), KRoC-style colon terminators on PROC/FUNCTION (optional), replicators on SEQ/PAR/IF (with optional STEP), arithmetic/comparison/logical/AFTER/bitwise operators, type conversions (`INT expr`, `BYTE expr`, `REAL32 expr`, `REAL64 expr`, etc.), REAL32/REAL64 types, hex integer literals (`#FF`, `#80000000`), string literals, byte literals (`'A'`, `'*n'` with occam escape sequences), built-in print procedures, protocols (simple, sequential, and variant), record types (with field access via bracket syntax), SIZE operator, array slices (`[arr FROM n FOR m]` and shorthand `[arr FOR m]` with slice assignment), nested PROCs/FUNCTIONs (local definitions as Go closures), MOSTNEG/MOSTPOS (type min/max constants for INT, BYTE, REAL32, REAL64), INITIAL declarations (`INITIAL INT x IS 42:` — mutable variable with initial value), checked (modular) arithmetic (`PLUS`, `MINUS`, `TIMES` — wrapping operators).
168→
169→## Course Module Testing
170→
171→The KRoC course module (`kroc/modules/course/libsrc/course.module`) is a real-world integration test. A reduced version excluding `float_io.occ` is provided:
172→
173→```bash
174→# Transpile course module (without float_io.occ)
175→./occam2go -I kroc/modules/course/libsrc -D TARGET.BITS.PER.WORD=32 -o /tmp/course_out.go course_nofloat.module
176→
177→# Verify Go output compiles (will only fail with "no main" since it's a library)
178→go vet /tmp/course_out.go
179→```
180→
181→## Not Yet Implemented
182→
183→RETYPES (bit-level type reinterpretation), transputer intrinsics (LONGPROD, LONGDIV, LONGSUM, LONGDIFF, NORMALISE, SHIFTRIGHT, SHIFTLEFT), CAUSEERROR, PRI ALT/PRI PAR, PLACED PAR, PORT OF. These are needed to transpile `float_io.occ` (Phase 2). See `TODO.md` for the full list with priorities.
184→Preprocessor (`#IF`/`#ELSE`/`#ENDIF`/`#DEFINE`/`#INCLUDE` with search paths, include guards, include-once deduplication, `#COMMENT`/`#PRAGMA`/`#USE` ignored), module file generation from SConscript (`gen-module` subcommand), SEQ, PAR, IF, WHILE, CASE, ALT (with guards, timer timeouts, and multi-statement bodies with scoped declarations), SKIP, STOP, variable/array/channel/timer declarations, abbreviations (`VAL INT x IS 42:`, `INT y IS z:`, `VAL []BYTE s IS "hi":`), assignments (simple and indexed), channel send/receive, channel arrays (`[n]CHAN OF TYPE` with indexed send/receive and `[]CHAN OF TYPE` proc params), PROC (with VAL, RESULT, reference, CHAN, []CHAN, open array `[]TYPE`, fixed-size array `[n]TYPE`, and shared-type params), channel direction restrictions (`CHAN OF INT c?` → `<-chan int`, `CHAN OF INT c!` → `chan<- int`, call-site annotations `out!`/`in?` accepted), multi-line parameter lists (lexer suppresses INDENT/DEDENT/NEWLINE inside parens), FUNCTION (IS and VALOF forms with multi-statement bodies, including multi-result `INT, INT FUNCTION` with `RESULT a, b`), multi-assignment (`a, b := func(...)` including indexed targets like `x[0], x[1] := x[1], x[0]`), KRoC-style colon terminators on PROC/FUNCTION (optional), replicators on SEQ/PAR/IF (with optional STEP), arithmetic/comparison/logical/AFTER/bitwise operators, type conversions (`INT expr`, `BYTE expr`, `REAL32 expr`, `REAL64 expr`, etc.), REAL32/REAL64 types, hex integer literals (`#FF`, `#80000000`), string literals, byte literals (`'A'`, `'*n'` with occam escape sequences), built-in print procedures, protocols (simple, sequential, and variant), record types (with field access via bracket syntax), SIZE operator, array slices (`[arr FROM n FOR m]` and shorthand `[arr FOR m]` with slice assignment), nested PROCs/FUNCTIONs (local definitions as Go closures), MOSTNEG/MOSTPOS (type min/max constants for INT, BYTE, REAL32, REAL64), INITIAL declarations (`INITIAL INT x IS 42:` — mutable variable with initial value), checked (modular) arithmetic (`PLUS`, `MINUS`, `TIMES` — wrapping operators).
Preprocessor (`#IF`/`#ELSE`/`#ENDIF`/`#DEFINE`/`#INCLUDE` with search paths, include guards, include-once deduplication, `#COMMENT`/`#PRAGMA`/`#USE` ignored), module file generation from SConscript (`gen-module` subcommand), SEQ, PAR, IF, WHILE, CASE, ALT (with guards, timer timeouts, and multi-statement bodies with scoped declarations), SKIP, STOP, variable/array/channel/timer declarations, abbreviations (`VAL INT x IS 42:`, `INT y IS z:`, `VAL []BYTE s IS "hi":`, untyped `VAL x IS expr:`), assignments (simple and indexed), channel send/receive, channel arrays (`[n]CHAN OF TYPE` with indexed send/receive and `[]CHAN OF TYPE` proc params), PROC (with VAL, RESULT, reference, CHAN, []CHAN, open array `[]TYPE`, fixed-size array `[n]TYPE`, and shared-type params), channel direction restrictions (`CHAN OF INT c?` → `<-chan int`, `CHAN OF INT c!` → `chan<- int`, call-site annotations `out!`/`in?` accepted), multi-line parameter lists and expressions (lexer suppresses INDENT/DEDENT/NEWLINE inside parens/brackets and after continuation operators), FUNCTION (IS and VALOF forms with multi-statement bodies, including multi-result `INT, INT FUNCTION` with `RESULT a, b`), multi-assignment (`a, b := func(...)` including indexed targets like `x[0], x[1] := x[1], x[0]`), KRoC-style colon terminators on PROC/FUNCTION (optional), replicators on SEQ/PAR/IF (with optional STEP), arithmetic/comparison/logical/AFTER/bitwise operators, type conversions (`INT expr`, `BYTE expr`, `REAL32 expr`, `REAL64 expr`, etc.), REAL32/REAL64 types, hex integer literals (`#FF`, `#80000000`), string literals, byte literals (`'A'`, `'*n'` with occam escape sequences), built-in print procedures, protocols (simple, sequential, and variant), record types (with field access via bracket syntax), SIZE operator, array slices (`[arr FROM n FOR m]` and shorthand `[arr FOR m]` with slice assignment), array literals (`[1, 2, 3]`), nested PROCs/FUNCTIONs (local definitions as Go closures), MOSTNEG/MOSTPOS (type min/max constants for INT, BYTE, REAL32, REAL64), INITIAL declarations (`INITIAL INT x IS 42:` — mutable variable with initial value), checked (modular) arithmetic (`PLUS`, `MINUS`, `TIMES` — wrapping operators), RETYPES (bit-level type reinterpretation: `VAL INT X RETYPES X :` for float32→int, `VAL [2]INT X RETYPES X :` for float64→int pair), transputer intrinsics (LONGPROD, LONGDIV, LONGSUM, LONGDIFF, NORMALISE, SHIFTRIGHT, SHIFTLEFT — implemented as Go helper functions), CAUSEERROR (maps to `panic("CAUSEERROR")`).The file /home/david/projects/code-associates/occam2go/CLAUDE.md has been updated successfully.
## Course Module Testing The KRoC course module (`kroc/modules/course/libsrc/course.module`) is a real-world integration test. A reduced version excluding `float_io.occ` is provided: ```bash # Transpile course module (without float_io.occ) ./occam2go -I kroc/modules/course/libsrc -D TARGET.BITS.PER.WORD=32 -o /tmp/course_out.go course_nofloat.module # Verify Go output compiles (will only fail with "no main" since it's a library) go vet /tmp/course_out.go ``` ## Not Yet Implemented RETYPES (bit-level type reinterpretation), transputer intrinsics (LONGPROD, LONGDIV, LONGSUM, LONGDIFF, NORMALISE, SHIFTRIGHT, SHIFTLEFT), CAUSEERROR, PRI ALT/PRI PAR, PLACED PAR, PORT OF. These are needed to transpile `float_io.occ` (Phase 2). See `TODO.md` for the full list with priorities.
## Course Module Testing The KRoC course module (`kroc/modules/course/libsrc/course.module`) is a real-world integration test: ```bash # Transpile full course module (including float_io.occ) ./occam2go -I kroc/modules/course/libsrc -D TARGET.BITS.PER.WORD=32 -o /tmp/course_out.go kroc/modules/course/libsrc/course.module # Verify Go output compiles (will only fail with "no main" since it's a library) go vet /tmp/course_out.go ``` ## Not Yet Implemented PRI ALT/PRI PAR, PLACED PAR, PORT OF. See `TODO.md` for the full list with priorities.
The file /home/david/projects/code-associates/occam2go/CLAUDE.md has been updated successfully.
| Nested `PROC`/`FUNCTION` | `name := func(...) { ... }` (Go closure) || Nested `PROC`/`FUNCTION` | `name := func(...) { ... }` (Go closure) |
| `VAL x IS 42:` (untyped) | `var x = 42` (Go type inference) |
| `[1, 2, 3]` (array literal) | `[]int{1, 2, 3}` |
| `VAL INT X RETYPES X :` | `X := int(int32(math.Float32bits(float32(X))))` |
| `VAL [2]INT X RETYPES X :` | `X := []int{lo, hi}` via `math.Float64bits` |
| `CAUSEERROR()` | `panic("CAUSEERROR")` |
| `LONGPROD` / `LONGDIV` etc. | Go helper functions using `uint64`/`math/bits` |The file /home/david/projects/code-associates/occam2go/CLAUDE.md has been updated successfully.
1→# occam2go — Implementation Status
2→
3→## Fully Implemented
4→
5→### Core Constructs
6→- **SEQ** — Sequential execution, with replicators (`SEQ i = 0 FOR n`) and optional STEP
7→- **PAR** — Parallel execution via goroutines + sync.WaitGroup, with replicators
8→- **IF** — Multi-branch conditionals, maps to if/else if chains, with replicators; supports multi-statement bodies (declarations scoped before process)
9→- **WHILE** — Loops, maps to Go `for` loops; supports multi-statement bodies
10→- **CASE** — Pattern matching with multiple cases and ELSE branch; supports multi-statement bodies
11→- **ALT** — Channel alternation, maps to Go `select`; supports boolean guards, timer timeouts, and multi-statement bodies
12→- **SKIP** — No-op process
13→- **STOP** — Error + deadlock
14→
15→### Data Types & Declarations
16→- **INT, BYTE, BOOL, REAL, REAL32, REAL64** — Scalar types (REAL/REAL64 map to float64, REAL32 maps to float32)
17→- **Variable declarations** — `INT x, y, z:`
18→- **Arrays** — `[n]TYPE arr:` with index expressions
19→- **Channels** — `CHAN OF TYPE c:` with send (`!`) and receive (`?`); `CHAN BYTE` shorthand (without `OF`)
20→- **Channel arrays** — `[n]CHAN OF TYPE cs:` with indexed send/receive and `[]CHAN OF TYPE` proc params
21→- **Channel direction** — `CHAN OF INT c?` (receive-only) and `CHAN OF INT c!` (send-only); direction annotations at call sites (`out!`, `in?`) accepted and ignored
22→- **Timers** — `TIMER tim:` with reads and `AFTER` expressions
23→- **Abbreviations** — `VAL INT x IS 1:`, `INT y IS z:` — named constants and aliases
24→- **INITIAL declarations** — `INITIAL INT x IS 42:` — mutable variables with initial values
25→- **Byte literals** — `'A'`, `'0'` with occam escape sequences (`*n`, `*c`, `*t`)
26→- **Hex integer literals** — `#FF`, `#80000000`
27→
28→### Procedures & Functions
29→- **PROC** — Declaration with VAL, reference, CHAN OF, and open array (`[]TYPE`) parameters
30→- **PROC calls** — With automatic `&`/`*` for reference params, pass-through for channels
31→- **FUNCTION (IS form)** — `INT FUNCTION square(VAL INT x) IS x * x`
32→- **FUNCTION (VALOF form)** — Local declarations + VALOF body + RESULT
33→- **Multi-result FUNCTIONs** — `INT, INT FUNCTION f(...)` returning multiple values via `RESULT a, b`
34→- **Nested PROCs/FUNCTIONs** — Local definitions inside a PROC body, compiled as Go closures
35→- **KRoC-style colon terminators** — Optional `:` at end of PROC/FUNCTION body
36→- **Built-in print** — `print.int`, `print.bool`, `print.string`, `print.newline`
37→
38→### Expressions & Operators
39→- **Arithmetic** — `+`, `-`, `*`, `/`, `\` (modulo)
40→- **Comparison** — `=`, `<>`, `<`, `>`, `<=`, `>=`
41→- **Logical** — `AND`, `OR`, `NOT`
42→- **Bitwise** — `/\`, `\/`, `><`, `~`, `<<`, `>>`
43→- **AFTER** — As boolean expression (maps to `>`)
44→- **Parenthesized expressions**
45→- **Array indexing** — `arr[i]`, `arr[expr]`
46→- **String literals** — Double-quoted strings
47→- **Type conversions** — `INT expr`, `BYTE expr`, `REAL32 expr`, `REAL64 expr`
48→- **Checked arithmetic** — `PLUS`, `MINUS`, `TIMES` — modular (wrapping) operators
49→- **MOSTNEG/MOSTPOS** — Type min/max constants for INT, BYTE, REAL32, REAL64
50→- **SIZE operator** — `SIZE arr`, `SIZE "str"` maps to `len()`
51→- **Array slices** — `[arr FROM n FOR m]` with slice assignment
52→- **Multi-assignment** — `a, b := f(...)` including indexed targets like `x[0], x[1] := x[1], x[0]`
53→
54→### Protocols
55→- **Simple** — `PROTOCOL SIG IS INT` (type alias)
56→- **Sequential** — `PROTOCOL PAIR IS INT ; BYTE` (struct)
57→- **Variant** — `PROTOCOL MSG CASE tag; TYPE ...` (interface + concrete types)
58→
59→### Records
60→- **RECORD** — Struct types with field access via bracket syntax (`p[x]`)
61→
62→### Preprocessor
63→- **`#IF` / `#ELSE` / `#ENDIF`** — Conditional compilation with `TRUE`, `FALSE`, `DEFINED()`, `NOT`, equality
64→- **`#DEFINE`** — Symbol definition
65→- **`#INCLUDE`** — File inclusion with search paths and include guards
66→- **`#COMMENT` / `#PRAGMA` / `#USE`** — Ignored (blank lines)
67→- **Predefined symbols** — `TARGET.BITS.PER.WORD = 64`
68→
69→### Tooling
70→- **gen-module** — Generate `.module` files from KRoC SConscript build files
71→
72→---
73→
74→## Not Yet Implemented
75→
76→### Required for shared_screen module (extends course module)
77→
78→| Feature | Notes | Used in |
79→|---------|-------|---------|
80→| **`DATA TYPE X IS TYPE:`** | Simple type alias (e.g. `DATA TYPE COLOUR IS BYTE:`). | shared_screen.inc |
81→| **`DATA TYPE X RECORD`** | Alternative record syntax (vs current `RECORD X`). | shared_screen.inc |
82→| **Counted array protocol** | `BYTE::[]BYTE` — length-prefixed array in protocols. | shared_screen.inc, shared_screen.occ |
83→| **`RESULT` param qualifier** | `RESULT INT len` on PROC params (output-only, like a write-only reference). | float_io.occ |
84→
85→### Other language features
86→
87→| Feature | Notes |
88→|---------|-------|
89→| **PRI ALT / PRI PAR** | Priority variants of ALT and PAR. |
90→| **PLACED PAR** | Assigning processes to specific hardware. |
91→| **PORT OF** | Hardware port mapping. |
92→| **`RETYPES`** | Type punning / reinterpret cast (`VAL INT X RETYPES X :`). Used in float_io.occ. |
93→| **`CAUSEERROR ()`** | Built-in error-raising primitive. Used in float_io.occ. |
94→| **Transputer intrinsics** | `LONGPROD`, `LONGDIV`, `LONGSUM`, `LONGDIFF`, `NORMALISE`, `SHIFTLEFT`, `SHIFTRIGHT`. Used in float_io.occ. |
95→| **`VAL []BYTE` abbreviations** | `VAL []BYTE cmap IS "0123456789ABCDEF":` — named string constants. |
96→| **`#PRAGMA DEFINED`** | Compiler hint to suppress definedness warnings. Can be ignored. |
97→- **Abbreviations** — `VAL INT x IS 1:`, `INT y IS z:` — named constants and aliases
- **Abbreviations** — `VAL INT x IS 1:`, `INT y IS z:`, untyped `VAL x IS expr:` — named constants and aliases
The file /home/david/projects/code-associates/occam2go/TODO.md has been updated successfully.
- **Array slices** — `[arr FROM n FOR m]` with slice assignment - **Multi-assignment** — `a, b := f(...)` including indexed targets like `x[0], x[1] := x[1], x[0]`
- **Array slices** — `[arr FROM n FOR m]` with slice assignment - **Array literals** — `[1, 2, 3]` — inline array/table expressions - **Multi-assignment** — `a, b := f(...)` including indexed targets like `x[0], x[1] := x[1], x[0]` - **Multi-line expression continuation** — Binary operators and `:=` at end of line continue expression on next line
The file /home/david/projects/code-associates/occam2go/TODO.md has been updated successfully.
### Records - **RECORD** — Struct types with field access via bracket syntax (`p[x]`)
### Records
- **RECORD** — Struct types with field access via bracket syntax (`p[x]`)
### Type Reinterpretation & Intrinsics
- **RETYPES** — Bit-level type reinterpretation (`VAL INT X RETYPES X :` for float32→int, `VAL [2]INT X RETYPES X :` for float64→int pair)
- **Transputer intrinsics** — `LONGPROD`, `LONGDIV`, `LONGSUM`, `LONGDIFF`, `NORMALISE`, `SHIFTLEFT`, `SHIFTRIGHT` — extended-precision arithmetic as Go helper functions
- **CAUSEERROR** — Error-raising primitive, maps to `panic("CAUSEERROR")`The file /home/david/projects/code-associates/occam2go/TODO.md has been updated successfully.
| **`RETYPES`** | Type punning / reinterpret cast (`VAL INT X RETYPES X :`). Used in float_io.occ. | | **`CAUSEERROR ()`** | Built-in error-raising primitive. Used in float_io.occ. | | **Transputer intrinsics** | `LONGPROD`, `LONGDIV`, `LONGSUM`, `LONGDIFF`, `NORMALISE`, `SHIFTLEFT`, `SHIFTRIGHT`. Used in float_io.occ. | | **`VAL []BYTE` abbreviations** | `VAL []BYTE cmap IS "0123456789ABCDEF":` — named string constants. |
| **`VAL []BYTE` abbreviations** | `VAL []BYTE cmap IS "0123456789ABCDEF":` — named string constants. |
The file /home/david/projects/code-associates/occam2go/TODO.md has been updated successfully.
{
"file_path": "/home/david/.claude/projects/-home-david-projects-code-associates-occam2go/memory/MEMORY.md"
} 1→# occam2go Memory
2→
3→## Phase 2: float_io.occ (Deferred)
4→
5→The course module's `float_io.occ` requires features not yet implemented. Key items:
6→
7→- **RETYPES**: `VAL INT X RETYPES X :` — bit-level type reinterpretation. Codegen needs `math.Float32frombits`/`Float64bits` or unsafe pointer casts. ~5 occurrences.
8→- **Transputer intrinsics**: LONGPROD, LONGDIV, LONGSUM, LONGDIFF, NORMALISE, SHIFTRIGHT, SHIFTLEFT — extended-precision arithmetic for float-to-string conversion. ~50+ occurrences. Implement as Go helper functions using `math/bits` or `uint64` arithmetic.
9→- **CAUSEERROR()**: Error-raising primitive. Map to `panic("CAUSEERROR")`.
10→- Test with: `./occam2go -I kroc/modules/course/libsrc -D TARGET.BITS.PER.WORD=32 -o /tmp/course_out.go kroc/modules/course/libsrc/course.module`
11→
12→## Key Patterns Learned
13→
14→- **Lexer-level paren suppression**: When dealing with multi-line constructs inside parentheses, suppress INDENT/DEDENT/NEWLINE at the lexer level (not parser). Parser-side suppression breaks because it consumes INDENT tokens that delimit PROC bodies.
15→- **goIdent()**: Occam identifiers use dots (`out.repeat`), Go doesn't. `goIdent()` in codegen.go replaces `.` with `_`. Must be applied everywhere an identifier appears in generated code (~30+ locations).
16→- **Include-once tracking**: Added in `preproc.go` via `included map[string]bool`. Marks files in `resolveAndInclude()` (not `ProcessFile()`) to preserve circular include detection.
17→- **String→[]byte wrapping**: When a proc/func param is `[]BYTE` and the arg is a string literal, wrap with `[]byte()`. Done in `generateProcCall` and `generateFuncCallExpr`.
18→# occam2go Memory ## Phase 2: float_io.occ (COMPLETED) Full course module (including float_io.occ) now transpiles and passes `go vet`: ```bash ./occam2go -I kroc/modules/course/libsrc -D TARGET.BITS.PER.WORD=32 -o /tmp/course_out.go kroc/modules/course/libsrc/course.module ``` ## Key Patterns Learned - **Lexer-level continuation**: Multi-line expressions (operator at end of line) are handled at lexer level via `isContinuationOp()` + `lastTokenType` tracking. Suppresses NEWLINE/INDENT/DEDENT on continuation lines without modifying the indent stack. - **Lexer-level paren/bracket suppression**: `parenDepth` tracks `(`/`)` and `[`/`]` to suppress INDENT/DEDENT/NEWLINE inside brackets. - **goIdent()**: Replaces dots with underscores AND escapes Go reserved words (`byte`, `int`, `string`, `len`, etc.) by prefixing with `_`. Critical for real-world occam code. - **RETYPES parameter rename**: When `VAL INT X RETYPES X :` shadows a function parameter, the parameter is renamed (`X → _rp_X`) in the signature so `:=` can create a new variable. Go doesn't allow `:=` to redeclare a parameter in the same scope. - **Scoped nested proc signatures**: `collectNestedProcSigsScoped()` saves/restores `procSigs` per proc scope to prevent name collisions between same-named nested procs in different parent procs (e.g., two different `Round` procs). - **Fixed-size array params → slices**: `[n]TYPE` proc params map to `[]type` (not `[n]type`) for compatibility with both array variables and slice expressions. - **Abbreviation unused suppression**: `generateAbbreviation()` emits `_ = name` inside function bodies to prevent "declared and not used" errors. - **Include-once tracking**: Added in `preproc.go` via `included map[string]bool`. Marks files in `resolveAndInclude()` (not `ProcessFile()`) to preserve circular include detection. - **String→[]byte wrapping**: When a proc/func param is `[]BYTE` and the arg is a string literal, wrap with `[]byte()`. Done in `generateProcCall` and `generateFuncCallExpr`.
The file /home/david/.claude/projects/-home-david-projects-code-associates-occam2go/memory/MEMORY.md has been updated successfully.
? github.com/codeassociates/occam2go [no test files] ? github.com/codeassociates/occam2go/ast [no test files] ok github.com/codeassociates/occam2go/codegen 10.861s ok github.com/codeassociates/occam2go/lexer 0.001s ok github.com/codeassociates/occam2go/modgen 0.002s ok github.com/codeassociates/occam2go/parser 0.002s ok github.com/codeassociates/occam2go/preproc 0.003s
All tests pass. Here's a summary of everything implemented in Phase 2:
The full course.module (including float_io.occ) now transpiles with 0 parse errors and passes go vet.
Lexer changes:
- [/] bracket NEWLINE suppression (parenDepth tracking)
- RETYPES keyword token
- Multi-line expression continuation (lexer-level isContinuationOp() + lastTokenType tracking)
AST additions:
- ArrayLiteral (Expression) — [1, 2, 3]
- RetypesDecl (Statement) — VAL INT X RETYPES X :
Parser changes:
- Untyped VAL abbreviations (VAL name IS expr :)
- Array literal expressions ([e1, e2, ...])
- RETYPES declarations (single and array forms)
Codegen changes:
- Array literal generation ([]int{...})
- Untyped VAL (var x = expr)
- RETYPES with parameter renaming to handle same-name shadowing
- 7 transputer intrinsic helper functions (LONGPROD, LONGDIV, LONGSUM, LONGDIFF, NORMALISE, SHIFTRIGHT, SHIFTLEFT)
- CAUSEERROR → panic("CAUSEERROR")
- Go reserved word escaping in goIdent() (byte → _byte, etc.)
- Scoped nested proc signature collection (fixes same-named procs in different scopes)
- Abbreviation _ = name suppression in function bodies
- Fixed-size array params mapped to slices for Go compatibility
On branch main Your branch is up to date with 'origin/main'. Changes not staged for commit: (use "git add <file>..." to update what will be committed) (use "git restore <file>..." to discard changes in working directory) modified: CLAUDE.md modified: TODO.md modified: ast/ast.go modified: codegen/codegen.go modified: codegen/codegen_test.go modified: lexer/lexer.go modified: lexer/token.go modified: parser/parser.go modified: parser/parser_test.go Untracked files: (use "git add <file>..." to include in what will be committed) codegen/e2e_phase2_test.go no changes added to commit (use "git add" and/or "git commit -a")
40f906f Merge pull request #26 from codeassociates/feat/course-module-phase1 ba39c57 Update CLAUDE.md with Phase 1 features and Phase 2 roadmap 987f0a3 Support real-world occam syntax for course module transpilation ef96b78 Merge pull request #25 from codeassociates/fix/nested-if-infinite-loop 3f7bf36 Fix infinite loops in parser and support nested IF constructs
CLAUDE.md | 16 +- TODO.md | 12 +- ast/ast.go | 24 +++ codegen/codegen.go | 521 +++++++++++++++++++++++++++++++++++++++++++++--- codegen/codegen_test.go | 44 ++++ lexer/lexer.go | 48 ++++- lexer/token.go | 3 + parser/parser.go | 136 ++++++++++--- parser/parser_test.go | 210 +++++++++++++++++++ 9 files changed, 949 insertions(+), 65 deletions(-)
<persisted-output>
Output too large (48.8KB). Full output saved to: /home/david/.claude/projects/-home-david-projects-code-associates-occam2go/65034dd7-a4f6-4eb4-956f-e3eedc8f87a6/tool-results/bf4b039.txt
Preview (first 2KB):
diff --git a/CLAUDE.md b/CLAUDE.md
index 580f785..b075068 100644
--- a/CLAUDE.md
+++ b/CLAUDE.md
@@ -132,6 +132,12 @@ Six packages, one pipeline:
| `[arr FOR m]` | `arr[0 : m]` (shorthand slice, FROM 0 implied) |
| `[arr FROM n FOR m] := src` | `copy(arr[n:n+m], src)` (slice assignment) |
| Nested `PROC`/`FUNCTION` | `name := func(...) { ... }` (Go closure) |
+| `VAL x IS 42:` (untyped) | `var x = 42` (Go type inference) |
+| `[1, 2, 3]` (array literal) | `[]int{1, 2, 3}` |
+| `VAL INT X RETYPES X :` | `X := int(int32(math.Float32bits(float32(X))))` |
+| `VAL [2]INT X RETYPES X :` | `X := []int{lo, hi}` via `math.Float64bits` |
+| `CAUSEERROR()` | `panic("CAUSEERROR")` |
+| `LONGPROD` / `LONGDIV` etc. | Go helper functions using `uint64`/`math/bits` |
## Key Parser Patterns
@@ -164,15 +170,15 @@ Typical workflow for a new language construct:
## What's Implemented
-Preprocessor (`#IF`/`#ELSE`/`#ENDIF`/`#DEFINE`/`#INCLUDE` with search paths, include guards, include-once deduplication, `#COMMENT`/`#PRAGMA`/`#USE` ignored), module file generation from SConscript (`gen-module` subcommand), SEQ, PAR, IF, WHILE, CASE, ALT (with guards, timer timeouts, and multi-statement bodies with scoped declarations), SKIP, STOP, variable/array/channel/timer declarations, abbreviations (`VAL INT x IS 42:`, `INT y IS z:`, `VAL []BYTE s IS "hi":`), assignments (simple and indexed), channel send/receive, channel arrays (`[n]CHAN OF TYPE` with indexed send/receive and `[]CHAN OF TYPE` proc params), PROC (with VAL, RESULT, reference, CHAN, []CHAN, open array `[]TYPE`, fixed-size array `[n]TYPE`, and shared-type params), channel direction restrictions (`CHAN OF INT c?` → `<-chan int`, `CHAN OF INT c!` → `chan<- int`, call-site annotations `out!`/`in?` accepted), multi-line parameter lists (lexer suppresses INDENT/DEDENT/NEWLINE inside parens), FUNCTION (IS and VALOF forms with multi-statement bodies, including multi-result `INT, INT FUNCTION` with `RESULT a, b`), multi-assignment
...
</persisted-output>git add CLAUDE.md TODO.md ast/ast.go codegen/codegen.go codegen/codegen_test.go codegen/e2e_phase2_test.go lexer/lexer.go lexer/token.go parser/parser.go parser/parser_test.go
git commit -m "$(cat <<'EOF' Support full course module transpilation including float_io.occ Implement Phase 2 features needed to transpile the complete KRoC course module with float_io.occ (598 parse errors → 0, go vet clean). Adds RETYPES declarations, transputer intrinsic helpers, array literals, untyped VAL abbreviations, multi-line expression continuation, and CAUSEERROR support across all pipeline stages. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> EOF )"
10 files changed, 1051 insertions(+), 65 deletions(-) create mode 100644 codegen/e2e_phase2_test.go
remote: remote: Create a pull request for 'feat/course-module-phase2' on GitHub by visiting: remote: https://github.com/codeassociates/occam2go/pull/new/feat/course-module-phase2 remote: To associate-1.github.com:codeassociates/occam2go.git * [new branch] feat/course-module-phase2 -> feat/course-module-phase2 branch 'feat/course-module-phase2' set up to track 'origin/feat/course-module-phase2'.
gh pr create --title "Support full course module transpilation including float_io.occ" --body "$(cat <<'EOF'
## Summary
- Implement Phase 2 features to transpile the complete KRoC course module with `float_io.occ` (598 parse errors → 0, `go vet` clean)
- Add RETYPES declarations, transputer intrinsic helpers, array literals, untyped VAL abbreviations, multi-line expression continuation, and CAUSEERROR support
- Add 13 new tests (5 parser, 4 codegen, 4 e2e) covering all new constructs
### Lexer
- Bracket `[`/`]` NEWLINE suppression via `parenDepth`
- `RETYPES` keyword token
- Multi-line expression continuation: `isContinuationOp()` suppresses NEWLINE/INDENT/DEDENT after binary operators and `:=`
### AST
- `ArrayLiteral` expression node (`[1, 2, 3]`)
- `RetypesDecl` statement node (`VAL INT X RETYPES X :`)
### Parser
- Untyped VAL abbreviations (`VAL name IS expr :`)
- Array literal expressions (`[e1, e2, ...]`)
- RETYPES declarations (single `VAL INT X RETYPES X :` and array `VAL [2]INT X RETYPES X :`)
### Codegen
- Array literal → `[]int{...}`
- Untyped VAL → `var x = expr`
- RETYPES with parameter renaming for same-name shadowing (Go can't `:=` redeclare params)
- 7 transputer intrinsic Go helpers (LONGPROD, LONGDIV, LONGSUM, LONGDIFF, NORMALISE, SHIFTRIGHT, SHIFTLEFT)
- CAUSEERROR → `panic("CAUSEERROR")`
- Go reserved word escaping (`byte` → `_byte`, etc.)
- Scoped nested proc signature collection (fixes same-named nested procs)
- Fixed-size array params → slices for Go compatibility
- Abbreviation `_ = name` suppression in function bodies
## Test plan
- [x] `go test ./...` — all existing + 13 new tests pass
- [x] `go vet /tmp/course_out.go` — full course.module transpiles cleanly
- [x] `go vet` on `course_nofloat.module` — regression check passes
🤖 Generated with [Claude Code](https://claude.com/claude-code)
EOF
)"