title: gen-module isn't general purpose state: OPEN author: dboreham labels: comments: 0 assignees: projects: milestone: number: 37 -- I noticed that the `gen-module` subcommand is documented as being able to process kroc module/library definitions via their `SConscript` files, which are Python scripts. However, I think that the current implementation doesn't actually have the ability to process arbitrary Python scripts. Rather it knows enough about the SConscript file in the course module's libsrc subdirectory to parse it and extract the referenced Occam source files. For example, if it were given the "root" course module SConscript file (https://github.com/concurrency/kroc/blob/master/modules/course/SConscript) it wouldn't succeed in processing it because to do so requires executing the Python for loop here: https://github.com/concurrency/kroc/blob/master/modules/course/SConscript#L5 Although having the capability to process arbitrary SConscript files might be nice, I think the implementation approach taken is probably ok given the additional effort that would be required. But I think this tradeoff/limitation should be documented to avoid user confusion.
1→// Package modgen generates .module files from KRoC SConscript build files.
2→// It parses the Python-based SConscript to extract source file lists and
3→// OccamLibrary calls, then produces an occam module file with include guards.
4→package modgen
5→
6→import (
7→ "fmt"
8→ "regexp"
9→ "strings"
10→)
11→
12→// Library represents an OccamLibrary extracted from SConscript.
13→type Library struct {
14→ Name string // e.g. "course.lib"
15→ Sources []string // source files
16→ Includes []string // --include files from OCCBUILDFLAGS
17→ Needs []string // --need dependencies from OCCBUILDFLAGS
18→}
19→
20→// ParseSConscript parses a SConscript file's content and extracts library definitions.
21→func ParseSConscript(content string) []Library {
22→ vars := extractSplitVars(content)
23→ return extractLibraries(content, vars)
24→}
25→
26→// GenerateModule creates a .module file from a Library.
27→// moduleName is the guard symbol (e.g. "COURSE.MODULE").
28→func GenerateModule(lib Library, moduleName string) string {
29→ var b strings.Builder
30→ fmt.Fprintf(&b, "#IF NOT (DEFINED (%s))\n", moduleName)
31→ fmt.Fprintf(&b, "#DEFINE %s\n", moduleName)
32→
33→ // Include files from OCCBUILDFLAGS first
34→ for _, inc := range lib.Includes {
35→ fmt.Fprintf(&b, "#INCLUDE \"%s\"\n", inc)
36→ }
37→
38→ // Then source files
39→ for _, src := range lib.Sources {
40→ fmt.Fprintf(&b, "#INCLUDE \"%s\"\n", src)
41→ }
42→
43→ b.WriteString("#ENDIF\n")
44→ return b.String()
45→}
46→
47→// splitVarRe matches: varname = Split('''...''') or Split("""...""")
48→var splitVarRe = regexp.MustCompile(`(\w+)\s*=\s*Split\(\s*(?:'''|""")([^'"]*)(?:'''|""")\s*\)`)
49→
50→// extractSplitVars finds all variable = Split('''...''') assignments.
51→func extractSplitVars(content string) map[string][]string {
52→ vars := map[string][]string{}
53→ for _, m := range splitVarRe.FindAllStringSubmatch(content, -1) {
54→ name := m[1]
55→ files := splitWhitespace(m[2])
56→ vars[name] = files
57→ }
58→ return vars
59→}
60→
61→// libCallRe matches OccamLibrary calls (both direct and via return).
62→// Captures: library name, source variable, and optional OCCBUILDFLAGS.
63→var libCallRe = regexp.MustCompile(
64→ `(?:local\.)?OccamLibrary\(\s*` +
65→ `['"]([^'"]+)['"]\s*,\s*` + // library name
66→ `(\w+)\s*` + // source variable
67→ `(?:,[^)]*?)?\)`, // optional extra args
68→)
69→
70→// flagsRe extracts OCCBUILDFLAGS value.
71→var flagsRe = regexp.MustCompile(`OCCBUILDFLAGS\s*=\s*['"]([^'"]+)['"]`)
72→
73→func extractLibraries(content string, vars map[string][]string) []Library {
74→ var libs []Library
75→
76→ for _, m := range libCallRe.FindAllStringSubmatch(content, -1) {
77→ lib := Library{
78→ Name: m[1],
79→ }
80→
81→ srcVar := m[2]
82→ if files, ok := vars[srcVar]; ok {
83→ lib.Sources = files
84→ }
85→
86→ // Look for OCCBUILDFLAGS in the full match
87→ fullMatch := m[0]
88→ if fm := flagsRe.FindStringSubmatch(fullMatch); fm != nil {
89→ parseFlags(fm[1], &lib)
90→ }
91→
92→ libs = append(libs, lib)
93→ }
94→
95→ return libs
96→}
97→
98→// parseFlags extracts --include and --need values from OCCBUILDFLAGS.
99→func parseFlags(flags string, lib *Library) {
100→ parts := strings.Fields(flags)
101→ for i := 0; i < len(parts); i++ {
102→ switch parts[i] {
103→ case "--include":
104→ if i+1 < len(parts) {
105→ lib.Includes = append(lib.Includes, parts[i+1])
106→ i++
107→ }
108→ case "--need":
109→ if i+1 < len(parts) {
110→ lib.Needs = append(lib.Needs, parts[i+1])
111→ i++
112→ }
113→ }
114→ }
115→}
116→
117→func splitWhitespace(s string) []string {
118→ var result []string
119→ for _, f := range strings.Fields(s) {
120→ if f != "" {
121→ result = append(result, f)
122→ }
123→ }
124→ return result
125→}
126→The issue is asking for documentation of the limitation that gen-module only handles simple SConscript files (specifically the pattern-matching approach using regex), not arbitrary Python scripts with loops and complex logic. Let me check the README and any other relevant documentation.
1→# Preamble
2→
3→Greetings humans. This paragraph is the only part of this repo generated by a human. Everything else (code, PRs, CI) was created by LLMs (so far Claude Code Opus 4.5 and 4.6).
4→My goal as the human in this endevour was to see if I could refute posts I saw after Anthropic publicised their project to have Claude write a C compiler. I noticed many comments to the effect
5→"well that's easy because there's many C compilers in the training data". I had the idea: Occam is so old that there's basically nothing about it on the modern internet.
6→I had also had a long standing curiosity about the connection between Occam and go(lang) that made be curious as to whether you could build an Occam to golang transpiler.
7→This project is the result: scratching two itches -- can an LLM create a working compiler for a language that doesn't show up in its training data, and can one transpile Occam to Go.
8→Without Claude I'd never have the time to scratch that second itch. And...over to Claude:
9→
10→# occam2go
11→
12→A transpiler from Occam to Go, written in Go.
13→
14→Occam was developed in the 1980s to support concurrent programming on the Transputer. Go, created decades later, shares similar CSP-influenced concurrency primitives. This transpiler bridges the two.
15→
16→## Building
17→
18→```bash
19→go build -o occam2go
20→```
21→
22→## Usage
23→
24→```bash
25→./occam2go [options] <input.occ>
26→./occam2go gen-module [-o output] [-name GUARD] <SConscript>
27→```
28→
29→Options:
30→- `-o <file>` - Write output to file (default: stdout)
31→- `-I <path>` - Include search path for `#INCLUDE` resolution (repeatable)
32→- `-D <SYMBOL>` - Predefined preprocessor symbol (repeatable, supports `SYMBOL=value`)
33→- `-version` - Print version and exit
34→
35→## Running an Example
36→
37→Here's how to transpile, compile, and run an Occam program:
38→
39→```bash
40→# 1. Build the transpiler (only needed once)
41→go build -o occam2go
42→
43→# 2. Transpile an Occam file to Go
44→./occam2go -o output.go examples/print.occ
45→
46→# 3. Compile the generated Go code
47→go build -o output output.go
48→
49→# 4. Run the compiled program
50→./output
51→```
52→
53→Or as a one-liner to see the output immediately:
54→
55→```bash
56→./occam2go -o output.go examples/print.occ && go run output.go
57→```
58→
59→## Example
60→
61→Input (`example.occ`):
62→```occam
63→SEQ
64→ INT x, y:
65→ PAR
66→ x := 1
67→ y := 2
68→ x := x + y
69→```
70→
71→Output:
72→```go
73→package main
74→
75→import (
76→ "sync"
77→)
78→
79→func main() {
80→ var x, y int
81→ var wg sync.WaitGroup
82→ wg.Add(2)
83→ go func() {
84→ defer wg.Done()
85→ x = 1
86→ }()
87→ go func() {
88→ defer wg.Done()
89→ y = 2
90→ }()
91→ wg.Wait()
92→ x = (x + y)
93→}
94→```
95→
96→## Codebase Size
97→
98→~16,200 lines of code (excluding the vendored `kroc/` subtree).
99→
100→| Category | Lines |
101→|----------|------:|
102→| Production code | 7,300 |
103→| Tests | 8,800 |
104→| Occam examples | 135 |
105→| **Total** | **~16,200** |
106→
107→Largest files:
108→
109→| File | Lines |
110→|------|------:|
111→| `parser/parser_test.go` | 3,124 |
112→| `parser/parser.go` | 2,811 |
113→| `codegen/codegen.go` | 2,617 |
114→| `codegen/codegen_test.go` | 669 |
115→| `ast/ast.go` | 520 |
116→| `lexer/lexer.go` | 448 |
117→| `preproc/preproc_test.go` | 437 |
118→| E2E test files (18 files) | 3,826 |
119→
120→## Implemented Features
121→
122→See [TODO.md](TODO.md) for the full implementation status and roadmap.
123→
124→| Occam | Go |
125→|-------|-----|
126→| `INT`, `BYTE`, `BOOL`, `REAL` | `int`, `byte`, `bool`, `float64` |
127→| `SEQ` | Sequential code |
128→| `PAR` | Goroutines with `sync.WaitGroup` |
129→| `IF` | `if / else if` |
130→| `WHILE` | `for` loop |
131→| `STOP` | Print to stderr + `select {}` (deadlock) |
132→| `PROC` with `VAL` params | Functions with value/pointer params |
133→| `:=` assignment | `=` assignment |
134→| Arithmetic: `+`, `-`, `*`, `/`, `\` | `+`, `-`, `*`, `/`, `%` |
135→| Comparison: `=`, `<>`, `<`, `>`, `<=`, `>=` | `==`, `!=`, `<`, `>`, `<=`, `>=` |
136→| Logic: `AND`, `OR`, `NOT` | `&&`, `\|\|`, `!` |
137→| Bitwise: `/\`, `\/`, `><`, `~` | `&`, `\|`, `^`, `^` (AND, OR, XOR, NOT) |
138→| Shifts: `<<`, `>>` | `<<`, `>>` |
139→| Type conversions: `INT x`, `BYTE n` | `int(x)`, `byte(n)` |
140→
141→### Channels
142→
143→| Occam | Go |
144→|-------|-----|
145→| `CHAN OF INT c:` | `c := make(chan int)` |
146→| `c ! x` (send) | `c <- x` |
147→| `c ? y` (receive) | `y = <-c` |
148→| `[5]CHAN OF INT cs:` | `cs := make([]chan int, 5)` + init loop |
149→| `cs[i] ! x` (indexed send) | `cs[i] <- x` |
150→| `cs[i] ? y` (indexed receive) | `y = <-cs[i]` |
151→| `PROC f([]CHAN OF INT cs)` | `func f(cs []chan int)` |
152→
153→Example:
154→```occam
155→SEQ
156→ CHAN OF INT c:
157→ INT result:
158→ PAR
159→ c ! 42
160→ c ? result
161→ print.int(result)
162→```
163→
164→Channel array example:
165→```occam
166→SEQ
167→ [3]CHAN OF INT cs:
168→ INT sum:
169→ sum := 0
170→ PAR
171→ PAR i = 0 FOR 3
172→ cs[i] ! (i + 1) * 10
173→ SEQ i = 0 FOR 3
174→ INT x:
175→ cs[i] ? x
176→ sum := sum + x
177→ print.int(sum)
178→```
179→
180→### Protocols
181→
182→Protocols define the type of data carried on a channel. Three forms are supported:
183→
184→| Occam | Go |
185→|-------|-----|
186→| `PROTOCOL SIG IS INT` | `type _proto_SIG = int` |
187→| `PROTOCOL PAIR IS INT ; BYTE` | `type _proto_PAIR struct { _0 int; _1 byte }` |
188→| `PROTOCOL MSG CASE tag; INT ...` | Interface + concrete structs per tag |
189→| `c ! 42 ; 65` (sequential send) | `c <- _proto_PAIR{42, 65}` |
190→| `c ? x ; y` (sequential recv) | `_tmp := <-c; x = _tmp._0; y = _tmp._1` |
191→| `c ! tag ; val` (variant send) | `c <- _proto_MSG_tag{val}` |
192→| `c ? CASE ...` (variant recv) | `switch _v := (<-c).(type) { ... }` |
193→
194→Sequential protocol example:
195→```occam
196→PROTOCOL PAIR IS INT ; INT
197→
198→SEQ
199→ CHAN OF PAIR c:
200→ INT x, y:
201→ PAR
202→ c ! 10 ; 20
203→ c ? x ; y
204→ print.int(x + y)
205→```
206→
207→Variant protocol example:
208→```occam
209→PROTOCOL MSG
210→ CASE
211→ data; INT
212→ quit
213→
214→SEQ
215→ CHAN OF MSG c:
216→ INT result:
217→ PAR
218→ c ! data ; 42
219→ c ? CASE
220→ data ; result
221→ print.int(result)
222→ quit
223→ SKIP
224→```
225→
226→### Records
227→
228→| Occam | Go |
229→|-------|-----|
230→| `RECORD POINT` with `INT x:` `INT y:` | `type POINT struct { x int; y int }` |
231→| `POINT p:` | `var p POINT` |
232→| `p[x] := 10` | `p.x = 10` |
233→| `p[x]` (in expression) | `p.x` |
234→| `PROC foo(POINT p)` (ref) | `func foo(p *POINT)` |
235→| `PROC foo(VAL POINT p)` (val) | `func foo(p POINT)` |
236→
237→Example:
238→```occam
239→RECORD POINT
240→ INT x:
241→ INT y:
242→
243→SEQ
244→ POINT p:
245→ p[x] := 10
246→ p[y] := 20
247→ print.int(p[x] + p[y])
248→```
249→
250→### Arrays
251→
252→| Occam | Go |
253→|-------|-----|
254→| `[5]INT arr:` | `arr := make([]int, 5)` |
255→| `arr[i] := x` | `arr[i] = x` |
256→| `x := arr[i]` | `x = arr[i]` |
257→
258→Example:
259→```occam
260→SEQ
261→ [5]INT arr:
262→ SEQ i = 0 FOR 5
263→ arr[i] := (i + 1) * 10
264→ INT sum:
265→ sum := 0
266→ SEQ i = 0 FOR 5
267→ sum := sum + arr[i]
268→ print.int(sum)
269→```
270→
271→### ALT (Alternation)
272→
273→| Occam | Go |
274→|-------|-----|
275→| `ALT` | `select` |
276→| `guard & c ? x` | Conditional channel with nil pattern |
277→| `SEQ i = 0 FOR n` | `for i := 0; i < n; i++` |
278→| `PAR i = 0 FOR n` | Parallel `for` loop with goroutines |
279→
280→Example:
281→```occam
282→ALT
283→ c1 ? x
284→ print.int(x)
285→ c2 ? y
286→ print.int(y)
287→```
288→
289→Generates:
290→```go
291→select {
292→case x = <-c1:
293→ fmt.Println(x)
294→case y = <-c2:
295→ fmt.Println(y)
296→}
297→```
298→
299→ALT with guards (optional boolean conditions):
300→```occam
301→ALT
302→ enabled & c1 ? x
303→ process(x)
304→ TRUE & c2 ? y
305→ process(y)
306→```
307→
308→### Replicators
309→
310→Replicators allow you to repeat a block of code a specified number of times.
311→
312→| Occam | Go |
313→|-------|-----|
314→| `SEQ i = 0 FOR n` | `for i := 0; i < n; i++` |
315→| `PAR i = 0 FOR n` | Parallel for loop with goroutines |
316→
317→Example with replicated SEQ:
318→```occam
319→SEQ i = 1 FOR 5
320→ print.int(i)
321→```
322→
323→This prints 1, 2, 3, 4, 5.
324→
325→Example with replicated PAR (spawns n concurrent processes):
326→```occam
327→PAR i = 0 FOR 4
328→ c ! i
329→```
330→
331→### Built-in I/O Procedures
332→
333→| Occam | Go |
334→|-------|-----|
335→| `print.int(x)` | `fmt.Println(x)` |
336→| `print.bool(x)` | `fmt.Println(x)` |
337→| `print.string(x)` | `fmt.Println(x)` |
338→| `print.newline()` | `fmt.Println()` |
339→
340→## Preprocessor and Modules
341→
342→Occam programs use `#INCLUDE` to import library modules. The transpiler includes a textual preprocessor that runs before lexing, handling conditional compilation and file inclusion.
343→
344→### Preprocessor Directives
345→
346→| Directive | Description |
347→|-----------|-------------|
348→| `#INCLUDE "file"` | Textually include a file (resolved relative to current file, then `-I` paths) |
349→| `#DEFINE SYMBOL` | Define a preprocessor symbol |
350→| `#IF expr` | Conditional compilation (`TRUE`, `FALSE`, `DEFINED (SYM)`, `NOT`, `(SYM = val)`) |
351→| `#ELSE` | Alternative branch |
352→| `#ENDIF` | End conditional block |
353→| `#COMMENT`, `#PRAGMA`, `#USE` | Ignored (replaced with blank lines to preserve line numbers) |
354→
355→The predefined symbol `TARGET.BITS.PER.WORD` is set to `64` (Go always uses 64-bit integers).
356→
357→### Using Modules with `#INCLUDE`
358→
359→Create a module file with include guards to prevent double-inclusion:
360→
361→```occam
362→-- mathlib.module
363→#IF NOT (DEFINED (MATHLIB.MODULE))
364→#DEFINE MATHLIB.MODULE
365→
366→INT FUNCTION abs(VAL INT x)
367→ INT result:
368→ VALOF
369→ IF
370→ x < 0
371→ result := 0 - x
372→ TRUE
373→ result := x
374→ RESULT result
375→
376→#ENDIF
377→```
378→
379→Then include it in your program:
380→
381→```occam
382→-- main.occ
383→#INCLUDE "mathlib.module"
384→
385→SEQ
386→ print.int(abs(0 - 42))
387→```
388→
389→Transpile with `-I` to specify where to find the module:
390→
391→```bash
392→./occam2go -I examples -o main.go examples/include_demo.occ
393→go run main.go
394→```
395→
396→Output:
397→```
398→42
399→20
400→10
401→```
402→
403→A working example is provided in `examples/include_demo.occ` with `examples/mathlib.module`.
404→
405→### Generating Module Files from KRoC SConscript
406→
407→The KRoC project defines module composition in SConscript (Python) build files. The `gen-module` subcommand parses these to generate `.module` files:
408→
409→```bash
410→# Clone the KRoC repository (one-time setup)
411→./scripts/clone-kroc.sh
412→
413→# Generate a module file from SConscript
414→./occam2go gen-module kroc/modules/course/libsrc/SConscript
415→```
416→
417→This outputs:
418→```
419→#IF NOT (DEFINED (COURSE.MODULE))
420→#DEFINE COURSE.MODULE
421→#INCLUDE "consts.inc"
422→#INCLUDE "utils.occ"
423→#INCLUDE "string.occ"
424→#INCLUDE "demo_cycles.occ"
425→#INCLUDE "demo_nets.occ"
426→#INCLUDE "file_in.occ"
427→#INCLUDE "float_io.occ"
428→#INCLUDE "random.occ"
429→#ENDIF
430→```
431→
432→### Running Programs with the Course Module
433→
434→The KRoC [course module](https://github.com/concurrency/kroc/tree/master/modules/course) is a standard occam library providing I/O utilities (`out.string`, `out.int`, `out.repeat`, etc.) for character-level communication over byte channels. The transpiler fully supports it.
435→
436→Occam programs that follow the standard entry point pattern — a PROC with three `CHAN BYTE` parameters `(keyboard?, screen!, error!)` — automatically get a generated `main()` that wires stdin, stdout, and stderr to channels.
437→
438→```bash
439→# 1. Clone the KRoC repository (one-time setup)
440→./scripts/clone-kroc.sh
441→
442→# 2. Build the transpiler
443→go build -o occam2go
444→
445→# 3. Generate the course module file from the KRoC SConscript (one-time setup)
446→./occam2go gen-module -o kroc/modules/course/libsrc/course.module \
447→ kroc/modules/course/libsrc/SConscript
448→
449→# 4. Transpile an example that uses the course module
450→./occam2go -I kroc/modules/course/libsrc \
451→ -D TARGET.BITS.PER.WORD=32 \
452→ -o hello.go examples/course_hello.occ
453→
454→# 5. Run it
455→go run hello.go
456→```
457→
458→Output:
459→```
460→Hello from occam2go!
461→The answer is: 42
462→------------------------------
463→Counting: 1, 2, 3, 4, 5
464→```
465→
466→The `-I` flag tells the preprocessor where to find the course module source files, and `-D TARGET.BITS.PER.WORD=32` sets the word size expected by the course module (the transpiler defaults to 64).
467→
468→The example program (`examples/course_hello.occ`):
469→```occam
470→#INCLUDE "course.module"
471→
472→PROC hello (CHAN BYTE keyboard?, screen!, error!)
473→ SEQ
474→ out.string ("Hello from occam2go!*c*n", 0, screen!)
475→ out.string ("The answer is: ", 0, screen!)
476→ out.int (42, 0, screen!)
477→ out.string ("*c*n", 0, screen!)
478→ out.repeat ('-', 30, screen!)
479→ out.string ("*c*n", 0, screen!)
480→ out.string ("Counting: ", 0, screen!)
481→ SEQ i = 1 FOR 5
482→ SEQ
483→ IF
484→ i > 1
485→ out.string (", ", 0, screen!)
486→ TRUE
487→ SKIP
488→ out.int (i, 0, screen!)
489→ out.string ("*c*n", 0, screen!)
490→:
491→```
492→
493→You can also transpile the KRoC examples directly:
494→```bash
495→./occam2go -I kroc/modules/course/libsrc \
496→ -D TARGET.BITS.PER.WORD=32 \
497→ -o hello_world.go kroc/modules/course/examples/hello_world.occ
498→go run hello_world.go
499→```
500→
501→## How Channels are Mapped
502→
503→Both Occam and Go draw from Tony Hoare's Communicating Sequential Processes (CSP) model, making channel communication a natural fit for transpilation.
504→
505→### Conceptual Mapping
506→
507→In Occam, channels are the primary mechanism for communication between parallel processes. A channel is a synchronous, unbuffered, point-to-point connection. Go channels share these characteristics by default.
508→
509→| Concept | Occam | Go |
510→|---------|-------|-----|
511→| Declaration | `CHAN OF INT c:` | `c := make(chan int)` |
512→| Send (blocks until receiver ready) | `c ! value` | `c <- value` |
513→| Receive (blocks until sender ready) | `c ? variable` | `variable = <-c` |
514→| Synchronisation | Implicit in `!` and `?` | Implicit in `<-` |
515→
516→### Synchronous Communication
517→
518→Both languages use synchronous (rendezvous) communication by default:
519→
520→```occam
521→PAR
522→ c ! 42 -- blocks until receiver is ready
523→ c ? x -- blocks until sender is ready
524→```
525→
526→The sender and receiver must both be ready before the communication occurs. This is preserved in the generated Go code, where unbuffered channels have the same semantics.
527→
528→### Differences and Limitations
529→
530→1. **Channel direction**: Occam channels are inherently unidirectional. Go channels can be bidirectional but can be restricted using types (`chan<-` for send-only, `<-chan` for receive-only). The transpiler currently generates bidirectional Go channels.
531→
532→2. **Protocol types**: Simple, sequential, and variant protocols are supported. Nested protocols (protocols referencing other protocols) are not yet supported.
533→
534→3. **Channel arrays**: Channel arrays (`[n]CHAN OF TYPE`) are supported, including indexed send/receive, `[]CHAN OF TYPE` proc params, and ALT with indexed channels.
535→
536→4. **ALT construct**: Occam's `ALT` maps to Go's `select` statement. Basic ALT, guards, and timer timeouts are supported. Priority ALT (`PRI ALT`) and replicated ALT are not yet implemented.
537→
538→## How PAR is Mapped
539→
540→Occam's `PAR` construct runs processes truly in parallel. On the Transputer this was hardware-scheduled; in Go it maps to goroutines coordinated with a `sync.WaitGroup`.
541→
542→### Basic PAR
543→
544→Each branch of a `PAR` block becomes a goroutine. The transpiler inserts a `WaitGroup` to ensure all branches complete before execution continues:
545→
546→```occam
547→PAR
548→ c ! 42
549→ c ? x
550→```
551→
552→Generates:
553→
554→```go
555→var wg sync.WaitGroup
556→wg.Add(2)
557→go func() {
558→ defer wg.Done()
559→ c <- 42
560→}()
561→go func() {
562→ defer wg.Done()
563→ x = <-c
564→}()
565→wg.Wait()
566→```
567→
568→The `wg.Wait()` call blocks until all goroutines have finished, preserving Occam's semantics that execution only continues after all parallel branches complete.
569→
570→### Replicated PAR
571→
572→A replicated `PAR` spawns N concurrent processes using a loop. Each iteration captures the loop variable to avoid closure issues:
573→
574→```occam
575→PAR i = 0 FOR 4
576→ c ! i
577→```
578→
579→Generates:
580→
581→```go
582→var wg sync.WaitGroup
583→wg.Add(int(4))
584→for i := 0; i < 0 + 4; i++ {
585→ i := i // capture loop variable
586→ go func() {
587→ defer wg.Done()
588→ c <- i
589→ }()
590→}
591→wg.Wait()
592→```
593→
594→### Differences and Limitations
595→
596→1. **Scheduling**: Occam on the Transputer had deterministic, priority-based scheduling. Go's goroutine scheduler is preemptive and non-deterministic. Programs that depend on execution order between `PAR` branches may behave differently.
597→
598→2. **Shared memory**: Occam enforces at compile time that parallel processes do not share variables (the "disjointness" rule). The transpiler does not enforce this, so generated Go code may contain data races if the original Occam would have been rejected by a full Occam compiler.
599→
600→3. **PLACED PAR**: Occam's `PLACED PAR` for assigning processes to specific Transputer links or processors is not supported.
601→
602→## How Timers are Mapped
603→
604→Occam's `TIMER` provides access to a hardware clock. The transpiler maps timer operations to Go's `time` package.
605→
606→### Timer Declaration
607→
608→Timer declarations are no-ops in the generated code since Go accesses time through the `time` package directly:
609→
610→```occam
611→TIMER tim:
612→```
613→
614→Generates:
615→
616→```go
617→// TIMER tim
618→```
619→
620→### Reading the Current Time
621→
622→A timer read stores the current time as an integer (microseconds since epoch):
623→
624→```occam
625→TIMER tim:
626→INT t:
627→tim ? t
628→```
629→
630→Generates:
631→
632→```go
633→// TIMER tim
634→var t int
635→t = int(time.Now().UnixMicro())
636→```
637→
638→### Timer Timeouts in ALT
639→
640→Timer cases in ALT allow a process to wait until a deadline. This maps to Go's `time.After` inside a `select`:
641→
642→```occam
643→TIMER tim:
644→INT t:
645→tim ? t
646→ALT
647→ c ? x
648→ process(x)
649→ tim ? AFTER (t + 100000)
650→ handle.timeout()
651→```
652→
653→Generates:
654→
655→```go
656→// TIMER tim
657→var t int
658→t = int(time.Now().UnixMicro())
659→select {
660→case x = <-c:
661→ process(x)
662→case <-time.After(time.Duration((t + 100000) - int(time.Now().UnixMicro())) * time.Microsecond):
663→ handle_timeout()
664→}
665→```
666→
667→The deadline expression `(t + 100000)` represents an absolute time. The generated code computes the remaining duration by subtracting the current time.
668→
669→### AFTER as a Boolean Expression
670→
671→The `AFTER` operator compares two time values and evaluates to `true` if the left operand is later than the right. It maps to `>`:
672→
673→```occam
674→IF
675→ t2 AFTER t1
676→ -- t2 is later
677→```
678→
679→Generates:
680→
681→```go
682→if (t2 > t1) {
683→ // t2 is later
684→}
685→```
686→
687→### Differences and Limitations
688→
689→1. **Clock resolution**: Occam timers are hardware-dependent (often microsecond resolution on the Transputer). The transpiler uses `time.Now().UnixMicro()` for microsecond values, but actual resolution depends on the OS.
690→
691→2. **Guarded timer ALT**: `guard & tim ? AFTER deadline` (timer cases with boolean guards) is not yet supported.
692→
693→3. **Clock wraparound**: Occam's `AFTER` operator handles 32-bit clock wraparound correctly. The transpiler uses a simple `>` comparison, which does not handle wraparound.
694→ 1→package main
2→
3→import (
4→ "flag"
5→ "fmt"
6→ "os"
7→ "strings"
8→
9→ "github.com/codeassociates/occam2go/codegen"
10→ "github.com/codeassociates/occam2go/lexer"
11→ "github.com/codeassociates/occam2go/modgen"
12→ "github.com/codeassociates/occam2go/parser"
13→ "github.com/codeassociates/occam2go/preproc"
14→)
15→
16→const version = "0.1.0"
17→
18→// multiFlag allows a flag to be specified multiple times (e.g. -I path1 -I path2).
19→type multiFlag []string
20→
21→func (f *multiFlag) String() string { return strings.Join(*f, ", ") }
22→func (f *multiFlag) Set(value string) error {
23→ *f = append(*f, value)
24→ return nil
25→}
26→
27→func main() {
28→ // Check for subcommand before parsing flags
29→ if len(os.Args) >= 2 && os.Args[1] == "gen-module" {
30→ genModuleCmd(os.Args[2:])
31→ return
32→ }
33→
34→ showVersion := flag.Bool("version", false, "Print version and exit")
35→ outputFile := flag.String("o", "", "Output file (default: stdout)")
36→ var includePaths multiFlag
37→ flag.Var(&includePaths, "I", "Include search path (repeatable)")
38→ var defines multiFlag
39→ flag.Var(&defines, "D", "Predefined symbol (repeatable)")
40→
41→ flag.Usage = func() {
42→ fmt.Fprintf(os.Stderr, "occam2go - An Occam to Go transpiler\n\n")
43→ fmt.Fprintf(os.Stderr, "Usage: %s [options] <input.occ>\n", os.Args[0])
44→ fmt.Fprintf(os.Stderr, " %s gen-module [-o output] <SConscript>\n\n", os.Args[0])
45→ fmt.Fprintf(os.Stderr, "Options:\n")
46→ flag.PrintDefaults()
47→ }
48→
49→ flag.Parse()
50→
51→ if *showVersion {
52→ fmt.Printf("occam2go version %s\n", version)
53→ os.Exit(0)
54→ }
55→
56→ args := flag.Args()
57→ if len(args) < 1 {
58→ flag.Usage()
59→ os.Exit(1)
60→ }
61→
62→ inputFile := args[0]
63→
64→ // Build defines map
65→ defs := map[string]string{}
66→ for _, d := range defines {
67→ if idx := strings.Index(d, "="); idx >= 0 {
68→ defs[d[:idx]] = d[idx+1:]
69→ } else {
70→ defs[d] = ""
71→ }
72→ }
73→
74→ // Preprocess
75→ pp := preproc.New(
76→ preproc.WithIncludePaths(includePaths),
77→ preproc.WithDefines(defs),
78→ )
79→ expanded, err := pp.ProcessFile(inputFile)
80→ if err != nil {
81→ fmt.Fprintf(os.Stderr, "Preprocessor error: %s\n", err)
82→ os.Exit(1)
83→ }
84→ if len(pp.Errors()) > 0 {
85→ fmt.Fprintf(os.Stderr, "Preprocessor warnings:\n")
86→ for _, e := range pp.Errors() {
87→ fmt.Fprintf(os.Stderr, " %s\n", e)
88→ }
89→ }
90→
91→ // Lex
92→ l := lexer.New(expanded)
93→
94→ // Parse
95→ p := parser.New(l)
96→ program := p.ParseProgram()
97→
98→ if len(p.Errors()) > 0 {
99→ fmt.Fprintf(os.Stderr, "Parse errors:\n")
100→ for _, err := range p.Errors() {
101→ fmt.Fprintf(os.Stderr, " %s\n", err)
102→ }
103→ os.Exit(1)
104→ }
105→
106→ // Generate Go code
107→ gen := codegen.New()
108→ output := gen.Generate(program)
109→
110→ // Write output
111→ if *outputFile != "" {
112→ err := os.WriteFile(*outputFile, []byte(output), 0644)
113→ if err != nil {
114→ fmt.Fprintf(os.Stderr, "Error writing file: %s\n", err)
115→ os.Exit(1)
116→ }
117→ } else {
118→ fmt.Print(output)
119→ }
120→}
121→
122→func genModuleCmd(args []string) {
123→ fs := flag.NewFlagSet("gen-module", flag.ExitOnError)
124→ outputFile := fs.String("o", "", "Output file (default: stdout)")
125→ moduleName := fs.String("name", "", "Module guard name (default: derived from library name)")
126→ fs.Parse(args)
127→
128→ if fs.NArg() < 1 {
129→ fmt.Fprintf(os.Stderr, "Usage: occam2go gen-module [-o output] [-name GUARD] <SConscript>\n")
130→ os.Exit(1)
131→ }
132→
133→ sconscriptFile := fs.Arg(0)
134→ data, err := os.ReadFile(sconscriptFile)
135→ if err != nil {
136→ fmt.Fprintf(os.Stderr, "Error reading SConscript: %s\n", err)
137→ os.Exit(1)
138→ }
139→
140→ libs := modgen.ParseSConscript(string(data))
141→ if len(libs) == 0 {
142→ fmt.Fprintf(os.Stderr, "No OccamLibrary found in %s\n", sconscriptFile)
143→ os.Exit(1)
144→ }
145→
146→ // Use first library by default
147→ lib := libs[0]
148→
149→ // Derive module name from library name if not specified
150→ guard := *moduleName
151→ if guard == "" {
152→ // course.lib → COURSE.MODULE
153→ name := lib.Name
154→ if idx := strings.LastIndex(name, "."); idx >= 0 {
155→ name = name[:idx]
156→ }
157→ guard = strings.ToUpper(name) + ".MODULE"
158→ }
159→
160→ output := modgen.GenerateModule(lib, guard)
161→
162→ if *outputFile != "" {
163→ err := os.WriteFile(*outputFile, []byte(output), 0644)
164→ if err != nil {
165→ fmt.Fprintf(os.Stderr, "Error writing file: %s\n", err)
166→ os.Exit(1)
167→ }
168→ } else {
169→ fmt.Print(output)
170→ }
171→}
172→The issue is requesting documentation of the limitation that gen-module uses regex-based pattern matching to extract information from SConscript files, rather than actually executing the Python code. This means it can only handle simple SConscript files that use straightforward Split('''...''') variable assignments and OccamLibrary() calls, but cannot handle SConscript files with Python logic like for loops.
The fix is straightforward - add documentation about this limitation. Let me identify the places to update:
The README.md section on "Generating Module Files from KRoC SConscript" (around line 405)
The modgen package doc comment in modgen.go
I should add clear notes about these regex-based parsing constraints across all three locations.
The issue asks us to document that gen-module uses regex-based pattern matching rather than actual Python execution, so it only works with simple SConscript files. Here are the places to update:
### Generating Module Files from KRoC SConscript The KRoC project defines module composition in SConscript (Python) build files. The `gen-module` subcommand parses these to generate `.module` files: ```bash # Clone the KRoC repository (one-time setup) ./scripts/clone-kroc.sh # Generate a module file from SConscript ./occam2go gen-module kroc/modules/course/libsrc/SConscript ```
### Generating Module Files from KRoC SConscript
The KRoC project defines module composition in SConscript (Python) build files. The `gen-module` subcommand extracts source file lists from these to generate `.module` files:
```bash
# Clone the KRoC repository (one-time setup)
./scripts/clone-kroc.sh
# Generate a module file from SConscript
./occam2go gen-module kroc/modules/course/libsrc/SConscript
```
**Limitation:** `gen-module` uses regex-based pattern matching to extract `Split('''...''')` variable assignments and `OccamLibrary()` calls from SConscript files. It does not execute the Python code. This means it works with simple, declarative SConscript files (like `modules/course/libsrc/SConscript`) but cannot handle files that rely on Python control flow such as loops or conditionals (like the top-level `modules/course/SConscript`).The file /home/david/projects/code-associates/occam2go/README.md has been updated successfully.
// Package modgen generates .module files from KRoC SConscript build files. // It parses the Python-based SConscript to extract source file lists and // OccamLibrary calls, then produces an occam module file with include guards.
// Package modgen generates .module files from KRoC SConscript build files.
// It uses regex-based pattern matching to extract Split('''...''') variable
// assignments and OccamLibrary() calls, then produces an occam module file
// with include guards.
//
// Note: This does not execute the Python code in SConscript files, so it only
// works with simple, declarative SConscript files. Files that use Python
// control flow (loops, conditionals, etc.) are not supported.The file /home/david/projects/code-associates/occam2go/modgen/modgen.go has been updated successfully.
6. **`modgen/`** — Generates `.module` files from KRoC SConscript build files. Parses Python-based SConscript to extract source lists and `OccamLibrary` calls. - `modgen.go` — SConscript parser and module file generator
6. **`modgen/`** — Generates `.module` files from KRoC SConscript build files. Uses regex-based pattern matching (not Python execution) to extract `Split('''...''')` source lists and `OccamLibrary` calls. Only works with simple, declarative SConscript files; files using Python control flow (loops, conditionals) are not supported.
- `modgen.go` — SConscript parser and module file generatorThe file /home/david/projects/code-associates/occam2go/CLAUDE.md has been updated successfully.
Here's a summary of the changes I made to address issue #37:
README.md — Added a "Limitation" note to the "Generating Module Files from KRoC SConscript" section explaining that gen-module uses regex-based pattern matching (not Python execution), works with simple declarative SConscript files, and cannot handle files with Python control flow like loops.
modgen/modgen.go — Updated the package doc comment to clarify the regex-based approach and its limitation.
CLAUDE.md — Updated the modgen/ description to mention the regex-based approach and its scope.
Want me to commit these changes and/or create a PR?
On branch main Your branch is up to date with 'origin/main'. Changes not staged for commit: (use "git add <file>..." to update what will be committed) (use "git restore <file>..." to discard changes in working directory) modified: CLAUDE.md modified: README.md modified: modgen/modgen.go no changes added to commit (use "git add" and/or "git commit -a")
diff --git a/CLAUDE.md b/CLAUDE.md
index b075068..b921039 100644
--- a/CLAUDE.md
+++ b/CLAUDE.md
@@ -51,7 +51,7 @@ Six packages, one pipeline:
- `codegen_test.go` — Unit tests (transpile, check output strings)
- `e2e_test.go` — End-to-end tests (transpile → `go build` → execute → check stdout)
-6. **`modgen/`** — Generates `.module` files from KRoC SConscript build files. Parses Python-based SConscript to extract source lists and `OccamLibrary` calls.
+6. **`modgen/`** — Generates `.module` files from KRoC SConscript build files. Uses regex-based pattern matching (not Python execution) to extract `Split('''...''')` source lists and `OccamLibrary` calls. Only works with simple, declarative SConscript files; files using Python control flow (loops, conditionals) are not supported.
- `modgen.go` — SConscript parser and module file generator
7. **`main.go`** — CLI entry point wiring the pipeline together
diff --git a/README.md b/README.md
index 165288b..9b9c626 100644
--- a/README.md
+++ b/README.md
@@ -404,7 +404,7 @@ A working example is provided in `examples/include_demo.occ` with `examples/math
### Generating Module Files from KRoC SConscript
-The KRoC project defines module composition in SConscript (Python) build files. The `gen-module` subcommand parses these to generate `.module` files:
+The KRoC project defines module composition in SConscript (Python) build files. The `gen-module` subcommand extracts source file lists from these to generate `.module` files:
```bash
# Clone the KRoC repository (one-time setup)
@@ -414,6 +414,8 @@ The KRoC project defines module composition in SConscript (Python) build files.
./occam2go gen-module kroc/modules/course/libsrc/SConscript
```
+**Limitation:** `gen-module` uses regex-based pattern matching to extract `Split('''...''')` variable assignments and `OccamLibrary()` calls from SConscript files. It does not execute the Python code. This means it works with simple, declarative SConscript files (like `modules/course/libsrc/SConscript`) but cannot handle files that rely on Python control flow such as loops or conditionals (like the top-level `modules/course/SConscript`).
+
This outputs:
```
#IF NOT (DEFINED (COURSE.MODULE))
diff --git a/modgen/modgen.go b/modgen/modgen.go
index f630cbf..ed38051 100644
--- a/modgen/modgen.go
+++ b/modgen/modgen.go
@@ -1,6 +1,11 @@
// Package modgen generates .module files from KRoC SConscript build files.
-// It parses the Python-based SConscript to extract source file lists and
-// OccamLibrary calls, then produces an occam module file with include guards.
+// It uses regex-based pattern matching to extract Split('''...''') variable
+// assignments and OccamLibrary() calls, then produces an occam module file
+// with include guards.
+//
+// Note: This does not execute the Python code in SConscript files, so it only
+// works with simple, declarative SConscript files. Files that use Python
+// control flow (loops, conditionals, etc.) are not supported.
package modgen
import (e8c9709 Merge pull request #36 from codeassociates/fix/readme-course-module-step 25a045a Add missing gen-module step to course module instructions 2012236 Merge pull request #34 from codeassociates/fix/readme-kroc-link e19bcca Fix kroc course module link in README b8b1c8c Merge pull request #32 from codeassociates/dboreham/readme-fix
git add README.md CLAUDE.md modgen/modgen.go && git commit -m "$(cat <<'EOF' Document gen-module regex-based limitation Clarifies that gen-module uses regex pattern matching (not Python execution) to extract source lists from SConscript files, so it only works with simple declarative files and cannot handle Python control flow. Closes #37 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> EOF )"
3 files changed, 11 insertions(+), 4 deletions(-)
remote: remote: Create a pull request for 'docs/gen-module-limitation' on GitHub by visiting: remote: https://github.com/codeassociates/occam2go/pull/new/docs/gen-module-limitation remote: To associate-1.github.com:codeassociates/occam2go.git * [new branch] docs/gen-module-limitation -> docs/gen-module-limitation branch 'docs/gen-module-limitation' set up to track 'origin/docs/gen-module-limitation'.
gh pr create --title "Document gen-module regex-based limitation" --body "$(cat <<'EOF'
## Summary
- Clarifies that `gen-module` uses regex-based pattern matching (not Python execution) to extract `Split('''...''')` assignments and `OccamLibrary()` calls from SConscript files
- Documents that it only works with simple, declarative SConscript files and cannot handle Python control flow (loops, conditionals)
- Updates README.md, CLAUDE.md, and the `modgen` package doc comment
Closes #37
## Test plan
- [x] Documentation-only change, no code behavior modified
- [x] `go build` succeeds
🤖 Generated with [Claude Code](https://claude.com/claude-code)
EOF
)"Switched to branch 'main' Your branch is up to date with 'origin/main'. From associate-1.github.com:codeassociates/occam2go e8c9709..d6b665f main -> origin/main Updating e8c9709..d6b665f Fast-forward CLAUDE.md | 2 +- README.md | 4 +++- modgen/modgen.go | 9 +++++++-- 3 files changed, 11 insertions(+), 4 deletions(-)