Claude Code transcript - page 1/1

Please take a look at issue #37.

$ Bash
View GitHub issue #37
gh issue view 37
title:	gen-module isn't general purpose
state:	OPEN
author:	dboreham
labels:	
comments:	0
assignees:	
projects:	
milestone:	
number:	37
--
I noticed that the `gen-module` subcommand is documented as being able to process kroc module/library definitions via their `SConscript` files, which are Python scripts. However, I think that the current implementation doesn't actually have the ability to process arbitrary Python scripts. Rather it knows enough about the SConscript file in the course module's libsrc subdirectory to parse it and extract the referenced Occam source files. For example, if it were given the "root" course module SConscript file (https://github.com/concurrency/kroc/blob/master/modules/course/SConscript) it wouldn't succeed in processing it because to do so requires executing the Python for loop here: https://github.com/concurrency/kroc/blob/master/modules/course/SConscript#L5

Although having the capability to process arbitrary SConscript files might be nice, I think the implementation approach taken is probably ok given the additional effort that would be required. But I think this tradeoff/limitation should be documented to avoid user confusion.
Read
{
  "file_path": "/home/david/projects/code-associates/occam2go/modgen/modgen.go"
}
     1→// Package modgen generates .module files from KRoC SConscript build files.
     2→// It parses the Python-based SConscript to extract source file lists and
     3→// OccamLibrary calls, then produces an occam module file with include guards.
     4→package modgen
     5→
     6→import (
     7→	"fmt"
     8→	"regexp"
     9→	"strings"
    10→)
    11→
    12→// Library represents an OccamLibrary extracted from SConscript.
    13→type Library struct {
    14→	Name     string   // e.g. "course.lib"
    15→	Sources  []string // source files
    16→	Includes []string // --include files from OCCBUILDFLAGS
    17→	Needs    []string // --need dependencies from OCCBUILDFLAGS
    18→}
    19→
    20→// ParseSConscript parses a SConscript file's content and extracts library definitions.
    21→func ParseSConscript(content string) []Library {
    22→	vars := extractSplitVars(content)
    23→	return extractLibraries(content, vars)
    24→}
    25→
    26→// GenerateModule creates a .module file from a Library.
    27→// moduleName is the guard symbol (e.g. "COURSE.MODULE").
    28→func GenerateModule(lib Library, moduleName string) string {
    29→	var b strings.Builder
    30→	fmt.Fprintf(&b, "#IF NOT (DEFINED (%s))\n", moduleName)
    31→	fmt.Fprintf(&b, "#DEFINE %s\n", moduleName)
    32→
    33→	// Include files from OCCBUILDFLAGS first
    34→	for _, inc := range lib.Includes {
    35→		fmt.Fprintf(&b, "#INCLUDE \"%s\"\n", inc)
    36→	}
    37→
    38→	// Then source files
    39→	for _, src := range lib.Sources {
    40→		fmt.Fprintf(&b, "#INCLUDE \"%s\"\n", src)
    41→	}
    42→
    43→	b.WriteString("#ENDIF\n")
    44→	return b.String()
    45→}
    46→
    47→// splitVarRe matches: varname = Split('''...''') or Split("""...""")
    48→var splitVarRe = regexp.MustCompile(`(\w+)\s*=\s*Split\(\s*(?:'''|""")([^'"]*)(?:'''|""")\s*\)`)
    49→
    50→// extractSplitVars finds all variable = Split('''...''') assignments.
    51→func extractSplitVars(content string) map[string][]string {
    52→	vars := map[string][]string{}
    53→	for _, m := range splitVarRe.FindAllStringSubmatch(content, -1) {
    54→		name := m[1]
    55→		files := splitWhitespace(m[2])
    56→		vars[name] = files
    57→	}
    58→	return vars
    59→}
    60→
    61→// libCallRe matches OccamLibrary calls (both direct and via return).
    62→// Captures: library name, source variable, and optional OCCBUILDFLAGS.
    63→var libCallRe = regexp.MustCompile(
    64→	`(?:local\.)?OccamLibrary\(\s*` +
    65→		`['"]([^'"]+)['"]\s*,\s*` + // library name
    66→		`(\w+)\s*` + // source variable
    67→		`(?:,[^)]*?)?\)`, // optional extra args
    68→)
    69→
    70→// flagsRe extracts OCCBUILDFLAGS value.
    71→var flagsRe = regexp.MustCompile(`OCCBUILDFLAGS\s*=\s*['"]([^'"]+)['"]`)
    72→
    73→func extractLibraries(content string, vars map[string][]string) []Library {
    74→	var libs []Library
    75→
    76→	for _, m := range libCallRe.FindAllStringSubmatch(content, -1) {
    77→		lib := Library{
    78→			Name: m[1],
    79→		}
    80→
    81→		srcVar := m[2]
    82→		if files, ok := vars[srcVar]; ok {
    83→			lib.Sources = files
    84→		}
    85→
    86→		// Look for OCCBUILDFLAGS in the full match
    87→		fullMatch := m[0]
    88→		if fm := flagsRe.FindStringSubmatch(fullMatch); fm != nil {
    89→			parseFlags(fm[1], &lib)
    90→		}
    91→
    92→		libs = append(libs, lib)
    93→	}
    94→
    95→	return libs
    96→}
    97→
    98→// parseFlags extracts --include and --need values from OCCBUILDFLAGS.
    99→func parseFlags(flags string, lib *Library) {
   100→	parts := strings.Fields(flags)
   101→	for i := 0; i < len(parts); i++ {
   102→		switch parts[i] {
   103→		case "--include":
   104→			if i+1 < len(parts) {
   105→				lib.Includes = append(lib.Includes, parts[i+1])
   106→				i++
   107→			}
   108→		case "--need":
   109→			if i+1 < len(parts) {
   110→				lib.Needs = append(lib.Needs, parts[i+1])
   111→				i++
   112→			}
   113→		}
   114→	}
   115→}
   116→
   117→func splitWhitespace(s string) []string {
   118→	var result []string
   119→	for _, f := range strings.Fields(s) {
   120→		if f != "" {
   121→			result = append(result, f)
   122→		}
   123→	}
   124→	return result
   125→}
   126→
Thinking

The issue is asking for documentation of the limitation that gen-module only handles simple SConscript files (specifically the pattern-matching approach using regex), not arbitrary Python scripts with loops and complex logic. Let me check the README and any other relevant documentation.

Now let me check the README and CLI help for how gen-module is currently documented.

Read
{
  "file_path": "/home/david/projects/code-associates/occam2go/README.md"
}
     1→# Preamble
     2→
     3→Greetings humans. This paragraph is the only part of this repo generated by a human. Everything else (code, PRs, CI) was created by LLMs (so far Claude Code Opus 4.5 and 4.6).
     4→My goal as the human in this endevour was to see if I could refute posts I saw after Anthropic publicised their project to have Claude write a C compiler. I noticed many comments to the effect
     5→"well that's easy because there's many C compilers in the training data". I had the idea: Occam is so old that there's basically nothing about it on the modern internet.
     6→I had also had a long standing curiosity about the connection between Occam and go(lang) that made be curious as to whether you could build an Occam to golang transpiler.
     7→This project is the result: scratching two itches -- can an LLM create a working compiler for a language that doesn't show up in its training data, and can one transpile Occam to Go.
     8→Without Claude I'd never have the time to scratch that second itch. And...over to Claude:
     9→
    10→# occam2go
    11→
    12→A transpiler from Occam to Go, written in Go.
    13→
    14→Occam was developed in the 1980s to support concurrent programming on the Transputer. Go, created decades later, shares similar CSP-influenced concurrency primitives. This transpiler bridges the two.
    15→
    16→## Building
    17→
    18→```bash
    19→go build -o occam2go
    20→```
    21→
    22→## Usage
    23→
    24→```bash
    25→./occam2go [options] <input.occ>
    26→./occam2go gen-module [-o output] [-name GUARD] <SConscript>
    27→```
    28→
    29→Options:
    30→- `-o <file>` - Write output to file (default: stdout)
    31→- `-I <path>` - Include search path for `#INCLUDE` resolution (repeatable)
    32→- `-D <SYMBOL>` - Predefined preprocessor symbol (repeatable, supports `SYMBOL=value`)
    33→- `-version` - Print version and exit
    34→
    35→## Running an Example
    36→
    37→Here's how to transpile, compile, and run an Occam program:
    38→
    39→```bash
    40→# 1. Build the transpiler (only needed once)
    41→go build -o occam2go
    42→
    43→# 2. Transpile an Occam file to Go
    44→./occam2go -o output.go examples/print.occ
    45→
    46→# 3. Compile the generated Go code
    47→go build -o output output.go
    48→
    49→# 4. Run the compiled program
    50→./output
    51→```
    52→
    53→Or as a one-liner to see the output immediately:
    54→
    55→```bash
    56→./occam2go -o output.go examples/print.occ && go run output.go
    57→```
    58→
    59→## Example
    60→
    61→Input (`example.occ`):
    62→```occam
    63→SEQ
    64→  INT x, y:
    65→  PAR
    66→    x := 1
    67→    y := 2
    68→  x := x + y
    69→```
    70→
    71→Output:
    72→```go
    73→package main
    74→
    75→import (
    76→	"sync"
    77→)
    78→
    79→func main() {
    80→	var x, y int
    81→	var wg sync.WaitGroup
    82→	wg.Add(2)
    83→	go func() {
    84→		defer wg.Done()
    85→		x = 1
    86→	}()
    87→	go func() {
    88→		defer wg.Done()
    89→		y = 2
    90→	}()
    91→	wg.Wait()
    92→	x = (x + y)
    93→}
    94→```
    95→
    96→## Codebase Size
    97→
    98→~16,200 lines of code (excluding the vendored `kroc/` subtree).
    99→
   100→| Category | Lines |
   101→|----------|------:|
   102→| Production code | 7,300 |
   103→| Tests | 8,800 |
   104→| Occam examples | 135 |
   105→| **Total** | **~16,200** |
   106→
   107→Largest files:
   108→
   109→| File | Lines |
   110→|------|------:|
   111→| `parser/parser_test.go` | 3,124 |
   112→| `parser/parser.go` | 2,811 |
   113→| `codegen/codegen.go` | 2,617 |
   114→| `codegen/codegen_test.go` | 669 |
   115→| `ast/ast.go` | 520 |
   116→| `lexer/lexer.go` | 448 |
   117→| `preproc/preproc_test.go` | 437 |
   118→| E2E test files (18 files) | 3,826 |
   119→
   120→## Implemented Features
   121→
   122→See [TODO.md](TODO.md) for the full implementation status and roadmap.
   123→
   124→| Occam | Go |
   125→|-------|-----|
   126→| `INT`, `BYTE`, `BOOL`, `REAL` | `int`, `byte`, `bool`, `float64` |
   127→| `SEQ` | Sequential code |
   128→| `PAR` | Goroutines with `sync.WaitGroup` |
   129→| `IF` | `if / else if` |
   130→| `WHILE` | `for` loop |
   131→| `STOP` | Print to stderr + `select {}` (deadlock) |
   132→| `PROC` with `VAL` params | Functions with value/pointer params |
   133→| `:=` assignment | `=` assignment |
   134→| Arithmetic: `+`, `-`, `*`, `/`, `\` | `+`, `-`, `*`, `/`, `%` |
   135→| Comparison: `=`, `<>`, `<`, `>`, `<=`, `>=` | `==`, `!=`, `<`, `>`, `<=`, `>=` |
   136→| Logic: `AND`, `OR`, `NOT` | `&&`, `\|\|`, `!` |
   137→| Bitwise: `/\`, `\/`, `><`, `~` | `&`, `\|`, `^`, `^` (AND, OR, XOR, NOT) |
   138→| Shifts: `<<`, `>>` | `<<`, `>>` |
   139→| Type conversions: `INT x`, `BYTE n` | `int(x)`, `byte(n)` |
   140→
   141→### Channels
   142→
   143→| Occam | Go |
   144→|-------|-----|
   145→| `CHAN OF INT c:` | `c := make(chan int)` |
   146→| `c ! x` (send) | `c <- x` |
   147→| `c ? y` (receive) | `y = <-c` |
   148→| `[5]CHAN OF INT cs:` | `cs := make([]chan int, 5)` + init loop |
   149→| `cs[i] ! x` (indexed send) | `cs[i] <- x` |
   150→| `cs[i] ? y` (indexed receive) | `y = <-cs[i]` |
   151→| `PROC f([]CHAN OF INT cs)` | `func f(cs []chan int)` |
   152→
   153→Example:
   154→```occam
   155→SEQ
   156→  CHAN OF INT c:
   157→  INT result:
   158→  PAR
   159→    c ! 42
   160→    c ? result
   161→  print.int(result)
   162→```
   163→
   164→Channel array example:
   165→```occam
   166→SEQ
   167→  [3]CHAN OF INT cs:
   168→  INT sum:
   169→  sum := 0
   170→  PAR
   171→    PAR i = 0 FOR 3
   172→      cs[i] ! (i + 1) * 10
   173→    SEQ i = 0 FOR 3
   174→      INT x:
   175→      cs[i] ? x
   176→      sum := sum + x
   177→  print.int(sum)
   178→```
   179→
   180→### Protocols
   181→
   182→Protocols define the type of data carried on a channel. Three forms are supported:
   183→
   184→| Occam | Go |
   185→|-------|-----|
   186→| `PROTOCOL SIG IS INT` | `type _proto_SIG = int` |
   187→| `PROTOCOL PAIR IS INT ; BYTE` | `type _proto_PAIR struct { _0 int; _1 byte }` |
   188→| `PROTOCOL MSG CASE tag; INT ...` | Interface + concrete structs per tag |
   189→| `c ! 42 ; 65` (sequential send) | `c <- _proto_PAIR{42, 65}` |
   190→| `c ? x ; y` (sequential recv) | `_tmp := <-c; x = _tmp._0; y = _tmp._1` |
   191→| `c ! tag ; val` (variant send) | `c <- _proto_MSG_tag{val}` |
   192→| `c ? CASE ...` (variant recv) | `switch _v := (<-c).(type) { ... }` |
   193→
   194→Sequential protocol example:
   195→```occam
   196→PROTOCOL PAIR IS INT ; INT
   197→
   198→SEQ
   199→  CHAN OF PAIR c:
   200→  INT x, y:
   201→  PAR
   202→    c ! 10 ; 20
   203→    c ? x ; y
   204→  print.int(x + y)
   205→```
   206→
   207→Variant protocol example:
   208→```occam
   209→PROTOCOL MSG
   210→  CASE
   211→    data; INT
   212→    quit
   213→
   214→SEQ
   215→  CHAN OF MSG c:
   216→  INT result:
   217→  PAR
   218→    c ! data ; 42
   219→    c ? CASE
   220→      data ; result
   221→        print.int(result)
   222→      quit
   223→        SKIP
   224→```
   225→
   226→### Records
   227→
   228→| Occam | Go |
   229→|-------|-----|
   230→| `RECORD POINT` with `INT x:` `INT y:` | `type POINT struct { x int; y int }` |
   231→| `POINT p:` | `var p POINT` |
   232→| `p[x] := 10` | `p.x = 10` |
   233→| `p[x]` (in expression) | `p.x` |
   234→| `PROC foo(POINT p)` (ref) | `func foo(p *POINT)` |
   235→| `PROC foo(VAL POINT p)` (val) | `func foo(p POINT)` |
   236→
   237→Example:
   238→```occam
   239→RECORD POINT
   240→  INT x:
   241→  INT y:
   242→
   243→SEQ
   244→  POINT p:
   245→  p[x] := 10
   246→  p[y] := 20
   247→  print.int(p[x] + p[y])
   248→```
   249→
   250→### Arrays
   251→
   252→| Occam | Go |
   253→|-------|-----|
   254→| `[5]INT arr:` | `arr := make([]int, 5)` |
   255→| `arr[i] := x` | `arr[i] = x` |
   256→| `x := arr[i]` | `x = arr[i]` |
   257→
   258→Example:
   259→```occam
   260→SEQ
   261→  [5]INT arr:
   262→  SEQ i = 0 FOR 5
   263→    arr[i] := (i + 1) * 10
   264→  INT sum:
   265→  sum := 0
   266→  SEQ i = 0 FOR 5
   267→    sum := sum + arr[i]
   268→  print.int(sum)
   269→```
   270→
   271→### ALT (Alternation)
   272→
   273→| Occam | Go |
   274→|-------|-----|
   275→| `ALT` | `select` |
   276→| `guard & c ? x` | Conditional channel with nil pattern |
   277→| `SEQ i = 0 FOR n` | `for i := 0; i < n; i++` |
   278→| `PAR i = 0 FOR n` | Parallel `for` loop with goroutines |
   279→
   280→Example:
   281→```occam
   282→ALT
   283→  c1 ? x
   284→    print.int(x)
   285→  c2 ? y
   286→    print.int(y)
   287→```
   288→
   289→Generates:
   290→```go
   291→select {
   292→case x = <-c1:
   293→    fmt.Println(x)
   294→case y = <-c2:
   295→    fmt.Println(y)
   296→}
   297→```
   298→
   299→ALT with guards (optional boolean conditions):
   300→```occam
   301→ALT
   302→  enabled & c1 ? x
   303→    process(x)
   304→  TRUE & c2 ? y
   305→    process(y)
   306→```
   307→
   308→### Replicators
   309→
   310→Replicators allow you to repeat a block of code a specified number of times.
   311→
   312→| Occam | Go |
   313→|-------|-----|
   314→| `SEQ i = 0 FOR n` | `for i := 0; i < n; i++` |
   315→| `PAR i = 0 FOR n` | Parallel for loop with goroutines |
   316→
   317→Example with replicated SEQ:
   318→```occam
   319→SEQ i = 1 FOR 5
   320→  print.int(i)
   321→```
   322→
   323→This prints 1, 2, 3, 4, 5.
   324→
   325→Example with replicated PAR (spawns n concurrent processes):
   326→```occam
   327→PAR i = 0 FOR 4
   328→  c ! i
   329→```
   330→
   331→### Built-in I/O Procedures
   332→
   333→| Occam | Go |
   334→|-------|-----|
   335→| `print.int(x)` | `fmt.Println(x)` |
   336→| `print.bool(x)` | `fmt.Println(x)` |
   337→| `print.string(x)` | `fmt.Println(x)` |
   338→| `print.newline()` | `fmt.Println()` |
   339→
   340→## Preprocessor and Modules
   341→
   342→Occam programs use `#INCLUDE` to import library modules. The transpiler includes a textual preprocessor that runs before lexing, handling conditional compilation and file inclusion.
   343→
   344→### Preprocessor Directives
   345→
   346→| Directive | Description |
   347→|-----------|-------------|
   348→| `#INCLUDE "file"` | Textually include a file (resolved relative to current file, then `-I` paths) |
   349→| `#DEFINE SYMBOL` | Define a preprocessor symbol |
   350→| `#IF expr` | Conditional compilation (`TRUE`, `FALSE`, `DEFINED (SYM)`, `NOT`, `(SYM = val)`) |
   351→| `#ELSE` | Alternative branch |
   352→| `#ENDIF` | End conditional block |
   353→| `#COMMENT`, `#PRAGMA`, `#USE` | Ignored (replaced with blank lines to preserve line numbers) |
   354→
   355→The predefined symbol `TARGET.BITS.PER.WORD` is set to `64` (Go always uses 64-bit integers).
   356→
   357→### Using Modules with `#INCLUDE`
   358→
   359→Create a module file with include guards to prevent double-inclusion:
   360→
   361→```occam
   362→-- mathlib.module
   363→#IF NOT (DEFINED (MATHLIB.MODULE))
   364→#DEFINE MATHLIB.MODULE
   365→
   366→INT FUNCTION abs(VAL INT x)
   367→  INT result:
   368→  VALOF
   369→    IF
   370→      x < 0
   371→        result := 0 - x
   372→      TRUE
   373→        result := x
   374→    RESULT result
   375→
   376→#ENDIF
   377→```
   378→
   379→Then include it in your program:
   380→
   381→```occam
   382→-- main.occ
   383→#INCLUDE "mathlib.module"
   384→
   385→SEQ
   386→  print.int(abs(0 - 42))
   387→```
   388→
   389→Transpile with `-I` to specify where to find the module:
   390→
   391→```bash
   392→./occam2go -I examples -o main.go examples/include_demo.occ
   393→go run main.go
   394→```
   395→
   396→Output:
   397→```
   398→42
   399→20
   400→10
   401→```
   402→
   403→A working example is provided in `examples/include_demo.occ` with `examples/mathlib.module`.
   404→
   405→### Generating Module Files from KRoC SConscript
   406→
   407→The KRoC project defines module composition in SConscript (Python) build files. The `gen-module` subcommand parses these to generate `.module` files:
   408→
   409→```bash
   410→# Clone the KRoC repository (one-time setup)
   411→./scripts/clone-kroc.sh
   412→
   413→# Generate a module file from SConscript
   414→./occam2go gen-module kroc/modules/course/libsrc/SConscript
   415→```
   416→
   417→This outputs:
   418→```
   419→#IF NOT (DEFINED (COURSE.MODULE))
   420→#DEFINE COURSE.MODULE
   421→#INCLUDE "consts.inc"
   422→#INCLUDE "utils.occ"
   423→#INCLUDE "string.occ"
   424→#INCLUDE "demo_cycles.occ"
   425→#INCLUDE "demo_nets.occ"
   426→#INCLUDE "file_in.occ"
   427→#INCLUDE "float_io.occ"
   428→#INCLUDE "random.occ"
   429→#ENDIF
   430→```
   431→
   432→### Running Programs with the Course Module
   433→
   434→The KRoC [course module](https://github.com/concurrency/kroc/tree/master/modules/course) is a standard occam library providing I/O utilities (`out.string`, `out.int`, `out.repeat`, etc.) for character-level communication over byte channels. The transpiler fully supports it.
   435→
   436→Occam programs that follow the standard entry point pattern — a PROC with three `CHAN BYTE` parameters `(keyboard?, screen!, error!)` — automatically get a generated `main()` that wires stdin, stdout, and stderr to channels.
   437→
   438→```bash
   439→# 1. Clone the KRoC repository (one-time setup)
   440→./scripts/clone-kroc.sh
   441→
   442→# 2. Build the transpiler
   443→go build -o occam2go
   444→
   445→# 3. Generate the course module file from the KRoC SConscript (one-time setup)
   446→./occam2go gen-module -o kroc/modules/course/libsrc/course.module \
   447→                         kroc/modules/course/libsrc/SConscript
   448→
   449→# 4. Transpile an example that uses the course module
   450→./occam2go -I kroc/modules/course/libsrc \
   451→           -D TARGET.BITS.PER.WORD=32     \
   452→           -o hello.go examples/course_hello.occ
   453→
   454→# 5. Run it
   455→go run hello.go
   456→```
   457→
   458→Output:
   459→```
   460→Hello from occam2go!
   461→The answer is: 42
   462→------------------------------
   463→Counting: 1, 2, 3, 4, 5
   464→```
   465→
   466→The `-I` flag tells the preprocessor where to find the course module source files, and `-D TARGET.BITS.PER.WORD=32` sets the word size expected by the course module (the transpiler defaults to 64).
   467→
   468→The example program (`examples/course_hello.occ`):
   469→```occam
   470→#INCLUDE "course.module"
   471→
   472→PROC hello (CHAN BYTE keyboard?, screen!, error!)
   473→  SEQ
   474→    out.string ("Hello from occam2go!*c*n", 0, screen!)
   475→    out.string ("The answer is: ", 0, screen!)
   476→    out.int (42, 0, screen!)
   477→    out.string ("*c*n", 0, screen!)
   478→    out.repeat ('-', 30, screen!)
   479→    out.string ("*c*n", 0, screen!)
   480→    out.string ("Counting: ", 0, screen!)
   481→    SEQ i = 1 FOR 5
   482→      SEQ
   483→        IF
   484→          i > 1
   485→            out.string (", ", 0, screen!)
   486→          TRUE
   487→            SKIP
   488→        out.int (i, 0, screen!)
   489→    out.string ("*c*n", 0, screen!)
   490→:
   491→```
   492→
   493→You can also transpile the KRoC examples directly:
   494→```bash
   495→./occam2go -I kroc/modules/course/libsrc \
   496→           -D TARGET.BITS.PER.WORD=32     \
   497→           -o hello_world.go kroc/modules/course/examples/hello_world.occ
   498→go run hello_world.go
   499→```
   500→
   501→## How Channels are Mapped
   502→
   503→Both Occam and Go draw from Tony Hoare's Communicating Sequential Processes (CSP) model, making channel communication a natural fit for transpilation.
   504→
   505→### Conceptual Mapping
   506→
   507→In Occam, channels are the primary mechanism for communication between parallel processes. A channel is a synchronous, unbuffered, point-to-point connection. Go channels share these characteristics by default.
   508→
   509→| Concept | Occam | Go |
   510→|---------|-------|-----|
   511→| Declaration | `CHAN OF INT c:` | `c := make(chan int)` |
   512→| Send (blocks until receiver ready) | `c ! value` | `c <- value` |
   513→| Receive (blocks until sender ready) | `c ? variable` | `variable = <-c` |
   514→| Synchronisation | Implicit in `!` and `?` | Implicit in `<-` |
   515→
   516→### Synchronous Communication
   517→
   518→Both languages use synchronous (rendezvous) communication by default:
   519→
   520→```occam
   521→PAR
   522→  c ! 42      -- blocks until receiver is ready
   523→  c ? x       -- blocks until sender is ready
   524→```
   525→
   526→The sender and receiver must both be ready before the communication occurs. This is preserved in the generated Go code, where unbuffered channels have the same semantics.
   527→
   528→### Differences and Limitations
   529→
   530→1. **Channel direction**: Occam channels are inherently unidirectional. Go channels can be bidirectional but can be restricted using types (`chan<-` for send-only, `<-chan` for receive-only). The transpiler currently generates bidirectional Go channels.
   531→
   532→2. **Protocol types**: Simple, sequential, and variant protocols are supported. Nested protocols (protocols referencing other protocols) are not yet supported.
   533→
   534→3. **Channel arrays**: Channel arrays (`[n]CHAN OF TYPE`) are supported, including indexed send/receive, `[]CHAN OF TYPE` proc params, and ALT with indexed channels.
   535→
   536→4. **ALT construct**: Occam's `ALT` maps to Go's `select` statement. Basic ALT, guards, and timer timeouts are supported. Priority ALT (`PRI ALT`) and replicated ALT are not yet implemented.
   537→
   538→## How PAR is Mapped
   539→
   540→Occam's `PAR` construct runs processes truly in parallel. On the Transputer this was hardware-scheduled; in Go it maps to goroutines coordinated with a `sync.WaitGroup`.
   541→
   542→### Basic PAR
   543→
   544→Each branch of a `PAR` block becomes a goroutine. The transpiler inserts a `WaitGroup` to ensure all branches complete before execution continues:
   545→
   546→```occam
   547→PAR
   548→  c ! 42
   549→  c ? x
   550→```
   551→
   552→Generates:
   553→
   554→```go
   555→var wg sync.WaitGroup
   556→wg.Add(2)
   557→go func() {
   558→    defer wg.Done()
   559→    c <- 42
   560→}()
   561→go func() {
   562→    defer wg.Done()
   563→    x = <-c
   564→}()
   565→wg.Wait()
   566→```
   567→
   568→The `wg.Wait()` call blocks until all goroutines have finished, preserving Occam's semantics that execution only continues after all parallel branches complete.
   569→
   570→### Replicated PAR
   571→
   572→A replicated `PAR` spawns N concurrent processes using a loop. Each iteration captures the loop variable to avoid closure issues:
   573→
   574→```occam
   575→PAR i = 0 FOR 4
   576→  c ! i
   577→```
   578→
   579→Generates:
   580→
   581→```go
   582→var wg sync.WaitGroup
   583→wg.Add(int(4))
   584→for i := 0; i < 0 + 4; i++ {
   585→    i := i  // capture loop variable
   586→    go func() {
   587→        defer wg.Done()
   588→        c <- i
   589→    }()
   590→}
   591→wg.Wait()
   592→```
   593→
   594→### Differences and Limitations
   595→
   596→1. **Scheduling**: Occam on the Transputer had deterministic, priority-based scheduling. Go's goroutine scheduler is preemptive and non-deterministic. Programs that depend on execution order between `PAR` branches may behave differently.
   597→
   598→2. **Shared memory**: Occam enforces at compile time that parallel processes do not share variables (the "disjointness" rule). The transpiler does not enforce this, so generated Go code may contain data races if the original Occam would have been rejected by a full Occam compiler.
   599→
   600→3. **PLACED PAR**: Occam's `PLACED PAR` for assigning processes to specific Transputer links or processors is not supported.
   601→
   602→## How Timers are Mapped
   603→
   604→Occam's `TIMER` provides access to a hardware clock. The transpiler maps timer operations to Go's `time` package.
   605→
   606→### Timer Declaration
   607→
   608→Timer declarations are no-ops in the generated code since Go accesses time through the `time` package directly:
   609→
   610→```occam
   611→TIMER tim:
   612→```
   613→
   614→Generates:
   615→
   616→```go
   617→// TIMER tim
   618→```
   619→
   620→### Reading the Current Time
   621→
   622→A timer read stores the current time as an integer (microseconds since epoch):
   623→
   624→```occam
   625→TIMER tim:
   626→INT t:
   627→tim ? t
   628→```
   629→
   630→Generates:
   631→
   632→```go
   633→// TIMER tim
   634→var t int
   635→t = int(time.Now().UnixMicro())
   636→```
   637→
   638→### Timer Timeouts in ALT
   639→
   640→Timer cases in ALT allow a process to wait until a deadline. This maps to Go's `time.After` inside a `select`:
   641→
   642→```occam
   643→TIMER tim:
   644→INT t:
   645→tim ? t
   646→ALT
   647→  c ? x
   648→    process(x)
   649→  tim ? AFTER (t + 100000)
   650→    handle.timeout()
   651→```
   652→
   653→Generates:
   654→
   655→```go
   656→// TIMER tim
   657→var t int
   658→t = int(time.Now().UnixMicro())
   659→select {
   660→case x = <-c:
   661→    process(x)
   662→case <-time.After(time.Duration((t + 100000) - int(time.Now().UnixMicro())) * time.Microsecond):
   663→    handle_timeout()
   664→}
   665→```
   666→
   667→The deadline expression `(t + 100000)` represents an absolute time. The generated code computes the remaining duration by subtracting the current time.
   668→
   669→### AFTER as a Boolean Expression
   670→
   671→The `AFTER` operator compares two time values and evaluates to `true` if the left operand is later than the right. It maps to `>`:
   672→
   673→```occam
   674→IF
   675→  t2 AFTER t1
   676→    -- t2 is later
   677→```
   678→
   679→Generates:
   680→
   681→```go
   682→if (t2 > t1) {
   683→    // t2 is later
   684→}
   685→```
   686→
   687→### Differences and Limitations
   688→
   689→1. **Clock resolution**: Occam timers are hardware-dependent (often microsecond resolution on the Transputer). The transpiler uses `time.Now().UnixMicro()` for microsecond values, but actual resolution depends on the OS.
   690→
   691→2. **Guarded timer ALT**: `guard & tim ? AFTER deadline` (timer cases with boolean guards) is not yet supported.
   692→
   693→3. **Clock wraparound**: Occam's `AFTER` operator handles 32-bit clock wraparound correctly. The transpiler uses a simple `>` comparison, which does not handle wraparound.
   694→
Read
{
  "file_path": "/home/david/projects/code-associates/occam2go/main.go"
}
     1→package main
     2→
     3→import (
     4→	"flag"
     5→	"fmt"
     6→	"os"
     7→	"strings"
     8→
     9→	"github.com/codeassociates/occam2go/codegen"
    10→	"github.com/codeassociates/occam2go/lexer"
    11→	"github.com/codeassociates/occam2go/modgen"
    12→	"github.com/codeassociates/occam2go/parser"
    13→	"github.com/codeassociates/occam2go/preproc"
    14→)
    15→
    16→const version = "0.1.0"
    17→
    18→// multiFlag allows a flag to be specified multiple times (e.g. -I path1 -I path2).
    19→type multiFlag []string
    20→
    21→func (f *multiFlag) String() string { return strings.Join(*f, ", ") }
    22→func (f *multiFlag) Set(value string) error {
    23→	*f = append(*f, value)
    24→	return nil
    25→}
    26→
    27→func main() {
    28→	// Check for subcommand before parsing flags
    29→	if len(os.Args) >= 2 && os.Args[1] == "gen-module" {
    30→		genModuleCmd(os.Args[2:])
    31→		return
    32→	}
    33→
    34→	showVersion := flag.Bool("version", false, "Print version and exit")
    35→	outputFile := flag.String("o", "", "Output file (default: stdout)")
    36→	var includePaths multiFlag
    37→	flag.Var(&includePaths, "I", "Include search path (repeatable)")
    38→	var defines multiFlag
    39→	flag.Var(&defines, "D", "Predefined symbol (repeatable)")
    40→
    41→	flag.Usage = func() {
    42→		fmt.Fprintf(os.Stderr, "occam2go - An Occam to Go transpiler\n\n")
    43→		fmt.Fprintf(os.Stderr, "Usage: %s [options] <input.occ>\n", os.Args[0])
    44→		fmt.Fprintf(os.Stderr, "       %s gen-module [-o output] <SConscript>\n\n", os.Args[0])
    45→		fmt.Fprintf(os.Stderr, "Options:\n")
    46→		flag.PrintDefaults()
    47→	}
    48→
    49→	flag.Parse()
    50→
    51→	if *showVersion {
    52→		fmt.Printf("occam2go version %s\n", version)
    53→		os.Exit(0)
    54→	}
    55→
    56→	args := flag.Args()
    57→	if len(args) < 1 {
    58→		flag.Usage()
    59→		os.Exit(1)
    60→	}
    61→
    62→	inputFile := args[0]
    63→
    64→	// Build defines map
    65→	defs := map[string]string{}
    66→	for _, d := range defines {
    67→		if idx := strings.Index(d, "="); idx >= 0 {
    68→			defs[d[:idx]] = d[idx+1:]
    69→		} else {
    70→			defs[d] = ""
    71→		}
    72→	}
    73→
    74→	// Preprocess
    75→	pp := preproc.New(
    76→		preproc.WithIncludePaths(includePaths),
    77→		preproc.WithDefines(defs),
    78→	)
    79→	expanded, err := pp.ProcessFile(inputFile)
    80→	if err != nil {
    81→		fmt.Fprintf(os.Stderr, "Preprocessor error: %s\n", err)
    82→		os.Exit(1)
    83→	}
    84→	if len(pp.Errors()) > 0 {
    85→		fmt.Fprintf(os.Stderr, "Preprocessor warnings:\n")
    86→		for _, e := range pp.Errors() {
    87→			fmt.Fprintf(os.Stderr, "  %s\n", e)
    88→		}
    89→	}
    90→
    91→	// Lex
    92→	l := lexer.New(expanded)
    93→
    94→	// Parse
    95→	p := parser.New(l)
    96→	program := p.ParseProgram()
    97→
    98→	if len(p.Errors()) > 0 {
    99→		fmt.Fprintf(os.Stderr, "Parse errors:\n")
   100→		for _, err := range p.Errors() {
   101→			fmt.Fprintf(os.Stderr, "  %s\n", err)
   102→		}
   103→		os.Exit(1)
   104→	}
   105→
   106→	// Generate Go code
   107→	gen := codegen.New()
   108→	output := gen.Generate(program)
   109→
   110→	// Write output
   111→	if *outputFile != "" {
   112→		err := os.WriteFile(*outputFile, []byte(output), 0644)
   113→		if err != nil {
   114→			fmt.Fprintf(os.Stderr, "Error writing file: %s\n", err)
   115→			os.Exit(1)
   116→		}
   117→	} else {
   118→		fmt.Print(output)
   119→	}
   120→}
   121→
   122→func genModuleCmd(args []string) {
   123→	fs := flag.NewFlagSet("gen-module", flag.ExitOnError)
   124→	outputFile := fs.String("o", "", "Output file (default: stdout)")
   125→	moduleName := fs.String("name", "", "Module guard name (default: derived from library name)")
   126→	fs.Parse(args)
   127→
   128→	if fs.NArg() < 1 {
   129→		fmt.Fprintf(os.Stderr, "Usage: occam2go gen-module [-o output] [-name GUARD] <SConscript>\n")
   130→		os.Exit(1)
   131→	}
   132→
   133→	sconscriptFile := fs.Arg(0)
   134→	data, err := os.ReadFile(sconscriptFile)
   135→	if err != nil {
   136→		fmt.Fprintf(os.Stderr, "Error reading SConscript: %s\n", err)
   137→		os.Exit(1)
   138→	}
   139→
   140→	libs := modgen.ParseSConscript(string(data))
   141→	if len(libs) == 0 {
   142→		fmt.Fprintf(os.Stderr, "No OccamLibrary found in %s\n", sconscriptFile)
   143→		os.Exit(1)
   144→	}
   145→
   146→	// Use first library by default
   147→	lib := libs[0]
   148→
   149→	// Derive module name from library name if not specified
   150→	guard := *moduleName
   151→	if guard == "" {
   152→		// course.lib → COURSE.MODULE
   153→		name := lib.Name
   154→		if idx := strings.LastIndex(name, "."); idx >= 0 {
   155→			name = name[:idx]
   156→		}
   157→		guard = strings.ToUpper(name) + ".MODULE"
   158→	}
   159→
   160→	output := modgen.GenerateModule(lib, guard)
   161→
   162→	if *outputFile != "" {
   163→		err := os.WriteFile(*outputFile, []byte(output), 0644)
   164→		if err != nil {
   165→			fmt.Fprintf(os.Stderr, "Error writing file: %s\n", err)
   166→			os.Exit(1)
   167→		}
   168→	} else {
   169→		fmt.Print(output)
   170→	}
   171→}
   172→
Thinking

The issue is requesting documentation of the limitation that gen-module uses regex-based pattern matching to extract information from SConscript files, rather than actually executing the Python code. This means it can only handle simple SConscript files that use straightforward Split('''...''') variable assignments and OccamLibrary() calls, but cannot handle SConscript files with Python logic like for loops.

The fix is straightforward - add documentation about this limitation. Let me identify the places to update:

  1. The README.md section on "Generating Module Files from KRoC SConscript" (around line 405)

  2. The modgen package doc comment in modgen.go

  3. Possibly the CLAUDE.md description of modgen

I should add clear notes about these regex-based parsing constraints across all three locations.

The issue asks us to document that gen-module uses regex-based pattern matching rather than actual Python execution, so it only works with simple SConscript files. Here are the places to update:

  1. README.md - Add a note to the "Generating Module Files from KRoC SConscript" section
  2. modgen/modgen.go - Clarify the package doc comment
✏️ Edit README.md
/home/david/projects/code-associates/occam2go/README.md
### Generating Module Files from KRoC SConscript

The KRoC project defines module composition in SConscript (Python) build files. The `gen-module` subcommand parses these to generate `.module` files:

```bash
# Clone the KRoC repository (one-time setup)
./scripts/clone-kroc.sh

# Generate a module file from SConscript
./occam2go gen-module kroc/modules/course/libsrc/SConscript
```
+
### Generating Module Files from KRoC SConscript

The KRoC project defines module composition in SConscript (Python) build files. The `gen-module` subcommand extracts source file lists from these to generate `.module` files:

```bash
# Clone the KRoC repository (one-time setup)
./scripts/clone-kroc.sh

# Generate a module file from SConscript
./occam2go gen-module kroc/modules/course/libsrc/SConscript
```

**Limitation:** `gen-module` uses regex-based pattern matching to extract `Split('''...''')` variable assignments and `OccamLibrary()` calls from SConscript files. It does not execute the Python code. This means it works with simple, declarative SConscript files (like `modules/course/libsrc/SConscript`) but cannot handle files that rely on Python control flow such as loops or conditionals (like the top-level `modules/course/SConscript`).
The file /home/david/projects/code-associates/occam2go/README.md has been updated successfully.
✏️ Edit modgen.go
/home/david/projects/code-associates/occam2go/modgen/modgen.go
// Package modgen generates .module files from KRoC SConscript build files.
// It parses the Python-based SConscript to extract source file lists and
// OccamLibrary calls, then produces an occam module file with include guards.
+
// Package modgen generates .module files from KRoC SConscript build files.
// It uses regex-based pattern matching to extract Split('''...''') variable
// assignments and OccamLibrary() calls, then produces an occam module file
// with include guards.
//
// Note: This does not execute the Python code in SConscript files, so it only
// works with simple, declarative SConscript files. Files that use Python
// control flow (loops, conditionals, etc.) are not supported.
The file /home/david/projects/code-associates/occam2go/modgen/modgen.go has been updated successfully.

Now let me also update the CLAUDE.md description to be consistent:

✏️ Edit CLAUDE.md
/home/david/projects/code-associates/occam2go/CLAUDE.md
6. **`modgen/`** — Generates `.module` files from KRoC SConscript build files. Parses Python-based SConscript to extract source lists and `OccamLibrary` calls.
   - `modgen.go` — SConscript parser and module file generator
+
6. **`modgen/`** — Generates `.module` files from KRoC SConscript build files. Uses regex-based pattern matching (not Python execution) to extract `Split('''...''')` source lists and `OccamLibrary` calls. Only works with simple, declarative SConscript files; files using Python control flow (loops, conditionals) are not supported.
   - `modgen.go` — SConscript parser and module file generator
The file /home/david/projects/code-associates/occam2go/CLAUDE.md has been updated successfully.

Let me verify everything still builds and tests pass:

$ Bash
Build to verify no issues
go build -o occam2go .

Here's a summary of the changes I made to address issue #37:

  1. README.md — Added a "Limitation" note to the "Generating Module Files from KRoC SConscript" section explaining that gen-module uses regex-based pattern matching (not Python execution), works with simple declarative SConscript files, and cannot handle files with Python control flow like loops.

  2. modgen/modgen.go — Updated the package doc comment to clarify the regex-based approach and its limitation.

  3. CLAUDE.md — Updated the modgen/ description to mention the regex-based approach and its scope.

Want me to commit these changes and/or create a PR?

Yes please.

$ Bash
Show working tree status
git status
On branch main
Your branch is up to date with 'origin/main'.

Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git restore <file>..." to discard changes in working directory)
	modified:   CLAUDE.md
	modified:   README.md
	modified:   modgen/modgen.go

no changes added to commit (use "git add" and/or "git commit -a")
$ Bash
Show staged and unstaged changes
git diff
diff --git a/CLAUDE.md b/CLAUDE.md
index b075068..b921039 100644
--- a/CLAUDE.md
+++ b/CLAUDE.md
@@ -51,7 +51,7 @@ Six packages, one pipeline:
    - `codegen_test.go` — Unit tests (transpile, check output strings)
    - `e2e_test.go` — End-to-end tests (transpile → `go build` → execute → check stdout)
 
-6. **`modgen/`** — Generates `.module` files from KRoC SConscript build files. Parses Python-based SConscript to extract source lists and `OccamLibrary` calls.
+6. **`modgen/`** — Generates `.module` files from KRoC SConscript build files. Uses regex-based pattern matching (not Python execution) to extract `Split('''...''')` source lists and `OccamLibrary` calls. Only works with simple, declarative SConscript files; files using Python control flow (loops, conditionals) are not supported.
    - `modgen.go` — SConscript parser and module file generator
 
 7. **`main.go`** — CLI entry point wiring the pipeline together
diff --git a/README.md b/README.md
index 165288b..9b9c626 100644
--- a/README.md
+++ b/README.md
@@ -404,7 +404,7 @@ A working example is provided in `examples/include_demo.occ` with `examples/math
 
 ### Generating Module Files from KRoC SConscript
 
-The KRoC project defines module composition in SConscript (Python) build files. The `gen-module` subcommand parses these to generate `.module` files:
+The KRoC project defines module composition in SConscript (Python) build files. The `gen-module` subcommand extracts source file lists from these to generate `.module` files:
 
 ```bash
 # Clone the KRoC repository (one-time setup)
@@ -414,6 +414,8 @@ The KRoC project defines module composition in SConscript (Python) build files.
 ./occam2go gen-module kroc/modules/course/libsrc/SConscript
 ```
 
+**Limitation:** `gen-module` uses regex-based pattern matching to extract `Split('''...''')` variable assignments and `OccamLibrary()` calls from SConscript files. It does not execute the Python code. This means it works with simple, declarative SConscript files (like `modules/course/libsrc/SConscript`) but cannot handle files that rely on Python control flow such as loops or conditionals (like the top-level `modules/course/SConscript`).
+
 This outputs:
 ```
 #IF NOT (DEFINED (COURSE.MODULE))
diff --git a/modgen/modgen.go b/modgen/modgen.go
index f630cbf..ed38051 100644
--- a/modgen/modgen.go
+++ b/modgen/modgen.go
@@ -1,6 +1,11 @@
 // Package modgen generates .module files from KRoC SConscript build files.
-// It parses the Python-based SConscript to extract source file lists and
-// OccamLibrary calls, then produces an occam module file with include guards.
+// It uses regex-based pattern matching to extract Split('''...''') variable
+// assignments and OccamLibrary() calls, then produces an occam module file
+// with include guards.
+//
+// Note: This does not execute the Python code in SConscript files, so it only
+// works with simple, declarative SConscript files. Files that use Python
+// control flow (loops, conditionals, etc.) are not supported.
 package modgen
 
 import (
$ Bash
Show recent commit messages
git log --oneline -5
e8c9709 Merge pull request #36 from codeassociates/fix/readme-course-module-step
25a045a Add missing gen-module step to course module instructions
2012236 Merge pull request #34 from codeassociates/fix/readme-kroc-link
e19bcca Fix kroc course module link in README
b8b1c8c Merge pull request #32 from codeassociates/dboreham/readme-fix
$ Bash
Create feature branch
git checkout -b docs/gen-module-limitation
Switched to a new branch 'docs/gen-module-limitation'
$ Bash
Stage and commit changes
git add README.md CLAUDE.md modgen/modgen.go && git commit -m "$(cat <<'EOF'
Document gen-module regex-based limitation

Clarifies that gen-module uses regex pattern matching (not Python execution)
to extract source lists from SConscript files, so it only works with simple
declarative files and cannot handle Python control flow.

Closes #37

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
EOF
)"
3 files changed, 11 insertions(+), 4 deletions(-)
$ Bash
Push branch to remote
git push -u origin docs/gen-module-limitation
remote: 
remote: Create a pull request for 'docs/gen-module-limitation' on GitHub by visiting:        
remote:      https://github.com/codeassociates/occam2go/pull/new/docs/gen-module-limitation        
remote: 
To associate-1.github.com:codeassociates/occam2go.git
 * [new branch]      docs/gen-module-limitation -> docs/gen-module-limitation
branch 'docs/gen-module-limitation' set up to track 'origin/docs/gen-module-limitation'.
$ Bash
Create pull request
gh pr create --title "Document gen-module regex-based limitation" --body "$(cat <<'EOF'
## Summary
- Clarifies that `gen-module` uses regex-based pattern matching (not Python execution) to extract `Split('''...''')` assignments and `OccamLibrary()` calls from SConscript files
- Documents that it only works with simple, declarative SConscript files and cannot handle Python control flow (loops, conditionals)
- Updates README.md, CLAUDE.md, and the `modgen` package doc comment

Closes #37

## Test plan
- [x] Documentation-only change, no code behavior modified
- [x] `go build` succeeds

🤖 Generated with [Claude Code](https://claude.com/claude-code)
EOF
)"
https://github.com/codeassociates/occam2go/pull/38

PR created: https://github.com/codeassociates/occam2go/pull/38

Thank you. I've approved the PR. Please merge.

$ Bash
Merge PR #38
gh pr merge 38 --merge

Merged. Switching back to main:

$ Bash
Switch to main and pull latest
git checkout main && git pull
Switched to branch 'main'
Your branch is up to date with 'origin/main'.
From associate-1.github.com:codeassociates/occam2go
   e8c9709..d6b665f  main       -> origin/main
Updating e8c9709..d6b665f
Fast-forward
 CLAUDE.md        | 2 +-
 README.md        | 4 +++-
 modgen/modgen.go | 9 +++++++--
 3 files changed, 11 insertions(+), 4 deletions(-)

Done. PR #38 is merged and main is up to date.