About

Examples

Source Code

Architecture

Help Wanted

Goals

Doxygen

Code Coverage

Support

Gyoji Programming Language

Gyoji Architecture

The architecture is described here. A compiler is a complicated machine, so there is more than one way to view the system. Each section below describes the same system with a different perspective.

Components

The following diagram shows how the various parts of the compiler are organized. Each of the components are designed to be independent of one another, exposing interfaces and all tied together in the "compiler" application using dependency-injection techniques. This way, any part of the compiler can be modified and augmented without having to unwind the entire system.

This makes the compiler/language system itself modular and reduces the load on any individual contributer to understand the whole flow too deeply.

Compiler Context +-----------+ +-----------------+ | Input .j | | Error reporting | +-----------+ +-----------------+ | Frontend | Token stream | +----------------------+-------+ +-----------------+ | +-----------------------+ | | | | Lex/Parse | | | Error/context api | | namespace resolution | |-----+ | +-----------------------+ | | | | Syntax tree | | | +---------------------+ | | | | Function Resolution | | | | | Type Resolution | | | | | (Lowering to MIR) | | | | +---------------------+ | | | \\ | +------------------------------+ | \\ | \\ +------------------+ \\ AnalysisPass | formatters | +------------------+ +------------------------------+ +------------------+ | MIR |--------| Borrow Check (Polonius-like) | | AST Tree +------------------+ | Unreachable Code | | | Other semantic checks | +---------------+ | +------------------------------+ |format-identity| | |format-pretty | | |format-tree | | +---------------+ | | +------------------------+ +-----------+ | Code Generation(LLVM) | |Text output| +------------------------+ +-----------+ | LLVM +------------------+ | ELF (.o) output | +------------------+

The components defined in the system are:

Lifetimes

The following is a "lifetime" diagram that indicates the lifetimes involved in the compiler stages. This is important because the compiler is designed such that more than one front-end and back-end can be used. In order to facilitate this capability, it is important to understand which parts of the compiler will have access to other components at different stages throughout the compile lifecycle. For instance, the error handling of the compiler is a feature that is used all throughout the program whereas the parse tree is only used in the initial stages and is not directly accessible in later stages of code generation.

High-level lifetimes and interactions:

+-------------------------- Compiler Context
| +------------------------ TokenStream Input context
| |                                    provides context to the error handler.
| |
| | +---------------------- Errors : Error reporting library
| | |
| | |
| | |
| | |
| | |
| | |
| |-|--------------------- Parser : Reads from input                    Parsing
| | |                                      * Parses input into a
| | |                                        parse tree
| | |                                      * Records input data
| | |                                        in token stream.
| | | +------------------- TranslationUnit
| | | |
| | | | +----------------- NamespaceContext
| | | | |
| | | | | +--------------- mir::Types                               Type Lowering
| | | | | | 
| | | | | | +------------  mir::Functions                       Function lowering
| | | | | | |
| | | | | | |                                                   Type consistency
| | | | | | |
| | | | | | |
| | | | +----------------- ~NamespaceContext
| | | |   | |
| | | +-------------------  ~TranslationUnit
| | |     | |
| | |     | | +----------  Analysis                              Static Analysis
| | |     | | |                                                  Borrow Checking
| | |     | | |                                                  Visibility rules
| | |     | | |
| | |     | | |
| | |     | | +---------- ~Analysis
| | |     | |
| | |     | +------------- LLVMBackend                           Code Generation
| | |     | |
| | |     | +------------- ~LLVMBackend
| | |     |
| | |     +--------------- ~types::Types
| | |     
| | +---------------------- ~Errors
| |
| +------------------------ ~TokenStream
|
+-------------------------- ~CompilerContext

As you can see from this diagram, the only things that last the entire program scope are the token stream and error reporting facilities. All other aspects of the program have more limited lifetimes. This enforces some separation of concerns because some parts of the system simply will not have access to others due to their lifetimes.

Lifetime notes:

Library Dependency Stack

This diagram shows how the components may depend on one another. This is important in order to establish dependencies in the build system but also establishes some sense of what component is responsible for what part of the system.

+--------------+
| Frontend     | 
|    Namespaces| +-------------+ 
|    Lowering  | | Analysi  s  |
+--------------+ |   Semantics | +---------+
    Create MIR   |   Borrowing | | Codegen |
    (immutable)  +-------------+ |  LLVM   |
                     Use MIR     +---------+
                                   Consume MIR
+-------------------------------------------------+
| Compiler Context                                |
+---------------------------+--+------------------+
            +-------------+ |  |            +-------------------+
            | Errors      | |  |            | misc: Utilities   |
            +-------------+ |  |            |       for strings |
               +------------+  |            +-------------------+
               | TokenStream   |
               +---------------+

Notes on differences from C syntax