Gyoji Programming Language
Goals
I would like to have a type-safe and memory safe version of C. This means a lot of different things to different people, so a bit of explanation is required. Complete memory safety is a myth. Although it's true that languages like Rust provide semantics in 'safe' mode which allow the memory to be 'memory safe', they do so at the cost of a difficult mental model, the borrow checker. While the borrow-checker and region-style semantic reasoning does provide assurances of safety, they come at the cost of understandability of the model that restricts some legitimate use-cases which are perfectly memory safe, but if the compiler can't prove it, they are disallowed.
Rust cannot prove many common cases, and it might be argued that a better reasoning system might someday hope to prove that all memory safe code is, in fact, safe. This cannot happen, however, because Rice's theorem, a consequence of Turing's Stopping theorem, show that compilers cannot, in general, prove all cases[1].
The other problem that Rust has is that is's suffering from bloat. The executables end up being very large because the compiler and the core libraries are tightly coupled. The whole 'panic' and unwinding system is built-in and comes along with a bunch of garbage that isn't always (usually) wanted or needed. You could argue that it's fine because you can always #![no_std] and #[no_panic] stuff away, but the fact that it's there by default is a bit troubling. This is particularly true when functions like 'sqrt()' may panic and unless the programmer knows to protect against this, can be a big problem for reliability. The authors of Rust would argue that it's better to panic() rathre than have undefined behaviour, but the fact that 'sqrt()' on a negative number may be undefined in a mathematical sense, it need not be undefined in a computational sense. Returning a NaN would seem to be a better choice than a panic() in these cases, but making philosophical changes to the core standard library at this stage are tall asks.
Instead, this language follows a different path. Minimalism. We aim for a C/Java/C++ style syntax with support for objects, but without a lot of the things that come with them. References and borrow checking for memory safety. Classes and basic notions of inheritance, but without allowing multiple inheritance and the ambiguities associated. Interfaces may be multiply inherited, but the vtable is constructed associated with the interface, not as a part of the class. Constructors cannot leak their content before initialization, removing the problem of 'half-initialized' objects. Generics are Java-style with type erasure which removes a lot of the complexity and bloat of C++-style template metaprogramming. The goal is to unify the best features of languages that have come before and to do it in the simplest way possible.
Guarantees we want the language to make
We would like the language to make certain guarantees about code that it compiles.
Initialization before use
Code that compiles outside of "unsafe" blocks guarantees that every variable is assigned to some initial value before it is used. Use, in this context includes use in an expression to calculate a value (i.e. an rvalue) and also includes expressions that de-reference pointers and even take the address of values.
For local function-scope variables, this guarantee is enforced by some rules:
- After a primitive value is declared, it must first be assigned a value before it is used in any expression.
- After a class value is declared, it must call a constructor before it is used.
- Every constructor must initialize every primitive member to some initial value.
- Every constructor must call the constructor of every class member
Note that under this definition, it is not allowed to take a pointer or reference to a variable and perform the initialization THROUGH that pointer or reference. While this would constitute an initialization before use, the complexity of this form of initialization is sufficient to make the validation of the rule extremely time consuming. Therefore, Gyoji has taken the decision to disallow this form of initialization in its validation.
Scope Guards
Classes declared in a local scope will always call their corresponding destructor at the end of the lexical scope in which it was declared.
This guarantee extends to the use of 'goto' and 'break' statements where scope is exited by means other than the '}' at the end of the lexical scope. That is, if you 'goto' a block which no longer includes a variable, the destructor will be called when the scope is left during execution.
Exceptions
Code compiled by Gyoji will not ORIGINATE any exceptions which would break control-flow. Note that this does NOT mean that exceptions are not possible. Exceptions may still originate from C or C++ code called indirectly by Gyoji and care must be taken that the functions called do not raise exceptions.