Written in 2013
Real World OCaml (RWO) will finally explain how OCaml works in the real world. This document, which has no relation to RWO, explains how OCaml works in my fantasies. I will try to highlight only how fantasy differs from reality.
Please note that none of the following describes the real OCaml as of version 4.00.1.
All source files are
The character literal forms
\dspecify Unicode code-points.
The following tokens are not keywords:
::is an ordinary operator (also see “builtins” below).
New infix operators are allowed to be declared in modules and module types. The
Pervasivesmodule includes infix declarations for the above tokens that are no longer keywords. Only the fixity declarations exposed in the signature are exported to other modules. All such infix operators have the same canonical precedence between
let ... in.
Modules and module types are unified into mixins, like in MixML. This allows, among other things, module definitions to be distributed across multiple files, and for multiple modules to share common sub-modules. The versatility of the ML module system is drastically enhanced.
A hierarchical module namespace is maintained in compiled code, but no compilation unit is allowed to specify an absolute module path. Instead, libraries are allowed to be grafted into any subtree of the module path of client code. This allows client code to use a module path “claimed” by another package by simply moving the other package to a non-conflicting path. The system is nevertheless clever enough to maintain only a single copy of any piece of code.
Stateless modules—i.e., modules that do not have any initialization code—are allowed to be pruned if they are not used (dead module elimination). These modules also have zero initialization overhead. A warning is added to complain about a module having initialization code.
utf-8encoded and the character type has the same size as
intand stores Unicode code-points.
New primitive types
bytestringfill the roles
stringplay in the real OCaml.
Values of type
bytestringare not mutable.
(::)(_, _)are not hard coded into the grammar.
Types and type definitions
Recursion in type abbreviations is simply syntactically impossible. The declaration
type t = tis not interpreted as a recursive type abbreviation (which then triggers a compile error) but as an abbreviation of
tas an existing
t. Such definitions are often needed in functor arguments that require a structure with a type
t. Recursion is still implicit in variant (polymorphic or otherwise) definitions.
All non-nullary value constructors (but not polymorphic variants) implicitly define a functional form. To illustrate, the definition
type 'a lst = Nil | Cons of 'a * 'a lstalso implicitly defines a function
_Cons : 'a -> 'a lst -> 'a lst.
There is absolutely no difference between
type t = T of int * booland
type t = T of (int * bool). Parentheses in type definitions serve only a disambiguating role.
The following is not silently accepted:
let f : 'a -> 'a = fun x -> x + 1. In other words, this definition is identical to
let f : 'a. 'a -> 'a = fun x -> x + 1.
Views are supported.
Expressions and value definitions
The following is no longer silently accepted:
fun x x -> x. Likewise for
let f x x = x. In either case, the compiler complains about a repeated argument variable
x, like it has always done for
fun (x, x) -> x.
<expr> match? <pattern>is an expression of type
e match? preturns
trueif and only if
(match e with p -> true | _ -> false)returns
<pattern> as <pattern>generalizes
<pattern> as <identifier>. The two arguments to
asmust define a disjoint set of variables, and the union of bindings from both patterns is added to the scope of the pattern.
let! f arg1 ... argn = ...is interpreted as a definition of
narguments that is forcefully inlined at compile time. Note that such definitions cannot be recursive, and these functions revert to ordinary functions when partially applied, especially when used as arguments to other functions. A corresponding
val!declaration is added to the module type. Compiled modules store a reusable Lambda representation of such inlinable definitions.
do ... doneform is removed, and
begin ... end(which are mandatory). The
dokeyword is reserved for future use in computational expressions, while
Function arguments are evaluated from left to right.
listtype is defined as:
type 'a list = Nil | Cons of 'a * 'a list. The legacy forms
::in patterns are retained as synonyms for
Consfor backwards compatibility. The infix operator
(::) : 'a -> 'a list -> 'a listreplaces the hard coded forms
<expr> :: <expr>and
(::)(<expr>, <expr>)in expressions.
Build Tools and Runtime
There is a canonical, well-maintained, and documented building tool for OCaml:
ocamlbuild. All other build systems, including
omake, are considered obsolete.
Camlp4 uses only the grammar of OCaml. There is no “revised grammar”, which has been ruthlessly expunged from history. There is also a comprehensive test-suite for OCaml parsing that both the compiler and Camlp4 are required to pass – gone are the days of Camlp4 and OCaml having different opinions on such things as source positions, syntax errors, etc.
The build system does not assume the existence of
bashand a standard Unix tool-chain, allowing it to be built on MinGW without the need for Cygwin. Cross-compiling a MinGW binary from Linux is also a standard (but optional) part of the OCaml distribution.
ocamlyacctool is deprecated in favour of Menhir.
The runtime encapsulates all its global state in a “runtime context” object. Multiple context objects may exist simultaneously in the same process. The
at_exitfunction adds exit functions to be called when the runtime is destroyed, not when the process exits.
There is an LLVM backend.
The standard library and runtime are re-licensed as a MIT or 2 clause BSD style license. The legally questionable LGPL 2.1 + linking exception license is deprecated. The compiler is re-licensed as GPL, with the old QPL deprecated. The deprecated licenses are only applied to legacy versions.
Numare dropped from the standard distribution. They survive as community-maintained libraries.
The runtime is disentangled from the
Unixlibraries, so that the entire standard library may be replaced. Community-maintained standard library replacements can jack directly into the OCaml runtime if needed, instead of operating as a shell on top of the standard library.
This document expresses the view of its author, Kaustuv Chaudhuri, alone. It should not be misinterpreted as the views of anyone else, particularly not of the OCaml developers.