1. A naming convention that reflects the differences between C and Lisp identifiers.
2. A core of C utilities for representing, initializing, and using run-time-typed data; closures over lexical functions; multiple values; dynamic binding; non-local exits; and cleanups.
3. Conservative garbage collection.
4. A run-time function-based interface to such Lisp utilities as the object system and environment initialization.
Few C programmers need go into the details of these; it is enough to
know that they exist. The descriptions in this section do, however,
give experienced Lisp programmers some understanding of how Lisp
issues are handled in C by Eclipse. In all cases, each technology was
implemented for maximum portability and so that the design would seem
understandable and usable to both Lisp and C programmers.
2.1: Identifiers
Eclipse defines a naming convention that maps Lisp names to C
identifiers. Eclipse uses this convention for each C identifier it
generates, including the entry points accessible in the Eclipse
library and identifiers appearing in translated user code. These
transformations apply only to C identifiers, all of which are
compile-time entities. Run-time data, such as the strings that form
SYMBOL-NAME
s, are not transformed.
The transformation of identifiers is necessary because C imposes much harsher restrictions on identifiers than Lisp.
Eclipse addresses these differences by translating Lisp symbol names to C identifiers using a convention that follows standard C practice.
CL:LOGICAL-PATHNAME-TRANSLATIONS
=>
clLogicalPathnameTranslations()
, USER::FOO-BAR
=> usrFooBar()
.
clNIL
, clCALL_ARGUMENTS_LIMIT
,
clLOGICAL_PATHNAME_TRANSLATIONS
, usrFOO_BAR
.
foo_bar
.
Eclipse uses package prefixes only when needed for scope or
distinction, or because the identifier would be illegal or reserved
without it. The shortest package nickname is used by default as the
prefix. The system defines the package prefix ``cl'' for all system
utilities and ``usr'' for utilities defined in the Lisp COMMON-LISP-USER
package.
Lexical scope issues in function names (i.e., nested FLET
/LABEL
) are
handled by preserving the nested chain of function names, separated by
underscores. For example, usrOuterFunction_InnerFunction()
. Eclipse
preserves method specializers and qualifiers, and SETF
function names,
in a similar manner.
The ``pipe'' escape characters (e.g., #\|
) used by the Lisp printer when
a symbol has non-default case are also part of an Eclipse C identifier
name.
Finally, when characters appear in a Lisp symbol that cannot appear in
a C identifier, Eclipse replaces the characters with an alphabetic
name in a contrasting case. Eclipse defines names for all the
non-alphabetic members of the BASE-CHAR
repertoire, but
uses hex codes for extended characters (i.e., Unicode). For example,
*DEBUG-IO*
=> clstarDEBUG_IOstar
,
user::|lower-CASE|
=> usrpipelowerpipe_CASE
,
LIST*
=> clLISTstar
,
clListSTAR()
.
Lisp names that are interned in the C package are exempt from this
``name mangling.'' This allows Lisp programs to reference C utilities
that do not follow these conventions.
2.2: Representation
Eclipse provides a single C header file, ``eclipse.h,'' which defines
Lisp data representations. Any C code that uses the Eclipse library
must include this file. The following sections describe some of the
representations defined in ``eclipse.h.''
2.2.1: Objects
The header file defines a C typedef called clObject
,
which is used by Eclipse to represent each Lisp
datum. clObject
is defined as a machine word that can be
treated as either a pointer or as immediate data.2
Most kinds of clObject
s are implemented in Eclipse as pointers to a
heap allocated structure. The first component of this structure
contains type information, including a pointer to the class
metaobject. On some architectures, Eclipse saves space for some data
types by using the least significant bits of the pointer for typing
information. Eclipse also represents some data such as fixnums and
characters by storing them directly in the clObject
word as immediate
data. Again, the least significant word bits are used for typing. In
these latter cases, Eclipse reaches the class metaobject through a
globally known array, indexed by the low clObject
bits.
The header file defines a macro to access the class metaobject of all
the built-in clObject
s, including structure and instance
objects. The header file and the Eclipse library define a number of
macros and functions for creating different kinds of
clObject
s from corresponding C data, and for accessing
the internal C data from different kinds of clObject
s.3
To aid in linting, and to shield code from changes in clObject
implementation, the header file defines an assignment macro,
clSetq(place, value)
, which usually expands into ((place) = (value))
.
2.2.2: Functions
Eclipse compiles each Lisp function to a corresponding C function. The
generated C function uses the C variable argument mechanism
(varargs/stdargs) to accept clObject
s as arguments. The arguments are
exactly the same as they are in Lisp, except that an additional
argument, the ``symbol'' clEOA
, is appended as an End-Of-Arguments
marker. It is used by the function during argument parsing. Eclipse
COMPILE-FILE
automatically adds this clEOA
marker to Lisp function
calls in generated code. This use of clEOA
is less error prone in
hand-written/modified C code than insisting that an explicit argument
count be provided. Some Lisp implementations pass data on a special
Lisp data stack. Eclipse programs pass arguments as ordinary C data.
No extra arguments are needed in Eclipse to represent the Lisp function's defining environment. C programmers can call any Lisp function without needing to know if or how the function refers to an enclosing environment. Eclipse handles this automatically as follows.
Eclipse represents Lisp functions at run-time as closure clObject
s
that contain the code to be executed (i.e., a C function pointer) and
the ``closed-over'' environment. The closure environment is defined by
Eclipse as a vector of those variable clBinding
s (addresses of
clObject
s) that were defined in the function's enclosing lexical
environment and used within the inner function code. For many
functions, the environment is empty; that is, there are no lexical
variables used within the function that are defined outside of
it. When Eclipse creates a closure clObject
, it fills the closure's
environment with any necessary bindings.
In general, Eclipse uses ordinary C variables to represent local Lisp
variables. Eclipse COMPILE-FILE
uses the same variable name and scope
in the generated C code as was present in the original Lisp
source. Eclipse declares these variables as being of type
clObject
. Eclipse uses the address of the C variable as the clBinding
of a closed-over Lisp variable with dynamic-extent. However, for a
closed-over Lisp variable with indefinite-extent, Eclipse generates
code that heap allocates a clBinding
. clBinding
s are shared by all
closures over them, but environments are not.
For each function defined in a non-empty environment, Eclipse
COMPILE-FILE
generates two ``environment hooks'' that point to a
closure's environment. One hook is defined statically, outside the C
function and the other is a local variable within the C function
definition. The static environment hook is initialized by Eclipse when
the closure clObject
is created. Generated code initializes the local
environment hook from the static hook immediately upon entering the
function on each call. All references to enclosing variables from
within the inner generated function use this local environment hook to
access the clBinding
. Using a cached local variable allows for
reentrant calls to identical code that is closed over different
bindings.
For ``top-level'' functions that only create closures once, Eclipse
initializes the static hook once and it is never changed. For
arbitrary Lisp closure objects created at run time, it is necessary to
call such functions through their closure objects using
FUNCALL
or APPLY
.4 FUNCALL
and
APPLY
set the environment hook if necessary before
calling the implementing C function. The address of the static
environment hook is stored by Eclipse in the closure
clObject
.
2.2.3: Multiple Values
Eclipse defines each generated C function to return the ``primary'' Lisp
value as an clObject
value. Eclipse also defines a globally known
pointer to a buffer of multiple clObject
values. Some functions just
return the values returned by other functions (i.e., are tail
calls). However, if a function returns a single value (e.g., the value
of a variable), then a macro from ``eclipse.h'' must be used to indicate
in the multiple values buffer that only one value is returned. The
function clValues()
can also be used to fill the multiple value buffer
with zero or more values. Receiving multiple values is accomplished by
using a macro from ``eclipse.h'' that introduces a new multiple values
buffer as a local (automatic, stack) C variable. The macro stores the
location of this new buffer in the globally known pointer.
2.2.4: Dynamic Environment
Eclipse uses a Lisp-specific control stack to keep dynamic environment
information such as dynamic bindings, active cleanups, and exits such
as catchers and closed over blocks/labels. The elements of this stack
are pointers to data identifying the kind of information.
SYMBOL-VALUE
to a new value.
setjmp()
/longjmp()
. Eclipse initializes a
C jmp_buf and caches the state of the multiple values machinery.
Eclipse defines macros in ``eclipse.h'' for using the control stack to
establish dynamic bindings, blocks, tagbodies, catchers, and
cleanups. The header file also defines macros for non-local transfers
such as RETURN-FROM
, GO
, and THROW
that unwind the control stack as
necessary. Besides using the appropriate longjmp()
machinery, these
transfers take care of unwinding dynamic bindings and executing
UNWIND-PROTECT
cleanup forms.
2.2.5: C Implementations
COMPILE-FILE
does not generate platform- or
compiler-specific code. Eclipse abstracts any platform or compiler
dependencies into conditionally defined macros within ``eclipse.h.''
These macros cover such issues as word size, variable argument
mechanism, and function prototypes. This allows the same Eclipse code
to be processed by ANSI/ISO C compilers [Harbison], traditional classic/K&R
C compilers [Kernighan], or
C++ compilers.
2.3: Memory Management
Eclipse uses a conservative, non-relocating garbage collector,
publicly available from Xerox PARC.[Boehm] In this case, ``conservative''
means that C data, including that held on the C stack or in registers,
are traced by the garbage collector. The system assumes that anything
that looks like it could be a pointer to data is live, and the data
there is not collected. This allows user-written and Eclipse-generated
C code to pass Lisp data around as ordinary C data without any need to
``register'' them by hand with the garbage collector. In addition, the
collector recognizes a pointer that appears to point to within a heap
allocated datum. This allows the collector to work with ``mangled
pointers'' such as those described in Section 2.2.1.
Non-relocating garbage collectors do not move heap allocated data
during collection. This allows Eclipse to implement clObject
s as
pointers to data, as opposed to ``indexes,'' ``handles,'' ``pointers to
pointers,'' or other more complex things.
The garbage collector uses incremental/generational collection when supported by the operating system. This means that only a small amount of work is done during each collection, which reduces delays.
The garbage collector used was written for use in arbitrary C/C++ programs, and was not modified for use in Eclipse. It can be used directly by C programmers without using the rest of Eclipse.
The collector defines alternatives to the standard malloc()
utilities. An application that uses malloc()
or sbrk()
will not work
with Eclipse, but must instead be changed to use GC_malloc()
. Care
must be taken when calling certain operating system utilities from C
code, because they sometimes use incompatible malloc-like utilities
internally.
If the provided garbage collector is undesirable for some reason, it
can be replaced with any user-provided, conservative, non-relocating
system with a malloc-like interface.
2.4: Function-Based Interface
The Common Lisp Object System (CLOS), and the semantics of file
loading are two examples of Lisp utilities that have no analog in
C. Eclipse defines functions so that these utilities may be used
within C programs.
2.4.1: CLOS-MOP
Eclipse implements not just the Common Lisp Object System (CLOS), but
its complete MetaObject Protocol (MOP).[Kiczales] The CLOS-MOP defines a
function-based interface for defining, instantiating, and accessing
classes, and for defining and using generic functions and methods. It
is through these MOP functions that eclipse allows object-oriented
Lisp to be accessed by C, which defines only function-like interfaces
to data.5
2.4.2: Initialization
Unlike C, Lisp source files can contain not only function definitions,
but arbitrary data as compile-file literals, and arbitrary top-level
code that is not part of a function definition. When a program or user
loads a Lisp file into any Lisp implementation, the system creates the
literal data, initializes it, and executes the top level code. C
provides no similar mechanism.
Consider, for example, a file that contains the following function definition:
(DEFUN MY-FUNCTION (A B) (LIST A B))
DEFUN
is a macro that essentially expands this function
definition into something like:
(SETF (SYMBOL-FUNCTION 'MY-FUNCTION) (LAMBDA (A B) (LIST A B)))
When this is loaded into any Lisp implementation, the system creates a
function object (a closure) that is stored in the SYMBOL-FUNCTION
of
the symbol MY-FUNCTION
.
In Eclipse, when COMPILE-FILE
generates a C file, it
generates one C function for each Lisp function in the source. It also
generates one extra C function, taking no arguments and returning no
value. This ``initialization function'' executes the ``load-time''
code. Calling this function is semantically equivalent to loading the
corresponding Lisp file. For example, the Eclipse-generated
initialization function for a file containing the previously discussed
DEFUN
will intern the symbol MY-FUNCTION
and
initialize it with a closure clObject
. (See Section 2.2.2.)
Before an applications calls any ``Lisp functions,'' it must first call all the initialization functions for each user-generated file, as well as those for the Eclipse library. For example:
/* Initialize Eclipse system code. */ clInit(); /* Initialize Eclipse run-time library. */ /* The next line is not needed in most applications. */ clInitD(); /* Initialize Eclipse development library. */ /* Initialize user code. */ usrMyFile(); /* An Eclipse-generated initialization function for "my-file.lisp" */ /* Now Lisp can be used. */ clPrint(clEval(clRead(clEOA), clEOA), clEOA); ...
Execution of the Eclipse-generated initialization function ensures that:
clInit()
and clInitD()
, followed by a call to the
read-eval-print loop.