The Digamma Blog: 2009

Thursday, December 10, 2009

Networking & Enyo improvements

Networking

Vesta's networking core is now complete (under c/posix.c for those with source access). This includes a "raw ip pack" function, to build up IP datagrams (as well as higher level protocols). One thing I'm working on now is an implementation of RUDP, for use in client libraries.

Enyo

I've been working on Enyo quite a bite lately, and it now is a relatively nice system for PreDigamma development. PreDigamma is similar to PreScheme, but with GC (since nowadays it's perfectly acceptable for a "lower than application but higher than OS" level language). I was thinking about rewriting the GC in PreDigamma, and compiling bootstrapping the runtime from C, but that's mostly just notes & tests right now.

I'm getting close to the point where many other people can have access to the source & I can move to a Caerwyn-style FPN (Digamma Programmer's Notebook).

Monday, November 23, 2009

Format, define-syntax (rules)

Just a few updates:

Format

The format form:

(format specification-string rest-list)

Now works for all internal types, including all collexions. Formatting of tries, vectors, strings, and lists is supported. There are (generally) two types of formatting for each object: display form and write form; the two types correspond to their top-level forms. The switch is case: lower case is display, upper is write. So, "~t" displays a t(able | rie) as display would, whereas "~T" displays it as write would.

Syntax rules

I've fixed pattern matching in define-syntax so that you can use the usual forms:

(let ((name . val) ...) . body)

Will expand exactly how you expect (unlike before...).

More to come
I've started to bring Ceres & Enyo up to where Vesta is currently; even though Vesta is almost the same speed as TinyScheme, this isn't really fast enough for many applications. Ceres is a SECD-like VM, and Enyo has some neat tricks in it. Hopefully they can be made production ready before year's end.

Tuesday, November 3, 2009

Trie functions

A quick update: tries are a fundamental data structure in Vesta, and used as the backing store to dictionaries. Since tries have many useful properties, it would be great to expose these properties to the "user". I just committed the partial-key? and keys forms to the core foundation. partial-key? returns #t if 100% of the string is within the dictionary, #f if none is and N (which is an integer) for a matching subset. keys on the other hand, does what it sounds like: it returns all keys within a dictionary structure. Now, on top of the set/bag library, I'll also include a simple semantic processing library with the default build.

Friday, October 30, 2009

Performance update

Performance is now around TinyScheme levels (slightly above in some cases, slightly below in others); that's not really saying much, since TinyScheme is quite slow, but it's better than where Vesta used to be. I'll focus on cleaning up everything (but core evaluator & FFI take top priority), as well as performance. Hopefully I'll be able to get better performance in Vesta before I work on Enyo & Ceres.

Thursday, October 15, 2009

Semantic Overview

So, I was discussing a project with lisp-qix, and promised to give a quick write up of Digamma's semantics. So, let's dive right in.

Data types
Digamma hosts the normal set of scheme data types: vectors, lists, atoms, numbers (integers, rationals, (double) reals, & complex), strings, &c. The only additions to those data types are environments and dictionaries.

Environments are to be used for security in functions such as eval: if you don't know where the code you're using comes from, you may wish to evaluate it in a top-level environment that doesn't have I/O, or has specially restricted I/O functions; Environments allow you define & control evaluation programatically, all from a standard program. To do this, Digamma has special forms that interact with environments, such as define-environment, set-environment!, get-environment, clone-environment, &c. eval and load can both take environments as optional arguments, but you can use the with-environment form to evaluate code in a restricted place. This code is the basis for a W7 environment within Digamma, to be used for security-related code in end-user programs; basically, user code can be sandboxed off from everything else in the system (including, technically, itself: remove def, cset!, and set! and the environment could not even modify itself).

Dictionaries are similar to what is found in most contemporary programming languages:

{ test: "this" keys: "work }

The basic "production" is:

'{' ((STRING | KEYWORD | SYMBOL) SEXPR)* '}'

The reason that I've restricted the storage index to character strings is because I'm using tries as the back end data storage mechanism, due to my initial use of Digamma as a semantic analysis platform. Dictionaries have one primitive form that other collexions do not (explained below): has-key?, which simply tells if the key exists within the dictionary. There is one other primitive that I'd like to add which is partial-key?, which would return #t if all characters within the index match, #f if none match and, N (where N is a number) if there is a partial match.

Collexions
One issue that always irked me about Lisp systems is that while there is type unity (all productions are S-Expressions after all), there is no such thing as a collexion, which have common accesors. Now, to be fair, certain Lisps do fix this issue; Gauche has a collections library. However, I think this should be specified in the base system, since it aids in selecting the correct data structure for the task, whilst still allowing room to expand. For instance, you might use an a-list initially, then switch to a dictionary later, for speed or memory issues. To aid in this endeavor, Digamma has a core set of primitive forms that operate on all collexion types (this is another reason why I chose tries: since they have a total ordering, and can easily play the role of "associative arrays" in a map, filtre, &c.). These forms are:

nth, which is an accessor (nth collexion index)
first, which grabs the first element of a collexion, ala car
rest, which returns everything other than the first, in an non-destructive manner
cset!, which destructively set!s a collexion member
cupdate, which returns a shared structure with the collexion index modified

These work on all sequences, and have only minor restrictions for the sequence type (i.e. anything other than a char or an integer cannot be inserted into a string, for instance).

Closures
One thing I've said nothing about yet, but that is totally important to Scheme, is closures. First, closures in Digamma are written fn, but pronounced lambda. Second, they are quite a bit different than their Scheme cousins. First, the pathological:

((fn (x) (+ x x)) 3)

Returns '6', as you might have expected. Let's try something different:

(def y (fn (:opt (x 34))) (+ x x)))
(y) ; returns 68
(y 3) ; returns 6

That :opt keyword is inserted into the middle, because :body and :rest arguments can have optionally specified default values as well. It also includes a case variant (ala case-lambda), a SRFI-89 style lambda, and a typed generic lambda.

(def f (fn :case (0 => 1) (1 => 1) (n => (+ (f (- n 1)) (f (- n 2))))))

The match is Hope-style: changing the order would not effect the result, as the matcher attempts to match the most specific case.

Wrap Up
I'll write future posts on the other parts of Digamma (unix/posix integration, networking, FFI, modules, &c.), but this is a decent overview of the semantic changes to Scheme that I've made for Digamma.

Thursday, October 8, 2009

Cleaning House

So, I'm cleaning up & refactoring the "internals" of Digamma's reference implementation, Vesta. There are two evaluators now, one of which is "stackless". Continuations & tasklets capture the evaluation stack & can "replay" it at later points, but it's rather messy, and performance is behind tinyscheme, which is quite a trick, considering that tinyscheme is one of the slowest scheme systems out there. The stackless version is much faster than the C stack version, and closer in speed to tinyscheme, but that still leaves one order of magnitude between Vesta & established systems. I'll close the gap in the upcoming days with a refactor of the stackless core, and then it's just a spruce up of the syntax-rules pattern matcher and the FFI that is holding Vesta back from being released. By the end of the month, I hope to be pushing out labs that you can run at home...

The Digamma Blog