Savvy (architecture)

This manual is a pre-release preview. Please follow Moonstalk on Twitter or Facebook to be notified of a public release.

Conception

Moonstalk is not ‘inspired’ by any other particular framework, instead its inception was driven by the desire for clean and flexible markup in views, with transparent database handling, and a preference to use RAM and avoid disk access as much as possible.

Its methodology does build upon the convention-over-configuration approach, extensively using tables from which values may be pulled (including the database, e.g. <title>?(db.record.title)</title>), and in which behaviours can be modified from their defaults (e.g. page.status=404). Further it avoids object-oriented programming (just as Lua does out-the-box), having no classes nor :method() calls (with some minor exceptions) instead favouring function calls that pass tables with typically optional keys (e.g. Function{arg1=value,arg2=value}).

Site and application bundles (folders with standardised file naming conventions) are used to simplify file management. Applications extend the core with functionality for specific usage scenarios, attempting to avoid including functionality by default that is not optimal for any scenario.

Components

Development Framework

Function

Enable the development and programming of sites and applications to serve dynamic web content, composed of both discrete and bundled layouts (design), logic (function) and assets, with an extensible set of functions.

Design

Model

Domain-specific (site) and cross-domain (application) bundles, comprising functions for acting upon and generating responses, invoked and configured through addresses and hook functions. Content–logic isolation is employed with a faceted view–controller model that supports templating, wherein responses are the merger of one or more sections.

Optimisation

All functions and content are loaded once and cached in RAM, but may be reloaded by sequentially restarting multiple backends for zero-downtime upgrades and updates. In developer mode, existing views and controllers are refreshed with each request to facilitate development.

Visitor analytics (logging) is considered best handled by third-party services using JavaScript, therefore Moonstalk is typically run with minimal logging for security auditing, performance optimisation and troubleshooting purposes only. If detailed logging is required, it may be enabled in the web server.

Performance

Static and dynamic pages without database queries are typically executed in about 1 millisecond. Pages with database queries typically execute in 1–5ms.

Database System

Function

Provides data storage (persistence) and retrieval functions to Moonstalk applications and sites. Implemented as an optional [application] module.

Design

Model

The database is a server daemon functioning as a shared Lua environment, operating for local clients (Scribe pages backends) over UNIX Domain Sockets. Persisted data is collated and written to an SQLite database on a dedicated thread. Additionally in a cluster configuration where replication is enabled, the database will maintain network connections with its peer nodes for synchronisation.

Optimisation

The database is intended for the retrieval of tables containing multiple name-value pairs in a single transaction, as opposed making multiple transactions for keys holding individual values, thus reducing overhead. Both as documents (tables) or aggregated results from Map-Reduce style function operations.

The database server keeps a dedicated connection open for each backend, executing queries sequentially on its main thread, whilst using separate threads for persistence and replication. Queries are thus not delayed by disk writes, and/or network issues resultant from the execution of preceding queries.

The persistence and replication functions batch changes together at a configurable interval to avoid resource (i.e. disk or network) saturation, and the overhead of many small transactions with commonly-changed values (e.g. counters). Typically the interval is low enough that in the case of a crash or unexpected power-off event, only a few seconds of non-critical data could ever be lost.

--TODO: Critical writes can be saved synchronously (immediately) and without delaying pending queries, through the use of a spawned thread that exits once either persistence or replication has completed. Likewise critical reads, except that they exit once a required number of nodes have responded (e.g. the first). Note however that the response to critical queries may be delayed by subsequent queries (synchronous responses are not supported).

Procedures are native Lua functions that execute on the main thread and thus block all subsequent queries. Procedures should therefore define structural actions that execute swiftly, unless they spawn a thread to complete their task (in which case no return value is possible), or utilise the functionality provided by the Tasks application to provide a result via polling.

Indexes allow for faster alternate database table lookups but introduce additional complexity in the maintenance of table structures and ownership by applications. Indexes are not saved to reduce persistence tasks, but [re]built on startup, and should be maintained on a per-query basis using delegated event handlers. See the Teller section for details.

Saved data accessibility is provided by using the open SQLite database file format. Individual values are saved in this under their unique combined table key namespace, and using flags to preserve data types. This data is read into the database's Lua environment upon starting.

Performance

Transaction times depend upon 3 factors—the size of the response, the complexity of the query (negligible for most fetch lookups), and the queue of pending queries.

This does not consider outstanding requests being queued for distribution amongst page backends by the web server, when under high traffic.

For a single fetch transaction, inclusive of overheads, a nil or simple chunk of text/HTML typically returns in 150 microseconds (0.15 milliseconds), whilst a table containing a few dozen nested keys returns in 200–300 microseconds. Lookups for non-numeric keys, within nested tables, and within very large tables introduce a little additional overhead.

This results in peak performance of 15,000 operations per second per Gigahertz (i.e per dedicated 'CPU' second on a single CPU core). Pages backends carrying out this many operations would probably be using at least the same amount of CPU time, therefore you would get half this performance per core, unless the database has a dedicated core. Realistically a page request would never make more than a few database requests, and currently 10,000+ requests per second is perfectly feasible. Performance for complex queries (functions) obviously varies according to the scope of the data. For higher traffic sites a multi-core 2Ghz+ processor is essential.

RAM requirements are approximately double that of disk storage requirements, i.e. for 5 MB of on-disk data (about 40,000 records of 200 characters, or 6,000 blog posts), the database will require 10 MB of RAM. Bear in mind that the non-persistent page cache functionality will require additional memory for every non-authenticated URL's HTML content.

Hosting Backends

Function

Handle HTTP requests for dynamic and static website content, across multiple sites (domains), and shared applications.

Design

Model

A separate web server manages HTTP connections and serves static content directly, whilst passing requests for dynamic content amongst separate backend daemons running as FastCGI processes. These processes run pre-configured with applications and sites (using the framework), and maintain an always-open connections to a database.

Optimisation

Moonstalk's focus is on serving web applications. Static content is considered best served from a CDN, not from the same server as a webapp. Therefore static and dynamic content are considered separately. However Moonstalk configures the webserver to serve both, and thus can be used for low-use static assets and CDN origin-pull without unduly reducing capacity for serving web applications.

Additionally this approach enables the webserver to accept long-running uploads without blocking the FastCGI backends, only invoking a backend upon receiving the completed upload. It further allows for fine-grained security (i.e. rejecting requests before they reach the backend).

In using multiple separate FastCGI processes, the number of processes is tuned to make maximum use of the available CPU cores, or indeed to restrict use to a subset of cores such that capacity can be reserved for other processes such as a database.

The processes preload and cache the sites (settings, views, controllers and applications) compiling functions, keeping everything in RAM, thus vastly reducing processing times as compared loading from disk with every request.

Performance

With both Scribe backends and Teller database with a typical web application, 165 requests per Gigahertz can be handled (i.e per dedicated 'CPU' second on a single CPU core). On a quad core 2.5Ghz CPU, the engines can thus serve 4,500 requests per second.

It's not yet apparent if there are scheduling (inter-process) bottlenecks that might prevent increased capacity on hardware with better performance without further optimisation.