Model 204
A Novel DBMS and Application Platform

Model 204 is a unique application platform that integrates a fourth generation language (4GL) specifically designed for writing database applications with a high performance DBMS engine. Model 204 delivers unparalleled performance for large-scale applications that process enormous databases that contain complex structures. Model 204 is firmly entrenched in the Very Large DataBase (VLDB) niche of the IBM mainframe market.

Add-on products from Sirius Software build upon the solid foundation of Model 204 to significantly enhance its performance and usability. The Janus suite of products places Model 204 at the leading edge of web services and service oriented architectures. Janus SOAP in particular extends the benefits of Object Orientation to User Language, the 4GL supported by Model 204. Other Sirius products facilitate real-time application maintenance and deployment, including the complex issue of post-hoc debugging for distributed applications.

The strengths of Model 204 flow from its extraordinary initial problem statement and subsequent clever implementation and constant enhancement. The concept of Model 204 was first articulated in an unsolicited proposal to the Director of the National Security Agency on 13 October, 1965. Quoting from the abstract:

Experimental Implementation of a Software System Based on Relational Structures

"Relational structures (described herein) are a novel form of software memory organization suitable for use in large-scale computer-based systems. The particular form of organization allows for the efficient storage of data-bases exhibiting arbitrarily complex interrelationships between the data-elements. Retrieval is achieved without the necessity of searching lists. Thus, the present approach avoids the frequently-encountered difficulty that processing time increases enormously as the data-base grows in size.

Since the approach is suited to problems requiring fast retrieval from large data-bases having complex structure, it may find particular application in natural-language processing (e.g., mechanical translation), fact retrieval, and deductive question-answering systems".

Several things about the proposal are remarkable. The problem statement essentially describes what could be called a "fifth generation" or "post-relational" DBMS. This statement is valid today as the basis for a forward-looking business plan, yet it was executed almost 40 years ago! The authors presciently anticipated the scaling issues that to this day plague traditional DBMS implementations and they envisioned an approach to solving these problems of scalability while still supporting "arbitrarily complex interrelationships between the data elements" – the holy grail of so-called "Universal Database".

The notion of "software memory organization" was meant to encompass not just the database file structures that allowed for the efficient storage of data exhibiting arbitrarily complex interrelationships, but also the programming language and requisite run-time environment to support the efficient processing of the data. The proposal was for a complete, self-contained application platform with extraordinary data processing abilities. In line with the thinking of the time, the authors imagined a system that would enable end-users to process data without the assistance of professional programmers. Accordingly, they proposed a high-level English-like processing language (User Language). In order to support deductive, or inference processing, they planned to incorporate facilities for processing the list structures found in LISP, a language popular with Artificial Intelligence researchers.

Model 204 embodies six unique attributes that differentiate it from other database management systems and application platforms. Taken together they confer a significant advantage in supporting large-scale complex applications.

  1. The Model 204 data model is far more flexible than the simple tabular format supported by the relational data model. Model 204 records can directly store complex tree-structured objects, without resorting to costly mapping schemes. Model 204 is the only commercial implementation of the Entity-Attribute data model. A Model 204 database file is structured into records, where a record is a container that can hold arbitrary combinations of field occurrences. Field occurrences are logically of the form field name = value. A file can contain several thousand field definitions. Order of field occurrences is maintained within records.
  2. Model 204 provides unequaled indexing technology. Any number of fields in a file can be indexed, using either a hashed structure or a compressed B-tree. In fact, Model 204 pioneered the technique of indexing via hash encoding. Because the database indexing structures were specifically designed to support complex Boolean retrievals, as opposed to single key searches, Model 204 can efficiently perform extremely complex Booleans – including AND terms – against huge databases. The efficiency advantage comes primarily from not scanning lists. The performance of searching and updating Model 204 indices is essentially flat across an extremely large range of file sizes.
  3. Model 204 integrates a fourth generation language (4GL) specifically designed for writing database applications with a high performance DBMS engine. User Language provides elegant operators, like FOR EACH OCCURRENCE, to leverage the flexible data structures supported by the Model 204 Entity-Attribute data model. The tight integration of User Language with the Model 204 DBMS engine dramatically simplifies application development, significantly improving programmer productivity. At the same time, Model 204 applications coded in User Language exhibit instruction path lengths similar to IMS COBOL-DL/I applications, providing extreme scalability. Model 204 provides the only Rapid Application Development environment with proven enterprise-level scalability.
  4. Model 204 provides a unique processing paradigm whereby programs can directly manipulate lists of records (PLACE, REMOVE, the ON clause for FIND, etc.). These are the underpinnings of the inference processing facilities that were part of the original problem statement. Because these statements use the same low-level structures as the Model 204 FIND statement, they are extremely efficient. The closest analogy to this construct in a relational system is temporary tables, which are not nearly as flexible or efficient. List processing radically simplifies certain operations like cross-tabbing and the types of queries used in data mining and OLAPI applications.
  5. Model 204 exhibits superb memory management. Mostly this is a consequence of the fact that the first versions of Model 204 ran on machines with pitifully small amounts of storage. However, tight memory management was also vital to achieving the original scalability objectives. A side effect of the memory management, when coupled with the integration of User Language and the Model 204 DBMS engine, is that Model 204 can support very large numbers of simultaneous in-flight transactions. This is an important quality for web-based applications that need to support extremely large user communities. Memory management is typically one of the most limiting factors for application servers and web servers.
  6. Model 204 is coded in assembly language, as opposed to a high-level language. This simple fact gives Model 204 a significant performance advantage over its rivals. Given the amount of string manipulation and bit-twiddling involved in a DBMS engine, a "C" implementation of Model 204 would probably require anywhere from two to four times as much CPU as the current assembly language version.