Summary of ARTEWG Workshop on Distributed Systems

21-23 April 1995

Submitted by Mike Kamrad (Computing Devices International) ARTEWG Chair

On 21-23 April 1995 in Santa Barbara CA, the Ada RunTime Environment Working Group (ARTEWG) of the Special Interest Group on Ada (SIGAda) of the Association for Computing Machinery (ACM) held a Workshop to discuss the needs for distributed systems software technology for high-tech applications in both DoD and non-DoD markets and how the distributed systems features of Ada95 can respond to these needs. A variety of experts in different application domains participated in the workshop.:

The conclusions of the summary session are, starting with observations about the nature of the applications:

  1. Very strong interest/development in both tightly integrated distributed systems and wide area distribution using networks, usually in hierarchial organization
  2. Use of client-server model is very strong
  3. While programming language is important, it is but one of several important technologies needed (such as operating system, communication protocols, client- server model)
  4. Interfaces are very important to sustain/preserve investment in presence of rapid technology changes. Lots of interest in POSIX, CORBA
  5. Many "application needs" appear to be too abstract for support from programming language
And the applicability of the Ada95 Distributed Annex capability:

  1. New features OOP, child packages, and especially partition model concept are useful for structuring distributed applications.
  2. Not clear that RPC mechanism is sufficient for all communications, in particular asynchronous message passing, but that the existing features provide the potential facilities for supporting additional message passing mechanisms.
  3. Investigation/prototyping is needed to show how DS annex can do the following:
    1. program various forms of fault tolerance
    2. how broadcast/asynchronous message passing can be implemented
    3. how it can interface with existing standards, like CORBA
    4. how it interfaces with other features like RT annex and OOP
    5. adaption to HRT performance requirements
The recommendations from the workshop for next step were to address these issues of applicability of Ada95. The concensus of the group was that aggressively pursuing these issues at the pace that ARTEWG in the past has pursued other issues, namely meeting on a quarterly basis, seems to be premature, awaiting more experience on Distributed Systems Annex implementation and on the experimentation of these features on new applications. Therefore it was decided that we should wait another year or so to meet again to formally discussion these experiences. It was suggested that this opportunity could occur with the next Interation Real-Time Ada Workshop, which is tentatively scheduled for September 1996. In the meantime, the existing ARTEWG mailing list would be combined with the mailing list from the workshop attendees to maintain communications among interested parties to share experiences and discuss issues.

The result of these recommendations is to close the chapter on ARTEWG as it was originally constituted. ARTEWG has met its goals, because it has produced documents that assist Ada83 for real-time systems and because the real-time facilities of Ada95 satisfy the requirements that originally motivated ARTEWG. As the implementation and usage of Ada95 unfolds over the next year or so, the community will decided what organization will succeed ARTEWG to meet its needs. ARTEWG has served its purpose -- it's time to move on.

Summary of Talks

Summary of Jay Bayne Presentation

Jay Bayne is Vice President of Systems Technology and Strategic Marketing for Elsag Bailey Process Automation. Elsag Bailey builds automation control systems for the continuous and batch process for the manufacture of chemical and pharmaceutical, oil and gas, electric power, food and beverage, and pulp and paper products. Jay described their latest program to design and manufacture scalable command, control and communications platforms and application products suitable for automating global continuous and batch manufacturing enterprises. The program will provide both design and development tools, library of service software and the execution enviroment.

Jay described the parameters for this program. Process control is migrating from individual device controls within a plant to veritcal integration of process controls and information management within and between plants. The goal is an enterprise command, control and communications infrastucture that combines process control along with plant and production scheduling, plant configuration and maintenance management and other secondary control. These automation systems have a long- life time, typically 15-30 years, so that the architecture must be evolvable over this period of time. This dictates the layering of the architecture, separating the application control policiies and procedures from the underlying implementation mechanisms which will change as more effective technology emerges. At the same time, the architecture must address the span of control throughout an entire manufacturing enterprise, which Jay categorized into five levels starting with the synchronous controls on the simplest parts of machines to the asyschronous interplant controls involving order processing, logistics, multi-plant production scheduling and inventory management.

Jay describe the technologies they are developing in this program:

ELBA Automation Model

The ELBA Automation Model separates the logical and physical process management policy into four abstract layers, proceeding from the lowest layer, process regulation, process optimization, process adaptation and process organization. All four are present in some form in all five levels of the automation hierarchy. All four layers are supported by a set of services which provide for the supervision of process behavior. These services use a model of control that implements these services as a learning automata.

In a two loop mechanism, the lower loop provides the conventional signal processing, state propagation, behavior generation and final judgment. The upper loop provides for value judgment, planning and hypothesis to permit adaption and dynamic tuning. The upper mechanisms provide for interactions between the automation system and humans and processes and between processes and their environment. These interactions include event management, planning, contingency response, management policy changes and adaptation to evolving operational requirements.

ELBA Object

The computational model must be able to handle both synchronous signal processing and asynchronous discrete event system behaviors throughout all five levels in the automation hierarchy. The software building block for capturing this is the ELBA object, which captures both types of behaviors and handles both continuous signals and discrete messages.

Classical synchronous signal processing model with associated filters, estimation and control theories govern the management and control of the continuous process part of the ELBA object. The control of the discrete event part of the ELBA object is defined as a finite state machine. Signal ports supply the process part with information and message ports supply the messages and events to the discrete part.

Membrane and MDS Services

These services isolate the ELBA automation and object models from the implementation details of the underlying hardware and software components. The membrane services provide the bridge between the ELBA object needs for I/O and other operating needs and the underlying I/O devices and operating systems. ELBA objects need communications that support location transparent naming for distribution and platform independence, which is provided by the Message Delivery Service (MDS). The MDS provides three classes of service: message delivery, call control and communication service management. Message delivery provides for the actual delivery of messages. The call control services provides for dynamic binding of named entities. The service management establishes the protocol for communications and routing paths. An important dimension of communication is establishing the quality of service (QoS) for individual messages and signals. QoS defines the service performance contracts between cooperating ELBA objects in two types of quality. The first type of quality describe service between the ELBA object and the MDS itself. The second type of quality is between the ELBA object and the underlying execution environment at the destination. The consequence of the QoS is the creation of distributed thread of controls shared by multiple ELBA objects, where the parameters in the QoS dictate the linkages and control of the threads in the local ELBA execution environments to coordinate the end-to-end execution of the distributed thread.

Graphical Design System

The ELBA object provides the design metaphor for defining and constructing the components of the automation system. There are extensive tools for desiging individual ELBA objects and connecting them. There is a rich library of service routines that the designer can use to support the activities in the ELBA objects. In addition there are design mechanisms that permit the designer to construct new and unique services such as the membrane services for unique I/O devices. The ELBA metaphor is also used in the monitoring of the execution of the subsequent process automation systems.

Jay described some of the implementation details of this system. First there is a heavy dependence on existing commercial products and interface standards. Interface Definition Language is used throughout the system implementation to define the majority of its interfaces. At the upper level of the automation system, both Unix and Windows are used for the underlying operating system; at the lower levels of the automation system, Mach/RT 3.0 is used for the operating system. They are looking at new operating systems that can permit user-defined scheduling policies separate from the support mechanisms in the underlying operating system.

C anc C++ are the predominent implementation languages used, along with Smalltalk, 4GL and IDL. Jay believes that the choice of programming languages is irrelevant in developing the implementation mechanisms supporting the upper levels of the automation system. But the choice of language is important at the implementation of the lower levels. One of the reasons C and C++ were chosen is the belief that they provide speed and portability to support implementation at all the automation levels. In addition, C and C++ provides a greater variety of tools that are quicker to market, a large supply of library software and a very large labor pool. Ada was never considered a choice.

Summary of Keith Pratt Presentation

Introduction

Military avionics systems consists of hardware and software elements necessary to perform navigation, sensors, controls and displays, flight path management, weapon delivery and communications. The primary purpose of these systems is to provide information to the pilot.

These systems consists of heterogeneous general purpose processors, graphics processors, and signal processors all communicating over a common communications path. The general purpose processors and graphics processors are implemented with Ada programs. The general purpose processors perform navigation, control, and other general processing while the graphics processors process display information. The signal processors that process signals from sensors are implemented with macro programs.

Characteristics of Distributed Systems in Military Avionics

Separate, independent software programs are used as the building blocks for these distributed systems. A program ( the executable output of the linker) is the unit of software distribution. For Ada this is the Main procedure and all packages withed by it, and includes all tasks and procedures contained in those packages. A single program does not cross a processor boundary or and address space boundary, but more than one independent program may reside on the same processor (in different address spaces). This provides a natural hardware barrier that limits the effect of programming errors.

These independent programs communicate through interface messages only, and over a Parallel Interface (PI) bus or a high speed data bus. This message interface is used whether the programs are in the same or different processors. In this fashion one program can be changed and rebuilt (or re-distributed) without impacting the other programs. Only changes to interfaces between programs cause impacts to more than one program. Since frequently messages must be sent to multiple destinations they are broadcast to all. The one broadcast message avoids having to send the message multiple times to each destination. The receiving program is responsible for capturing appropriate messages. The sender does not need to know the number or identity of the receivers. The broadcast of messages also permits the capturing of data by monitors and simulators that would not necessarily be in the operational system without having to change the operational software.

Past systems were tightly coupled, synchronous systems characterized by cyclic executives which strictly controlled the time when data comes in, when it is processed, and when it goes out. Current systems are characterized by loosely coupled asynchronous systems. Instead of the central cyclic controller, the processing is controlled by self-scheduling Ada tasks. The Ada tasks are driven by incoming data, not hard wired deadlines. In the absence of inputs the software is designed to "coast". The only time constraints imposed are those required to meet end-to-end data latency requirements.

Tools and Support for Distributed Systems

Data Base Interface Tool
This tool is used to support definition and maintenance of the interface messages passed between tasks. The tool generates Ada package specifications which contain data type definitions (Ada records). These common Ada packages are used by each sender and receiver program of that interface message.
The message interface data types along with other common data types are placed in a common parent Ada library. Each Ada program is then created in a separate sub-library of this parent library.
Ada Tasking and Priority Support
Ada tasks are self-scheduling and are assigned priorities based on rate monotonic theory. When multiple Ada programs share the same processor, preemptive priority scheduling is performed across all tasks of all programs on the processor.
Ada Run-Time Resource Manager
The Run-Time Resource Manager manager performs fault detection and allocates the Ada programs to hardware resources at initial program load. The resource manager will also redistribute programs to remaining available resources upon detecting hardware faults. "Hot spare" processors are used. They do not execute in lock step with their active partner but execute in the same data driven manner as the active processors. The Run-Time Resource Manager is duplicated in these hot spares for fault tolerant reasons.
Non-Intrusive Debuggers
Because stopping a running program can disrupt the entire system the debugger provides the ability to symbolically examine and modify program variables while the program is running. A task is devoted to getting and responding to debug messages, doing the read and write operations without stopping the program. This task is maintained in operational flight as well as during debug phases and the debug messages are communicated over a separate test and maintenance (TM) bus.
The debugger can also halt and resume all activity in the system at the same time and these operations can either be initiated by an operator or by software. The stop (or resume) message is received by the Run-Time Resource Manager and it uses run-time capabilities to stop the system. The stop operation is a "gentle" stop in that the system is permitted to come to a known state before halting (e.g., queued messages are allowed to complete).

Summary

Due to the increasing complexity of distributed systems for military avionics applications there is a need to constantly look for ways to simplify the building and integration of these systems.

Summary of Jeff Clark Presentation

Jeff Clark of Computing Devices International (CDI) made a presentation entitled "Distributed Systems In Radar and Tracking Applications", in which he described CDI's experience in using Ada in an early-warning, active RADAR application ("HAVE STARE"). Additionally, Jeff presented the current trends in such systems, as well as the issues involved with the application of distributed systems technology to that domain.

Jeff described a sample application as having a real-time, closed- loop architecture with a 50 millisecond track maintenance time frame, scheduling of future frame activities, and signal processing past frame returns. Additionally, at a lower relative priority, the system concurrently communicates events and processed data, drives operational displays and controls, and records bulk and processed data. System components thus include an active RADAR, a positioning and control system for the RADAR, signal processing components(e.g. to filter noise), a data recording facility (for later analysis), the track loop, communications processing (i.e., an early-warning system must transmit the warning), and integrated displays and controls. One salient characteristic of these high-powered RADAR systems is that failure to control the RADAR hardware can not only lead to loss of functionality (critical to ensure defensive capability), but can also physically destroy millions of dollars of hardware. (Other safety-critical systems can obviously do the same -- the difference here is the scale of damage.)

Jeff indicated that the commonly-appreciated advantages of distributed computing technology also applied to this domain, namely: spreading functionality of mainframes across several platforms to reduce cost and parallelize processing, improving availability, and facilitating the addition of functionality without redesigning the system. In today's implementation, he indicated that distribution across platforms is at the workstation level for displays and controls, with specialized RADAR control and signal processing hardware (on VME-bus based hardware). Data processing is still done by mainframe-class machines. At a finer level of granularity, the computational load is spread across processors, which may use Ada tasking as well as specialized, O.S.-specific memory and event interfaces.

Typical goals not met by today's systems include scarcity of adequate Ada implementations for parallel and vector processors or specialized single board computers, as well distributed processing in general because of large data stores and data latency issues. He indicated that a few processors can be used effectively, but not many.

Jeff then described their experience on one such successful system, the HAVE STARE (acronym not available) project, which uses Ada83. In this system, displays and controls functionality is distributed across several workstations, and track support loops are distributed across several general purpose processors with vector processors connected for signal processing. Other distributed processing includes specialized RADAR control and communications single-board computers, with signal processing performed by several processors within a single "data processor" machine. The HAVE STARE architecture thus consists of the single-board computer controlling the RADAR, another single-board computer for message processing, a multiprocessor "data processor" for the track loop, mission control, signal processing and data recording, and workstations for displays and controls. The "data processor" is a three-CPU VAX 9000 executing approximately 40 Ada tasks. The displays/controls workstations are VAX 4000 machines.

In the near term, Jeff indicates that there will be minimal new development of this class of system, as the DoD funding emphasis is to maintain existing systems. Furthermore, since existing systems were developed mostly before Ada was available, cost effective solutions for platform upgrades must reuse as much of the legacy design as possible. Ada is well positioned to aid in this requirement since it supports non object-oriented designs, where necessary.

He also indicated that Ada95's facilities in the Distributed Systems Annex would well meet the needs of distributed RADAR applications, via transparent remote subprogram calls and marshaling routines for platform independent data representation.

Finally, it was Jeff's opinion that requirements for very high availability are not yet such that distributed systems are typically proposed, because of the view that more hardware implies higher risk and higher test costs.

Summary of John Woodruff Presentation

John Woodroff of the Lawrence Livermore Laboratory made a delightful presentation to the workshop on the National Ignition Facility being developed at Lawrence Livermore Labs.

The purpose of the facility is to do research on nuclear fusion using high-powered laser beams on frozen deuteurium capsules. The laser beams will travel along 192 different paths to be focussed simultaneously on the same pellet. These laser paths must be maintained so that all beams arrive simultaneously and focussed on a pellet of dimension measured in millimeters.

The distributed computing environment being developed for the National Ignition Facility must handle 3,200 control points at rates up to 10 updates per second for human response, motors and sensors. Most of the setup and data collection capability operates in quasi-real-time with a hierarchy of control levels and interprocessor communications and control.

There are a large number of control points, and therefore processors required to manage them. Current plans predict 220 front end processors, processors, workstations and file servers. To support such a system, distribution is incoroprated into the design as a basic philosophy. Processing elements fit into a client-server model using CORBA communication between objects on the network.

Original programming was done in the Praxis (an Ada-like language) with recent retrofits being done in Ada on VAXStations. Plans are to use Ada95, especially for the Object-Oriented features. The design of the system is intended to be reusable and to support the National Ignition Facility for 20+ years.

It was apparent from the presentation that this is a challenging distributed environment being envisioned. Current prototytpes use VAX-VMS specific global sections and system services. The move to a client-server model, together with CORBA/IDL interfaces and reprogramming in Ada is in keeping with recent developments in programming languages and distributed architecture design, and hopefully will position the software to be maintainable through its 20 year life cycle.

Summary of Pete Rizik Presentation

Pete Rizik started his presentation by making it clear that the talk represented his personal views and not necessarily those of his company, SAIC. His concerns evolved around the needs of distributed simulations as used by the military for training and other purposes. In general, his talk was at a high level and was aimed at presenting a clear explanation of this application area. Videos were used to illustrate current, and future, distributed simulation exercises.

Simulation ("an alternative to the real thing"), in this context, is concerned with linking together a number of distinct elements into a potentially global synthetic battlefield. The elements include individual simulators for tanks, planes, ships etc. The links utilise national, and international, networks. A good distributed simulation has significant benefits in terms of giving realistic training with low environmental impact. However it was pointed out that a poor (or out-of-date) facility can have a negative training effect.

The importance of distributed simulation is indicated by the number of programs/projects that are looking to exploit this technology. To realise the potential for this type of simulation a number of key technologies need to be in place, including:

As the presentation was at a fairly abstract level it was not of direct relevance to distributed Ada. Nevertheless a number of issues did arise, and it was clear that considerable effort would be placed on this application area in the future. Some simulation elements have clear real-time requirements; for example, when moving through a terrain a simulation rate of perhaps 60 HZ is required. Moreover the networks themselves have real-time requirements placed on them. There is also increasing emphasis on the use of object oriented paradigms. But perhaps the key role that Ada can play in this domain is as the system's integrator. The partitioning facilities within the distribution annex provide a means of modelling the entire system and of integrating disparate elements written in a variety of languages.

Summary of Bob and Suzie Leif Presentation

Bob and Suzie Leif are key developers of medical devices for Ada_Med, a division of Newport Instruments. They have developed several medical devices, including mid- range hematology analyzer, using Ada. They are both firm believers in Ada and software engineering for medical systems development and are outspoken promoters of Ada in this market.

In an entertaining anecdotal style, Bob and Suzie described the relevance of Ada to the medical devices industry. They drew an direct analogy between the safety and performance requirements of medical devices software and military mission critical software. The US FDA expects software development standards with well documented and enforced software process, well written and traced requirements with heavy emphasis on hazard and safety concerns and formalized testing with evidence through traceability that the hazards and safety issues were addressed. They claim that employing the software development process encapsulated in MIL-STD 498 and Ada is the solution to satisfying the FDA. To further support this claim, they showed how FDA purchasing procedures demand conformance to product specifications which validated Ada compilers appear to meet and C/C++ compilers don’t.

Bob and Suzie then enumerated other compelling reasons for chosing Ada:

They described the hard problems that “medical command and control systems” face, such as identifying patients and obtaining their records. Identification of patients require a reliable method such as automated fingerprint identification. In handling patient’s records, there are several thorny issues, including confidentality and security, multiple data formats and data types, such as images, and access to multiple data bases. Consequently they see a heavy emphasis on data base technology and image processing technology.

To address these problems they identified several potential opportunities. The first is upgrading the current AdaSAGE database management capability to work with Ada95 and to incorporate the next version of SQL, SQL 2. AdaSAGE is the relational database and menu generation system that was developed by US Marines and is available on the Internet. Second is research into image prefetch to improve the accessing time for acquiring images. Finally, the creation of a DoD Telemedicine test bed will investigate the impact of computers, sensors, communications and software on delivery of health care within the DoD. This work is centered at US Army Medical Research and Material Command, Fort Detrick MD.

Finally Bob and Suzie ended the presentation with a call to the Ada community to address several specific opportunities for maximizing Ada’s value to the medical systems development: getting Ada compilers for DSPs and for Novell network systems.

Summary of Capt. Jules Bartow Presentation

An overview of the Joint Advanced Strike Technology (JAST) Program was presented by Captain Jules Bartow, U.S. Air Force.

The JAST program is a joint effort intended to develop next-generation strike weapons technology for the U.S. Air Force, Navy and Marine Corps. The team is unique in the degree of inter-service coordination: an Air Force Major General is the Program Director and a Navy Rear Admiral is Deputy Director, with Marine Corps, Navy and Air Force personnel serving as Requirements Director, STOVL Director, et cetera.

JAST will develop replacement technology for the Navy's A6, the Air Force's F-16, and the Marine's Harrier aircraft in a "plug-and-play" format in which interchangeable parts are integrated into a common airframe based upon service-specific requirements. For example, airfield landing gear could be replaced directly with gear more suitable for carrier landings. The plug-and-play approach is intended to decrease testing costs, as well as reduce overall development costs due to commonality and sharing across the three services. A common production line is intended as the enabling basis for the "plug-and-play" capability, with engineering/manufacturing development intended to start in the year 2000. Deployment is expected by the year 2002.

A central theme of the JAST effort is affordability and cost-containment. A major approach is thus based on sensible leveraging of COTS products. However, the team is aware that there are "no silver bullets", and that COTS usage involves issues of security, longevity, safety and reliability, integration and performance. The security issues include support for multi-level secure systems, covert channels and cryptography. The longevity issue is most obvious: commercial products are obsolete in less than three years, while aircraft systems easily last ten times as long. Furthermore, the scale of application, on the order of millions of lines of code, dwarfs most COTS products' capabilities.

The team is working with the Software Productivity Consortium, which has identified nine distinct software issues: management of changing requirements, risk identification and management, reuse of both code and processes (especially for legacy systems, such as the Seawolf and F-22 trainers), method and tool integration, method inadequacies, process inadequacies, inadequate techniques for verification of systems and software, inadequate standards for measurement of project performance, and inadequate methods for systems integration. Overall, the team realizes that software management will be critical to the success of the effort, and intends to perform software development maturity evaluations of contractors as part of the selection process.

As part of the COTS and standards theme, the program is considering adoption of Ada95, POSIX (real-time extensions), X-11 OpenGL, and Motif. In order to evaluate the maturity of these standards, the JAST program has funded several near term demonstrations, including an Ada/POSIX Real-Time Demonstration, an Ada Software Fault Tolerance Demonstration, and an Ada95 Demonstration Testbed, among others. Several of these demonstrations have been shown at various conferences, such as the 1995 Software Technology Conference in Salt Lake City.

The Ada/POSIX Real-Time Demonstration examines the issues of real-time applications in a POSIX environment, and performs timing measurements to determine the applicability of POSIX to real-time applications (e.g., avionics applications).

The Ada Software Fault Tolerance Demonstration explores the potential for software- based fault tolerance (as opposed to hardware-redundancy, for example) by monitoring and correcting errors in software before they become unrecoverable effects. A real-world weapons-release calculation is used as the prototypical fault- tolerant algorithm.

The Ada95 Demonstration Testbed examines the applicability of Ada95's future avionics applications by modifying an existing combat aircraft flight simulation to use Ada95's Distributed Systems support for distributed programming. The simulation is therefore distributed across three embedded processors, with connections to two Sun workstations for cockpit displays and out-the-window views. In order to address the survivability issue inherent in combat systems, processor failures are transparently masked such that redundant partitions automatically replace those on failed processors.

Postscript

Finally, on a personnal note, I announced that I would not choose to lead any succeeding effort to ARTEWG. First, I have become stale. It is time for someone with fresh ideas and energy to replace me. Second I have returned to graduate school. Third, my daughter is a manic-depressive who is undergoing intensive therapy. I need to spend more time focused on these two important activities. I will complete my duties of ARTEWG chair by editing the full summary of this workshop for AdaLetters and I will maintain the expanded ARTEWG mailing list and will assist my successor whoever she/he may be.

I have been richly rewarded as ARTEWG chair -- it's the people of ARTEWG who have made this experience so rewarding. I am so damn proud of what they have accomplished. I have made personnal and professional friendships that will last my lifetime. I thank all of your ARTEWGers for you hard work and the great ride you have given me and yourselves for the last ten years.

Mike


Mike Kamrad
Computing Devices International
M/S BLC W3T
8800 Queen Avenue South
Bloomington MN 55431
kamrad@cdev.com
1.612.921.6908
FAX: 1.612.921.6165