IRAC Preface |
1 Introduction |
2 General Design Objectives |
3 General Syntax and Semantics of the Abstract Specification |
4 Object Management System |
5 Program Execution Facilities |
6 Input and Output |
7 Protection and Security |
8 Requirements for Tool Management and Services for Tools |
9 Ancillary Requirements |
10 Definitions |
Submission of Comments
4.1 Objects, Relationships and Attributes
In order to express the requirements for the PCIS, it proved necessary
to have a very precise description of the way terms were used. There
are two orthogonal distinctions:
- The distinction between things in the real world and their
representation in the computer
- The distinction between the abstract concept of a thing and
its observed or manipulable information (this distinction has
been known for centuries, see for example Aristoteles, Peri
Hermeneias (de Interpretatione) approximately 350 B.C)
leading to four separate concepts:
- Things in the real world, for example a person named "John
Smith". Note that the concern here is with the abstract concept
of this particular person. He may change his name, he may or
may not be physically present when we talk about him, but the
abstract concept of this particular person is quite separate
from the information we may have about him.
- There are facts that we can observe in the real world about
these things, such as a particular person's name, age, height,
weight, etc. We may know many more facts about the real world
thing than are actually recorded in the computer.
- The representation of the abstract concept of a thing in
the computer. This is known as an "object" and may be
considered as an instance of an abstract data type. It is quite
separate from the data values that may be recorded about it;
indeed, you can do nothing with such instances, except apply
the operations relevant for the type.
- The facts recorded in the computer system about objects.
These are known as attributes.
Note that relationships are quite separate from the four concepts of a
thing, an object to represent a thing, facts about things, and their
representations as attributes.
In order to state some more specific requirements that the PCIS shall
satisfy, it is necessary to specify a particular data model within
which those requirements may be expressed.
The term "data model" may be used in two distinct senses:
- It is used to refer to the way in which data are structured
and manipulated, for example, the network model, the
hierarchical model, the relational model and the Entity
Relationship (ER) model.
- It is used to refer to a particular schema to represent the
things of interest in the environment, which are modelled
within the facilities of a data model of the first sort.
The term "data model" is used in the first sense here. Within the IRAC
the data storage and manipulation needs of a software project have been
considered in terms of an "Object Management System".
Software projects deal with a great many collections of data, devices,
people, and other things which need to be treated as single units. The
representations of these in the computer (concept 2 above) are given
the abstract title "objects". A computer-based system for storing,
naming, and manipulating objects is called an "Object Management
System".
Much of the PCIS work has had the underlying assumption that the
typical flat or hierarchical file system found in a modern computer
system is inadequate for the needs of software development projects.
This assumption originates in the fact that most projects and companies
are forced to supplement the file system's facilities with additional
functions, tools, and conventions to be able to do their job. For
example, an attempt is often made to give files a "type" that indicates
the general form of their contents by establishing a naming convention.
Files containing Ada source code might be required to have a name of
the form xxxx.ada, where xxxx is the user-meaningful name of the file.
It may be that the Ada compiler will only accept as input files whose
names are of this form. The convention may reduce the number of
characters that a user can use to make a file name meaningful and is
far from foolproof. The IRAC makes typing an intrinsic part of the
system and thus eliminates these and other problems. The kn
The reader is referred to the list of definitions for the precise
definitions of and rationale for the following terms which are used
throughout this section:
OBJECT,
RELATIONSHIP, and
ATTRIBUTE. An
understanding of the definitions of these terms is crucial to
understanding the requirements and rationale that follow.
4.1A Data. The PCIS shall provide mechanisms for representing data
using:
a) Objects. An object is the PCIS unit for representing "things"
which are relevant to the needs of tools.
b) Relationships. A relationship is an ordered association among
objects. A relationship among N objects (not necessarily distinct) is
known as an "N-ary" relationship. The PCIS may restrict relationships
to be binary.
c) Attributes. An attribute is an association of an object or
relationship with a value. This is generally the value of a property of
the object or relationship, describing its state.
d) Components of objects. An object may be specified (through a
means left undefined here) to be a component of another object. An
object's components then form a set. This supports abstraction, by
allowing the set to be treated as a single object.
A software project involves many kinds of objects, relationships of
objects, and attributes of objects and relationships, all of which must
be stored and made available. For example, a piece of program source,
some test data, a document, a person or a particular assignment may all
be represented by objects; the actual source text, test data, text of a
document, date created, storage format and access allowed may be
attributes of an object; "compiled from", "referenced in", "written
by", "working on" may be relationships between objects; and "date
compiled" may be an attribute of a relationship.
With more advanced systems, source text may be stored as trees of
smaller-than-file units of text (for example, representing statements
or lexical units); documents may be held as trees of sections,
paragraphs, sentences, and references to object attributes where names
and properties appear; design chart graphics may be held as graphs of
individual defining objects with their graphical placement data.
Because the name of an object in a repository may change, and the name
may appear throughout source text and documents, the source text and
documents may be implemented with relationships to the repository
object's name attribute. This eliminates multiply embedded, separately
located copies of the name attribute. The OMS-based system provides the
facilities which can be used to build powerful application models of
this sort.
Attributes represent the actual data about objects or relationships.
Examples are the actual source or object code of a program, the date an
object or relationship was created, or a status value to describe an
object. The value associated with an object or relationship by means of
an attribute is referred to as the attribute value. Values are numbers,
names (enumeration values), or character sequences, or aggregates
thereof. Relationships are "associations" from one object to another.
Examples are source code to object code, old to new revision of a
document, and owner/user to owned object. Relationships may be
functional mappings (one-to-one or many-to-one) or relational mappings
(one-to-many or many-to-many) and may have attributes.
At least some objects represent the concept of those collections of
data which we normally think of as files or data sets. For these
objects the attribute that contains the data that would (in a
conventional file system) be held in a file is of particular
importance. The data in this attribute may consist of multiple records
in a given format or of undifferentiated sequences of characters or
bits. Examples are source text, test results and cross references.
Other objects may represent hardware devices (either abstract or
virtual), groupings of objects for purposes like naming, and users of
the system.
Within the software engineering domain it is often desirable to operate
on data at different levels of abstraction. The ability to decompose an
object into a number of components allows the levels of abstraction to
be modelled within the OMS. This has a number of important benefits
including potential performance gains. For example, in a tool which
operates on Ada program libraries it may be sufficient to regard the
program library as a single object, whereas in other tools it may be
necessary to examine the components of the program library, that is,
the source files, compilation units and their inter-relationships. An
object that can have components is often referred to as a "composite
object".
4.1B Attribute Values. The PCIS:
a) Shall provide mechanisms for attributes whose values are at
least integer and enumeration types.
This is the minimum requirement. In addition to general mechanisms for
enumeration types, specialized mechanisms (such as for Boolean and
character) may also be provided (for example, if efficiency concerns
warrant).
b) Should provide mechanisms for attributes whose values are real
numbers. The PCIS may provide mechanisms for non-scalar data, but it is
not required to do so.
It is left to the PCIS designer to enumerate the additional attribute
value types that are supported, for example, fixed and floating point
types, array and record types.
Several communities are striving to produce standards giving
portability and interoperability to applications. The choice of
attribute types to be supported is crucial to those goals. So this, of
all areas, is one in which the PCIS designers must give careful
consideration to the choices made through other standards, to maximize
interoperability.
c) Shall provide mechanisms for attributes whose values can be
used as bulk data.
Bulk data may correspond to files in a conventional operating system.
The definition permits, but does not require, an ability for a single
object to have any number of such bulk data attributes.
4.1C Shared Components. The PCIS shall provide that any objects may
share components. PCIS shall also provide that objects may constrain
the number of composite objects of which they are a component, down to
one.
This requirement allows for objects that are composites of overlapping
sets of components, for example, two different Ada program libraries
might share a source file. The ability to constrain the number of
objects of which one is a component is necessary in order to model
basic data structures, such as trees. There is no intent in the
requirement to suggest whether the expression of the constraint is to
be done on a per-object basis, or instead on a class-basis through the
object type. Either approach has benefits, though the latter has the
uniformity of being consistent with the expression of bounds on the
number of incoming and outgoing relationships for a given object (see
Requirement 4.2B(c)).
4.1D Context Sensitive Interpretation of Relationships. The PCIS
shall provide the ability to interpret different relationships
involving shared components in the contexts of the objects containing
those components.
This is best explained diagrammatically (see below). C1 and C2 are
composite objects with the common component S, each composite having a
component A. The navigation from S to A in the context of C1, arrives
at the package A that is correct in the context of C1, while the
navigation in the context of C2 arrives at the package A that is
correct for C2.
+===============================================+
|| ||
|| +-----------+ +======================++=======================+
|| | package A | 'with(A) +-----------+ || C2 ||
|| | . +------------+ with A; | 'with(A) +-----------+ ||
|| | -- vers 1 | || | package S +------------+ package A | ||
|| | . | || | . | || | . | ||
|| | end A; | || | . | || | -- vers 2 | ||
|| +-----------+ || | end S; | || | . | ||
|| || +-----------+ || | end A; | ||
|| || || +-----------+ ||
|| C1 || || ||
+=======================++======================+ ||
|| ||
+===============================================+
4.1E Granularity.
a) The PCIS mechanisms shall support objects representing data
which range in size from large (up to the level of granularity of, for
example, a DBMS database or the text of a book) to small (down to the
granularity of, for example, paragraphs within a document or nodes
within a diagram).
The data that is being dealt with in a PSE has a wide range of sizes of
structures from a book or complete specification down to structures in
which the grain of detail goes down to the individual sentence in a
requirements document, to nodes and arcs within diagrams, and to nodes
within the abstract syntax tree of a program. Further, all this
structure is relevant to the software engineering process and as such
should be explicitly modeled within the OMS.
Clearly the OMS can be used to represent data at any level of
granularity. The important point is that this be achieved efficiently
and economically. If, for example, the PCIS specifies that all objects
have a large number of predefined attributes, then representing fine
grain data as separate objects might cause unacceptable overhead.
b) The PCIS shall facilitate implementations to exploit common
properties of composite objects in order to get good access performance
to all their components.
One of the main interests of introducing facilities to define composite
objects is the fact that an implementation can anticipate operations on
all its components at the time where its root component is accessed. It
may for instance apply a transitive locking of all components in
advance or transfer data supporting the components in a cache.
Facilities such as composite access control lists may be provided to
allow for a centralized security control (at the level of the root, for
instance).
c) The PCIS shall present to tools a uniform interface for
facilities to define and manage instances of, and relationships
between, data of differing granularity, even when different facility
implementations are used for different degrees of granularity.
Tools and users should not be aware of the presence or absence of
different facilities for differing granularities of data.
Administration of data by differing facilities based on granularity may
be necessary for access times commensurate with data size. However, the
user should not suffer the inconveniences of a non-uniform interface to
those facilities, such as reworking tools and end user procedures.
Tools should be oblivious to the existence of multiple or differing
facilities in the same way as they now are oblivious to NFS hosting of
files on a network.
4.1F
Data Consistency. The PCIS shall provide mechanisms to ensure
the consistency of the data represented in the Object Management
System. These mechanisms shall include at least typing, access control,
synchronization, transactions, robustness and restoration. These
mechanisms should support consistency which ranges, in time-scales,
from short term (for example, over an individual operation) to long
term (for example, configuration control and checkout/checkin).
The requirement for data consistency must include recognition of the
fact that systems do fail unexpectedly, due to hardware and software
faults. The PCIS must be implementable on a wide variety of hosts;
therefore it is unacceptable to require fault-tolerant or
multiply-redundant hardware. The PCIS design must actively support
recovery from unexpected failures.
There are several basic approaches to this problem. Modern data base
managers usually implement the "transaction", a unit of work that
appears, from the viewpoint of the rest of the system and of unexpected
faults, to happen "all at once". This is sometimes implemented by a
journal system, in which the data base is recorded at a given time and
subsequent transactions are recorded in a journal. After a crash, the
data base is restored from the recording and the transactions from the
journal are re-applied to it.
Modern operating systems often build a certain amount of redundancy
into their data structures and include a "scavenger" program that can
scan the structure after a crash and use the redundancy to correct any
inconsistencies in it. This is sometimes formalized as a system of
"truths" and "hints", in which the truths are handled in such a way
that they cannot be made inconsistent by a crash. Hints are used by the
operating system in its normal operation, for instance, to speed
access, but are always checked against the truths. When a hint and
truth conflict, the hint is discarded. A scavenger program can recreate
the hint system from the truths.
The PCIS has aspects of both data base and operating system, and both
of the above approaches are possible implementations. In fact, they are
quite similar at a basic level, differing mostly in the terminology
used. The PCIS designer is not constrained to adopt either approach
except insofar as there are specific requirements for a usable
transaction mechanism. However, data consistency is a very important
requirement, so much so that it may set the tone for the entire PCIS
design.
The consistency mechanisms enumerated here are all enumerated in other,
more specific requirements. The purpose here is to ensure their
complementary working over wide time-scales.
Go forward to
Section 4.2, Typing.