

                            ASIS/Program View Layer
 
                               Control Flow View



INTRODUCTION
------------

The control flow view identifies all possible transfers of control
among the statements of an Ada program unit.  The view is a directed
graph, commonly called a control flow graph (CFG), where each node
represents a statement in the unit and each edge represents a possible
transfer of control between two statements. A CFG can be used to
determine the execution paths through a program unit.  As such, CFGs
have found application in complexity measurement, compiler
optimization, test data selection, and visual display of program
structure.

A control flow view is constructed from an Asis.Declaration element
of one of the following Declaration_Kinds:

	* A_Package_Body_Declaration
	* A_Procedure_Body_Declaration
	* A_Function_Body_Declaration
	* A_Task_Body_Declaration

Given such an element, the view construction function traverses the
element hierarchy rooted at the element and returns a CFG data
structure.  The traversal uses ASIS semantic queries, so the
compilation unit containing the element (and all supporting
compilation units) should be current.  Failure to meet this
requirement may result in the exception ASIS_Failed being raised with
the status Obsolete_Reference_Error.


FILES
-----

This directory contains the following files:

   README (this file)
   build_control_flow_view.2.ada
   control_flow_defs.1.ada
   control_flow_save.1.ada
   control_flow_save.2.ada
   control_flow_scan.1.ada
   control_flow_scan.2.ada
   control_flow_view.1.ada
   control_flow_view.2.ada
   tests (directory)


VIEW SUBSYSTEM COMPONENTS
-------------------------

The control flow subsystem contains 4 Ada library units.  Two
units comprise the subsystem's interface:

   package Control_Flow_Defs

	Declares the view data structures.

   package Control_Flow_View

        Contains operations for constructing, dumping and destroying
        a view.

Two units form the subsystem's implementation:

   package Control_Flow_Scan

	Performs an ASIS element traversal within a program unit body.
	Constructs the view data structure.

   package Control_Flow_Save

	Provides an operation for dumping a view to a text file.

In addition, procedure Build_Control_Flow_View is a test driver for 
demonstrating the operation of the subsystem.  Given an ASIS library
and the name of an Ada unit, the test driver builds a view for each
program unit body in the unit and dumps all the views to a text file.


VIEW SUBSYSTEM DEPENDENCIES
---------------------------

The control flow subsystem imports the following subsystems:

   common
   asis


TEST CASES
----------

The tests directory contains a set of Ada compilation units that test
view construction using each kind of Ada statement.  Included is a
textual dump of the view created for each unit. The format of the dump
files is described in the specification of unit Control_Flow_Save.  

VIEW DESCRIPTION
----------------

(In the discussion that follows, an edge in a directed graph is
denoted by the ordered pair (n1, n2), where n1 is called the *source*
node and n2 is called the *target* node.)

A CFG always contains two distinct nodes called the *start* node and the 
*terminal* node.  The start node represents the entry of control into
a program unit and the terminal node represents the exit of control
from the unit.  The start node is never the target of an edge and
the terminal node is never the source of an edge. The start and
terminal nodes are denoted by the symbols "S" and "T", respectively.

With one exception, each of the remaining nodes in the graph uniquely
corresponds to one Ada statement belonging to the unit.  The exception
occurs in the case of the Ada IF statement, where the graph contains a
node for each IF and ELSIF condition of the statement.

An edge connects two statement nodes if there is a possible transfer
of control from statement n1 to statement n2.  The qualification
"possible" means the transfer is permitted by the semantics of the
language, but it may not actually occur during a given execution of
the unit.  Be aware that dynamic factors such as the values of
variables controlling branch conditions and the raising of exceptions
determine the actual flow of execution within a unit.

An edge (S, n) connects the start node and some statement n if execution 
flows immediatly to the statement when control enters the unit.  Likewise, 
an edge (n, T) connects some statement n and the terminal node if a 
transfer of control out of the statement causes control to leave the unit.
As an example of the latter, the node representing a return statement is 
always connected to the terminal node (unless the return occurs within an 
accept statement - LRM 5.8).

The inclusion of exception handlers in the language adds an extra wrinkle 
to CFG construction.  A transfer of control to the statement sequence of a 
handler can occur when a raise statement local to the unit names an 
exception that is handled by the handler. A handler's statements 
can also be executed when an exception is propagated from some other unit.  
One approach for representing the latter case is to connect the first 
statement of each handler to the start node.  However, in some applications 
of the CFG it is convenient to treat each handler as a separate subgraph 
since handlers are not executed during normal computation.   We have 
chosen this approach.

Therefore, a CFG consists of one or more distinct subgraphs.  There is one 
subgraph for the main sequence of statements of the program unit.  This 
subgraph includes the start and terminal nodes.  In addition, there is one 
subgraph for each exception handler occurring in the body, including all 
handlers in nested block statements.   Edges can connect nodes
belonging to different subgraphs. For example:

  * When a raise statement raises an exception that is handled by a
    local handler, the CFG contains an edge connecting the raise 
    statement and the first statement in the handler's sequence of 
    statements.

  * When the last statement in a handler transfers control out of the
    program unit, the CFG contains an edge connecting the last statement
    and the terminal node. For a handler occurring in a block statement,
    the edge connects the last statement in the handler to the statement 
    immediately following the block statement.

      CFG Representation
      ------------------

A CFG is an instance of the graph abstract data type presented in
[Booch 87].  The graph type is an Ada generic, where the node (vertex)
and edge (arc) data types are specified via generic formal parameters.
For the CFG, a node is a record of the form:

   type Item_Type is
      record
         Kind : Node_Kind_Type;
         Element : Asis.Element;
      end record;

The possible node kinds are defined by the enumeration type:

   type Node_Kind_Type is (Start, Statement, Terminal, If_Statement_Arm);

The kinds Start and Terminal are reserved for the start and terminal nodes, 
respectively.  All other nodes in the graph have kind Statement except for 
the IF and ELSIF conditions of an IF statement, which have kind 
If_Statement_Arm.

Each node also contains an ASIS element reference.  The kind of element 
referenced is determined by the node kind:

   * Nodes of kind Statement reference an Asis.Statement element.

   * Nodes of kind If_Statement_Arm reference an Asis.If_Statement_Arm 
     element.

   * Nodes of kind Start and Terminal reference the Asis.Statement 
     element returned by the Asis functions Subprogram_Body_Block, 
     Package_Body_Block, and Task_Body_Block.

A CFG edge is a record of the form:

   type Edge_Type is
      record
         Kind : Edge_Kind_Type;
         Element : Asis.Element;
      end record;

The possible edge kinds are defined by the enumeration type:

   type Edge_Kind_Type is ( 
      Prog_Unit_Start, Condition_True, Condition_False, Case_Alt, 
      Select_Arm, Block_Body_Start, Accept_Body_Start, Propagated_Raise, 
      Handled_Raise, Continuation);

Edge kinds indicate the reason for the associated transfer of control.

   * Simple sequencing transfers and unconditional branches
     are Continuation edges.  

   * Conditional two-way branch statements have outgoing 
     Condition_True and Condition_False edges.

   * Multi-way branches from CASE and SELECT statements have Case_Alt 
     and Select_Arm edges, respectively.  

   * Handled_Raise and Propagated_Raise edges distinguish transfers
     of control due to raise statements that are handled locally vs.
     propagated out of the unit.

   * The remaining kinds indicate entry into program units, block
     statements, and accept statements.

Depending on the edge kind, an Asis element reference is sometimes
attached to an edge to aid the CFG user in determining the conditions
under which the transfer of control occurs.  For example, each edge
originating at a case statement node includes a reference to a
Case_Statement_Alternative element.  Using this element, the
expression in each choice of the alternative can be retrieved. (The
exact kind and element reference that appears on each CFG edge is
described in the "CFG Structural Details" section below.)

A control flow view is an object of the following type:

type View_type is
   record
      Graph : Control_Graph.Graph;
      Start : Control_Graph.Vertex;
      Body_Decl : Asis.Declaration;
      Handlers : Handler_Set.Set;
      Terminal : Control_Graph.Vertex;
   end record;

The Graph field is the CFG.  The start and terminal nodes of the CFG 
are located in the Start and Terminal fields, respectively.  Body_Decl 
is the ASIS declaration element from which the CFG was created.  Handlers 
is a set of objects of the following type:

   type Handler_Type is
      record
         Handler : Asis.Exception_Handler;
         Node : Control_Graph.Vertex;
      end record;

The set contains a Handler_Type object for each exception handler in
the program unit.  The object contains 1) a reference to the ASIS exception
handler element and 2) the root node of the CFG subgraph created from
the handler's statements.

      CFG Structural Details
      ----------------------

This section describes the structure and attributes of the CFG nodes
and edges created for each kind of Ada statement.  The following is
provided for each statement:

   * The value of the Kind field for each node.

   * The value of the Kind and Element fields for each edge.

   * The target node of each edge leaving a node.



-> (Program Unit Start)

	Node Kind : Start

	Outgoing Edge:
    		Kind : Program_Unit_Start
    		Element : Nil_Element

    	The target of the edge is the first statement of the unit.

-> null_statement
-> assignment_statement
-> delay_statement
-> procedure_call_statement
-> entry_call_statement
-> abort_statement
-> code_statement

	Node Kind : Statement

	Outgoing Edge:
    		Kind : Continuation
    		Element : Nil_Element

	The target of the edge is the statement that is executed
	immediately after the statement.

-> goto_statement

	Node Kind : Statement

	Outgoing Edge:
    		Kind : Continuation
    		Element : Nil_Element

	The target of the edge is the labelled statement referenced
	by the GOTO statement.

-> case_statement

	Node Kind : Statement
	
        Outgoing Edges (one per case alternative) :
		Kind : Case_Alt
		Element : A_Case_Statement_Alternative
	
	The target of each edge is the first statement of the
	alternative.


-> selective_wait

	Node_Kind : Statement

	Outgoing Edges (one per select alternative/ELSE part) :
		Kind : Select_Arm
		Element : A_Select_Statement_Arm
				(A_Selective_Wait_Select_Arm,
				 A_Selective_Wait_Or_Arm
				 A_Selective_Wait_Else_Arm )

	The target of an edge representing a terminate
	alternative is the terminal node of the CFG.

	The target of all other edges is the first statement of
        the alternative/ELSE part.

-> conditional_entry_call

	Node Kind : Statement

	Outgoing Edges (one per alternative) :
		Kind : Select_Arm
		Element : A_Select_Statement_Arm
				(A_Conditional_Entry_Call_Select_Arm,
				 A_Conditional_Entry_Call_Else_Arm )

	The target of each edge is the first statement of the
	alternative.

-> timed_entry_call

	Node Kind : Statement

	Outgoing Edges (one per alternative) :
		Kind : Select_Arm
		Element : A_Select_Statement_Arm
				(A_Timed_Entry_Call_Select_Arm,
				 A_Timed_Entry_Call_Or_Arm )

	The target of each edge is the first statement of the
	alternative.

-> block_statement

	Node_Kind : Statement

	Outgoing Edge :
		Kind : Block_Body_Start
		Element : Nil_Element

	The target of the edge is the first statement in the block.

-> accept_statement

   CASE 1: accept statement WITHOUT a do...end body.

	Node Kind : Statement

	Outgoing Edge : 
		Kind : Continuation
		Element : Nil_Element

	The target of the edge is the statement that is executed
	immediately after the accept statement.

   CASE 2: accept statement WITH a do...end body.

	Node Kind : Statement

	Outgoing Edge : 
		Kind : Accept_Body_Start
		Element : Nil_Element

	The target of the edge is the first statement of the body.

-> loop_statement

   CASE 1: unconditional loop

	Node Kind : Statement

	Outgoing Edge :
		Kind : Continuation
		Element : Nil_Element

	The target of the edge is the first statement within the loop.

   CASE 2: conditional (FOR, WHILE) loop

	Node Kind : Statement

	Outgoing Edge : 
		Kind : Condition_True
		Element : Nil_Element

	Outgoing Edge :
		Kind : Condition_False
		Element : Nil_Element

	The target of the Condition_True edge is the first statement
	within the loop.

	The target of the Condition_False edge is the statement that
	is executed immediately after the loop statement.

   For both cases, there is also an edge originating at every
   statement in the loop body that transfers control back to the
   beginning of the loop.  The target of each such edge is the loop
   statement. The Kind of the edge is Continuation and the Element is
   Nil_Element.

-> exit_statement

   CASE 1: unconditional exit

	Node Kind : Statement
	
	Outgoing Edge :
		Kind : Continuation
		Element : Nil_Element

        The target of the edge is the statement that is executed
	immediately after the exited loop.

   CASE 2: conditional exit (EXIT...WHEN)

	Node Kind : Statement

	Outgoing Edge :
		Kind : Condition_True
		Element : Nil_Element

	Outgoing Edge :
		Kind : Condition_False
		Element : Nil_Element

     	The target of the Condition_True edge is the statement that 
  	is executed immediately after the exited loop.

	The target of the Condition_False edge is the statement that
	is executed immediately after the exit statement when the
	branch in not taken.
		
-> return_statement

	Node Kind : Statement

	Outgoing Edge : 
		Kind : Continuation
		Element : Nil_Element

	The target of the edge varies depending upon the context of
	the return statement.  If the return occurs within an accept
	statement, the target is the statement that is executed 
	immediately after the innermost accept.  Otherwise, the target
	is the terminal	node of the CFG.

-> if_statement

	The IF statement does not map to a single CFG node.  Instead, 
	there is one node for each An_If_Statement_Arm element
	with subkind An_If_Arm or An_Elsif_Arm.  (These represent the 
	"IF condition THEN" and "ELSIF condition THEN" constructs, 
	respectively; each constitutes a two-way branch.)

	Node Kind : If_Statement_Arm

	Outgoing Edge : 
		Kind : Condition_True
		Element : Nil_Element

	Outgoing Edge : 
		Kind : Condition_False
		Element : Nil_Element

	The target of the Condition_True edge is always the first
        statement of the arm.

	The target of the Condition_False edge varies based on the
	kind of arm appearing after the arm in the IF statement:

		* If An_If_Arm or An_Elsif_Arm is followed by
		  An_Elsif_Arm, the target is the latter An_Elsif_Arm.

		* If An_If_Arm or An_Elsif_Arm is followed by 
		  An_Else_Arm, the target is the first statement of
		  the An_Else_Arm.

		* If An_If_Arm or An_Elsif_Arm is followed neither by
		  An_Elsif_Arm nor An_Else_Arm, the target if the
		  statement executed immediately after the IF
		  statement.

	(See the output of test "if1a" in the "tests" directory for 
	an example CFG containing IF statements.)

-> raise_statement

	CFG construction applies the rules of LRM 11.4.1 in 
	determining the target node of an edge originating at a 
	raise statement.  

   CASE 1: raise statement with a name (e.g., "raise foo;")

	Node Kind : Statement

	Outgoing Edge : 
		Kind : Handled_Raise -OR- Propagated_Raise
		Element : base Entity_Name_Definition of the raised
		          exception

	If the raised exception is handled by a handler belonging to 
	an enclosing frame within the body, then the target node is 
	the first statement of that handler, and the edge kind is
	Handled_Raise. Otherwise, the target node is the terminal
	node of the CFG, indicating the exception is propagated 
	out of the unit, and the edge kind is Propagated_Raise.

   CASE 2: raise statement without a name (such a statement must
	   occur in an exception handler), and the enclosing handler 
	   has one or more named exception choices (e.g, "when A | B").

	Node Kind : Statement

	Outgoing Edges (one per exception choice) : 
		Kind : Handled_Raise -OR- Propagated_Raise
		Element : base Entity_Name_Definition of the exception
			  named in the choice

	(The idea behind this case is that the raise statement can
	re-raise each of the exceptions handled by the handler. 
	We conservatively assume that each of the named exceptions
	can somehow reach the handler (it would be impossible to
        determine whether they actually do without performing a global
        flow analysis). Logically, there is a distinct transfer of 
	control out of the raise statement for each exception.  We 
	apply the same rules as in CASE 1 to determine the target node
	of each edge.)

	If the re-raised exception is handled by a handler belonging to 
	an enclosing frame within the body, then the target node is 
	the first statement of that handler, and the edge kind is
	Handled_Raise. Otherwise, the target node is the terminal
	node of the CFG, indicating the exception is propagated 
	out of the unit, and the edge kind is Propagated_Raise.

   CASE 3: raise statement without a name (such a statement must
	   occur in an exception handler), and the enclosing handler 
	   has a "when others" choice.

	Node Kind : Statement

	Outgoing Edges : NONE!!

	(In this case, the set of exceptions that can be re-raised is
	unknown.  Again, without global flow analysis we cannot
	make any assumptions about the exceptions that reach the 
	handler.  Thus, there are no identifiable transfers of control
	out of the statement.  We indicate this by leaving a dangling
	node.)


      CFG Construction
      ----------------

The algorithm for building the CFG is based on the one described in
[Bieman 88].  The paper presents an algorithm for analyzing Pascal
programs; we extended the approach for Ada.


REFERENCES
----------

[Bieman 88] Bieman, "A Standard Representation of Imperative Language
            Programs for Data Collection and Software Measures 
            Specification", The Journal of Systems and Software, 8, 
            pp 13-37, 1988.

[Booch 87]  Grady Booch, "Software Components with Ada", Benjamin-Cummings, 
            1987.