From cooper@longshot.ds.boeing.com Thu Aug 17 23:00:09 1995 Return-Path: Date: Thu, 17 Aug 1995 19:32:50 +0800 From: cooper@longshot.ds.boeing.com (Dan Cooper) To: ASISWG-technical@sw-eng.falls-church.va.us Subject: Issue #009: Client/Server ASIS Implementation Cc: tomt@longshot.ds.boeing.com, stan@longshot.ds.boeing.com, gbd@Rational.COM, karlw@Rational.COM, gabrielb@Rational.COM Content-Length: 2913 X-Lines: 60 Status: RO ASISWG, Below is an excerpt from the issue regarding client-server implementations of ASIS: > !ASIS Issue #009 > !topic Don't Preclude a Client/Server ASIS Implementation ... > What many third-party tools want to do is to connect to a totally > separate executable image of ASIS, rather than always have to link > to ASIS and thus include a large amount of code in their tool's > executable image. > > Currently a tool is tied to a particular implementation, version, > etc. of an Ada compilation suite, of which ASIS is very tightly > tied to. It's basically a configuration management problem. > > Eventually, we would like to see a separate ASIS engine and a > separate executable -- in a client/server type of relationship. > This would open the Ada tool market up considerably. I would like to add yet another justification, and argue strongly that this approach *ought* to be how ASIS is implemented. The CM problems mentioned above are certainly valid and would constrain a third-party vendor who wants to market ASIS-based tools on a long-term basis. Also, now that we have some experience using Rational's ASIS, we can confirm the excessive file sizes created for executables: a trivial "hello world" level of tool (for example, to traverse an Ada file and count the number of declarations) consumes at least 17 MB on the disk. As disk prices keep dropping, who really cares, right? Well, there's another more relevant consequence to this: on our SPARC10, the (Unix) linker requires at least 5 minutes to build the monster! This is in marked contrast to non-ASIS development, and has had a severe impact on our productivity. This leads to a couple of considerations for a developer using ASIS: * process impact: After you've thought about your design and arrived at a strategy, it is frustrating/counterproductive to be forced to crawl along; this discourages incremental development and also constrains you to *minimize* the number of strategies you can afford to fruitfully investigate. * documentation impact: The (typically) minimal ASIS documentation encourages/forces you to "try it and see" in order to fully understand details of what a function really does in a particular context. This is consistent with the fast turnaround needed for such experimentation, but is at odds with current ASIS implementations. I think ASISWG should strongly encourage ASIS implementors to build to a client/server architecture. This will not only mitigate the CM and disk space issues, but also will significantly boost client productivity by reducing turnaround time between builds. C. Daniel Cooper ================================================ Adv Computing Technologist | processes | all opinions are | 206-655-3519 | + architectures | strictly my own, | Cooper@Boeing.com | = systems | NOT my employers | From danr@spartan.ssd.hcsc.com Fri Aug 18 08:29:30 1995 Return-Path: From: danr@spartan.ssd.hcsc.com (Dan Rittersdorf) Subject: Re: Issue #009: Client/Server ASIS Implementation To: tawny!longshot.ds.boeing.com!cooper (Dan Cooper) Date: Fri, 18 Aug 1995 07:19:21 -0400 (EDT) Cc: ASISWG-technical@sw-eng.falls-church.va.us, tomt@longshot.ds.boeing.com, stan@longshot.ds.boeing.com, gbd@Rational.COM, karlw@Rational.COM, gabrielb@Rational.COM Reply-To: Dan.Rittersdorf@mail.hcsc.com Content-Type: text Content-Length: 5558 X-Lines: 119 Status: RO Dan, > > ASISWG, > > Below is an excerpt from the issue regarding client-server > implementations of ASIS: > > > !ASIS Issue #009 > > !topic Don't Preclude a Client/Server ASIS Implementation > ... > > Eventually, we would like to see a separate ASIS engine and a > > separate executable -- in a client/server type of relationship. > > This would open the Ada tool market up considerably. > > I would like to add yet another justification, and argue strongly that > this approach *ought* to be how ASIS is implemented. > .... > As disk prices keep dropping, who really cares, right? Well, there's > another more relevant consequence to this: on our SPARC10, the (Unix) > linker requires at least 5 minutes to build the monster! This is in > marked contrast to non-ASIS development, and has had a severe impact on > our productivity. This leads to a couple of considerations for a > developer using ASIS: ... > > I think ASISWG should strongly encourage ASIS implementors to build to > a client/server architecture. This will not only mitigate the CM and > disk space issues, but also will significantly boost client productivity > by reducing turnaround time between builds. > I agree, but not entirely for the reasons you cite. These are implementation issues that an ASIS provider could get around by, for example, providing the bodies as a shared object. On the SPARC, you should have dynamic linking available to you, right? The reasons you cite above are the very justifications we use for providing HAPSE customers the ability to produce dynamically linkable shared object from their Ada sources. You can even slip a new dynamic object in after linking and have the application use it next time it runs. Great way to reduce turnaround time between builds, or slip a fix in while debugging. On the other hand, if you have enough ASIS calls in your application to increase the link time that much, they'll be there whether you link with a direct ASIS or a client-server ASIS. If the ASIS *implementation* causes a lot of linker work, then changing to a less intensive ASIS client implementation, or shared object implementation might save you link time. If you link with a "stubbed out" version of ASIS, with empty bodies, would you still have large link times? On the other hand, if ASISWG, or ASISRG, or "the friends of ASIS" or some kind individual with entirely too much time on their hands were to provide a PD (or otherwise "free") implementation of ASIS on ASIS, with some client-server technology (such as IP or UNIX sockets) in the middle... This technology would serve several purposes: 1: It would free the user's application from a particular vendor implementation, without so much as relinking their application. (The client half could "talk" to any ASIS server) 2: It *might* reduce the amount of work a linker has to do. If the client is also a shared object, you can effectively use lazy binding to distribute that link-time overhead throughout the execution of your program. At the very least, you can delay it until startup of your executable. 3: It would ensure that we meet that design goal stating that we don't want to preclude a client-server implementation. What better proof than to provide one alongside (NOT IN) the standard. 4: If the server side were implemented using an ASIS provider's ASIS implementation (ASIS atop of ASIS), then the ASIS user could build the ASIS client-server themselves. They could rebuild the server for another provider's ASIS implementation, and get their application speaking to the new server without rebuilding or relinking the application/client. The ASIS providers would continue to provide the standard, "direct" ASIS implementation. If I only had time and money on my hands... In the short term, if you can't get the vendor to provide it, you can produce a dynamic object version of ASIS, if the Ada vendor supports the generation of Ada shared objects even slightly. It would mean providing a thin binding to ASIS, with the bodies implemented using "pragma interface Ada", (or UNCHECKED) and externally named. Then provide an implementation of the interfaced routines that simply calls the real ASIS. Link the implementation with ASIS once, and produce a shared object. Link your ASIS with the thin ASIS implementation and with the shared object. Well, after you get the shared object to elaborate, it should work in theory... :-) It's only software. Think of it as implementing the client-server interface using "pragma interface" and a dyncmic linker instead of RPC. :-) > > C. Daniel Cooper ================================================ > Adv Computing Technologist | processes | all opinions are | > 206-655-3519 | + architectures | strictly my own, | > Cooper@Boeing.com | = systems | NOT my employers | > -- -danr <>< ______________________________________________________________________________ Dan.Rittersdorf@mail.hcsc.com 178 Washington St Harris Computer Systems Corporation Sparta MI 49345 Ft. Lauderdale FL 33309 Ph: (616)887-5431 ______________________________________________________________________________ From cooper@longshot.ds.boeing.com Wed Aug 30 15:57:03 1995 Return-Path: Date: Wed, 30 Aug 1995 12:03:06 +0800 From: cooper@longshot.ds.boeing.com (Dan Cooper) To: ASISWG-Technical@sw-eng.falls-church.va.us Subject: Re: Issue #009: Client/Server ASIS Implementation Content-Length: 7083 X-Lines: 142 Status: RO All, I have permission from Rational to post their follow-up to the recent note I sent out on this subject. I find it a convincing response: my note took the developer's view, whereas their's addresses the user's view; certainly, the user should take precedence. The final paragraph gives the needed insight as to why a client/server approach is probably not really viable. --Dan Cooper ----- Begin Included Message ----- Depends on whether you anticipate asking ASIS for simple characteristics of Library Units or do detailed analysis. I can't imagine that anyone has the patience to wait for a message-passing implementation to complete on a large, Ada Analyzer-like, application. Were performance not an issue, lots of things would be possible. > I would like to add yet another justification, and argue strongly that > this approach *ought* to be how ASIS is implemented. > > The CM problems mentioned above are certainly valid and would constrain > a third-party vendor who wants to market ASIS-based tools on a long-term > basis. Also, now that we have some experience using Rational's ASIS, we > can confirm the excessive file sizes created for executables: a trivial > "hello world" level of tool (for example, to traverse an Ada file and > count the number of declarations) consumes at least 17 MB on the disk. The primary reason that Rational's ASIS is so large is because it uses various mechamisms that were designed for our compiler. Conseqeuntly they are vastly larger and more complex than ASIS requires. The ASIS library contains a very large fraction of the Rational Ada compiler. Not because it wants to include all that, but because there's presently no way to eliminate the unwanted functionality. The size of our ASIS library says absolutely nothing, either good or bad, about whether a server-based ASIS implemetation is a worthwhile idea or not. > As disk prices keep dropping, who really cares, right? Well, there's > another more relevant consequence to this: on our SPARC10, the (Unix) > linker requires at least 5 minutes to build the monster! This is in > marked contrast to non-ASIS development, and has had a severe impact on > our productivity. This leads to a couple of considerations for a > developer using ASIS: While development agrees that 30 minute link times might be objectionable, they see no problem with link times in the single digit minutes range. They are in fact envious. > * process impact: After you've thought about your design and arrived at > a strategy, it is frustrating/counterproductive to be forced to crawl > along; this discourages incremental development and also constrains you > to *minimize* the number of strategies you can afford to fruitfully > investigate. << They don't understand why a server would make this any different? Just because link times *might* go down? >> Dan: Yes, by speeding up the turnaround time. > * documentation impact: The (typically) minimal ASIS documentation > encourages/forces you to "try it and see" in order to fully understand > details of what a function really does in a particular context. This is > consistent with the fast turnaround needed for such experimentation, > but is at odds with current ASIS implementations. << don't know what is meant by "typically">> It doesn't apply to Rational's ASIS. We have several thousand pages of online ASIS documentation. It is replete with examples, explanations, and cross references. A server would make no difference here. Dan: what I'm referring to here is the *fact* that no ASIS documentation can possibly describe exactly what some query really returns in some particular (unusual or rare) case of Ada code: the developer must resort to "try it and see", hence an experimental approach that is frustrated by slow turnaround. > I think ASISWG should strongly encourage ASIS implementors to build to > a client/server architecture. This will not only mitigate the CM and > disk space issues, but also will significantly boost client productivity > by reducing turnaround time between builds. There is no such thing as a free lunch. To send a message from one process to another takes a "long" time. Take the average RISC machine today, such a machine running Unix will typically take anywhere from 0.1 millisecond to several seconds to move a message from one process to another. The 0.1 millisecond is highly unlikely to occur in practice. When you get two programs, locked into memory, furiously sending messages back and forth, you get the best-case scenario. That is, nobody is doing any paging, and everybody is always ready to read/write, and nobody else on the machine is interfering. In testing our Apex message mechanisms we've sometimes seen message times as low as just a few milliseconds, but typical messages times are more like 10 .. 100 milliseconds. And, sometimes Unix boxes simply decide to take a very long time, multiple seconds, to deliver a message. It's a combination of load, paging, page replacement algorithms, etc. But let's take 0.1 millisecond per message. That optimistic number will show us some things. When talking with ASIS, you have to make a function call for each and every ASIS element that you wish to traverse. With a server that's 0.1 millisecond per call plus whatever ASIS internal processing time is required. If you take the average line of code, and break it down into ASIS elements, you will find that this mythical average line of code has somewhere around 8 (for A := A + 1;, you get 8) elements in it. If we take a medium sized Ada unit of 1000 lines, we can easily estimate that it will have at least around 8000 ASIS elements. (Actually, it will typically have a lot more.) 8000 elements * 0.1 millisecond per element => 0.8 seconds That's 0.8 seconds elapsed time just to visit all of the elements in the unit. And, if we take a more realistic 10 milliseconds instead of this 0.1 millisecond, we end up with 80 seconds or about one minute to traverse one unit. Even 1 millisecond gives us 8 seconds just to traverse one unit. I have never actually seen the 0.1 millisecond number occur in practice. It's something that CPU vendors claim to deliver but seldom actually do deliver. While I was pondering this message I wrote a little test scenario. It's just two little programs that talk to each other over a Unix pipe. One sends, the other receives, and then sends back. They do this 100_000 times. I tried it with a little message of 8 characters and a longer message of 64 characters. I'm working with a Sparc 10, nobody on it but me. I ran each test 4 times just to get average numbers. I can't do better than 1.2 milliseconds per message. That means around 10 seconds to traverse one little ASIS unit on a totally unloaded machine. That free lunch isn't so free after all. ASIS data is too "fine grained" to be workable using servers. All transactions are synchronous too. Implementing an ASIS server would be relatively easy. But nobody would ever use it a second time. ----- End Included Message ----- From cooper@longshot.ds.boeing.com Tue Nov 14 19:18:22 1995 Return-Path: Date: Tue, 14 Nov 1995 16:02:55 -0800 From: cooper@longshot.ds.boeing.com (Dan Cooper) To: ASIS-Officers@sw-eng.falls-church.va.us Subject: more on client-server architecture for ASIS Content-Length: 44086 X-Lines: 855 Status: RO All, The recent ASISWG meeting generated the following action item: > 9511-11. STEVE BLAKE: Write a statement for Issue #009 concerning > the fact that the ASIS specification does not preclude a > client/server implementation; Issue #009 will then become > Approved. Before doing that, we should probably append this follow-up (below) to some of the questions raised in the earlier email I forwarded. I'm pretty sure the author is Gary Barnes, although I received it anonymously via our Rational tech rep. It's surprisingly lengthy. --Dan Cooper ----- Begin Included Message ----- From: karlw@Rational.COM Date: Fri, 13 Oct 95 08:45:07 PDT Subject: FW: Re: client-server architecture for ASIS To: cooper@longshot.ds.boeing.com Here is the final word on an ASIS client-server architecture. > > 1. Would modifications to the ASIS architecture or > > usage model assist a server approach? No and Yes. Far more No than Yes. (No) It's the inherent nature of the information. We want to be able to walk over the entire unit and see every little detail. Even if we only consider the syntactic details and ignore all the semantic information, there are thousands (tens to hundreds of thousands in large units) of things to see. Each one is either the result or the input to yet another client-server interaction. Given that each interaction is N milliseconds, that leads to M seconds to fully examine each unit. Given hundreds of units (small program, X*10K SLOC) or thousands of units (medium program, X*100K SLOC) or tens of thousands of units (large program, X*1M SLOC), the processing time becomes too unwield to actually use. (Yes) If ASIS was really high level, meaning that ASIS directly implements everything that, or much of what, anyone could ever want to ask, then only the "answers" would ever have to flow back across. However, that would make ASIS of equal or greater complexity to the average compiler. A chilling thought. To use an analogy, image any ordinary database. In a client-server approach, there are two extremes. On one end, the database link does nothing but fetch/store individual records (perhaps only individual fields of individual records). On the other end, the database does everything and the "client" is nothing more than a remote user interface for the "server" which is now really the entire application (actually it has become all possible applications). In the real world, database client-servers are in the middle. The client is a user interface that allows the user to create queries. The queries from the user may be simple (eg. visual) or may be complex (the user may be allowed to program directly in the query language.) The client side then sends a query program (note: *program*, it sends a *program*) to the server side. The server side executes the program and returns the result. The server side *executes* the program because the server side has easy/fast access to the data. ASIS is the first extreme. That's why it won't work. The communications cost is too extreme. ASIS needs to be more towards the other extreme. That's the only way to get the communications cost down to where it might be viable. An ASIS query language would be the best way to go. Why? Because we don't know all possible queries and the only viable compromise is to have a query language. The only way client-server ever works is to have the work done on the side that has access to the data needed to get the work done. When the work to be done cannot be predicted ahead of time, the only option is to have a work language so that the client-server can exchange programs written in that language. In the case of ASIS, only the client side needs to be able write and send programs and only the server side needs to be able execute programs and return results. (In some distributed applications, both sides, or even all N sides, need to be able to do writing, sending, executing, and returning.) > > 2. Is this an issue for distributed programming in general? Of course. It always has been. Its fundamental to the genre. If you can't break up a problem so that the various distributed pieces are able to avoid heavy continuous synchronous interaction, then the resulting performance will be terrible. It will always be directly limited by the speed of that synchronous communication. This has been known for as long as there have been distributed applications. One explicit goal of most distributed designs is to make absolutely everything asynchronous if at all possible. Another explicit goal is usually to try to make sure that every piece has something to do while it otherwise has to wait for a response to some query/action. In most ASIS applications, the program is searching the library looking for something, or it is searching and gathering statistics or other information. In other words, the program has to examine some significant portion of the library (perhaps all of it) in order to compute the answer to whatever question it has. My main model for an ASIS client-server (assuming that we don't get radical and create query language, we keep the present functional interface) would be that each query causes a client-server interaction. That's where the N*10K elements * M milliseconds/interaction results in X seconds to walk just one unit and it takes hours (or even days, weeks, or months) to walk an entire library. Another possible model (in a continuum of non-query-language models) would be that the ASIS client side would cache entire unit values instead of just element values---doing this without the application needing to do anything. Then an element-to-element query that merely walked within a single unit would not need to cross the communication link. However, this would force the ASIS server to pre-process each unit when it was first visited. This, in effect, means that the unit has to be fully traversed, each element value precomputed, and every syntactic and semantic link precomputed, and then the result sent to the client. This would make things like Unit_Declaration very very very very very slow (I would not be surprised at seconds, but possibly fewer seconds than doing it piecemeal the previous uncached way, and it would only happen once unless the application ran out of memory and ended up doing some kind of "paging" or "swapping" of unit/element data, in which case the application could end up "thrashing"). Another possible model (at the extreme of the non-query-language continuum) would be for the server to open a library, fully traverse the library, and transmit it all to the client side. (Which would then probably run out of memory (library information being at a minimum 10's of Mb, and being some number of Gb in the really large libraries that some of our customers have), so a whole virtual memory system would have to be constructed and used by the ASIS client side.) This would also take a very long time to actually occur. It takes a long time to transmit Mb or Gb over TCP/IP or some other connection protocol. At this extreme, it would be better to have a "library dump" facility that dumped the vendor library into some "universal" format which would then be queried by a "universal" ASIS implementation. This implementation is guaranteed to be slower than whatever the vendor would otherwise do just because it has to be "universal" and that would extract various time/space prices. > Karl, > > Here's a somewhat belated response to earlier mail on the subject. > Please forward this to whoever it was that wrote the response: --------------------------------------------------------------------- > > The size of our ASIS library says absolutely nothing, either good or > > bad, about whether a server-based ASIS implemetation is a worthwhile > > idea or not. > > Of course. It's only an indicator of the volume of stuff the linker > must deal with. > > > << They don't understand why a server would make this any > > different? Just because link times *might* go down? >> > > That was the idea: to speed up link time. If a client-server model won't > achieve this, my whole argument goes out the window (although the other > arguments still stand). The client-server model might well achieve this. However, the resulting program would be usuable only for testing. You'd never want to use it for actual production work. It would be too slow by at least a couple orders of magnitude. It should be very trivial for you to write a client-server. I spent a couple days messing with it and I was able to put together an Emacs lisp program that reads in the ASIS package specifications and which spits out the two Ada sides of the server. With a little tweaking of the program I've been able to compile both sides. I haven't gotten to run it yet, our latest-greatest internal compiler has some problem producing code for one of the units. I expect the eventual execution to be entertaining. I earlier did a test client-server that was just two programs spiting integers at each other as fast as they could. Each side waited for the other side's integer before spitting another integer back. The best round-trip time I could achieve on a Sparc-10 using sockets was just barely under one millisecond (got this one time only) and most of the trials were >1 millisecond. So the expected round-trip time for this test should be estimated at probably 1.5 milliseconds (no load) and >10 milliseconds (load average about 2.5 on the Sparc-10). In any case, expect traversal times of at least 5-20 seconds for the average unit with little or no other load on the machine. I expect the actual ASIS client-server test to be noticably slower because the message size will be rather more than 4 bytes and the programs have more resource contention than my simple test. The extra time spent will be largely due to the additonal messing about in the operating system caused by the variable sized, and much larger sized, messages. Messages that are returning "large" array values can be predicted to be especially slow based on observations we've made in Apex. Apex uses large variable length messages to communicate within its various pieces. Messages over a Kb or so can be startlingly slow and the degree of slowness is not predicatable in any way that we've found. N Mb messages can take several 10's of seconds to propagate, far more than just the address-load-address-store RISC memory cycles would seem to account for. Also, their pre-message and post-message-pre-result executions will cause more system paging activity, resulting in more pages of the other side of the link being paged out, and thus those pages will have to be paged back in again on the next message, resulting in more lost time slowing down the communication. > > > * documentation impact: The (typically) minimal ASIS documentation > > > encourages/forces you to "try it and see" in order to fully understand > > > details of what a function really does in a particular context. This is > > > consistent with the fast turnaround needed for such experimentation, > > > but is at odds with current ASIS implementations. > > > > << don't what is meant by "typically">> > > It doesn't apply to Rational's ASIS. We have several thousand pages of > > online ASIS documentation. It is replete with examples, explanations, > > and cross references. > > > > A server would make no difference here. > > The point here is that no amount of documentation can tell you just what > a specific query will really do with some rare or weird code example; > or the ASIS implementation may have a bug such that it doesn't behave > as documented. This leads to substantial "try it and see": a highly > iterative process that is encumbered by slow link times. At any given point in your application you know (or can know by calling a Kind interface) what kind of Element you have in your hand. Worst case, all you have to do is to start at the first page of the documentation and search forward looking for references to that Kind of element. The ASIS interface is such that while lots of interfaces may Return a particular element Kind, only a very very few (often exactly one or two) interfaces ever Accept a particular element Kind. Simply search forward until you find one of the This-Interface- Accepts-These-Element-Kinds tables that contains the element Kind you are dealing with. There is one such table for every interface and they all look the same so they are easy to recognize. Myself, I just have a simple text file that contains all of the ASIS package specifications concatenated into it. I just search through that file with my editor until I find the right interface. It seldom takes more that 2..3 seconds to find what I need. That way I don't have to remember their names. All of this information is in both the online documentation as well as in the actual ASIS package specifications. So you can use either approach. Every interface also has a complete Returns list of all element Kinds it can return. > > To send a message from one process to another takes a "long" time. > > Take the average RISC machine today, such a machine running Unix will > > typically take anywhere from 0.1 millisecond to several seconds to move > > a message from one process to another. > > > > The 0.1 millisecond is highly unlikely to occur in practice. When you > > get two programs, locked into memory, furiously sending messages back > > and forth, you get the best-case scenario. That is, nobody is doing > > any paging, and everybody is always ready to read/write, and nobody > > else on the machine is interfering. In testing our Apex message > > mechanisms we've sometimes seen message times as low as just a few > > milliseconds, but typical messages times are more like 10 .. 100 > > milliseconds. And, sometimes Unix boxes simply decide to take a very > > long time, multiple seconds, to deliver a message. It's a combination > > of load, paging, page replacement algorithms, etc. > > > > But let's take 0.1 millisecond per message. That optimistic number will > > show us some things. > > > > When talking with ASIS, you have to make a function call for each and > > every ASIS element that you wish to traverse. With a server that's 0.1 > > millisecond per call plus whatever ASIS internal processing time is > > required. > > > > If you take the average line of code, and break it down into ASIS > > elements, you will find that this mythical average line of code has > > somewhere around 8 (for A := A + 1;, you get 8) elements in it. If we > > take a medium sized Ada unit of 1000 lines, we can easily estimate > > that it will have at least around 8000 ASIS elements. (Actually, it > > will typically have a lot more.) > > > > 8000 elements * 0.1 millisecond per element => 0.8 seconds > > > > That's 0.8 seconds elapsed time just to visit all of the elements in > > the unit. And, if we take a more realistic 10 milliseconds instead > > of this 0.1 millisecond, we end up with 80 seconds or about one minute > > to traverse one unit. Even 1 millisecond gives us 8 seconds just to > > traverse one unit. > > > > I have never actually seen the 0.1 millisecond number occur in > > practice. It's something that CPU vendors claim to deliver but seldom > > actually do deliver. While I was pondering this message I wrote a > > little test scenario. It's just two little programs that talk to each > > other over a Unix pipe. One sends, the other receives, and then sends > > back. They do this 100_000 times. I tried it with a little message > > of 8 characters and a longer message of 64 characters. I'm working > > with a Sparc 10, nobody on it but me. I ran each test 4 times just to > > get average numbers. I can't do better than 1.2 milliseconds per > > message. That means around 10 seconds to traverse one little ASIS > > unit on a totally unloaded machine. > > > > << Dan, you may want to try the above-described message- > > << passing test on your machine.>> > > The above is a comprehensive and persuasive argument. Thanks! I certainly > agree that user performance deserves much higher priority than developer > convenience. Besides, as it happens, we have discovered a simpler solution > to reducing link time, directly related to the Unix performance numbers > cited above. > > An experiment showed that "hello world" links in 18 seconds; adding an > (unused) "WITH Asis" jumps the time to 5 minutes. However, we then > discovered that the 5-minute link time was substantially attributable to > network traffic between the workstation where the program resided and the > other workstation where the multi-megabytes of ASIS were stored. When the > same program was made coresident on the latter, the linker time dropped to > an astonishing 30 seconds! So, as far as we're concerned, the linker issue > has been resolved. You might keep this approach in mind to suggest to your > other ASIS customers. Yes, we also see this when we link Apex. People that have local disks often have their personal set of Apex working views on those disks. They often get link times which are large factors smaller than people that have to link across a network for some reason. It's just a fact of life on a busy network. > > ASIS data is too "fine grained" to be workable > > using servers. All transactions are synchronous too. Implementing an > > ASIS server would be relatively easy. But nobody would ever use it a > > second time. > > I've gotten a couple of other responses, basically challenging this > low-level, though simple, approach. One respondent says: > > > I'm still not convinced that a client-server architecture couldn't > > be crafted. And yes, the client and server would both have to > > have more intelligence than one might implement if they just built > > a library of function calls. This is a crucial point. The level of the interface needs to be higher if we want client-server to be viable. However, nobody has been able to come up with any sort of higher level interface that anyone else has ever been interested in. There was one attempt at that. I forget which person from which company or university. But at one point not too terribly long ago they very proudly presented their results to us at one of the ASIS meetings. And everyone was pretty underwhelmed. One person's high level interface is another person's who-cares interface, if it doesn't ask/answer the right questions. One person's high level interface is aother person's junk interface, if it doesn't ask/answer the questions with just the right twist. Have you ever heard of a database, any sort of database, where the interface had/knew all of the concievable queries? Of course not. That's why SQL exists for relational databases. > > The ASIS client-server product could > > provide a client side application which bundles up requests. And You can't bundle when the next request depends on the result of this request. Has this person ever used ASIS for anything? If your basic problem is one of "searching", then there's no such thing as "bundling". Most ASIS problems are ones of searching. "What is the call graph?" (Search out all call and see who makes them.) "What variables are unused?" (Search out all references and declarations and see which ones aren't referenced.) "Does this program conform to our coding standards?" (Search for violations.) Etc. This is why relational databases have SQL. You write your search in SQL, and then you ship that search program to the server. It does the search and returns the result. Any usable sort of ASIS client-server would need to have a programmably/query-language interface. > > yes this would be more expensive technology. For instance, compilers > > look ahead and determine how to optimize code. Programmers also do > > optimization by writing smart applications. Thus, a programmer would > > code their analysis program differently in a client-server architecture > > than in a straight function call binding. All irrelevant. What compilers do or don't do is irrelevant. Compilers are nothing like database query systems, and ASIS is a database query system. (a crude one) Compilers aren't even client-server applications. What programmers do is nothing like a database query system. Programmers aren't client-server appliations. The only viable way to make ASIS into a usuable (ie. fast enough) client-server would be to design some kind of query language for it. That is, we would have to design a Turing-complete programming language (or a 4GL). And then we would have to get vendors to implement it. And then they would charge you thousands of dollars a seat for it, and rightly so. It would be as much effort as doing an Ada compiler. Have you ever read any of the SQL (and other database query system) literature? It's really hard core stuff. It's much harder to really optimize SQL than it is to optimize Fortran, C, or Ada. > > And yes the vendor would > > have to publish additional features/APIs, Why? For what? > > and yes you wouldn't have as much portability. Why? I don't see any reason. Perhap I just don't understand whatever it is he's talking about. > > But how come the windows, RPC, and database > > interfaces have gone to client-server interfaces? And Gone to? They were designed for that from the begining. They are also oranges to our apples. WINDOWS: How come? Because windows (both X and the MicroSoft stuff) is not a database query system. They aren't any kind of query system. Almost literally all traffic between a windows application and the windowing system is asyncrhonous. The entire system was designed so that whenever: a) the application wanted something to change, it just sends a message out and assumes that the change has occured. No reply occurs. There are very few places, and they are virtually all at application start-up time, when a windows application asks any synchronous (can't go forward until I get an answer) questions they are things like, what fonts are there, and what colors do I have available. b) the window system needs to indicate that something has changed, such as your-window-is-now-an-icon, or your-window-changed-size, or the-user-says-to-exit, or user-typed-this-key, etc. The window system don't get a reply to these messages, it just goes on. Comparing ASIS to windows is silly. Windows is not any kind of query system. The two parts hardly interact at all most of the time. ASIS is nothing by interaction. Windows is designed to be asynchronous and to very very seldom need or want to be synchronous. ASIS is purely synchronous. If you can design your application so that it doesn't need to be synchronous (fat chance given the problem domain), then use Ada tasking and pretend that the client-server is asynchronous. RPC: RPC is exactly what we've been talking about already. RPC is nothing more than remote-procedure-call. You call, then you wait for a reply. For some things, you call, and then you don't wait, because there is not going to be a reply. In the rest of the cases, you call, and don't wait, and every once in a while you check a flag to see if the answer is here yet. Variations are call-wait-for-a-while and then change over to check-a-flag-periodically if the reply isn't here yet. RPC is not an application. RPC is not a database. RPC is not a query system. RPC is not any kind of system at all. It's a tool for implmenting other things. It's just a (very slow) procedure call mechanism. The RPC call-and-wait takes the N milliseconds we've talked about before. The RPC call-and-don't-wait-no-reply is not of much use with ASIS since the inteface is 99% question-answer interfaces. There are very few interfaces that just make-something-happen without returning some kind of result. The RPC call-and-I'll-check-a-flag-later stuff can be done using Ada tasking if you actually find yourself a situation where you don't really need to be synchronous. In any case, RCP isn't any kind of application. It's just a mechanism that can be used to connect two remote pieces. It doesn't actually do anything interesting in and of itself. It's the two pieces that matter. RPC is just glue. DATABASE: Databases use SQL or some other query language. You write your program in SQL and ship the program to the server. The server executes the program and returns the results. ASIS doesn't have any sort of query language. Nobody has been able to come up with one. And, implementing such a thing is a vastly bigger project than doing ASIS as it is now. Do you really think you can get the vendors to do that much work? If so, how many thousands-per-seat do you think they will charge for it? Lots I bet. Want to pay that? ASIS right now is a freebee from Rational. > > you see tons > > of applications built using these APIs that are slightly different > > and still the applications are making money and are available on > > a wide-range of computer platforms? I don't care how many oranges somebody has used to make orange juice. I still say that apples won't make that kind of juice. > > I still believe that compilation units provide a natural syntactic > > and fairly natural semantic boundary for analysis programs. What > > I generally want in an analysis program is the interface information > > of set/use, call tree, etc. I might want a different approach if > > I was building a complexity metrics program or a test case generator, > > however I am skeptical. If he can come up with a query language, or some other system, then great, problem solved. Until he has something definite to actually propose, it's all gee-I-wish-the-world-worked-the-way-I-want-it-to-work wishful thinking. It's not obvious to me how such a query language would work. I suppose we could define a "schema" for ASIS that basically matched the functions we have now. There would be one "entry" in essence for each possible currently-existing primitive interface. Then we could use something akin to SQL to fashion programs/queries. Those could be shipped off to the ASIS sever to process. (Have you ever tried to debug SQL or any other ship-it-there-it-runs-there kind of program, such as the PostScript programs that got shipped to the Sun NeWS server? It can be a real problem. You can bet your life that there will be no debugger. Probably not ever be a debugger. You'll be lucky if you even get some kind of execution trace to help you figure out what went wrong.) It could get very complicated trying to program any sort of real sophisticated query into such a system. Yes it can be done. But it may not be very easy to do. The entries are very low-level and all of the queries would tend to be rather high-level. That is, the entries describe what-is and the queries want to ask what-does-it-all-mean. Bridging that gap in some sort of high level language might be a real challange. Maybe not. I don't know. I do know that implementing things like SQL is a major project, just like implementing any compiler/interpreter/whatever is a major project. ASIS in its present form can be done by one person, albeit in 2+ years, but one person. > > I certainly wouldn't want a language > > sensitive editor to use an ASIS interface due to performance problems. > > Even in a nonclient-server architecture I believe the > > performance would be unbearable. . performance of a language sensitive editor using a client-server ASIS. Yes, I agree with that. > > I could be wrong, but I believe > > too much stuff would have to be loaded into RAM to make things run > > with reasonable performance. > Another respondent writes: > > > I hope we won't dismiss this approach simply because one of many possible > > implementations would be too slow. Efficiency was an important issue when > > we first discussed this topic, and it still is now as Rational's response > > shows. But surely there are, or will be, advances in client/server > > implementation mechanisms and distributed programming that will make this > > feasible. Whatever the advances, communciation between two different processes on a machine (much less two different machines) will always be much slower than direct procedure calls. This sort of thing has been an intense focus of study around the world since the mid 70's. There is no possibility of a "good" solution in the sense that things might eventually get "fast enough". Go out and read the literature. If a simple procedure calls takes time N, then inter-process (or inter-task, or inter-thread) is guaranteed to take M*N. Where M is determined by a vast number of things. The application wants to make a query. So it creates a message and calls the operating system. This causes a context switch. On the most stripped down systems, with the most dedicated and specially created hardware, using fixed length messages, this is guaranteed to cause a bunch of activity. (Saving of registers, given enough register banks that can sometimes be avoided, if there aren't too many things going on.) (Saving of process context, eg. who-am-i, who-am-i-calling.) (Saving of virtual address space, paging tables, etc.) Really specially designed systems can get this down to just a few dozen clock cycles. Normal OS's take hundreds to thousands of clock cycles. (That is hundreds-to-thousands compared to a-few-to-a-couple-dozen clock cycles for a direct-procedure call.) Once in the OS, the OS copies the message out of the application's virtual space into it's virtual space. (Yes, specialized systems sometimes copy messages directly from sender to receiver. Some even manage to share the message space. Normal OS's don't try to do any of that.) Then the OS has to switch to the other process. Which takes another dozens-to-hundreds of cycles. And, it probably didn't schedule that other process right away. It's probably trying to do some kind of timesharing among all of the processes on the system so the other process may have to wait some number of timesharing tics before it even gets called, typically some N*1/60'ths of a second. Then the result has to come back, which is two more context swaps. This is why a direct procedure call will always be 1..100 orders of magnitude faster than any sort of RPC. In something like ASIS, the performance of the application will be directly controlled and limited by the RPC round-trip time. This is a well known result in the literature on distributed systems. Highly connected components are always directly response limited by the communication time. > > A usage shift could help too. For example, rather than make individual > > "fine grained" calls to traverse, make one call to traverse the whole unit > > and return the tree, which the client can walk independently of the server. This doesn't have to be dumped in the lap of the application (introduces complexity). This can be done under-the-floor by the client-side portion of the ASIS application (just as I mentioned up above). But, a) it will take time to do the traversal, and b) it will be a *big* lump of data to transmit between processes, so expect the first call that returns an element from a new unit to take many seconds to complete. This will tend to make the response time of the application unpredictable or erratic in the view of the user of that application. They will have a difficult time predicting how long some "simple" request they make will take. (There have been lots of studies, users always prefer a system that is uniformly very slow to a system that is often fairly fast but sometimes is inexplicably/unexpectedly real slow when all they asked was "just the same thing as before" only they asked for it in a different unit or something else that caused sudden unexpected computation.) > > This reduces RPC overhead to a few calls rather than many. Some creative > > caching of information on both sides could really help too. Both sides? Only the client side has anything to gain. And, this all helps only a bit in any case. It would be a good deal of work to do it and the payback just isn't there in my opinion. All it accomplishes is faster link times. It doesn't achieve any vendor independence. It doesn't make the application programming process any faster or better or more reliable. It doesn't make the application any more general or even faster. It just makes linking faster (and that's not guaranteed, it's just likely) and the application is definitely slower. And, it's more work for the vendor (yet another product to maintain). So why bother? The effort would be far better spent just trying to make the library smaller. Our ASIS library in particular has megabytes of unnecessary/unused code in it. But, given the structure of Ada83, there's no good way to eliminate most of it. When we start using Ada95 and child units, the library should be able to shrink noticably. But that's later and not right now. > > Don't discount hardware advances over the next 10 years that may make this > > discussion irrelevant. Won't happen. Whatever hardware advances occur, the computer will get faster, meaning that the directly-linked application will get faster. Even if communication time goes to zero, you still have the computer doing 4 context switches to do one query-response. (I'm assuming that the two processes are on the same machine. If we're going over a network, and we haven't come up with some kind of ASIS SQL query language, then I can only laugh at the whole idea. Network communication time will ruin everything. No matter how fast the network gets, the time is still non-zero time added on top of the context switches.) Those 4 context switches determine the resulting speed of the client-server application. That speed will always be at least one order of magnitude slower. (Why one order of magnitude? If it takes 5 instructions (or clock cycles or whatever) do to a procedure call, then we only need 50 instructions (or whatever) during a process switch in order to get our one order of magnitude. I've never seen or read about a system, even special only-in-the-laboratory systems, that didn't take at least 10 times as long to context switch as to do a simple call. The usual difference is more like two to three orders of magnitude.) I don't care how fast the computer gets. You always lose an order of magnitude, and in the real world you probaly lose closer to three orders. In the ASIS applications I have written, and in the ones that I have seen which other people have written, those applications do very little other than figure out what ASIS interface to call next. This means that the application runtimes are basically governed by the time it takes to do a procedure call. (Most Rational ASIS interfaces do very little real work. The procedure call overhead is comparable to the actual useful computation time taken.) No query program is ever fast enough. Compare two competing systems, the fast one isn't fast enough, the slow one is 10 times slower. Which one survives? (With proper marketing, the slow one of course! 8^)) > > All were really talking about > > here is building a vendor-independent client for vendor-dependent servers. Oh, well that's another possible discussion. If we want some kind of vendor-independent client-side library and then have each vendor do a server-side server, then we change the game. (We also make it unlikely that the vendor will actually use the product himself.) In this case you get faster link times, the same application exectuable can run against more than one vendor meaning fewer supported product versions for the tool vendor, and a vastly slower application. It still sounds like a losing proposition, although I do understand the lure for small companies with severely limited capital. A better approach would be to "dump" the library into another form and that way you eliminate the communications bottleneck. Of course, you just dumped a hundred megabytes or so of information (given an "interesting" library with a few hundred thousand lines of code) so the dumping process took some time. And of course you needed to find/have those hundred megabytes of scratch space. And of course you'll have to dump them again next time you compile anything (or else come up with a incremental-dump engine). And of course the dumped library is probably less-dense (less compacted) than the vendor's internal library so it takes more space. And of course the application now has to read this dumped, and probably portable, and probably vendor independent, and probably not very optimized, larger, and probably somewhat slow (but faster than the client-server option) form of the data. And, if anyone likes/want-to-try this approach, they can try it right now. The vendor doesn't have to do anything. Applications can do this right now if they want. They just have to write the dump engine and then implement an ASIS against it. In fact, that would be the best thing to do. The ASISWG (or someone) should write the dump engine, and write the ASIS library implementation that reads the dump, and write an incremental dump engine, and then all the vendors have to do is to take this Public Domain code and compile it and link it and ship the result. Then you have your vendor-independent ASIS from the tool builder viewpoint, and you have your vendor-dependent dumper, and the vendor doesn't have to agree to do any more work (or charge any more money) than they already do. The only penalty is disk space and the dump time. You even get your faster link times. And, no, I don't think that end-users will like this a great deal because of the extra disk space and the time it will take to dump things. An application that didn't use the dumper, and which directly linked with the ASIS library would be faster and use less disk space. Those would be very strong selling advantages. No matter how cheap disk gets, nobody ever has enough. > > Several questions: > > > > 1. Would modifications to the ASIS architecture or usage model > > assist a server approach? Yes. But only if someone can design an acceptable ASIS SQL/query-language. (ie. it works, it does what tools builders need, it can be undersood by the average programmer, etc) And of course we have to get vendors to actually implement it. Otherwise I don't care what you do to the interface. Client-server only works when, one of: a) queries are only infrequent or when nobody cares how long it takes, b) the two sides only ever send messages to each other, and never do query-response messages, c) the two sides only (99%) talk asynchronously and they can, or don't need, or can keep busy while waiting for, any responses to messages, d) detailed queries on one side can be shipped to the other side for actual processing (ie. ship the query program to where the data is so that the data can be accessed efficiently) e) detailed queries on one side cause the data to be shipped to that side (ie. if the database is "over there" and you need to do detailed queries, then move (or copy) it "over here" so you can do it efficiently" These are all perfectly normal concerns, and results, from the world of databases. There are entire journals devoted solely to dicussing the issues of query languages, dynamic motion of where-the-data-lives-now, shipping queries, shipping data, compiling queries, debugging queries, improving query languages, etc. ASIS is nothing special in this regard. It is "just another" database. We don't need to invent anything (except possibly a specialized query language). We aren't doing anything terribly new. If we desparately want to do client-server stuff then we need someone who actually is familiar with client-server database stuff. There are books written on some of the aspects of things we would need to consider. If anyone actually wants it to happen, they need to design the query language, and they need to implement the client-side and they need to implement the server-side query-engine down to about the present ASIS level of functionality. Then they need to give it all away Public Domain and not Copy-Left, or else a funny Copy-Left where people don't have to make things linked against it Copy-Left (ie. something like the GNU C anc C++ I/O libraries are Copy-Left'ed, but not quite Copy-Left'ed), or vendors and tool builders will never use any of it because that would make their products automatically Copy-Left. And if vendors and tool-builders don't use it, that eliminates a large fraction of the ASIS audience and it means that there will never be terribly robost or well-integrated versions of the thing. Things offered by a vendor, but not actually used internally by a vendor, never really work very well and seldom improve in quality over time. Here's something else to think about with servers. Where does the ASIS server run? Is there more than one? You noticed that linking something where the library was way-over-there in the network was slow. Where is the server? Here? Or way-over-there? Also, is there more than one server? Does each client get its own server? If each library is dozens to hundreds of megabytes, then we may want to have one server per client, or even one server per database. It could be a major performance issue. One server, N clients, N * 100Mb = too much space for one process to hold, which means the server must implement some sort of library paging or library swapping scheme, which probably lowers the overall throughput of the server if it is server multiple clients. If we do multiple servers, what happens when the system runs out of swap space, or maybe it just starts thrashing because the servers aren't coordinating? Maybe doing the paging/swaping in one server would be better? How do we arrange things so that the user of a tool can control all of this variablity? Things can get very complicated very fast. Not only does a query language need to be designed, not only does a public implementation of the client side and the server side need to be done, but thought needs to be put into how do we control the number and the sharing of servers. If we blindly go to the one-client/one-server scenario, then we may penalize systems that could have done better another way. If we assume N-clients/one-server then systems with small virtual process spaces may be unusable because one server quickly maxes out in space and thrashes. If we assume that servers run on the local machine, then what about the availabilty of a fast network and nearby fast machines that could serve? What transport mechanisms do we use to connect the two? There are no vendor independent SQL/database libraries even in the client-server scenarios that I am aware of. There is at least one Public Domain SQL implementation with an associated database. But it very much is not a high performance or a high capacity item and it is not actually used by anyone for creating commercial software. I've only ever heard of anyone using it for academic things and the stories I've heard were mainly about them later switching to something commercial that had capacity, speed, and support. Is it reasonable to expect that there can be an ASIS vendor-independent client-server library? Or is this just the everything-I-want-in-life-should-be-free-and-top-quality folks crying in the night again? If the relational database world can't/won't/or-hasn't done it, can we? Our audience is a lot smaller and more specialied (meaning: the problem is harder). > > 2. Is this an issue for distributed programming in general? Of course. It's been under study for as long as there have been two computers with a communcations link between them. Ie. since the 70's. There is no "solution"---just like there is no "solution" to building compilers---there's only work and cost/return tradeoffs. ----- End Included Message ----- From cooper@longshot.ds.boeing.com Tue Nov 14 19:39:47 1995 Return-Path: Date: Tue, 14 Nov 1995 16:24:48 -0800 From: cooper@longshot.ds.boeing.com (Dan Cooper) To: ASIS-Officers@sw-eng.falls-church.va.us Subject: Re: client-server architecture for ASIS Content-Length: 1482 X-Lines: 39 Status: RO All, The recent ASISWG meeting generated the following action item: > 9511-11. STEVE BLAKE: Write a statement for Issue #009 concerning > the fact that the ASIS specification does not preclude a > client/server implementation; Issue #009 will then become > Approved. Here's an afterthought to the lengthy comment I just forwarded; it comes from one of our engineers. --Dan Cooper ----- Begin Included Message ----- Date: Mon, 16 Oct 1995 07:23:30 -0700 From: "TWEEDY::HAMILTON"@goofy.ds.boeing.com To: cooper%longshot.ds.boeing.com@PLATO.ds.boeing.com Subject: Re: client-server architecture for ASIS Dan, Hope you enjoy OOPSLA. I can't believe the size of the response. Must have hit a sensitive nerve on this one. I agree with his comment that someone needs to come up with a good proposed solution before shooting-off on a tangent. Due to the limited size of this market I can't forsee anything ever coming of a client-server ASIS. The only advantage I can think of for a client-server approach is for some tool-to-tool communications or an interactive call-graph or some other form of browsing. I would expect the response to be kinda slow just like my client-server database application. Language sensitive editing would be possible, but probably pretty slow and not worth the effort. I don't see compiler-like applications such as test case generation, code analysis, etc using a client-server architecture. Jim Hamilton ----- End Included Message -----