Merge pull request #2216 from jwillemsen/jwi-cxxversionchecks
[ACE_TAO.git] / TAO / docs / releasenotes / ftcorba_services.html
blobd603acd716a7f2104e06ea2e2f65035ef4dfb03c
1 <!doctype html public "-//w3c//dtd html 4.0 transitional//en">
2 <!-- -->
3 <html>
4 <head>
5 <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
6 <meta name="Author" content="Steve Totten">
7 <title>Fault Tolerant (FT) CORBA Services</title>
8 </head>
9 <body>
11 <center>
13 <h2>Fault Tolerant (FT) CORBA Services</h2></center>
15 <p>
16 Points of contact: <a href="mailto:wilson_d@ociweb.com">Dale
17 Wilson</a> and <a href="mailto:totten_s@ociweb.com">Steve Totten</a>
18 </p>
20 <p>
21 <ul>
22 <li><a href="#Introduction">Introduction</a></li>
23 <li>
24 <a href="#FTCORBA_Services">FT CORBA Services</a>
25 <ul>
26 <li><a href="#Replication_Manager">Replication Manager</a></li>
27 <li><a href="#Fault_Notifier">Fault Notifier</a></li>
28 <li><a href="#Fault_Detector">Fault Detector</a></li>
29 <li><a href="#Fault_Detector_Factory">Fault Detector Factory</a></li>
30 <li><a href="#Redundancy">Redundancy of FT CORBA Infrastructure
31 Services</a></li>
32 </ul>
33 </li>
34 <li>
35 <a href="#Sample_FT_Application">Sample FT Application</a>
36 <ul>
37 <li><a href="#Replica_Factory">Replica Factory</a></li>
38 <li><a href="#Replica">Replica</a></li>
39 <li><a href="#Object_Group_Creator">Object Group Creator</a></li>
40 <li><a href="#Client">Client</a></li>
41 <li><a href="#Prototype_Architecture">Prototype Architecture</a></li>
42 </ul>
43 </li>
44 <li><a href="#Propagating_IOGRs">Propagating IOGRs</a></li>
45 <li><a href="#IOGR_Creation_Manipulation">IOGR Creation and
46 Manipulation</a></li>
47 <li><a href="#Bootstrapping">Bootstrapping of FT CORBA
48 Infrastructure and Application</a></li>
49 <li><a href="#Future_Work">Future Work</a></li>
50 </ul>
51 </p>
53 <h3><a name="Introduction"></a>Introduction</h3>
55 <p>
56 Object Computing, Inc. (OCI) and the Distributed Object Computing
57 (DOC) group at Vanderbilt University's Institute for Software
58 Intensive Systems (ISIS) collaborated on a research and development
59 (R&amp;D) effort to demonstrate the viability of the OMG FT CORBA
60 specification (defined in Chapter 23 of the <a
61 href="http://www.omg.org/cgi-bin/doc?formal/02-12-02">CORBA 3.0
62 specification</a>), with some extensions, as a platform for building
63 fault-tolerant DRE applications.
64 </p>
66 <p>
67 The OCI team designed, implemented, and tested FT service-level
68 entities and other components needed to support an FT infrastructure
69 in TAO, and a sample application to demonstrate TAO's FT capabilities.
70 The ISIS team provided enhancements to TAO's ORB Core to support fault
71 tolerance in applications, including implementing ORB-core-level
72 features defined in the FT CORBA specification.
73 </p>
75 <p>
76 Extensions to the FT CORBA specification investigated during the
77 project included:
78 <ol>
79 <li>Adding a <code>SEMI_ACTIVE</code> replication style similar to
80 that described <a
81 href="http://www.dre.vanderbilt.edu/~schmidt/PDF/WDMS02.pdf">here</a>.</li>
82 <li>Separating interfaces and type definitions that are common
83 across multiple specifications into a Portable Group module as
84 described in the <a
85 href="http://www.omg.org/cgi-bin/doc?ptc/01-11-09">OMG Data
86 Parallel Processing specification</a> and the <a
87 href="http://www.omg.org/cgi-bin/doc?ptc/01-11-08">Unreliable
88 Multicast Inter-ORB Protocol specification</a></li>
89 <li>Adding factory registration and a fault detector factory
90 interfaces.</li>
91 <li>Adding mechanisms for bootstrapping FT CORBA systems.</li>
92 <li>Defining protocols for operating between the ORB core and FT
93 services.</li>
94 </ol>
96 <h3><a name="FTCORBA_Services"></a>FT CORBA Services</h3>
98 <h4><a name="Replication_Manager"></a>Replication Manager</h4>
101 The Replication Manager is perhaps the most visible of FT CORBA's
102 infrastructure components. Fault tolerant services interact with the
103 replication manager to create object groups, manage an object group's
104 properties, control an object group's membership, and so forth. The
105 Replication Manager is also solely responsible for the creation and
106 maintenance of Interoperable Object Group References (IOGRs).
107 According to the FT CORBA specification, the Replication Manager's
108 operations are defined by three separate interfaces:
109 </p>
111 <ul>
112 <li><em>Property Manager</em>: Defines operations for setting
113 properties, such as replication style, at several levels: by group,
114 by type, or by default.</li>
115 <li><em>Object Group Manager</em>: Defines operations to add and
116 remove members of an object group, change the primary member of an
117 object group (for passive replication styles only), specify or get
118 the locations of object group members, and get the current object
119 group reference and object group identifier.</li>
120 <li><em>Generic Factory</em>: Defines operations for creating and
121 destroying objects. The replication manager's realization of these
122 interfaces effect the creation and destruction of object groups.
123 These operations are also realized by application specific object
124 factories to create and destroy replicas as object groups are
125 created and maintained.</li>
126 </ul>
129 <em>
130 Note: The Data Parallel (DP) CORBA final adopted specification defines
131 a new <code>PortableGroup</code> module, including the three
132 interfaces listed above, to share common interfaces and their
133 supporting types among DP CORBA, FT CORBA, Load Balancing, and other
134 specifications. It is identical to a subset of the FT CORBA
135 specification with a few changes to make it more generic to group
136 management. TAO already has an implementation of the PortableGroup
137 module that we adapted for reuse by the Replication Manager's
138 implementation for this project.
139 </em>
140 </p>
143 In addition, the Replication Manager serves the role of a consumer for
144 fault report events propagated to it via the Fault Notifier, so it
145 must realize the Structured Push Consumer interface from the
146 CosNotifyComm module (defined in the <a
147 href="http://www.omg.org/cgi-bin/doc?formal/02-08-04">OMG's
148 Notification Service specification (formal/02-08-04)</a>. Our
149 design provides a <em>Fault Consumer/Analyzer framework</em> and
150 concrete fault consumer and fault analyzer implementations that are
151 tailored for use by the Replication Manager. The classes making up
152 this framework, and the relationships among them, are shown in the
153 figure below.
154 </p>
156 <center>
158 <a name="FT_FaultAnalyzerFramework"></a> <img
159 src="FT_FaultAnalyzerFramework.jpg" alt="Fault Consumer/Analyzer
160 Framework" title="Fault Consumer/Analyzer Framework"></img>
161 </p>
162 <h4>Figure 1: Fault Consumer/Analyzer Framework</h4>
163 </center>
167 The OCI team also added a <em>Factory Registry</em> interface that is
168 also realized by the Replication Manager to allow various types of
169 factories, such as Replica Factories and Fault Detector Factories (all
170 of which implement the Generic Factory interface) to register with the
171 Replication Manager. The Replication Manager then uses these
172 factories when necessary to add members (replicas) to object groups or
173 to create a new Fault Detector at a specific location to monitor a
174 replica.
175 </p>
178 The Replication Manager's interfaces are defined in <a
179 href="../../orbsvcs/orbsvcs/FT_ReplicationManager.idl">FT_ReplicationManager.idl</a>.
180 The Replication Manager also supports interfaces and types from the
181 PortableGroup module that is defined in <a
182 href="../../orbsvcs/orbsvcs/PortableGroup.idl">PortableGroup.idl</a>
183 and additional interfaces and types from the FT module that is defined
184 in <a
185 href="../../orbsvcs/orbsvcs/FT_CORBA.idl">FT_CORBA.idl</a>.
186 Source code for the Replication Manager's implementation is found in
187 the <a
188 href="../../orbsvcs/FT_ReplicationManager/">FT_ReplicationManager</a>
189 directory.
190 </p>
192 <h4><a name="Fault_Notifier"></a>Fault Notifier</h4>
195 FT CORBA's Fault Notifier is based upon a subset of the <a
196 href="http://www.omg.org/cgi-bin/doc?formal/02-08-04">OMG's
197 Notification Service specification (formal/02-08-04)</a>. The Fault
198 Notifier typically gathers fault reports from Fault Detectors as well
199 as from application- or platform-specific fault detectors.
200 </p>
203 The Fault Notifier can support an arbitrary number of Fault Detectors
204 and consumers because it is based upon the Notification Service.
205 Components interested in receiving fault reports assume the role of
206 push consumer with respect to the Fault Notifier. The Replication
207 Manager is one such component, as described above; an application may
208 provide its own fault analysis capability by connecting an
209 application-specific fault analyzer as a consumer to the Fault
210 Notifier. (In fact, a real-world application will likely participate
211 intimately in identifying and analyzing faults. One way this could be
212 done is to "plug-in" an application-specific fault analyzer to the
213 Replication Manager, using the <a
214 href="#FT_FaultAnalyzerFramework">Fault Consumer/Analyzer
215 framework</a> described above.)
216 </p>
219 The Fault Notifier's interfaces are defined in <a
220 href="../../orbsvcs/orbsvcs/FT_Notifier.idl">FT_Notifier.idl</a>.
221 Source code for the Fault Notifier's implementation is found in the <a
222 href="../../orbsvcs/Fault_Notifier/">Fault_Notifier</a>
223 directory.
224 </p>
227 <h4><a name="Fault_Detector"></a>Fault Detector</h4>
230 The Fault Detector is the basic component in FT CORBA for monitoring a
231 fault tolerant system's software components, processes, and processing
232 nodes and reporting faults. The FT CORBA specification defines a
233 single monitoring style, the <em>pull</em> monitoring style, in which
234 a Fault Detector periodically issues a CORBA request
235 (<code>is_alive</code>) to monitored objects and reports faults for
236 those objects that fail to respond. A fault detector that monitors a
237 single replica may be co-located on the same host as that replica. If
238 the replica fails (defined as failure to reply to the detector's
239 <code>is_alive</code> invocation within a prescribed time-out period),
240 the detector issues a fault report to the Fault Notifier, which it
241 finds via the Replication Manager. In our example application, the
242 detector then exits since the replica that it was monitoring no longer
243 exists. Fault detectors can also be deployed on other nodes and used
244 to monitor other FT CORBA infrastructure components, such as a Fault
245 Detector or Fault Detector Factory on another host. The pull
246 monitoring style is used to monitor these components as well.
247 </p>
249 <h4><a name="Fault_Detector_Factory"></a>Fault Detector Factory</h4>
252 Fault Detectors are created and managed by a Fault Detector Factory.
253 There may be many Fault Detector Factories deployed in a typical fault
254 tolerant system. The Fault Detector Factory implements the Generic
255 Factory interface.
256 </p>
259 Fault Detectors are created in the same process as their Fault
260 Detector Factory. Each Fault Detector runs in its own thread and
261 monitors its replica according to its prescribed monitoring interval
262 (defined by a property on the object group). The Fault Detector
263 Factory owns the thread manager for these threads. If a replica
264 member is removed from an object group, the Fault Detector Factory
265 that "owns" the Fault Detector that is monitoring that replica can
266 cause the detector to "quit," thereby causing it to clean up any
267 resources and its thread to exit.
268 </p>
271 Fault Detector Factories register with the Replication Manager via the
272 Factory Registry interface. The Replication Manager then uses these
273 Fault Detector Factories to create new Fault Detectors as needed to
274 monitor replicas as they are created.
275 </p>
278 The Fault Detector Factory's interfaces are defined in
280 href="../../orbsvcs/orbsvcs/FT_FaultDetectorFactory.idl">FT_FaultDetectorFactory.idl</a>. Source code for the
281 Fault Detector and Fault Detector Factory implementations is found in
282 the <a
283 href="../../orbsvcs/Fault_Detector/">Fault_Detector</a> directory.
284 </p>
286 <h4><a name="Redundancy"></a>Redundancy of FT CORBA Infrastructure Services</h4>
289 To achieve fault tolerance, a system must not have a single point of
290 failure. This includes not only application services, but
291 infrastructure services as well. In the case of this project, the
292 following FT infrastructure services need to be made fault tolerant
293 via redundancy:
294 </p>
296 <ul>
297 <li>Replication Manager</li>
298 <li>Fault Notifier</li>
299 <li>Fault Detector Factories</li>
300 <li>Replica Factories</li>
301 </ul>
304 <em>
305 One of the initial goals of this project was to provide redundant
306 implementations of each of these services after first providing basic
307 non-redundant implementations. Unfortunately, due to complexities
308 encountered during implementation and the relatively short time frame
309 for the project, we did not complete development of redundant versions
310 of the various FT infrastructure services. We encourage this to be
311 given a high priority for any follow-on work.
312 </em>
313 </p>
316 Note that making the Replication Manager redundant will require direct access to the lower-level state synchronization mechanism (i.e., via a synchronization strategy) while other FT infrastructure services can likely be made fault tolerant using the full range of FT CORBA mechanisms.
317 </p>
320 <h3><a name="Sample_FT_Application"></a>Sample FT Application</h3>
322 <h4><a name="Replica_Factory"></a>Replica Factory</h4>
325 A Replica Factory is an application-defined entity that implements the
326 Generic Factory interface. There may be many Replica Factories
327 deployed in a typical fault tolerant application. Each Replica
328 Factory acts as an agent for the Replication Manager to create and
329 manage replica members of object groups of a specific type at a
330 specific location. Replica Factories register with the Replication
331 Manager via the Factory Registry interface. The Replication Manager
332 then uses these Replica Factories to create new replicas as needed
333 when creating object groups or adding new members to existing object
334 groups.
335 </p>
338 We have provided a sample implementation of a Replica Factory as part
339 of our example application for this project. It implements the
340 Generic Factory interface from the FT module defined in <a
341 href="../../orbsvcs/orbsvcs/FT_CORBA.idl">FT_CORBA.idl</a>.
342 Source code for the example application's Replica Factory is found in
343 the <a
344 href="../../orbsvcs/tests/FT_App/">FT_App</a>
345 directory.
346 </p>
349 <h4><a name="Replica"></a>Replica</h4>
352 A Replica is an application object that serves as a member of an
353 object group. Each replica implements an application-defined
354 interface. In addition, each replica must implement the Pull
355 Monitorable interface so it can be monitored by a Fault Detector.
356 Replicas are created by Replica Factories by the Replication Manager
357 or by another application. Each new replica is then added to an
358 object group and managed by the Replication Manager.
359 </p>
362 We have provided a sample implementation of a Replica as part of our
363 example application for this project. A Replica must implement the
364 <code>PullMonitorable</code>, <code>Checkpointable</code>, and
365 <code>Updateable</code> interfaces, which are defined in <a
366 href="../../orbsvcs/orbsvcs/FT_Replica.idl">FT_Replica.idl</a>.
367 For our example application, a test replica interface is defined in <a
368 href="../../orbsvcs/tests/FT_App/FT_TestReplica.idl">FT_App/FT_TestReplica.idl</a>.
369 The implementation of the test replica is also in the <a
370 href="../../orbsvcs/tests/FT_App/">FT_App</a>
371 directory.
372 </p>
374 <h4><a name="Object_Group_Creator"></a>Object Group Creator</h4>
377 The Object Group Creator is a utility for creating an object group.
378 It can be used by an application to create an initial set of objects
379 in a system. The Object Group Creator finds the Replication Manager
380 and uses its Factory Registry interface to get a list of factories it
381 can use to create objects of the desired type. The Object Group
382 Creator can be used in different ways depending upon if the object
383 group's <code>MembershipStyle</code> property value is
384 application-controlled membership or infrastructure-controlled
385 membership.
386 </p>
388 <ul>
389 <li>Application-controlled membership: If application-controlled
390 membership is being used, the Object Group Creator calls the
391 Replication Manager to create an empty object group, then calls the
392 factories to create members for the group. Members are added via
393 the Replication Manager's <code>add_member</code> operation.</li>
395 <li>Infrastructure-controlled membership: If
396 infrastructure-controlled membership is being used, and the object
397 creator is configured to set factories at the type level, the object
398 creator optionally passes the set of factories to the
399 <code>set_type_properties</code> operation of the Replication
400 Manager, then calls Replication Manager's <code>create_object</code>
401 operation to create an object group.</li>
402 </ul>
405 After creating the object group, the Object Group Creator can
406 optionally write the group's IOGR to a file or bind it in the Naming
407 Service so it can be accessed by clients.
408 </p>
411 The Object Group Creator can exist as a stand-alone utility or it can
412 be integrated with an application. Our example application includes
413 an implementation of the Object Group Creator in the <a
414 href="../../orbsvcs/tests/FT_App/">FT_App</a>
415 directory.
416 </p>
419 <h4><a name="Client"></a>Client</h4>
422 A client application obtains the object group reference from a file or
423 from the Naming Service and invokes operations on it as it would a
424 normal IOR. In the <code>SEMI_ACTIVE</code> replication style, only the primary
425 replica receives and processes each request. The state
426 synchronization strategy developed by the ISIS team for this project
427 is used synchronize state between the primary and backup replicas with
428 the completion of each request. If the primary replica fails, the
429 transparent reinvocation mechanism inherent in the FT ORB (also
430 developed by the ISIS team) causes the client's failed request to be
431 automatically reinvoked on a backup replica. Meanwhile, the fault
432 detection mechanisms described above are used to notify the
433 Replication Manager of the fault and the Replication Manager takes the
434 necessary actions to maintain the object group's integrity. Our
435 example application includes a simple client in the <a
436 href="../../orbsvcs/tests/FT_App/">FT_App</a>
437 directory.
438 </p>
440 <h4><a name="Prototype_Architecture"></a>Prototype Architecture</h4>
443 The figure below shows the architecture of a prototypical FT system
444 and the relationships among the various FT infrastructure and
445 application-defined components described above.
446 </p>
448 <center>
450 <a name="FT_PrototypeArchitecture"></a> <img
451 src="FT_PrototypeArchitecture.jpg" alt="Architecture of Prototypical FT System" title="Architecture of Prototypical FT System"></img>
452 </p>
453 <h4>Figure 2: Architecture of Prototypical FT System</h4>
454 </center>
457 The steps involved in orderly start-up and operation of an FT system
458 are numbered in Figure 2 and described below:
459 </p>
461 <ol>
462 <li>Start the Naming Service. (This step is optional as none of the
463 FT components actually depends upon the Naming Service.)</li>
464 <li>Start the Replication Manager.</li>
465 <li>Start the Fault Notifier.</li>
466 <li>The Fault Notifier finds the Replication Manager and registers
467 with it.</li>
468 <li>The Replication Manager connects as a consumer to the Fault
469 Notifier.</li>
470 <li>Start one or more Fault Detector Factories.</li>
471 <li>The Fault Detector Factories register with the Replication Manager's Factory Registry.</li>
472 <li>Start one or more Replica Factories.</li>
473 <li>The Replica Factories register with the Replication Manager's Factory Registry.</li>
474 <li>Start the Object Group Creator.</li>
475 <li>(not shown) The Object Group Creator finds the Replication Manager and gets a list of Fault Detector Factories for the Replication Manager's Factory Registry.</li>
476 <li>(not shown) The Object Group Creator gets a list of Replica Factories from the Replication Manager's Factory Registry.</li>
477 <li>The Object Group Creator creates an object group via the Replication Manager's Generic Factory interface.</li>
478 <li>The Object Group Creator creates one or more Replicas via Replica Factories.</li>
479 <li>Each Replica Factory creates a Replica.</li>
480 <li>The Object Group Creator creates a Fault Detector for each Replica via the Fault Detector Factories.</li>
481 <li>Each Fault Detector Factory creates a Fault Detector for a Replica.</li>
482 <li>Each Fault Detector finds the Replication Manager and gets the Fault Notifier from the Replication Manager.</li>
483 <li>Each Fault Detector connects as a supplier to the Fault Notifier.</li>
484 <li>The Object Group Creator adds each Replica as a member to the object group via the Replication Manager's Object Group Manager interface.</li>
485 <li>The Replication Manager generates a new IOGR for each added Replica and updates each Replica member of the object group with the new IOGR.</li>
486 <li>The Object Group Creator optionally binds the IOGR of the object group with the Naming Service or publishes its IOGR in some other way, such as a file.</li>
487 <li>Start a Client.</li>
488 <li>The Client optionally resolves the object group by name from the Naming Service or resolves it in some other way, such as from a file or via a corbaloc ObjectURL.</li>
489 <li>The Client invokes a request on the object group. This request is carried out by the primary Replica of the object group.</li>
490 <li>Each Fault Detector periodically pings its Replica via the Replica's PullMonitorable interface.</li>
491 <li>If a Replica fails, the Fault Detector pushes a structured fault report to the Fault Notifier.</li>
492 <li>The Fault Notifier pushes the structured fault report as an event to the Replication Manager's consumer.</li>
493 <li>(not shown) The Replication Manager removes the failed member from the object group, selects a new primary for the object group, generates a new IOGR, and updates each Replica member of the object group with the new IOGR.</li>
494 <li>(not shown) The Replication Manager may also add new members to the object group if the number of replicas has fallen below the object group's MinimumNumberReplicas property. When it adds new members, the Replication Manager also generates a new IOGR and updates each Replica member of the object group with the new IOGR.</li>
495 </ol>
498 <h3><a name="Propagating_IOGRs"></a>Propagating IOGRs</h3>
501 The FT CORBA specification requires the Replication Manager to create
502 and maintain IOGRs. It also requires the FT ORB to perform
503 <em>most-recent IOGR</em> processing, whereby the FT ORB can update a
504 client using an old IOGR, by means of a <code>LOCATION_FORWARD</code>
505 reply, with a new IOGR. However, the specification fails to define a
506 way for the Replication Manager to propagate revised IOGRs to the FT
507 ORBs of object group members. Therefore, the OCI and ISIS teams
508 agreed upon a simple interface (<code>tao_update_iogr</code>) by which
509 the Replication Manager can propagate revised IOGRs to the FT ORB for
510 each member of an object group (e.g., after failure of a primary
511 replica, selection of a new primary member, and generation of a new
512 IOGR by the Replication Manager). While this interface is
513 TAO-specific, it accomplishes one of our research goals of
514 investigating formal protocols by which different ORB implementations
515 of FT CORBA could be made interoperable. The ISIS team implemented
516 this interface and the OCI team incorporated its use within the
517 Replication Manager.
518 </p>
520 <h3><a name="IOGR_Creation_Manipulation"></a>IOGR Creation and
521 Manipulation</h3>
524 To support FT CORBA, the Replication Manager must be able to create
525 and manipulate IOGRs. For example, the Replication Manager's
526 realization of the Generic Factory interface must return an IOGR.
527 Also, upon receiving a fault report on an object group, the
528 Replication Manager may need to remove a member, designate a member as
529 the new primary replica, and generate a new IOGR that can then be
530 propagated to each member of the object group.
531 </p>
534 For the purposes of this project, we used TAO's existing
535 IORManipulation library for creating and managing IOGRs. However, the
536 IORManipulation library lacked certain features that were needed. We
537 worked with the ISIS team to define extensions to the IORManipulation
538 library to:
539 </p>
541 <ul>
542 <li>Support the creation of "empty" IORs that have no profiles.</li>
543 <li>Support the complete replacement of all the profiles in an
544 IOR.</li>
545 <li>Support the addition of the following tagged components:</li>
546 <ul>
547 <li>TAG_MULTIPLE_COMPONENTS</li>
548 <li>TAG_GROUP</li>
549 <li>TAG_FT_PRIMARY</li>
550 </ul>
551 </ul>
554 <em>
555 While the IORManipulation library can be used to create and manipulate
556 IOGRs, a longer term approach may be to use a specialized
557 implementation of the Object Reference Template and IORInterceptor
558 abstractions defined in sections 21.5.3 and 21.5.4 of the <a
559 href="http://www.omg.org/cgi-bin/doc?formal/02-12-02">CORBA 3.0
560 specification</a>, respectively.
561 </em>
562 </p>
564 <h3><a name="Bootstrapping"></a>Bootstrapping of FT CORBA Infrastructure and Application</h3>
567 FT CORBA infrastructure and application components must collaborate to
568 achieve fault tolerance. To do so, infrastructure and application
569 components must be started and initial objects and object groups
570 created in an orderly fashion. Much of this "bootstrapping" can be
571 accomplished by scripting. However, an entity to control the creation
572 of initial objects and object groups can greatly simplify certain
573 aspects of the bootstrapping process.
574 </p>
577 The sample application provided as part of this project uses an Object
578 Group Creator utility to create initial objects and object groups.
579 The Object Group Creator is implemented as a library that can be
580 easily integrated with other parts of an application. A simple
581 wrapper is also provided allowing the Object Group Creator to be used
582 as a stand-alone executable.
583 </p>
585 <h3><a name="Future_Work"></a>Future Work</h3>
588 During the course of this research, we uncovered several areas for
589 further research and development, including:
590 </p>
592 <ul>
593 <li>Adding redundancy to the FT infrastructure services.</li>
594 <li>Extending FT support to additional platforms, such as the
595 VxWorks RTOS.</li>
596 <li>Incorporating FT capabilities into an advanced application,
597 such as TAO's RT Notification Service, or Naming
598 Service.</li>
599 <li> Integrating the <a href="ftrt_ec.html"> FT RT Event Service </a>
600 with FT Services. The current implementation of FT RT Event
601 Service in TAO is not based on the FT Services implemented
602 by OCI because they are developed independently.
603 </li>
604 <li>Investigating improvements to the mechanisms used to detect
605 faults in replicated application objects.</li>
606 <li>Investigating advanced fault analysis capabilities and
607 mechanisms by which application- or platform-specific fault
608 analyzers to be "plugged" into the Replication Manager or other
609 components.</li>
610 <li>Investigating formal protocols that will allow interoperability
611 among different ORB implementations of FT CORBA.</li>
612 <li>Refactoring of the Replication Manager and Portable Group
613 implementations to consolidate the representation of object
614 groups.</li>
615 <li>Employing standard mechanisms, such as Object Reference Template
616 and IORInterceptors, instead of the TAO-specific IORManipulation
617 library, for the creation and manipulation of IOGRs.</li>
618 <li>Investigating performance and scalability issues in FT
619 systems.</li>
620 <li>Enabling the configuration and enforcement of FT quality of
621 service (QoS) properties, such as bounds on fault detection and
622 recovery times.</li>
623 </ul>
625 </body>
626 </html>