1 SYNCHRONIZATION ALGORITHM:
2 -------------------------
3 The synchronization algorithm is used for every service in openais to
4 synchronize state of he system.
6 There are 4 events of the synchronization algorithm. These events are in fact
7 functions that are registered in the service handler data structure. They
8 are called by the synchronization system whenever a network partitions or
12 Within the init event a service handler should record temporary state variables
13 used by the process event.
16 The process event is responsible for executing synchronization. This event
17 will return a state as to whether it has completed or not. This allows for
18 synchronization to be interrupted and recontinue when the message queue buffer
19 is full. The process event will be called again by the synchronization service
20 if requesed to do so by the return variable returned in process.
23 The abort event occurs when during synchronization a processor failure occurs.
26 The activate event occurs when process has returned no more processing is
27 necessary for any node in the cluster and all messages originated by process
30 CHECKPOINT SYNCHRONIZATION ALGORITHM:
31 ------------------------------------
32 The purpose of the checkpoint syncrhonization algorithm is to synchronize
33 checkpoints after a paritition or merge of two or more partitions. The
34 secondary purpose of the algorithm is to determine the cluster-wide reference
35 count for every checkpoint.
37 Every cluster contains a group of checkpoints. Each checkpoint has a
38 checkpoint name and checkpoint number. The number is used to uniquely reference
39 an unlinked but still open checkpoint in the cluser.
41 Every checkpoint contains a reference count which is used to determine when
42 that checkpoint may be released. The algorithm rebuilds the reference count
43 information each time a partition or merge occurs.
46 my_sync_state may have the values SYNC_CHECKPOINT, SYNC_REFCOUNT
47 my_current_iteration_state contains any data used to iterate the checkpoints
50 refcount_set contains reference count for every node consisting of
51 number of opened connections to checkpoint and node identifier
52 refcount contains a summation of every reference count in the refcount_set
54 pseudocode executed by a processor when the syncrhonization service calls
56 call process_checkpoints_enter
58 pseudocode executed by a processor when the synchronization service calls
59 the process event in the SYNC_CHECKPOINT state
60 if lowest processor identifier of old ring in new ring
61 transmit checkpoints or sections starting from my_current_iteration_state
62 if all checkpoints and sections could be queued
63 call sync_refcounts_enter
65 record my_current_iteration_state
67 require process to continue
69 pseudocode executed by a processor when the synchronization service calls
70 the process event in the SYNC_REFCOUNT state
71 if lowest processor identifier of old ring in new ring
72 transmit checkpoint reference counts
73 if all checkpoint reference counts could be queued
74 require process to not continue
76 record my_current_iteration_state for checkpoint reference counts
78 sync_checkpoints_enter:
79 my_sync_state = SYNC_CHECKPOINT
80 my_current_iteration_state set to start of checkpont list
83 my_sync_state = SYNC_REFCOUNT
85 on event receipt of foreign ring id message
88 pseudocode executed on event receipt of checkpoint update
89 if checkpoint exists in temporary storage
93 reset checkpoint refcount array
95 pseudocode executed on event receipt of checkpoint section update
96 if checkpoint section exists in temporary storage
99 create checkpoint section
101 pseudocode executed on event receipt of reference count update
102 update temporary checkpoint data storage reference count set by adding
103 any reference counts in the temporary message set to those from the
105 update that checkpoint's reference count
106 set the global checkpoint id to the current checkpoint id + 1 if it
107 would increase the global checkpoint id
109 pseudocode called when the synchronization service calls the activate event:
111 free all previously committed checkpoints and sections
112 convert temporary checkpoints and sections to regular sections
113 copy my_saved_ring_id to my_old_ring_id
115 pseudocode called when the synchronization service calls the abort event:
116 free all temporary checkpoints and temporary sections