Distributed Transaction

A Distributed Transaction Management Scheme for Multidatabase
Systems*
Xinfeng Ye,
Department of Computer Science,
University of Auckland,
New Zealand.
Abstract
finsaction management in a multidatabase sys- tem must ensure global serializability. Local seri- alizable execution is, by itself, not suficient to en- sum global serializability, since local serialisation or- ders of subtmnsactions of global transactions must be the same at all systems. In this paper a distributed tmnsaction management scheme is introduced. The scheme maintains autonomy of the local database systems. It is free from global deadlock, and, guar- antees fairness in the execution of the tmnsactions in the system.
1 Introduction
A multidatabase system (MDBS) is a collection of pre-existing autonomous, and possibly heteroge- neous, local database systems (LDBSs). Transac- tions in an MDBS are of two types:
Local transactions: Those transactions that only access data m,anaged by a single LDBS.
Global transactions: Those transactions that ac- cess data managed by more than one LDBS.
Transaction management in the MDBS is hierar- chical. Each LDBS controls the local transactions and the subtransactions of the global transactions at its site, and assures serializable execution at that site. The MDBS software controls the global trans- actions, and assures global serializability.
Global serializability guarantees the correct con- current execution of global transactions [4]. Global serializability requires that the local serialisation or- ders of the subtransactions of global transactions be the same at all LDBSs where the global transactions execute. It is difficult to achieve this because the
LDBSs do not release information concerning their local serialisation orders.
TPransaction management in the MDBS also has to deal with the possibility of global deadlocks. Dead- lock detection in an MDBS is difficult and compli- cated [l]. Therefore, it is highly desirable to incorpo- rate methods assuring freedom from global deadlocks into a transaction management scheme [I].
Most schemes that ensure global serialieability in an MDBS use a Centralised global transaction man- ager (GTM) to coordinate execution of the global under grant A18/XXXXX/62090/F3414029.
'This work is supported in part by Auckland University
John A. Keane,
Department of Computation,
UMIST,
Manchester, UK. transactions 3, 5, 8, lo]. A centralised scheme is
P 91. In [l] a distributed scheme is introduced. How- ever, the scheme is not fair in the sense that a global transaction can be continuously aborted if it con- flicts with some other transactions. Therefore, a global transaction may never have a chance to be executed. In this paper a distributed transaction manage- ment scheme is introduced. The scheme is free from global deadlock and does not require knowledge of the transaction management methods of the LDBSs.
In the scheme each LDBS has a GTM. The GTM on each site commits or aborts the global transactions based on its local information. The scheme is more flexible than the one in [l], and guarantees fairness in the execution of the transactions. rone to bott \ eneck problem in distributed systems
2 Transaction Management in MDBS
.-The MDBS software is a layer above the LDBSs.
It coordinates the global transactions. Local trans- actions are submitted to the LDBSs directly. There- fore, the MDBS software is unaware of the existence of the local transactions. Global transactions are submitted to the MDBS. They are decomposed into a set of subtransactions. Each subtransaction is ex- ecuted at one LDBS, and is treated as if it were a lo- cal transaction. A global transaction commits when all the subtransactions of the global transaction have completed their operations successfully. If any of the subtransactions aborts, then the global transaction aborts. Therefore, the operations of a subtransac- tion can only be committed when the global trans- action commits. In the following a transaction refers to either a local transaction or a subtransaction.
Transactions can be modelled as a sequence of read and write operations, denoted as r(x) and w(x) respectively, where xis the data item to be accessed.
A local execution consists of all the operations per- formed at an LDBS. Two operations conflict if both operate on the same data item and one of them is a write. Two transactions conflict directly if they contain conflicting operations. For two transactions,
TI and T2 that conflict directly, Ti precedes T2 (de- noted as Ti + T2) in the execution order if TI is executed before T2 in terms of the conflicting op- erations. The precedence relation is transitive. A local execution is serializable (also referred to as lo- cal serializability) if all the conflicting transactions have the same precedence order on all the conflid- ing operations. The serialisation order refers to the precedence relation between the transactions in a se- rializable execution.
Let s1 and 5-2 be two subtransactions at an LDBS. s1 and s~ indirectly conflict if they do not directly conflict and there are some transactions, fi, ..., fm. such that s1 -* f1 + ... 7 fm, + s2. Two transac- tions are said to be conflact with each other if they conflict either directly or indirectly.
Ezample 1: Assume an MDBS consists of two sites, 1 and 2. Data item a is stored at site 1 and data items b and c are stored at site 2. Global transactions G1 and G2 are defined as:
Gz : wz(a)wz(c)
A local transaction L3 at site 2 is defined as:
Let G~J and Gl,2 denote the subtransaction of GI at sites 1 and 2 respectively. Similarly, G2,l and
Gz.2 are the subtransactions of G2 at sites 1 and 2.
Suppose that the LDBSs at sites 1 and 2 generate the following executions:
GI : r1 (a)wl (b)
L3 : T3(b)W3(C)
At site 1: el = rl(a),wz(a)
At site 2: e2 = r3(b),wl(b),mz(c),w3(c)
G~J and G2,l directly conflict at site 1; and the order between them is G1,1 -+ GZJ. G1 2 and G2,2 directly conflict with L3 at site 2; and the order is L3 +
G1.2 and Gz,z + L3. Thus, G1,2 and G2,z indirectly conflict at site 2; and the order is G2,2 + G1.2.
In Example 1, although at both sites the execu- tions are locally serializable, the executions are not globally serializable. This is because the serialisation orders between the subtransactions of G1 and G2 are dif€erent at the two sites. The problem is caused by the execution at site 2, where the order of perform- ing the operations is different from the serialisation order between Gl,2 and Gz,z. The difference is due to L3 that introduces indirect conflict between Gl,2 and G2,z and forces Gz.2 to be serialised preceding
The problem described above cannot be solved by limiting the concurrent execution of the global transactions. This is because the MDBS is unaware of the existence of local transactions, thus it can- not detect indirect conflicts between the subtransac- tions that are caused by the local transactions. In order to solve the problem, [SI proposed a scheme which forces the subtransactions to conilict directly.
In the scheme an artificial data item is inserted to each LDBS. This data item is maintained by the
LDBS as a regular data item. Each subtransaction is extended to include a write access to that data item at the corresponding LDBS. This access forces the subtransactions to conflict directly.
If the LDBSs use some sort of locking mechanism, then global deadlock could arise. Global deadlock refers to the situation that more than one global transaction are waiting for each other, and none of them can make progress. A global deadlock involves more than one LDBS.
Each LDBS has some mechanisms to prevent and resolve local deadlock. This, however, does not en- sure that global deadlock will not occur.
Example 2: Consider an MDBS where data items a and b are at site i and c and d are at site j. The following global transactions are submitted:
GlJ.
Let L3 and L4 be two local transactions that are submitted at sites i and j respectively:
Let ei and ej be the executions at sites t and j re- spectively: ~53 : ~3(a)rs(b) L4 : r4((c)r4((t) e; = r3 a w1 a) w2(b r3 b) ej = r4t~:wz1C):wi(d~:~4~~)
A global deadlock may occur as follows: At site a,
Gl,; waits for data item a locked by L3. L3 waits for the data item b locked by Gz,,. Therefore, GI,, is waiting indirectly for Gz,; at site i. Similarly, Gz,j is waiting indirectly for Gl,j at site j. Gl,j releases d when GI commits. However, G1 will not commit until Gl,i completes. This means that Gl,j (GI) is waiting for GI,; to complete. Similarly, Gz,; (Gz) is waiting for G2 . to complete. Since GI,; is waiting indirectly for dz i, G1 is waiting for G2 to commit, and vice versa. Aence, there is a global deadlock.
3 The MDBS Model
The completely distributed system architecture in
[l] is used here. The MDBS software at each site consists of a GTM and a set of servers. The GTMs at different sites can communicate with each other.
Global transactions can be submitted to any of the
GTMs. The GTM to which a global transaction is submitted becomes the coordinator for this transac- tion. Each global transaction is assigned a times- tamp. The timestamps are unique system-wide, and define a total order among the global transactions.
The timestamps generated at each site are in mono- tonically increasing order. The timestamp of a trans- action G is denoted as time(G).
The coordinator of a global transaction decom- poses the transaction into a set of subtransactions, each of which is sent to a participating site where it can be executed. All subtransactions of the global transaction carry its timestamp. The coordinator controls the commit or the abort of the global trans- action. If a global transaction is aborted due to the violation of global serializability, it will be restarted with the same timestamp.
When a GTM receives a subtransaction, it creates a server. A server is a process assigned to a subtrans- action by a GTM to act as an agent for the global transaction. A server submits the operations of the subtransaction to the LDBS, monitors its execution and interacts with the local GTM. If a server and the coordinator of the global transaction reside at differ- ent sites, then the local GTM of the server controls communication between the server and the coordi- nator. In this paper, server and subtransaction are used interchangeably. The coordinator of a global transaction is also called as the coordinator of the subtransactions (servers) of the global transaction.
Interaction between the servers and the LDBSs occurs (a) when the servers send the operations of the subtransactions to the LDBSs and (b) when the
LDBS reports the completion of the operations. A server can instruct a LDBS to commit or abort the operations. The following assumptions are made:
1. Each LDBS guarantees local serializability.
2. All local executions are deadlock-free. That is, either deadlock does not-occur in the LDBSs, or if deadlock occurs then it is detected and resolved locallv. GI : wi (4 Gz : w2 (61~2 (c) 3. All IoGal executions are fair. That is, the opera- tions submitted to a LDBS will be completed eventually. 4. Once an LDBS informs a server that the opera- tions of the subtransaction have been completed, the LDBS will not abort the operations unilater- ally. 5. The LDBSs do not inform the MDBS software of the local serialisation orders.
4 The Scheme
In a similar way to 61, the scheme in this paper tions on a LDBS. In contrast to 6 the global trans-
To enforce the direct conflict, each LDBS main- tains a data item dummy. A write to dummy is added to each subtransaction. This forces the sub- transactions on a LDBS to conflict directly. The conflict makes it possible for the GTM to know the serialisation order of the subtransactions at a site.
The main ideas of the scheme are as follows:
1. When the operations of a subtransaction are com- pleted, the coordinator is informed. When all the subtransactions have completed successfully, the cc- ordinator commits the global transaction. If one of the subtransactions aborts at a site, then the coor- dinator aborts all other subtransactions.
2. Each server must request the LDBS perform a write to dummy first. After the LDBS completes this write, the server submits the operations of the subtransaction to the LDBS.
3. Only one server is allowed to submit w(dummy) to an LDBS at any time. The order of committing the operations of the subtransaction must be the same as the order of submitting wfdummy) to the LDBS.
4. When the coordinator of global transaction G is informed that one of its subtransactions has com- pleted, the coordinator requests the other subtrans- acpns of G be executed immediately. When the request is received by the GTM at a site, the GTM aborts all the.subtransactions which precede G and whose timestamps are greater than time(G). Abort- ing transactions prevents global deadlock.
The details of the scheme are explained now.
Servers are in one of several states. A GTM creates a server when the GTM receives a subtransaction.
The created server is in the dormant state. A GTM activates a server by changing.its state to register when no other servers are in the register state.
When a server enters the register state, it sub- mits w(dummy) to the LDBS. When the operation is completed by the LDBS, the server informs the
GTM and enters the active state. A server in the active state submits the operations of the subtrans- action to the LDBS. When the operations of a sub- transaction are completed by the LDBS, t?ye server sends a request-to-commit message to itwcoordina- tor and enters the prepared-to-commit state. The request-to-commit message informs the coordinator that the subtransaction has eompleted. When the servers of a global transaction have all entered the prepared-to-commit state, the coordinator instructs the servers to enter the ready-to-commit state.
A server sends a ready message to the coordinator when it enters the ready-to-commit state. A server in the ready-to-commit state can only be aborted by the coordinator of the server. The ready-to-commit state enforces the direct con B ict between the subtransac- actions are not subject to glob ah eadlock. is necessary because some of the servers may have been aborted due to the local deadlock prevention measures after they entered the prepared-to-wmmit state. Therefore, the ready-to-wmmit state allows the Coordinator to test whether all the servers are still available to commit. If the coordinator receives ready message from all its servers, then the coordina- tor asks the servers to commit. If a server is aborted before it enters the ready-to-commit state, then the coordinator of the server will receive an abort mes- sage. In this case the coordinator will abort all the subtransactions. At any state, a server aborts itself if it receives an abort message from the GTM. If a server is aborted to prevent local deadlock, then an abort message is sent to the coordinator of the server. When a cc- ordinator receives an abort message, the coordinator asks all its servers to abort.
The operations of the coordinators and the servers are described below in an event driven style.
Coordinator of a Global lhnsaction:
1. when received a global transaction:
2. assign timestamp to the transaction;
3. decompose the transaction;
4. send each subtransaction to corresponding
LDBS;
5. when received abort message:
6. send abort message to all LDBSs with subtransactions; 7. when received first request-to-commit message:
8. send immediate-ezecution request to all
LDBSs except one from which request-to- commit message is received;
9. when request-to-commit messages received from all servers:
10. send continue message to all LDBSs where servers reside;
11. when ready received from all servers:
12. send commit message to all LDBSs where servers reside;
Server:
13. when change to register state:
14. submit w(dummy) to the LDBS; /* Enter register state. */
15. when w dummy) is completed by the LDBS:
16. send jnish-register message to local GTM;
17. submit operations of subtransaction to the
LDBS; /* Enter active state. * /
18. when submitted operations are completed by
19. send request-to-commit message to local
GTM ; /* Enter prepared-to-commit state. * / send ready message to local GTM; /* Enter ready-to-commit state. */ request the LDBS to commit the operations; request the LDBS to abort the operations; the LDBS:
20. when continue message is received:
21.
22. when commit message received:
23.
24. terminate;
25. when abort message received:
26.
27. terminate;
A GTM has two tasks. One to act as coordinator of global transactions being submitted to the GTM. The other to create and control the servers at a site.
To guarantee global serializability, a GTM has to know the serialisation order of the subtransactions.
In this paper a GTM only activates a server when no servers are in the register state (i.e. when no w(dummy) is waiting to be completed). Thus, only one server can submit w(dummy) to the LDBS. As a result, dummy is accessed in the same order as the subtransactions are activated; and the order is the serialisation order of the subtransactions.
In order to achieve global serializability, a GTM only allows a subtransaction a to send a request-to- commit message to a’s coordinator when all sub- transactions which precede a have committed or aborted. This measure ensures that, if the subtrans- actions of mme global transactions have different 1- cal serialisation orders, then the global transactions will not be committed.
In the scheme, a transaction with a larger timestamp is aborted if the transaction might be blocking a trans- action with a smaller timestamp to commit. Abort- ing a transaction with a larger timestamp ensures that all the transactions have a chance to be exe- cuted eventually.
Global Ransaction Manager:
28. when subtransaction st is received:
29. create a server for st;
30. if there are no servers in the register state
Global deadlock must be prevented. then change st to register state; else set st to dormant;
31. when received finish-register message:
32. activate a server which is dormant;
33. when received request-to-commit message from
34. if all subtransactions which precede st have server st: committed or aborted then forward the message to coordinator of else hold the message;
St;
35. when received ready message from a server:
36. forward message to the coordinator of server;
37. when received one of continue/commit/abort from coordinator of server:
38. forward the message to the server;
39. when received immediate-execution from coordinator of st
40. for each server which precedes st and whose timestamp is larger than time(st) do then send abort message to server and if server is not ready-to-commit the coordinator of the server respectively; end-for
41. if st is dormant then activate st;
42. when server st has committed or aborted:
43. check if received request-to-commit from server st’ which follows st in the serialisation order; if such a message has been received then forward message to coordinator of st’;
Although each GTM forces the subtransactions to be committed in their serialisation order, the scheme allows the concurrent execution of the subtransac- tions of different global transactions at a site. The concurrent execution of the subtransactions is only restricted when they access the dummy data item in
44.
the LDBS. This restriction allows the GTM to know the serialisation order of the subtransactions without violating the autonomy of the LDBSs.
All subtransactions can be ordered according to the precedence relation. However, since some of the precedence relations between the transactions do not affect the operations of the transactions [2]. More than one server may be in the prepared-to-commit state if the operations of the servers are not influ- enced by each other. Hence, sometimes it is nec- essary for the GTM to delay the request-to-commit message (lines 33-34).
5 Correctness of the Scheme
Correctness is proved by showing that the scheme satisfies three properties: (a) global serializability,
(b) deadlock-free, and (c) fairness.
Theorem 1: Global serializability is guaranteed.
Proof: Assume global serializability is not satisfied.
This means that the subtransactions of some com- mitted global transactions have different local seri- alisation orders at some LDBSs. Without lose of generality, the following assumptions can be made:
1. Two obal transactions G1 and Gz, such that time&) < time(Gz), have committed.
2. The subtransactions of GI, G1.1 and Gl,z, are executed at sites 1 and 2 respectively.
3. The subtransactions of Gz, Gz,l. and (22.2, are executed at sites 1 and 2 respectively.
4. The serialisation orders at sites 1 and 2 are
GI,~ -* Gz,l and Gz,z -* G1,z respectively.
Since G~J 4 Gz,l holds at site 1, Gz 1 cannot send a request-to-commit message before the opera- tions of G~J are committed (lines 33-34). The co- ordinator of a global transaction tries to commit the transaction when it receives the request-to-commit message from all its subtransactions. This means that Gz cannot commit until G1 commits. Since
Gz,l cannot send a request-to-commit message to its coordinator before G1 commits, no subtransactions of G2 can be in ready-to-commit state before G1 com- mits (lines 9-10 and 20-21). Applying the same argument at site 2, G1 cannot commit until Gz com- mits. From lines 18-19, when the operations of
G~J are completed at site 1, G~J sends a request-to- commit message to its coordinator. In turn, the co- ordinator will send an immediate-execution request to site 2 (lines 7-8). Since time(G1) < time(Gz),
Gz,z is aborted at site 2 (lines 39-40). Therefore,
Gz,z -* G1,z must not hold at site 2. This means that it is impossible to find a pair of global transac- tions which violate global serializability. Hence, the assumption that the global serializability is not sat- isfied is wrong. Thus, the theorem holds.
Theorem 2: The scheme is deadlock-free.
Proof: When a deadlock occurs, a cycle can be found in the wait-for graph [4, 71. Without loss of gener- ality, it can be assumed that a cycle is found in the wait-for graph (see Figure-1).
In the cycle G,,3 is the subtransaction of global transaction G, at site j, and time(G1) < time(G2).
Two cases involving transactions waiting for each other need to be to considered:
1. A transaction in the cycle is ready-to-commzt.
A subtransaction can only be in the ready-to- commit state for a limited period. This is because a Figure 1: Wait-for Graph coordinator will receive the ready or abort messages within a limited time. Therefore, the coordinator will instruct the subtransactions either to commit or to abort within a limited time. Hence, the cycle will be broken when the transaction commits or aborts.
2. No transactions in the cycle are in ready-to- commit state.
The cross-site wait-for relation only holds for the subtransactions of the same global transaction. It means that a subtransaction in prepared-to-commit state is waiting for another subtransaction to enter prepared-to-commit state. In the cycle, G2 1 + G2.2 and Gl,2 -+ G1.1 are cross-site relations. Therefore,
G2,l and G~J must be in the prepared-to-commit state. From lines 18-19 and 7-8, the coordinator of G2,l and Gl,z will send an immediate-mecution request to sites 2 and 1 respectively. However, the request from the coordinator of G2,l will not affect
Gl.2 at site 2. This is because time(G1) < time(G2)
(lines 39-40). Since G2,l is not in the ready-to- commit state, from the scheme (lines 39-40), GZ,~ will be aborted when the request of the coordina- tor of G1 2 is received by site 1. Thus, the cycle is broken, &d G1,l will be executed.
In both cases, the cycle cannot last indefinitely.
Hence, the scheme is free from global deadlock.
Theorem 3 The scheme is fair.
Prooj! From the assumption in section 3, the scheduling of the LDBSs is fair. Hence, a server will reach the prepared-to-commit state eventually.
According to lines 39-40, a GTM aborts a global transaction if the execution of the transac- tion might be blocking the execution of a transac- tion with a smaller timestamp. Clearly, the eldest global transaction will never be aborted. Therefore, the eldest transaction will be committed as long as the operations of one of its subtransactions can be completed by the LDBS. This is because, if one of the subtransactions is completed, then the other sub- transactions will be executed when the immediate- ezecution requests are received by the sites where the subtransactions reside. Since (a) the timestamps assigned to the new transactions increase monoton- ically, and (b) an aborted transaction will be re- started with the same timestamp, an aborted trans- action can become the eldest transaction in the sys- tem eventually. Therefore, any transaction can be executed eventually. Hence, the scheme is fair.
6 Conclusions
In this paper a transaction management scheme for MDBS has been proposed. The scheme is dis- tributed. Each site manages the global transactions based on locally available information. Therefore, it has the potential to provide a higher degree of fault tolerance, and allows incremental growth of the system. The scheme preserves autonomy and het- erogeneity of the LDBSs. It is fair and free from global deadlock. The global consistency is enforced through (a) additional operations on a dummy data item stored in the LDBSs, (b) forbidding access- ing the dummy data item concurrently by differ- ent subtransactions, and (c) forcing the subtransac- tions to be committed in the order that they access the dummy data item. Freedom from deadlock is achieved by aborting the transactions which might block the execution of the transactions that have smaller timestamps. The aborting only occurs when the transactions with smaller timestamps request the execution of all their subtransactions to be carried out immediately. Therefore, the scheme in this pa- per is more flexible than [l], where the transaction must be committed in strict timestamp order. The fairness is achieved by assigning different execution priority to the transactions. The priority is based on the timestamps of the transactions. The priority of a transaction with a small timestamp is always higher than a transaction with a large timestamp.
Therefore, the transaction with the smallest times- tamp can always be executed.
References
R.K. Batra, M. Rusinkiewicz & D. Georgakopou- los, A decentralised deadlock-free concurrency control method for multidatabase transactions,
Proc. 12th Int. Conf. on Distributed Computing
Systems, 1992.
P.A. Bernstein & N. Goodman, Concurrency control in distributed database systems, AGM
Computing Surveys 13(2) pp. 185-222, 1981.
Y. Breibart, D. Georgakopoulos, M. Rusinkie- wicz & A. Silberschatz, Rigorous scheduling in multidatabase transactions, Proc. Workshop on Multidatabase and Semantic Inter-operability,
1990.
Y. Breibart & A. Silberschatz, Multidatabase up date issues, Proc. AGM SIGMOD Int. Conf. on
Management of Data, 1989.
A.K. Elmagarmid & W. Du, A paradigm for concurrency control in heterogeneous distributed database systems, Proc. 6th Int. Conf. on Data
Engineering, 1990.
[6] D. Georgakopoulos, M. Rusinkiewicz & A. Sheth,
On serializability of multidatabase transactions through forced local conflicts, Proc. 7th IEEE
Int. Conf. on Data Engineering, 1991.
[7] H. Korth & A. Silberschatz, Database System
Concepts, McGraw-Hill, 1986.
[8] C. Pu, Superdatabases for Composition of Het- erogeneous Databases, Proc. 4th Int. Conf. on
VLDB, 1988.
[9] A.S. Tanenbaum, Modem Operating Systems,
Prentice Hall, 1992.
[lo] J. Tang, Using dummy reads to maintain consis- tency in heterogeneous database systems, Proc.
Third Workshop on Future Trends of Distributed
Computing Systems, 1992.
40 1

Distributed Transaction

Similar Documents

Assignment

What Title ?

Pos 355 Failures

Hgffdfdrf

The Knuth Marris Pratt Algorithm

Four Types of Distributed Computer System Failures

Son of Computer and Technology

Failures Paper Pos/355

Poss 355

Thesis Spatial Data

Operating System Upgrade

Week3 Pos/355

Crapola

Hello

Security Key Concepts

Popular Essays