CONCURRENCY CONTROL IN DISTRIBUTED DATABASE SYSTEMS

ABSTRACT

This Academic Research paper reviews the coverage of concurrency control and security in distributed Database System. The need for improvement in distributed database management systems becomes even more important in this era of distributed environment. Most important of these factors of distributed database are single level and multilevel access controls, protection against inference, and maintenance of integrity. The review shows that many concurrency issues and paper we survey, consolidate, and present the state of the art in distributed database concurrency control.

The core of this analysis is a different factor of concurrency control and providing security means reliability of database and quality of data. This academic research paper will examine the underlying features of the distributed database Management system. Learning the task of distributed database management system will lead us to a successful design to resolve the issues of concurrency control. The design will improve scalability, accessibility and flexibility while accessing various types of data.

 INTRODUCTION

As the demand for excellent performance and readily availability, computers have been upgraded from centralized to distributed architectures. This up-gradation launched new issues in the area of database management as well Distributed Database Systems. Simulating distributed database systems is happened to be a difficult task, as there are many factors which are concerning with the results.

This includes architectural options as well as work-load and data distribution. In this paper we present the extendible and easy configurable Distributed Database Simulator and some results from a comparison between two concurrency control algorithms, timestamp ordering and two-phase locking.

We have earlier implemented centralized version of Distributed Database Simulator, where standard relational database systems were used as context. That means, instead of sending the queries to the data, data is sent to the queries.

           The Well-known centralized concurrency control techniques can be extended to solve the problem of concurrency control in distributed databases, but not all concurrency control techniques are suitable for a distributed database. One example is serialization graph testing, which works well in a centralized database system given relative powerful processors compared to I/O speed. But in a distributed environment, keeping the graph updated at all times it prohibit expensive because of the communication costs.

During the last years, several distributed database systems have been introduced and practically implemented. Usually, the concurrency control in these systems has been done by two-phase locking, but as processor speed increases relative to I/O and communication speed, it is expected that timestamp scheduling should be able to compete with two-phase locking in performance optimization. In theory, timestamp ordering scheduling should be capable of good performance in distributed systems. It is deadlock free and avoids much communication for synchronization and lock management.

Simulating distributed database systems is inherently difficult, as there are many factors that may influence the results. This includes architectural options as well as workload and data distribution. In this paper, I present the DB simulator and some simulation results.

           The DB simulator architecture is extendible, and it is easy to change parameters and configuration. The simulation results in this paper are a comparison of performance and response times for two concurrency control algorithms, timestamp ordering and two-phase locking.

            The simulations have been run with different number of nodes, network types, data de-clustering and workloads. The results show that for a mix of small and long transactions, the throughput is significantly higher for a system with a timestamp ordering scheduler than for a system with a two-phase locking scheduler.

By applying short and simple transactions, the performance comparison of these two schedulers in connection with DDBs is almost identical. Long transactions are treated more fair by a two-phase locking scheduler, because a timestamp ordering scheduler has a very high abort rate for long transactions.

 RELATED WORK

Much work has been done in studying characteristics of centralized schedulers. An interesting model and simulation results can be found in related work done by Agrawal, Carey and Livny. Less has been done in the area of distributed schedulers. The work that has been done has been mostly theoretical, but some interesting simulation models have been developed and simulated at the University of Wisconsin. The related work done by Carey and Livny describes a distributed DBMS model, an extension to their centralized model.

            Different simulation parameters are examined through simulations. Several papers about concurrency control have also been written by Thomasian et. al.

            The most important difference between our approach and the earlier approaches is that we focus on data-shipping page-server DDBs, while earlier approaches have been done in the context of query-shipping relational database systems. Also, inter-operation and inter-transaction times are expected to be much smaller in this kind of system.

 DISTRIBUTED DATABASE SIMULATION

This section gives us a detailed view of related work that describes the architecture of the distributed version of Distributed Database Simulator.

In below section, we discuss the parameters used in the simulation model, and in later section, we discuss the results from the simulations. Further I have discussed possible weaknesses and shortcomings in the model. Finally, I present future work and conclude the paper. The related work has been done in studying characteristics of centralized schedulers. Different simulation parameters are examined through simulations.

            The most important difference between our approach and the earlier approaches is that we focus on data-shipping page-server DDBs, while earlier approaches have been done in the context of query-shipping relational database systems. Also, inter-operation and inter-transaction times are expected to be much smaller in this kind of system. In Distributed Database Simulator section, I elaborate the architecture of the simulator. Each of the main modules will be described.

Especially, we will focus on the parts which are particularly important in the distributed database model. In addition to simulate and compare schedulers, one of the main task in the development of the Distributed Database Simulator was that it should be useful as a framework for simulation of schedulers and easy to extend with new schedulers.

           The simulation is event driven, and each transaction can be thought of as a thread. In the main loop an event from the event queue is picked and executed. Events in the queue consist of an event type and the time for the event to be executed.

 If the event is a TM-event, the TM is called, and if the event is a DM-event, the DM is called. Possible reasons for events could be a transaction requesting an operation, or the data manager has finished a read or write operation on disk.

Post a Comment

4 Comments