CONCURRENCY CONTROL THROUGHPUT VS NUMBER OF NODES

THROUGHPUT VS NUMBER OF NODES AND MPL

          Distributed DBMS are used both to increase availability and performance. The performance should increase with in-creasing number of nodes. We have simulated with 1, 2, 4, 10, 2 0 and 50 nodes, and for all of these node configurations with a MPL of 1, 5, 10, 100, 500 and 1000. 

          Here we assume that all nodes have an equal amount of the data in the database, and the transaction load is balanced. That means that on average the same number of transaction are started on each node. We also assume that the probability of the data being accessed to be located at the home node of the transaction to be 80%. If not on the home node, it is located on a remote node, each node having the same probability. The network model used in this simulation is a LAN.

          The throughput increases with increasing number of nodes for both timestamp ordering and two-phase schedulers.

          A more interesting result is that the throughput for the TO scheduler is higher than 2PL scheduler. This was not actually expected. In the comparison done with the centralized version of the simulator, the 2PL scheduler performed better than the TO scheduler in most cases. In a real implementation of a distributed scheduler it would have been more likely to experience this result, because the 2PL has an overhead due to the global deadlock detection. But in our implementation, the global deadlock checking is done without any delay on the participating transactions.

          For the TO scheduler, we note that the curves for a MPL of 500 and MPL of 1000 are almost identical. We have here reached a limit for maximal throughput in the system with the available resources, where the main bottleneck is the disks.

          For the 2PL scheduler, we reach the maximal throughput for a much lower load. The figure shows that we get the maximal throughput near 100 active transactions.

          For 500 and 1000 active transactions, the throughput is much lower. With only one transaction in the system, the throughput is expected to decrease when we add several nodes.            This can be verified by examining the figures. When we only have one active transaction, the line makes a dip when we introduce more than one node in the network. The reason is that some of the data accessed have to go through the network.

          Another comment we want to make to these figures, is that with more nodes, and a network with high capacity between them, the databases get a very good increase in performance compared to a centralized database. The main reason is higher disk bandwidth because of more disks. This, of course, requires that data are evenly accessed and that most accesses go to the local disk without going through the network. Given these characteristics, it seems like a distributed database consisting of rather inexpensive nodes based on off-the-shelf technology very well can compete with more expensive parallel database machines. 

Response Times

        I present the response times for the two schedulers. The results are taken from the same simulations as those presented in the previous subsection. We have only used the short transactions when making these curves. The reason is that it is usually the small transactions that have the most stringent requirements regarding the response time.

If we compare the two figures we reach the same conclusion as in the previous subsection: The TO scheduler performs better than the 2PL scheduler for short transactions.

 Abort Frequency

            We are now able to answer the question why the TO has higher throughput than the 2PL scheduler. The length of a transaction influence very much on the abort probabilities. For short transactions, the abort probability is rather small.

For long transaction, this is not the case. Especially if we have a high number of parallel transactions, we get a lot of conflicting operations, and many aborts. What is important to notice, is that the TO produce more aborts than the 2PL scheduler. This is as expected, since the TO doesn’t handle conflicts as well as the 2PL scheduler. The 2PL is more likely to let transactions wait for the resource, while the TO aborts more often when a conflicting operation occurs.

Then why does the TO scheduler perform so much better than the 2PL scheduler?

            The main reason is that we have a mix of long and short transactions. The long transactions are the reason why the 2PL performs worse than the TO. There are two reasons for this:-

1. With many active transactions, there will be a lot of conflicts, and with long transactions in the system, they will lock many data elements. The 2PL will let a short transaction wait for the long transaction to finish. This makes a lot of short transactions being locked for long time. This is the main reason why the response time is so much longer for short transaction with the 2PL scheduler, compared to the TO scheduler. It is also the reason why the 2PL reaches its maximum throughput earlier than the TO.

2. Another effect of the locking is that long transactions have a higher probability of finishing without being aborted. This means that in average there are more resources occupied for long time by long transactions in the 2PL scheduler than in the TO scheduler. From the simulation logs we noticed that the percentage of the finished transactions that where long was much higher for the 2PL than for the TO.

            A conclusion that can be drawn from this is that the 2PL treats long transactions more fair than the TO scheduler. The cost of this is that the maximum throughput is not as good. To check this we ran the same simulations with only short transactions. This time both schedulers had almost comparative performance, as expected.

            For a mixed load, i.e. long and short transactions, we keep the load ratio constant, 20% long and 80% short transactions. However, the production ratio changes with increasing total load. Because the long transactions are more vulnerable than short transactions to data contention under the TO scheduler, the relative production of long transactions is decreasing with increasing MPL.

Throughput with Different Data Placement

            Also important is to measure throughput when we change the probability of accessing data on home node. Here we used the schedulers with a configuration with 10 nodes and a MPL of 1, 20 and 1000, with a LAN network.

First, we saw how the throughput changes with a home node probability ranging from 0% to 100%. As expected the curves shows that the more of the data accesses that go to local data, the better is the performance of the system.

For the TO simulations, the curves for a MPL of 1 and a MPL of 20 are easy to explain. For a MPL of 1 the through-put only increase slightly when increasing home node probability. The reason is that more and more of the operations can be performed locally and do not have to go across the network. For a MPL of 20 we see a very good effect from increasing the home node probability. The system goes from a state where the network is the bottleneck, to a state where the network is contributing very little to the transaction delay.

The curve for a MPL of 1000 might look a bit strange. For 0% home node probability the network is the bottleneck. This is because we simulate the network as a shared medium LAN. For 10% home node probability, the problem has already changed from the network being the bottleneck, to the disks being the bottleneck. From Figure 8, we see that for the 2PL simulation, the results are similar. 

CONCLUSION

            The results of the comparison between the timestamp ordering scheduler and the two-phase locking scheduler can be summarized as follows:

a.      With a mix of long and short transactions, the TO scheduler have a higher throughput than the 2PL scheduler.

b.     With only short transaction, the two schedulers perform almost identical.

c.       The TO scheduler have much higher abort probabilities than the 2PL scheduler.

d.      The 2PL scheduler is in favor of long transactions, and the number of long transactions that manage to finish successfully is much higher for the 2PL scheduler.

e.      The network is not the bottleneck for a reasonable load. Only for heavy load and a slow network this severely affects performance.

f.       2PL, the standard technique used for centralized DBMSs, proves to perform rather poorly for distributed systems, whereas timestamp ordering based protocols in their various forms seem to provide the best overall performance.

  • In 2PL, and other locking techniques as well, the deadlock prevention or detection in a distributed environment, which is much more complex and costly .
  • Timestamp ordering techniques (TO) avoid deadlocks entirely.
  • Basic TO (BTO) usually shows better overall performance in a distributed environment than 2PL.
  • ORDER outperforms both 2PL and BTO, i.e. low network latency and an efficient implementation of the total ordering algorithm. For high network latencies, ORDER appears to be a rather disadvantageous approach.
  • PREDICT shows basically the same advantages ORDER does.

Post a Comment

10 Comments