PostgreSql 20% improvement


Pontus was asked to perform a before-and-after analysis of how much improvement PVTM could make on the Postgresql PGBench benchmark. The objective of the tests was not to try and achieve the maximum performance of Postgresql, but rather to compare the out-of-the-box performance before and after using PVTM. The table and graph below show the summary results, where PVTM improved transactions/second and latency by nearly 20%, and standard deviation (i.e. how predictable the transactions are) by over 190%:

Metrics No PVTM PVTM Delta
TPS (000s) (Higher Better) 5.012 5.989 19.5%
Avg Latency (ms) (Lower Better) 3.99 3.339 19.5%
Stdev Latency(ms) (Lower Better) 10.655 3.626 193.8%




PGBench Benchmark

The PGBench benchmark simulates retail branch transactions with 100,000 Accounts, and 10 tellers. The benchmark reports the following statistics:

  • STDEV (ms) – how deterministic the results are
  • Latency (ms) – how fast each transaction was
  • Throughput (tps) – how many transactions / sec

These statistics reflect how long stored procedures take to execute, and the rate of execution; the stored procedure tested has the following commands:


System under test (SUT):


The system under test was a Lenovo x3850x6, with the following characteristics:

  • 4 CPUs
  • 96 hyperthreads
  • 256 GB memory

The whole test was self-contained within this single server. In order to maximize the throughput without having to tweak the Postgresql configuration files, we ran the whole database in a tmpfs file system, and left the out-of-the box configuration in place.

The PVTM simulator and GUI were running on a laptop with 16GB of memory, and 2 cores with hyperthreading on. The pvtm-agent was running in the SUT and using an ssh tunnel to access the PVTM simulator over a VPN connection on the internet.

Postgresql 9.4 installation Procedures:

Here are the step-by-step install procedures for Postgresql 9.4:


Test Procedures

The following test procedures were followed during every test:


Surprisingly, in the Centos 7.0 version that we used, using the new OS NUMA affinity facilities nearly doubled the latency in the system; after running a few test cases, the best OS settings, which we ended up using for the comparative tests, are here:

When thread manager was being tested, the following additional steps were taken:

When the thread manager was not being tested (e.g. for the baseline tests), the following additional steps were taken to stop the PVTM agent, and remove the LD_PRELOAD dependency:


Test Results

There were a total of 38 different test cases to be tested, with variations on several OS and PVTM parameters. Each test case was run three times, and the most representative run (e.g. the one with the closest results to the average of the three runs) was used. For brevity, the test results shown in the summary are for the best we could get with the OS’s native capabilities, vs the best we could get with PVTM.

Here are the test results for the most representative test case where PVTM was not being used (note that the error in the second line is benign, and is simply reflecting the fact that we moved the LD_PRELOAD library to avoid tainting the results):

Here are the test results for the most representative test case where PVTM was not being used (note that the error in the second line is benign, and is simply reflecting the fact that we moved the LD_PRELOAD library to avoid tainting the results):

Test Analysis

For brevity, the test results above suppress statistics that are displayed every 5 seconds whilst running the tests; however, these provide insightful analysis of the data. To view the patterns, these results were plotted in two graphs showing the standard deviation at different scales, and in two graphs showing the transactions per second at different scales.

The first two standard deviation graphs show how predictable the database transaction times are (the smaller the number, the better, as the time variation expected when running queries won’t be very high) with PVTM and without it (the Virgin OS). Note that overall, PVTM reduced the standard deviation, and though it did have some spikes of activity, these were significantly smaller than the spikes with the Virgin OS settings.



A similar pattern emerges in the throughput graphs; in these graphs, the higher the value, the more transactions can be made. The PVTM throughput is overall better than the virgin OS. The slight degradation in the beginning of the graph is related to the ‘warm-up’ period required by PVTM so it can recognize all the thread communication patterns. The ‘dents’ in throughput correspond to the same periods where the standard deviation were high. We did not explore the root cause of those spikes; however, they clearly show that PVTM significantly reduces the amplitude of the dents when compared with the native OS.





Though this is quite a synthetic test, it illustrates that PVTM can help increase database performance by roughly 20%. The key advantage of these tests is that they are easy to reproduce, enabling users to quickly gauge how well PVTM can help in their current environment.