Free Newsletters :

PostgreSQL 9 Advances Database Replication

New data replication, performance and in-place upgrade added in major update to open source database. Also, a peak ahead at PostgreSQL 9.1.

The open source PostgreSQL project is out with a major new database release this week adding in support for features that expand scalability, security and performance.

PostgreSQL 9.0 includes new features that developers have been asking about for years including replication and hot standby capabilities. The PostgreSQL 9.0 release continues the evolution of the 14 year old open source project for the new era of virtualization and cloud computing.

"If you asked PostgreSQL users three years ago what were their top ten list of missing features, built-in easy replication and in-place upgrades would have been on every single user's list," Josh Berkus, PostgreSQL core team member, told InternetNews.com. "Now, with 9.0, we have both of those things. Also, we've added some other features which require more than the usual amount of testing for upgrades, particularly regarding the stored procedure language PL/pgSQL. Having the "double 0" number (9.0.0) tells our users that."

The PostgreSQL 9.0 release is the first in the 9.x branch and follows the PostgreSQL 8.4 release which debuted in July of 2009.

Berkus explained that hot standby and streaming replication are two complimentary features which can be used either together or separately. Previous versions of PostgreSQL had a feature called 'warm standby' which enabled PostgreSQL users to have a dormant failover server by copying binary log files from the master to the standby.

"Hot standby improves this by allowing you to run read-only queries against the standby server, making it useful for reporting and load balancing," Berkus said.

Hot Standby however is an asynchronous type of replication which could potentially be hours behind the master database.

"This is where streaming replication comes in; it opens a database port connection between the master and the standby and ships over the data changes as they are synced on the master," Berkus said. "This means the standby can be as little as a few milliseconds behind."

Berkus said the tradeoff between streaming replication and hot standby is all about the load on the master database. With hot standby there isn't much load on the master, whereas streaming replication could add extra load.

"As part of testing the new replication, I benchmarked having six standby servers off of one master on Amazon EC2 and the resource utilization on the master was less than 10 percent higher than no standbys at all," Berkus said. "The incremental cost of adding new standbys is extremely low, which means that you can have a lot of them for 'bursty' traffic."

Berkus also noted that the new replication features in PostgreSQL 9.0 require much lower administration than older replication tools, allowing a single sysadmin or lead developer to manage a large cluster of replicated servers without needing a full-time database expert.

"For a cloud host, which might have hundreds of PostgreSQL nodes, that's essential," Berkus. "In addition to the replication, the security features in 9.0 are critical to multi-tenant environments, as well as the query planner improvements to make object-relational mapper queries execute better."

PostgreSQL 9.1

With PostgreSQL 9.0 now out the door, planning on the features for the next release are already underway.

"We have a whole slate of improvements to replication lined up for 9.1, the biggest of which is adding synchronous options," Berkus said. "We've also got teams working on enhanced security options, using label-based access control, which should make multi-tenanting more secure. And there are several clustering projects, like PostgresXC, which potentially could allow multi-master scaling when they reach production quality."

There is also a new technology under development called SQL/MED, which is a protocol for federating databases. VoIP vendor Skype introduced technology for PostgreSQL in 2006 called PL/Proxy for data partitioning that could serve as a starting point.

"Skype showed us all how one could use federated databases through PL/proxy to scale to 200 server nodes," Berkus said. "With SQL/MED, PostgreSQL users will be able to do this even with some of the nodes being other database systems. That feature may take more than one year to mature, however."

Sean Michael Kerner is a senior editor at InternetNews.com, the news service of Internet.com, the network for technology professionals.




Tags: open source, database, cloud computing, PostgreSQL, virtualization


0 Comments (click to add your comment)
Comment and Contribute

 


(Maximum characters: 1200). You have characters left.