Slony-I, the most widely used PostgreSQL replication tool, is inherently inferior to MySQL's built in replication for a number of reasons. First, it uses SQL and triggers to replicate the data across servers. This is considerably slower than MySQL's binary log shipping and makes the communication costs much higher. Second, Slony-I's communication costs grow quadratically in relation to the number of servers in the replication pool (Order(n^2)). This makes it inherently unusable for larger clusters. If we conservatively figure that Slony-I's SQL/trigger method takes twice as much communication as MySQL's binary log shipping, we can easily see how poorly this would work for larger clusters in the real world.
With two servers: MySQL: 2 = 2 PostgreSQL: 2*2^2 = 8
With 4 servers: MySQL: 4 = 4 PostgreSQL: 2*4^2 = 32
With 12 servers: MySQL: 12 = 12 PostgreSQL: 2*12^2 = 288.
While Slony-I is adequate for high availability with two servers, its communication costs are simply prohibitive for scaling out.
Note from Jan Wieck: I don't quite understand how the author of the above came up with those numbers. It seems to refer to the SYNC events and their confirmations. Slony does create these internal housekeeping messages on all nodes and they are transported to all other nodes. But it doesn't transport all user data changes back and forth in that fashion. These are small (maybe 30-100 byte messages) that occur once every couple of seconds. Claiming that with 2 nodes it has 4 times MySQL's communication volume or with 12 nodes 24 times that is outright wrong. Also, Slony does allow cascading of nodes, where one replica will act as a multiplier, serving more replicas in a star like cluster configuration. There is no need to pull directly from the origin.
Slony-I is also difficult to administer.
PGCluster is not useful for situations where high-performance and a decent amount of writes are to be expected. This is because it is a synchronous replication system which waits until a write has happened on all machines in the cluster rather. However, in situations that have very few writes and require data to be absolutely consistent across each database, PGCluster can be a good tool to use.
[edit] PostgreSQL Synchronous ReplicationPostgreSQL 9.1 comes with support for synchronous replication. It thus allows for the popular star-topology configurations with one write and multiple read-only slaves with slaves capable of stepping up in case the primary goes down.
[edit] MySQLMySQL ships with support for asynchronous replication. In this form of replication a log of events is transmitted to the slaves. These slaves must apply the statements or rows in this log to each of the slave servers independently. This has the limitation that each slave may have a slightly different view of the data depending upon the length of their lag in transferring and applying this log. Prior to 5.1 non-deterministic statements in this log can cause records to be inserted or updated differently on each slave [8]
Starting with version 5.1, MySQL supports two forms of replication; statement based replication (SBR) and row based replication (RBR). SBR, used prior to 5.1, collects SQL queries which affect changes to the database in a binary log which the slave servers subscribe to for their changes. RBR instead records the incremental row changes themselves in the binary log that are then applied to the slave. RBR is used automatically when non-deterministic queries are executed on the master. The storage engines NDB and InnoDB, in certain cases, only support replication using this new row based binlog format. [9]
Starting with version 5.5 MySQL supports semi-synchronous [10] replication in which the master confirms receipt of the log by at least one slave before returning from a COMMIT. This allows for greater data integrity and simplified slave promotion in the event of a catastrophic failure of the master.
Within the NDB storage engine MySQL supports scalable synchronous replication between ndbd nodes using a two phase commit protocol which does not rely upon a binary log. The two phase commit protocol still exhibits excellent performance due to the in-memory nature of NDB. Once an update has been committed to memory on two nodes it is considered durable. [11] Replication between two NDB clusters or between NDB and tables in another engine are possible via the asynchronous replication that comes standard with MySQL.
In addition to the built-in asynchronous and semi-synchronous replication methods provided by MySQL Server, additional 3rd party solutions exists to provide synchronous replication.
热门源码