I’m currently working as a solution architect for the SCP group on the document
management competency center.
Our document management system is based on a product called Alfresco, which is a
document management system based on Java technologies and Mysql DB running on
Since the document platform is used by the entire enterprise, we had to implement a
clustering solution for the Alfresco platform in order to provide both scalability
and high availability.
Until recently we only had one Mysql instance, we wanted to provide a high
availability solution for Mysql as well.
The high availability option currently used by the SCP group for mysql is using
mysql’s replication, although the infrastructure team is also currently looking
at Galera http://codership.com/content/using-galera-cluster
Mysql replication asynchronously copies data from one server called a master to other
servers called slaves. Updates are only performed on the master server, while
read operations can be performed on the slave servers.
There are several benefits to using a master-slave replication configuration:
- Scaling – the application can be configured to perform updates on the master server while performing read
operations on the the slave(s).
- Fail over – if the master fails the application can be configured to fail over to the slave server, using
the mysql JDBC driver. The slave will be read only, but at least the system is
still functional for browsing.
- Disaster recovery – since data is replicated, in case of a major hardware problem on the master, the data is
available on the slave, which can be configured to be the master.
- Backup – backup procedures can be carries out on the slave server, taking live data snapshots without
effecting the performance of the master. This is a best practice recommendation
for mysql replication.
The Alfresco Implementation
Since Alfresco is a product by a 3rd party vendor, we have limited control on the application’s setup and configuration. Therefore, the benefit of scaling cannot be utilized without product support for that within Alfresco.
We tested for failover by configuring the jdbc driver url for failover, however we got inconsistent application stability when we failed over to a read only system. This again, as a result of the Alfresco implementation and architecture, which we cannot control.
So we decided to forget about the failover option with the replication setup and implement that once we have another solution in place.
We changed our backup procedure to take backups from the slave server inorder to make use of the new replication setup.
The setup of the mysql replication was done automatically using chef scripts.
Mysql replication – http://dev.mysql.com/doc/refman/5.5/en/replication.html
Mysql JDBC driver documentation – http://dev.mysql.com/doc/refman/5.5/en/connector-j-reference-configuration-properties.html