Skip to Content

In the A Deeper Look At Sybase(i.e. the non-mobile products), in this episode we’ll take a look at Sybase Replication Server.

Sybase Replication Server -or just RepServer for its friends- is one of the hidden gems of Sybase.
The reason you may not have heard of RepServer is that this is one of those pieces of infrastructural software you tend to forget about because it is always running just fine. Database Administrators may have many issues to deal with every day, but Sybase RepServer tends not to to be among the things they worry about.
What doesn’t help its name recognition is that RepServer does not perform a particularly exciting job in the eyes of most people. Sybase Replication Server is a ‘data movement’ product, which, well, moves data from one place to another. More specifically, it captures database transactions in one database and then applies these to another database. Even to many IT folks, that may sound as interesting as watching paint dry. Myself, I do actually find replication quite an exciting topic (but maybe that just says a lot about me).

Reasons for using replication

This sort of requirement was in fact the inspiration for RepServer in the first place: A Deeper Look At Sybase: History of ASE approached Sybase in the early 1990’s to see how their requirements could be met, and this led to the product known as Sybase Replication Server (I believe the first RepServer version appeared in 1992).

Because RepServer captures only data changes, there is no impact on the applications already running on that database. This means that RepServer can be used in pretty much any database system. Some common uses of RepServer are to facilitate migrations of databases or applications that require significant downtime; by setting up a replica with Repserver, this downtime can be almost fully eliminated.
Because of its application transparency, Replication may also be seen as a mechanism to integrate different applications, by synchronising the data changes.

An increasingly popular use case for replication is for enabling real-time Business Intelligence. Business systems are typically OLTP-oriented with primary business events in the form of database transactions. Analytics typically does not co-exist well with OLTP workload in the same database, so a separate analytics database is used for that purpose; A Deeper Look At Sybase: Sybase IQ, Sybase’s column-oriented database optimized for analytics, is a classic example.
Architecturally, the challenge is to move that transactional data into the analytics database — preferably in (near) real-time, since the days when it was acceptable to run analytics on last week’s data are long gone.

In this context, RepServer has traditionally been used by many customers to replicate transactional data from Sybase ASE (Sybase’s OLTP database, which, BTW,

SAP on Sybase ASE

) to a data mart or data warehouse powered by A Deeper Look At Sybase: Sybase IQ. The problem here has been that IQ is optimized for bulk operations and does not particularly excel at inserts/updates of individual rows, which is precisely the type of workload that RepServer sends to the replicate database. The common solution for this problem was to replicate (with RepServer) those primary transactions into an intermediate ASE database that accumulates the incoming transactions; periodically, (e.g. every N minutes) those changes were then bulk-loaded into IQ from this ASE staging database (the latter step does not involve RepServer).
This has worked very well but things have become decidely more elegant with the latest and greatest RepServer version 15.6: here, the accumulation of replicated transactions is done by RepServer itself, and RepServer also initiates the periodic bulk-uploading of those changes into IQ. This removes the need for a separate ASE staging database as well as for an external process to regularly import the data into IQ: these tasks are now all transparently handled by the RepServer process itself, making real-time Business Intelligence a lot easier to achieve. 
(NB: should you wish to discuss this topic with a Sybase/SAP representative, ask about Replication Server’s Real-Time Loading option (RTL). Incidentally, this RTL option is also available for Oracle).

How RepServer works

RepServer replicates committed database transactions. In Sybase ASE, a built-in replication agent continuously monitors the database transaction log for newly committed transactions that have been marked for replication. When it finds one, the replication agent sends it to the RepServer process that runs somewhere outside ASE. RepServer will then forward the transaction to the designated replicate database and apply it.
It should be noted that that replicate database does not have to be Sybase ASE. In fact, neither does the primary. One of RepServer’s design strengths is that the replication agent sends the transaction to RepServer in a database-independent internal format. When a replicated transaction reaches the replicate database, the internal format is converted to the SQL dialect of that specific  database. This way, you can replicate from, say, a primary DB2 database to a replicate Oracle database as well as to a replicate Microsoft SQL Server database — and vice versa, since biderectional replication is also supported.
Supported primary database types include DB2, Oracle and MS SQL Server, where the replication agent will be a separate process running externally to these databases. These same databases can also be the target of replication, as can be SQL Anywhere, Sybase IQ, and some others. Perhaps not suprisingly, for best overall performance and minimal latency, here at Sybase we recommend, well, Sybase ASE as the preferred primary database, and ASE or IQ as the replicate database.

RepServer uses a publish-subscribe model. On the primary side, the entity to be replicated is published, and this can be subscribed to by one or more replicate databases. Such an entity can be an individual database table, or an entire database. The latter is often used for ‘warm standby’ configurations where all changes in a database are replicated to a standby. In addition, a stored procedure can be marked for replication, causing the execution of that procedure to be replicated with its actual parameter values.
As you would expect, by default an insert on the primary side is replicated into exactly the same insert on the replicate side. Perhaps less expected is that RepServer allows you to map a primary event to any custom action on the replicate side. For example, that primary insert could be replicated as the invocation of a custom stored procedure on the replicate side, which receives the values specified in the original insert as its parameter values. Or a delete on the primary side can be converted into the same delete plus an insert into a history table on the replicate side. RepServer just gives you total freedom on what replicate actions can be defined — as long as it can be expressed in SQL.

RepServer performs asynchronous replication. This means that when the database transaction in the primary database commits, it will be picked up and applied to the replicate database shortly afterwards, decoupled from the primary transaction. Consequently, when using asynchronous replication, there is always somelatency: some time will pass before a primary transaction gets been applied to the replicate side The challenge is to keep this latency as short as possible. In a well-designed and well-tuned replication system, latencies of no more than single-digit seconds are typical, even when the replicate database is on a WAN.
An unwillingness to accept latency is sometimes used as an argument in favor of a synchronous replication approach. Here, the idea is that the primary transaction does not commit until the transaction has committed in the replicate database as well. The big downside is that the replicate site must always be available: if it is not, transactions cannot commit at the primary site. In practice, this makes the overall system more vulnerable, since there are now two points of failure. RepServer’s asynchronous replication offers an optimal compromise between latency and availability.

Another aspect worth discussing briefly is that a typical replication system operates in a networked environment. For example, let’s assume the primary database runs on an HP-UX server in New York; the RepServer runs on a Linux box in the same data center; one replicate database run on an AIX server in Houston, Texas; and another replicate database runs on Linux in London, UK.
RepServer is designed under the assumption that it is normal for network connections to temporarily fail, and then come back into service. Once a connection fails, RepServer will automatically keep retrying until the connection is back, and then proceed replication where it left off.

RepServer vs. storage replication

It is important to mention here that RepServer performs transactional replication, which captures database transactions and applies these to the replicate or standby database. This should be distinguished from storage-level replication solutions (like EMC’s SRDF) which replicate changes to the disk devices. While both are ‘replication’ and, from a  high level, kinda do the same thing, it is important to realise that these are two fundamentally different approaches towards replication.
Storage replication can potentially be a bit simpler to set up and operate than transactional replication with RepServer, but there are two main implications you need to be aware of. First, with storage replication, the replicate side is essentially a cold standby: those replicated storage devices cannot be used for any purpose, including running a database server, until you decide to switch over to the replicate side. In contrast, with RepServer your replicate side is an actively running database server that you can use for other purposes while it is being replicated to. For example, it is not uncommon to see an ASE database replicated with RepServer to a standby database for high-availability purposes, and use that standby database to also run some reporting that would have been too much load to run on the primary database.
A second difference is that, should any device corruption occur on the primary side, this corruption will be literally replicated to the replicate storage as well. This may not be what you want, since now your standby database is corrupted too (the sort of thing that can happen thenis described here). With transactional replication by RepServer, this problem will not occur because the replicate database is updated through SQL statements so any device-level corruptions will not be duplicated.

Finally…

After singing RepServer’s many praises you may wonder if there may be any downsides?

Well, frankly, there are: I think pretty much everyone agrees that RepServer is not the easiest product to configure and manage. While that can also be an attraction -personally, I like RepServer ‘cos it forces me to think– I think everyone also agrees that RepServer needs to be made easier to use. The good news is that this is already happening, with the past few RepServer releases introducing new features for ease of management. It is Sybase’s intention to follow through on this direction in future RepServer releases.

If you are like me, and you feel all fired up now about this great piece of software technology, you could check out the Replication Server Design Guide which discusses a number of different possible architectures for a replication system. Or just check out some RepServer whitepapers and success stories . Or just do as many people do: forget about Replication Server’s existence in the confidence that it sits there at the foundations of many IT systems, working away perfectly all the time — as long as you tell your boss about RepServer first.

Next episode: A Deeper Look At Sybase: PowerBuilder and PowerDesigner

To report this post you need to login first.

12 Comments

You must be Logged on to comment or reply to a post.

  1. Markus Doehr
    We run (all) our systems on SAPs MaxDB and according to this blog only DB2, Oracle and MS-SQL are supported.

    AFAIK RepServer is used in certain versions of HANA to replicate data to the HANA box by capturing the transactions.

    Does that consequently mean that in case we would implement HANA at some point we would have first to migrate to a supported database?


    Markus

    (0) 
    1. Rob Verschoor Post author
      Support for HANA as a replicate database is planned, but not yet available (I cannot currently comment on when it will be). In fact, during Vishal Sikka’s TechEd 2010 keynote, a demo was shown where data was replicated into HANA in real-time; this was actually done with a prototype of RepServer-for-HANA.
      As for MaxDB, no replication agent for MaxDB is available today, so you could not use it as a primary database right now. Would you consider using RepServer if a MaxDB repagent was available?
      (0) 
      1. Markus Doehr
        That would be our approach of a smooth HANA implementation (when it’s available).

        So considering that we won’t be able to implement it, right?


        Markus

        (0) 
        1. Rob Verschoor Post author
          So long as there is no replication agent available for MaxDB, indeed you won’t be able to use MaxDB as a primary database in the context of Sybase Replication Server.
          (0) 
    1. Rob Verschoor Post author

      Gary,

      Storage replication like SRDF is disk-block replication: the storage unit replicates blocks of disk that are changed to a remote storage unit. Best Google on ‘SRDF’ for more technical information.  SRDF is EMC’s storage replication brand; IBM, Hitachi and others all have somethign similar.

      The thing to remember is that -apart from what is already written above- is that this type of replication does not look at database transactions, it just looks at changes to the disks. The reason why storage replication has become so popular is that t simplifies the overall IT picture compared with a product like Sybase Replication Server which requires more configuration, setup and monitoring, so storage replication is attractive to CIOs (despite the often significant price tag). The downsides (see above) are however rarely discussed.

      For more information on the transactional repplication by Replication Server, see the Design Guide (link at the end of the article).

      PS. I noticed the article had become messed up by the SCN content mgmt system; the HTML tags were all messed up which made it hard to read. It’s fixed now.

      (0) 
  2. Kenny Tan

    Hi Rob,

    I am an Oracle DBA and have recently started with Sybase. I have a question regarding the version of RepServer and ASE.

    We are planning to migrate our database to ASE 15.7. But our RepServer is running on 15.0.3.  Do we have to upgrade the RepServer to match ASE 15.7 as well?

    Thanks,

    Best regards,

    Kenny

    (0) 
    1. Dale Newth

      Hi Kenny,

      The Replication Server functionality closely mirrors the capabilities of ASE. You can use a newer verision of ASE with an older version of Replication Server, but would need to configure the replication agents in the ASE to ‘skip unsupported features’. This will prevent the ASE rep agent from sending the Replication Server information that the particular version of the Replication Server will not understand. Running your replication system with these different versions will limit you to the functionality of the 15.0.3 Replication Server. There may be new functionality in the newer version of Replication Server that you will want to take advantage of, but cannot do so until you upgrade your Replication Server.

      Regards,

      Dale Newth

      (0) 
  3. Kenny Tan

    Hi Dale, thanks for the information.

    Hi Bert, thanks for the link, will read up about replication from it.

    Regards,

    Kenny

    (0) 

Leave a Reply