Synchronization
Bysoft | January 10, 2012Synchronization is a typical problem for e-commerce web sites. The online store has to be synchronized with a number of other applications that are more suited to handle different tasks: CRM, ERP, inventory, accounting, call center, warehouse, catalog, you name it.
Synchronization is a difficult problem, and though it sounds easy for anyone who coined the term, it is in fact a ongoing challenge for developer, when you take into consideration performance and consistence.
Consistence
Consistence is the fact that the remote system and the e-commerce site are always at the same level of information. What is in the CRM is in the web site, and what is in the web site is also in the CRM.
Or not. Sometimes, willfully, the synchronization is one-way: information come down from one of the server, and is used by the second one. The second system must use the information as ready only : not modification is possible by the second system, as the first one is de facto, master. This is typical of two kinds of situations: when the e-commerce web site is evolving as a side project to a larger internal system. This happens a lot when retailers are moving to the Web. They already have an existing system. Thus, the e-commerce takes all its information in the original system, and displays them out to the web. The other situation is when the internal system is seen as strategic, and company doesn’t want to run it, for performance/security/organization reasons. So as to decouple the two, synchronization is set up.
This schema is the most simple, and once it has been readied, the two way synchronization raises its ugly head. Obviously, there will sooner or later, information from the web site that will need to be made available to the other system : CRM will need to see the updated information of any contact profile, just like the web site will need to have the contact information updated. Or, the web site will produce new data, like view count for each production, that the ERP will want to use in order to plan the next buying process.
One way and two way synchronization are the two main approaches. Hybrid solutions includes : double one way synchronization (each server is master for a distinct set of data), synchronization with moderation (one of the server will be master, and collect data for conflict resolution, before pushing it back to the second server), and other variations.
One rule: keep this simple, as the second aspect of synchronization is going to be difficult.
Synchronous or asynchronous
The above approach must take into account the how fast the information has to be taken into account. The transfer is synchronous when both servers update their information at the same time. This is typically the case for wire transfer: both s will validate the wire transfer, or they will not validate it at all. There can be no “inconsistent” time, during which one bank has send the money, and the other hasn’t acknowledged it.
Asynchronous introduce a simple concept: delay. One of the server will have the information, and it will relay it later to the other. And, there will be a time difference between those two operations.
The main advance of such approach is to avoid transferring the load of the traffic to the remote resource. Basically, when the two servers are synchronous, any hit on one of them translates into another hit on the second server. So, traffic on the first server will be transferred to the second. And, when it comes to web application, this is not a good idea.
Besides, synchronous transfer usually means doing it unit by unit. Everytime a customer updates its contact information, the information is sent to the CRM. On the other hand, asynchronous systems will accept batch working. Instead of sending them one by one, the original server will stack several requests in a batch: it will run less often, and probably faster.
Delays on my app!
Usually, when asynchronous transfer is presented to project owners, there is a bit of misunderstanding. The concept is usually met with furious denial : I need this data right away! However, business timing most often allow for such delay. In fact, it usually accept it most of the time. Let us review some examples.
Updating catalog description is usually a daily routine. It is rather infrequent that it has to be updated several times a day. Benign errors, such as spelling mistake may be fixed manually, while batch updating, with several update can wait the next business day. As such, it is best done at night, with low traffic, so as to reset all caches without pressure. Solution : asynchronous, once a day.
Contact information is indeed something that cannot wait a day. Once the information has been modified, it is often important to have this transferred promptly to CRM. Most of the time, a 5 minute to 1 hour difference is still OK. For one, this allows the online user to change his mind, and change again the contact, and avoid a double send. Secondly, it is unusual to need up to the second information in the CRM. Even if the CRM is on the verge of a major mailing, 5 minutes delay will not have a major impact. And if the contact are processed manually in the CRM, 1 hour is even normal delay for processing : no need to rush. Solution : asynchronous, 5 to 60 minuts.
Invoices and shipping order are more tricky. Shipping the same day will need the information in time, and this will prevent us from using a long delay. Although, a 5 mins delay will be OK, allowing both batching, and not dropping any last minute order. May be, adapting the schedule to be more frequent by the deadline, and less frequent when away is a good idea : frequency may be a variable function. Solution : asynchronous, variable frequency.
Inventory checks are probably one of the only situations that will need synchronous transfer. You don’t want to book a product that will be out of stock. Not only its availability has to be checked, but its booking also needs to be confirmed. Warehouse will handle the situation on its own, sending buying order automatically below certain levels of stock, preventing the sell a few items before outsold, etc.
Capacity planning
Capacity planning is easier for asynchronous than for synchronous. Basically, since traffic is decoupled from synchronization, you can dimension the size of the processing. Ideally, the processing will deal with all the waiting objects, and make use of any batch command available on the remote system. Think about LOAD DATA INTO TABLE rather than INSERT INTO TABLE approach. Whenever the load is too much for the remote system, you can make it smaller, or, strangely enough, make it parallel (several smaller updates at the same time).
As for synchronous system, there is only one hope: make the remote system fast enough. For this, one has to estimate the amount of traffic the web will actually generate to them. One suggestion is to have an equation such as this one :
Traffic * conversion * average_hits * security
Traffic is anything you want: average traffic, peak traffic, etc. Ideally, you will provide both, but this is often luxury.
Conversion is the part of your traffic that will end up making a remote call. For example, your e-shop may have 1000 users a day, but will make 10 sales a day. Your conversion here is 1/100.
Average_hits is the number of hit each session will do on the remote servers. For example, each sales will usually hold 2 products. It will need then 2 checks on the remote servers. 2 is actually a good evaluation here.
Security is the often overlooked factor that will cover any unforeseen problem. Here, we may expect sudden sales. That will make 10 times more orders. We can use 10 as security.
Finally, for our simple example, a 1000 visitors a day traffic will require 1000 * 1/100 * 2 * 10 = 200 hits a day. If our peak expected traffic was 1000 a seconds, we would need 200 hits a second on the remote server.
Once you have the target traffic for the remote system, you will require the remote system to answer as fast as need. This is an extremely important point, as usually, there are no check done whatsoever. With the target traffic, you can run a performance tests, and make sure the distant server is apt for your traffic, and the delays are not too long. Performance of the remote system will actually reflect directly on your e-commerce web site.
Conclusion
As conclusion, do not underestimate synchronization. As usual with informatics, such systems are prone to feature creep: it is obvious at the beginning that a one way replication is sufficient, then a hybrid is set up, then a full blown two ways is expected, and finally, performance is taken into account. Each phase has its own challenges.
Damien Seguy






Your home is valueble for me. Thanks!…
Kpb07V rewoytxnepuo, [url=http://xloalgxochfm.com/]xloalgxochfm[/url], [link=http://cibpjthaaqgz.com/]cibpjthaaqgz[/link], http://pydltzxlgzmw.com/
Thanks a lot for giving everyone an extremely splendid possiblity to read from this blog. It’s usually so kind and also stuffed with amusement for me personally and my office peers to visit your site really 3 times per week to study the newest stuff you have. And of course, I’m also usually fulfilled concerning the terrific pointers you serve. Certain two facts in this article are in truth the most beneficial we’ve had.
The very root of your writing while sounding agreeable initially, did not really sit very well with me personally after some time. Someplace throughout the sentences you managed to make me a believer unfortunately only for a while. I however have got a problem with your jumps in logic and you would do well to help fill in those breaks. In the event you actually can accomplish that, I will definitely end up being impressed.
Have you ever considered publishing an e-book or guest authoring on other sites? I have a blog based on the same ideas you discuss and would really like to have you share some stories/information. I know my visitors would value your work. If you’re even remotely interested, feel free to shoot me an email.