Integration series: Messaging

Last time we spoke about some integration methods we can use.

As we see, there are methods that are not so tight coupled, being able to generate lots of little data packages (like file transfer), easily synchronizable (like shared database), details of storage’s structure hidden from applications (unlike shared database) and being able to send data to invoke behavior in other app (like RPI) but with being resistant to failure (unlike RPI).

And here messaging comes to play. The rules are simple: you create message, send it to message channel and someone waiting for this kind of message will get it. While it has some problems on it’s own, it is reliable, fequent, fast and asynchronous and.

  1. Being asynchronous means you won’t block process while waiting for the result/answer. Calling app can continue with it’s work.
  2. Decoupling. Messages will be sent to message channel without knowing almost anything about receiver. The common interface are the types of messages sent, not the bidings between apps. It also allows separation integration developement from application developement.
  3. Frequent, small messages allow applications to behave almost immediatly by sending more messages.

And many more we’ll explore in the series. Why I will write a series on it? The main disadvantage of messaging is the learning curve. While other methods are fairly easy to use, messaging and async thinking is not something we’re used to. But once learned this concepts will help you not only when integrating lots of enormous applications. You can also apply it to “integrate” classes/functions/actors in your code.

Introduction to integration

I started to get more into integration and integration patterns. There are few reasons:

  • Open Settlers II will be created with integration with possible UI integration in mind
  • It will be helpful in my daily job
  • I feel that it’s an important topic in software engineering

Having this set up, let’s briefly talk about some integration methods.

File transfer

We want two (or more) applications to exchange data. We can use simplest solution – write it to file for others to read. (Almost?) every language non-esoteric lanuage has some file read/write function built in. It is also easy to do no matter what environment you’re working with. Coupling is not so tight as application devs can (should?) agree on common file format(s) to work with. Changes in code won’t change the communication as long as output file is the same. With json it’s easier than ever. Even with third party apps it’s still trivial to consume messages from software we don’t have influence on.

There are also some downsides, too. There is a lot of work with deciding on file structure, file processing. Not too mention storage place, naming conventions, delivering file (if one app doesn’t have rights to output location of other), times of reading/writing (and what will happen if one reads while the other writes). But it all is nothing compared to one big problem. Changes propagate slowly (one system can produce file overnight, after “collection” of other). Desynchronization is common and it’s easy for corrupted data be spread before any validation (if it’s even possible).

Shared database

Shared database is a remedy for the synchronization problem. All data is in one central database, so information propagates instantly. Databases also have transaction mechanism to prevent some reading/writing-while-writing errors. You also don’t need to worry about different file formats.

But it also comes with a price. It’s difficult to design a shared database. Usually tables are designed to fit different applications and are a pain to work with. Worse if we’re talking enterprise level solutions and some critial app – its needs will be put higher, making work for others harder. After creating database design there’s a tendency to leave it as it is – changes can be hard to follow. Another problem is third-party software. It will usually work with its own design and it may change with newer verions. Database itself can become a perfomance bottleneck

Remote Procedure Invocation

Sometimes sharing data is not enough, because data changes may require actions in different applications. Think changing address at goverment service – there are a lot of adustments and documents to be generated. Apps maintain integrity of data it owns. It also can modify it without affecting other appliactions. Multiple interfaces to CRUD data can be created (e.g. few methods to update data, depending on caller), which can prevent semantic dissonance and enforces encapsulation.

It may loosen the coupling, but it’s still quite tight. In particular doing things in particular order can lead to muddy mess. While developers know how to write procedures (it’s what we do all the time, right?) and it may seem like a good thing it’s actually not so good. It’s easy to forget that we’re not calling local procedure and that it will take more time or can fail due to multiple reasons. Due to this thinking also quite tight coupling arises (as stated before).

As always, there’s always a tradeoff. But do we have the best approach here? Or can we do even better? I’ll address these questions in the next post in series.