Service Oriented Architecture

Mon, Sep 1, 2014

Service Oriented Design with Ruby and Rails - Paul Dix

Introduction

Regular readers of my (irregular) posts will know that I enjoy making fun little javascript games on the side, however my day-to-day work is generally spent building SaaS-style web applications, primarily within the Ruby on Rails eco-system.

For the last 7+ years I worked on a single RoR application that grew from an experimental prototype, into a beta product, through those first paying customers, via many, many short iterative releases, up to a large, robust, and feature rich application with thousands of users, supported by a company rapidly growing towards 100 employees.

Rails is a fantastic framework to get through those early years. It’s certainly not without it’s faults, but in terms of productivity and getting features to market it is hard to beat.

However, if we follow the vanilla Rails path we can easily end up with a large, monolithic application, and after a while begin to feel the creaks and groans of the framework and start to think a lot about the issue of scaling.

How are we going to…

scale to meet the demands of our users? Most production Rails applications already have a horizontally scalable, load balanced set of web & application servers. They are also likely to have already off-loaded any heavy duty computation to background processes via a queuing system (such as Resque)… but if all of these processes are connected to the same database then ultimately that DB will become a bottleneck and limit the scalability of the system.
scale to meet the demands of our development team? Most development teams are already using a distributed DVCS such as Mercurial or Git, and using feature and release branches to control their development process. They are also likely to have CI systems in place with a heavy emphasis on automated testing…. But even with these tools in place, if the entire application is stored in a single codebase then it becomes harder and harder to visualise the component boundaries within the application, and harder to predict the side-effects of any single code change. Every additional feature carries an ongoing cost and velocity decreases as those costs start to add up, despite (or perhaps because of) the growing number of developers on the team.

As much as we’d like to believe that we will have a million users overnight, we are more likely to have a long, slow grind, building up brand awareness trying to increase our customer base. To achieve this we will spend a lot of time iterating on our product by adding and improving features. So a startup is more likely to struggle with the latter problem of scaling the development team, before having issues with the former (scaling the technical platform).

Once we start thinking about either of these issues, we inevitably start to think about breaking up our monolithic application into smaller components and we begin to see the phrase “Service Oriented Architecture” cropping up as the solution to all of our troubles, and who doesn’t like a silver bullet?

(werewolves? vampires? NO! It’s a trick question, there is no silver bullet)

What is SOA?

The phrase “Service Oriented Architecture” has been around for quite a while. More recently the term “Microservices” has also become popular. When discussing these terms it is easy to get distracted from the underlying patterns by a variety of potential technical implementations:

HTTP and REST
AMQP and Message Brokers such as RabbitMQ
SOAP, WSDL, and WS-*
Enterprise Service Bus
WCF

While the technologies differ, the underlying patterns are very similar and stem from the desire to break our application into smaller pieces that are simpler to manage, and easier to understand.

My personal definition of an SOA goes something like this…

A Service encapsulates a cohesive set of features from our business domain.
An Application uses services to provide an interface to our business domain (e.g. a UI or public API)
An SOA is a collection of applications and services that combine to implement our complete business domain.

(another meaningless SOA diagram)

An SOA is about breaking up our business domain using sensible boundaries that minimize coupling and maximize cohesion. Implementing each piece as a service with a well defined API.

In the book “Service Oriented Design with Ruby and Rails”, Paul Dix makes the case for Service Oriented Design by talking about the benefits of Isolation, Robustness, Scalability, Agility, Interoperability, and Reuse.

Many other people, more eloquent than I, have written their own descriptions and definitions for SOA, and I encourage you to read through as many of the related links found at the end of this article as you can make time for.

Defining Boundaries

In order to break up our business domain we need to be able to define sensible boundaries. This will always be heavily dependent on our particular domain and use cases. We are most likely to want to break up our application based on logical function. However, Paul Dix suggests alternate ways to analyze our business domain boundaries…

Perhaps based on data access patterns:

Which data has high read and low write frequency?
Which data has high write frequency?
Which joins occur most frequently?

… or design stability:

Which parts of the domain have clearly defined requirements vs ill defined?
Which parts of the domain need to be iterated frequently vs rarely?

How we define the boundaries of our services is a design problem, not a technical one. It is going to be one of the hardest part of building an SOA, much more so than the technical issue of how to communicate between services. Unfortunately it is going to be very domain specific and therefore beyond the scope of these short articles. You can get lots of inspiration from the experts:

Postponing Distribution

Once we start down the distributed architecture path our individual components might get smaller and simpler, but our system as a whole becomes a lot more complex as we are forced to deal with more middleware, integration testing, deployment, monitoring, and the fallacies of distributed computing

We should defer this additional complication as long as possible. Especially for a new-startup. Our early years should be about iterating on the product, proving its value, obtaining paying customers, measuring and learning about the domain, and evolving the product towards a mature feature set. We should certainly be thinking about the future, and guide towards a vision, but resisting BDUF and architecture astronauts. Instead, be agile and iterate towards that long term vision. Choosing a distributed architecture is a critical decision - but it is one we should postpone for as long as possible. We should build it just-in-time, not just-in-case.

Of course this a very subjective statement, and will entirely depend on our product, our business, and the growth rate of our user base and development team.

Looking back at the 2 issues that might drive us to consider an SOA:

scaling to meet the demands of our users
scaling to meet the demands of our development team

We can postpone the former by action such as:

scale vertically by upgrading hardware
measure and optimize the hotspots in our application
pro-active performance testing

We can postpone the latter by action such as:

Define concrete boundaries within our monolith.
Extract libraries, gems and packages.
Document our development practices, common patterns, and styles guides.
Do code reviews.
Do post mortems.
Write wiki articles.
Give brown-bag lightning talks.
Start a (tech) book club.
Go to meetups and conferences.
… in short, encourage growth and learning - educate our team.

But to play devils advocate - we must not postpone too long. The larger our monolith, the harder and more time consuming it will become to extract it into services. Again, this is a very subjective balancing act that, I think, comes with experience. If we let our monolith grow unchecked it will become very costly to refactor later, but if we take care to build it with well defined boundaries in mind (e.g. general OO best practices) and refactor as we go then we should find the migration to an SOA a natural progression.

This turned into a very long-winded way of saying… don’t just dump all code into /lib. Even if we have a single application, we should take time to extract out common code into libraries, gems, and packages, with their own set of independent tests and well defined public API’s.

Communication Patterns

If we do decide that it is the right time for our application to be broken into services then we must decide on a communication platform, but before we do that we should think about the 3 common communication patterns we are most likely to need:

Synchronous Request/Response

The classic example is a user requests a dashboard-style web page in our application and in order to serve that page the application must make a request for data from a separate service. It blocks, waiting for the response because it cannot return a half-composed page to the user. A common additional requirement is to allow multiple requests in parallel and block until all responses have been returned.

Most discussions of services revolve around the request/response pattern.

Asynchronous Worker Queue

Another common pattern in distributed computing is to queue up some work to occur asynchronously in order to avoid blocking the main application and let the user continue working. This should be done with any computationally expensive or potentially slow feature that might otherwise block the main application.

Many production Rails applications are already using this pattern in the form of DelayedJob or Redis/Resque and so are already distributed, helping to meet one of our goals (scale to meet the demands of our user). However if the worker process is loading in the entire Rails environment to perform the work then that doesn’t necessarily help with our other goal (scale to meet the demands of our development team). A better architecture would be to implement the asynchronous workers independently of the main application.

Asynchronous Publish/Subscribe

Finally, an under-used, but very useful pattern is the publish/subscribe pattern where the application can publish an event and any interested service can subscribe to be informed of that event. This pattern can allow for highly decoupled systems.

Technology Choices

There are many ways to approach an SOA infrastructure, but within the Ruby and Rails community we tend to see the following two choices being considered:

SOA using HTTP - making an HTTP call to a rack-based server is bread-and-butter for the RoR community and so this is the natural technology we reach for when thinking about the synchronous request/response communication pattern. However we need to reach for additional technologies, such as Redis, Resque, or Sidekiq, if we want to implement the additional asynchronous communication patterns.
SOA using AMQP - using a message broker such as RabbitMQ is not as common in our community as using direct HTTP calls, but this is a great option for providing a common transport for all three of our desired communication patterns (and more), as well as providing a robust, managed broker to act as the hub for all internal communication.

A comparison of some specific technologies that we might choose when going down either of these paths is shown below:

	HTTP	AMQP
Make a synchronous request/response	Unicorn	rabbitMQ
Asynchronous worker queue	Redis + Resque	rabbitMQ
Asynchronous publish/subscribe	Redis	rabbitMQ

Up Next…

Ok, so we’ve reached the point where we have decided to break our application into services and recognized the 3 main communication patterns of a distributed system. Can we finally get into something more practical? This introductory article has had a lot of theoretical talk, along with many links to further reading. We should pause here and reserve the practical talk for the next few posts.

In the remaining articles in this series I will get much more concrete and talk about the underlying infrastructure we might build to support an SOA, along with plenty of code samples.

Next up I’ll show how to implement the 3 communication patterns using HTTP-related technologies, followed by showing the same 3 communication patterns implemented using AMQP-related technologies:

In the mean time, here’s some more light reading…

Service Oriented Architecture:

Micro services:

Design and Architecture:

Books: