A recent client runs a large distributed system that processes global data in near real-time. Because of high data volumes, the architecture requires that each host (physical or virtual) be allocated to one or more geographic regions. Geographic data is maintained in memory and this architecture allows the system to easily scale without exceeding the physical memory of each host. I was asked to prototype a light weight services framework that would route each service request to an appropriate host. In this case, the client was also interested in using plain old RMI for remote invocation.
The general approach is to assign servers based on some domain-specific criteria. For example, if geographic data can be associated to a Region ID, then we can assign hosts to Regions. This is generally a many-to-many relationship as each host can service multiple regions (presumably a small subset) and a region may be serviced by multiple hosts (to support dynamic scaling). Given a service method invocation, the region ID may be derived from examining the input parameters. In the simplest case, one of the parameters contains a value for the Region ID. The general problem is more complex involving some complex data mapping involving business rules, etc. For the prototype, I used an annotation @Locator to mark the parameter (or field) containing the Region ID.
Once we have a strategy for determining the Region ID for each service invocation, we can implement a Service Locator to resolve the physical host address and port and route the request to that endpoint. Besides partitioning by geographic region, it’s easy to see how this concept could be applied to an domain based affinity scheme.
The high level design goals include:
- Utilize lightweight remoting with support for multiple protocols (e.g. http, RMI)
- Centralize configuration of host to endpoint mapping. The assignment of hosts to regions is expected to be very dynamic and controlled by the operations team
- Centralize the service locator. Given the potential complexity of service location rules and data dependencies, the Service Locator should be a centralized resource
- Allow for simple configuration. It should be easy to add new services
- Provide service versioning. Allow for the availability of different versions and allow clients to specify compatible version(s)
Spring Remoting comes immediately to mind but does not meet the design goals. It is fairly simple to extend Spring’s
RmiProxyFactoryBean to inject different endpoints at runtime. A Service Locator implementation can be injected into the extended proxy to provide a URL for each invocation. This approach requires a proxy to be configured for each service type and, on the server side, A corresponding Service Exporter. This violates the simple configuration goal. In addition, accessing a central Service Locator requires an extra round trip on the network for each service request. There is a possibility to use caching, but this quickly leads to unnecessary complexity.
An alternative, is to go the ESB route and use content based routing. I initially considered Mule, but apparently Mule does not provide an inbound RMI adapter. I envisioned a similar approach: basically the client invokes a server side proxy that routes each service request to a target endpoint. Using Spring Remoting alone would require configuring a separate endpoint for each Interface type. This would eventually become unmanageable.
In the end, Spring Integration (SI) provided an interesting solution.
SI provides a
GatewayProxyFactoryBean commonly used to transform a method invocation on any interface to a Message. A
SimpleMessagingGateway is used to send a message with a payload containing the method parameters over a request channel and send the return value as a response message over a reply channel. SI also provides messaging over RMI using
RmiInboundGateway. These components were the basic building blocks for the solution.
What I was going for is an RPC based approach in which the server and client side share a common interface. It is important to note that SI in general, and the
GatewayProxyFactoryBean in particular is not designed for this purpose. Spring Remoting already provides an RPC-like implementation. SI is an implementation of Gregor Hohpe’s Enterprise Integration Patterns which is based on message based communications. As previously mentioned, I discarded Spring Remoting because the configuration would be complex and hard to manage.
A better approach is to convert each service invocation to a Message. In SI, a Message is an abstraction that contains Headers and a Payload. The message headers are represented as a Map and the Payload can be any object. Once the service invocation is transformed to a message, it is possible to process service request messages using a common endpoint and message handlers to do perform the remote invocations and routing. Once the message is routed to the correct host, an operation on the service implementation, a POJO, is invoked and the result is returned via the reply channels.
At first, the
GatewayProxyFactoryBean (GPFB) appeared to be very close to what I wanted, but not quite. I needed the identical message payloads to what GPFB provides (the method arguments in the request message and the return value in the reply message) but I needed to enrich the message header with the service interface class name, the method name and the requested service version range. It turns out that the GPFB does set the method name in the message history headers, but it is very difficult to determine the service interface.
The latter is a characteristic of dynamic proxies in general. In Spring the Proxy Factory Bean’s
getObject() method returns the proxy. The name of the proxy class is something like
$Proxy1 so you can’t get the type directly by calling
getClass().getName() (there is a way to discover the class name using reflection but it is much more involved). Also, it is difficult to get a reference to the actual Spring factory bean and query it’s
interface property (where are the leaky abstractions when you need them?). Any attempt to extend the GPFB to provide the interface type proved futile. So that is clearly the wrong approach. But it gave me an idea…
I implemented my own Proxy Factory Bean (using the GPFB source as a template). This creates a dynamic proxy for the service interface. When a method is invoked on the proxy, it can create a message with the required attributes in the message header and the set the message payload to the method parameters and send that message to an inbound gateway. As it turns out, I did not even need to create a message. The GPFB does all this for you. Internally, GPFB uses a
SimpleMessagingGateway injected with an
InboundMessageMapper and an
OutboundMessageMapper. The inbound message mapper is very smart. For example if your payload (method parameters) consists of a
Map followed by an object that is not a Map and you annotate the Map with @Headers (an SI annotation) then it will create the message for you, adding the map entries to the message header. I created the following interface:
And used the GPFB to wire it into an SI messaging gateway and injected that into my proxy factory bean,
The message gets sent via RMI (RmiOutboundGateway) to determine the service endpoint and add the service URL to the message header. For the prototype, I wrote a
AnnotationAwareServiceLocator to determine the endpoint URL and a
ServiceURLHeaderEnricher and an
AnnotationAwareServiceLocatorMessageProcessor to process the message.
The must now be routed to the service host via RMI. The
RmiOutboundGateway can do this however, out of the box I would have to statically configure an outbound RMI channel for every service endpoint. Instead, I opted to write a
DynamicRmiOutboundGateway. This is a simple wrapper around the
RmiOutboundGateway to configure it programmatically from the message header attributes. The prototype implementation creates a new instance of
RmiOutboundGateway for each request and would benefit from caching them.
At the service host the message is received over an
RmiInboundGateway (These all run in one JVM and are configured with different ports) and handled by a Service Activator which also implements the
GatewayAdapter interface to extract the message payload and header. At this point, we need to obtain an instance of the service class from the application context and invoke the named method. The
ServiceGatewayAdapter uses reflection to do this (Sorry about the crappy layout. Any WordPress tips are appreciated).
You can download the source code, build it with maven (or m2eclipse) and run the