Wednesday, December 23, 2009

Entity metadata: Java vs XML

The architecture of the last multi-tier project I worked for included a bunch of C# frontends connecting to a few Java services. There were a number of business domain entities. The state of almost every entity was continuously mutated. Those mutations were mainly reactions on different external events and constituted the major part of the business logic hosted within the server-side processes. A frontend would subscribe to state changes for every entity it was interested in and that would cause a stream of state change notifications sent to the client maintaining its local cache in sync. Additionally there were a number of operations exposed to client so that it could initiate some business logic flows at the user's discretion. To keep client-side aware of the entity structure and operations available at the server-side an interoperability layer was embedded at the both sides. In that case the layer had to be able to 1) marshall/unmarshall both entity state and commands, 2) know how to map entities into database, and 3) define human-readable tags for certain properties. These things can typically be organized in a very uniform way so most of the layer was generated as a part of the build process. The overall model was described by a single XML file where entities and the related operations were declared according to a well-defined schema. A task in the build process read the model and transformed it into a set of Java classes at the server-side. Similar activity occurred during a client-side build.

After a couple of years with that project I became certain I would never vote for such approach in future. The reason was simple - it did not pay back very well (unless the goal is to entertain the team with a technology zoo). Imagine what you would typically do once the model had evolved. First, you modified the XML. Secondly, you needed to generate interop layer out of it. And only then you would be able to use the new model in your application with a high potential for some naming conflicts to occur. The latter problem was always a curse since most of the time XML and the generated code were disconnected from each other. Additionally to that, the project was split into modules in a very odd way - XML and generated classes resided in different modules with separate versioning schemes so you would need to commit twice, publish twice to the repository, and upgrade two dependencies rather than one in any other project using the model. This seriously slowed down the whole thing even in case of a small change.

The whole experience brought me to a strong belief in that the overhead can be reduced significantly if an alternative to externalizing model as XML would be employed. The solution is simple as usual - use Java itself, don't mess with XML at all. Define entites as Java-bean interfaces. Annotate the properties to provide marshalling tags, database mapping hints, and whatever else you might need in your application. Start working on the business logic immediately as interfaces become available after an upgrade of a single dependency. An extra bonus you get - no name inconsistency any more as now compiler and other tools ensure there is no mismatch. Every smart IDE will be able to perform consistent project-wide refactoring when you need to rename a property, for example.

The only major piece of work to be done is to generate proper serialization/deserialization logic as well as database mappings. Java-class introspection can perfectly help with this task and the logic would hardly be more sophisticated than producing the same result by parsing and processing XML. Generating the C# part also remains relatively simple since, since as the metadata source, there is no big difference between XML and Java reflection.

I believe this approach can help keeping any project agile and maintainable for quite a long term. And the major point is - do not overuse XML. In most cases it sucks.

No comments:

Post a Comment