.life.work.toString(): 2009

Wednesday, December 23, 2009

Entity metadata: Java vs XML

The architecture of the last multi-tier project I worked for included a bunch of C# frontends connecting to a few Java services. There were a number of business domain entities. The state of almost every entity was continuously mutated.

Gory details of java.lang.String interning

While exploring through the JDK source code today I came to some degree of understanding of how interned strings are treated by garbage collector (GC).

In the first versions of java interned strings were not collected at all. They were accumulated in the PermGen so it was quite possible to very quickly end up with OutOfMemory (OOM) exception when abusing intern() call. The current version of JVM uses a smarter way to maintain the string cache.

Opposed to some people saying that strings are kept as weak references the actual approach is different. During the first part of mark-and-sweep phase GC delegates to the static string table (a specialization of Hashtable) to get rid of all non-alive entries. These entries are not deleted but relinked instead from the hashtable bucket (the linked list they reside in) to the linked list of free entries (revise

   BasicHashtable::free_entry(BasicHashtableEntry* entry)← 
   void Hashtable::unlink(BoolObjectClosure* is_alive)←
   StringTable::unlink(BoolObjectClosure* cl)

call chain for details)
One important observation here is that memory taken by a freed entry is not deallocated. That means the more non-identical strings are interned by the application the more PermGen memory is consumed. Correspondingly if the JVM string table is too intensively used, for example, by attempting to cache too many non-identical strings it is easy to cause OOME. While in a case where the cached strings are known to have big percentage of duplicates interning along with fine tuning of PermGen may significantly reduce the overall memory consumption.

Thursday, December 3, 2009

Tribute to C++

It's been a long time since I last did anything in c++. After many years with Java and C# I don't really feel like fiddling with tons of headers and source files without a really good fast navigation between the types, methods, etc. One of the biggest advantages of Java/C# is that declaration and implementation are combined in one source file. That greatly simplifies navigation and refactoring (unless, of course, the application is designed that badly that it stops you from making any changes in a reliable fashion

).
However, the knowledge of C++ still appears to be extremely helpful, for instance, when I need to clarify some details of JVM operation. Every time a question comes for which there is no good answer readily available (e.g. does JVM really apply any optimizations to final methods?) I'd better dig into JVM source code rather than wasting my time on reading many controversial opinions on the question. After all it is just a waste of time trying to understand who's right, who's wrong. So usually a better option is to make it certain by yourself.

P.S. JVM does apply optimizations to a final method. For example, see Parse::optimize_inlining(), ciMethod::find_monomorphic_target(), methodOopDesc::can_be_statically_bound() methods (JDK6-6u18 sources). And understanding if these optimizations may really boost your application performance is best assessed with testing. That's the only reliable approach.

Wednesday, November 4, 2009

The S stands for Simple

Old discussion about SOAP - seems we still haven't moved far from that. :)

Wednesday, October 28, 2009

Benchmarking Java method performance

Very often there is a need to make a quick estimation of which of several method implementations performs better. A typical question is how many fractions of second it would take for a single invocation to complete. With Java (and any other JIT-based virtual machines) the estimation is especially difficult as certain number of initial invocations may be done in the interpreted mode with later switching to the compiled form plus some other optimizations may occur. Sometimes though the exact call duration may not be of much interest. We'd rather find out which implementation performs significantly better by comparing the performance in a simulated usage scenario. For that purpose I wrote a few classes that can facilitate that micro-benchmarking.

Parsing command-line options - part 4

This is the last part of the story. This time we will search for a way to make our annotation work.

A hint for indenting with XMLStreamWriter

Default implementation of javax.xml.stream.XMLStreamWriter does not support such output features as indentation and multi-line output. Correspondingly the resulting text is always a long one-line string which many of us would want to format in a pretty style for better reading. There is a solution bundled with Java 6 although it is sort of internal Sun facility which may have gone one beautiful day.:-) The class is named com.sun.xml.internal.txw2.output.IndentingXMLStreamWriter, it is public, and it resides in rt.jar. Using it is a matter of couple of lines in your code:

        final XMLStreamWriter defaultWriter = 

            XMLOutputFactory.newInstance().createXMLStreamWriter(writer);

        final IndentingXMLStreamWriter sw = new IndentingXMLStreamWriter(defaultWriter);

        sw.setIndentStep("    ");

As shown above it allows you to vary the indent character sequence giving certain flexibility by that.
The only problem I can think of might be due to the end-of-line character internally hard-coded by '\n'. I would prefer having system-dependent or, better, a user-provided line terminator instead.
P.S. Just discovered that this class is only included into JRE not into JDK. That means one sure way to solve the original problem is to duplicate the class source in your application. There is nothing special in its logic as it routinely decorates an instance of XMLStreamWriter with the required functionality. The source code for JDK 6 can be downloaded from this page.

Monday, October 19, 2009

Parsing command-line options - part 3

So we have defined two interfaces - 1) the interface we want to use to access command-line values, - a real-world example of usage, and 2) the annotation we want to apply to the interface #1 to furnish it with option-specific information. It's time to start writing some code to bind them together.

Parsing command-line options - part 2

To set off with the further activities I have to capture the list of requirements first. This list will let me grasp the overall scope of work and possibly start with a set of unit-tests.

Enterprise Data Fabric is what we need?

I have just watched this presentation on Coherence by Cameron Purdy, a vice president in Oracle. Well, I think it is the very right direction they have been moving at so far. I encountered similar but custom solutions at least in three big projects at different companies. I must admit that no other architecture can fit equally well to the modern business requirements where delivering the most recent values across the enterprise is important (this is true for almost every modern application).

Frankly, I feel incredibly envious about the work they have been doing in Tangosol(now Oracle) :-). Quick search in Google and Wikipedia yields that there is a term for the technology and it is Enterprise Data Fabric (EDF) which sounds good to me.I feel like writing my own implementation of EDF based on experience I have got with that kind of systems.

Nowadays the biggest problem is that integration of systems has typically been done in a totally chaotic way. Even while some enterprise applications are using Coherence (or GemStone GemFire, or similar), others are not and will never be able to due to major architectural flaws making them incompatible with EDF. There are still so many applications designed on top of a relational database, for example. Honestly, almost every such system is cr*ppiest legacy nightmare, - totally inflexible, hard to maintain, so far away from low latency data delivery - and thus very inefficient.

Parsing command-line options - part 1

I favor metaprogramming. Annotations in Java and Attributes in C# fit this technique very well, making your code concise and simple.

A good excerise where metaprogramming may help is mixed GNU/POSIX-style option processing for command-line utilities. A descriptive explanation for this use-case is given here.

A bug story

Ha-ha...Almost introduced undesired behavior for a Hibernate-managed class when noticed that on commit I forgot to uncomment the line:
//@org.hibernate.annotations.Entity(mutable = false)
Guess how I have fixed it on the first go?
//@org.hibernate.annotations.Entity(mutable = true)
:)))

Tuesday, October 13, 2009

Oracle WTF

Maximum identifier length is 30 (!!!) bytes... Are they crazy? That's nonsense. Of course, renaming Rollback Segments to Undo Segments was much more important.

Monday, October 12, 2009

Quick way to create an uppercase letter-only string from a long value.

Today I had to find a quick way to convert the current time in milliseconds into an uppercase letter-only ([A-Z]+) string with possibly minimal length. java.util.UUID did not seem to fit well to the purpose as it produces [0..9A..F]+ that requires extra conversion so I come up with the following:


    public static String makeUppercaseLetterOnlyStringFromCurrentTime() {

        final long ts = System.currentTimeMillis();

        // convert the value to range of 0..9a...z characters

        final String chars = Long.toString(ts, 26);

        final StringBuilder sb = new StringBuilder();

        for (byte b : chars.getBytes()) {

            final int ch = (int) b;

            // for a digit subtract '0' and add to 'A',

            // for a letter - subtract 'a' and and add to 'A'

            final int x = b + ((Character.isDigit(b)) ? 0x11 : -0x20);

            sb.append((char) x);

        }

        return sb.toString();

    }

This for sure is not the fastest nor the most efficient approach yet quite straight-forward one. :-)

Monday, October 5, 2009

Why use Oracle?

I once had been extremely fascinated by Oracle RDBMS and was quite a good DBA at that time (in the sense I could properly set up and run an instance while being able to quickly solve most of the issues). If I'm not mistaken it was about 15 years ago.

Posting code snippets...

Eh, that's going to be tough. I thought of ocassionally posting Java and C# code snippets with syntax coloring and nice formatting. Googling for the problem shows that's not a simple thing though. Got to search for a good solution...
This is what Java2Html Eclipse plugin produces when using inline fonts:


1     public static boolean equals(final Object a, final Object b) {

2         return a != null && b != null && a.equals(b);

3     }

Looks fine so far... Later I'll switch to customized stylesheets.

Never thought I'd start a blog

Namaste!

.life.work.toString()