Wednesday, December 23, 2009

Entity metadata: Java vs XML

The architecture of the last multi-tier project I worked for included a bunch of C# frontends connecting to a few Java services. There were a number of business domain entities. The state of almost every entity was continuously mutated.

Tuesday, December 15, 2009

Gory details of java.lang.String interning

While exploring through the JDK source code today I came to some degree of understanding of how interned strings are treated by garbage collector (GC).

In the first versions of java interned strings were not collected at all. They were accumulated in the PermGen so it was quite possible to very quickly end up with OutOfMemory (OOM) exception when abusing intern() call. The current version of JVM uses a smarter way to maintain the string cache.

Opposed to some people saying that strings are kept as weak references the actual approach is different. During the first part of mark-and-sweep phase GC delegates to the static string table (a specialization of Hashtable) to get rid of all non-alive entries. These entries are not deleted but relinked instead from the hashtable bucket (the linked list they reside in) to the linked list of free entries (revise
   BasicHashtable::free_entry(BasicHashtableEntry* entry)← 
   void Hashtable::unlink(BoolObjectClosure* is_alive)←
   StringTable::unlink(BoolObjectClosure* cl)
call chain for details)
One important observation here is that memory taken by a freed entry is not deallocated. That means the more non-identical strings are interned by the application the more PermGen memory is consumed. Correspondingly if the JVM string table is too intensively used, for example, by attempting to cache too many non-identical strings it is easy to cause OOME. While in a case where the cached strings are known to have big percentage of duplicates interning along with fine tuning of PermGen may significantly reduce the overall memory consumption.

Thursday, December 3, 2009

Tribute to C++

It's been a long time since I last did anything in c++. After many years with Java and C# I don't really feel like fiddling with tons of headers and source files without a really good fast navigation between the types, methods, etc. One of the biggest advantages of Java/C# is that declaration and implementation are combined in one source file. That greatly simplifies navigation and refactoring (unless, of course, the application is designed that badly that it stops you from making any changes in a reliable fashion).
However, the knowledge of C++ still appears to be extremely helpful, for instance, when I need to clarify some details of JVM operation. Every time a question comes for which there is no good answer readily available (e.g. does JVM really apply any optimizations to final methods?) I'd better dig into JVM source code rather than wasting my time on reading many controversial opinions on the question. After all it is just a waste of time trying to understand who's right, who's wrong. So usually a better option is to make it certain by yourself.

P.S. JVM does apply optimizations to a final method. For example, see Parse::optimize_inlining(), ciMethod::find_monomorphic_target(), methodOopDesc::can_be_statically_bound() methods (JDK6-6u18 sources). And understanding if these optimizations may really boost your application performance is best assessed with testing. That's the only reliable approach.