In the first versions of java interned strings were not collected at all. They were accumulated in the PermGen so it was quite possible to very quickly end up with OutOfMemory (OOM) exception when abusing intern() call. The current version of JVM uses a smarter way to maintain the string cache.
Opposed to some people saying that strings are kept as weak references the actual approach is different. During the first part of mark-and-sweep phase GC delegates to the static string table (a specialization of Hashtable) to get rid of all non-alive entries. These entries are not deleted but relinked instead from the hashtable bucket (the linked list they reside in) to the linked list of free entries (revise
BasicHashtable::free_entry(BasicHashtableEntry* entry)← void Hashtable::unlink(BoolObjectClosure* is_alive)← StringTable::unlink(BoolObjectClosure* cl)call chain for details)
One important observation here is that memory taken by a freed entry is not deallocated. That means the more non-identical strings are interned by the application the more PermGen memory is consumed. Correspondingly if the JVM string table is too intensively used, for example, by attempting to cache too many non-identical strings it is easy to cause OOME. While in a case where the cached strings are known to have big percentage of duplicates interning along with fine tuning of PermGen may significantly reduce the overall memory consumption.
No comments:
Post a Comment