The First Law of Optimization
There is an old joke: every man thinks that he is expert in ***, sport
and politics. Similar, every programmer thinks that he/she is expert
in optimization. Even those ones, who just completed 2-months crash-course
in Java, have never studied CS, and who read more books by Donald Duck
then by Donald Knuth. Anyway, the lack of knowledge never
prevents us from discussing things we have no clue about: rocket science,
brain surgery or micro-optimizations of code executed by JVM (sure,
without ever learning JVM internal working). By the way, what I’m
typing right now is no different in this regard 😉
Even the old grizzlies, the ones who know The Law, frequently go for
premature (micro-) optimizations. It sooooooo itches! “Yeh, I’m expert,
I know how things works so I can predict that here will be bottleneck
Nowadays applications are complex. Even if your application is just
several simple WebDynpro components, consider the complexity of SAP WebAS
running application, complexity of R/3 backend it accesses and
complexity of database software used by R/3. So how you can safely
predict that namely this micro part, like String concatenation or
preferring Array in favor of java.util.ArrayList affects overall
performance in significant way? Do you have any mathematical theory behind
your claims? No? This is just evident?
Practice shows that such “evident” conclusions are wrong, worse of it –
following these conclusions leads to even more severe performance problems
or makes fixing real performance problems in future near to
impossible due to ugly design decisions made for “pre-optimized performance”
objective. Real performance problems are far disconnected from predicted
ones in practice. And “practice”, or “observations”, or experiment is a
mandatory part of science. It’s more “scientific”
and real then any “self-evidence” coming from our speculations.
Actually, do you observe that majority of developers are more proud by
micro-tricks they applied during application development rather then by
complete application created (macro-level thing)? It is some psychical
phenomena – we (developers) have illusion of full control when think that
we understand low-level things. And totally forgetting about upper level
or problem at large:
- Empty String in JVM occupies 40 bytes in memory? Wow, that’s cool to know!
This really helps me. This is more relevant for performance then
understanding how generative garbage collector in modern JVM works.
I’d rather cache all strings! 40 bytes! Never imaging!
- Sun JVM has undocumented class Unsafe that let me alter class of
object at run-time? Amazing! Even better! Now I fix all my class-cast errors!
It’s more important then understanding how JVM class-loading really works.
B’cause it always sucks. It’s better to unpack all library jars and
place the together within single JAR, believe me. Or to place them to
jre/lib/ext on server.
- java.util.List.sublist provides an updatable view fro range of elements?
I do not understand this; there is no value in such feature.
Anyway, I optimized away all collection classes from my application
and replaced them with arrays. I have many tests, and both of them shows
that arrays perform far better; I get a huge performance gain in approximately
3-4%, worth to mention saving 100 bytes of memory.
I’d like to tell you several stories that… Ok, conclusion is up to you.
The Expensive Compiler Story
Noticeably, we try to find the lowest possible bit of code and then decide
in advance that namely this is a root of all evils. String class in Java is
an excellent target.
String concatenation is evil! Sure so! This is evident! Let us
optimize concatenations and our program will fly (in terms of speed) and
has a weight of fly (in terms of memory)!
In-memory Webster Dictionary
XML is a reality we have. It’s hard to say whether it is a standard,
or a set of related standards or technology. Anyway, it is here
and we use/abuse it and blame it periodically. So we need tools.
Probably the best-known tool for XML processing in Java is Jakarta
Xerces/Xalan. So the story is about Xerces.
As a side-note, if you are not Java-only developer, you probably
know that there are at least 2 versions of this XML parser/XSLT
processor duo: one developed in Java, other in C++. Interestingly,
their first versions were developed almost at the same time. And
guess what? XalanJ outperforms C++ version! Yes, this was partly
due performance improvements in JDK 1.3, but the main reason is
efficient algorithms used. That’s simple! I don’t know for sure,
but it seems Xalan C++ developers where concentrated too much on
micro-optimizations rather then on algorithms research. You now,
overloaded operators new/delete, memory management etc. It’s evident
that this may cause performance problems. And in fact it does,
if you pay all your attention to pre-optimizations.
But XercesJ team has created own “gem”. As you now, parsing XML
yields a lot of String objects. It’s evident that this +could
be+ a problem. And problems should be fixed. Better before they
ever appear. So XercesJ team decides to intern()-ate Strings:
there is only one cached copy of every String literal used as
element/attribute name. And numerous tests performed show that
there is a huge memory saving. Also, as far as String is intern()-ated
it is possible to compare strings with identity equality, i.e.
StringA == StringB rather then value equality, StringA.equals(StringB)!
Yet another performance improvement! Compile, pack, upload, released!
If they were commercial vendor, it’s time to open champagne.
Who knows, probably IBM in fact sent Dom Perignon to every developer…
Or should I say DOM Perignon? 😉
But at the same time developers start to use Xerces library and
found that their applications end with OutOfMemoryError rather quickly.
And the reason was… Bingo! The calls to String.intern() that
Xerces used to “cache” Strings! The brilliant code that should
prevent memory consumption consumes all memory on its own!
I will not explain here internal details of String.intern(),
you may find enough information itself. But in short words, JVM allocates
such String in special memory area that has very limited size and
rarely-to-never reclaims this memory. So when library is used as
standard library of application server, it very quickly fills up
limited memory space with hundreds of thousands unique names of XML
elements/attributes coming from tens and hundreds of J2EE
applications running. Oh, it does never reach such numbers; the
error is thrown very quickly.
By the way, check settings of your SAP WebAS with ConfigTool.
You quickly find command-line argument for PermSize. It’s defined
here to allocate large enough “permanent” memory space to hold
unique names (symbols) of classes, methods, fields and string
constants coming from hordes of classes loaded by SAP WebAS.
Xerces also uses this space. Actually, “Xerces used”,
because now Xerces has own symbols cache in “regular” heap.
The Best Worst Library
What could be slower then Java reflection? Nothing, you say.
In JDK 1.1 it was tens times overhead for reflective method invocation
comparing to direct invocation (up to 70 if my memory serves me well).
In JDK 1.3 the overhead was reduced, 20 times was maximum observed value.
In JDK 1.4/.5 it is several time slower, 3-7.
Hmmm… The final number is not so exciting. So we get almost the
same performance as with direct invocation? Yes! Even the speed of
reflective and direct access will never be the same, the gap will be smaller
and smaller. The overhead is small enough to let “Scripting for Java
Platform” emerging. Majority of scripting engines (Rhino, JRuby, Groovy)
use reflection. It’s just has acceptable performance now.
Do you remember yet another popular library from different domain
that used reflection? Hint: Object-to-Relation mapping. Hint: it exists
now and very popular.
Answer: it is Hibernate. When Gavin King started his project, he
clearly defined his goals. Besides supporting wide range of
O/R scenarios (I mean options like projecting inheritance hierarchy one
or several tables), besides minimizing dependencies between custom applications
and his library, he has maximizing database access performance as main
What? Isn’t it a premature optimization? No. Because it is reason d’etre
of the library itself. Highly optimized generated SQL queries that
allows either eager or lazy loading, no N+1 read problems (if you developed with EJB1.1
you know what I’m talking about), full utilization of database-specific
SQL extensions. It is what library is developed for.
On other hand, initial version of Hibernate didn’t use bytecode mangling
(like in JDO) to “weave” Hibernate-specific persistent interfaces into
business objects. The generation of classes on-the-fly with CGLIB came
later. First version used… yes, reflection, to create necessary proxy
See, Hibernate code stays in-between user input and database access
(I mean JDBC->TCP/IP->DB). The later is very slow. Worse of it, if your
SQL is not optimized, then it’s terribly slow. It is order of magnitude
slower then any reflection. So the time that really matter is database
access time. All the rest are just details that can be improved in
future, if someone will use your library/product.
The conclusion is up to you whether the decision was right, but
Hibernate is still here and plan to stay for long continuously evolving.
Don’t do it!
- Do not optimize
- Do not optimize if this makes application architecture smells – it
is always simpler to later optimize parts of well-thought application
with clean separation of concerns then monolithic prematurely optimized
crap created on false assumptions.
- Do not optimize parts when developing – this is a) useless b) unnecessary
before profiling of complete product prove otherwise, i.e. what part
- Do not optimize algorithms – unless your product is compiler,
compression library, XSLT processor or anything else that heavily
loaded with mathematics; only profiling of complete application shows what
algorithm must be optimized. Better keep them readable and maintainable.
- Do not optimize data structures – unless you are programming micro
devices with draconian memory restrictions; only profiling of complete
application shows what causes large memory consumption. Better keep
them convenient: you agree that String is more convenient structure
then character array, don’t you? Then use collection classes at least
for objects, they are far more flexible then arrays.
- Do not optimize object construction – JVM manages short-living
objects better then immortal singletons or instances form dumb
- Do not optimize method invocation – in Java, do not set final on
every method. JIT knows better what is final what is not run-time statistics
and behaves accordingly (JIT watches method overrides).
- Do not alter classes design for sake of optimization – yes, invoking
interface method in Java requires more bytecode instructions, but you
may say “it is slow” only after profiling; yes, delegating calls is
more instructions then direct execution, but do not prefer inheritance
over composition only for premature optimization. Be aware that you
faced the need to refactor or re-compose your objects earlier and
more frequently then you faced the need to optimize code. If ever.
And interfaces and composition are what enable application to adopt to
new requirements, to survive specification changes and to evolve from
version to version.
- Do not listen anyone’s advices about optimization – listen only to
profiler. As an immediate follow-up, do not listen to my advices.
- Have you anything to add?