CORBA style was: [omniORB] CORBA::string_alloc(len) problem

David.Chung@USPTO.GOV David.Chung@USPTO.GOV
Thu, 15 Jul 1999 20:54:23 -0400


	For prototyping my application, I had to choose between Java and
C++.
I had do consider the simplicity and elegance of Java vs. efficiency of C++.

	In the end, I chose C++.  The reason is that, despite Java's clean
structure, it is too slow
and "heavy."  I have no doubt that one day, one will see JVMs that match C++
performance 
(and when that day comes, I will probably port my code).  But fast and
efficient code needs 
to be written today.

	My complaints are not about some Java  API's that one can casually 
rewrite -- the problematic classes really go to the core of JVM.  Rewriting
these API
means one has to dig deep into JVM's innars and worry about JNI and native
code -- 
but then, why am I using Java? 
	
	If Java were implemented right, I would love to
use Java.  But until then, well, I must put up with C++ syntax and
C++ CORBA.

----------------------------------------------------------------------------
-------

	Here is my basic list of gripes against Java:

	(1) Java's serialization mechanism.  If you 
try to store and restore a large number of objects that are UNIQUE within
the same address space using JVM, you will realize that, for scaleable
operations Java's serialization mechanism is not usable.  
This is true, despite various optimizations -- Supercede, Symantec, or 
other vendor's JIT and native compilers.  The latest Hotspot optimizer
(by JavaSoft), of course, does not address the issues that go to the core of
Java's problems.  
The best optimizations by TowerJ, or Instantiations fall short of what is
needed.

	This is terrible, because Java relies heavily on streams for most of
their I/O.
But I/O is the bottleneck for efficiency on almost all server application.  
What is one to do, reimplement most of Java I/O using native code?  

	(2) Java's basic global storage structures such hashtable,
StringBuffer, and
other types of "maps."  For server applications, Java's Hashtable
implementation is simply intolerable -- if you look at their locking
protocol,
you will see that the relevant APIs lock the whole table, instead of locking
individual hash
bucket -- this forces all threads which want read-write on the hashtable to
be
serialized.  Concurrency cannot be achieved if one were to use the core Java
hashtable.
Java servers that use the core hashtable (as shared memory) cannot be
scaleable.

	StringBuffer is synchronized, even though you might not use it for
shared memory.
This basically means StringBuffer operations are slow.  If you want speed,
well, 
you might as well reimplement these objects in native code.  This is easy to
test out
and compare it to C++.  This also holds true for other buffers used for
streams -- again,
one must rewrite their stream classes.

	Other "maps" are problematic too, with regard to performance.

	(3) No shared locking in Java.  For servers, one needs asynchronous
I/O, and
concurrent reads.  In server environment, shared locking mechanism
facilitates concurrency.  
Java's monitor only locks in exclusive mode.  Again, one must write native
code
to implement shared locks.  Even if one does write shared locks in native
code, accessing
this is much slower than C++ shared locks, because you must use JNI
interface it to Java.

	(4) Garbage collector -- one is forced to rely on this for ALL Java
objects.  
I wish I could choose the objects to be taken care of by the garbage
collector, while
forcing other objects to free memory.  This is bad because during the
moments
of heavy load, a server application must be able to free memory allocated
for
objects that are causing performance degradation.

	(5) JMV places "everything" on a heap.  This means I need
to preallocate all objects to optimize performance.  The result is
huge memory consumption.

	(6) Java Native Interface.   With Java, one simply cannot
write a decently implemented TRANSACTIONAL locks, even using JNI.
A lock should be capable of locking and unlocking in the order of 
one or two microseconds (assuming you have 200 MHz Pentium, NT,
using Symantec JIT).  With Java, the best the lock can do is about
10 microseconds -- using JNI.  This is despite the optimizations of
pathlengths
and reducing all initialiations to minimum, and breaking all object
oriented programming rules.

	Btw, writing JNI is like pulling teeth. 

	(7) Java is not really that portable.  Microsoft's scornful
comment "write once, debug everywhere" has a grain of truth.

	(8) Java's synchronization mechanism, I found, is buggy on NT.  I
tried
to point this out to Javasoft, but they simply would not listen.  
You can test Java's bugginess for their threads by (a) implementing locks 
using their monitor (b) creating several threads to compete for the locks
(c) and run the code for
couple of days.  You will get a deadlock for sure.
The bug always has been reproducible for me.
	
	The deadlock never happened when I used Microsoft 
CRITICAL_SECTION or my own JNI interfaced mutexes 
(written from assembly).  Given Microsoft's bad rep,
I found this quite ironic.  

	This type of bug is very very bad.  The preceding setup to
produce a deadlock will rarely occur naturally on real servers; one will not
often encounter situations in which threads are so competitively fighting
to acquire a lock for such long time.  Thus, the deadlock will not occur
unless
one is very unlucky.  This means that if you wrote your code in Java, 
trusting Javasoft's JVM, thinking everything is all right -- you might get a
deadlock 
once every few month or once a year due to the bug (under high stress).  
You can pretty much kiss your chances of finding the bug goodbye--
(unless you already knew where to look for the bug).
This type of bug simply will not be reproducible in real situations.

	This was of June last year, so Javasoft have fixed the 
problem.  (I was using Java 1.1.6.)  Javasoft eventually made 
obsolete (deprecated, as they say) a number of thread related API's.  
I doubt that the bugs have gone away.  I did not test JDK 2.0; I have had 
enough (for now) Java bug testing.

	(9) Cleaning up Java's hung socket is pain in the neck.  If you like
pain, try using Java sockets, threads and their synchronization mechanism,
such that the threads are competing for resources and locks.
Even for simple programs, you can easily create bugs that appear only
once in a blue moon.  The problem here is that the streams are blocking;
the threads interact badly with them. 

	(10) RMI is pretty slow.  (Even though it is very very elegant).

	The performance of Java APIs related to graphics and RMI 
are not as important as the performance of the above-complained about API's.

Java GUI would be generally needed for writing client programs 
(for which performance is not an issue), and
RMI is used over the network (which tends to be the peformance bottleneck
anyways).

	So, here is the question: which is worse?  Java's problematic API's
 or C++'s inscrutable syntax and its related issues (which probably would
never need to come up if the C++ syntax were clean, like LISP or Java).

	For me, I think C++ and CORBA are clearly better, despite all their
problems.  
At least, I can trace my bugs.  For Java (1) I need to delve into huge
amount 
of Javasoft's source code and then (2) worry about Javasoft's licensing
scheme.  
Remember, if I distribute products based on a modified form of Javasoft's 
JVM, I need to pay licensing fee. Rephrased, if I debug Javasoft's JVM and
recompile it,
to distribute it along with my other programs written in Java, I need
to pay royalty to Javasoft.  

	I do not enjoy debugging others' code and
paying them at the same time.
	

> -----Original Message-----
> From:	mdlandry@lincoln.midcoast.com [SMTP:mdlandry@lincoln.midcoast.com]
> Sent:	Thursday, July 15, 1999 5:57 PM
> To:	omniorb-list@uk.research.att.com
> Subject:	CORBA style was: [omniORB] CORBA::string_alloc(len) problem
> 
> Is there any wonder why developers swear off C++ and CORBA for more
> accessible technologies like Java and VB? You'd need a friggin' master's
> degree in CS to understand this discussion. I've got one and it still
> doesn't make much sense. What ever happened to the art of writing code
> that
> others can read (and compilers can translate)?
> 
> Sorry, I know this is off topic for this group but unless CORBA becomes
> more
> usable by the typical business programmer it will fade into the sunset
> like
> so many other good technologies. One look at a mesage like this and most
> programmers' eyes glaze over. This may explain why we see so many
> unsubscribe messsages sent to the list.;-)
> 
> (Bruce, no offense intended.)
> 
> Bruce Visscher wrote:
> 
> > Just a nit.
> >
> > Stephen Coy wrote:
> > >
> > > You probably need to do something like:
> > >
> > >     char * myStr = CORBA::string_alloc(100);
> > >     String_var myStrVar = myStr;
> >
> > I assume you meant CORBA::String_var.
> >
> > Please don't do this.  This is copy initialization.  According to the
> > ISO C++ standard this is defined to create a temporary using the
> > converting constructor, CORBA::String_var::String_var(char*), then to
> > invoke the copy constructor from the temporary and finally to invoke the
> > String_var destructor on the temporary.  This winds up doing a strlen on
> > unitialized memory in the copy constructor.
> >
> > Compilers are allowed to optimize away the temporary.  Most will.
> > However, the DEC C++ compiler on the VAX won't (it always does on Alpha
> > VMS;  I assume it does on Tru64 Unix).  Therefore, you will only have
> > problems on certain platforms (maybe just the VAX?).  But if you really
> > want to be portable you should use direct initialization syntax:
> >
> >         CORBA::String_var myStrVar(myStr);
> >
> > I realize there are even examples using copy initialization syntax like
> > the above in the CORBA standard.  IMHO, the (CORBA) standard is wrong.
> >
> > Bruce Visscher
> 
> 
>