This paper is subjective, and as such, is not meant to be a serious technical discussion of the various merits of these languages. It is my own personal opinion, and we all know what opinions are like.
The purpose of this paper is to provide a perspective for people planning to start their own projects in one or the other of these languages. As such, if you are reading this and have a different paper that you'd like to submit as a counterexample, I will gladly link to it.
Test | Java | Python | Comparison |
---|---|---|---|
Standard Output | 138.85 | 30.58 | Python 4.5X Faster than Java |
Hashtable | 17.0 | 8.22 | Python 2X Faster than Java |
I/O | 56.72 | 47.36 | Python 1.2X Faster than Java |
List | 5.94 | 14.32 | Java 2.4X Faster than Python |
Native Methods | 2.475 | 7.92 | Java 3.2X Faster than Python |
Interpreter Initialisation | 0.25 | 0.04 | Python 6.3X Faster than Java |
Object Allocation | 23.65 | 211.11 | Java 8X Faster than Python |
Interpreter Speed | 0.43 | 2.29 | Java 5.3X Faster than Python |
Again, I must stress that these are approximate times, which are specific to my computer, my Java version (JDK 1.1.7B, blackdown), my operating system (Debian GNU Linux 2.2), my python version (Python 1.5.2) and my idiosyncratic test method, which is not strenuous at all. I repeated the tests 3 times and averaged the results to get the numbers you see here. Sources to the tests are available here in .tar.gz format, so you can perform them yourself.
The overall conclusion one may draw from these numbers is that Java is generally faster. If it's that simple a conclusion you're looking for, though, I recommend using C++ as your language -- neither Java nor Python can hold a candle to a true systems-level language. (I originally thought of including C or C++ comparisons here, but the difference is truly laughable, especially with compiler optimisations.)
ConsoleTest | |
---|---|
Python | Java |
for x in xrange(1000000): print x |
public class ConsoleTest { public static void main(String[] args) { for (int i = 0; i < 1000000; i++) { System.out.println(i); } } } |
The console test was impressive to me: Java's performance was
astonishingly bad. Combined with the long initialisation time, this
makes Java a completely unsuitable language for streams-based
programming.
It might be worth noting here that System.out actually writes to stderr, which is highly confusing (and broken, IMHO) behaviour. Java was definitely not designed with shell scripting or piping in mind. |
Hashtest | |
---|---|
Python | Java |
for i in xrange(1000): x={} for j in xrange(1000): x[j]=i x[j] |
import java.util.Hashtable; public class HashTest { public static void main(String[] args) { for (int i = 0; i < 1000; i++) { Hashtable x = new Hashtable(); for (int j = 0; j < 1000; j++) { x.put(new Integer(i), new Integer(j)); x.get(new Integer(i)); } } } } |
Here is one of the many places that python benefits from a standard data structure being in C; the "{}" Hashtable is one of the primary reasons I decided to move Twisted Reality to python. |
IOTest | |
---|---|
Python | Java |
f=open('scratch','wb') for i in xrange(1000000): f.write(str(i)) f.close() |
import java.io.*; public class IOTest { public static void main(String[] args) { try { File f = new File("scratch"); PrintWriter ps = new PrintWriter(new OutputStreamWriter (new FileOutputStream(f))); for (int i = 0; i < 1000000; i++) { ps.print(String.valueOf(i)); } ps.close(); } catch(IOException ioe) { ioe.printStackTrace(); } } } |
Python's I/O is also marginally faster than Java's -- considering that python's interpreter is vastly slower, this is really impressive. To be fair, this is a super-naive implementation of file access in Java. It's stream-based, and not buffered. However, the point of this exercise was mostly to test the most 'natural' way of doing things in each language. |
ListTest | |
---|---|
Python | Java |
for i in xrange(1000): v=['a','b','c','d','e','f','g'] for j in xrange(1000): v.append(j) v[j] |
import java.util.Vector; public class ListTest { public static void main(String[] args) { for (int i = 0; i < 1000; i++) { Vector v = new Vector(); v.addElement("a"); v.addElement("b"); v.addElement("c"); v.addElement("d"); v.addElement("e"); v.addElement("f"); v.addElement("g"); for (int j = 0; j < 1000; j++) { v.addElement(new Integer(j)); v.elementAt(j); } } } } |
Java's list syntax is hideous. It turns out (to my surprise) that Vector performs better than the standard [] operator in python... or at least, it doesn't sufficiently outperform it to beat the interpreter speed difference. |
NativeTest | |
---|---|
Python | Java |
from pynative import * for i in xrange(1000000): hello() |
public class NativeTest { public native void nativeMethod(); static { System.loadLibrary("javanative"); } public static void main(String[] args) { NativeTest nt = new NativeTest(); for (int i = 0; i < 1000000; i++) { nt.nativeMethod(); } } } |
Python C Module | Java C Module | #include "Python.h" static PyObject* pynative_hello(self,args) PyObject *self; PyObject *args; { printf("Hello, world!\n"); Py_INCREF(Py_None); return Py_None; } static PyMethodDef NativeMethods[] = { {"hello", pynative_hello, METH_VARARGS}, {NULL, NULL}, /* Sentinel... what's this? */ }; void initpynative() { (void) Py_InitModule("pynative", NativeMethods); } |
--- Autogenerated NativeTest.h --- /* DO NOT EDIT THIS FILE - it is machine generated */ #include |
Python's native interface requires no header-file-generation phase. One can design native *objects*, not merely native code. Furthermore, native modules are introspectable, and require no python code to bind them. In short: the python native interface is vastly superior to Java's. Not only that, but the function-call overhead is lower (it's still slower than Java, but the interpreter speed difference is more than made up for). |
NoTest | |
---|---|
Python | Java |
public class NoTest { public static void main(String[] args){} } | |
This, I think, is an interesting commentary on the design of both
languages. First of all, in order to do nothing in python, you write
exactly that -- nothing. In order to successfully start up and do
nothing in Java, you have to have a class definition with the correct
name, with a main method...
Aside from the philosophical ramifications of this code, there is the very practical issue of Java's large initalization overhead. It takes too long to start a virtual machine to do anything like CGI or shell-scripting with Java. It seems to me that this is a pattern throughout the language, while it is designed to scale up, it seems poorly suited to scale *down*, lower than a large application. I don't understand why they market it as a product for set-top boxes. |
ObjectTest | |
---|---|
Python | Java |
class ObjectTest: pass for i in xrange(1000): root=ObjectTest() for j in xrange(10000): root.next=ObjectTest() root=root.next |
public class ObjectTest { public ObjectTest next; public static void main(String[] args) { for (int i = 0; i < 1000; i++) { ObjectTest root = new ObjectTest(); for (int j = 0; j < 10000; j++) { root.next=new ObjectTest(); root=root.next; } } } } |
This test really surprised me. There's little I can say about it except that python needs to improve -- the allocation time on a completely empty class definition is unacceptably long. |
SpeedTest | |
---|---|
Python | Java |
for x in xrange(1000000): pass |
public class SpeedTest { public static void main(String[] args) { for (int i = 0; i < 1000000; i++); } } |
Java's interpreter is, as expected, vastly superior to Python's. This is the comparison most people are making when they say something like "Java is faster than python". Technically, this may be true (and in applications where speed is really a factor, and a lot of code is written in one of these languages, you WILL feel it) but the large base of native library code written in C for python will negate this for most small, and some large, applications. For example, if 90% of what you're doing is writing files to disk or over a network connection, you won't really care that Java does it in half the time, because you'll be waiting on the network. |
Java's documentation is generally better, and better organised (at least the API doc) but even though I've been using Java for several years, and python for only a month, I found it easy to remember all of the python necessary for this exercise without resorting to the module doc; I had to look up 3 things about IOStreams and URLs in the Java documentation.
Java programs are longer than their equivalent python programs. In fact, they're approximately 3 times longer, if you're counting bytes in source code (remembering, also, that python programs can run from sources and Java must be compiled first). Totals for the programs I wrote (source only) were: python, 921 bytes, Java, 2,742 bytes.
Java is faster, and more suitable for "systems-level" programming. I believe that c -> c++ -> Java -> python establishes a nice continuum of systems-oriented, low-level programming to application-oriented, high-level programming.
I would love to hear from other people who have run the same test-cases and come up with different results.
From a language-design perspective, the next section won't be interesting; however, it's a very important look at the real-world consequences of choosing this language, unless you plan to write your own virtual machine (or use one of the free ones, like japhar or kaffe... which don't even necessarily have the APIs implemented that these bugs are in).
Want an arbitrary number of frames in your app? Sorry; you'd better keep track, and set a hard limit. I wouldn't count on re-using components either; add() seems to leak like a bitch too. Widget re-parenting appears to too, but this is small enough that it's livable.
Opening about 100 windows with a decent amount of widgets will crash a VM -- and I don't mean "at once". I mean at all. Open window, close window, open window, close window... repeat 100 times, and boom, your app is *permanently* out of memory, unless you've loaded all the swing classes in a special classloader and you're ballsy enough to go re-loading them all every time that happens.
The AWT does this too, but you could probably write an application that ran for longer than 20 minutes using it.
Even if you do decide to be brave and diddle the commandline options, the aformentioned problems with GC makes it unfeasable to allocate really large blocks of memory in Java without a ten-processor SPARCstation.
This problem is so bad that even third-party applications that purport to solve this problem, such as Zero-G software's "Install Anywhere" will sometimes fail, simply because it's impossible to determine certain information about the operating environment. (No slight to "Install Anywhere" -- it's a GREAT product, and works very well for their supported platforms!)
Note: Some people have brought it to my attention that the same thing could be said about python; however, I have not found it difficult at all to run my python scripts on Windows or MacOS; the packaging process for Java applications is highly platform-specific and usually requires an intimate knowledge of where you want it to run... well, unless you're not using ANY libraries at all, and your entire app is a single class-file. Java2's -jar option makes this slightly easier to do, but it's still painful (not to mention undocumented) to determine the version and capabilities of the installed VM.