MattHicks.com

Programming on the Edge

Java Delegates

Published by Matt Hicks under , , , , on Tuesday, July 01, 2008
If you're searching the internet for the title of this post you'll see a lot of people ranting about how Java doesn't have Delegate support. Though I have been one of these people, and continue to be frustrated that there is no native solution, that is not the purpose of this post.

I write this to say I've created what I would consider to be a pretty clean alternative to native delegates in Java using reflection. I created this as a feature in jSeamless and have been using it heavily for a while, but after a few people asking about support in non-jSeamless applications for my friendly Delegate support I decided I would remove it from jSeamless and make it a stand-alone API that anyone can use.

The Delegate implementation I find myself using more than anything else is MethodDelegate. It provides the ability to reference a specific method on a specific object that can be invoked. For example:

Some Rules Can't Be Broken

Published by Matt Hicks under , , on Wednesday, May 28, 2008
I'm a big lover of Reflection in Java. It's extremely powerful and lets you do things that technically shouldn't be allowed. For example, an Object with a private method cannot be invoked outside of that method by virtue of the "private" keyword. However, if you use reflection:

public class MyObject {
private void doSomethingHidden() {
System.out.println("I'm hidden and can't be invoked!");
}
}


public class Test {
public static void main(String[] args) throws Exception {
MyObject o = new MyObject();

Method method = MyObject.class.getDeclaredMethod("doSomethingHidden");
method.setAccessible(true);
method.invoke(b, new Object[0]);
}
}


You can invoke the method by setting accessible to true.

I was attempting to do a similar breaking of rules to try to invoke the underlying method of an overridden class:

public class ObjectA {
public void doSomething() {
System.out.println("I'm ObjectA!");
}
}


public class ObjectB extends ObjectA {
public void doSomething() {
System.out.println("I'm ObjectB! Yay!");
}
}


import java.lang.reflect.Method;

public class Test {
public static void main(String[] args) throws Exception {
ObjectB b = new ObjectB();
b.doSomething();

Method method = ObjectA.class.getDeclaredMethod("doSomething");
method.invoke(b, new Object[0]);
}
}


Unfortunately this always invokes the overridden implementation instead. It's a shame the only way to do this is via super.doSomething() from within the overriding Class. Oh well, I guess it's best that you can't do it as it would be a blatant violation of the intentions of Object overriding.

Fastest Deep Cloning

Published by Matt Hicks under , , on Thursday, May 22, 2008
I recently needed to do deep cloning of my Java objects and began to do the old-school style I've used for years: Use ObjectOutputStream and ObjectInputStream to do a "poor-boys" deep clone. However, the performance of this on large Objects is absolutely awful not to mention the amount of memory and CPU it takes to accomplish this. Even if I could get past all of that, the fact that every single Object in the chain MUST implement Serializable just pushed me over the edge...

I thought to myself, "Surely someone else realizes how dumb this is and has a better solution that having to implement Cloneable on everything", but alas, searching online didn't return much other than explanations how to use Serialization to do deep cloning and a little tutorial on increasing the performance on Serialization cloning:

http://javatechniques.com/blog/faster-deep-copies-of-java-objects/

I found that interesting, but it really didn't provide that great of a performance gain, and still didn't overcome the other issues, so I set out to make a rounder wheel. Perhaps it's the years of Reflection I've got hammered into me, or just the fact I think Reflection is awesome, but I turned to my old friend for a faster solution. It really is ironic, since so many people refuse to use Reflection because of performance reasons that I would turn to Reflection to increase performance, but lets see how this goes. :)

So, I took the benchmark test that is referenced in the article above and decided to apply it to my own test using Reflection and see the performance difference. The difference is staggering. After modifying the benchmark to do better timing (System.nanoTime instead of System.currentTimeMillis and a couple other tweaks) I bumped up the iterations to 100 and got the following results:

Unoptimized time (standard Serialization): 13,518 milliseconds

Optimized time (the tutorial Serialization): 12,941 milliseconds

Reflection time (my happy little test): 139 milliseconds

Yes, I did expect the performance to increase, but about 100 times faster is a startling increase in performance. So I bumped up the test another digit to see how well it scaled:

Unoptimized time (standard Serialization): 125,754 milliseconds

Optimized time (the tutorial Serialization): 121,434 milliseconds

Reflection time (my happy little test): 1,359 milliseconds

Okay, this seems to scale at least as well as the alternative and doesn't require the objects to be Serializable. The only problem is that there is substantially more code involved to make this work well since there are peculiarities with certain objects (arrays, Maps, etc.) that have to be explicitly handled. To deal with this scenario I created an interface, "Cloner" that has a simple method:

public Object deepClone(Object original, Map<Object, Object> cache) throws Exception;

This is relatively self-explanatory, but the "cache" map is the only really complex part. This is necessary in order to handle cyclical references (ex. ObjectA contains ObjectB that contains a reference back to ObjectA) that would otherwise cause an endless loop until either memory is exhausted, or more commonly the case, a StackOverflowException is thrown. Upon instantiation of an Object I immediately assign the original Object as the key in the cache and the cloned Object as the value in the cache. I check at the beginning of cloning of each Object if a cache reference already exists and if so I re-use that.

The Cloner instances are pre-defined in the CloneUtilities Class in a static Map<Class<?>, Cloner>. This not only abstracts the implementation details of cloning per specific case, but also allow extensibility for anyone that needs to handle special Objects in their own cloning scenarios.

If you're interested in seeing the final results of this, I have added it to my jCommon project so anyone can use it:

http://commonj.googlecode.com/svn/trunk/src/org/jcommon/clone/

I've included several cloning options in the CloneUtilities Class to allow testing between different options. However, it would seem that reflection is significantly faster than any other known alternative (apart from explicitly writing Clonable support in your Objects and providing some extra functionality for deep cloning). There is still a lot to add to this in the long-run, but I believe this is a good start.

Please drop me some feedback if you have any standard Cloner implementations to add.

EDIT: This post has received a lot of traffic and the jcommon project is no longer being hosted at the referenced URL, it can be found here: jcommon on googlecode. You can also find a newer yet less complete version of this in the xjava project: clone package

Myth Assistant

Published by Matt Hicks under , , , , , on Monday, March 17, 2008
Though I actually wrote this a couple years ago I finally decided to contribute this utility back to the community. This tool allows a graphical remote connection to a MythTV database and can copy and delete recordings over a Samba share. It is a graphical utility written in Java to search recordings, see details of recordings, copy single and batches of recordings, and other features for remote access to recorded programs. This should work on Windows, Mac, and Linux. Hopefully other people can get as much use out of this tool as I have. Feel free to post a comment if you found this useful.

Screenshot of MythAssistant

MythAssistant.jar

MythAssistant.exe

MythAssistant-source.jar

Smarter Beans?

Published by Matt Hicks under , , on Wednesday, January 30, 2008
In Java, Beans are an important part of good programming practices and good Object-Oriented coding. There has been a lot of discussion recently about monitoring changes on beans other such functionality that is not inherently built into Java. With my Magic Beans project I've already done a lot of things along these lines, but recently I've been thinking more and more about how powerful beans could be if they were to extend outside the boundaries of the norm. What if you could deal with Beans like you do with Connections in SQL? What if you could create a transaction that would give you a new copy of a Bean and when you are done with it you can "commit" that bean back to the original? What if you had transactional monitoring of beans that goes beyond the normal Observer/Observable concepts?

I keep coming back to beans as one of my biggest stumbling blocks for writing good and efficient code. Inevitably I'm going to have to revisit Magic Beans and finally create the end-all-be-all for Bean handling.

Multithreaded Unzip?

Published by Matt Hicks under , on Sunday, January 13, 2008
A co-worker and I were discussing the possibilities of performance gain doing extraction of ZIP files using multiple threads rather than the typically single-threading extraction that the majority (if not all) mainstream archive extraction utilities use and given Java's great ZIP file support built-in it seemed rather trivial to give this a shot, thus, UberZip was born.

UberZip is my simple little sample Java command-line application to extract files where you can specify the number of threads to utilize during extraction. My #1 biggest headache is extracting Eclipse from the ZIP file you get, so I decided that would be an ideal test of my little program. I downloaded the Eclipse J2EE bundle (3.1.1), which is a happy 132 meg with thousands of files.

I need to test further with a machine that actually can better utilize multiple threads, but here are the stats for my AMD Athlon 64 3200+ (Hyperthreaded, so I actually get a very slight benefit) running on Windows XP:

14.7 seconds with 1 thread

13.4 seconds with 5 threads

11.6 seconds with 30 threads

Anything above 30 threads seems to actually create more overhead than gain.

Now, to be fair I matched this up against the fastest unzipper I am aware of, 7zip (http://www.7-zip.org). It took right at about 14 seconds to unzip the file inside 7zip, which seems to fall nearly perfectly in-line with the single-threaded execution of UberZip.

For those of you that would be interested in taking a look at this very simple example, I have committed the source to my public repository:

svn://captiveimagination.com/public/uberzip/trunk

Further, if you'd simply rather download the JAR or EXE I have uploaded copies of it for those of you that would like to unzip files "uber" fast. :)

uberzip.exe

uberzip.jar

After taking this to try out on my work machine (Dual Xeon dual-core 3.2 GHz processors + 4 gig of memory = 8 processors in Windows) I got the following stats:

7zip - 25 seconds
UberZip (1 thread) - 22.17 seconds
UberZip (5 threads) - 19.5 seconds
UberZip (30 threads) - 21.6 seconds

Oddly it would seem about 5 threads is the "sweet-spot" for this machine. Though some of the results are a bit strange and it would seem on this machine the hard drives don't perform quite as well as my home machine it is obvious that at some level of configuration multithreading you can gain some good performance on the single-threaded applications out there.

Perhaps someone else will find this information useful and turn UberZip into a product people can use. ;)

Update (2017.05.31)

I've completely re-written this functionality in Scala and posted it on GitHub: https://github.com/outr/uberzip.  It's faster than ever and more cleanly written.  Tested on the latest Eclipse ZIP (320 meg) it can unzip 2,981 files in 0.73 seconds. Doing the same test on the same machine with Linux unzip took 1.7 seconds.

Flash/Flex URLRequest Upload Security Hack

Published by Matt Hicks under , , , , on Friday, December 28, 2007
It's been a while since I posted and after dealing with this horrid bug in Flash I figured it would be a great topic to get me back on track.

So, apparently there's this nifty little bug in Flash/ActionScript with the URLRequest object when you attempt to do a file upload, it uses a completely different browser session than the web browser or any other requests made from within Flash (image loading, sound loading, resource loading, etc.). This isn't a problem unless you care anything about security...and unfortunately most of us do. So the project I'm currently working on uses JCIFS (http://jcifs.samba.org/) for SSO, which is actually very nice by the way, and it uses a servlet filter configured in the web.xml to authenticate the user before they ever hit the page, so by the time they reach the Servlet they've properly been authenticated and I don't have to worry about rogue users.

Okay, so that all works well and good until I try to do a file upload. I try to establish a URLRequest to my JSLServlet (of course this is jSeamless) that I use for everything else, but this time I get a nifty little login prompt using basic authentication. Now, this is of course bad, but what makes it even worse is that Flash simply prompts you and then disregards it and streams the upload into the outer darkness (where there is weeping and gnashing of teeth) so the upload from the Flash perspective worked and uploaded fully (note, before I even enter anything into the login prompt) but the server never receives the upload at all. If I enter my credentials it accepts it and then disappears, but really doesn't care anything about it.

Okay, now for the hack. Now, for you programmers that are faint of heart, don't try this at home. For the rest of us that have to do the job no matter how dirty it might get, please forgive me for what follows...if there were another way I would have done it.

Alright, so I create a new Servlet filter called UploadOverrideFilter (yes, you see where this is going) that gets put in the filter order before the NtmlHttpFilter (JCIFS filter for SSO) that detects if the reference is made to an upload (endswith("jslupload") in my case). Now, what took me a bit of work to figure out is that all mechanism for referencing directly to the Servlet that the filter can pass off to is either barred or deprecated (from ServletContext there are methods for getting a Servlet, but they always return null since deprecation) so I pull out my ugly stick and turn JSLServlet into an evil beast by maintaining a static reference to "instance" that is the initted reference to the currently context JSLServlet instance. Then I simply reference that static JSLServlet instance and explicitly call doPost(...) instead of passing off to the filter chain. See the source code for yourself:

public class UploadOverrideFilter implements Filter {
public void init(FilterConfig config) {
}

public void doFilter(ServletRequest request, ServletResponse response, FilterChain chain) throws IOException, ServletException {
if (((HttpServletRequest)request).getRequestURI().endsWith("jslupload")) {
JSLServlet.instance.doPost((HttpServletRequest)request, (HttpServletResponse)response);
} else {
chain.doFilter(request, response);
}
}

public void destroy() {
}
}

Yep, it's having to write code like this that will be the death of me...a very good reason why I prefer to stick with 100% Java whenever possible is so I don't have to write hacks like this.

Well, I guess it wasn't as bad as it could have been, but it's still a hack and it's things like this that keep me up at night in the cold sweats.