Archive for the 'Technical' Category

Total File Sizes by Extension

Tuesday, September 2nd, 2008

Every so often I have a brief love affair with awk. Today I got curious about the file sizes beneath a directory. In particular I wanted to see the totals by file extension. I did a quick search but came up with nothing. I decided that even if there is something out there to do the job, it'd be a lot more fun to do it myself. Tada:

ls -Rl | \
grep ^- | \
awk \
'{ split($9,e,"."); \
exts[e[length(e)==1?2:length(e)]]+=$5 } \
END \
{ for (ext in exts) printf "%10d %s\n", exts[ext], ext }' | \
sort

Yeah, there's an ugly hack in there to deal with file names that either have no extension or multiple dots in their name.

As an added bonus, looking at the previous post on awk got me all pissed off about "smart quotes" in WordPress blogs and the problems they cause when copying and pasting code examples. So, out they go.

Share

Interminably Long Timeouts for META-INF Under IIS

Wednesday, August 27th, 2008

How's that for a catchy title? To recap recent events: I'm working on speeding up a Java applet, sloppy code in applet libraries try to load resources from the server which then 404, you can avoid this by setting the codebase_lookup property to false in the applet tag, and finally eliminating 23 megs of invisible data can help speed up downloading. Now that we're all caught up, let's turn to today's adventure: "deployment nightmares" OR "why the hell doesn't my test environment match production?"

Applet Won't Load

I finally got the applet to the "good enough for government work" level of load-time performance. Understand that I don't even work on any of the code in the applet, I'm just trying to optimize what's there and how it's delivered from the server. Today was the day we decided to quietly deploy to production.

The first sign of a problem was when the Apple guys came into the office. The applet wouldn't load for them in either Safari, Firefox 2, or Firefox 3. However, it worked fine on every server they tried except the production server. While trying to figure out what was going on it turned out that the issue had to do with all machines using JRE 1.5 regardless of OS or browser. They all worked against every server except production.

Differences Between Production and Test Environments

In production we have some sort of load balancer, Tomcat is behind IIS, and it's an external network. Nothing in our test environment has a load balancer in front of it, only one machine has IIS but works fine, and my external EC2 deployment is obviously off our network. I'm not sure why we don't mirror as much of this as possible in our test environment, but we don't.

Now back to the bug. Turning off the load balancer had no effect. Eventually, someone let their browser sit long enough to see that the applet did in fact load. It just took around 10 minutes. I finally noticed that the Java console would hang on different non-existent resources it tried to load from the server. I used cURL to retrieve the URL and had to wait 2 minutes until it returned an empty reply. Most non-existent resources timed out immediately. Only URLs that contained META-INF or WEB-INF would hang.

Various 3rd partly libraries were trying to load odd things from the server as I mentioned previously. A few of these load attempts point at the META-INF directory. This only happens under 1.5 because I used the codebase_lookup parameter in the tag. Tomcat, Apache in front of Tomcat, and our internal IIS server all return immediately. The first two serve a custom 404 page while the IIS server sends an immediate empty reply.

WEB-INF and META-INF Protection

Both WEB-INF and META-INF are directories that you probably shouldn't be exposing. In fact, in most versions of the Tomcat Connector the connector will automatically 403 or 404 when any resource from those directories is requested. In our case, we were running an older version of the connector that just happened to have a bug that caused requests to either directory to take 2 minutes to timeout. A quick upgrade and an IIS service bounce fixed everything.

So the debugging lessons for the day are: use something like ngrep to watch your traffic, your test environment should mirror your production environment, applets under 1.5 sucks, and check your version numbers on third party libraries (and consider upgrading).

Share

Optimizing PNG File Sizes

Thursday, August 21st, 2008

For the past few weeks I've been working on improving the load time of an applet at work. Someone here noticed a while back that we seemed to have a ridiculous amount of image files in the applet. There are roughly 23 megabytes of images in the final jar which is around 14 megabytes in total size.

I opened a couple of the files in GIMP and resaved them to find that the final file was much smaller than the original. It turns out that the images were created using Fireworks and by default it puts some extra information in the PNG file, things like layers or palette information I believe. A few minutes of searching around on the internet and I found a wonderful tool named PNGOUT that could be used to losslessly optimize the size of PNG files. I used Cygwin (I'm on Windows at work) to run all of the PNG files through the utility via this command: find . -type f -name '*.png' -exec pngout.exe {} \; and waited a few minutes. The end result was as follows:

Before PNGOUT
Total size of images: 24,147,770 bytes
Total size of app jar: 14,086,540 bytes

After PNGOUT
Total size of images: 1,026,186 bytes
Total size of app jar: 4,335,560 bytes

Yikes. Not bad for a few minutes worth of work.

Share

More on applets and codebase_lookup

Friday, August 8th, 2008

As I mentioned in the last post I'm farting around with applets for work. You may remember that the applet was hammering the server whenever it couldn't find a resource in the jars. Of course, everything the applet needs is already in the jars so if it's not in them then it's not on the server either. It's all because the AppletClassLoader tries to load stuff from the codebase on the server if it fails to load it from the jars.

As part of the fix of setting codebase_lookup to false I did some very quick, unofficial benchmarking. The test environment was an EC2 deployment I've been messing around with–the smallest instance. I timed the startup time of the applet and counted the number of server hits during startup. To further minimize variablility, I did this after the jars had already been cached. This roughly corresponds to the startup time of a visitor to the site that has already successfully launched the applet. The results were as follows:

codebase_lookup – true (default setting)
Startup time: 35 seconds
Server hits: 442

codebase_lookup – false
Startup time: 7 seconds
Server hits: 34

When the the applet does hit the server to look stuff up in the code base, it doesn't just do it at startup. Some libraries that don't cache failed attempts to load a resource keep on hitting the server. In our app it's XFire and a lot of non-existent "aegis.xml" and "doc.xml" files as well as a whole bunch of "BeanInfo.class" attempts thanks to java.beans.Introspector. Each of these attempts takes a tiny bit of time and puts an annoying 404 in the access logs. It's hard to say what the distributed sluggishness of the app will be because of all these little attempts, but it's definitely non zero. It also has an affect on server side scalability when each client is potentially 13 times more chatty with the server than it needs to be.

An additional concern I have is that some of this cavalier attitude toward resource loading in these libraries also happens on the server side. How many failed getResourceAsStream attempts am I not seeing and what impact are they having on overall server performance? At the current traffic levels it's probably insignificant but the idea of that much inefficiency spread out throughout the app kind of bothers me.

Share

XFire, Applets, and 404s

Wednesday, August 6th, 2008

XFire Mapping Files

The application I work on uses XFire as a Java SOAP framework (this project has been replaced by CXF which is why this isn't a bug report with a patch) that is launched via Web Start. At some point we decided to allow it to be run as an applet as well. One of the weird things I ran into when we did this was a ton of 404 errors in the access logs on the server. The applet was repeatedly trying to load these "*.aegis.xml" files. These are mapping files used to configure how XFire maps your types to XML. However, you don't necessarily need these files if you elect to configure the mappings another way. XFire apparently still looks for the mapping file to allow you a mechanism for overriding other mapping methods. Everybody still here? Good.

Repeatedly Loading Non-Existent Resources

Since we don't use the mapping files, they're not in any of the jars that would be loaded by the applet. So what happens is XFire calls XMLClassMetaInfoManager.getDocument whenever it needs to figure out how to map one of your objects to XML. The manager then executes a getResourceAsStream to see if you have a mapping file (either as a primary or backup configuration). This winds up in the classloader which in the case of an applet is the AppletClassLoader. The class loader first tries to find the requested resource locally and if it's not there it tries to load it from the codebase. In our case it's not on the server so you get a 404 in the logs. The fact that the resource isn't available anywhere isn't a problem but the fact that this information isn't cached anywhere means that the next time something looks for the mapping file the whole thing is going to happen all over again.

The code in XMLClassMetaInfoManager.getDocument could easily add and check for a key with a null value in the event that it had previously tried to find the mapping and couldn't. This is what it does instead:

	public Document getDocument(Class clazz) {
		if (clazz == null)
			return null;
		Document doc = (Document) documents.get(clazz.getName());
		if (doc != null) {
			return doc;
		}
		String path = '/' + clazz.getName().replace('.', '/') + ".aegis.xml";
		InputStream is = clazz.getResourceAsStream(path);
		if (is == null) {
			log.debug("Mapping file : " + path + " not found.");
			return null;
		}
		log.debug("Found mapping file : " + path);
		try {
			doc = new StaxBuilder().build(is);
			documents.put(clazz.getName(), doc);
			return doc;
		} catch (XMLStreamException e) {
			log.error("Error loading file " + path, e);
		}
		return null;
	}

(Incidentally, I'm really liking the syntaxhighlighter WordPress plugin.) The AppletClassLoader is off the hook because you can easily say that its behavior is a feature, not a bug.

The Quick Fix

Like I said, XFire is in maintenance mode. The prospect of getting the source code, creating a vendor branch in the SCM, fixing / testing, and finally internally hosting the patched artifact didn't appeal to me on the limited time scale on which I was working. The quick fix (that apparently only works in JRE 1.6) is to set the codebase_lookup parameter to false in the applet tag.

Sadly, the 1.6 restriction means I probably won't get to live with this fix for long and will likely wind up either patching XFire or upgrading to CXF. CXF may or may not still have the problem, but at least I'd be writing a patch for an actively developed library.

To further complicate things, there are other incidents of 404s with resource bundles and XML parsers. The nice thing about the codebase_lookup solution is that it "fixes" those issues as well. If I can't rely on JRE 1.6 as a requirement I'll have a lot more work to do than just patching XFire.

Applets. Yay.

Update

It's probably actually XMLTypeCreator that is causing my problem, but the code is pretty much identical. There are also issues with XMLDocumentationBuilder.loadDocument. I don't blame XFire in particular. There are a lot of projects out there with code that uses getResourceAsStream knowing that it could fail and then not caching the fact that it failed. They then repeat the operation multiple times. This assumes that getResourceAsStream and other such operations are very cheap performance-wise or that it's not worth the effort or memory to cache failed attempts. I'm not sure I buy it in the first place but it certainly goes out the window when you're in an applet and every such call that fails locally then gets executed across the network back to your codebase.

Share

OpenX and My Own Stupidity

Thursday, July 31st, 2008

Within the product I work on we use OpenAds, which has been renamed to OpenX. Apparently OpenAds may have been somewhat of a pain to get installed since everyone cringed when I suggested we should make a test server for our ads. Since I'm always a fan of the latest and greatest, I decided to install OpenX 2.6.0 in a VM and see what the differences were. The great news is that it was a breeze to install and whatever was changed doesn't affect our product at all.

The really odd thing was that once I got it installed I couldn't see any images from its admin web site in Firefox (but I could in Internet Explorer). I searched around and found out from the internet just how stupid I am. I was trying to see images on an internally deployed online advertising server while I had AdBlock installed. The sad thing is this isn't the first time something like this has happened to me. Perhaps the third time will be the charm.

Share

Java, Private Members, and Dynamic Proxies

Friday, July 25th, 2008

I found this problem and solution and then backtracked to some Google search results to make sure I wasn't being completely stupid. The issue was that the equals method on one of my objects was returning false when I knew the two objects were equal. A quick trip to the debugger showed me that the method was using some of the private member variables of the provided object to determine equality. Something along the lines of if(this.name.equals(other.name)) (no that's not the whole method, yes I know how to write an equals method, etc). The problem is that if the object provided to the method is proxied by something like a CGLIBLazyInitializer, those private fields may not have values. So, you're best off using the getter method. Also, as another blog and its comments point out, you need to be careful when calling certain methods on the provided object. One example is that you can't rely on using the getClass() method and should use instanceof instead.

As the other post also points out, this is one of the possible odd side effects from using Hibernate, even though it's really an issue with the CGLIB dynamic proxy class(es).

Share

Cookies and Funky Characters

Thursday, July 24th, 2008

Today's weird problem had to do with a browser cookie not keeping the value I gave it. The cookie's value is an encrypted string. I noticed in the debugger that what was written out was not what was read in later. The decryption failed because the encrypted value retrieved from the cookie just didn't make sense. Apparently cookies are sent as HTTP headers and can only use US-ASCII characters minus whatever other special characters don't work (you can wade through several RFCs if you like). In my case it was trailing "="s that were being eaten.

The easiest solution (besides just not using those characters) is to encode the values. Something like base64 or using the URL encode/decode utility classes in Java would easily do the trick.

The more interesting thing is that we use other cookies to store all kinds of strings, sometimes internationalized funky character containing strings. Those Unicode characters also don't get handled very well in cookie values so there is a general need to encode/decode them. Maybe it's just safest to always encode on the way out and decode on the way in.

Chalk up another one for things I probably should have known but didn't. Yay!

Share

Migrating SQL Server 2005 to MySQL

Tuesday, July 22nd, 2008

Here's a quick note on migrating data from MS SQL Server to MySQL. I found out in a white paper entitled A Practical Guide to Migrating From Microsoft SQL Server to MySQL (free registration required) that the MySQL install (on Windows at least) includes a tool called the MySQL Migration Toolkit.

It has a simple wizard that steps you through all of the options and you're ready to migrate. You can even choose to migrate "live" or save the scripts to files. One minor annoyance is that you can't pick the name of the target database/schema. When migrating from a SQL Server database named "blahblah" with default schema of "dbo", the migration tool will create "blahblah_dbo" in MySQL. I thought I would just rename it only to find that support for RENAME DATABASE has been dropped.

Instead, I decided to just do a backup and restore said backup to a new schema. I then ran into a problem with one of our badly designed tables that has large (not huge) amounts of BLOB data. The restore died with an error saying "MySQL Server has gone away." I finally tracked down the issue which can be fixed with a small configuration change. Since I was testing this in MySQL for Windows, I just added max_allowed_packet=12M to the my.ini under the server directory and bounced the service. Everything worked nicely after that. Your size on that configuration option may vary, of course.

Share

Virtualization as Adoption Criteria

Wednesday, June 4th, 2008

Yesterday at work I got a question from a co-worker about doing something under AIX. I don't really work with AIX, but I managed to answer the question. This is for another project that I don't work on. It got me to wondering if I could run AIX in a VM on PC hardware, probably using a PPC emulator. I'll save you the suspense and tell you that I couldn't find a way to do it. There are a couple of projects that are sort of trying to do it but they haven't done it successfully that I could find. Those would be PearPC and QEMU. I'm sure one of these projects will get it working someday, as soon as some really capable programmer wants it badly enough. I briefly thought about suggesting purchasing an RS6000 from eBay but decided it wasn't a project I needed to poke my nose into.

This is not a rant against AIX, although in a world gone mad with a billion distros of Linux, OpenSolaris, and PC hardware that is criminally cheap and available I don't feel the need to brush up on that incredibly rusty skill set. No, this is more about that fact that I don't think I would ever choose to work with any operating system and/or software product that I can't run in a VM on PC hardware–it's just too damn handy these days, especially when it comes to QA. That of course includes any non-hacked up version of MacOS, not that they'll miss my business. Also, sadly, I don't always get to pick what I want to work with. Still, it's definitely something to consider when picking your stack and deployment/hosting environment.

Share