Archive for the ‘General’ Category

Attempting AJAX

Monday, October 23rd, 2006

So, AJAX is one of the more recent causes of excitement in the web-based application delivery world. The first major site to feature AJAX (as far as I know) was Google’s gmail. Assuming that you are okay with writing tons of Javascript, it’s quite a nice way to spice up your application enough to make your users feel a little less like they’re using a web-based app.

I have to admit that I generally like to use the minimum of Javascript on my sites that can possibly work, because it usually doesn’t. Javascript is notorious for failing on various browsers and platforms with no specific rhyme or reason. Most often, it is because the application developer did not take the time to test and debug the Javascript on various browsers, or failed to use standards-compliant code and chose a single target browser (this more often happens when an application is targeted towards Microsoft Internet Explorer).

My rule of thumb is that Javascript should not be used unless there is some non-Javascript backup for the same functionality. Basically, Javascript may only be used to enhance the user interface; it cannot be used to drive the user interface.

We have a page that allows users to tick items off of a list. The list is pagenated, to it can be quite long. They can either tick them or un-tick them, and the page reflects the current status of each item. There is one big problem with this type of interface: the user expects to interact with a standard widget (the checkbox), but the page itself does not respond to checking or unchecking the checkbox: you have to submit a form. This basically won’t work because users are not going to tolerate having to check these items and then click a button, especially when moving from one page to another. They might also tick an item and then leave the page entirely. There is no way to stop them, and the only way to capture the event of ticking the checkbox is to use Javascript.

I am unhappy with the Javascript-only solution because it breaks down when the user’s browser does not have Javascript available. The user might not have Javascript available because of the browser (think lynx), or because they have turned off Javascript for security reasons, or because their Javascript implementation is buggy and isn’t going to work for some reason. Another approach is required.

My solution in the past was to make the checkbox into an image that looks like a checkbox, but it actually a link. That link points to the URL that selects (or de-selects) the item in question, which then re-displays the page with the proper status update. This works very well, except for the fact that the page view often re-sets itself back to the top of the page. That is inconvenient when you want to select multiple items from the same page, and you have to scroll down the page to see them.

This is a perfect example of when I have a non-Javascript solution working that could be significantly improved through the use of Javascript.

Enter AJAX.

If all the planets are aligned (i.e. the user has Javascript available and enabled, and it supports AJAX, and nothing else goes wrong), we can use some AJAX magic to improve the user experience.

AJAX is little more than a single (at least, from an AJAX developer’s point of view) very useful Javascript object called XMLHttpRequest. It’s job is to make an asynchronous HTTP request to some URL and provide the response either as text or as an XML document. If you elect to use the response as an XML document, then you can use the standard DOM methods to traverse everything.

I started out by consulting Mozilla’s AJAX:Getting Started page. It gives a fairly straightforward example of, well, getting started with AJAX. Using the information presented on that page, I was able to get something up and working relatively quickly. They had even listed the changes I would have to make in order to use the code under Microsoft Internet Explorer, so I figured I was covered when I went to test on MSIE.

Unfortunately, there’s a relatively large caveat when using MSIE up through version 6 when working with XML: the document.getElementsByTagName function is not namespace-aware. That means that those of us who came out from under the non-namespace-using rock several years ago have to deal with some pretty stupid code in order to work around it.

At this point, AJAX pros are saying to themselves “why doesn’t this guy just use one of the dozen or so cross-platform AJAX libraries that are out there — then he won’t have this problem”. Well, I’ll tell you: because I wanted to solve the problem myself in order to understand what was going wrong. It did take quite a while, and I ended up using information presented on Apple’s Dynamic HTML and XML: The XMLHttpRequest object. This was the only place where I saw any mention of MSIE’s failure to support namespaces.

Working around non-namespace-aware Javascript is pretty ugly. Under normal circumstances, one would simply call document.getElementsByTagName and pass the “local name” of the element to that function. You’d get an array of nodes back and everything would be fine. But, since MSIE sees “foo:bar” as the local name (instead of just “bar”), you’d have to change your code to look for “foo:bar”. But, that wouldn’t work in browsers that are namespace-aware, and it’s difficult, if not impossible, to tell at runtime which way a browser will behave.

So, I was forced to implement my own function that loops through the children of a particular node and looks for matching elements. :(

It occured to me just now that I could probably get away with making two calls: one that uses the preferred method, and then, if the call returns no nodes, another that calls the same method with the namespace attached. The only problem with this option is that you have to know the text of the namespace that is being used. Typically, you only have to deal with the namespace URI instead of the actual value being used (such as “foo” in the example above). In this case, I’ll have to hard-code the namespace into the Javascript, which is non-ideal. My existing solution has no such restrictions, so I’ll probably keep it for the time being.

A special thanks to the MSIE team for once again stepping outside of the standard (which, in all fairness, may or may not have existed at the time of implementation) and spicing up my day.

Now, time to test on MSIE 7. And Opera. And Safari….

An Hour in Montepulciano

Thursday, August 10th, 2006

I’ve arrived in town an hour too early for the Internet joint to open. I came alone, so I have some time to kill. I’ve come to the top of Montepulciano to look west over some of the most pictuesque scenes in Tuscany. Wait a minute: I take that back. Pretty much everything in Tuscany is picturesque.

Still, many tourists and locals alike come to this spot to take a look and take a breather — the hike to this spot definitely gets ones heart pumping. As with Florence, the town itself is something to do; there’s no need to have a specific activity. I don’t shop unless there is something I need (note to self: don’t forget to get brown leather shoes). I don’t eat unless I am hungry, although since every third shop sells pasta, cheese and salami, it’s hard to avoid persistent hunger even when the stomach is left out of the conversation.

No, the city itself is a destination in itself. I could spend all day simply wandering its vicoli, smiling at whomever happens to look my way. Unfortunately, there are only two kinds of people that respond: shop keepers and old women. The shop keepers are, of course, trying to get you to come in. They usually speak in broken English because they can spot you by the look in your eye: il touristo americano. The old ladies smile the most, accompanied by the occational buona sera.

A group of four Germans has arrived and they are taking turns with the camera. My heart starts to beat faster as I consider asking them if they want a photo together. I get nervous when communicating with people that aren’t expecting it. I’m pretty comfortable at brutalizing the Italian language in front of shop keepers. Though it makes their hearts ache, I’m sure they are used to tourists speaking to them loudly in their native tongue, finally settling on poor English as the lingua franca.

I casually interrupt in my best German, which is ignored at first. After the next picture, the gentleman holding the camera looks deliberately in my direction, as if to say “yeah, what?” I freeze, forgetting the nouns for the sentence I had constructed in my head during my earlier internal monologue. I ask if they speak English. He indicates that they do, and I ask if they want me to take their picture all together. He declines immediately, apolgetically indicating that his camera is out of film. I know it is a lie, but I might not give my camera away as easily to some stranger typing on a huge laptop at the top of a hill, either. Oh, well. At least I offered.

Time to move to another location.

The main square in town is oddly calm. Most people are still eating or relazing after their meals. The square itself has undergone some construction over the last few days in preparation for an amateur dramatic production that will be hald this evening. Today’s additions include freshly cut tree limbs to act as trees for the set. There is still a pile of branches on the steps, awaiting their final placement. Nobody seems to be tending to them at the moment.

Throughout the streets today, I noticed flags that weren’t there before. There are at least 4 different flags, obviously demarking the various neighborhoods. Oddly, only two are visible from the piazza grande. I would have expected to see all of them converge on this central square. Perhaps neighborhood warfare has resulted in several takeovers over the years leading to the current situation. Perhaps not.

Piazza Grande in Montepulciano is dominated by two structures: the town hall, complete with clock, and the obligatory cathedral, complete with bell tower. Actually, both the tower and the clock have bells. In 20 minutes, I’ll find out if they are on the same schedule. The cathedral is similar to many Catholic cathedrals in Italy, except that it is one of the unfortunates that does not have a marble facade. Yesterday’s trip to Arezzo gave me the opportunity to see a very attractive cathedral that was not intended to have a marble facade. Instead, it had light brown, slightly reddish bricks, smooth all around (aside from the few that had cracked and had their street-facing portions fall off). The tower here was built as a masonry structure, and looks like it is finished. The naked cathedral — one that was intended to have a facade, but never got one (or sometimes had it reposessed for a more important church) — is unfortunately very ugly. The bricks are left intentionally ragged, a smattering of mortar on the outside to hold it all together. It almost looks forgotton, except for the obvious significance it has at the center of town.

A group of Americans was trying to figure out how to get up higher for a better view. My heart doesn’t beat so fast when considering whether or not to help Americans; I can judge their interest much more successfully given my relative mastery of the language. I tell them that it’s a pretty good view to walk up the clock tower, although they won’t let you go up to the top of the clock (it’s pretty small). You get a pretty good view from up there. (I’m waiting to go to the Internet place to download the software to create panoramas from my individual shots.) They thank me and head into the town hall. I leave to get online.

My Wife and I Went to Portland and All I Got Was This Poison Ivy

Sunday, May 28th, 2006

My sister-in-law had a conference to attend in Portland, Oregon, and she happened to have two companion fare-rate tickets available. What the heck, we said. We’ve never been to Oregon, before.

First of all, let me say that the travel gods were smiling on us for this trip. Not a drop of rain fell the entire time, and the tmperature was in the high 70s and low 80s the entire time. We were definitely spoiled. Portland is an interesting town. Yes, town. Portland is, by the numbers, about the same as Washington, DC: they are within a few percent of each other in terms of population and land area. However, the city seemed small — that is, easy to explore on foot — and empty. We spent two days inside the city, including most of the area covered by the MAX – Portland’s mass transit system, and hardly met any people. I kept waiting for businesses to close and people to come streaming into the streets during lunch, or even at the end of the business day. Such events never came.

Walking around in downtown DC, the sidewalks are teeming with people. Sure, everyone is on a cell phone talking way louder than absolutely necessary, but they are there nonetheless. Being in Portland felt like being in a city that had been partly evacuated.

I mentioned to someone on the MAX that there were plenty of seats in the middle of the day, and that it was nice. She glanced around the train car and said, “well, around five o’clock, it’ll be pretty full”. We were on the train again around that time, and this time I had to stand, along with a reasonable number of people (I’m used to the Orange Line, which is pretty full most of the time, and totally crazy during rush hour). Another train came while we were still standing near the station, and it was practically empty. Very odd. Maybe we were there on some kind of city-wide vacation week.

Many stores didn’t seem to be open no matter when we went by. We wandered around most of the time we were in the city — no specific destinations or schedules. One evening, we were suprised to see that a gelato shop (mmm… gelato) was closed at maybe eight in the evening. The next day, we came by the same block at nine in the morning and it was also closed. Maybe they were only open for the lunch rush.

Maybe not. That same day, we went to the northwest corner of the city for two things: lunch and liquor. We stepped into one of the McMenamins pubs (one of the most prolific pub owners in the city) to get a pint and a sandwhich, and experienced a continuation of our Twilight Zonesque trip in this cavernous pub: there was one guy at a billiard table, one guy at the bar, and a group of 4 people in a booth. Those guys in the booth looked familiar: they were the kind of business-type folks that I’m used to seeing packed heel-to-toe in joints like this in DC.

The place was large enough to seat maybe 100 people, and it looked like there were more rooms if you kept walking into the establishment. We sat next to the foursome at a booth, next to the windows in the front — prime real estate for lunch as far as I was concerned. Where were these people? I think maybe Portland was experiencing a rash of body-snatchings or something.

Okay, enough about the emptiness of portland. There were lots of interesting things to be found in Portland, including an electric car charging station and a distillery. I had never been to a distillery, and since we could take the train to one, we did. We went to the Clear Creek Distillery, who distills their own spirits right in the place we visited (meaning that we didn’t just go to a storefront or anything… we were in the place where they do everything).

Unbelievably, we found a whiskey that ought to be a whisky. A Scotch Whisky, that is. McCarthy’s Oregon Single Malt is a 3-year-old whiskey that is so smooth and peaty, I find it superior to many single-malt Scotches that I’ve had. Amazing.

Also amazing were the waterfalls — as well as the canyon walls producing them — along the Columbia River Gorge. We took many pictures while we were there, including some nice ones of the falls and some videos (like this one). I have yet to assemble some of the panoramic sets that I took, which I just remembered that I have to do. We toured the Bonneville Lock and Dam, which includes a fish ladder to allow fish to swim upstream, around the dam. The fish ladder features a narrow route where fish are counted by folks who live to count fish. Our visit occurred on a slow day (otherwise known as “not September”) and so she told us all about identifying and counting fish. We even saw a couple of Chinook Salmon going by the window (pic and video). Mmmm… salmon. We ended up at a restaurant half way up mount hood, where there was still snow on the ground. So much for 85-degree weather in Portland.

The Pacific Coast is also gorgeous (no pun intended). Again, we have some pictures of that outing as well. Unfortunately, I hadn’t charged my camera battery before we left so it crapped-out around four o’clock in the afternoon. We switched to my sister-in-law’s camera, but we had left it in the car since I had mine, so we didn’t get some really nice pictures that I would like to have gotten. Oh, well. At least I got this one of a tree eating me. At some point during these two hiking trips, I got poison ivy.

When I was a kid, I got poison ivy all the time. I was starting to become convinced that doctors just didn’t understand how poison ivy worked, and that I had actually been permemently infected with it, and that heat simply triggered it in my body. It didn’t help that I had no idea what it looked like. Despite my father’s repeated descriptions and demonstrations (with me at a distance, naturally), I simply couldn’t identify the plant. Even as recently as last summer, I went to the edge of a forested plot of land near my parents’ home and said “hey, that’s poison ivy, right? I’ll take care to avoid that.” He pointed out that what I had identified as poison ivy was actually just a harmless fern or whatever, but that several feet away lie the beast. For whatever reason, I simply cannot identify it properly. I can identify things that look like poison ivy, but I don’t think I have a single successful identification under my belt. What I do have under my belt is an itchy rash. Wow, does that sound bad: t’s on my hip, okay!?

On our last day there, Katie and I walked to the Portland Rose Garden, but due to severe scale management issues on the tourist’s map we were carrying, we thought it really sucked. We finally found it, but sadly May isn’t a good time for roses. So, we toured the Portland Rose Bush Garden, sans most of the flowers. After that, Katie used her superior orienteering skills to get us lost in Forest Park. Fortunately, the park is bounded on all sides by roads, so eventually we would have been rescued. When we stumbled across the Visitors’ Center (closed, of course), we were happy to find that trail maps were provided on the outside of the building, meaning we could take one and find our way back to the train.

Finally, we got down to business and started visiting wineries. Unless you’re there, stories about them are dreadfully boring, so I’ll spare the details. Suffice it to say that we toured several wineries, tasted lots of wine, got industry discounts on everything (thanks to Katie’s job at Lost Creek Winery here in Virginia), came home with nearly 40 bottles of wine, and ate lunch at a joint called Nacho Mama’s. Fortunately, we opted to ship 2 cases, so we lugged fewer than half of those home with us on the plane.

So, I’m going to go assemble those panoramas. I have only one decision to make: shall I have wine, or scotch?

Update: 2006-05-28 17:03 – Panoramas are available for the Columbia River Gorge, some random mountain, and Mt. Hood. I had scotch.

(Yet Another) Microsoft Internet Explorer Rendering Bug

Tuesday, March 21st, 2006

For years, standard operating procedure for developing a web application would be to design and implement it with Microsoft’s Internet Explorer as the test bed. You’d pick a version and tell everyone that they needed to use it and that was that. I have to admit that even I committed such sins.

These days, I want my pages to be usable on as wide a variety of web browsers as possible, so I use Mozilla Firefox for development, and then just check MSIE at the end to see if anything is amiss. Yet, with MSIE, something is almost always amiss…

I’ve had trouble with good-looking logos and mastheads for a long time. Back in the day, tables were the way to go. More recently, CSS is the preferred (and really the only) way to do layouts. The trouble is that MSIE has some schizophrenia when it comes to CSS. The folks that wrote MSIE implement only part of the CSS specifications, and often took shortcuts whenever they wanted.

A few days ago, I noticed that MSIE was acting strangely when viewing one of my current projects. It appeared that the text in my masthead wasn’t being displayed. Perhaps I had some conflicting CSS styles that were giving my text the same background and foreground colors. I checked, and everything was okay. Reload after reload, application server restart after application server restart didn’t solve the problem. Mozilla Firefox never blinked an eye and rendered the (quite simple) page without a problem.

Then, I noticed that scrolling the page past the fold and back again would mysteriously reveal the text. That’s not something that can be done using CSS.

Check out this movie that demonstrates the problem:

Screenshot of MSIE Rendering Bug

It’s definitely a bug. The markup is validated correct strict XHTML 1.0 and the CSS is also spick-and-span clean.

The markup is fairly straightforward; there’s a div surrounding the entire masthead (both the topmost blue bar and the bar containing the tri-colored regions), and then a divcontaining everything within the topmost blue bar. The blue bar contains a form containing the login form. All the other text elements are plain-old h1 and h2 h2elements. The tri-colored regions live in their own divand are made up of the surrounding div(black background) and two nested divelements with appropriate background colors.

The styles are also straightforward; colors, margins, borders, etc. I didn’t even have to use the ‘line-height’ trick to get MSIE to display an empty div(for the tri-colored regions).

It turns out that the problem can be solved by adding a simple non-breaking space between the blue-bar and tri-colored-bar divelements. MSIE interprets this change by finally giving me what I wanted in the first place. Unfortunately, it adds a small vertical space before the tri-colored-bar which I would prefer not to have… it looks like unnecessary padding in the blue div.

I have a virtual machine running Microsoft Windows XP with MSIE 7 beta running on it, so I decided to comfort myself with the fact that MSIE 7 had probably fixed this bug. It hasn’t, at least not yet. I hope the MSIE engineers really try to get CSS right this time.

Terre Haute Tribune-Star Snubs Hometown School

Monday, March 13th, 2006

The Terre Haute Tribune-Star, a puplication I rarely think of since I graduated from college, is running a story about how Indiana State University (in the ‘Haute) will be the “first public university in the state to require all students to have notebook computers“.

After mentioning ISU’s new plan, the article goes on to say that ISU is only “one of a handful of institutions nationally, including the University of North Carolina-Chapel Hill and Clemson University, to institute similar mobile computing initiatives“.

Apparently, the author of this article did not do their homework, at least not very well. The article seems to come directly from the Tribune-Star; it’s not an AP report or reprinted from another publication, so they should have known about this: Rose-Hulman Institute of Technology, a well-known university in town, has had a mandatory laptop program since the class of 1999 began as freshmen in 1995, making that program over 10 years old.

UNC-Chapel Hill appears to have discussed this back in 2004 and has probably implemented it since then. Clemson University started requiring laptops in 2003, and University of Denver (mentioned in the comments) implemented their program back in 1999.

The author not only completely misses the fact that Rose-Hulman has a laptop program, and that it is over 10 years old, but also that there is actually a rather long list of universities that implement laptop programs such as these. The statement that ISU is one of “a handful of institutions nationally” to implement laptop programs is preposterous.

Oddly enough, there is no attribution for the Tribune-Star piece.
I wouldn’t want my name on it, either.

Something’s been bugging me…

Friday, March 10th, 2006

For a few months, now, Katie and I have been repeatedly finding a certain bug in our house. Our building has a history of cockroach
infestation, but we hardly get them way up on the 8th (top) floor
of our building.

The Western Conifer Seed Bug

These are not cockroaches, at least I was pretty sure they weren’t. I don’t know alot about bugs, so I wasn’t sure if these things were momma cockroaches or something else entirely.

Katie emits a blood curdling scream whenever she sees a bug (like this house centipede that we found in our bathtub one day. Totally creepy, this thing has a 3 inch body (~7cm) and legs everywhere, which made it seem much bigger. Fortunately for us, the tub was too slick for it to escape and feed on our succulent brains.

Anyhow, we found one of our new friends, dead, on the floor, today. Usually, I end up killing these guys with plenty of paper towels so I don’t get icky bug juice on me. Besides, I’m no outdoorsman. Since we had a dead specimen that hadn’t been turned into a streaky mess in a paper towel, I decided to investigate him. I’m generally too creeped-out to take a look at a live one, especially when I don’t know what the hell it is. It might be a Peruvian Eye-Gasher for all I know.

I decided that I had to find out what these things were, because they were appearing every couple of days. If they were roaches, I was going to deploy bait traps every 18 inches until they stopped showing up.

I searched around teh intarweb for a while, and found a great site: whatsthatbug.com. This site is all about identifying creepy things that one finds in and around one’s house. It takes the form of submissions of the type “I photographed this thing on my wall/floor/cabinet/dog/toilet, and I want to know what it is”. They archive what looks like to be every submission they’ve ever gotton, along with an explanation of what the thing is. You can search and/or browse, so I did a little of both.

Meanwhile, Katie was trying other sources of information — they appeared to be more academic in nature — with not too much luck. She found a few things that might have been related, but I wasn’t confinved. They looked too dissimilar to the corpse before me on my desk.

It didn’t take me long to find an entry about the Western Conifer Seed Bug, which is our bug. Some of the pictures of the WCSB on What’s That Bug‘s True Bugs page are not very good (except the one that was taken from below, where it looks like the bug is either on a window or is skydiving at the photographer). So, I decided to take my own (see above).

For some reason, my camera wouldn’t focus properly on the bug when at close range, or even far range with a high zoom. I simply couldn’t get the camera to focus properly. So, I used my extensive knowledge of optics to engineer a solution: I put a large magnifying glass between the camera and the bug, and the results were very good.

So, the Western Conifer Seed Bug is nothing to fear, apparently. They’re still creepy, and I’m likely to kill those found in my home, especially if I find my wife standing on a chair screaming.

Hyperthreading CPUs and User Experience

Tuesday, November 29th, 2005

Brian has an article on his blog about Hyperthreaded CPUs and their effects on “the user experience”, by which I’m sure he means the typical response on a graphical desktop to a user’s actions — something like moving the mouse, dragging a window, opening-up a menu, etc.

I disagree with a few of his assertions… namely that HT itself is responsible for improving the user experience. For example, if you have a single (and non-HT) processor and you run some CPU-intensive process (such as a compiler, a complex graphical manipulation that doesn’t take advantage of your graphics processor, or a poorly written program that runs away with your CPU in a tight loop), that process is going to eat cycles that would otherwise be used to redraw your mouse pointer (hardware-drawn cursors went away with Microsoft Windows 3.1), draw the menus in your spreadsheet program, or drag your windows around the screen. This makes the responsiveness of your graphical desktop seem sluggish.

The reason this happens is that CPUs can only do one thing at a time. Fortunately, they typically do things reeeeeally fast, so you don’t notice that it’s only doing that one thing at a time because it switches tasks and does a little bit of work here and a little bit of work there, and it magically looks like everything is getting done “all at the same time”.

With HT, the CPU itself can actually do more than one thing at a time. Sure, the CPU still does that frienzied-switcheroo dance, except that it can — ostensibly, anyway — do work on two whole tasks at once. Brian mentions that HT isn’t as nice as actually having two equally-fast processors, but let’s ignore that fact for the moment.

I assert that the responsiveness of the graphical desktop has more to do with the way that the desktop functions than the way the CPU works. Evidence? Compare any version of Microsoft Windows with a similar machine running Linux and any one of the graphical desktops that run atop it. When you launch a program under Microsoft Windows, you get an hourglass mouse pointer, the computer churns for a while, and the program window eventually opens. The next time you do that, move the mouse around… try to open another application…. try to drag another window around. For the most part, your desktop will respond quite favorably. The mouse cursor will smoothly follow your hand motions, the windows will redraw, and the second application will also eventually open.

My experience with Linux is not the same. If I open an application, the mouse cursor immediately starts jerking around and loses its smoothness. With the mouse jerking around, the windows jerk around as well. Other apps will start, of course, but it’s really still like dropping menus down and moving the mouse that people really notice.

Note to Linux zealots: I totally love Linux. I run it on everything except the computer that I use as my primary desktop, mostly because of games that I want to play. Yeah, Wine just doesn’t work for me. Get over it.

Anyway, these observations lead me to believe that Microsoft Windows, no doubt through some kind of unholy voodoo, has gone through great pains to schedule the user interface at the highest possible priority. Linux, in typical pragmatic style, has chosen not to hijack the CPU for such trivial details as turning your mouse pointer into the Energizer Bunny.

As for Brian’s compiler running in a virtual machine, it’s a shame that VMWare doesn’t properly expose both processors available to Microsoft Windows to the OS running in the virtual machine. I would expect that a decent virtualization environment would allow you to set the number of CPUs to expose to the guest OS. I would have expected his gentoo compile to be able to peg both of his virtual CPUs.

But back to CPU utilization versus user experience. I would bet that if he were using a threaded compiler (which almost doesn’t make any sense) directly in his Microsoft Windows environment, and compiling the same code (or at least performing a compile that was equally CPU-intensive), then both HT CPUs would be pegged, and he’d still be able to move the mouse around, click on things (with a slight delay), etc.

I think it comes down to scheduling. Your OS can always interrupt your compiler for any reason. Your compiler (well, really your VM) is probably scheduled at in “normal” mode, whatever that means for your OS. I’m willing to bet dollars-to-doughnuts (mmmm… doughnuts) that Microsoft Windows’s graphical shell itself (explorer?) is scheduled at a higher-than-normal priority, or that all the UI calls that it makes are either running at the kernel level (which wouldn’t suprise me one bit for MS Windows, honestly) or at a higher-than-normal priority. It’s not the CPU, it’s the scheduler.

There are a lot of folks out there that say that HT is actually hurting performance. I haven’t read any of them, ’cause I’m honestly not that interested in looking at those numbers. After reading the ARS article linked above a few years ago, I thought that some really smart dudes got really high one day and had themselves a fantastic idea. I figured that it wasn’t as cool as the hype would suggest, but hey… why not squeeze as much out of the CPU as you possibly can? My gut reaction is that you can find data to either support or deride HT technology. I do know one thing: lots of Java developers were complaining in the past that HT CPUs would crash all the time with very strange errors, and turning off HT would solve their problems. >shrug<. You gotta do what you gotta do. Too bad those folks paid extra for their super-sexy HT processors.

I had a friend at Rose-Hulman that used to play Unreal Tournament with a couple of friends and me. He had just gotton a dual-CPU machine and decided to play with it a little: he created a dedicated server and set the processor affinity to his second processor (i.e. not the primary one). Then, he started UT in client mode so he could play it, and set the processor affinity for that process to the primary CPU. I’m not sure if it really made any difference than just running them separately with no tweaks, but it was an interesting idea.

When I heard that he had done that, I decided that since the OS itself actually needs very little CPU time to do it’s stuff, that an OS that could monopolize a considerably lower-powered processor and then schedule all user tasks on a much higher-powered processor would be great. Super-fast memory allocation (not that it’s particularly slow in the first place), buffer management, DMA, etc. For most OSs, this also means that the various hardware drivers would run on a CPU that wasn’t being used for applications. That would speed-up graphics processing since even a computer with the latest monster GPU still needs the graphics driver to actually send the data to the GPU to do the work.

Who knows. Maybe someone will steal my idea and make a jillion dollars. That would really suck for me.

Character Assassination

Friday, November 18th, 2005

At the dawn of (computer) time, someone decided that computers being able to deal with letters as well as numbers would be a great idea. And it turned out to be a big ‘ole mess.

The problem is that you have to decide how to encode these letters (or characters) into numbers, which is the only thing that computers can handle. EBCDIC and ASCII were two of the first, and while DBCDIC has effectively died, ASCII has turned into a few (relatively compatible) standards such as US-ASCII and ISO-8859-1 (also called “Latin-1”). These jumbles of letters are called character sets, and the describe how to take the concept of a letter and turn it into one or more 8-bit bytes for processing within the computer.

One of the most flexible characters sets is called UTF-8, and represents an efficient packing of bytes by only using the minimum necessary. For example, there are jillions of characters out there in human language if you take into account written languages like Chinese, Sanskrit, etc. We would need many bytes to represent all character possibilities (maybe 4 or 5), but UTF-8 has a trick up its sleeve that helps reduce the number of bytes taken up by common (read: Latin-1) characters. It’s also completely backward-compatible with ASCII, which makes it super-handy to use in places where ASCII was already being used, and it’s time to add support for international characters.

Now that the history lesson is over, it’s time to complain.

I’m writing an application in the Java programming language, which is generally highly touted as having excellent internationalization (or i18n) support: it has encoding and decoding capability for a number of different character sets (ASCII, UTF-8, Big5, Shift_JIS, any number of ISO-xyz-pdq encodings, etc.), natively uses Unicode (actually, UTF-16, which is a specific type of Unicode), and has some really sexy ways to localize (that’s the process of managing translations of your stuff into non-native languages — such as Spanish being non-native to me, an English speaker) content.

I was tyring to do something very simple: get my application to accept a “funny” (or “international” or non-Latin-1… I’ll just say “funny”, since I don’t use those characters very often) character. I love the Spanish use of open-exclaimation and open-question characters. They’re upside-down versions of ! and ? and preceed questions and exclaimations. It makes sense when you think about it. Anyhow, I was trying to successfully take the string “¡Bienvenidos!”, put it into my database, and get it back out successfully, using a web browser as the client and my own software to move the data back and forth.

It wasn’t working. Repeated submissions/views/re-submissions were resulting in additional characters being inserted before the “¡”. Funny stuff that I had clearly not entered.

I’ve done this before, but the mechanics are miserable and I pretty much block out the painful memories each time if happens.

The problem is that many pieces of code get their grubby little hands on the data from the time you type it on your keyboard and the time it gets into my database. Here is a short list of code that handles those characters, and where opportunities for cock-ups occur.

  • Keyboard controller. Your keyboard has to be able to “type” these characters correctly so that the operating system can read them. I can’t type a “¡”on my keyboard, so I need to take other steps.
  • Your operating system. MS-DOS in its default configuration in the US isn’t going to handle Kanji characters very well.
  • Your web browser. The browser has to take your characters and submit them in a request to the web server. Guess what? There’s a character encoding that is used in the request itself, which can complicate matters.
  • The web server, which may or may not perform any interpretation of the bytes being sent from the web browser.
  • The application server, which provides the code necessary to convert incoming request data into Java strings.
  • My database driver, which shuttles data back and forth between Java and the database server.
  • The database itself, which has to store strings and retrieve them.

I can pretty much absolve the keyboard and operating system at this point. If I can see the “¡” on the screen, I’m pretty happy. I can also be reasonably sure that the web browser knows what character I’m taking about, since it’s being displayed in the text area where I’m entering this stuff. My web server is actually ignoring request content and just piping it through to my app server. The database and driver should be okay, as I have specified that I want UTF-8 to be used both as the storage format of characters in the database, and for communication between the Java database driver and the database server.

That leaves 2 possibilities: the request itself (made by the web browser) or the application server (converts bytes into Java strings).

The first step in determining the problem is research: what happens when the web browser submits the form, and how is it accepted and converted into a Java string?

  1. The web browser creates a request by converting all the data in a form into bytes. It does this by using the content-type “application/x-www-form-urlencoded” and some character encoding. You can ignore the content-type for now.
  2. The request is sent to the server.
  3. The application uses the ServletRequest.getParameter method to get a String value for a request parameter.
  4. The application server reads the parameter out of the request using some character encoding, and converts it into a String.

So, it looks like the possibilties for confusion are where the character sets are chosen. The W3C says that <form> elements can specify their preferred character set by using the accept-charset attribute. The default value for that attribute is “UNKNOWN”, which means that the browser is free to choose an arbitrary character set. A semi-tacit recommendation is that the browser use the character encoding that was used to provide the form (i.e. the charset of the current page) as the charset to use to make the request.

That seems relatively straightforward. My responses are currently using UTF-8 as their only charset, so the forms ought to be submitted as UTF-8. Perfect! “¡” ought to successfully be transmitted in UTF-8 format, and go straight-through to my database without ever being mangled. Since this wasn’t happening, there was obviously a problem. What character set *was* the browser using? A quick debug log message ought to help:

DEBUG - request charset=null 

Uh, oh. A null charset means that the app server has to do some of it’s own thinking, and that usually spells trouble.

Time to take a look at the ‘ole API specification. First stop, ServletRequest.getParameter(), which is the first place my code gets a crack at reading data. There’s no mention of charsets, but it does mention that if you’re using POST (which I am), that calling getInputStream or getReader before calling getParameter might cause problems. That’s a tip-off that one of those methods gets called in order to read the parameter values themselves. Since InputStreams don’t care about character sets (they deal directly with bytes), I can ignore that one. ServletRequest.getReader() claims to throw UnsupportedEncodingException if the encoding is (duh) unsupported, so it must be applying the encoding itself. There is no indication of how the API determines the charset to use.

The HTTP specification has a header field which can be used to communicate the charset to be used to decode the request. The header is “content-type”, and has the form: “Content-Type: major/minor; charset=[charset]”. I already mentioned that the content-type of a form submission was “application/x-www-form-urlencoded”, so I should expect something like “Content-Type: application/x-www-form-urlencoded; charset=UTF-8” to be included in the headers from the browser. Let’s have a look:

DEBUG - Header['host']=[deleted]
DEBUG - Header['user-agent']=Mozilla/5.0 [etc...]
DEBUG - Header['accept']=text/xml, [etc...]
DEBUG - Header['accept-language']=en-us,en;q=0.5
DEBUG - Header['accept-encoding']=gzip,deflate
DEBUG - Header['accept-charset']=ISO-8859-1,utf-8;q=0.7,*;q=0.7
DEBUG - Header['keep-alive']=300
DEBUG - Header['connection']=keep-alive
DEBUG - Header['referer']=[deleted]
DEBUG - Header['cookie']=JSESSIONID=[deleted]
DEBUG - Header['content-type']=application/x-www-form-urlencoded
DEBUG RequestDumper- Header['content-length']=121

Huh? The Content-Type line doesn’t contain a charset. That means that the application server is free to choose one arbitrarily. Again, the unspecified charset comes back to haunt me.

So, the implication is that the web browser is submitting the form using UTF-8, but that the app server is choosing its own character set. Since things aren’t working, I’m assuming that it’s choosing incorrectly. Since the Servlet spec doesn’t say what to do in the absence of a charset in the request, okly reading the code can help you figure out what’s going on. Unfortunately, Tomcat’s code is so byzantine, you don’t get very far into the request wrapping and facade classes before you go crazy.

So, you try other things. Maybe the app server is using the default file encoding for the environment (it happens to be “ANSI_X3.4-1968”) for me. Setting the “file.encoding” system property changes the file encoding used in the system, so I tried that. No change. The last-ditch effort was to simply smack the request into submission by explicitly setting the character encoding in the request if none was provided by the client (in this case, the browser).

The best way to do this is with a servlet filter, which gets ahold of the request before it is processed by any servlet. I simply check for a null charset and set it to UTF-8 if it’s missing.

public class EncodingFilter
    implements Filter
{
    public static final String DEFAULT_ENCODING = "UTF-8";

    private String _encoding;

    /**
     * Called by the servlet container to indicate to a filter that it is
     * being put into service.
     *
     * @param config The Filter configuration.
     */
    public void init(FilterConfig config)
    {
	_encoding = config.getInitParameter("encoding");
	if(null == _encoding)
	    _encoding = DEFAULT_ENCODING;
    }

    protected String getDefaultEncoding()
    {
	return _encoding;
    }

    /**
     * Performs the filtering operation provided by this filter.
     *
     * This filter performs the following:
     *
     * Sets the character encoding on the request to that specified in the
     * init parameters, but only if the request does not already have
     * a specified encoding.
     *
     * @param request The request being made to the server.
     * @param response The response object prepared for the client.
     * @param chain The chain of filters providing request services.
     */
    public void doFilter(ServletRequest request,
			 ServletResponse response,
			 FilterChain chain)
	throws IOException, ServletException
    {
	request.setCharacterEncoding(getCharacterEncoding(request));

	chain.doFilter(request, response);
    }

    protected String getCharacterEncoding(ServletRequest request)
    {
	String charset=request.getCharacterEncoding();

	if(null == charset)
	    return this.getDefaultEncoding();
	else
	    return charset;
    }

    /**
     * Called by the servlet container to indicate that a filter is being
     * taken out of service.
     */
    public void destroy()
    {
    }
}

This filter has been written before: at least here and here.

It turns out that adding this filter solves the problem. It’s very odd that browsers are not notifying the server about the charset they used to encode their requests. Remember the “accept-charset” attribute from the HTML <form> element? If you specify that to be “ISO-8859-1”, Mozilla Firefox will happily submit using ISO-8859-1 and not tell the server which encoding was used. Same thing with Microsoft Internet Explorer.

I can understand why the browser might choose not to include the charset in the content type header because the server ought to “know” what to expect, since the browser is likely to re-use the charset from the page containing the form. But what if the form comes from one server and submits to another? Neither of these two browsers provide the charset if the form submits to a different page, so it’s not just an “optimization”… it’s an oversight.

There’s actually a bug in Mozilla related to this. Unfortunately, the fix for it was removed because of incompatibilities that the addition of the charset to the content type was causing. Since Mozilla doesn’t want to get the reputation that their browser doesn’t work very well, they decided to drop the charset. :(

The bottom line is that, due to some bad implementations out there that ruin things for everyone, I’m forced to use this awful forced-encoding hack. Fortunately, it “degrades” nicely if and when browsers start enforcing the HTTP specification a little better. My interpretation is that “old” implementations always expect ISO-8859-1 and can’t handle the “charset” portion of the header. Fine. But, if a browser is going to submit data in any format other than ISO-8859-1, then they should include the charset in the header. It’s the only thing that makes sense.

Ridin’ Along in my Automobile

Wednesday, September 21st, 2005

Over the Labor Day weekend, a few friends and I rented a cabin on the Shenandoah riverfront and whiled away the weekend grilling, playing cards, and laughing at Bill whistling on all-fours, and Kasey smaaashing bugs.

Fortunately for Bill and Kasey, there was something even funnier, and more unbelievable that we witnessed that weekend. That thing was this dude driving his pickup down the middle of the Shenandoah river.

Image of some dude driving his truck down a river.
No particular place to go

There’s a video that I got during the whole thing, but I still have to get it from the camera’s owner (only got the pictures so far). Follow the picture-link to see more pictures.

And yes, he was just driving down the river… as if it were a regular road, albeit with continuous speed bumps.

Update: Here’s a google maps link to the place where we were staying: Cabin Map. If you zoom out one level and choose the map/satellite hybrid, you can see the area of the river where this dude was. We were on the southwest side of the river, and the island in the map was directly across the river from us.

How old are you, really?

Sunday, August 7th, 2005

When a man sits with a pretty girl for an hour, it seems like a minute. But let him sit on a hot stove for a minute–and it’s longer than any hour. That’s relativity.

-Albert Einstein

Reckoning time has always been a problem for humans, it seems. We have argued over which calendar to use for quote a long time. Even worse is trying to figure out how long ago something happened.

The answers to many “how long ago” questions can be answered with a certain degree of slop. For example, “how long ago was Jesus of Nazareth born?” could be answered, “about 2000 years ago”. “When was peace declared at the end of World War II?”, “60 years ago”. But what a question to which the answer should be more specific, such as “how long ago was I born?”. I want to know the years, months, and days for that figure, and here’s why.

As part of my continuing work with The Center for Promotion of Child Development Through Primary Care, I have to be able to display ages for patients that our doctors will be treating. More often than not, these patients are young, so we’re talking about newborns through adolescencts. For the newborns, the number of months and days is very important, while the ages of adolescent patients are okay to round-off to years and months, and maybe just years.

It turns out that it’s somewhat difficult to answer the question “how old are you?”. It doesn’t really seem all that hard, until you actually try to do it. The problem is that people disagree about a lot of things. For example, you won’t get much argument that there are 10 days separating 2000-01-01 and 2000-01-11, or that there is 1 month separating 2000-01-01 and 2000-02-01. But what about the date difference between 2000-01-31 and 2000-02-30? Is that 30 days or is it 1 month?

Julian Bucknall is a guy who studies algorithms, at least as a hobby. He has a discussion of time reckoning in software including a sample implementation in C#. Although I appreciate his discussion (and created a few new unit tests based upon some of the problematic date ranges he presents), I don’t entirely agree with how he did his implementation. I happen to be using Java for my purposes, but I did my own implementation because I needed to, not because I’m just a Java wonk.

Before I start, those without a programming background have to realize that most programming languages have very poor tools for handling dates. Mostly they center around counting milliseconds since a certain date (usually 1970-01-01). This is great for quick calculations of numbers of days between events, since a day has a fixed number of milliseconds (1000 ms/sec * 60 sec/min * 60 min/hr * 24 hr/day = 86400000 ms/day).

For those of you who are too smart for your own good, I’m going to be ignoring leap seconds and things like that for the time being, since computers generally don’t handle those, anyway. If you want your computer’s time to be correct to the nearest leap-second mandated by the IEOS, you should just manually adjust your clock whenever it’s convenient… no date library is going to worry about keeping a list of all leap-seconds ever added to civil time.

So, back to dates in software. Since the number of milliseconds in a day is fixed, and computers often represent dates as a number of milliseconds from a fixed date (generally known as the epoch), it’s very easy to calculate the difference between two dates as a number of days. For example, I was born on 1977-10-27. That means that I am 10146 days old (wow, that doesn’t seem like a lot…). But how many years, months, and days old am I?

Fortunately, for discussion purposes, I’m writing this entry on 2005-08-07, which has both the day-of-month, as well as the month itself, less than the same numbers in my birth date (that is, 8 is less than 10, and 7 is less than 27). That’s good because it makes the math harder. If I had been born on 1977-08-01, then you could count on your fingers that I am 28 years, 0 months, and 6 days old. Since I was born later in the month and later in the year, there are all kinds of fun things that have to happen.

If you were to perform these calculations on your fingers, you’d probably start with the birth date and keep adding years until you couldn’t add them anymore without going over. You’d easily get to 27 and stop (if you had that many fingers). But then, you have to figure out what the differences are between the months and days. Exactly 27 years after my birth would be 2004-10-27. In order to get yourself to 2005-08-07, you need to add a bunch of months. If you add 10 months, you’ll get 2005-08-27, which is too much. So, you have to add 9 months instead, and then figure the days. Exactly 27 years and 9 months after my birth would be 2005-07-27. In order to get to today, you have to add days. If you add 11 days, you’ll get to 2005-08-07. Ta-da!

Now, that didn’t seem too bad, did it? Actually, an implementation which basically follows this on-your-fingers calculation is the one proposed by Julian Bucknall as well as many others on the web. I don’t like this implementation because is it computational overkill (you have to do lots of looping, and most Date object implementations that exist out there will re-calculate a bunch of stuff whenever you update a single field, such as the year or month). I actually wrote mine before I read his article, and I don’t have a C# compiler handy to run his algorithm through my test cases, so I can’t be sure that they yield the same results. At any rate, I have an implementation that should be a little more efficient and meets my needs.

Oh, one last note: we had been using a Java library called BigDate to do our date calculations. I knew it was going to be a pain in the neck to write our own, so we found a library that would do it for us. Unfortunately, it fails with Java Date objects representing dates before 1970-01-01. The author claims that his library handles dates prior to 1970 in contrast to Java’s Date, but it appears that he is wrong on two counts: Java’s Date class does, in fact, handle dates before 1970, and his library trips over them. I was able to use his library by passing-in the year, month, and date separately, but that required me to use deprecated methods in the Date API, and I was already starting to look down my nose at it, slightly. Just for the heck of it, I tried to use BigDate to calculate the date delta between a BCE date and today, and BigDate ignored the era, so I got the wrong answers there, too.

So, I wrote my own implementation (in Java) that quickly calculates deltas for all three fields (I’m not concerned with time, just the date), possibly ajdusts them for BCE dates, and then runs a fairly simple algorithm to move the date, then month and year to their correct values. We use a class called DiffDate which just stores a year, month, and date as a return value. I have one method that accepts a pair of Date objects, and one that accepts a pair of Calendars. Use of the Calendar avoids deprecation warnings during compilation, and offers two methods for client code, making it easier to use in situations that call for either Dates or Calendars.

    //
    // Copyright and licence notice: I intend for this code to be freely copied, edited, improved, etc.
    // Please give me (Chris Schultz, http://www.christopherschultz.net/) credit as the source of
    // this code, and let me know if you find ways to improve it.
    //
    public static DiffDate diffDates(Date earlier, Date later)
    {
      Calendar c_e = Calendar.getInstance();
      c_e.setTime(earlier);
      Calendar c_l = Calendar.getInstance();
      c_l.setTime(later);
      return diff(c_e, c_l);
    }

    public static DiffDate diff(Calendar earlier, Calendar later)
    {
      int y1 = earlier.get(Calendar.YEAR);
      int m1 = earlier.get(Calendar.MONTH);
      int d1 = earlier.get(Calendar.DATE);
      int y2 = later.get(Calendar.YEAR);
      int m2 = later.get(Calendar.MONTH);
      int d2 = later.get(Calendar.DATE);

      // Adjust years across eras (BC dates should be negative, here).
      if(java.util.GregorianCalendar.BC == earlier.get(Calendar.ERA))
        y1 = -y1;
      if(java.util.GregorianCalendar.BC == later.get(Calendar.ERA))
        y2 = -y2;

      int d_y = y2 - y1;
      int d_m = m2 - m1;
      int d_d;

      // Now that we've got deltas, start with the days and work backward
      // changing any negatives into positives, and rippling up to larger
      // fields.
      if(d2 >= d1)
      {
        d_d = d2 - d1; // Easy
      }
      else
      {
        // To determine how big the months are.
        Calendar work = (Calendar)later.clone();
        while(d1 > d2)
        {
          // Move backward through the months, adding a whole month 
          // until we have enough days to cover the deficit.
          --m2;
          // To track our progress through the month
          --d_m;
          // Now, there's one less month between dates
          if(0 > m2)
          {
            --d_y;
            work.set(Calendar.YEAR, work.get(Calendar.YEAR) - 1);
            m2 = Calendar.DECEMBER;
          }

          work.set(Calendar.MONTH, m2);
          d2 += work.getActualMaximum(Calendar.DAY_OF_MONTH);
        }

        d_d = d2 - d1;
      }

      // Adjust the months and years
      while(0 > d_m)
      {
        d_m += 12;
        d_y -= 1;
      }

      return new DiffDate(d_y, d_m, d_d);
    }

The whole thing is very straightforward, with the notable exception of the big “else” block in the middle of the code. It is here where we handle cases when the earlier date has a day-of-month that is later in the month than the later date. In that case, we need to count backwards, enlisting the help of a Calendar object to give me the lengths of various months. That ‘work’ calendar actually exists only to help me with leap-year determination. I suppose I would have used the old “years evenly divisible by 4, except every 100, except every 400”, but that would have complicated my code even further, and, I think, been inaccurate for old dates because of changes to the calendar. Then again, I think that GregorianCalendar (the default calendar in my locale) had those same rules, so I’d get the same results in both cases. If you want to calculate dates in October of 1582, you’re on your own.

You may have noticed, but this implementation does not handle time zones in any way. The reason is that this is intended to be for age calculation. If you were born in Sydney on 2000-01-01, then it might still have been 1999-12-31 in New York. However, you’re certainly not going to maintain your birthday to be 1999-12-31 when you’re in the US and 2000-01-01 when you’re in Sydney. Or, at least, we won’t ;)

It occurs to be that I’d like to write an entirely new Date implementation for Java, to handle things like bizarre missing dates (like October 1582) and a few other things that bother me about the Date class, but it’s just not going to happen. There are too many APIs that already use Date (or Calendar) and they’re not likely to change. Also, one of the things that I haven’t liked about the APIs is that they were able to neither calculate nor store delta dates. I have solved both with a delta date implementation and a simple delta date class.

So, how old are you, exactly? My code says that I’m 27 years, 9 months, and 11 days old. But I feel much younger than that.