Sunday, June 9, 2013

Unicode

One of the requirements in my current project is the use of a few scientific symbols. They're Unicode characters, such as the degrees sign (°), or Maōri macrons for place names. Simple, right?

Not simple. JSP pages have two places that they need to specify that the contents are UTF-8. First, you need to add this JSP declaration to every heading, which sets the Content-Type in the HTTP header:

<@page contentType="text/html" pageEncoding="UTF-8">

Then you need to use a Spring filter in web.xml which I assumes sets the character set setting in the HTML itself:

 <filter>
        <filter-name>encodingFilter</filter-name>
        <filter-class>org.springframework.web.filter.CharacterEncodingFilter</filter-class>
        <init-param>
            <param-name>encoding</param-name>
            <param-value>UTF-8</param-value>
        </init-param>
        <init-param>
            <param-name>forceEncoding</param-name>
            <param-value>true</param-value>
        </init-param>
 </filter>

So far so good; POST results seem to arrive encoded properly at the servlet. Then I sent the results to the Oracle database, and lo! The database does not like them. It accepts them, but returns random ASCII characters when the data is fetched again. Oracle is retarded; we all know that, but it is so incredibly retarded that to change the character set, we need to completely re-install Oracle (!!). That's not going to happen. Possible things to try include:

  • Use NVARCHAR (not recommended), and do pstmt.setFormOfUse(2, oracle.jdbc.OraclePreparedStatement.FORM_NCHAR) every time you use a field. The method setFormOfUse(...) is an Oracle extension to JDBC. Alternatively you can use JDBC 4.0's getNString() methods.
  • There's a JDBC environment setting (-Doracle.jdbc....) somewhere for the Oracle JDBC thin drivers. The OCI drivers might work better. Newer versions might work better. I thought this was too retarded to investigate.
  • Don't use Strings, but rather use BLOBs or raw data. Seriously; I was tempted.

Or... just store HTML in the database. You can encode Unicode characters using org.apache.commons.lang.StringEscapeUtils.escapeHtml(), which makes them ASCII, and then use unescapeHtml() to restore them, or just send it straight to the browser and let the browser do it.

Except for newline characters. Oh how I hate them. Sometimes they're one character, sometimes they're two, browsers ignore them, except when they're in a <textarea> or <pre>. So currently I'm converting newline character pairs into  <br />, and back again.

This lends itself to yet another problem: you can add a "maxlength" attribute to a textarea, but that counts characters before they're converted into 7-bit clean HTML. Once they hit the database, chances are they will be longer than the field size, meaning that you need to return a recoverable error nicely to the user, and also provision about twice as much space as you think you'll need for these fields.

Why can't it all "just work"? Why can't everything be UTF-8, by default, and just work? Why couldn't HTML choose a newline character that would always appear in the output rather than using

and <br /> everywhere? People can use word-wrapping text editors, surely. Why couldn't they make text editors smarter and use HTML like the rest of the browser does?

Sunday, July 11, 2010

A Unix joke

Unix (well, Linux) is great. If it isn't obscure or obfuscated, it's just plain tricky.

I was just making a new user (for a system process) and I wanted to set some environment variables, so I did:

cp -a /etc/skel/.* .

...and I ended up with a copy of /etc! I was confounded, until a very observant colleague pointed out that ".*" will also match ".."!

From now on, I'm going to be doing a "ls -l" with my path names before I do a "rm -rf"!! Also, never do "rm -rf .*" to remove all hidden files.

Sunday, May 9, 2010

Java and Object.

When is an object not an object?

When it's a Java primitive data type.

One of the nice things about Smalltalk is that everything is an Object. Every object in a Smalltalk system understands "printString" or "copy". But in Java, integers and booleans, two of the most commonly used objects (which they aren't...) have to be handled completely differently.

Neither integers nor booleans have any methods whatsoever. You can't do "1.toString()". You can't do "1.compareTo(2)". I find this annoying.

Java tries to remedy this by making Object wrappers for Integer, String, Boolean and so forth. Then they hacked the language to automatically convert between the primitive and the wrapper. The wrappers, however, aren't transparent and are a leaky abstraction.

Take, for example, SQL ResultSets. I find myself doing:

q.setString(1, "hello");
q.setFloat(2, 1.25);
and then:
q.setInteger(3, 45);

But there is no "setInteger" method on a PreparedStatement. It is "setInt", probably because it is passed an "int" rather than an "Integer" as a parameter.

Tuesday, May 4, 2010

HTML Checkboxes.

From the HTML 4.01 specification:
checkboxes
Checkboxes (and radio buttons) are on/off switches that may be toggled by the user. A switch is "on" when the control element's checked attribute is set. When a form is submitted, only "on" checkbox controls can become successful.
I kid you not. If you "uncheck" a checkbox, it will not be submitted to your web application. Your application would need to know what checkboxes were are on the page. If your application generated those checkboxes from a source that it can't reproduce (or doesn't want to, considering how much effort that is), your application can't determine which checkboxes had their values changed from checked to unchecked.

There are two alternatives I know of: use drop-down lists, or make your own checkboxes with JavaScript and images.

Monday, May 3, 2010

Installing Eclipse

Eclipse: oh yes, the great productivity tool of Java programmers world-wide. It's big, it's slow, and if you install the wrong plug-in, it'll become unstable and you have to reinstall it.

A couple of weeks ago, I was fed up with my installation of Eclipse. It wasn't working; every few minutes, it would pop up an error. If I dismissed that error (which was some internal error in Eclipse), it would pop up again a few minutes later. That, and since I upgraded to Ubuntu 9.04, the GTK widgets all broke.

So I reinstalled. Sure, reinstalling works fine after you've downloaded 224 friggen' megabytes. Eclipse is just a glorified text editor; how the heck does it get that big?

After the download, it didn't have the two main features I needed: Subversion connectivity, and a JSP editor. But that's okay. Eclipse has a fantastic feature: you can go to "Help" and then "Install New Software". If you enjoy spraying acid on your raw wounds, you'll enjoy trying to use this, the world's most retarded installation tool.

So you open up this dialog. It says "There is no site selected". Okay - I can handle this. I pull down the lists of sites and... I... er... what the heck are all these URLs? I want a JSP page editor. So I Google. I take an educated guess... I think I want the Web Tools Platform, but I have no idea which component provides the JSP page editor. Furthermore, when I do try to install something, it gives a meaningless error about a missing dependency:


Cannot complete the install because one or more required items could not be found.
Software being installed: Eclipse Web Developer Tools 3.1.1.v200908120400-7R77FStEVw2z07WtDz-OZrhL5C-3 (org.eclipse.wst.web_ui.feature.feature.group 3.1.1.v200908120400-7R77FStEVw2z07WtDz-OZrhL5C-3)
Missing requirement: org.eclipse.wst.server.discovery 1.0.1.v20100210 requires 'bundle org.eclipse.equinox.p2.engine [1.0.100,2.0.0)' but it could not be found
Cannot satisfy dependency:
From: WST Server UI 3.1.1.v20090727b-7B3DPAtJZRZxULQRALIM4256 (org.eclipse.wst.server_ui.feature.feature.group 3.1.1.v20090727b-7B3DPAtJZRZxULQRALIM4256)
To: org.eclipse.wst.server.discovery [1.0.1.v20100210]
Cannot satisfy dependency:
From: Eclipse Web Developer Tools 3.1.1.v200908120400-7R77FStEVw2z07WtDz-OZrhL5C-3 (org.eclipse.wst.web_ui.feature.feature.group 3.1.1.v200908120400-7R77FStEVw2z07WtDz-OZrhL5C-3)
To: org.eclipse.wst.server_ui.feature.feature.group [3.1.1.v20090727b-7B3DPAtJZRZxULQRALIM4256]


So what the heck am I meant to install to make this POS install? What's more, I don't even know which version of Eclipse I'm using (Ganymede? Equinox?) because there isn't anywhere in Eclipse that I can find this information.

Eclipse is the crappiest POS I've ever had the discomfort of using. It is a bloated 224MB pig that doesn't even have a basic JSP page editor or Subversion connector out of the box. Surely these are two very commonly used components?