MS SQL Server, JDBC, and Unicode?

T

Tom Anderson

Greets yalls,

Has anyone made SQL Server work with unicode in java?

I'm working on a system which wants to put unicode in a database. It does
this fine with Oracle, but we haven't been able to make it do so when the
database is SQL Server - and indeed the manufacturers of the system list
this as something that it won't do. Anything that isn't on the current
code page turns into a question mark.

Our columns are nvarchar, and sendStringParametersAsUnicode is true in the
JDBC URL. Is there more than this we need to do?

I've come across mention of a syntax which looks like N'this is a unicode
string' for writing unicode literals in SQL. Do i need to do that? How do
i do that if i'm using PreparedStatements?

We're using the MS driver. An alternative would be the open source jTDS -
any idea if that will fix the problem?

Thanks,
tom
 
J

joe.no_junk

Greets yalls,

Has anyone made SQL Server work with unicode in java?

I'm working on a system which wants to put unicode in a database. It does
this fine with Oracle, but we haven't been able to make it do so when the
database is SQL Server - and indeed the manufacturers of the system list
this as something that it won't do. Anything that isn't on the current
code page turns into a question mark.

Our columns are nvarchar, and sendStringParametersAsUnicode is true in the
JDBC URL. Is there more than this we need to do?

I've come across mention of a syntax which looks like N'this is a unicode
string' for writing unicode literals in SQL. Do i need to do that? How do
i do that if i'm using PreparedStatements?

We're using the MS driver. An alternative would be the open source jTDS -
any idea if that will fix the problem?

Thanks,
tom

Which driver are you using now? By default, the driver will send
string data (parameter values) to the DBMS as 16-bit characters,
as you'd want. Are you saying that you have good strings in your
Java client, but on insert, the DBMS has it wrong? Or are you saying
that the DBMS can have good data, but on extracting, it's bad, or
that insert+extract gets bad stuff? Show the JDBC code.
 
A

Arne Vajhøj

Tom said:
Has anyone made SQL Server work with unicode in java?

I'm working on a system which wants to put unicode in a database. It
does this fine with Oracle, but we haven't been able to make it do so
when the database is SQL Server - and indeed the manufacturers of the
system list this as something that it won't do. Anything that isn't on
the current code page turns into a question mark.

Our columns are nvarchar, and sendStringParametersAsUnicode is true in
the JDBC URL. Is there more than this we need to do?

I've come across mention of a syntax which looks like N'this is a
unicode string' for writing unicode literals in SQL. Do i need to do
that? How do i do that if i'm using PreparedStatements?

We're using the MS driver. An alternative would be the open source jTDS
- any idea if that will fix the problem?

I can't get it not to work.

:)

The following is tested with the MS driver (driver for 2000
against 2000, but I expect 2005 against 2005 to work identical):

import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.PreparedStatement;
import java.sql.ResultSet;
import java.sql.Statement;

public class Unicode {
public static void main(String[] args) throws Exception {
Class.forName("com.microsoft.jdbc.sqlserver.SQLServerDriver");
// SQLServer 2000
Connection con =
DriverManager.getConnection("jdbc:microsoft:sqlserver://localhost;DatabaseName=Test",
"sa", "");
Statement stmt = con.createStatement();
stmt.executeUpdate("CREATE TABLE unifun (id INTEGER NOT NULL,
data NVARCHAR(50), PRIMARY KEY(id))");
stmt.executeUpdate("INSERT INTO unifun VALUES(1,N'ÆØÅæøå the
wrong way')");
PreparedStatement pstmt = con.prepareStatement("INSERT INTO
unifun VALUES(?,?)");
pstmt.setInt(1, 2);
pstmt.setString(2, "ÆØÅæøå the correct way");
pstmt.executeUpdate();
ResultSet rs = stmt.executeQuery("SELECT id,data FROM unifun");
while(rs.next()) {
System.out.println(rs.getInt(1) + " : " + rs.getString(2));
}
rs.close();
stmt.executeUpdate("DROP TABLE unifun");
stmt.close();
con.close();
}
}

Arne
 
E

Erland Sommarskog

Arne said:
The following is tested with the MS driver (driver for 2000
against 2000, but I expect 2005 against 2005 to work identical):

I don't know much about JDBC, but I don't think Microsoft JDBC driver
for SQL 2000 has a very good reputation. The SQL 2005 is likely to be
better.
Statement stmt = con.createStatement();
stmt.executeUpdate("CREATE TABLE unifun (id INTEGER NOT NULL,
data NVARCHAR(50), PRIMARY KEY(id))");
stmt.executeUpdate("INSERT INTO unifun VALUES(1,N'ÆØÅæøå the
wrong way')");
PreparedStatement pstmt = con.prepareStatement("INSERT INTO
unifun VALUES(?,?)");
pstmt.setInt(1, 2);
pstmt.setString(2, "ÆØÅæøå the correct way");
pstmt.executeUpdate();
ResultSet rs = stmt.executeQuery("SELECT id,data FROM unifun");
while(rs.next()) {
System.out.println(rs.getInt(1) + " : " + rs.getString(2));
}

First of all, are you saying that ÆØÅæøå does not come back correctly?
That would be even stranger, since I would expect your ANSI code page
to be 1252, in which case things cannot go wrong.

In any case, did you look in Management Studio or Query Analyzer to see how
the data looks like? That is, is data mutilated on input or on output?
You can also use Profiler to see what the driver sends to SQL Server.

--
Erland Sommarskog, SQL Server MVP, (e-mail address removed)

Links for SQL Server Books Online:
SQL 2008: http://msdn.microsoft.com/en-us/sqlserver/cc514207.aspx
SQL 2005: http://msdn.microsoft.com/en-us/sqlserver/bb895970.aspx
SQL 2000: http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx
 
T

Tom Anderson

Which driver are you using now?

One from Microsoft, downloaded a couple of weeks ago. I don't have access
to the machines with the driver on right now, so i can't be more specific,
i'm afraid.
By default, the driver will send string data (parameter values) to the
DBMS as 16-bit characters, as you'd want. Are you saying that you have
good strings in your Java client, but on insert, the DBMS has it wrong?
That.

Or are you saying that the DBMS can have good data, but on extracting,
it's bad, or that insert+extract gets bad stuff? Show the JDBC code.

I can't - i didn't write it, and don't have the source. This is the crux
of the problem - i don't know if the problem is in SQL Server and/or its
drivers, or the way the web app is using it. But if it's doing it wrong,
why does it work with Oracle? It's a mystery!

tom
 
T

Tom Anderson

Tom said:
Has anyone made SQL Server work with unicode in java?

I can't get it not to work.

:)

The following is tested with the MS driver (driver for 2000
against 2000, but I expect 2005 against 2005 to work identical):

public class Unicode {
public static void main(String[] args) throws Exception {
Class.forName("com.microsoft.jdbc.sqlserver.SQLServerDriver"); // SQLServer 2000
Connection con = DriverManager.getConnection("jdbc:microsoft:sqlserver://localhost;DatabaseName=Test", "sa", "");
Statement stmt = con.createStatement();
stmt.executeUpdate("CREATE TABLE unifun (id INTEGER NOT NULL, data NVARCHAR(50), PRIMARY KEY(id))");
stmt.executeUpdate("INSERT INTO unifun VALUES(1,N'?????? the wrong way')");
PreparedStatement pstmt = con.prepareStatement("INSERT INTO unifun VALUES(?,?)");
pstmt.setInt(1, 2);
pstmt.setString(2, "?????? the correct way");
pstmt.executeUpdate();
ResultSet rs = stmt.executeQuery("SELECT id,data FROM unifun");
while(rs.next()) {
System.out.println(rs.getInt(1) + " : " + rs.getString(2));
}
rs.close();
stmt.executeUpdate("DROP TABLE unifun");
stmt.close();
con.close();
}
}

Silly question, but those ?s were unicode characters before you pasted
this into usenet, right?

Thanks for doing this, Arne - i should probably have tried it myself. It
eliminates one area of doubt about the problem, but still leaves me none
the wiser as to why the system won't do unicode right. Maybe it's
constructing SQL strings internally, rather than using PreparedStatements,
and not using the N'?' syntax. I really don't think that's the case,
though - i've seen evidence from debugging and stack traces that
PreparedStatements are indeed used.

tom
 
G

Greg Miller

Silly question, but those ?s were unicode characters before you pasted
this into usenet, right?

Those unicode characters came through my news server just fine. Maybe
it's your news read that doesn't understand them.
 
A

Arne Vajhøj

Tom said:
Tom said:
Has anyone made SQL Server work with unicode in java?

I can't get it not to work.

:)

The following is tested with the MS driver (driver for 2000
against 2000, but I expect 2005 against 2005 to work identical):

public class Unicode {
public static void main(String[] args) throws Exception {
Class.forName("com.microsoft.jdbc.sqlserver.SQLServerDriver");
// SQLServer 2000
Connection con =
DriverManager.getConnection("jdbc:microsoft:sqlserver://localhost;DatabaseName=Test",
"sa", "");
Statement stmt = con.createStatement();
stmt.executeUpdate("CREATE TABLE unifun (id INTEGER NOT NULL,
data NVARCHAR(50), PRIMARY KEY(id))");
stmt.executeUpdate("INSERT INTO unifun VALUES(1,N'?????? the
wrong way')");
PreparedStatement pstmt = con.prepareStatement("INSERT INTO
unifun VALUES(?,?)");
pstmt.setInt(1, 2);
pstmt.setString(2, "?????? the correct way");
pstmt.executeUpdate();
ResultSet rs = stmt.executeQuery("SELECT id,data FROM unifun");
while(rs.next()) {
System.out.println(rs.getInt(1) + " : " + rs.getString(2));
}
rs.close();
stmt.executeUpdate("DROP TABLE unifun");
stmt.close();
con.close();
}
}

Silly question, but those ?s were unicode characters before you pasted
this into usenet, right?

Yes.

I am surprised that they did no go through.
Thanks for doing this, Arne - i should probably have tried it myself. It
eliminates one area of doubt about the problem, but still leaves me none
the wiser as to why the system won't do unicode right. Maybe it's
constructing SQL strings internally, rather than using
PreparedStatements, and not using the N'?' syntax. I really don't think
that's the case, though - i've seen evidence from debugging and stack
traces that PreparedStatements are indeed used.

We will need more info to trouble shoot.

code
what is being inserted
what is being selected out

Arne
 
A

Arne Vajhøj

Greg said:
Those unicode characters came through my news server just fine. Maybe
it's your news read that doesn't understand them.

8 bit should not be a problem in usenet posts, but things happen.

It would be easy to replace them with \unnnn though.

Arne
 
A

Arne Vajhøj

Erland said:
I don't know much about JDBC, but I don't think Microsoft JDBC driver
for SQL 2000 has a very good reputation. The SQL 2005 is likely to be
better.

Yes.

But if it works with the bad driver, then it is a fair assumption that
it will also work with the good driver.

But not the other way around.
First of all, are you saying that ÆØÅæøå does not come back correctly?

No.

I am saying that it work perfectly for me.

With the intention of getting the OP to check differences between
what he is doing and what I am doing.
That would be even stranger, since I would expect your ANSI code page
to be 1252, in which case things cannot go wrong.

Things can always go wrong.

But you think the test would be better with a unicode value > 255 ?

Arne
 
A

Arne Vajhøj

Arne said:
Things can always go wrong.

But you think the test would be better with a unicode value > 255 ?

Just tried with:

stmt.executeUpdate("INSERT INTO unifun
VALUES(1,N'\u20AC\u20AC\u20AC the wrong way')");
PreparedStatement pstmt = con.prepareStatement("INSERT INTO
unifun VALUES(?,?)");
pstmt.setInt(1, 2);
pstmt.setString(2, "\u20AC\u20AC\u20AC the correct way");
pstmt.executeUpdate();
ResultSet rs = stmt.executeQuery("SELECT id,data FROM unifun");
while(rs.next()) {
System.out.println(rs.getInt(1) + " : " + rs.getString(2));
}

and it displays euro signs nicely.

Arne
 
E

Erland Sommarskog

Tom said:
Silly question, but those ?s were unicode characters before you pasted
this into usenet, right?

And even after. The characters displayed correctly in my newsreader, so
did Arne's name. So there seems to be a problem at your end.

Whether that says anything about your database problems I don't know.

--
Erland Sommarskog, SQL Server MVP, (e-mail address removed)

Links for SQL Server Books Online:
SQL 2008: http://msdn.microsoft.com/en-us/sqlserver/cc514207.aspx
SQL 2005: http://msdn.microsoft.com/en-us/sqlserver/bb895970.aspx
SQL 2000: http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx
 
T

Tom Anderson

And even after. The characters displayed correctly in my newsreader, so
did Arne's name. So there seems to be a problem at your end.

Whether that says anything about your database problems I don't know.

Touche.

Er, imagine an acute accent on the e there.

tom
 
T

Tom Anderson

Those unicode characters came through my news server just fine. Maybe
it's your news read that doesn't understand them.

True. Apologies for my mistaken assumption here.

tom
 
T

Tom Anderson

We will need more info to trouble shoot.

code
what is being inserted
what is being selected out

All of that is under the hood where i can't really get at it, sadly.

Although i could trap the queries and the results with the SQL Server
profiler.

And i could run the app under a debugger and breakpoint all the
interesting methods, to see what's actually being called.

If i can find time to work on this (full unicode support is not a high
priority right now), and if i get an answer, i'll post my findings here,
in case anyone's interested.

tom
 
A

Arne Vajhøj

Tom said:
All of that is under the hood where i can't really get at it, sadly.

Although i could trap the queries and the results with the SQL Server
profiler.

And i could run the app under a debugger and breakpoint all the
interesting methods, to see what's actually being called.

More info is definitely needed.

Right now it is like trying to catch a black cat in a dark room
blindfolded.

Arne
 
A

Arne Vajhøj

Dan said:
I want to second Erland's suggestion to capture the actual SQL with
Profiler. Also, check the JDBC sendStringParametersAsUnicode setting to
make sure it is set to true.

I would be worth trying.

Arne
 
L

Lew

Arne said:
Right now it is like trying to catch a black cat in a dark room
blindfolded.

Open can of tuna.
Place open tuna inside cloth bag.
Maintain loose grip on drawstring.
When you feel cat inside the bag, pull on drawstring, securing cat.

Simple.
 
A

Arne Vajhøj

Lew said:
Open can of tuna.
Place open tuna inside cloth bag.
Maintain loose grip on drawstring.
When you feel cat inside the bag, pull on drawstring, securing cat.

Simple.

Can you backtranslate analogu->real world ?

Arne
 
T

Tom Anderson

Can you backtranslate analogu->real world ?

Obviously, one would stuff the SQL Server machine with tuna.

However, in this case, the server is on a virtual machine. Not sure how to
proceed here.

tom
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,743
Messages
2,569,478
Members
44,899
Latest member
RodneyMcAu

Latest Threads

Top