Ordering on Hierarchical Dot Notation

D

dar7yl

I have a SQL database (MySql) containing a table with a key column which
uses "Hierarchical .Dot Notation".
vis:
1
1.1
1.2
2
2.1
2.2
2.10
3
10
10.1
11
...

The problem is that this column is proving very difficult to sort naturally
after the values get up to 10. They sort like this:
1
1.1
1.2
10
10.1
11
2
2.1
2.10
2.2
3
....

I want to solve this without having to change the original column. The
users still want to query the database using the natural representation.

Basically, I can see of three methods:

1) Provide an auxillary column, and when adding records, set up this field
coded to sort in the desired order. For instance, extract each level as an
integer and append it to a variable byte array (stored as a BLOB). Then
include "ORDER BY Aux" in the query.

2) Query the database unordered, and sort the ResultSet myself.

3) Code a function module in MySql which can compare two strings in
hierarchical format, and use that function in the query.

Option 1 involves up-front application changes, expecially when adding data,
and would have to be performed for each existing and future application.
Non-controlled applications (for instance a stand-alone sql query, or a
report generator) can use the table except for adding records. Storage
representation limits number of items per level (for instance 255 items for
a BYTE array).

Option 2 also involves up-front application changes and precludes using the
table in non-controlled applications.

Option 3 involves minimal application impact, but involves delving deep into
MySql's operation. Also, it's not portable across db's.

Any thoughts on this would be appreciated.

regards,
Dar7yl
 
H

Heiner Kücker

dar7yl wrote
I have a SQL database (MySql) containing a table with a key column which
uses "Hierarchical .Dot Notation".
vis:
1
1.1
1.2
2
2.1
2.2
2.10
3
10
10.1
11
...

The problem is that this column is proving very difficult to sort naturally
after the values get up to 10. They sort like this:
1
1.1
1.2
10
10.1
11
2
2.1
2.10
2.2
3
...

I want to solve this without having to change the original column. The
users still want to query the database using the natural representation.

Basically, I can see of three methods:

1) Provide an auxillary column, and when adding records, set up this field
coded to sort in the desired order. For instance, extract each level as an
integer and append it to a variable byte array (stored as a BLOB). Then
include "ORDER BY Aux" in the query.

2) Query the database unordered, and sort the ResultSet myself.

3) Code a function module in MySql which can compare two strings in
hierarchical format, and use that function in the query.

Option 1 involves up-front application changes, expecially when adding data,
and would have to be performed for each existing and future application.
Non-controlled applications (for instance a stand-alone sql query, or a
report generator) can use the table except for adding records. Storage
representation limits number of items per level (for instance 255 items for
a BYTE array).

Option 2 also involves up-front application changes and precludes using the
table in non-controlled applications.

Option 3 involves minimal application impact, but involves delving deep into
MySql's operation. Also, it's not portable across db's.

Any thoughts on this would be appreciated.

regards,
Dar7yl

Use leading zeros for your order numbers.

public String fillLeadingZeros( int pValue , int pDigits )
{
StringBuffer strBuff = new StringBuffer( "" + pValue );
while ( strBuff.length() < pDigits )
{
strBuff.insert( 0 , '0' );
}
return strBuff.toString();
}

Format all values with the method above before
cocatenating and storing in database.

--
Heiner Kuecker
Internet: http://www.heinerkuecker.de http://www.heiner-kuecker.de
JSP WorkFlow PageFlow Page Flow FlowControl Navigation: http://www.control-and-command.de
Java Expression Formula Parser: http://www.heinerkuecker.de/Expression.html
CnC Template Technology http://www.heinerkuecker.de/Expression.html#templ
 
D

dar7yl

Heiner Kücker said:
Use leading zeros for your order numbers.

public String fillLeadingZeros( int pValue , int pDigits )
{
StringBuffer strBuff = new StringBuffer( "" + pValue );
while ( strBuff.length() < pDigits )
{
strBuff.insert( 0 , '0' );
}
return strBuff.toString();
}

Thanks, I thought of that as well (a variation on option 1). However, the
requirement is that the original column does not change, as it is being used
for legacy operations. I cannot force my users to conform to an artificial
format. In terms of efficiency, it is still more appropriate for me to
convert directly to binary instead of a string. Especially the use of
StringBuffer, which is notoriously slow.

Also, how many zeros do I insert? The binary method gives me flexability to
go for up to 255 items per level for byte blobs, to 64K for word blobs, and
4.2M for long blobs.

regards,
Dar7yl
 
H

Harald Fuchs

dar7yl said:
I have a SQL database (MySql) containing a table with a key column which
uses "Hierarchical .Dot Notation".
vis:
1
1.1
1.2
2
2.1
2.2
2.10
3
10
10.1
11
...
The problem is that this column is proving very difficult to sort naturally
after the values get up to 10. They sort like this:
1
1.1
1.2
10
10.1
11
2
2.1
2.10
2.2
3
...
I want to solve this without having to change the original column. The
users still want to query the database using the natural representation.
Basically, I can see of three methods:
1) Provide an auxillary column, and when adding records, set up this field
coded to sort in the desired order. For instance, extract each level as an
integer and append it to a variable byte array (stored as a BLOB). Then
include "ORDER BY Aux" in the query.

Don't denormalize unless you have no other choice.
2) Query the database unordered, and sort the ResultSet myself.

Don't do something in the application what can be done by the DBMS.
3) Code a function module in MySql which can compare two strings in
hierarchical format, and use that function in the query.
Option 3 involves minimal application impact, but involves delving deep into
MySql's operation. Also, it's not portable across db's.

That's right, but most DBs should have string handling functions. In MySQL:

SELECT whatever
FROM yourtbl
ORDER BY substring_index(col,'.',1) + 0,
CASE
WHEN instr(col,'.') THEN substring_index(col,'.',-1) + 0
ELSE 0
END
 
D

dar7yl

Thanks, Harald.
I tried your solution, but it didn't work. There is no iteration or
recursion to compare at each level. I'm sure that it is possible to create
an efficient expression to do this comparison, and you have at least given
me a starting point to persue this further.
..
Btw, I had already implemented option 1, storing a byte for every index at
each level. However, I was mistaken about the TINYBLOB type, which I
thought was synonomous with VARCHAR.

I ended up using VARCHAR(255) BINARY. This almost sorts naturally, but I
have to add a trailing null, because [2,1,4] orders after [2,1,4,1] for some
strange reason, thus I have to force it to [2,1,4,0] so the storage ends up
taking (levels+1) bytes.

Question: is VARCHAR BINARY signed or unsigned? This makes the difference
whether I can store 255 or 127 items per level. (not that it really matters,
typical usage is <20 items per level)

regards,
Dar7yl.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,768
Messages
2,569,574
Members
45,051
Latest member
CarleyMcCr

Latest Threads

Top