Filter a martix?? I need your help, please!

V

Vinc_get31

Hi everyone!

I have 2 matrices [XYZ]: The 1 is smaller than the 2.
I would like to select in the 2nd only lines which correspond to
couples [XY] of 1st...

Is that possible? And how?
However, I need a very optimized code, because I work with very very
big matrice (250000*3), and this operation had to be quick...

Thanks,
Regards,
Vinc'


A= [1 1 7]
[1 2 5]
[2 4 7]
[3 5 8]

B= [1 1 2]
[1 2 3]
[1 3 4]
[1 4 3]
[1 5 0]
[2 1 7]
[2 2 2]
[2 3 7]
[2 4 1]
[2 5 7]
[3 1 1]
[3 2 4]
[3 3 7]
[3 4 0]
[3 5 0]

And I want, with the function I'm looking for, this result :
C=f(B)= [1 1 2]
[1 2 3]
[2 4 1]
[3 5 0]
 
C

Christian Gollwitzer

Vinc_get31 said:
Hi everyone!

I have 2 matrices [XYZ]: The 1 is smaller than the 2.
I would like to select in the 2nd only lines which correspond to
couples [XY] of 1st...
This is known as a "join" operation in databases. If the matrices are
sorted (which they apparently are in your representation), there is a
quite fast algorithm of the order O(N+M), where N and M are the size of
the sparse matrices.


1) i=1; j=1;
2) while ((i<=length(A)) && (j<=length(B))
3) compare A and B
4) decide:
5) A is greater: increment j
6) B[j] is greater: increment i
7) they are equal : output B[j] and increment j
8) endwhile


this simply steps with two indices through both matrices simultaneously,
it is somewhat similiar two mergesort.
 
C

Christian Gollwitzer

Vinc_get31 said:
Hi everyone!

I have 2 matrices [XYZ]: The 1 is smaller than the 2.
I would like to select in the 2nd only lines which correspond to
couples [XY] of 1st...
This is known as a "join" operation in databases. If the matrices are
sorted (which they apparently are in your representation), there is a
quite fast algorithm of the order O(N+M), where N and M are the size of
the sparse matrices.


1) i=1; j=1;
2) while ((i<=length(A)) && (j<=length(B))
3) compare A and B
4) decide:
5) A is greater: increment j
6) B[j] is greater: increment i
7) they are equal : output B[j] and increment j
8) endwhile


this simply steps with two indices through both matrices simultaneously,
it is somewhat similiar two mergesort.
 
K

Karl Heinz Buchegger

Vinc_get31 said:
A= [1 1 7]
[1 2 5]
[2 4 7]
[3 5 8]

B= [1 1 2]
[1 2 3]
[1 3 4]
[1 4 3]
[1 5 0]
[2 1 7]
[2 2 2]
[2 3 7]
[2 4 1]
[2 5 7]
[3 1 1]
[3 2 4]
[3 3 7]
[3 4 0]
[3 5 0]

And I want, with the function I'm looking for, this result :
C=f(B)= [1 1 2]
[1 2 3]
[2 4 1]
[3 5 0]


Well. First of all. Make sure things are sorted! Searching is
always easier and faster if things are sorted. That's why
phone books are sorted :)

Sort your B matrix according to the first and second column.
Your posted example already has the correct sort order. I don't
know if this was intended or if it is just a concidence, that's
why I address this issue.

There are multiple strategies you could use.

A general one is this:

Take a row from the A matrix -> 2 4 7
You are looking for a row in the B vector which starts with a 2.
So do a binary search to locate a row that starts with 2

Eg. The binary search in the first column will bring you to:

[1 1 2]
[1 2 3]
[1 3 4]
[1 4 3]
[1 5 0]
[2 1 7]
[2 2 2]
[2 3 7] <-
[2 4 1]
[2 5 7]
[3 1 1]
[3 2 4]
[3 3 7]
[3 4 0]
[3 5 0]

that is a row [ 2 3 7 ] which starts with 2. Since you are looking for [ 2 4 7 ]
and 4 is greater then 3, do a search backwards until you either find a row
which starts with [ 2 4 . ] or the first column is no longer a 2 or the
second row element is greater then 4 (remember: the entries are sorted !)

In the second case and third case, there is no row which starts with [2 4 .]

If on the other hand the binary search dropped you at an entry [ 2 5 7 ]
and you are still searching for [ 2 4 7 ], then you know that the row, if
it is in the matrix is before [ 2 5 7 ]. So continue search frontwards until
you either find a row that fits or the second column has a value smaller then 4,
or the first column has a value smaller then 2. In this cases there is no
row that fits your needs.

Basically that's it. Repeat the whole process for the next search row and
collect the results.

If you think this is complicated: not at all. You have done the very same thing
hundreds of times: Whenever you search a phone number in a phone book, you do
the very same thing: First locate the page which contains the last name, eg
by opening the book in the middle and deciding if the searched name is before
of after that page. Repeat that process until you find a page which contains the
last name (that's the binary search step). If there are multiple first names with
the same last name, just pick one and decide if the searched first name is in front
of or after the one you search for.

You see: coming up with algorithms is often a question of watching carfully what
you do in real life :)

Various variations are possible. Eg. if the numbers in your matrix are limited
to lets say the a range of 0 to 9, then an additional array containing the
starting indices of those numbers in the first row may replace the binary
search step. (That would be equivalent with having an index in the phone book:
you are looking for a last name that starts with C. Don't worry searching the
whole book, all names starting with C are on page 78ff)
 
P

Patrick Kowalzick

Hi Vinc,

the speed of the search depends on the structure of the data. Your Vector B
has always blocks of five elements and the increment between the blocks i
constant (1). The second argument you showed here is always 1 to 5. Is this
always like that or just in your example you posted?

If it is always like that, you could just calculate the position and there
is no search needed. But I assume you would store the data just in a two
dimensional field.

I think you have to describe your data a little bit more exact.

Regards,
Patrick
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,020
Latest member
GenesisGai

Latest Threads

Top