How to check record duplication before saving?

Frank Tsao · Apr 11, 2008

Hi, everyone,

I am a beginner in ruby programming and I have a minor question about
data checking, if someone who could give me a suggestion that would be
great.

The system needs to prevent a duplicated record happened in database,
the record is consisted of two fields which are column A and column B.
Supposing a record has been saved in database that the A=1, B=1.
Afterward, the same set of record can not be accepted again, for
example, record: A=1, B=2 and A=2, B=1 both are acceptable. In other
words, record: A=1, B=1 can not be saved in database.

So how to check and prevent a new record that has the same columns with
same values tend to be saved in database?

Frank

Dave Thomas · Apr 11, 2008

So how to check and prevent a new record that has the same columns
with
same values tend to be saved in database?

Try creating a unique database index that spans the two columns.

Regards

Dave

Phillip Gawlowski · Apr 11, 2008

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Frank Tsao wrote:
| Hi, everyone,
|
| I am a beginner in ruby programming and I have a minor question about
| data checking, if someone who could give me a suggestion that would be
| great.
|
| The system needs to prevent a duplicated record happened in database,
| the record is consisted of two fields which are column A and column B.
| Supposing a record has been saved in database that the A=1, B=1.
| Afterward, the same set of record can not be accepted again, for
| example, record: A=1, B=2 and A=2, B=1 both are acceptable. In other
| words, record: A=1, B=1 can not be saved in database.
|
| So how to check and prevent a new record that has the same columns with
| same values tend to be saved in database?

Most database systems can handle that, be defining a UNIQUE constraint
on columns.

However, that means that A = 1 and B = 2 won't work.

Different methods to access databases have different ways to handle
these kinds of collisions, or at least different ways to implement the
collision detection.

(For example, you can fetch all records, and check if they are already
set, but that is rather bad for memory consumption and speed, but might
be enough in your case.)

So, if you can tell us how you access your database (and what it is), we
can help you out much better.

- --
Phillip Gawlowski
Twitter: twitter.com/cynicalryan

~ - You know you've been hacking too long when...
...your brain keeps hallucinating random "system error: collision with
stack heap" or "Guru Meditation Mode # Three billion and fifty-two,
press left button to continue" or even "This is not a DOS disk." error
messages.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.8 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkf/zngACgkQbtAgaoJTgL/2mwCfU4Ndpir25pE11+V9JyTYvAmB
CAkAn0N6erBRi8ISLPD/ptHkM/GhFMGU
=oV1Q
-----END PGP SIGNATURE-----

Dave Thomas · Apr 11, 2008

Most database systems can handle that, be defining a UNIQUE constraint
on columns.

However, that means that A = 1 and B = 2 won't work.

Sure it will- you define the unique index to span the two columns, and
then only non-unoque combinations of those columns will fail.

Dave

RubyTalk · Apr 11, 2008

If you are using Rails and migrations

add_index

table, [:column1, :column2], :unique => true)

if not something like this sql

CREATE UNIQUE INDEX index_name ON table(column1,column2);

Becker

Frank Tsao · Apr 12, 2008

If you are using Rails and migrations

add_indextable, [:column1, :column2], :unique => true)

if not something like this sql

CREATE UNIQUE INDEX index_name ON table(column1,column2);

Becker

Thanks for your help, but the situation in A=1, B=2 or A=2, B=1 that
wont be accepted if using unique.

My point is that when record(A=1, B=1) was existed in database, any new
record(A=1, B=1) can not be valided. They can be (A=1, B=2), (A=1, B=3)
or (A=5, B=1) etc.

Frank Tsao · Apr 12, 2008

Dave said:
Try creating a unique database index that spans the two columns.

Regards

Dave

Thanks for your help, but the situation in A=1, B=2 or A=2, B=1 that
wont be accepted if using unique.

My point is that when record(A=1, B=1) was existed in database, any new
record(A=1, B=1) can not be valided. They can be (A=1, B=2), (A=1, B=3)
or (A=5, B=1) etc.

Frank Tsao · Apr 12, 2008

Thanks for your help. I just used Instant Rails 2.0 for developing.
Mongrel and Sqlite.

Phillip Gawlowski · Apr 12, 2008

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Frank Tsao wrote:
| Thanks for your help. I just used Instant Rails 2.0 for developing.
| Mongrel and Sqlite.

As far as I can see, you have to iterate of all the present records to
check for uniqueness of values, since SQLite3 doesn't allow for a
multi-column uniqueness.

With MySQl and other RDBMSes, Dave Thomas' idea should work (I don't
know, since I never had to handle this situation yet myself).

- --
Phillip Gawlowski
Twitter: twitter.com/cynicalryan

Rule of Open-Source Programming #13:

Your first release can always be improved upon.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.8 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkf/8hUACgkQbtAgaoJTgL88EgCfWkL9pAPPKoEJ/8fBNWdaoa48
PbUAoINqSXEudAti8l6/YlauOAgPp1/f
=ZaP/
-----END PGP SIGNATURE-----

Todd Benson · Apr 12, 2008

If you are using Rails and migrations

add_indextable, [:column1, :column2], :unique => true)

if not something like this sql

CREATE UNIQUE INDEX index_name ON table(column1,column2);

Becker

Click to expand...

Thanks for your help, but the situation in A=1, B=2 or A=2, B=1 that
wont be accepted if using unique.

My point is that when record(A=1, B=1) was existed in database, any new
record(A=1, B=1) can not be valided. They can be (A=1, B=2), (A=1, B=3)
or (A=5, B=1) etc.

You probably have a primary key constraint on one of the columns
(unless you are using kindergarten SQLite). Example without correct
caps...

create table stuff (
colA int not null,
colB int not null,
primary key (colA, colB)
)

...or if you succumb to the idea integer IDs for the primary key are a
good idea...

create table stuff (
id int not null primary key,
colA int not null,
colB int not null,
unique (colA, colB)
)

Not tested.

Todd

Todd Benson · Apr 12, 2008

You probably have a primary key constraint on one of the columns
(unless you are using kindergarten SQLite).

That probably sounded bad. Sorry.

Example without correct
caps...

create table stuff (
colA int not null,
colB int not null,
primary key (colA, colB)
)

...or if you succumb to the idea integer IDs for the primary key are a
good idea...

create table stuff (
id int not null primary key,
colA int not null,
colB int not null,
unique (colA, colB)
)

In rails, I should point out I _have_ to use the integer ID for my
primary key, which irks me to no end because it goes against
everything I know about set theory. That whining aside, I might try
to fix it, but the underlying rails code is intimidating.

Todd

RubyTalk · Apr 12, 2008

I like to use guids for ids.

create_table

name,{:id=>false}) do |t|
t.column

id,:string,:limit=>64)
end
execute(sql_statement_to_create_primary_key)

Becker

ara.t.howard · Apr 12, 2008

In rails, I should point out I _have_ to use the integer ID for my
primary key, which irks me to no end because it goes against
everything I know about set theory. That whining aside, I might try
to fix it, but the underlying rails code is intimidating.

no db actually deals with sets - try to do a query and get the results
back in different order each time. in practice they are *ordered*
tuples of key value pairs and every db i've used is written with this
implicit understanding. having an id is no different that having an
'updated_at' field - it simply provides a total ordering which is
arbitrary. fighting the id is just swimming upstream to nowhere - i
know as i did it for a long time. ;-)

a @ http://codeforpeople.com/

Rick Tessner · Apr 12, 2008

Hi,

Excerpts from shuhao.tsao's message of Fri Apr 11 16:02:10 -0700 2008:

Thanks for your help, but the situation in A=1, B=2 or A=2, B=1 that
wont be accepted if using unique.

My point is that when record(A=1, B=1) was existed in database, any new
record(A=1, B=1) can not be valided. They can be (A=1, B=2), (A=1, B=3)
or (A=5, B=1) etc.

You could try creating a third column. Let's suppose that you know that
all A & B values are 6 digits or less, create a unique string column
that will then contain the value of A & B sorted, zero padded and
joined.

u = [a, b].sort.map { |i| "%06d" %i }.join

So, the database would contain 3 columns: a, b, u with u being defined
as unique in the database.

So, with a = 1; b = 2 we get u = "000001000002"

or with a = 2; b = 1 we also get u = "000001000002"

Todd Benson · Apr 13, 2008

no db actually deals with sets - try to do a query and get the results back
in different order each time. in practice they are *ordered* tuples of key
value pairs and every db i've used is written with this implicit
understanding. having an id is no different that having an 'updated_at'
field - it simply provides a total ordering which is arbitrary. fighting
the id is just swimming upstream to nowhere - i know as i did it for a long
time. ;-)

a @ http://codeforpeople.com/

I understand that, for example, postgresql and mysql have labeled all
things with an integer. But, in the table itself, umm, doesn't make
much sense...

create table my_set (
id int not null primary key,
a int,
b int,
);

insert into my_set values (1, 1, 1);

insert into my_set values (2, 1, 1);

It gets even worse with built in functions like auto_increment, where
you've tossed all identity of the tuple to the wind (think of moving
such a table to another db).

I'm still a firm believer in using logical primary keys instead of
arbitrary ones anyway, even if it takes a little more effort, but I'm
willing to listen to other ways

Todd

Todd Benson · Apr 13, 2008

I understand that, for example, postgresql and mysql have labeled all
things with an integer. But, in the table itself, umm, doesn't make
much sense...

create table my_set (

id int not null primary key,
a int,
b int,
);

insert into my_set values (1, 1, 1);

insert into my_set values (2, 1, 1);

It gets even worse with built in functions like auto_increment, where
you've tossed all identity of the tuple to the wind (think of moving
such a table to another db).

I'm still a firm believer in using logical primary keys instead of
arbitrary ones anyway, even if it takes a little more effort, but I'm
willing to listen to other ways

I'll clarify a little (I may have told this little story before). I
once worked for a company that had complete confidence in its ERP
database (to be nameless). Suddenly, one day, there's about a half
mil missing for no real reason except for what the reports said.
Having been embezzled before, they wanted to find the culprit. The
culprit turned out to be the database schema (and they were _so_ ready
for a witch hunt).

Todd

Todd Benson · Apr 13, 2008

I'll clarify a little (I may have told this little story before). I
once worked for a company that had complete confidence in its ERP
database (to be nameless). Suddenly, one day, there's about a half
mil missing for no real reason except for what the reports said.
Having been embezzled before, they wanted to find the culprit. The
culprit turned out to be the database schema (and they were _so_ ready
for a witch hunt).

Todd

Wow. That sounded incriminating, didn't it! My apologies for the
noise, but it's true. The company I worked for lost a bunch of money
because of a culmination of practices. They were paying for new part
revs and the db didn't reflect it. "Bad" shipments were dropped/lost
in the shipping part of the database. Also, it was full of tables
called: 12400010, 12400011, etc. and the relations were not even
close to sane. The "help" was paid about $1000/hr to set it up.

Todd

ara.t.howard · Apr 13, 2008

It gets even worse with built in functions like auto_increment, where
you've tossed all identity of the tuple to the wind (think of moving
such a table to another db).

yeah that's valid. i have at least one db where every id is a
globally unique uuid - and by globally i mean according to rfc. still

olddb.id

is a valid concept...

a @ http://codeforpeople.com/

ara.t.howard · Apr 13, 2008

The "help" was paid about $1000/hr to set it up.

guess we have something to learn from them! ;-)

a @ http://codeforpeople.com/

Todd Benson · Apr 13, 2008

guess we have something to learn from them! ;-)

Yeah, it's funny. That may sound like an exaggeration, but after you
add up the flight cost and motel stay, it probably was more like
$1200/hr. The guy worked about 4 hours per day for a total of 3 or 4
days.

How can I guarantee that the all callback functions of the first Ajax API call have finished executing before initiating the 2 call in JavaScript?	2	Oct 30, 2023
Upgrading Company's Internal Record Keeping Systems	0	Sep 24, 2021
how to remove code duplication	23	Aug 11, 2008
How to try a range of hex values in C# code ?	0	Nov 19, 2022
How to check / list gem dependency before install anything?	2	Jul 29, 2009
How to fix ssl.SSLError: [SSL: WRONG_VERSION_NUMBER] wrong version number (_ssl.c:2570)	0	Jul 28, 2023
virus scan documents before saving to database	0	Oct 18, 2006
BindingSource, Archiving Old record before saving new changes	1	Mar 14, 2006

How to check record duplication before saving?

Frank Tsao

Dave Thomas

Phillip Gawlowski

Dave Thomas

RubyTalk

Frank Tsao

Frank Tsao

Frank Tsao

Phillip Gawlowski

Todd Benson

Todd Benson

RubyTalk

ara.t.howard

Rick Tessner

Todd Benson

Todd Benson

Todd Benson

ara.t.howard

ara.t.howard

Todd Benson

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads