Why can't I xor strings?

D

dataangel

I wrote a function to compare whether two strings are "similar" because
I'm using python to make a small text adventure engine and I want to it
to be responsive to slight mispellings, like "inevtory" and the like. To
save time the first thing my function does is check if I'm comparing an
empty vs. a non-empty string, because they are to be never considered
similar. Right now I have to write out the check like this:

if str1 and str2:
if not str1 or not str2:
return 0

Because python won't let me do str1 ^ str2. Why does this only work for
numbers? Just treat empty strings as 0 and nonempty as 1.

Regardless of whether this is the best implementation for detecting if
two strings are similar, I don't see why xor for strings shouldn't be
supported. Am I missing something? Inparticular, I think it'd be cool to
have "xor" as opposed to "^". The carrot would return the resulting
value, while "xor" would act like and/or do and return the one that was
true (if any).
 
P

Paul Rubin

dataangel said:
To save time the first thing my function does is check if
I'm comparing an empty vs. a non-empty string, because they are to be
never considered similar.
Right now I have to write out the check like this:

if str1 and str2:
if not str1 or not str2:
return 0

How about:

def nonempty(s):
return (len(s) > 0)

and then

if nonempty(str1) != nonempty(str2):
return 0

Of course there's other ways to write the same thing. Python 2.3 has
a "bool" function which when called on a string, returns true iff the
string is nonempty.
 
B

Byron

Hi DataAngel,

Welcome to the Python community!

From reading your post, it sounds like you are wanting to use XOR
encryption because there are other ways to quickly and easily compare
strings to see if they are equal.

To compare a string, use the following:

name = "Steven"
name2 = "Steven"
print name == name2

The result will be "True." However, if my initial guess is correct that
you are wanting to use XOR for strings, it probably because you are
wanting a quick way of encrypting data so that basic users won't be able
to view the information. If this is what you are wanting, I might have
something for you...

Let me know,

Byron
 
B

Byron

Hi DataAngel,

Welcome to the Python community!

From reading your post, it sounds like you are wanting to use XOR for
"encryption." The reason why I say this is because there are many other
ways of comparing the content of two strings to see if they are exactly
the same. An example of how to do such a thing is listed below:

# How to compare two strings to see if they are exact matches.
name = "Steven"
name2 = "Steven"
print name == name2


The result will be "True."

However, the only reason that one might want to use XOR for a string is
if he or she wanted to use XOR based encryption in order to keep data
semi-private.

Let me know,

Byron
 
G

Grant Edwards

Hi DataAngel,

Welcome to the Python community!

From reading your post, it sounds like you are wanting to use XOR
encryption

No, he wants to do an exclusive or of the boolean values of the
strings.
The result will be "True." However, if my initial guess is correct that
you are wanting to use XOR for strings, it probably because you are
wanting a quick way of encrypting data so that basic users won't be able
to view the information.

No, he explained exactly what he was trying to do, and it had
nothing to do with encryption. He wants to know if exactly one
(1) of the strings is the empty string.
 
B

Byron

Grant said:
No, he explained exactly what he was trying to do, and it had
nothing to do with encryption. He wants to know if exactly one
(1) of the strings is the empty string.

Hi Grant,

Opps, that's what happens when on skims through a post a light speed...
lol :)

Byron
---
 
B

Byron

Grant said:
No, he explained exactly what he was trying to do, and it had
nothing to do with encryption. He wants to know if exactly one
(1) of the strings is the empty string.


Hi Grant,

Opps, that's what happens when one skims through a post a light speed...
lol :)

Byron
---
 
B

Byron

Grant said:
No, he explained exactly what he was trying to do, and it had
nothing to do with encryption. He wants to know if exactly one
(1) of the strings is the empty string.


Hi Grant,

Opps, that's what happens when one skims through a post at light
speed... lol :)

Byron
---
 
A

Andrew Dalke

Grant Edwards:
No, he explained exactly what he was trying to do, and it had
nothing to do with encryption. He wants to know if exactly one
(1) of the strings is the empty string.

BTW, another way to get that is

if bool(str1) + bool(str2) == 1:
print "one and only one of them was empty"


Andrew
(e-mail address removed)
 
S

Steven Bethard

Grant Edwards said:
No, he wants to do an exclusive or of the boolean values of the
strings.

I'm guessing the OP has already guessed this solution from the variety already
provided, but the direct translation of this statement would be:

bool(x) ^ bool(y)

for example:
True


Steve
 
G

Grant Edwards

I'm guessing the OP has already guessed this solution from the
variety already provided, but the direct translation of this
statement would be:

bool(x) ^ bool(y)

:)

It dawned on me later that perhaps the jump from what I wrote
to the code you wrote wasn't as obvious as I thought.
 
J

Jeremy Bowers

Regardless of whether this is the best implementation for detecting if
two strings are similar, I don't see why xor for strings shouldn't be
supported. Am I missing something?

The basic problem is that there is no obvious "xor on string" operation
*in general*, even if you stipulate "bitwise". In particular, what does it
mean to bitwise xor two different length strings?

"In general" is the key, here. You actually don't have strings,
conceptually, you have "tokens" or "commands" or something that happen to
be strings. I recommend creating a class to match this concept by
subclassing str... or even just write it directly without subclassing
string. You can then make __xor__(self, other) for that class do whatever
makes sense with your concept.

http://docs.python.org/ref/numeric-types.html

I can almost guarantee that you will later find other things to put into
that class. It may very well clean up a lot of code.
 
G

Grant Edwards

The basic problem is that there is no obvious "xor on string" operation

Sure there is. Strings have a boolean value, and the xor
operation on boolean values is well-defined.
 
D

David Bolen

Grant Edwards said:
Sure there is. Strings have a boolean value, and the xor
operation on boolean values is well-defined.

That's an operation, but I'm not sure that's the obvious one. For my
part, if I saw "string1 ^ string2" I'd probably expect a byte by byte
xor with the result being a new string.

It doesn't feel natural to me to have my strings suddenly interpreted
as a new data type based on the operation at hand. Logical operators
work that way but not numerics (it would be in the same vein as string
+ number interpreting the string as a number - that way lies Perl :))

-- David
 
G

Grant Edwards

That's an operation, but I'm not sure that's the obvious one.
For my part, if I saw "string1 ^ string2" I'd probably expect
a byte by byte xor with the result being a new string.

Only because Python lacks a logical xor operator, so you're
used to thinking of ^ as a bitwise operator. What if you saw

string1 xor string2?

Wouldn't you expect it to be equivalent to

(string1 and (not string2)) or ((not string1) and string2)
It doesn't feel natural to me to have my strings suddenly
interpreted as a new data type based on the operation at hand.
Logical operators work that way but not numerics

I don't know what you mean by that. Nobody seems to have a
problem with "and" "or" and "not" operators using the truth
values of strings. What is there about "xor" that precludes it
from behaving similarly?
(it would be in the same vein as string + number interpreting
the string as a number - that way lies Perl :))

I don't see that at all.
 
S

Stephen Waterbury

David said:
That's an operation, but I'm not sure that's the obvious one. For my
part, if I saw "string1 ^ string2" I'd probably expect a byte by byte
xor with the result being a new string.

.... but you'd get a traceback. ;) As pointed out earlier in this
thread, what works is "bool(string1) ^ bool(string2)", which
certainly doesn't violate the law of least astonishment.
Why would anything else be needed?
 
V

Ville Vainio

Grant> I don't know what you mean by that. Nobody seems to have a
Grant> problem with "and" "or" and "not" operators using the truth
Grant> values of strings. What is there about "xor" that
Grant> precludes it from behaving similarly?

It's just that logical xor is an extremely rare beast. I would
probably prefer to see the operation expanded to the more typical
and-or-nots in real code.
 
P

Paul Rubin

Ville Vainio said:
It's just that logical xor is an extremely rare beast. I would
probably prefer to see the operation expanded to the more typical
and-or-nots in real code.

I think != is appropriate in this situation.
if bool(x) != bool(y): ...
 
F

Fredrik Lundh

Ville said:
Grant> I don't know what you mean by that. Nobody seems to have a
Grant> problem with "and" "or" and "not" operators using the truth
Grant> values of strings. What is there about "xor" that
Grant> precludes it from behaving similarly?

It's just that logical xor is an extremely rare beast. I would
probably prefer to see the operation expanded to the more typical
and-or-nots in real code.

"imp" and "eqv", anyone?

</F>
 
A

Andrew Dalke

Grant said:
What if you saw

string1 xor string2?

Wouldn't you expect it to be equivalent to

(string1 and (not string2)) or ((not string1) and string2)

I would expect it to give a syntax error.

If not, and 'xor' did become a new boolean operator
in Python I would expect it to act more like

xor_f(x, y)

where 'xor_f' the function is defined as

def xor_f(x, y):
x = bool(x)
y = bool(y)
return (x and not y) or (not x and y)


Why the distinction? In your code you call bool on
an object at least once and perhaps twice. The
truth of an object should only be checked once. You
also have asymmetric return values. Consider

s1 s2 s1 xor s2
"A" "B" False
"A" "" True
"" "B" "B"
"" "" False

Esthetics suggest that either "A"/"" return "A" or that
""/"B" return True. Mine does the latter. Yours does
neither. Probably the Pythonic way, assuming 'xor'
can be considered Pythonic, is to return the object
which gave the True value, like this

def xor_f(x, y):
bx = bool(x)
by = bool(y)
if bx:
if not by:
return bx
return False
else:
if by:
return by
return False

In any case, 'xor' the binary operator is rarely
needed and as you've shown can be interpreted in
a couple different ways. Each alone weighs against
it. Both together make it almost certainly a bad
idea for Python.


Andrew
(e-mail address removed)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Staff online

Members online

Forum statistics

Threads
473,769
Messages
2,569,577
Members
45,052
Latest member
LucyCarper

Latest Threads

Top