How to generate k+1 length strings from a list of k length strings?

G

Girish Sahani

I have a list of strings all of length k. For every pair of k length
strings which have k-1 characters in common, i want to generate a k+1
length string(the k-1 common characters + 2 not common characters).
e.g i want to join 'abcd' with bcde' to get 'abcde' but i dont want to
join 'abcd' with 'cdef'
Currently i'm joining every 2 strings, then removing duplicate characters
from every joined string and finally removing all those strings whose
length != k+1.Here's the code i've written:

for i in range(0,len(prunedK) - 1,1):
if k in range(1,len(prunedK),1) & i+k <= len(prunedK) -1:
colocn = prunedK + prunedK[i+k]
prunedNew1.append(colocn)
continue
for string in prunedNew1:
stringNew = withoutDup(string)
prunedNew.append(stringNew)
continue

But this one is quite bad in the time aspect :(.
Thanks in advance,
girish
 
M

MTD

Try this:

def k2k1(string1, string2):
for c in string1:
string2 = string2.replace(c,"",1)

if len(string2) == 1:
string1 += string2

return string1

print k2k1("abcd", "ebcd")
 
M

MTD

actually, minor fix:
Try this:

def k2k1(string1, string2):
for c in string1:
string2 = string2.replace(c,"",1)

if len(string2) == 1:
string1 += string2
else:
string1 = ""
 
M

MTD

So yeah, just to put it all together, try this. From your two Ks, it
either returns K+1 if it can or an empty string.

def k2k1(string1, string2):
for c in string1:
string2 = string2.replace(c,"",1)

if len(string2) == 1:
string1 += string2
else:
string1 = ""

return string1


Testing:

print k2k1("abcdadd", "abceadd")

gives:
abcdadde
 
J

Jon Clements

Are you asking the question, "Which pairs of strings have one character
different in each?", or "Which pairs of strings have a substring of
len(string) - 1 in common?".

Jon.
 
B

Boris Borcic

Girish said:
I have a list of strings all of length k. For every pair of k length
strings which have k-1 characters in common, i want to generate a k+1
length string(the k-1 common characters + 2 not common characters).
e.g i want to join 'abcd' with bcde' to get 'abcde' but i dont want to
join 'abcd' with 'cdef'
Currently i'm joining every 2 strings, then removing duplicate characters
from every joined string and finally removing all those strings whose
length != k+1.

Hum, since your code is not syntactically correct, anything will run faster :)
I'd favor the following, that I find most readable

sets = map(set,list_of_strings)
res = set(''.join(sorted(s1|s2)) for s1 in sets for s2 in sets if len(s1^s2)==2)

unless performance is really an issue

Here's the code i've written:
for i in range(0,len(prunedK) - 1,1):
if k in range(1,len(prunedK),1) & i+k <= len(prunedK) -1:
colocn = prunedK + prunedK[i+k]
prunedNew1.append(colocn)
continue
for string in prunedNew1:
stringNew = withoutDup(string)
prunedNew.append(stringNew)
continue

But this one is quite bad in the time aspect :(


how do you know ?
Thanks in advance,
girish

you should do your own homework
 
M

MTD

Jon said:
Are you asking the question, "Which pairs of strings have one character
different in each?", or "Which pairs of strings have a substring of
len(string) - 1 in common?".

Jon.

I imagine it's the former because the latter is trivially easy, I mean
_really_ trivially easy.
 
B

bearophileHUGS

Boris Borcic:
I'd favor the following, that I find most readable
sets = map(set,list_of_strings)
res = set(''.join(sorted(s1|s2)) for s1 in sets for s2 in sets if len(s1^s2)==2)

I think there can be written more readable code. For my programs I
usually prefer simpler code, that (if possible) even a children can
understand. So I can debug, modify and improve it better & faster.

Bye,
bearophile
 
B

bearophileHUGS

I think there can be written more readable code. For my programs I
usually prefer simpler code, that (if possible) even a children can
understand. So I can debug, modify and improve it better & faster.

Debugged:
I think it can be written more readable code.
In this newsgroup sometimes I have tried to post 'clever' code, but for
my programs I (if possible) prefer simpler code, that even a child can
understand. So I can debug, modify and improve it faster.

Sorry, I was tired,
bearophile
 
B

Boris Borcic

Boris Borcic:

I think there can be written more readable code.

readability, of course, is in the eye of the beholder... and I find this code
*much* easier to recognize as a realisation of the description made by the OP,
than the code he himself offered - if you care to take a look at both.

For my programs I
usually prefer simpler code,

I challenge you to write simpler code to do the equivalent.
> that (if possible) even a children can
> understand.

what child ? one that is trained just like *you* think children should start, I
guess.
> So I can debug, modify and improve it better & faster.

Sure, but the case is we each were *distinct* children.
 
B

bearophileHUGS

Boris Borcic:
I challenge you to write simpler code to do the equivalent.

I don't do challenges. I too have written the code to solve that
problem, it wasn't much different from your one (it uses a generator
function xpairs, to yeild a scan of the different pairs, about half the
square, it uses symmetric_difference, etc), but it's on different
lines, with variables (object names), etc. And maybe comments too.
It's not that difficult to improve the readability of a quite long
line, you can start splitting it.

Sure, but the case is we each were *distinct* children.

Every person is different, so every person defines the style he/she
likes and understands more.
But there can be defined some "coding suggestions", useful for most
people, to avoid the most common errors, and one of such errors is to
write very long lines, full of a lot of stuff, like your one.

Bye,
bearophile
 
G

Girish Sahani

Yes it is the former of course.Common elements could be in any order in
both strings.
Thanks Marc :)...though we need to add an if for checking the length of
the string (k+1).
 
B

bearophileHUGS

Boris Borcic:
Pfff... and you don't do real debates either.

Different nations have different values and different cultures, in mine
challenges are often seen as things for children, and a waste of time
for adults (probably in USA challenges are appreciated more).

Bye,
bearophile
 
B

Boris Borcic

Boris Borcic:

Different nations have different values and different cultures, in mine
challenges are often seen as things for children, and a waste of time
for adults (probably in USA challenges are appreciated more).

(1) This is becoming very ridiculous. What do the USA have to do with it ?

(2) Different nations have different languages, and a good rule of thumb when
debating with someone of unknown linguistic origins is not to focus on the
connotations you percieve to a single word you abstract from its context.

(3) Different intellectual communities have different values, in mine it is the
rule not to make abstract claims if not ready to put them to experimental test -
and it's less a matter of (passing or failing) tests than a matter of making
clear the rule of translation between one's own abstract language and objective
facts. And by these means, ultimately, to measure the divergence in languages.

(4) As concerns the matter of the case, you made broad and abstract claims while
implicitely referring to some doctrine(s), and I concisely formulated the demand
that you back them up with (according to you) observable facts. I note that
instead you chose to articulate your doctrine while avoiding an occasion to
probe its adequacy.

(5) I find your systematic enrollment of children to your cause quite
inappropriate.
 
B

Boris Borcic

It's not that difficult to improve the readability of a quite long
line, you can start splitting it.

The point is that it is observer and context dependent whether

res = set(''.join(sorted(X|Y))
for X in sets
for Y in sets
if len(X^Y)==2)

is measurably easier to read than

res = set(''.join(sorted(X|Y)) for X in sets for Y in sets if len(X^Y)==2)

*and* that I acknowledged it by writing "I find most readable"
while you denied it by speaking of "the readability" - and similar language.
 
S

Sibylle Koczian

Girish said:
Yes it is the former of course.Common elements could be in any order in
both strings.
Thanks Marc :)...though we need to add an if for checking the length of
the string (k+1).

No, why? Every character from string1 is removed from string2 at most
once. And string2 is added to string1 if and only if it has length 1; if
not, string1 is set to the empty string. So either string1 is empty or
it has length k+1. Or did you mean checking if string1 is empty?
 
B

Boris Borcic

Hum, since your code is not syntactically correct, anything will run

in fact it parses, I was fooled by this line


I was fooled because the form "k in range(" tends to imply a for statement ;
in the case of an if statement, one usually uses chained comparisons instead.

If you know i>=0 and i,k are integers (as is implied by context), you could
simply write something like:

if 0 < k < i+k < len(prunedK) :


in the contrary case,
assuming your line does what you want, you should really write it as

if 1 <= k < len(prunedK) and i+k <= len(prunedK)-1 :

or more concisely

if 1 <= k < len(prunedK) >= i+k+1 :

or even

if i+k < len(prunedK) > k > 0 :
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,756
Messages
2,569,534
Members
45,007
Latest member
OrderFitnessKetoCapsules

Latest Threads

Top