# How to generate k+1 length strings from a list of k length strings?

Discussion in 'Python' started by Girish Sahani, Jun 8, 2006.

1. ### Girish SahaniGuest

I have a list of strings all of length k. For every pair of k length
strings which have k-1 characters in common, i want to generate a k+1
length string(the k-1 common characters + 2 not common characters).
e.g i want to join 'abcd' with bcde' to get 'abcde' but i dont want to
join 'abcd' with 'cdef'
Currently i'm joining every 2 strings, then removing duplicate characters
from every joined string and finally removing all those strings whose
length != k+1.Here's the code i've written:

for i in range(0,len(prunedK) - 1,1):
if k in range(1,len(prunedK),1) & i+k <= len(prunedK) -1:
colocn = prunedK + prunedK[i+k]
prunedNew1.append(colocn)
continue
for string in prunedNew1:
stringNew = withoutDup(string)
prunedNew.append(stringNew)
continue

But this one is quite bad in the time aspect .
girish
Girish Sahani, Jun 8, 2006

2. ### MTDGuest

Try this:

def k2k1(string1, string2):
for c in string1:
string2 = string2.replace(c,"",1)

if len(string2) == 1:
string1 += string2

return string1

print k2k1("abcd", "ebcd")
MTD, Jun 8, 2006

3. ### MTDGuest

actually, minor fix:

MTD wrote:
> Try this:
>
> def k2k1(string1, string2):
> for c in string1:
> string2 = string2.replace(c,"",1)
>
> if len(string2) == 1:
> string1 += string2

else:
string1 = ""

>
> return string1
>
> print k2k1("abcd", "ebcd")
MTD, Jun 8, 2006
4. ### MTDGuest

So yeah, just to put it all together, try this. From your two Ks, it
either returns K+1 if it can or an empty string.

def k2k1(string1, string2):
for c in string1:
string2 = string2.replace(c,"",1)

if len(string2) == 1:
string1 += string2
else:
string1 = ""

return string1

Testing:

gives:
MTD, Jun 8, 2006
5. ### Jon ClementsGuest

Are you asking the question, "Which pairs of strings have one character
different in each?", or "Which pairs of strings have a substring of
len(string) - 1 in common?".

Jon.

Girish Sahani wrote:
> I have a list of strings all of length k. For every pair of k length
> strings which have k-1 characters in common, i want to generate a k+1
> length string(the k-1 common characters + 2 not common characters).
> e.g i want to join 'abcd' with bcde' to get 'abcde' but i dont want to
> join 'abcd' with 'cdef'
> Currently i'm joining every 2 strings, then removing duplicate characters
> from every joined string and finally removing all those strings whose
> length != k+1.Here's the code i've written:
>
> for i in range(0,len(prunedK) - 1,1):
> if k in range(1,len(prunedK),1) & i+k <= len(prunedK) -1:
> colocn = prunedK + prunedK[i+k]
> prunedNew1.append(colocn)
> continue
> for string in prunedNew1:
> stringNew = withoutDup(string)
> prunedNew.append(stringNew)
> continue
>
> But this one is quite bad in the time aspect .
> girish
Jon Clements, Jun 8, 2006
6. ### Boris BorcicGuest

Girish Sahani wrote:
> I have a list of strings all of length k. For every pair of k length
> strings which have k-1 characters in common, i want to generate a k+1
> length string(the k-1 common characters + 2 not common characters).
> e.g i want to join 'abcd' with bcde' to get 'abcde' but i dont want to
> join 'abcd' with 'cdef'
> Currently i'm joining every 2 strings, then removing duplicate characters
> from every joined string and finally removing all those strings whose
> length != k+1.

Hum, since your code is not syntactically correct, anything will run faster
I'd favor the following, that I find most readable

sets = map(set,list_of_strings)
res = set(''.join(sorted(s1|s2)) for s1 in sets for s2 in sets if len(s1^s2)==2)

unless performance is really an issue

Here's the code i've written:
>
> for i in range(0,len(prunedK) - 1,1):
> if k in range(1,len(prunedK),1) & i+k <= len(prunedK) -1:
> colocn = prunedK + prunedK[i+k]
> prunedNew1.append(colocn)
> continue
> for string in prunedNew1:
> stringNew = withoutDup(string)
> prunedNew.append(stringNew)
> continue
>
> But this one is quite bad in the time aspect

how do you know ?

> girish

you should do your own homework
Boris Borcic, Jun 8, 2006
7. ### MTDGuest

Jon Clements wrote:
> Are you asking the question, "Which pairs of strings have one character
> different in each?", or "Which pairs of strings have a substring of
> len(string) - 1 in common?".
>
> Jon.

I imagine it's the former because the latter is trivially easy, I mean
_really_ trivially easy.
MTD, Jun 8, 2006
8. ### Guest

Boris Borcic:
> I'd favor the following, that I find most readable
> sets = map(set,list_of_strings)
> res = set(''.join(sorted(s1|s2)) for s1 in sets for s2 in sets if len(s1^s2)==2)

I think there can be written more readable code. For my programs I
usually prefer simpler code, that (if possible) even a children can
understand. So I can debug, modify and improve it better & faster.

Bye,
bearophile
, Jun 8, 2006
9. ### Guest

> I think there can be written more readable code. For my programs I
> usually prefer simpler code, that (if possible) even a children can
> understand. So I can debug, modify and improve it better & faster.

Debugged:
I think it can be written more readable code.
In this newsgroup sometimes I have tried to post 'clever' code, but for
my programs I (if possible) prefer simpler code, that even a child can
understand. So I can debug, modify and improve it faster.

Sorry, I was tired,
bearophile
, Jun 8, 2006
10. ### Boris BorcicGuest

wrote:
> Boris Borcic:
>> I'd favor the following, that I find most readable
>> sets = map(set,list_of_strings)
>> res = set(''.join(sorted(s1|s2)) for s1 in sets for s2 in sets if len(s1^s2)==2)

>
> I think there can be written more readable code.

readability, of course, is in the eye of the beholder... and I find this code
*much* easier to recognize as a realisation of the description made by the OP,
than the code he himself offered - if you care to take a look at both.

For my programs I
> usually prefer simpler code,

I challenge you to write simpler code to do the equivalent.

> that (if possible) even a children can
> understand.

what child ? one that is trained just like *you* think children should start, I
guess.

> So I can debug, modify and improve it better & faster.

Sure, but the case is we each were *distinct* children.

>
> Bye,
> bearophile
>
Boris Borcic, Jun 8, 2006
11. ### Guest

Boris Borcic:

> I challenge you to write simpler code to do the equivalent.

I don't do challenges. I too have written the code to solve that
problem, it wasn't much different from your one (it uses a generator
function xpairs, to yeild a scan of the different pairs, about half the
square, it uses symmetric_difference, etc), but it's on different
lines, with variables (object names), etc. And maybe comments too.
It's not that difficult to improve the readability of a quite long
line, you can start splitting it.

> Sure, but the case is we each were *distinct* children.

Every person is different, so every person defines the style he/she
likes and understands more.
But there can be defined some "coding suggestions", useful for most
people, to avoid the most common errors, and one of such errors is to
write very long lines, full of a lot of stuff, like your one.

Bye,
bearophile
, Jun 8, 2006
12. ### Girish SahaniGuest

Re: How to generate k+1 length strings from a list of klengthstrings?

Yes it is the former of course.Common elements could be in any order in
both strings.
Thanks Marc ...though we need to add an if for checking the length of
the string (k+1).
>
> Jon Clements wrote:
>> Are you asking the question, "Which pairs of strings have one character
>> different in each?", or "Which pairs of strings have a substring of
>> len(string) - 1 in common?".
>>
>> Jon.

>
> I imagine it's the former because the latter is trivially easy, I mean
> _really_ trivially easy.
>
> --
> http://mail.python.org/mailman/listinfo/python-list
>
Girish Sahani, Jun 9, 2006
13. ### Boris BorcicGuest

wrote:
> Boris Borcic:
>
>> I challenge you to write simpler code to do the equivalent.

context ?

>
> I don't do challenges.

Pfff... and you don't do real debates either.

> I too have written the code to solve that
> problem,

Why ? When ?
Boris Borcic, Jun 9, 2006
14. ### Guest

Boris Borcic:
> > I don't do challenges.

>
> Pfff... and you don't do real debates either.

Different nations have different values and different cultures, in mine
challenges are often seen as things for children, and a waste of time
for adults (probably in USA challenges are appreciated more).

Bye,
bearophile
, Jun 9, 2006
15. ### Boris BorcicGuest

wrote:
> Boris Borcic:
>>> I don't do challenges.

>> Pfff... and you don't do real debates either.

>
> Different nations have different values and different cultures, in mine
> challenges are often seen as things for children, and a waste of time
> for adults (probably in USA challenges are appreciated more).

(1) This is becoming very ridiculous. What do the USA have to do with it ?

(2) Different nations have different languages, and a good rule of thumb when
debating with someone of unknown linguistic origins is not to focus on the
connotations you percieve to a single word you abstract from its context.

(3) Different intellectual communities have different values, in mine it is the
rule not to make abstract claims if not ready to put them to experimental test -
and it's less a matter of (passing or failing) tests than a matter of making
clear the rule of translation between one's own abstract language and objective
facts. And by these means, ultimately, to measure the divergence in languages.

(4) As concerns the matter of the case, you made broad and abstract claims while
implicitely referring to some doctrine(s), and I concisely formulated the demand
that you back them up with (according to you) observable facts. I note that
instead you chose to articulate your doctrine while avoiding an occasion to

(5) I find your systematic enrollment of children to your cause quite
inappropriate.

>
> Bye,
> bearophile
>
Boris Borcic, Jun 9, 2006
16. ### Boris BorcicGuest

wrote:

> It's not that difficult to improve the readability of a quite long
> line, you can start splitting it.

The point is that it is observer and context dependent whether

res = set(''.join(sorted(X|Y))
for X in sets
for Y in sets
if len(X^Y)==2)

is measurably easier to read than

res = set(''.join(sorted(X|Y)) for X in sets for Y in sets if len(X^Y)==2)

*and* that I acknowledged it by writing "I find most readable"
while you denied it by speaking of "the readability" - and similar language.
Boris Borcic, Jun 9, 2006
17. ### Sibylle KoczianGuest

Re: How to generate k+1 length strings from a list of k lengthstrings?

Girish Sahani schrieb:
> Yes it is the former of course.Common elements could be in any order in
> both strings.
> Thanks Marc ...though we need to add an if for checking the length of
> the string (k+1).
>

No, why? Every character from string1 is removed from string2 at most
once. And string2 is added to string1 if and only if it has length 1; if
not, string1 is set to the empty string. So either string1 is empty or
it has length k+1. Or did you mean checking if string1 is empty?

--
Dr. Sibylle Koczian
Universitaetsbibliothek, Abt. Naturwiss.
D-86135 Augsburg
e-mail : -Augsburg.DE
Sibylle Koczian, Jun 9, 2006
18. ### Boris BorcicGuest

> Hum, since your code is not syntactically correct, anything will run
> faster

in fact it parses, I was fooled by this line

>> if k in range(1,len(prunedK),1) & i+k <= len(prunedK) -1:

I was fooled because the form "k in range(" tends to imply a for statement ;
in the case of an if statement, one usually uses chained comparisons instead.

If you know i>=0 and i,k are integers (as is implied by context), you could
simply write something like:

if 0 < k < i+k < len(prunedK) :

in the contrary case,
assuming your line does what you want, you should really write it as

if 1 <= k < len(prunedK) and i+k <= len(prunedK)-1 :

or more concisely

if 1 <= k < len(prunedK) >= i+k+1 :

or even

if i+k < len(prunedK) > k > 0 :
Boris Borcic, Jun 9, 2006