how to duplicate array entries

S

Sebastian

Hi there,

I have an array x=[1,2,3]

Is there an operator which I can use to get the result
[1,1,1,2,2,2,3,3,3] ?

I tried x*3, which resulted in [1,2,3,1,2,3,1,2,3]
I also tried [[b,b,b] for b in x] which led to [[1,2,3],[1,2,3],
[1,2,3]], but this isn't what I want either.

Cheers, Sebastian
 
S

Sebastian

I also tried [[b,b,b] for b in x] which led to [[1,2,3],[1,2,3],
[1,2,3]]

Sorry, I have to correct myself. The quoted line above resulted in
[[1,1,1],[2,2,2],[3,3,3]] of course!

Cheers, Sebastian
 
C

Chris Rebert

Hi there,

I have an array  x=[1,2,3]

Is there an operator which I can use to get the result
[1,1,1,2,2,2,3,3,3] ?

I tried x*3, which resulted in [1,2,3,1,2,3,1,2,3]
I also tried [[b,b,b] for b in x] which led to [[1,2,3],[1,2,3],
[1,2,3]], but this isn't what I want either.

from itertools import chain, repeat
n = 3
stretched = list(chain(*[repeat(item, n) for item in x]))

Cheers,
Chris
 
P

Paul Rudin

Sebastian said:
Hi there,

I have an array x=[1,2,3]

In python such an object is called a "list".

(In cpython it's implemented as an automatically resizable array.)
Is there an operator which I can use to get the result
[1,1,1,2,2,2,3,3,3] ?

There's no operator that will give you that directly - but there are
plenty of one-liners that will yield that list.
e.g:
list(itertools.chain(*([x]*3 for x in [1,2,3])))
[1, 1, 1, 2, 2, 2, 3, 3, 3]
 
G

Gary Herron

Paul said:
Hi there,

I have an array x=[1,2,3]

In python such an object is called a "list".

(In cpython it's implemented as an automatically resizable array.)

Is there an operator which I can use to get the result
[1,1,1,2,2,2,3,3,3] ?

There's no operator that will give you that directly - but there are
plenty of one-liners that will yield that list.
e.g:

list(itertools.chain(*([x]*3 for x in [1,2,3])))
[1, 1, 1, 2, 2, 2, 3, 3, 3]

List comprehension also works nicely for this problem, and may be
clearer to some.
>>> x = [1,2,3]
>>> print [i for i in x for k in range(3)]
[1, 1, 1, 2, 2, 2, 3, 3, 3]

Gary Herron
 
S

Steven D'Aprano

Hi there,

I have an array x=[1,2,3]

You have a list. Python has an array type, but you have to "import array"
to use it.

Is there an operator which I can use to get the result
[1,1,1,2,2,2,3,3,3] ?

Not an operator, but you can do it easily with a function. Here's the
simple version:
.... L = []
.... for item in items:
.... L.extend([item]*count)
.... return L
....
[1, 1, 1, 2, 2, 2, 3, 3, 3]



Here's a version which is short and simple enough to use in-line, but
will be slow for large lists:

x = [1,2,3]
count = 3
sum([[item]*count for item in x], [])
[1, 1, 1, 2, 2, 2, 3, 3, 3]


Finally, here's a nasty hack that you should never, ever, ever use for
any reason except to win a bet:

[locals()['_[1]'].extend([item]*(count-1)) or item for item in x]
[1, 1, 1, 2, 2, 2, 3, 3, 3]
 
A

Alf P. Steinbach

* Paul Rudin:
Sebastian said:
I have an array x=[1,2,3]

In python such an object is called a "list".

(In cpython it's implemented as an automatically resizable array.)

I don't think the OP's terminology needs correction.

A Python "list" is an array functionality-wise.

If one isn't observant of that fact then one ends up with O(n^2) time for the
simplest things.

Using the term "array" accentuates and clarifies this most important aspect.

Using the misleading term "list", even if that's the type's name in Python,
hides this most important aspect, and so is not, IMHO, a Good Idea except where
it really matters that it's a 'list' array as opposed to, say, a 'tuple' array.

Is there an operator which I can use to get the result
[1,1,1,2,2,2,3,3,3] ?

There's no operator that will give you that directly - but there are
plenty of one-liners that will yield that list.
e.g:
list(itertools.chain(*([x]*3 for x in [1,2,3])))
[1, 1, 1, 2, 2, 2, 3, 3, 3]

And I think it's worth noting that, for the general case, this notation is also
hiding a gross inefficiency, first constructing sub-arrays and then copying them
and joining them.

It doesn't even buy clarity.

So, just
.... a = []
.... for item in s:
.... for i in range( n ):
.... a.append( item )
.... return a
....
>>> repeat_items_in( [1, 2, 3], 3 ) [1, 1, 1, 2, 2, 2, 3, 3, 3]
>>> _

And if one absolutely feels like trading some efficiency and clarity for some
more functional-programming expression like thing (I don't understand why people
desire that!), just replace the 'append' line with a 'yield' and then write

list( repeat_items_in( [1, 2, 3], 3 ) )

Re the thing I don't understand: it's the same in C++, people using hours on
figuring out how to do something very simple in an ungrokkable indirect and
"compiled" way using template metaprogramming stuff, when they could just write
a simple 'for' loop and be done with in, say, 3 seconds, and much clearer too!


Cheers,

- Alf
 
S

Steven D'Aprano

* Paul Rudin:
Sebastian said:
I have an array x=[1,2,3]

In python such an object is called a "list".

(In cpython it's implemented as an automatically resizable array.)

I don't think the OP's terminology needs correction.

A Python "list" is an array functionality-wise.

If one isn't observant of that fact then one ends up with O(n^2) time
for the simplest things.

Well that's certainly not true. Some operations may be O(N**2), but
others are not: list.append() is amortized O(N) and for individual
appends, may be can be as fast as O(1).

Using the term "array" accentuates and clarifies this most important
aspect.

But Python lists are designed to behave as lists. Just because CPython
implements them using arrays doesn't make them arrays. Other Python
implementations might use other implementations...

If the creator of CLPython is out there, perhaps might like to comment on
whether he uses Lisp linked-lists for the Python list type?

Using the misleading term "list", even if that's the type's name in
Python, hides this most important aspect, and so is not, IMHO, a Good
Idea except where it really matters that it's a 'list' array as opposed
to, say, a 'tuple' array.

Or an "array" array.

<type 'array.array'>


Is there an operator which I can use to get the result
[1,1,1,2,2,2,3,3,3] ?

There's no operator that will give you that directly - but there are
plenty of one-liners that will yield that list. e.g:
list(itertools.chain(*([x]*3 for x in [1,2,3])))
[1, 1, 1, 2, 2, 2, 3, 3, 3]

And I think it's worth noting that, for the general case, this notation
is also hiding a gross inefficiency, first constructing sub-arrays and
then copying them and joining them.

I wouldn't call that a gross inefficiency -- that's a gross exaggeration
unless count is very large, and even then, possibly not that large. Only
one such sub-array (sub-list) exists at any time. (Unless I am grossly
misinformed.)


It doesn't even buy clarity.

Not if you're unused to the functional, iterator-based idioms, no.

But if you are, it does.


And if one absolutely feels like trading some efficiency and clarity for
some more functional-programming expression like thing (I don't
understand why people desire that!),

I don't understand why you assume that functional forms are necessarily
less efficient and clear than non-functional. Which is easier to read?
6

versus
total = 0
for i in [1, 2, 3]:
.... total += i
....
6


[...]
Re the thing I don't understand: it's the same in C++, people using
hours on figuring out how to do something very simple in an ungrokkable
indirect and "compiled" way using template metaprogramming stuff, when
they could just write a simple 'for' loop and be done with in, say, 3
seconds, and much clearer too!

Amen to that brother!

It's the obsession with one-liners and the desire for a single built-in
command to do every imaginable task.
 
S

Sebastian

Thank you for your answers! I actually implemented it using for loops
before I posted here, but I was curious if there is a more elegant
solution (judging from the post, Alf will probably say, that for loops
are already elegant).

Sebastian
 
M

Munir

I have an array  x=[1,2,3]
Is there an operator which I can use to get the result
[1,1,1,2,2,2,3,3,3] ?

I tried x*3, which resulted in [1,2,3,1,2,3,1,2,3]

Have you tried:

y = x*3
y.sort()

Munir
 
P

Peter Otten

Alf said:
Re the thing I don't understand: it's the same in C++, people using hours
on figuring out how to do something very simple in an ungrokkable indirect
and "compiled" way using template metaprogramming stuff, when they could
just write a simple 'for' loop and be done with in, say, 3 seconds, and
much clearer too!

Most of that stuff doesn't end in code meant to do anything important. It's
more like gymnastics that helps you keep your mind in shape.
Or so I would hope.
items = [1, 2, 3]
result = 3*len(items)*[None]
result[::3] = result[1::3] = result[2::3] = items
result
[1, 1, 1, 2, 2, 2, 3, 3, 3]
[1, 1, 1, 2, 2, 2, 3, 3, 3]

;)

Peter
 
A

Alf P. Steinbach

* Steven D'Aprano:
* Paul Rudin:
I have an array x=[1,2,3]
In python such an object is called a "list".

(In cpython it's implemented as an automatically resizable array.)
I don't think the OP's terminology needs correction.

A Python "list" is an array functionality-wise.

If one isn't observant of that fact then one ends up with O(n^2) time
for the simplest things.

Well that's certainly not true. Some operations may be O(N**2), but
others are not: list.append() is amortized O(N) and for individual
appends, may be can be as fast as O(1).

The second sentence may or may not be true. I don't know of any fundamental
'list' operations that have quadratic time. Is there?

The first sentence is just baffling -- what on Earth is the "that" that you
think is not true?

OK, I can guess (correct me if I'm guessing wrong, please): you think I'm
talking about elementary operations. I'm not. I'm talking about algorithmic
complexity for loops doing e.g. insertions.

But Python lists are designed to behave as lists.

No, I'm sorry, they're not.

A Python 'list' has de facto constant time indexing, or "random access".

A linked list -- what the informal "list" means in programming -- does not
have constant time indexing.

A linked list has constant time insertion.

A Python 'list' has de facto linear time insertion (except when used as cursor
gap array, of course, or e.g. implementing a linked list on top, such things).

So in short, a Python 'list' has all the properties of arrays, and none of the
properties of linked lists.

Just because CPython
implements them using arrays doesn't make them arrays. Other Python
implementations might use other implementations...

No, I'm sorry, not without screwing up existing Python programs. Indexing is
assumed as constant time in every CPython program. That is, in your own words,
but here correct, that's "certainly not true". ;-)

No (sensible) programming language allows a language implementation to change
the running time of common loops from linear to quadratic.

It would be decidedly un-pythonic. ;-)

If the creator of CLPython is out there, perhaps might like to comment on
whether he uses Lisp linked-lists for the Python list type?

If he does then you're talking about a different language than the one that
CPython implements: constant time indexing is very different from linear time.
It doesn't matter if some bananas are called oranges. They don't turn into
oranges no matter what they're called.

Or an "array" array.

For example, yes. These different kinds of arrays have different restrictions:
can't be used as dictionary key, can't be modified, has fixed item type. And
when talking about such characteristics the type name can be relevant.

<type 'array.array'>


Is there an operator which I can use to get the result
[1,1,1,2,2,2,3,3,3] ?
There's no operator that will give you that directly - but there are
plenty of one-liners that will yield that list. e.g:

list(itertools.chain(*([x]*3 for x in [1,2,3])))
[1, 1, 1, 2, 2, 2, 3, 3, 3]
And I think it's worth noting that, for the general case, this notation
is also hiding a gross inefficiency, first constructing sub-arrays and
then copying them and joining them.

I wouldn't call that a gross inefficiency -- that's a gross exaggeration
unless count is very large, and even then, possibly not that large. Only
one such sub-array (sub-list) exists at any time. (Unless I am grossly
misinformed.)

I'm sorry but to the best of my knowledge you're misinformed.

Unless there's some pretty advanced lazy evaluation involved the * operator has
to collect the subarrays into an array formal argument for the 'chain' routine.

And at that point they all exist at the same time.

.... print( type( poff ) )
.... print( poff )
....
>>> a = [1, 2, 3]
>>> knurre( *(3*[x] for x in a) )

It doesn't even buy clarity.

Not if you're unused to the functional, iterator-based idioms, no.

But if you are, it does.

He he -- see above, with 99.x certainty you *misunderstood* the code.

That's *not* clear code.

That's, hereby (almost) proven :), code that makes even experienced programmers
misunderstand what's going on!

And if one absolutely feels like trading some efficiency and clarity for
some more functional-programming expression like thing (I don't
understand why people desire that!),

I don't understand why you assume that functional forms are necessarily
less efficient and clear than non-functional. Which is easier to read?
print sum([1,2,3])
6

versus
total = 0
for i in [1, 2, 3]:
... total += i
... 6

No, here I very much agree with you: the first is bestest. :)

I like abstraction.

But not for its own sake: there has to be some increase in clarity, some details
that the abstraction removes from consideration, instead of introducing
additional details for consideration!

[...]
Re the thing I don't understand: it's the same in C++, people using
hours on figuring out how to do something very simple in an ungrokkable
indirect and "compiled" way using template metaprogramming stuff, when
they could just write a simple 'for' loop and be done with in, say, 3
seconds, and much clearer too!

Amen to that brother!

It's the obsession with one-liners and the desire for a single built-in
command to do every imaginable task.

He he... :)


Cheers,

- Alf
 
C

Chris Rebert

* Steven D'Aprano:
* Paul Rudin:


I have an array  x=[1,2,3]

In python such an object is called a "list".

(In cpython it's implemented as an automatically resizable array.)

I don't think the OP's terminology needs correction.

A Python "list" is an array functionality-wise.

If one isn't observant of that fact then one ends up with O(n^2) time
for the simplest things.

Well that's certainly not true. Some operations may be O(N**2), but others
are not: list.append() is amortized O(N) and for individual appends, may be
can be as fast as O(1).

The second sentence may or may not be true. I don't know of any fundamental
'list' operations that have quadratic time. Is there?

The first sentence is just baffling  --  what on Earth is the "that" that
you think is not true?

OK, I can guess (correct me if I'm guessing wrong, please): you think I'm
talking about elementary operations. I'm not. I'm talking about algorithmic
complexity for loops doing e.g. insertions.

But Python lists are designed to behave as lists.

No, I'm sorry, they're not.

A Python 'list' has de facto constant time indexing, or "random access".

A linked list  --  what the informal "list" means in programming

Eh, it's a bit context-dependent. The abstract data type definition is
a superset that includes both linked lists and dynamic arrays. FWIW,
Java likewise uses "list" in its ADT sense.

Cheers,
Chris
 
A

Alf P. Steinbach

* Chris Rebert:
* Steven D'Aprano:
On Mon, 11 Jan 2010 08:56:36 +0100, Alf P. Steinbach wrote:

* Paul Rudin:

I have an array x=[1,2,3]
In python such an object is called a "list".

(In cpython it's implemented as an automatically resizable array.)
I don't think the OP's terminology needs correction.

A Python "list" is an array functionality-wise.

If one isn't observant of that fact then one ends up with O(n^2) time
for the simplest things.
Well that's certainly not true. Some operations may be O(N**2), but others
are not: list.append() is amortized O(N) and for individual appends, may be
can be as fast as O(1).
The second sentence may or may not be true. I don't know of any fundamental
'list' operations that have quadratic time. Is there?

The first sentence is just baffling -- what on Earth is the "that" that
you think is not true?

OK, I can guess (correct me if I'm guessing wrong, please): you think I'm
talking about elementary operations. I'm not. I'm talking about algorithmic
complexity for loops doing e.g. insertions.

Using the term "array" accentuates and clarifies this most important
aspect.
But Python lists are designed to behave as lists.
No, I'm sorry, they're not.

A Python 'list' has de facto constant time indexing, or "random access".

A linked list -- what the informal "list" means in programming

Eh, it's a bit context-dependent. The abstract data type definition is
a superset that includes both linked lists and dynamic arrays.

Assuming you're talking about some abstract type definition that's in some PEP
somewhere (there's no abstract data type in the language specification, it's all
informal) then that's a deficiency of the specification, since the type is
designed around indexing operations. Perhaps the designers thought it would be
"obvious", that no-one could mistake it for other than what it is? Anyway, that
doesn't make it context-dependent: if true, it only makes it poorly specified.

FWIW, Java likewise uses "list" in its ADT sense.

I'm sorry, I'm unfamiliar with that Java terminology (but then, reportedly, many
Java programmers think that Java has "pass by reference", so nothing coming from
that direction will surprise me very much!). The Java language specification has
a section about arrays, none about lists AFAICS. Do you have a reference?


Cheers & hth.,

- Alf
 
C

Chris Rebert

* Chris Rebert:

Assuming you're talking about some abstract type definition that's in some
PEP somewhere

No, I mean the computer science definition/term:
http://en.wikipedia.org/wiki/List_(computer_science)
I'm sorry, I'm unfamiliar with that Java terminology (but then, reportedly,
many Java programmers think that Java has "pass by reference", so nothing
coming from that direction will surprise me very much!). The Java language
specification has a section about arrays, none about lists AFAICS. Do you
have a reference?

http://java.sun.com/j2se/1.4.2/docs/api/java/util/List.html

Cheers,
Chris
 
A

Alf P. Steinbach

* Chris Rebert:
No, I mean the computer science definition/term:
http://en.wikipedia.org/wiki/List_(computer_science)

Note that the default meaning is a list with the characteristics of a linked list.

The abstract data type specified in that article is a bit more restricted than
the usual meaning of list -- as the article notes, the ADT it presents is
equivalent to an abstract stack, and it's essentially the Lisp view of lists,
not only just linked list but a restricted view of linked lists. To understand
it you have to keep in mind that such an ADT is a simplification, for the
purpose of specifying logical functionality and nothing else. The algorithmic
efficiency is simply not specified, but is implied.

Unfortunately, as noted there, the article doesn't cite any references or sources...

Here's one:

http://www.itl.nist.gov/div897/sqg/dads/HTML/list.html


Thanks. I didn't know that. It's a very dirty hodgepodge interface with
*optional* methods that throw exceptions if not implemented and apparently no
constraints on algorithmic complexity for various methods. As such, it's a very
good example why 'list' should not be conflated with 'array'. It leads to such
monstrosities where neither correctness nor any kind of efficiency is guaranteed
:)

Cheers, & thanks,


- Alf

PS: Please do not mail me copies of your replies to the group. A mail copy means
that I may have to answer your article twice, as in this case.
 
S

Steven D'Aprano

The second sentence may or may not be true. I don't know of any
fundamental 'list' operations that have quadratic time. Is there?

I should hope not, but you seem to indicate that there are, or could be
-- you were the one who made the claim of O(N**2).

If you're talking about functions someone might write themselves, well
there is no limit on the awfulness of that. No doubt you're aware of
Bogosort, which has performance O(N!) instead of O(N*log N). Or the
(sadly very common) "Polish Painter Algorithm".

With a bit of work, one might even come up with a sometimes-O(N**2)
algorithm based on dicts:


def increment(d, key):
"""Add one to the value of dict d[key]"""
for k in d.keys():
value = d.pop(key)
if k == key:
value = value+1
d[key] = value

If all the keys hash to the same values, CPython dict lookup degenerates
to O(N) due to the key collisions. If you then remove each key from the
internal linked list, and re-add it to the end, then you might end up
with something as bad as O(N**2).

Of course the above is a stupidly bad function, but it's not as bad as
some things I've seen in real life.

The first sentence is just baffling -- what on Earth is the "that"
that you think is not true?

OK, I can guess (correct me if I'm guessing wrong, please): you think
I'm talking about elementary operations. I'm not. I'm talking about
algorithmic complexity for loops doing e.g. insertions.

I'm sorry that I baffled you, but I was responding to your implication
that failure to realise that Python lists are implemented as arrays will
likely lead to people inadvertently writing O(N**2) algorithms when they
need not.

Of course you are correct in the sense that there is no limit to the
awfulness to code that can be written by somebody sufficiently
incompetent, ignorant, inexperienced or unlucky. But as far as typical
behaviour goes, I do not believe that the risks of writing such O(N**2)
algorithms is any higher than it otherwise would be if they were called
arrays. In fact, calling them "arrays" might give some people the false
impression that every single append requires an expensive resize
operation, which is not the case.

I will give you one thing: if you assume that Python lists are linked
lists, then you might expect list.insert(0, x) to be O(1) and list to
be O(N). But in fact they are O(N) and O(1) respectively. Since indexing
is typically far more common than insertion, this is a good thing.

But since Python lists are quite rich, and already provide methods for
things like reversal and sorting, I just don't see that the risk of
writing poorly performing functions is any greater because they are
called "lists" rather than "arrays". Particularly if you treat them as an
unspecified implementation of abstract lists, rather than assuming they
are specifically linked-lists.


No, I'm sorry, they're not.

A Python 'list' has de facto constant time indexing, or "random access".

A linked list -- what the informal "list" means in programming --
does not have constant time indexing.

A linked list is a specific implementation of an even more general
abstract data type, the list. This is why people specify "linked list"
rather than "list". Abstract types are never defined in terms of
performance, because of course you can't say what the performance of an
abstract thing is, only of functional behaviour.

Both arrays and linked lists (as well as hash tables, as in Lua, or
trees) can be used to implement an abstract list type, and they have
different performance characteristics. A list in this sense is defined by
behaviour (fundamental operations provided), not performance
characteristics.


[...]
No, I'm sorry, not without screwing up existing Python programs.

That's the choice that the implementer makes when doing something,
anything, different from CPython.

Indexing is assumed as constant time in every CPython program.

*Every* program? Really? I've written many programs that don't assume
anything about indexing except that it is "fast enough".

In the same way people are often indifferent between hash tables with
constant-time lookup and binary trees with O(log N) lookup.


That is,
in your own words, but here correct, that's "certainly not true". ;-)

No (sensible) programming language allows a language implementation to
change the running time of common loops from linear to quadratic.

Unless the Python reference manually specifically guarantees such
performance characteristics, it is entirely an implementation choice
whether to optimize for insertions or indexing or something else all
together.

Of course in practice CPython is the proverbial 300lb gorilla in the
room, and people's expectations will be shaped by it, so many
implementers will likely try to match CPython's performance
characteristics.


It would be decidedly un-pythonic. ;-)



If he does then you're talking about a different language than the one
that CPython implements: constant time indexing is very different from
linear time. It doesn't matter if some bananas are called oranges. They
don't turn into oranges no matter what they're called.

But both bananas and oranges are fruit, and if the menu says "Fruit of
the Day", the chef is free to choose between oranges or bananas. Unless
the menu promises that the fruit will be round, juicy, made up of
segments, rich in Vitamin C and low in potassium, there's nothing wrong
with supplying bananas.

Likewise, both linked-lists and arrays are suitable for implementing
abstract lists, and unless the language reference promises CPython's
performance characteristics the implementer is free to vary them.



[...]
list(itertools.chain(*([x]*3 for x in [1,2,3])))
[1, 1, 1, 2, 2, 2, 3, 3, 3]
And I think it's worth noting that, for the general case, this
notation is also hiding a gross inefficiency, first constructing
sub-arrays and then copying them and joining them.

I wouldn't call that a gross inefficiency -- that's a gross
exaggeration unless count is very large, and even then, possibly not
that large. Only one such sub-array (sub-list) exists at any time.
(Unless I am grossly misinformed.)

I'm sorry but to the best of my knowledge you're misinformed.

Unless there's some pretty advanced lazy evaluation involved the *
operator has to collect the subarrays into an array formal argument for
the 'chain' routine.

And at that point they all exist at the same time.

I think you are correct, to be honest I didn't notice the * operator and
was thinking that chain was being called on an iterator argument, not
collected into a tuple. Oh well, that's the downside of using operators
instead of named functions :/
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,483
Members
44,901
Latest member
Noble71S45

Latest Threads

Top