Bash-like brace expansion

Peter Waller · Mar 24, 2009

Okay, I got fed up with there not being any (obvious) good examples of
how to do bash-like brace expansion in Python, so I wrote it myself.
Here it is for all to enjoy!

If anyone has any better solutions or any other examples of how to do
this, I'd be glad to hear from them.

#~ BraceExpand.py - Bash-like brace expansion in Python
#~ Copyright (C) 2009 <[email protected]>

#~ This program is free software: you can redistribute it and/or
modify
#~ it under the terms of the GNU Affero General Public License as
#~ published by the Free Software Foundation, either version 3 of the
#~ License, or (at your option) any later version.

#~ This program is distributed in the hope that it will be useful,
#~ but WITHOUT ANY WARRANTY; without even the implied warranty of
#~ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
#~ GNU Affero General Public License for more details.

#~ You should have received a copy of the GNU Affero General Public
License
#~ along with this program. If not, see <http://www.gnu.org/licenses/

.

import re
from itertools import imap

class NoBraces(Exception): pass

def BraceExpand(text, ordering = []):
"""Recursively brace expand text in a bash-like fashion.

See 'man 2 bash' under the heading 'Brace Expansion'

Example: "/data/file{1..10}.root" expands to file1.root ...
file9.root

Ordering allows changing the order of iteration. The rules for
this are
a little complicated. Ordering is a list of boolean values.
If it is possible to expand the postamble, the first value from
the list
is removed, and used to determine whether the postamble should be
iterated
over before iterating over the amble. The list is then passed
down
recursively to the next Brace.

What does this code do?

It is simpler than it looks.

There are three main steps:
1) Split the string into three parts, consisting of pre-amble,
amble and
post-amble. (This requires keeping track of nested {}'s.)
2) In the amble, figure out which commas are not stuck in {}'s,
and
split the string by those commas.
3) For each part of this split string, Add together the pre-
amble, the
string and the post-amble. Call BraceExpand() on this string
to deal
with nested braces and any expansion required in the
postamble

Other things this code does which make it look more complicated:
* Expand ranges along the lines of {1..10}
* Allow for re-arranging the order of iteration

Todo/Not Implemented:
* Escaping/quoting

Example C implementation from bash (inspiration for this):
http://www.oldlinux.org/Linux.old/bin/old/bash-1.11/braces.c
"""

def FindMatchedBraces(text, position = -1):
"Search for nested start and end brace in text starting from
position"
braceDepth = 0
nextClose = -1
first = True

# Search for a {
# is it closer than the nearest } ?
# Yes : increase brace depth
# No : decrease brace depth
# When we reach braceDepth == 0, we have found our close brace
while braceDepth or first:

nextOpen = text.find("{", position+1)
nextClose = text.find("}", position+1)

if first and nextOpen >= 0:
startBrace = nextOpen
first = False

if nextOpen < nextClose and nextOpen >= 0:
braceDepth += 1
position = nextOpen
elif nextClose >= 0:
braceDepth -= 1
position = nextClose
else:
raise NoBraces()

return startBrace, position

try: start, end = FindMatchedBraces(text)
except NoBraces:
# There are no braces! Nothing to expand!
return [text]

# Split the text into three bits, '{pre,,post}amble'. The 'pre' is
anything
# before expansion, the '' is the bit that needs expanding and
gluing to
# the pre and the post. After gluing together, we can recursively
expand
# again
preamble = text[:start]
amble = text[start+1:end]
postamble = text[end+1:]

def BareCommaSearcher(amble):
"Search for commas which are not encapsulated in {}"

haveBraces = True
try: start, end = FindMatchedBraces(amble)
except NoBraces: haveBraces = False

position = -1
while True:
position = amble.find(",", position+1)

if position < 0:
# We didn't find any comma after 'position', finish
searching.
break

if haveBraces and start < position < end:
# We're inside some braces, skip to the end of them,
find the
# next set.
position = end

haveBraces = True
try: start, end = FindMatchedBraces(amble, position)
except NoBraces: haveBraces = False

continue

yield position

# Reached the end of the string. (Conveniently
"text"[pos:None] is the
# same as "text"[pos:]
yield None

parts = []
rangeMatcher = re.compile(r"^(\d+)..(\d+)$")

# Find each segment of the amble, splitting by 'bare' commas
position = 0
for commaPos in BareCommaSearcher(amble):
# Get a bit between bare commas
part = amble[position:commaPos]

# Found a matched range, expand it!
matchedRange = rangeMatcher.match(part)
if matchedRange:
matchedIndices = map(int, matchedRange.groups())
parts.extend(imap(str,xrange(*matchedIndices)))
else:
parts.append(part)

if not commaPos:
# Avoid None + 1 at the end of iteration
break

position = commaPos + 1

# Is the postamble expandable? (Does it contain braces?)
postambleExpandable = True
try: start, end = FindMatchedBraces(postamble)
except NoBraces: postambleExpandable = False

# Check to see if we should iterate over postamble first, also
sort out
# iteration ordering
postambleFirst = ordering
thisPostambleFirst = False
if postambleExpandable and postambleFirst:
postambleFirst = list(postambleFirst)
thisPostambleFirst = postambleFirst.pop(0)

result = []

if postambleExpandable and thisPostambleFirst:
# Iterate of expanded postamble first
for postpart in BraceExpand(postamble):
for part in parts:
part = "".join([preamble, part, postpart])
result.extend(BraceExpand(part, postambleFirst))

else:
for part in parts:
part = "".join([preamble, part, postamble])
result.extend(BraceExpand(part, postambleFirst))

return result

if __name__ == "__main__":
from pprint import pprint
pprint(BraceExpand("electron_{n,{pt,eta,phi}[{1,2}]}", ordering =
[True]))

#~ pprint(BraceExpand("Myfile{1,3..10}.root"))

#~ pprint(BraceExpand("{pre,,post}amble"))

#~ pprint(BraceExpand("amble{a,b,}}"))

Peter Waller · Mar 24, 2009

Okay, yuck. I didn't realise that posting would mangle the code so
badly. Is there any better way to attach code? I'm using google
groups.

bearophileHUGS · Mar 24, 2009

Peter Waller:

Is there any better way to attach code?

This is a widely used place (but read the "contract"/disclaimer
first):
http://code.activestate.com/recipes/langs/python/

Bye,
bearophile

Tino Wildenhain · Mar 24, 2009

Peter said:
Okay, I got fed up with there not being any (obvious) good examples of
how to do bash-like brace expansion in Python, so I wrote it myself.
Here it is for all to enjoy!

If anyone has any better solutions or any other examples of how to do
this, I'd be glad to hear from them.

It may be a funny experiment but I really fail to see the value in your
proposal.

The simple {foo} expansion you mention should be quite easily handled
with re.sub and a function as argument. So not much more then a few
lines of code.

Interesting could be to have {foo#bar} and {foo%bar} as well but again
I don't think the whole stuff would be very usefull anyway given the
%(foo)s form works quite well and has a host of options (for aligning
for example).

Cheers
Tino

Peter Waller · Mar 24, 2009

Heh, thanks

Unit tests did cross my mind. I was kicking myself for not starting
out with them, there were several regressions during development, and
there could well still be lurking corner cases

I've since heard that a 'better way' would be to use pyparsing. Also,
I saw that python has dropped the idea of having recursive regular
expressions at the moment.

http://bugs.python.org/msg83993

Maybe I might re-implement this with pyparsing and some unit tests.

Paul McGuire · Mar 24, 2009

Maybe I might re-implement this with pyparsing and some unit tests.

In your pyparsing efforts, you might draw some insights from this
regex inverter (that is, given an re such as "[AB]\d", returns "A0"
through "B9") on the pyparsing wiki: http://pyparsing.wikispaces.com/file/view/invRegex.py.

-- Paul

Peter Waller · Mar 26, 2009

I've since heard that a 'better way' would be to use pyparsing. Also,
I saw that python has dropped the idea of having recursive regular
expressions at the moment.

I've written a pyparsing parser which is very much shorter and nicer
than the original parser. I have not yet figured out how to process
the result yet though, I'm having a hard time dealing with the
recursion in my brain. Here it is anyway, in case someone smarter than
me can figure out how to turn the output of this into the relevant
list of expansions.

from pyparsing import (Suppress, Optional, CharsNotIn, Forward,
Group,
ZeroOrMore, delimitedList)

lbrack, rbrack = map(Suppress, "{}")

optAnything = Group(Optional(CharsNotIn("{},")))

braceExpr = Forward()

listItem = optAnything ^ Group(braceExpr)

bracketedList = Group(lbrack + delimitedList(listItem) + rbrack)

braceExpr << optAnything + ZeroOrMore(bracketedList + optAnything)

result = braceExpr.parseString(text).asList()

bash like expansion	4	May 12, 2005
How to position the tooltip comment on these buttons?	9	Nov 4, 2023
Iam having trouble adding a level editor to my platformer	0	Nov 4, 2025
Help with code	4	Oct 21, 2024
Mini Web Server in C++ (Part One)	4	Oct 2, 2025
Running Python scripts from BASH	6	Feb 27, 2007
Weird Behavior with Rays in C and OpenGL	4	Feb 12, 2024
Survey details won't go through using php, ajax, Mysql	3	Oct 25, 2023

Bash-like brace expansion

Peter Waller

Peter Waller

bearophileHUGS

Tino Wildenhain

Peter Waller

Paul McGuire

Peter Waller

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads