String formatting with the format string syntax

  • Thread starter Andre Alexander Bell
  • Start date
A

Andre Alexander Bell

Hello,

I'm used to write in Python something like

and then have a dictionary like

and get the formatted output like this:

Occasionally I want to extract the field names from the template string.
I was used to write a class like

class Extractor(object):
def __init__(self):
self.keys = []
def __getitem__(self, key):
self.keys.append(key)
return ''

and use it like this:
['hello']

Now Python has the format method for string formatting with the more
advanced handling. So I could as well write

My question is, if I do have a string template which uses the newer
format string syntax, how do I best extract the field information?

I found the str._formatter_parser() method which I could use like this:

keys = []
for (a, key, c, d) in s._formatter_parser():
if key:
keys.append(key)

Is there a more elegant solution?
What are a, c, d?
Where can I find additional information on this method?
Should one use a method that actually starts with an _?
Couldn't this one change any time soon?

Thanks for any help


Andre
 
M

Miki

You can use ** syntax:
Hello,

I'm used to write in Python something like

 >>> s = 'some text that says: %(hello)s'

and then have a dictionary like

 >>> english = { 'hello': 'hello' }

and get the formatted output like this:

 >>> s % english

Occasionally I want to extract the field names from the template string.
I was used to write a class like

class Extractor(object):
     def __init__(self):
         self.keys = []
     def __getitem__(self, key):
         self.keys.append(key)
         return ''

and use it like this:

 >>> e = Extractor()
 >>> res = s % e
 >>> e.keys
['hello']

Now Python has the format method for string formatting with the more
advanced handling. So I could as well write

 >>> s = 'some text that says: {hello!s}'
 >>> s.format(hello='hello')

My question is, if I do have a string template which uses the newer
format string syntax, how do I best extract the field information?

I found the str._formatter_parser() method which I could use like this:

keys = []
for (a, key, c, d) in s._formatter_parser():
     if key:
         keys.append(key)

Is there a more elegant solution?
What are a, c, d?
Where can I find additional information on this method?
Should one use a method that actually starts with an _?
Couldn't this one change any time soon?

Thanks for any help

Andre
 
T

Thomas Jollans

You can use ** syntax:

No, you can't. This only works with dicts, not with arbitrary mappings, or
dict subclasses that try to do some kind of funny stuff.
Hello,

I'm used to write in Python something like
s = 'some text that says: %(hello)s'

and then have a dictionary like
english = { 'hello': 'hello' }

and get the formatted output like this:
s % english

Occasionally I want to extract the field names from the template string.
I was used to write a class like

class Extractor(object):
def __init__(self):
self.keys = []
def __getitem__(self, key):
self.keys.append(key)
return ''

and use it like this:
e = Extractor()
res = s % e
e.keys
['hello']

Now Python has the format method for string formatting with the more
advanced handling. So I could as well write
s = 'some text that says: {hello!s}'
s.format(hello='hello')

My question is, if I do have a string template which uses the newer
format string syntax, how do I best extract the field information?

I found the str._formatter_parser() method which I could use like this:

keys = []
for (a, key, c, d) in s._formatter_parser():
if key:
keys.append(key)

Is there a more elegant solution?
What are a, c, d?
Where can I find additional information on this method?
Should one use a method that actually starts with an _?
Couldn't this one change any time soon?

Thanks for any help

Andre
 
A

Andre Alexander Bell

You can use ** syntax:

Thanks for your answer. Actually your answer tells me that my example
was misleading. Consider the template

s = 'A template with {variable1} and {variable2} placeholders.'

I'm seeking a way to extract the named placesholders, i.e. the names
'variable1' and 'variable2' from the template. I'm not trying to put in
values for them.

I hope this is clearer.

Thanks again


Andre
 
P

Peter Otten

Andre said:
You can use ** syntax:

Thanks for your answer. Actually your answer tells me that my example
was misleading. Consider the template

s = 'A template with {variable1} and {variable2} placeholders.'

I'm seeking a way to extract the named placesholders, i.e. the names
'variable1' and 'variable2' from the template. I'm not trying to put in
values for them.

I hope this is clearer.
s = 'A template with {variable1} and {variable2} placeholders.'
[name for _, name, _, _ in s._formatter_parser() if name is not None]
['variable1', 'variable2']

Peter
 
P

Peter Otten

Peter said:
Andre said:
You can use ** syntax:
english = {'hello':'hello'}
s.format(**english)

Thanks for your answer. Actually your answer tells me that my example
was misleading. Consider the template

s = 'A template with {variable1} and {variable2} placeholders.'

I'm seeking a way to extract the named placesholders, i.e. the names
'variable1' and 'variable2' from the template. I'm not trying to put in
values for them.

I hope this is clearer.
s = 'A template with {variable1} and {variable2} placeholders.'
[name for _, name, _, _ in s._formatter_parser() if name is not None]
['variable1', 'variable2']

Caveat: the format spec may contain names, too.
Here's an attempt to take that into account:

def extract_names(t, recurse=1):
for _, name, fmt, _ in t._formatter_parser():
if name is not None:
yield name
if recurse and fmt is not None:
for name in extract_names(fmt, recurse-1):
yield name

t = "before {one:{two}{three}} after"
print(t)

for name in extract_names(t):
print(name)
['one', 'two', 'three']

Don't expect correct results for illegal formats:
list(extract_names("{one:{two:{three}}}")) ['one', 'two']
"{one:{two:{three}}}".format(one=1, two=2, three=3)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: Max string recursion exceeded

Duplicate names may occur:['one', 'one', 'one']

Positional arguments are treated like names:
list(extract_names("{0} {1} {0}")) ['0', '1', '0']
list(extract_names("{} {} {}"))
['', '', '']

Peter
 
A

Andre Alexander Bell

def extract_names(t, recurse=1):
for _, name, fmt, _ in t._formatter_parser():
if name is not None:
yield name
if recurse and fmt is not None:
for name in extract_names(fmt, recurse-1):
yield name

Thanks Peter, I very much like this generator solution. It will work for
all situations I can currently think of.

However, one thing remains. It is based on the _format_parser method.
And as I wrote in my original post this one - since it starts with _ -
suggests to me to better not be used. So if using this method is
completely ok, why does it start with _, why is it almost undocumented?
Or did I miss something, some docs somewhere?

Best regards


Andre
 
P

Peter Otten

Andre said:
Thanks Peter, I very much like this generator solution. It will work for
all situations I can currently think of.

However, one thing remains. It is based on the _format_parser method.
And as I wrote in my original post this one - since it starts with _ -
suggests to me to better not be used. So if using this method is
completely ok, why does it start with _, why is it almost undocumented?
Or did I miss something, some docs somewhere?

Sorry, I really should have read your original post carefully/completely. It
would have spared me from finding _formatter_parser() independently...

I personally would not be too concerned about the leading underscore, but
you can use

string.Formatter().parse(template)

instead.

Peter
 
A

Andre Alexander Bell

I personally would not be too concerned about the leading underscore, but
you can use

string.Formatter().parse(template)

instead.

Thanks for this pointer. I like it this way. So if I now combine your
generator with your suggestion, I end up with something like this:

def extract_names(t, recurse=1):
import string
for _, name, fmt, _ in string.Formatter().parse(t):
if name is not None:
yield name
if recurse and fmt is not None:
for name in extract_names(fmt, recurse-1):
yield name

Pretty cool. Thanks a lot.

Andre
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,756
Messages
2,569,535
Members
45,008
Latest member
obedient dusk

Latest Threads

Top