Is it possible to get string from function?

Roy Smith · Jan 16, 2014

I realize the subject line is kind of meaningless, so let me explain

I've got some unit tests that look like:

class Foo(TestCase):
def test_t1(self):
RECEIPT = "some string"

def test_t2(self):
RECEIPT = "some other string"

def test_t3(self):
RECEIPT = "yet a third string"

and so on. It's important that the strings be mutually unique. In the
example above, it's trivial to look at them and observe that they're all
different, but in real life, the strings are about 2500 characters long,
hex-encoded. It even turns out that a couple of the strings are
identical in the first 1000 or so characters, so it's not trivial to do
by visual inspection.

So, I figured I would write a meta-test, which used introspection to
find all the methods in the class, extract the strings from them (they
are all assigned to a variable named RECEIPT), and check to make sure
they're all different.

Is it possible to do that? It is straight-forward using the inspect
module to discover the methods, but I don't see any way to find what
strings are assigned to a variable with a given name. Of course, that
assignment doesn't even happen until the function is executed, so
perhaps what I want just isn't possible?

It turns out, I solved the problem with more mundane tools:

grep 'RECEIPT = ' test.py | sort | uniq -c

and I could have also solved the problem by putting all the strings in a
dict and having the functions pull them out of there. But, I'm still
interested in exploring if there is any way to do this with
introspection, as an academic exercise.

Roy Smith · Jan 16, 2014

Ben Finney said:
That looks like a poorly defined class.

Are the test cases pretty much identical other than the data in those
strings?

No, each test is quite different. The only thing they have in common is
they all involve a string representation of a transaction receipt. I
elided the actual test code in my example above because it wasn't
relevant to my question.

Chris Angelico · Jan 16, 2014

So, I figured I would write a meta-test, which used introspection to
find all the methods in the class, extract the strings from them (they
are all assigned to a variable named RECEIPT), and check to make sure
they're all different.

In theory, it should be. You can disassemble the function and find the
assignment. Check out Lib/dis.py - or just call it and process its
output. Names of local variables are found in
test_t1.__code__.co_names, the constants themselves are in
test_1.__code__.co_consts, and then it's just a matter of matching up
which constant got assigned to the slot represented by the name
RECEIPT.

But you might be able to shortcut it enormously. You say the strings
are "about 2500 characters long, hex-encoded". What are the chances of
having another constant, somewhere in the test function, that also
happens to be roughly that long and hex-encoded? If the answer is
"practically zero", then skip the code, skip co_names, and just look
through co_consts.

class TestCase:
pass # not running this in the full environment

class Foo(TestCase):
def test_t1(self):
RECEIPT = "some string"

def test_t2(self):
RECEIPT = "some other string"

def test_t3(self):
RECEIPT = "yet a third string"

def test_oops(self):
RECEIPT = "some other string"

unique = {}
for funcname in dir(Foo):
if funcname.startswith("test_"):
for const in getattr(Foo,funcname).__code__.co_consts:
if isinstance(const, str) and const.endswith("string"):
if const in unique:
print("Collision!", unique[const], "and", funcname)
unique[const] = funcname

This depends on your RECEIPT strings ending with the word "string" -
change the .endswith() check to be whatever it takes to distinguish
your critical constants from everything else you might have. Maybe:

CHARSET = set("0123456789ABCDEF") # or use lower-case letters, or
both, according to your hex encoding

if isinstance(const, str) and len(const)>2048 and set(const)<=CHARSET:

Anything over 2KB with no characters outside of that set is highly
likely to be what you want. Of course, this whole theory goes out the
window if your test functions can reference another test's RECEIPT;
though if you can guarantee that this is the *first* such literal (if
RECEIPT="..." is the first thing the function does), then you could
just add a 'break' after the unique[const]=funcname assignment and
it'll check only the first - co_consts is ordered.

An interesting little problem!

ChrisA

Roy Smith · Jan 16, 2014

Chris Angelico said:
So, I figured I would write a meta-test, which used introspection to
find all the methods in the class, extract the strings from them (they
are all assigned to a variable named RECEIPT), and check to make sure
they're all different.
[...]

Click to expand...

But you might be able to shortcut it enormously. You say the strings
are "about 2500 characters long, hex-encoded". What are the chances of
having another constant, somewhere in the test function, that also
happens to be roughly that long and hex-encoded?

The chances are exactly zero.

If the answer is "practically zero", then skip the code, skip
co_names, and just look through co_consts.

That sounds like it should work, thanks!

Of course, this whole theory goes out the
window if your test functions can reference another test's RECEIPT;

No, they don't do that.

Chris Angelico · Jan 16, 2014

The chances are exactly zero.

That sounds like it should work, thanks!

No, they don't do that.

Awesome! Makes it easy then.

ChrisA

Steven D'Aprano · Jan 16, 2014

I've got some unit tests that look like:

class Foo(TestCase):
def test_t1(self):
RECEIPT = "some string"

def test_t2(self):
RECEIPT = "some other string"

def test_t3(self):
RECEIPT = "yet a third string"

and so on. It's important that the strings be mutually unique. In the
example above, it's trivial to look at them and observe that they're all
different, but in real life, the strings are about 2500 characters long,
hex-encoded. It even turns out that a couple of the strings are
identical in the first 1000 or so characters, so it's not trivial to do
by visual inspection.

Is the mapping of receipt string to test fixed? That is, is it important
that test_t1 *always* runs with "some string", test_t2 "some other
string", and so forth?

If not, I'd start by pushing all those strings into a global list (or
possible a class attribute. Then:

LIST_OF_GIANT_STRINGS = [blah blah blah] # Read it from a file perhaps?
assert len(LIST_OF_GIANT_STRINGS) == len(set(LIST_OF_GIANT_STRINGS))

Then, change each test case to:

def test_t1(self):
RECEIPT = random.choose(LIST_OF_GIANT_STRINGS)

Even if two tests happen to pick the same string on this run, they are
unlikely to pick the same string on the next run.

If that's not good enough, if the strings *must* be unique, you can use a
helper like this:

def choose_without_replacement(alist):
random.shuffle(alist)
return alist.pop()

class Foo(TestCase):
def test_t1(self):
RECEIPT = choose_without_replacement(LIST_OF_GIANT_STRINGS)

All this assumes that you don't care which giant string matches which
test method. If you do, then:

DICT_OF_GIANT_STRINGS = {
'test_t1': ...,
'test_t2': ...,
} # Again, maybe read them from a file.

assert len(list(DICT_OF_GIANT_STRINGS.values())) == \
len(set(DICT_OF_GIANT_STRINGS.values()))

You can probably build up the dict from the test class by inspection,
e.g.:

DICT_OF_GIANT_STRINGS = {}
for name in Foo.__dict__:
if name.startswith("test_"):
key = name[5:]
if key.startswith("t"):
DICT_OF_GIANT_STRINGS[name] = get_giant_string(key)

I'm sure you get the picture. Then each method just needs to know it's
own name:

class Foo(TestCase):
def test_t1(self):
RECEIPT = DICT_OF_GIANT_STRINGS["test_t1"]

which I must admit is much easier to read than

RECEIPT = "...2500 hex encoded characters..."

Peter Otten · Jan 16, 2014

Roy said:
I realize the subject line is kind of meaningless, so let me explain

I've got some unit tests that look like:

class Foo(TestCase):
def test_t1(self):
RECEIPT = "some string"

def test_t2(self):
RECEIPT = "some other string"

def test_t3(self):
RECEIPT = "yet a third string"

and so on. It's important that the strings be mutually unique. In the
example above, it's trivial to look at them and observe that they're all
different, but in real life, the strings are about 2500 characters long,
hex-encoded. It even turns out that a couple of the strings are
identical in the first 1000 or so characters, so it's not trivial to do
by visual inspection.

So, I figured I would write a meta-test, which used introspection to
find all the methods in the class, extract the strings from them (they
are all assigned to a variable named RECEIPT), and check to make sure
they're all different.

Is it possible to do that? It is straight-forward using the inspect
module to discover the methods, but I don't see any way to find what
strings are assigned to a variable with a given name. Of course, that
assignment doesn't even happen until the function is executed, so
perhaps what I want just isn't possible?

It turns out, I solved the problem with more mundane tools:

grep 'RECEIPT = ' test.py | sort | uniq -c

and I could have also solved the problem by putting all the strings in a
dict and having the functions pull them out of there. But, I'm still
interested in exploring if there is any way to do this with
introspection, as an academic exercise.

Instead of using introspection you could make it explicit with a decorator:

$ cat unique_receipt.py
import functools
import sys
import unittest

_receipts = {}
def unique_receipt(receipt):
def deco(f):
if receipt in _receipts:
raise ValueError(
"Duplicate receipt {!r} in \n {} and \n {}".format(
receipt, _receipts[receipt], f))
_receipts[receipt] = f
@functools.wraps(f)
def g(self):
return f(self, receipt)
return g
return deco

class Foo(unittest.TestCase):
@unique_receipt("foo")
def test_t1(self, RECEIPT):
pass

@unique_receipt("bar")
def test_t2(self, RECEIPT):
pass

@unique_receipt("foo")
def test_t3(self, RECEIPT):
pass

if __name__ == "__main__":
unittest.main()
$ python unique_receipt.py
Traceback (most recent call last):
File "unique_receipt.py", line 19, in <module>
class Foo(unittest.TestCase):
File "unique_receipt.py", line 28, in Foo
@unique_receipt("foo")
File "unique_receipt.py", line 11, in deco
receipt, _receipts[receipt], f))
ValueError: Duplicate receipt 'foo' in
<function test_t1 at 0x7fc8714af5f0> and
<function test_t3 at 0x7fc8714af7d0>

Roy Smith · Jan 16, 2014

Steven D'Aprano said:
Is the mapping of receipt string to test fixed? That is, is it important
that test_t1 *always* runs with "some string", test_t2 "some other
string", and so forth?

Yes.

Is it possible to get some informations from a document in Google Docs and show it on my website ?	0	Nov 19, 2022
Is it possible monkey patch like this?	4	Dec 18, 2012
Is it possible to make a unittest decorator to rename a method from"x" to "testx?"	10	Aug 8, 2013
I have to finish this code for my assignment but I cant figure out how to solve it	1	Jun 27, 2023
Copy string from 2D array to a 1D array in C	1	Nov 1, 2023
Get await function in loop to finish before script ends	0	Oct 14, 2021
Is it possible to use python to get True Full Duplex on a Serial port?	21	Aug 14, 2009
How to get education and coding job coming from abroad starting new in the US? Advice of courses or places to look?	2	May 18, 2023

Is it possible to get string from function?

Roy Smith

Roy Smith

Chris Angelico

Roy Smith

Chris Angelico

Steven D'Aprano

Peter Otten

Roy Smith

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads