Read C++ enum in python

Ludo · Aug 18, 2009

Hello,

I work in a very large project where we have C++ packages and pieces of
python code.

I've been googleing for days but what I find seems really too
complicated for what I want to do.

My business is, in python, to read enum definitions provided by the
header file of an c++ package.
Of course I could open the .h file, read the enum and transcode it by
hand into a .py file but the package is regularly updated and thus is
the enum.

My question is then simple : do we have :
- either a simple way in python to read the .h file, retrieve the c++
enum and provide an access to it in my python script
- either a simple tool (in a long-term it would be automatically run
when the c++ package is compiled) generating from the .h file a .py file
containing the python definition of the enums ?

Thank you for any suggestion.

MRAB · Aug 18, 2009

Ludo said:
Hello,

I work in a very large project where we have C++ packages and pieces of
python code.

I've been googleing for days but what I find seems really too
complicated for what I want to do.

My business is, in python, to read enum definitions provided by the
header file of an c++ package.
Of course I could open the .h file, read the enum and transcode it by
hand into a .py file but the package is regularly updated and thus is
the enum.

My question is then simple : do we have :
- either a simple way in python to read the .h file, retrieve the
c++ enum and provide an access to it in my python script
- either a simple tool (in a long-term it would be automatically run
when the c++ package is compiled) generating from the .h file a .py file
containing the python definition of the enums ?

Thank you for any suggestion.

Speaking personally, I'd parse the .h file using a regular expression
(re module) and generate a .py file. Compilers typically have a way of
letting you run external scripts (eg batch files in Windows or, in this
case, a Python script) when an application is compiled.

AggieDan04 · Aug 19, 2009

Hello,

I work in a very large project where we have C++ packages and pieces of
python code.

I've been googleing for days but what I find seems really too
complicated for what I want to do.

My business is, in python, to read enum definitions provided by the
header file of an c++ package.
Of course I could open the .h file, read the enum and transcode it by
hand into a .py file but the package is regularly updated and thus is
the enum.

My question is then simple : do we have :
- either a simple way in python to read the .h file, retrieve the c++
enum and provide an access to it in my python script

Try something like this:

file_data = open(filename).read()
# Remove comments and preprocessor directives
file_data = ' '.join(line.split('//')[0].split('#')[0] for line in
file_data.splitlines())
file_data = ' '.join(re.split(r'\/\*.*\*\/', file_data))
# Look for enums: In the first { } block after the keyword "enum"
enums = [text.split('{')[1].split('}')[0] for text in re.split(r'\benum
\b', file_data)[1:]]

for enum in enums:
last_value = -1
for enum_name in enum.split(','):
if '=' in enum_name:
enum_name, enum_value = enum_name.split('=')
enum_value = int(enum_value, 0)
else:
enum_value = last_value + 1
last_value = enum_value
enum_name = enum_name.strip()
print '%s = %d' % (enum_name, enum_value)
print

Mark Tolonen · Aug 19, 2009

MRAB said:
Speaking personally, I'd parse the .h file using a regular expression
(re module) and generate a .py file. Compilers typically have a way of
letting you run external scripts (eg batch files in Windows or, in this
case, a Python script) when an application is compiled.

This is what 3rd party library pyparsing is great for:

--------begin code----------
from pyparsing import *

# sample string with enums and other stuff
sample = '''
stuff before

enum hello {
Zero,
One,
Two,
Three,
Five=5,
Six,
Ten=10
}

in the middle

enum blah
{
alpha,
beta,
gamma = 10 ,
zeta = 50
}

at the end
'''

# syntax we don't want to see in the final parse tree
_lcurl = Suppress('{')
_rcurl = Suppress('}')
_equal = Suppress('=')
_comma = Suppress(',')
_enum = Suppress('enum')

identifier = Word(alphas,alphanums+'_')
integer = Word(nums)

enumValue = Group(identifier('name') + Optional(_equal + integer('value')))
enumList = Group(enumValue + ZeroOrMore(_comma + enumValue))
enum = _enum + identifier('enum') + _lcurl + enumList('list') + _rcurl

# find instances of enums ignoring other syntax
for item,start,stop in enum.scanString(sample):
id = 0
for entry in item.list:
if entry.value != '':
id = int(entry.value)
print '%s_%s = %d' % (item.enum.upper(),entry.name.upper(),id)
id += 1
--------------end code------------

Output:
HELLO_ZERO = 0
HELLO_ONE = 1
HELLO_TWO = 2
HELLO_THREE = 3
HELLO_FIVE = 5
HELLO_SIX = 6
HELLO_TEN = 10
BLAH_ALPHA = 0
BLAH_BETA = 1
BLAH_GAMMA = 10
BLAH_ZETA = 50

-Mark

Neil Hodgson · Aug 19, 2009

AggieDan04:

file_data = open(filename).read()
# Remove comments and preprocessor directives
file_data = ' '.join(line.split('//')[0].split('#')[0] for line in
file_data.splitlines())
file_data = ' '.join(re.split(r'\/\*.*\*\/', file_data))

For some headers I tried it didn't work until the .* was changed to a
non-greedy .*? to avoid removing from the start of the first comment to
the end of the last comment.

file_data = ' '.join(re.split(r'\/\*.*?\*\/', file_data))

Neil

Bill Davy · Aug 19, 2009

Mark Tolonen said:
This is what 3rd party library pyparsing is great for:

--------begin code----------
from pyparsing import *

# sample string with enums and other stuff
sample = '''
stuff before

enum hello {
Zero,
One,
Two,
Three,
Five=5,
Six,
Ten=10
}

in the middle

enum blah
{
alpha,
beta,
gamma = 10 ,
zeta = 50
}

at the end
'''

# syntax we don't want to see in the final parse tree
_lcurl = Suppress('{')
_rcurl = Suppress('}')
_equal = Suppress('=')
_comma = Suppress(',')
_enum = Suppress('enum')

identifier = Word(alphas,alphanums+'_')
integer = Word(nums)

enumValue = Group(identifier('name') + Optional(_equal +
integer('value')))
enumList = Group(enumValue + ZeroOrMore(_comma + enumValue))
enum = _enum + identifier('enum') + _lcurl + enumList('list') + _rcurl

# find instances of enums ignoring other syntax
for item,start,stop in enum.scanString(sample):
id = 0
for entry in item.list:
if entry.value != '':
id = int(entry.value)
print '%s_%s = %d' % (item.enum.upper(),entry.name.upper(),id)
id += 1
--------------end code------------

Output:
HELLO_ZERO = 0
HELLO_ONE = 1
HELLO_TWO = 2
HELLO_THREE = 3
HELLO_FIVE = 5
HELLO_SIX = 6
HELLO_TEN = 10
BLAH_ALPHA = 0
BLAH_BETA = 1
BLAH_GAMMA = 10
BLAH_ZETA = 50

-Mark

Python and pythoneers are amazing!

Ludo · Aug 19, 2009

Neil Hodgson a écrit :

For some headers I tried it didn't work until the .* was changed to a
non-greedy .*? to avoid removing from the start of the first comment to
the end of the last comment.

file_data = ' '.join(re.split(r'\/\*.*?\*\/', file_data))

Thank you ! I adopt it !

Cheers.

Mark Tolonen · Aug 20, 2009

[snip]

This is what 3rd party library pyparsing is great for:

--------begin code----------
from pyparsing import *

# sample string with enums and other stuff
sample = '''
stuff before

enum hello {
Zero,
One,
Two,
Three,
Five=5,
Six,
Ten=10
}

in the middle

enum blah
{
alpha,
beta,
gamma = 10 ,
zeta = 50
}

at the end
'''

# syntax we don't want to see in the final parse tree
_lcurl = Suppress('{')
_rcurl = Suppress('}')
_equal = Suppress('=')
_comma = Suppress(',')
_enum = Suppress('enum')

identifier = Word(alphas,alphanums+'_')
integer = Word(nums)

enumValue = Group(identifier('name') + Optional(_equal +
integer('value')))
enumList = Group(enumValue + ZeroOrMore(_comma + enumValue))
enum = _enum + identifier('enum') + _lcurl + enumList('list') + _rcurl

# find instances of enums ignoring other syntax
for item,start,stop in enum.scanString(sample):
id = 0
for entry in item.list:
if entry.value != '':
id = int(entry.value)
print '%s_%s = %d' % (item.enum.upper(),entry.name.upper(),id)
id += 1
--------------end code------------

Output:
HELLO_ZERO = 0
HELLO_ONE = 1
HELLO_TWO = 2
HELLO_THREE = 3
HELLO_FIVE = 5
HELLO_SIX = 6
HELLO_TEN = 10
BLAH_ALPHA = 0
BLAH_BETA = 1
BLAH_GAMMA = 10
BLAH_ZETA = 50

Paul McGuire (pyparsing author) reminded me that:

enum.ignore(cppStyleComment)

before scanString will skip commented out sections as well.

-Mark

Mini Web Server in C++ (Part One)	4	Oct 2, 2025
ANN: Python 3 enum package	0	Feb 16, 2013
Unable to read input from keyboard, in below C code, for a BST.	0	Jul 20, 2025
How to write variable in a python file then import it in another python file?	2	Dec 23, 2024
Qr code read	2	May 8, 2023
Python battle game help	2	Feb 23, 2023
Read xml column inside csv file with Python	0	Jul 22, 2022
C compiler for Windows	0	Jun 16, 2026

Read C++ enum in python

Ludo

MRAB

AggieDan04

Mark Tolonen

Neil Hodgson

Bill Davy

Ludo

Mark Tolonen

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads