Attempting to create a C .h file to an Assembly .inc file

P

Percival

Hello, I am reletivly newbish at Perl and would like to build a
translator between C .h files into Nasm assembly .inc files. The only
things that I will translate between them are constants, structures, and
possibly unions.

For those who don't know C, here is the problem. Constants are defined
in two ways,

#define MyConstant 500

/\ Don't worry about that one, i got that part of the program done.

enum {
CONSTANT = 0, CONSTANT2, CONSTANT3
};

Where the = is optional, whitespaces are the only delimiter, and
constants increase by one as they move to the right. Example:
enum { Red, White, Blue};
Sets Red equal to 0, White to 1, and Blue to 2

enum { Red, White = 5, Blue};
Sets red equal to 0, white to 5, and blue to 6

enum { Red = 10, White = 2, Blue};
Sets red equal to 10, white to 2, and blue to 3.

And constants should be translated to my assembler like so:

MyConstant equ 500
CONSTANT equ 0
CONSTANT2 equ 1
CONSTANT3 equ 2

The #define statement is straightforward, and I have that part of the
program complete, (as define always comes in the beginning of the line
and so forth)

But i do not have a plan on implementing the enum feature in C. What may
help is that newlines and tabs are all treated as whitespace in C, and
all whitespaces after the first are ignored, and if the character isn't
allowed in symbols whitespaces before and after it are ignored. '=',
',', '{' and '}' are the only ones that probably will appear in enum. So:
enum {
abc = 5,
def = 10,
ghi = 11
};

Is the same as:
enum{abc=5,def=10,ghi=11};
Is the same as

enum

{

abc= 5, def=10

ghi =11 }


;

Thanks for your help.
Percival
 
T

Tassilo v. Parseval

Also sprach Percival:
Hello, I am reletivly newbish at Perl and would like to build a
translator between C .h files into Nasm assembly .inc files. The only
things that I will translate between them are constants, structures, and
possibly unions.

For those who don't know C, here is the problem. Constants are defined
in two ways,
[...]

But i do not have a plan on implementing the enum feature in C. What may
help is that newlines and tabs are all treated as whitespace in C, and
all whitespaces after the first are ignored, and if the character isn't
allowed in symbols whitespaces before and after it are ignored. '=',
',', '{' and '}' are the only ones that probably will appear in enum. So:
enum {
abc = 5,
def = 10,
ghi = 11
};

Is the same as:
enum{abc=5,def=10,ghi=11};
Is the same as

enum

{

abc= 5, def=10

ghi =11 }


;

h2xs, the utility that creates a Perl module skeleton from a C-header,
is capable of parsing enums. The parsing looks like this:

no warnings 'uninitialized';

# Remove C and C++ comments
$src =~ s#/\*[^*]*\*+([^/*][^*]*\*+)*/|("(\\.|[^"\\])*"|'(\\.|[^'\\])*'|.[^/"'\\]*)#$2#gs;

# The while loop iterates over one complete enum-block:
while ($src =~ /(\benum\s*([\w_]*)\s*\{\s([\s\w=,]+)\})/gsc) {
my ($enum_name, $enum_body) =
$1 =~ /enum\s*([\w_]*)\s*\{\s([\s\w=,]+)\}/gs;
my $val = 0;
for my $item (split /,/, $enum_body) {
my ($key, $declared_val) = $item =~ /(\w*)\s*=\s*(.*)/;
$val = length($declared_val) ? $declared_val : 1 + $val;
# $key is now the constant name, $val its value
}
}

For that to work, it's necessary to slurp the whole header into $src. It
doesn't work if you try to process the file linewise. Also, C comments
(that may show up in enums, too) have to be stripped.

Having a closer look at the above, I think it doesn't catch all cases.
Most notably, it should fail on:

enum id { CONSTANT };

That is: It should fail when only one key is in the enumeration.

Tassilo
 
P

Percival

Tassilo said:
h2xs, the utility that creates a Perl module skeleton from a C-header,
is capable of parsing enums. The parsing looks like this:

no warnings 'uninitialized';

# Remove C and C++ comments
$src =~ s#/\*[^*]*\*+([^/*][^*]*\*+)*/|("(\\.|[^"\\])*"|'(\\.|[^'\\])*'|.[^/"'\\]*)#$2#gs;

# The while loop iterates over one complete enum-block:
while ($src =~ /(\benum\s*([\w_]*)\s*\{\s([\s\w=,]+)\})/gsc) {
my ($enum_name, $enum_body) =
$1 =~ /enum\s*([\w_]*)\s*\{\s([\s\w=,]+)\}/gs;
my $val = 0;
for my $item (split /,/, $enum_body) {
my ($key, $declared_val) = $item =~ /(\w*)\s*=\s*(.*)/;
$val = length($declared_val) ? $declared_val : 1 + $val;
# $key is now the constant name, $val its value
}
}

For that to work, it's necessary to slurp the whole header into $src. It
doesn't work if you try to process the file linewise. Also, C comments
(that may show up in enums, too) have to be stripped.

Having a closer look at the above, I think it doesn't catch all cases.
Most notably, it should fail on:

enum id { CONSTANT };

That is: It should fail when only one key is in the enumeration.

Tassilo

Thank you very much, I'll find a way to deal with that failure case.

Thanks for the program, it looks like it does what i need! Just want to
give one last question... doesn't \w include _? So [\w_] is a waste,
just stick with \w . But you are the expert, not me :)

Percival
 
T

Tassilo v. Parseval

Also sprach Percival:
Tassilo said:
h2xs, the utility that creates a Perl module skeleton from a C-header,
is capable of parsing enums. The parsing looks like this:

no warnings 'uninitialized';

# Remove C and C++ comments
$src =~ s#/\*[^*]*\*+([^/*][^*]*\*+)*/|("(\\.|[^"\\])*"|'(\\.|[^'\\])*'|.[^/"'\\]*)#$2#gs;

# The while loop iterates over one complete enum-block:
while ($src =~ /(\benum\s*([\w_]*)\s*\{\s([\s\w=,]+)\})/gsc) {
my ($enum_name, $enum_body) =
$1 =~ /enum\s*([\w_]*)\s*\{\s([\s\w=,]+)\}/gs;
my $val = 0;
for my $item (split /,/, $enum_body) {
my ($key, $declared_val) = $item =~ /(\w*)\s*=\s*(.*)/;
$val = length($declared_val) ? $declared_val : 1 + $val;
# $key is now the constant name, $val its value
}
}

For that to work, it's necessary to slurp the whole header into $src. It
doesn't work if you try to process the file linewise. Also, C comments
(that may show up in enums, too) have to be stripped.

Having a closer look at the above, I think it doesn't catch all cases.
Most notably, it should fail on:

enum id { CONSTANT };

That is: It should fail when only one key is in the enumeration.

Tassilo

Thank you very much, I'll find a way to deal with that failure case.

Thanks for the program, it looks like it does what i need! Just want to
give one last question... doesn't \w include _? So [\w_] is a waste,
just stick with \w . But you are the expert, not me :)

That's correct, \w includes _. But looking at the big picture of h2xs,
this is really one of the minor nits. h2xs, despite being in the core,
is one of the ugliest and worst written pieces of Perl you can find. It
makes a lot of script kiddies look like decent programmers really.

That's because over the years one feature after the other was added by
different people (the above enum thing was added by me based on code I
received from another porter). But if you seek a challenge for your
refactoring skills, h2xs is the right place. It's really the Mount
Everest of refactoring (with the first ascent not having happened yet).

Tassilo
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Staff online

Members online

Forum statistics

Threads
473,774
Messages
2,569,596
Members
45,143
Latest member
DewittMill
Top