Extract a function from C code?

R

Robert Schultz

I have a C/C++ file that I simply want to 'extract' a function from.
Something like: extract <function name> <c or cpp file>

I want it to return from the beginning of the function, to the end.

I tried cscope, and that will find the function, but it won't tell me
how many lines it is or extract it for me.

Any ideas on what I could use?

It would also be awesome if I could choose to extract struct
definitions, multi-line macro defs, etc.

(e-mail address removed)
 
E

Eric Sosman

Robert said:
I have a C/C++ file that I simply want to 'extract' a function from.
Something like: extract <function name> <c or cpp file>

I want it to return from the beginning of the function, to the end.

I tried cscope, and that will find the function, but it won't tell me
how many lines it is or extract it for me.

Any ideas on what I could use?

It would also be awesome if I could choose to extract struct
definitions, multi-line macro defs, etc.

If by "extract a function" you mean "move the
function's source lines out of one file and into
another," an editor is your best bet. Be warned
that the source lines of a function may not be
useful when separated from the #include's and other
declarations of their original file.

As for "extracting" struct definitions and macros,
once again an editor seems the best bet. What are
you trying to do? Break up an everything-thrown-
together source file into manageable pieces? If so,
purely mechanical tools -- tools that operate without
understanding of the relationships between the various
pieces -- won't do a very good job.
 
R

red floyd

Robert said:
I have a C/C++ file that I simply want to 'extract' a function from.
Something like: extract <function name> <c or cpp file>

I want it to return from the beginning of the function, to the end.

I tried cscope, and that will find the function, but it won't tell me
how many lines it is or extract it for me.

Any ideas on what I could use?

It would also be awesome if I could choose to extract struct
definitions, multi-line macro defs, etc.

(e-mail address removed)

If you're on a *nix system, look into "csplit".

We're OT now, so I won't say any more.
 
C

CBFalconer

Robert said:
I have a C/C++ file that I simply want to 'extract' a function from.
Something like: extract <function name> <c or cpp file>

I want it to return from the beginning of the function, to the end.

I tried cscope, and that will find the function, but it won't tell me
how many lines it is or extract it for me.

Any ideas on what I could use?

It would also be awesome if I could choose to extract struct
definitions, multi-line macro defs, etc.

(e-mail address removed)

Use an editor. But it won't work on a C/C++ file, because no such
language exists. There are C source files, and there are C++
source files, and there are AWK source files, but there are NO
C/C++ source files, alas.
 
A

Arthur J. O'Dwyer

I have a C/C++ file that I simply want to 'extract' a function from.
Something like: extract <function name> <c or cpp file>

I want it to return from the beginning of the function, to the end.

I tried cscope, and that will find the function, but it won't tell me
how many lines it is or extract it for me.

I don't know what "cscope" does, but if it will tell you where the
function begins, then you can just take everything from that line,
to the next '{' character, to the matching '}' character; that will
give you all the lines you want, barring evil preprocessor tricks
such as

#define RBRACE }
void foo {
RBRACE
It would also be awesome if I could choose to extract struct
definitions, multi-line macro defs, etc.

Sounds like you're looking for a C parser. ;-) In the worst case,
there are lex/yacc grammars for C90 and maybe C99 available; you can
learn lex/yacc and use those grammars to extract the stuff you want.

HTH,
-Arthur
 
C

Cedric LEMAIRE

I have a C/C++ file that I simply want to 'extract' a function from.
Something like: extract <function name> <c or cpp file>

I want it to return from the beginning of the function, to the end.

I tried cscope, and that will find the function, but it won't tell me
how many lines it is or extract it for me.

Any ideas on what I could use?
You could use CodeWorker, a parsing tool and a source code generator.

Copy the following script into a file called "function_extractor.cwp":
function_extractor ::=
=> if _ARGS.empty() error("function name expected on the command
line");
#ignore(C++)
->[
// from here, copy what is parsed to the output file
#implicitCopy
// name of the function
#readIdentifier:sFunctionName
#nextStep
// function we are looking for?
#check(sFunctionName == _ARGS#front)
// in C++, the function could be template
['<' ignore_template_clause '>']?
'('
// read parameters
ignore_parenthesis
')'
// in C++, ignore 'throw' or 'const' ...
[#readIdentifier | '(' ignore_parenthesis ')']*
// read the block of instructions
'{'
block
'}'
]
;

ignore_template_clause ::= [#readCString | #readCChar | '<'
template_clause '>' | ~'>']*;
ignore_parenthesis ::= [#readCString | #readCChar | '('
ignore_parenthesis ')' | ~')']*;
block ::= [#readCString | #readCChar | '{' block '}' | ~'}']*;

Then, type the command:
codeworker -translate function_extractor.cwp <your-C-file>
<your-output-file> -args <function-name>

It doesn't copy the return type of the function. If you need it, and
if we can suppose that the return type occupies the space between the
beginning of the line and the name of the function, it doesn't require
to parse the type seriously, so I can improve the script very easily.
It would also be awesome if I could choose to extract struct
definitions, multi-line macro defs, etc.

It is simpler to extract a macro definition or a 'struct' declaration.
If you don't want to write them by yourself, because you don't want to
invest time on CodeWorker, I can write them for you.
 
B

Blue River

Or you could use flex on linux.
You could use CodeWorker, a parsing tool and a source code generator.

Copy the following script into a file called "function_extractor.cwp":
function_extractor ::=
=> if _ARGS.empty() error("function name expected on the command
line");
#ignore(C++)
->[
// from here, copy what is parsed to the output file
#implicitCopy
// name of the function
#readIdentifier:sFunctionName
#nextStep
// function we are looking for?
#check(sFunctionName == _ARGS#front)
// in C++, the function could be template
['<' ignore_template_clause '>']?
'('
// read parameters
ignore_parenthesis
')'
// in C++, ignore 'throw' or 'const' ...
[#readIdentifier | '(' ignore_parenthesis ')']*
// read the block of instructions
'{'
block
'}'
]
;

ignore_template_clause ::= [#readCString | #readCChar | '<'
template_clause '>' | ~'>']*;
ignore_parenthesis ::= [#readCString | #readCChar | '('
ignore_parenthesis ')' | ~')']*;
block ::= [#readCString | #readCChar | '{' block '}' | ~'}']*;

Then, type the command:
codeworker -translate function_extractor.cwp <your-C-file>
<your-output-file> -args <function-name>

It doesn't copy the return type of the function. If you need it, and
if we can suppose that the return type occupies the space between the
beginning of the line and the name of the function, it doesn't require
to parse the type seriously, so I can improve the script very easily.

It would also be awesome if I could choose to extract struct
definitions, multi-line macro defs, etc.


It is simpler to extract a macro definition or a 'struct' declaration.
If you don't want to write them by yourself, because you don't want to
invest time on CodeWorker, I can write them for you.

mailto: (e-mail address removed)
mailto: (e-mail address removed)
 
R

Robert Schultz

Eric Sosman said:
If by "extract a function" you mean "move the
function's source lines out of one file and into
another," an editor is your best bet. Be warned
that the source lines of a function may not be
useful when separated from the #include's and other
declarations of their original file.

As for "extracting" struct definitions and macros,
once again an editor seems the best bet. What are
you trying to do? Break up an everything-thrown-
together source file into manageable pieces? If so,
purely mechanical tools -- tools that operate without
understanding of the relationships between the various
pieces -- won't do a very good job.

Well see, I've written some C functions of my own over the past 10
years.
In numerous projects, on numerous platforms.
The functions have evolved and more have been added, features have
been added, bugs have been fixed.
But silly me, I didn't do all my updates to a single file.

So now I have about 300 files, all named Utility.c containing more or
less the same functions, but each in a different degree of 'being
right'.

I wanted to avoid going through each file by hand, looking at each
function by hand.
So I was hoping to be able to automate a process so it would extract
function XYZ from every one of those files.
Then I would use diff to compare the extracted functions to see what
sort of changes are where.
I can't use diff on the original 300 because the files as a whole vary
file to much line by line (functions arranged in different orders).

So now you see why I wanted an automated method because a little
automation could reduce the amount of manual labor required by a large
large amount.

Any ideas?
 
E

Eric Sosman

Robert said:
Well see, I've written some C functions of my own over the past 10
years.
In numerous projects, on numerous platforms.
The functions have evolved and more have been added, features have
been added, bugs have been fixed.
But silly me, I didn't do all my updates to a single file.

So now I have about 300 files, all named Utility.c containing more or
less the same functions, but each in a different degree of 'being
right'.

I wanted to avoid going through each file by hand, looking at each
function by hand.
So I was hoping to be able to automate a process so it would extract
function XYZ from every one of those files.
Then I would use diff to compare the extracted functions to see what
sort of changes are where.
I can't use diff on the original 300 because the files as a whole vary
file to much line by line (functions arranged in different orders).

So now you see why I wanted an automated method because a little
automation could reduce the amount of manual labor required by a large
large amount.

Hmmm... I still think an editor may be your best
friend, provided it's programmable. Emacs comes to mind,
because it already has the ability to match opening and
closing braces without being fooled by their presence in
literals, comments, and such. A moderate amount of Lisp
programming, I think, could build upon this to give you a
command that would start from any position in a file and
work both forward and backward until it found the "ends"
of a function definition, and then copy that region out to
a new file. The tricky part might be finding the actual
start of the function -- if you've located the `foo' in
`foo(int x, int y)' you still need to work backward to find
where `static unsigned long *foo ...' begins. (Possible
algorithm: Work backwards from `foo' until you find a
closing brace or a semicolon or a preprocessor directive,
then forward again until you find a non-comment. This
can be fooled by overenthusiastic use of the preprocessor,
but so can pretty much everything else.)

Even if you manage to automate most or even all of the
extraction step, I still think you're going to have to do
a lot of manual inspection. You'll get spurious diffs from

/* a.c */
int func(int x, int y)

/* b.c */
int func(int col, int row)

and the like. It might be a good idea to run everything
through a source beautifier to eliminate some formatting
differences.

The good part of all this, I think, is that you're
learning why "re-use by cut'n'paste" is unattractive ...
Good luck!
 
M

Michael Wojcik

I don't know what "cscope" does, but if it will tell you where the
function begins, then you can just take everything from that line,
to the next '{' character, to the matching '}' character;

Provided they're not in comments, or string literals, or character
literals, or #define directives, or sections of code that are
conditionally-compiled out...
that will
give you all the lines you want, barring evil preprocessor tricks
such as

#define RBRACE }
void foo {
RBRACE

Yeah, or those, too.
Sounds like you're looking for a C parser.

I think so. And it needs to handle preprocessing (and, to be fully
correct, translations phases 1 (mapping to source character set and
trigraph translation) and 2 (source line splicing), first), with
the same environmental conditions (eg predefined macros) in effect
as when the program is translated by the actual implementation.

Sure, in most cases, just matching {} characters works fine. Whether
it sufficies in this case is something the OP will have to judge. It
isn't the sort of thing I'd be comfortable automating; I'd do it by
hand, in vim, using brace-matching as a first approximation but
checking to make sure I had selected the lines I really wanted.

--
Michael Wojcik (e-mail address removed)

You have Sun saying, "Who needs Linux? We have Solaris." You have
Microsoft saying, "Who needs Linux? We have Windows 2000." Then you
have IBM saying, "I think we all need Linux." Only the greatest
sinners know how to really repent. -- John Patrick, IBM VP
 
J

John Bode

Arthur J. O'Dwyer said:
I don't know what "cscope" does, but if it will tell you where the
function begins, then you can just take everything from that line,
to the next '{' character, to the matching '}' character; that will
give you all the lines you want, barring evil preprocessor tricks
such as

#define RBRACE }
void foo {
RBRACE


Sounds like you're looking for a C parser. ;-) In the worst case,
there are lex/yacc grammars for C90 and maybe C99 available; you can
learn lex/yacc and use those grammars to extract the stuff you want.

HTH,
-Arthur

Yeah, something like this basically screams for a compiler front-end
to recognize individual translation units, struct and union
definitions, typedefs, etc. (at least to do it right).
 
C

Cedric LEMAIRE

Well see, I've written some C functions of my own over the past 10
years.
In numerous projects, on numerous platforms.
The functions have evolved and more have been added, features have
been added, bugs have been fixed.
But silly me, I didn't do all my updates to a single file.

So now I have about 300 files, all named Utility.c containing more or
less the same functions, but each in a different degree of 'being
right'.

I wanted to avoid going through each file by hand, looking at each
function by hand.
So I was hoping to be able to automate a process so it would extract
function XYZ from every one of those files.
Then I would use diff to compare the extracted functions to see what
sort of changes are where.
I can't use diff on the original 300 because the files as a whole vary
file to much line by line (functions arranged in different orders).

So now you see why I wanted an automated method because a little
automation could reduce the amount of manual labor required by a large
large amount.

Any ideas?

It seems not so hard :
* first, to establish the name of all functions automatically,
considering the 300 'Utility.c' files,
* second, to suppress the function names, which aren't interesting
for you, by hand of course,
* third, to extract the implementations function by function, from
the 300 files:
* to retain all different implementations,
* to put them in different output files,
* to make a diff on each of them (good luck!),

These tasks are easy to implement with CodeWorker (see the extraction
further, for instance). If you are interested, I can help you.
 
R

Robert Schultz

I have a C/C++ file that I simply want to 'extract' a function from.
Something like: extract <function name> <c or cpp file>

I want it to return from the beginning of the function, to the end.

I tried cscope, and that will find the function, but it won't tell me
how many lines it is or extract it for me.

Any ideas on what I could use?
You could use CodeWorker, a parsing tool and a source code generator.

Copy the following script into a file called "function_extractor.cwp":
Code:
Then, type the command:
codeworker -translate function_extractor.cwp <your-C-file>
<your-output-file> -args <function-name>

It doesn't copy the return type of the function. If you need it, and
if we can suppose that the return type occupies the space between the
beginning of the line and the name of the function, it doesn't require
to parse the type seriously, so I can improve the script very easily.
[QUOTE]
It would also be awesome if I could choose to extract struct
definitions, multi-line macro defs, etc.[/QUOTE]

It is simpler to extract a macro definition or a 'struct' declaration.
If you don't want to write them by yourself, because you don't want to
invest time on CodeWorker, I can write them for you.[/QUOTE]

AWESOME!!!! THANK YOU SO MUCH!
This is great.
codeworker is EXACTLY what I was looking for!
Thanks again!
 
C

Cedric LEMAIRE

AWESOME!!!! THANK YOU SO MUCH!
This is great.
codeworker is EXACTLY what I was looking for!
Thanks again!

You are welcome.

If you need some help to implement the extractor for 'struct' and
'#define', do not hesitate to contact me on my email.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top