Making a C Refactoring Program

P

pingu219

Hi I'm currently in the midst of building a C high-level refactoring
program in Java but I was wondering if there are any good parsers (or
some other alternative) which are able to read in C files at a
function or global level and allow transformations to the code, in
other words it will allow me to swap a function out of one file and
into another using the api etc...

ATM I'm using ANTLR to generate a C Lexer/parser from a C grammar file
and then using that lexer/parser to read in all the code into an
internal tree structure but that seems abit excessive...

Cheers

P.S Also any references to similar software out there which maps out C
(or some other language) projects would be greatly appreciated.
 
S

santosh

Hi I'm currently in the midst of building a C high-level refactoring
program in Java but I was wondering if there are any good parsers (or
some other alternative) which are able to read in C files at a
function or global level and allow transformations to the code, in
other words it will allow me to swap a function out of one file and
into another using the api etc...

ATM I'm using ANTLR to generate a C Lexer/parser from a C grammar file
and then using that lexer/parser to read in all the code into an
internal tree structure but that seems abit excessive...

Cheers

P.S Also any references to similar software out there which maps out C
(or some other language) projects would be greatly appreciated.

Since you are doing this in Java may be you can adapt the following
software to your needs?

<http://www.eclipsecon.com/articles/Article-LTK/ltk.html>

Also:

<http://www.refactoring.com/tools.html>
<http://www.joanju.com/dist/docs/tree_manip.html>
 
C

Chris Dollin

Hi I'm currently in the midst of building a C high-level refactoring
program in Java but I was wondering if there are any good parsers (or
some other alternative) which are able to read in C files at a
function or global level and allow transformations to the code, in
other words it will allow me to swap a function out of one file and
into another using the api etc...

The preprocessor (and C's context-sensitive parsing) make this rather
harder than one would hope. Also note that you need to preserve (and
possibly modify) the comments from the source; a refactoring that loses
all your function headers would be ... unsatisfactory.
ATM I'm using ANTLR to generate a C Lexer/parser from a C grammar file
and then using that lexer/parser to read in all the code into an
internal tree structure but that seems abit excessive...

I don't see that there's much alternative. You need at least enough
structure to do declaration and use analysis. It is /possible/ that
you can do this on the fly, since C doesn't have, um, backward
declarations [1]. I don't speak ANTLR, but what I'd expect to be
able to do is associate with some production

Spoo ::= Argle "TOKEN" Bargle

an action that says "take the results of parsing Spoo, Argle, and
Bargle, given TOKEN, and produce a new result". If those results
are bits of AST, the result can be another AST -- or it might be
some /evaluation/ of the AST in context [ie, application of refactoring
rules]. The grammar need not care; only the actions. You may need
to do some fast dancing to get a clean & useful type structure.

[1] By "backward declarations" I mean declarations that may follow
use, as for example BCPL's `let D1 and D2 ... and Dn` declarations
where the names introduced by the Di are in the scope of the bodies
of the Di, or the commin-in-functional-languages `where` declaration
of the form `body where declaration`, the body being in the scope
of that following declaration.

--
"The whole apparatus had the look of having been put /Jack of Eagles/
together with the most frantic haste a fanatically
careful technician could muster."

Hewlett-Packard Limited registered no:
registered office: Cain Road, Bracknell, Berks RG12 1HN 690597 England
 
P

pingu219

Hi thanks for all the replies.

I know it's going to be hard performing lower-level code refactoring
with C and if I did attempt it, it would really only be with
preprocessed code. Alot of refactoring patterns don't seem to suit
structured programming languages very well, you're left with only a
basic subset of fairly basic refactorings that are still applicable.

I'm focusing more on the higher-level file-level refactorings like if
say a certain function gets used alot in a particular file then it may
be better to move that function to that file if it's currently in a
different one, the same with global declarations etc..

So I would really only need basic parsing capability or better still
an existing basic program that lays out out the dependencies. I looked
at CDepend but the original author mentioned that it hadn't been
updated in awhile so it would be better to use objdump though that's a
little too sparse..

BTW does anyone know whether there's any research or literature that
focuses purely on the higher-level refactorings rather than basic low-
level code refactorings? Or is that going to be refactoring in
entirely different context? I'm finding it hard to put down the right
term for it.

Cheers and again thanks for the help so far.
 
C

Chris Dollin

Hi thanks for all the replies.

I know it's going to be hard performing lower-level code refactoring
with C and if I did attempt it, it would really only be with
preprocessed code.

Doesn't that limit its utility rather? If it's programmers steering
the refactoring -- and ye gods, I'd love to have a refactoring editor
for C, and something that did useful non-refactoring changes too --
then it's unpreprocessed code they're looking at.
Alot of refactoring patterns don't seem to suit
structured programming languages very well

?!

Do you mean that some refactoring patterns only apply in the OO world?
, you're left with only a
basic subset of fairly basic refactorings that are still applicable.

Surely there's lots of useful refactorings one can apply to C programs.

Let's see. Rename (possibly the most useful simple refactoring ever).
I'd give this one an Alpha. Extract function. Inline function. Extract
local, inline local. Invert if-then-else. Publish function (make it
public, put a declaration in the .h file). Perish function (make it
static, remove it from the .h file, complain if it's still used in the
codebase). Introduce global struct. Move global into global struct.
Move global out of global struct. Replace global with parameter pointer.
Add parameter to function; remove parameter from function; reorder
function arguments; change parameter type if consistent. Move
function(s) to different compilation unit.

[Some of those need support from test-sets because otherwise you can't
tell if you may have unseeingly altered behaviour. I don't see this
as a problem, because I want those testsets anyway.]

I use pretty much all of those in Java development, except for the
'global struct' stuff which I invented because I want to do it to
an existing C++ program which is fuller of globals than the oceans
are of water, and the invert if-then-else which I want want want.

--
"I know it was late, but Mountjoy never bothers, /Archer's Goon/
so long as it's the full two thousand words."

Hewlett-Packard Limited registered office: Cain Road, Bracknell,
registered no: 690597 England Berks RG12 1HN
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,054
Latest member
TrimKetoBoost

Latest Threads

Top