space/tab conversion utility?

G

Grant Edwards

Is there any utility to convert Python sources from space-based
block indentation to tab-based?

You can use "expand" convert from tabs->spaces, but "unexpand"
isn't bright enough to do the reverse.
 
B

Brian van den Broek

Grant Edwards said unto the world upon 28/07/2004 18:12:
I don't see how that's useful.




"unexpand" will do dumb conversion, but what's required is
something that understands Python block syntax. Something
similar to C's "indent" program.

Hi,

OK, sorry. I'm still pretty new to Python and programming in general. But
I would have thought that the recipe I pointed to could be used as the
basis of a script that would uniformly replace all leading tabs with a set
number of spaces, thus preserving relative indents. (On the assumption
that the code being transformed is all tabs.)

But getting your response made me read the recipe more carefully, and now
I think I see why you doubt its usefulness for your task.

Still, the approach of replacing only leading tabs seems to me like it
would work. I did up a script and from my testing it appears to preserve
block structure. But I hesitate to post code; one gaff a day is enough ;-)

If there's a subtlety that is blowing past me in my newbieness,
enlightenment would be appreciated. On the other hand, if you'd like to
see the code, I'd be happy to share.

Anyway, sorry if I wasted your time.

Best,

Brian vdB
 
G

Grant Edwards

OK, sorry. I'm still pretty new to Python and programming in general. But
I would have thought that the recipe I pointed to could be used as the
basis of a script that would uniformly replace all leading tabs with a set
number of spaces, thus preserving relative indents. (On the assumption
that the code being transformed is all tabs.)

1) The conversion I need to do is the other direction spaces->tabs.

2) Simply converting all leading spaces to the right number of
tabs (unexpand knows how to do that) isn't correct. Only
the spaces that are block-indent spaces should be converted.

For example:

1 ....if (asfsadfsadf and qwerqwrwer and
2 ........(asdf or qwer)):
3 ........doThis()
4 ........doThat()

Let's say we want to convert the lines above (indented with
groups of 4 spaces) into tab-indented code. In line 2, only the
first 4 spaces should be converted into a tab. The second set
of 4 spaces aren't block-indent spaces they're visual alignment
spaces used to line up the sub-expression, and they need to
remain 4 spaces regardless of the indentation level chosen by
the person viewing the tab-indented code.
Still, the approach of replacing only leading tabs seems to me
like it would work. I did up a script and from my testing it
appears to preserve block structure. But I hesitate to post
code; one gaff a day is enough ;-)

If there's a subtlety that is blowing past me in my
newbieness, enlightenment would be appreciated. On the other
hand, if you'd like to see the code, I'd be happy to share.

The problem is that sometimes lines contain leading spaces that
shouldn't be converted to tabs.
 
B

Brian van den Broek

Grant Edwards said unto the world upon 28/07/2004 19:15:
1) The conversion I need to do is the other direction spaces->tabs.

Doh! :-[
2) Simply converting all leading spaces to the right number of
tabs (unexpand knows how to do that) isn't correct. Only
the spaces that are block-indent spaces should be converted.



The problem is that sometimes lines contain leading spaces that
shouldn't be converted to tabs.

Hi Grant,

thanks for the explanation. Since you probably have better things to do
than play "spot the newbie's confusion" and I am fresh out of feet to
stick in my mouth, let's leave it there ;-)

Best,

Brian vdB
 
T

Tim Peters

[Grant Edwards, wants to convert spaces to tabs]
...
2) Simply converting all leading spaces to the right number of
tabs (unexpand knows how to do that) isn't correct. Only
the spaces that are block-indent spaces should be converted.
....

reindent.py is in your Python distribution, and is the state of the
art for "intelligent" conversion of tab-infected files to
space-celebrating ones. I understand that's not the direction you
want, but it is the *code* you want to start from. Doing a good job
on this is harder than anyone believes until they've failed at least
once, and the problem is indeed dealing with "semantically
insignifcant" whitespace in a pleasant way, trying to guess the
author's intent about visual appearance. reindent.py is aware of the
*semantic* indentation level of each line, and you could fiddle its
internals to convert just the semantically significant leading spaces
to hard tabs.
 
G

Grant Edwards

thanks for the explanation. Since you probably have better
things to do than play "spot the newbie's confusion" and I am
fresh out of feet to stick in my mouth, let's leave it there
;-)

We were all newbie's at everything sometime and are newbies at
something all the time. ;)
 
G

Grant Edwards

[Grant Edwards, wants to convert spaces to tabs]
...
2) Simply converting all leading spaces to the right number of
tabs (unexpand knows how to do that) isn't correct. Only
the spaces that are block-indent spaces should be converted.
...

reindent.py is in your Python distribution, and is the state
of the art for "intelligent" conversion of tab-infected files
to space-celebrating ones. I understand that's not the
direction you want, but it is the *code* you want to start
from. [...]

That does indeed sound like the right starting point. The
other option I can think of would be to hack up Jed's
Python-mode so that it attempts to automatically detect whether
tabs or spaces should be used.

Or I could go work on my car.
 
M

Martin Bless

[Tim Peters said:
reindent.py is in your Python distribution, and is the state of the
art for "intelligent" conversion of tab-infected files to
space-celebrating ones.

I'm carrying a question around for quite a while now and it seems to
fit into this context. The issue is "configuration files".

I'm not really happy with "configparser" for some reasons and would
much more use python code itself as the natural und preferred way to
write configuration data.

The idea is to read the configuration file, which should be Python
code, and accept only statements that can't be harmful. A grammar
somehow like this (as a sketch, written before breakfast):

value := "string constant" | "num constant" | ...
seq := (value*) | [value*]
assignment := name = seq | value
.... and so on. To be elaborated.

In words: have assignments in the config file and accept only values
and structures that can do no harm. I'm not familiar with tokenizing
and python source code parsing yet.

My question:

Do you think I could accomplished this using standard Python means?
I'm thinking of the tokenize module, which I haven't used yet.

Will it be possible / easy / difficult?

Martin
 
?

=?iso-8859-15?Q?Pierre-Fr=E9d=E9ric_Caillaud?=

- Open file in editor which uses tabs (like Scite set to use tabs)
- Ctrl-A
- Tab
- Shift-Tab
- Et voilà...

This just indents and dedents the whole program. A side-effect is that
the editor converts spaces to tabs. It also works the other way.

If your program is worrectly indented, it'll stay that way with tabs.
 
C

Catalin Marinas

Grant Edwards said:
So that I can submit patches to a Python application whos
devloper uses tabs. My Python editor uses spaces.

You have different options to the diff command for this:

-E --ignore-tab-expansion Ignore changes due to tab expansion.
-b --ignore-space-change Ignore changes in the amount of white space.
-w --ignore-all-space Ignore all white space.
-B --ignore-blank-lines Ignore changes whose lines are all blank.

Otherwise, a simple sed script might do it (convert 4 spaces to one
tab for example):

sed -e "s/ /\t/g/"

Catalin
 
G

Grant Edwards

You have different options to the diff command for this:

-E --ignore-tab-expansion Ignore changes due to tab expansion.
-b --ignore-space-change Ignore changes in the amount of white space.
-w --ignore-all-space Ignore all white space.
-B --ignore-blank-lines Ignore changes whose lines are all blank.

But the lines of code I added will still be indented with
spaces, and the program's author uses tabs.
Otherwise, a simple sed script might do it (convert 4 spaces
to one tab for example):

sed -e "s/ /\t/g/"

Nope, that doesn't work for reasons already discussed. Even if
you restrict it to spaces at the beginning of the line (e.g.
unexpand -t4 --first-only) it still breaks things.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,764
Messages
2,569,565
Members
45,041
Latest member
RomeoFarnh

Latest Threads

Top