What language to manipulate text files

R

ross

I want to do some tricky text file manipulation on many files, but have
only a little programming knowledge.

What are the ideal languages for the following examples?

1. Starting from a certain folder, look in the subfolders for all
filenames matching *FOOD*.txt Any files matching in each folder should
be copied to a new subfolder within the current folder called EATING
with a new name of *FOOD*COPY.txt

2. Process each file as follows:
Here is a simplified example of what I want as input and output.

------------------------------------- input
.......................... 'several unknown lines of text file
Get apples from apples shop
Get oranges from oranges shop
Get plums from plums shop
Get pears from pears shop
Eat from apples, oranges,
plums, pears 'whitespace at start of line is unimportant
.......................... 'more unknown lines of text file
Chapter 1
Several lines of text about apples in here
Chapter 2
Several lines of text about oranges in here
Chapter 3
Several lines of text about plums in here
Chapter 4
Several lines of text about pears in here

------------------------------------- output
.......................... 'several unknown lines of text file
Get apples from apples shop
Get oranges from oranges shop
Get plums from plums shop
Get pears from pears shop
Get bagels from bagels shop 'the Get lines...
Get donuts from donuts shop 'can be in any order
Eat from apples, bagels, oranges,
plums, donuts, pears 'whitespace at start of line is unimportant
.......................... 'more unknown lines of text file
Chapter 1
Several lines of text about apples in here
Chapter 2
Several lines of text about bagels in here
Chapter 3
Several lines of text about oranges in here
Chapter 4
Several lines of text about plums in here
Chapter 5
Several lines of text about donuts in here
Chapter 6
Several lines of text about pears in here

Summary:
I have added two new items to Get;
I have put them into the comma-delimited list after searching for a
particular fruit to put each one after;
The Chapters are renumbered to match their position in the
comma-delimited list.
The "several lines of text" about each new item can be pulled from a
new_foods.txt file (or a bagels.txt and a donuts.txt file).

My first objective is to process the files as described.
My second objective is to learn the best language for this sort of text
manipulation. The language should run on Windows 98, XP and Linux.

Would Python be best, or would a macro-scripting thing like AutoHotKey
work?
I thought about Perl, but think I would learn bad habits and have hard
to read code.

Thanks, Ross
 
T

Terry Hancock

I want to do some tricky text file manipulation on many files, but have
only a little programming knowledge. [...]

Would Python be best, or would a macro-scripting thing like AutoHotKey
work?
I thought about Perl, but think I would learn bad habits and have hard
to read code.

Both Perl and Python are *extremely* good at this kind of work. This is
pretty much what inspired Perl, and Python implements most of the same
toolset. You will solve many of these kinds of problems using "regular
expressions" (built-in first-class object in Perl, created from strings in
Python using the "re" module).

No surprise of course that I would choose Python. Mainly because of what
it provides beyond regular expressions. Many simple cases can be handled
with string methods in Python (check the Sequence types information in the
built-ins section of the Library Reference -- also look at the "string" module,
though it's usually easier to use the string methods approach).

You will probably end up with more readable code using Python and
take less time to develop sufficient proficiency to do the job with it.
 
R

Roose

Why do people keep asking what language to use for certain things in the
Python newsgroup? Obviously the answer is going to biased.

Not that it's a bad thing because I love Python, but it doesn't make sense
if you honestly want an objective opinion.

R
 
M

Michael Hoffman

ross said:
I want to do some tricky text file manipulation on many files, but have
only a little programming knowledge.

What are the ideal languages for the following examples?

1. Starting from a certain folder, look in the subfolders for all
filenames matching *FOOD*.txt Any files matching in each folder should
be copied to a new subfolder within the current folder called EATING
with a new name of *FOOD*COPY.txt

This should get you started:

import errno
from path import path # http://www.jorendorff.com/articles/python/path/

dst_dirpath = path("EATING")

# create dst_dirpath
try:
dst_dirpath.makedirs() # make destination directory and its parents
except OSError, err: # error!
if err.errno = errno.EEXIST: # might just be that it already exists
if not dst_dirpath.isdir(): # and it's a directory
raise # if not, raise an exception

for filepath in path(".").walkfiles("*FOOD*.txt"):
infile = file(filepath)
outfile = file(dst_dirpath.joinpath(filepath.namebase+"_COPY.txt"))

...do processing here...
My first objective is to process the files as described.
My second objective is to learn the best language for this sort of text
manipulation. The language should run on Windows 98, XP and Linux.

Would Python be best, or would a macro-scripting thing like AutoHotKey
work?

Personally, I'd use Python, but what do you expect when you ask here?
 
R

ross

Roose said:
Why do people keep asking what language to use for certain things in the
Python newsgroup? Obviously the answer is going to biased.

Not that it's a bad thing because I love Python, but it doesn't make sense
if you honestly want an objective opinion.

R

What usenet group is it best to ask in then?
Is there one where people have good knowledge of many scripting
languages?

Ross
 
B

beliavsky

ross said:
What usenet group is it best to ask in then?
Is there one where people have good knowledge of many scripting
languages?

"What programming language is best for x" questions can be asked in
comp.programming and/or comp.lang.misc , and possibly in a
domain-specific newsgroup if it exists, for example
sci.math.num-analysis if x = scientific computing. The resulting
debates contain both heat and light :).
 
B

Brian

Hi Roose,

Actually, it is a good thing because it allows those who know the Python
language to be able to show the benefits and weaknesses of the language.
Sure, the attitude here will be "Yes, it's a great language." Yet, at
the same time, it also enables the poster to be able to see potential
benefits to Python that he or she may not of been aware of.

If we don't let others know about the benefits of Python, who will?

Brian
 
D

Dan Christensen

ross said:
What are the ideal languages for the following examples?

1. Starting from a certain folder, look in the subfolders for all
filenames matching *FOOD*.txt Any files matching in each folder should
be copied to a new subfolder within the current folder called EATING
with a new name of *FOOD*COPY.txt

Bash?

for f in *FOOD*.txt; do cp ${f} EATING/${f}COPY.txt; done

Or "mmv", a linux utility:

mmv '*FOOD*.txt' 'EATING/#1FOOD#2COPY.txt'

For the rest, I personally for choose python.

Dan
 
J

Jim

Roose said:
Why do people keep asking what language to use for certain things in the
Python newsgroup? Obviously the answer is going to biased.

Not that it's a bad thing because I love Python, but it doesn't make sense
if you honestly want an objective opinion.

It will, however, have the side-effect of helping people who google for
it tomorrow. I've often found a several months old answer that people
on a group had taken the trouble of patiently answering, which was a
big help to me. In this case I can imagine a person who has heard that
Python is in a class of languages like Perl and Ruby, and who googles
around with some keywords to get some idea of whether it can solve
their problem.

Jim
 
C

Cameron Laird

"What programming language is best for x" questions can be asked in
comp.programming and/or comp.lang.misc , and possibly in a
domain-specific newsgroup if it exists, for example
sci.math.num-analysis if x = scientific computing. The resulting
debates contain both heat and light :).

comp.lang.python is actually a fine place to ask such questions,
I submit, for reasons the original poster could not have known:
clp includes quite a few deeply-experienced commentators, and the
ethos of clp favors accuracy over invective far more than some
other newsgroups nominally better focused on general questions.
 
R

ross

I tried Bash on Cygwin, but did not know enough about setting up the
environment to get it working.
Instead I got an excellent answer from alt.msdos.batch which used the
FOR IN DO command.
My next job is to learn Python.
Ross
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,768
Messages
2,569,574
Members
45,048
Latest member
verona

Latest Threads

Top