Searching for text

R

robinsiebler

I have a batch of files that I am trying to search for specific text in
a specific format. Each file contains several items I want to search
for.

Here is a snippet from the file:
....
/FontName /ACaslonPro-Semibold def
/FontInfo 7 dict dup begin
/Notice (Copyright 2000 Adobe Systems Incorporated. All Rights
Reserved.Adobe Caslon is either a registered trademark or a trademark
of Adobe Systems Incorporated in the United States and/or other
countries.) def
/Weight (Semibold) def
/ItalicAngle 0 def
/FSType 8 def
....

I want to search the file until I find '/FontName /ACaslonPro-Semibold'
and then jump forward 7 lines where I expect to find '/FSType 8'. I
then want to continue searching from *that* point forward for the next
FontName/FSType pair. Unfortunately, I haven't been able to figure out
how to do this in Python, although I could do it fairly easily in a
batch file. Would someone care to enlighten me?
 
T

Tim Chase

I want to search the file until I find '/FontName /ACaslonPro-Semibold'
and then jump forward 7 lines where I expect to find '/FSType 8'. I
then want to continue searching from *that* point forward for the next
FontName/FSType pair. Unfortunately, I haven't been able to figure out
how to do this in Python, although I could do it fairly easily in a
batch file. Would someone care to enlighten me?

found_fontname = False
font_search = '/FontName /ACaslonPro-Semibold'
type_search = '/FSType 8'
for line in file('foo.txt'):
if font_search in line:
found_fontname = True
if found_fontname and type_search in line:
print 'doing something with %s' % line
# reset to look for font_search
found_fontname = False


or, you could

sed -n '/\/FontName \/ACaslonPro-Semibold/,/\/FSType 8/{/\/FSType
8/p}'

You omit what you want to do with the results when you find
them...or what should happen when they both appear on the same
line (though you hint that they're a couple lines apart, you
don't define this as a "this is always the case" sort of scenario)

-tkc
 
R

robinsiebler

You omit what you want to do with the results when you find
them...or what should happen when they both appear on the same
line (though you hint that they're a couple lines apart, you
don't define this as a "this is always the case" sort of scenario)

I don't do anything, per se. I just need to verify that I find the
FontName/FSType pair. And they *always* have to be in the same
location in relation to each other, i.e. they should never appear on
the same line or any closer/farther from each other.
 
R

robinsiebler

The other thing I failed to mention is that I need to ensure that I
find the fsType *before* I find the next FontName.
 
T

Tim Chase

The other thing I failed to mention is that I need to ensure that I
find the fsType *before* I find the next FontName.

found_fontname = False
font_search = '/FontName /ACaslonPro-Semibold'
type_search = '/FSType 8'
for line in file('foo.txt'):
if font_search in line:
if found_fontname:
print "Uh, oh!"
else:
found_fontname = True
if found_fontname and type_search in line:
print 'doing something with %s' % line
# reset to look for font_search
found_fontname = False

and look for it to report "Uh, oh!" where it has found another
"/FontName /ACaslonPro-Semibold".

You can reduce your font_search to just '/FontName' if that's all
you care about, or if you just want any '/FontName' inside an
'/ACaslonPro-SemiBold' block, you can tweak it to be something like

for line in file('foo.txt'):
if found_fontname and '/FontName' in line:
print "Uh, oh!"
if font_search in line:
found_fontname = True

-tkc
 
S

Simon Forman

robinsiebler said:
The other thing I failed to mention is that I need to ensure that I
find the fsType *before* I find the next FontName.

Given these requirements, I'd formulate the script something like this:


f = open(filename)

NUM_LINES_BETWEEN = 7

Fo = '/FontName /ACaslonPro-Semibold'
FS = '/FSType 8'


def checkfile(f):
# Get a (index, line) generator on the file.
G = enumerate(f)

for i, line in G:

# make sure we don't find a FSType
if FS in line:
print 'Found FSType without FontName %i' % i
return False

# Look for FontName.
if Fo in line:
print 'Found FontName at line %i' % i

try:

# Check the next 7 lines for NO FSType
# and NO FontName
n = NUM_LINES_BETWEEN
while n:
i, line = G.next()

if FS in line:
print 'Found FSType prematurely at %i' % i
return False

if Fo in line:
print "Found '%s' before '%s' at %i" % \
(Fo, FS, i)
return False
n =- 1

# Make sure there's a FSType.
i, line = G.next()

if FS in line:
print 'Found FSType at %i' % i

elif Fo in line:
print "Found '%s' instead of '%s' at %i" % \
(Fo, FS, i)
return False

else:
print 'FSType not found at %i' % i
return False

except StopIteration:
print 'File ended before FSType found.'
return False

return True


if checkfile(f):
# File passes...
pass


Be sure to close your file object when you're done with it. And you
might want fewer or different print statements.

HTH

Peace,
~Simon
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,483
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top