more search and replace

I

ishamid

[Total novice]

A follow-up on my last email ("search and replace")". I am trying to
convert an OOo xml source (content.xml) to TeX. It's a bibliography and
thus very predictable/regular/simple etc. Each entry looks roughly like
this (simplified):

====================================
<text:p text:style-name="ID">[<text:sequence text:ref-name="refAutoNr3"

text:name="AutoNr" text:formula="ooow:AutoNr+1"
style:num-format="1">4</text:sequence></text:p>
<text:p text:style-name="Standard">Ben</text:p>
<text:p text:style-name="reference">
<text:span text:style-name="T10">Article</text:span>.,
<text:span text:style-name="Style2">Journal</text:span>,
volume, issue, year.
</text:p>
<text:p text:style-name="reference"/>
<text:p text:style-name="reference"/>
====================================

I. line one is discussed in my last email. Basically, each line of this
type (numbers are variable) needs to be converted to

====
\head
====

II.
====================================
<text:p text:style-name="P6">Jim</text:p>
<text:p text:style-name="P8">Michael</text:p>
<text:p text:style-name="Standard">Ben</text:p>
====================================

replace each with the name plus a linespace

====================================
Jim

Michael

Ben
====================================

III. <text:span text:style-name="T10">Article</text:span>

If the style-name="T10", then the argument should be, e.g. {\bf
Article}
if the style-name="Style2", then argument should be, e.g. {\it
Journal}

IV. So the final output should be something like

====================================
\head Ben

{\bf Article}, {\it Journal}, volume, issue, year.

====================================

I hope to get enough info here to be able to finish this myself. I
assume finishing my script would only take one of you guys 15 or 20
minutes ;-) If I'm not able to get things working quickly (trying to
learn Ruby and do my work at the same time) I will be happy to pay one
of you for an hour or so of work (I'm up against a deadline).

THANK YOU
Idris

PS For reference, here is the script I'm trying to modify for this OOo
bibliography:

=====================================
class OpenOffice

# using an xml parser if overkill and we need to regexp anyway

attr_reader :display, :inline, :translate
attr_writer :display, :inline, :translate

def initialize
@data = nil
@file = ''
@display = Hash.new
@inline = Hash.new
@translate = Hash.new
end

def load(filename)
if not filename.empty? and FileTest.file?(filename) then
begin
@data, @file = IO.read(filename), filename
rescue
@data, @file = nil, ''
end
else
@data, @file = nil, ''
end
end

def save(filename='')
if filename.empty? then
filename = "clean-#{@file}"
end
if f = open(filename,'w') then
f.puts(@data)
f.close
end
end

def convert
@translations = Hash.new
@translate.each do |k,v|
@translations[/#{k}/] = v
end
if @data then
@data.gsub!(/<\?.*?\?>/) do
# remove
end
@data.gsub!(/<!--.*?-->/) do
# remove
@data.gsub!(/<!--.*?-->/) do
# remove
end
@data.gsub!(/.*?<(office:text).*?>(.*?)<\/\1>.*/mois) do
'\starttext' + "\n" + $2 + "\n" + '\stoptext'
end

@data.gsub!(/<(office:font-face-decls|office:automatic-styles|text:sequence-decls).*?>.*?<\/\1>/mois)
do
# remove
end

@data.gsub!(/<text:span.*?text:style-name=([\'\"])(.*?)\1>(.*?)<\/text:span>/)
do
tag, text = $2, $3
if inline[tag] then
(inline[tag][0]||'') + clean_display(text) +
(inline[tag][1]||'')
else
clean_display(text)
end
end
@data.gsub!(/<text:p[^>]*?\/>/) do
# remove
end

@data.gsub!(/<text:p.*?text:style-name=([\'\"])(.*?)\1>(.*?)<\/text:p>/)
do
tag, text = $2, $3
if display[tag] then
"\n" + (display[tag][0]||'') + clean_inline(text) +
(display[tag][1]||'') + "\n"
else
"\n" + clean_inline(text) + "\n"
end
end
@data.gsub!(/\t/,' ')
@data.gsub!(/^ +$/,'')
@data.gsub!(/\n\n+/moi,"\n\n")
end
end

def clean_display(str)
str.gsub!(/&quot;(.*?)&quot;/) do
'\quotation {' + $1 + '}'
end
str
end

def clean_inline(str)
@translations.each do |k,v|
str.gsub!(k,v)
end
str
end

end

def convert(filename)

doc = OpenOffice.new

doc.display['P1'] = ['\chapter{','}']
doc.display['P2'] = ['\startparagraph'+"\n","\n"+'\stopparagraph']
doc.display['P3'] = doc.display['P2']

doc.inline['T1'] = ['','']
doc.inline['T2'] = ['{\sl ','}']

doc.translate['¬'] = 'XX'
doc.translate['&apos;'] = '`'

doc.load(filename)

doc.convert

doc.save
end

filename = ARGV[0]

filename = 'content.xml' if not filename or filename.empty?

convert('content.xml')
=====================================
 
J

Jeremy McAnally

Are you using OOo 2.0.4? I know it has a TeX/BibTeX export feature now...

It's not Ruby, but it should work (unless you're using this with some
sort of automated system). :)

--Jeremy

[Total novice]

A follow-up on my last email ("search and replace")". I am trying to
convert an OOo xml source (content.xml) to TeX. It's a bibliography and
thus very predictable/regular/simple etc. Each entry looks roughly like
this (simplified):

=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
<text:p text:style-name=3D"ID">[<text:sequence text:ref-name=3D"refAutoNr= 3"

text:name=3D"AutoNr" text:formula=3D"ooow:AutoNr+1"
style:num-format=3D"1">4</text:sequence></text:p>
<text:p text:style-name=3D"Standard">Ben</text:p>
<text:p text:style-name=3D"reference">
<text:span text:style-name=3D"T10">Article</text:span>.,
<text:span text:style-name=3D"Style2">Journal</text:span>,
volume, issue, year.
</text:p>
<text:p text:style-name=3D"reference"/>
<text:p text:style-name=3D"reference"/>
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D

I. line one is discussed in my last email. Basically, each line of this
type (numbers are variable) needs to be converted to

=3D=3D=3D=3D
\head
=3D=3D=3D=3D

II.
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
<text:p text:style-name=3D"P6">Jim</text:p>
<text:p text:style-name=3D"P8">Michael</text:p>
<text:p text:style-name=3D"Standard">Ben</text:p>
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D

replace each with the name plus a linespace

=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
Jim

Michael

Ben
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D

III. <text:span text:style-name=3D"T10">Article</text:span>

If the style-name=3D"T10", then the argument should be, e.g. {\bf
Article}
if the style-name=3D"Style2", then argument should be, e.g. {\it
Journal}

IV. So the final output should be something like

=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
\head Ben

{\bf Article}, {\it Journal}, volume, issue, year.

=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D

I hope to get enough info here to be able to finish this myself. I
assume finishing my script would only take one of you guys 15 or 20
minutes ;-) If I'm not able to get things working quickly (trying to
learn Ruby and do my work at the same time) I will be happy to pay one
of you for an hour or so of work (I'm up against a deadline).

THANK YOU
Idris

PS For reference, here is the script I'm trying to modify for this OOo
bibliography:

=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
class OpenOffice

# using an xml parser if overkill and we need to regexp anyway

attr_reader :display, :inline, :translate
attr_writer :display, :inline, :translate

def initialize
@data =3D nil
@file =3D ''
@display =3D Hash.new
@inline =3D Hash.new
@translate =3D Hash.new
end

def load(filename)
if not filename.empty? and FileTest.file?(filename) then
begin
@data, @file =3D IO.read(filename), filename
rescue
@data, @file =3D nil, ''
end
else
@data, @file =3D nil, ''
end
end

def save(filename=3D'')
if filename.empty? then
filename =3D "clean-#{@file}"
end
if f =3D open(filename,'w') then
f.puts(@data)
f.close
end
end

def convert
@translations =3D Hash.new
@translate.each do |k,v|
@translations[/#{k}/] =3D v
end
if @data then
@data.gsub!(/<\?.*?\?>/) do
# remove
end
@data.gsub!(/<!--.*?-->/) do
# remove
@data.gsub!(/<!--.*?-->/) do
# remove
end
@data.gsub!(/.*?<(office:text).*?>(.*?)<\/\1>.*/mois) do
'\starttext' + "\n" + $2 + "\n" + '\stoptext'
end

@data.gsub!(/<(office:font-face-decls|office:automatic-styles|text:sequen=
ce-decls).*?>.*? said:
do
# remove
end

@data.gsub!(/<text:span.*?text:style-name=3D([\'\"])(.*?)\1>(.*?)<\/text:= span>/)
do
tag, text =3D $2, $3
if inline[tag] then
(inline[tag][0]||'') + clean_display(text) +
(inline[tag][1]||'')
else
clean_display(text)
end
end
@data.gsub!(/<text:p[^>]*?\/>/) do
# remove
end

@data.gsub!(/<text:p.*?text:style-name=3D([\'\"])(.*?)\1>(.*?)<\/text:p>/= )
do
tag, text =3D $2, $3
if display[tag] then
"\n" + (display[tag][0]||'') + clean_inline(text) +
(display[tag][1]||'') + "\n"
else
"\n" + clean_inline(text) + "\n"
end
end
@data.gsub!(/\t/,' ')
@data.gsub!(/^ +$/,'')
@data.gsub!(/\n\n+/moi,"\n\n")
end
end

def clean_display(str)
str.gsub!(/&quot;(.*?)&quot;/) do
'\quotation {' + $1 + '}'
end
str
end

def clean_inline(str)
@translations.each do |k,v|
str.gsub!(k,v)
end
str
end

end

def convert(filename)

doc =3D OpenOffice.new

doc.display['P1'] =3D ['\chapter{','}']
doc.display['P2'] =3D ['\startparagraph'+"\n","\n"+'\stopparagraph']
doc.display['P3'] =3D doc.display['P2']

doc.inline['T1'] =3D ['','']
doc.inline['T2'] =3D ['{\sl ','}']

doc.translate['=AC'] =3D 'XX'
doc.translate['&apos;'] =3D '`'

doc.load(filename)

doc.convert

doc.save
end

filename =3D ARGV[0]

filename =3D 'content.xml' if not filename or filename.empty?

convert('content.xml')
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
 
I

ishamid

Hi Jeremy,

Are you using OOo 2.0.4? I know it has a TeX/BibTeX export feature now...

Wow, I did not know this, but...
It's not Ruby, but it should work (unless you're using this with some
sort of automated system). :)

I use ConTeXt, not LaTeX, and the two are really different, so...

I am sending a note to the ConTeXt developers list about this; maybe
some of them can port the OOo LaTeX filters to ConTeXt. In the meantime
I think it's best to finish that script...

Thank you very much for letting me know about OOo and LaTeX!

Best
Idris
 
I

ishamid

Hi Jeremy,

I checked it out; the source is way too messy for my purposes; it will
be much easier to convert the xml to ConTeXt than the LaTeX to ConTeXt.

Thnx again
Idris
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top