Newbie: working with a text file and converting to xml

Discussion in 'Ruby' started by Adam Teale, Dec 6, 2006.

  1. Adam Teale

    Adam Teale Guest

    hi Guys,

    I have a tab-delimited text file that I would like to convert into an
    xml file that can be read/imported into Apple's Final Cut Pro.

    The text file is 2 columns.
    The first column is the time (timecode)
    The second column is text (for sub-titling)

    I thought this might be a good starting project to get into Ruby

    Any suggestions on how I might approach this?

    Thanks!

    Adam Teale

    --
    Posted via http://www.ruby-forum.com/.
    Adam Teale, Dec 6, 2006
    #1
    1. Advertising

  2. > I have a tab-delimited text file that I would like to convert into an
    > xml file that can be read/imported into Apple's Final Cut Pro.
    >
    > The text file is 2 columns.
    > The first column is the time (timecode)
    > The second column is text (for sub-titling)
    >
    > I thought this might be a good starting project to get into Ruby
    >
    > Any suggestions on how I might approach this?


    look at XMLBuilder and FasterCSV

    Setup FasterCSV to use a tab as the delimiter instead of the comma and
    then use it to read the input and then use XMLBuilder to output
    <timecode>data</timecode><sub-title>data</subtitle>

    should be fairly simple, or you can avoid libraries and do it by
    yourself to learn more about ruby without getting bogged down in 3rd
    party libs

    x = Builder::XmlMarkup.new:)target => $stdout, :indent => 1)
    x.instruct
    x.timcode data
    x.sub-title data

    etc

    Kev
    Kevin Jackson, Dec 6, 2006
    #2
    1. Advertising

  3. Adam Teale

    Peter Szinek Guest

    Adam Teale wrote:
    > hi Guys,
    >
    > I have a tab-delimited text file that I would like to convert into an
    > xml file that can be read/imported into Apple's Final Cut Pro.
    >
    > The text file is 2 columns.
    > The first column is the time (timecode)
    > The second column is text (for sub-titling)


    Could you send us 2 example files? I guess the text file format is
    obvious (but better to work with a real-life example) but I am not so
    sure about the Final Cut Pro XML (or is it just a plain simple XML?)

    Until then, check out this code:

    ============================================================
    input = <<INPUT
    0.12 Salut, Foo!
    0.15 Hola Bar! Did you see Baz?
    0.22 I guess he is hanging around with Fluff and Ork.
    INPUT

    template = <<TEMPLATE
    <timecode>TIMECODE</timecode>
    <sub-titling>SUB-TITLING</sub-titling>
    TEMPLATE

    result = "<?xml version=\"1.0\" encoding=\"ISO-8859-1\"?>\n"

    input.split(/\n/).each do |line|
    data = line.split(/\t/)
    result += template.sub('TIMECODE'){data[0]}.sub('SUB-TITLING'){data[1]}
    end

    result += '</xml>'

    puts result
    ============================================================

    output:

    <?xml version="1.0" encoding="ISO-8859-1"?>
    <timecode>0.12</timecode>
    <sub-titling>Salut, Foo!</sub-titling>
    <timecode>0.15</timecode>
    <sub-titling>Hola Bar! Did you see Baz?</sub-titling>
    <timecode>0.22</timecode>
    <sub-titling>I guess he is hanging around with Fluff and
    Ork.</sub-titling>
    </xml>


    Cheers,
    Peter

    __
    http://www.rubyrailways.com
    Peter Szinek, Dec 6, 2006
    #3
  4. Adam Teale

    Adam Teale Guest

    Hi Kev & Peter!

    Thanks for respoding so quickly!

    The text file looks pretty much like that

    00:00:30:13 Swayambhunath Temple: building started 460AD
    00:00:42:21 Durbar Square
    00:01:05:06 Driving to Trisuli River for Rafting
    00:01:55:22 Day 1 Trekking: Pokhara to Tirkhedhunga (1540m)
    00:02:20:20 Day 2 Trekking: Tirkhedhunga to Ghorephani (2750m)
    00:02:33:19 Day 3 Trekking: Ghorephani to Ghandruk (1940m)
    00:02:42:04 Day 4 Trekking: Ghandruk to Pothana (1900m)
    00:03:10:13 Day 5 Trekking: Pothana to Phedi (1130m)

    It'll take a while for your example to filter down into my brain - when
    it does I'll get back to you about it.

    Awesome!

    Thanykou so much!

    Adam


    Peter Szinek wrote:
    > Adam Teale wrote:
    >> hi Guys,
    >>
    >> I have a tab-delimited text file that I would like to convert into an
    >> xml file that can be read/imported into Apple's Final Cut Pro.
    >>
    >> The text file is 2 columns.
    >> The first column is the time (timecode)
    >> The second column is text (for sub-titling)

    >
    > Could you send us 2 example files? I guess the text file format is
    > obvious (but better to work with a real-life example) but I am not so
    > sure about the Final Cut Pro XML (or is it just a plain simple XML?)
    >
    > Until then, check out this code:
    >
    > ============================================================
    > input = <<INPUT
    > 0.12 Salut, Foo!
    > 0.15 Hola Bar! Did you see Baz?
    > 0.22 I guess he is hanging around with Fluff and Ork.
    > INPUT
    >
    > template = <<TEMPLATE
    > <timecode>TIMECODE</timecode>
    > <sub-titling>SUB-TITLING</sub-titling>
    > TEMPLATE
    >
    > result = "<?xml version=\"1.0\" encoding=\"ISO-8859-1\"?>\n"
    >
    > input.split(/\n/).each do |line|
    > data = line.split(/\t/)
    > result +=
    > template.sub('TIMECODE'){data[0]}.sub('SUB-TITLING'){data[1]}
    > end
    >
    > result += '</xml>'
    >
    > puts result
    > ============================================================
    >
    > output:
    >
    > <?xml version="1.0" encoding="ISO-8859-1"?>
    > <timecode>0.12</timecode>
    > <sub-titling>Salut, Foo!</sub-titling>
    > <timecode>0.15</timecode>
    > <sub-titling>Hola Bar! Did you see Baz?</sub-titling>
    > <timecode>0.22</timecode>
    > <sub-titling>I guess he is hanging around with Fluff and
    > Ork.</sub-titling>
    > </xml>
    >
    >
    > Cheers,
    > Peter
    >
    > __
    > http://www.rubyrailways.com



    --
    Posted via http://www.ruby-forum.com/.
    Adam Teale, Dec 6, 2006
    #4
  5. Adam Teale

    Peter Szinek Guest

    Adam Teale wrote:
    > The text file looks pretty much like that


    Then it should be fine - as far as there are no tabs in the second
    column. Of course even that would not mean an unsolvable problem but it
    would not work with the code I sent you.

    > It'll take a while for your example to filter down into my brain - when
    > it does I'll get back to you about it.


    Sure!

    >
    > Awesome!

    Yeah, Ruby is awesome! I am a beginner, too (picked up Ruby a few months
    ago) and though I have very limited time to learn it, I can do a lot of
    things already. The learning curve is really steep.

    Cheers,
    Peter

    __
    http://www.rubyrailways.com
    Peter Szinek, Dec 6, 2006
    #5
  6. Adam Teale

    Adam Teale Guest

    Hi Peter,

    I saved your code and called it convert.rb. I ran it (replacing
    'filename' with the path of my text file - was that right to do?)

    i got this error:
    convert.rb:1: unknown regexp options - atal

    any ideas?

    also, do you know if thereis any way to run a script from the
    commandline like?:
    /convert.rb mytextfile.txt
    i made a shell script that used this kind of thing - it took the input
    file as something like $ARGV (i think - sorry i'm a super newbie!!)
    make sense?

    Thanks Peter!

    Adam


    Peter Szinek wrote:
    > Adam Teale wrote:
    >> The text file looks pretty much like that

    >
    > Then it should be fine - as far as there are no tabs in the second
    > column. Of course even that would not mean an unsolvable problem but it
    > would not work with the code I sent you.
    >
    >> It'll take a while for your example to filter down into my brain - when
    >> it does I'll get back to you about it.

    >
    > Sure!
    >
    >>
    >> Awesome!

    > Yeah, Ruby is awesome! I am a beginner, too (picked up Ruby a few months
    > ago) and though I have very limited time to learn it, I can do a lot of
    > things already. The learning curve is really steep.
    >
    > Cheers,
    > Peter
    >
    > __
    > http://www.rubyrailways.com



    --
    Posted via http://www.ruby-forum.com/.
    Adam Teale, Dec 6, 2006
    #6
  7. Adam Teale

    Peter Szinek Guest

    Adam Teale wrote:
    > Hi Peter,
    >
    > I saved your code and called it convert.rb. I ran it (replacing
    > 'filename' with the path of my text file - was that right to do?)
    >
    > i got this error:
    > convert.rb:1: unknown regexp options - atal
    >
    > any ideas?

    I guess you are referring to Paul's solution since I did not use any
    files :) In any case, could you paste the code here (convert.rb) so I
    can check what's going on?

    > also, do you know if thereis any way to run a script from the
    > commandline like?:
    > ./convert.rb mytextfile.txt


    Sure. The array called ARGV contains all the command line options.

    ------ test.rb
    #!/usr/bin/ruby
    puts ARGV[0]
    puts ARGV[1]
    ------

    /test rb foo bar

    will output

    ----
    foo
    bar
    ----

    Cheers,
    Peter

    __
    http://www.rubyrailways.com
    Peter Szinek, Dec 6, 2006
    #7
  8. Adam Teale

    Adam Teale Guest

    doh! Sorry guys!

    Peter - thanks for the ARGV tips!

    I think i have Paul's script going using the ARGV
    ---------------------------------------------------
    #!/usr/bin/ruby -w

    data = File.read(ARGV[0])

    output = "<?xml version=\"1.0\" encoding=\"ISO-8859-1\"?>\n"

    data.each do |line|
    timecode,subtitle = line.strip.split("\t")
    xml =
    "<item><timecode>#{timecode}</timecode><subtitle>#{subtitle}</subtitle></item>"
    output += xml + "\n"
    end

    File.open("output.xml","w") { |f| f.write output }
    ---------------------------------------------------


    However it only outputs the first line from my txt file:
    ---------------------------------------------------
    <?xml version="1.0" encoding="ISO-8859-1"?>
    <item><timecode>00:00:30:13</timecode><subtitle>Swayambhunath Temple:
    building started 460AD
    00:00:42:21</subtitle></item>
    ---------------------------------------------------

    Apologies for my newbieness!

    Cheers guys!

    Adam




    --
    Posted via http://www.ruby-forum.com/.
    Adam Teale, Dec 6, 2006
    #8
  9. Adam Teale

    Peter Szinek Guest

    Hi,
    >
    > However it only outputs the first line from my txt file:
    > ---------------------------------------------------
    > <?xml version="1.0" encoding="ISO-8859-1"?>
    > <item><timecode>00:00:30:13</timecode><subtitle>Swayambhunath Temple:
    > building started 460AD
    > 00:00:42:21</subtitle></item>
    > ---------------------------------------------------

    Hmm strange. I have cut'n'pasted this code and the data from your
    previous mail and
    for me it works perfectly (as all other Paul's solutions). Are you sure your
    input txt file is OK?

    Are you on Mac? Maybe there can be something with the line breaks?

    > Apologies for my newbieness!

    No need to apologize. In no time, *you* will be answering other's
    questions :)

    Peter

    __
    http://www.rubyrailways.com
    Peter Szinek, Dec 6, 2006
    #9
  10. Adam Teale

    Adam Teale Guest

    Ah thanks Peter - yes on OSX - you are right, there is something funny
    with the line breaks! Weird!

    now i just have to work out how to add all the FCP xml stuff in there

    I appreciate l all your help & encouraging words!!



    Peter Szinek wrote:
    > Hi,
    >>
    >> However it only outputs the first line from my txt file:
    >> ---------------------------------------------------
    >> <?xml version="1.0" encoding="ISO-8859-1"?>
    >> <item><timecode>00:00:30:13</timecode><subtitle>Swayambhunath Temple:
    >> building started 460AD
    >> 00:00:42:21</subtitle></item>
    >> ---------------------------------------------------

    > Hmm strange. I have cut'n'pasted this code and the data from your
    > previous mail and
    > for me it works perfectly (as all other Paul's solutions). Are you sure
    > your
    > input txt file is OK?
    >
    > Are you on Mac? Maybe there can be something with the line breaks?
    >
    >> Apologies for my newbieness!

    > No need to apologize. In no time, *you* will be answering other's
    > questions :)
    >
    > Peter
    >
    > __
    > http://www.rubyrailways.com



    --
    Posted via http://www.ruby-forum.com/.
    Adam Teale, Dec 6, 2006
    #10
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. jkflens
    Replies:
    2
    Views:
    1,468
    jkflens
    May 30, 2006
  2. Replies:
    2
    Views:
    895
    Joe Fawcett
    Jul 3, 2008
  3. Kee Nethery
    Replies:
    12
    Views:
    2,075
    Stefan Behnel
    Jun 27, 2009
  4. Replies:
    5
    Views:
    257
  5. Erik Wasser
    Replies:
    5
    Views:
    450
    Peter J. Holzer
    Mar 5, 2006
Loading...

Share This Page