convert text documents to XML

Discussion in 'XML' started by ddddddd, Aug 4, 2009.

  1. ddddddd

    ddddddd Guest

    hello sir

    i am shaji from kerala india. i am a programmer. i hve one doubt i
    will explain below.

    we have to devolop a literatures database. i need to include varoius
    features like search by author,search title,
    search by publication year etc:-

    so what we noramlly doing is first store all the literature details to
    datbase(it includes foolwing
    information like author(Au),title(TI),abstract(
    Ab) ,published year etc-:
    in order to store the these information we need to create one
    literature entry form...and store the details via form...
    this method is possible... it is very tedoius since we have around
    6000 literaturees are there.

    so is there any other method?
    we are downloding these literatures from some external websites...all
    the literatures are in the same format
    ...i am sending the litertures record forma below...please check
    that ...

    [in the below record format TI:means title AU means author
    AB:abstract KW means Keywords PY means publication year
    etc:

    please check below records...and suggest any new methods.
    can we convert it into xml directly?


    literature Record format

    Record: 1

    TI- A Survey of Phytophthora Species on Hainan Island of South China.
    AU- Hui-cai Zeng1
    AU- Hon-hing Ho2
    AU- Fuy-Cong Zheng3
    JN- Journal of Phytopathology
    PD- Jan2009, Vol. 157 Issue 1, p33-39
    PG- 7p
    DT- 20090101
    PT- Article
    AB- During the period 1997–2007, a comprehensive study of the
    occurrence and distribution of Phytophthora species was conducted on
    Hainan Island of South China. To date, 14 species of Phytophthora have
    been recovered and their distribution determined. Phytophthora
    nicotianae ( =P. parasitica) is the most important species attacking a
    wide variety of crops, followed by Phytophthora capsici and
    Phytophthora citrophthora. In contrast to Phytophthora colocasiae
    attacking taro leaves throughout the entire island, Phytophthora
    cyperi was found only once on Digitaria ciliaris in Danzhou. It is of
    interest to note that Phytophthora heveae, Phytophthora katsurae and
    Phytophthora insolita are commonly found in forest soil/water of
    protected mountains without causing any plant diseases. Although
    Phytophthora species are usually terrestrial or found in fresh water,
    one isolate of Phytophthora resembling closely the asexual isolates of
    P. insolita in Hainan was obtained from decaying Rhizophora leaves
    submerged in seawater. An unidentified Phytophthora species producing
    non-papillate; internally proliferating sporangia was isolated from
    the soil in which Ceriops tagel and Bruguiera serangula were growing
    in a salt water shrimp farm. [ABSTRACT FROM AUTHOR]
    AB- Copyright of Journal of Phytopathology is the property of
    Blackwell Publishing Limited and its content may not be copied or
    emailed to multiple sites or posted to a listserv without the
    copyright holder's express written permission. However, users may
    print, download, or email articles for individual use. This abstract
    may be abridged. No warranty is given about the accuracy of the copy.
    Users should refer to the original published version of the material
    for the full abstract. (Copyright applies to all Abstracts.)
    DE- PHYTOPHTHORA
    DE- CROPS
    DE- PLANTS -- Wounds & injuries
    DE- PLANT quarantine
    GE- HAINAN Island (China)
    GE- CHINA
    KW- marine isolates
    KW- Phytophthora capsici
    KW- Phytophthora cinnamomi
    KW- Phytophthora citrophthora
    KW- Phytophthora heveae
    KW- Phytophthora insolita
    KW- Phytophthora katsurae
    KW- Phytophthora nicotianae
    AD- 1The Institute of Bioscience and Biotechnology, Chinese Academy of
    Tropical Agricultural Sciences, Haikou 571101, Hainan, China
    AD- 2Department of Biology, State University of New York, New Paltz,
    New York 12561, USA
    AD- 3College of Environment and Plant Protection, Hainan University,
    Baodoa Xincun, Danzhou City, Hainan 571737, China
    IS- 09311785
    DI- 10.1111/j.1439-0434.2008.01441.x
    AN- 35655828
    UR- http://search.ebscohost.com/login.aspx?direct=true&db=a9h&AN=35655828&site=ehost-live
    UR- http://search.ebscohost.com/login.aspx?direct=true&db=a9h&AN=35655828&site=ehost-live

    Record: 2

    TI- Effects of Some Plant Materials on Phytophthora Blight
    (Phytophthora capsiciLeon.) of Pepper.
    AT- Bazi Bitkisel Materyallerin Biberde Phytophthora Yanikligi
    ( Phytophthora capsici Leon.)'na Etkileri.
    AU- Dem?rc?, Fikret1
    AU- Dolar, F. Sara1
    JN- Turkish Journal of Agriculture & Forestry
    PD- 2006, Vol. 30 Issue 4, p247-252
    PG- 6p
    DT- 20060801
    PT- Article
    AB- Effects of dried garlic, peppermint, cabbage, lentil, alfalfa,
    onion, radish, and garden cress plant materials on Phytophthora blight
    (Phytophthora capsici Leon.) of pepper were determined, in both in
    vitro and in vivo conditions. Extracts of the plant materials were
    used in vitro. The plant materials were extracted in ethanol and were
    added to corn meal agar (CMA) at 5 and 10 µg ml<sup>-1</sup>. The
    extracts of alfalfa, garlic, cabbage, and peppermint reduced colony
    diameter of P. capsicion corn meal agar between 3.46% and 13.73%,
    whereas mycelial growth of P. capsici was increased by onion, radish,
    garden cress, and lentil extracts. The plant materials inhibitory to
    mycelial growth of P. capsici were incorporated into soil inoculated
    with P. capsici, in pots and also in the field, in order to determine
    their effects on Phytophthora blight severity. The severity of
    Phytophthora blight of pepper was markedly reduced by cabbage, garlic,
    and alfalfa materials by 15.3%, 39.8% and 46.9%, respectively, in pot
    trials. No significant effect of peppermint on disease severity was
    found. In the field infested with P. capsici, disease severity
    decreased with cabbage, garlic, and alfalfa by 89.5%, 40%, and 10.7%,
    respectively. Peppermint slightly increased the disease severity
    (3.4%). In this study, dry cabbage, garlic, and alfalfa materials were
    effective in reducing the severity of disease caused by P. capsici, in
    both in vitro and in vivo conditions. (English) [ABSTRACT FROM
    AUTHOR]
    AB- Kurutulmus sar?msak, nane, lahana, mercimek, yonca, sogan, turp ve
    tere bitki art?klar?n?n biberde Phytophthora yan?kl?g? (Phytophthora
    capsici Leon.)'na etkileri in vitro ve in vivo kosullarda
    belirlenmistir, in vitro kosullardaki çal?smalarda bitki
    materyallerinin ekstraktlar? kullan?lm?st?r. Bitki materyalleri etil
    alkolde ekstrakte edilmis ve m?s?r unu agara 5 ve 10 ug mi4 dozlar?nda
    ilave edilmistir. Yonca, sar?msak, lahana ve nane, P. capsici misel
    gelisimini % 3.46 ila % 13.73 oran?nda azalt?rken, sogan, turp, tere
    ve mercimek ekstraktlar?, misel gelisimini art?rm?st?r. P. capsici nin
    misel gelisimine engelleyici etkisi olan bitki materyalleri, biberde
    Phytophthora yan?kl?g? hastal?g?n?n siddetine etkilerinin belirlenmesi
    amac?yla, içinde P. capsici ile inokule edilmis toprak bulunan saks?
    lara ve tarla toprag?na ilave edilmistir. Saks? denemelerinde lahana,
    sar?msak ve yonca art?klar? biberde Phytophthora yan?kl?g? hastal?g?n?
    n siddetini s?ras?yla %15.3, %39.8 ve %46.9 oran?nda azaltm?st?r.
    Nanenin ise hastal?k siddetine önemli bir etkisi olmam?st?r. Tarla
    kosullar?nda lahana, sar?msak ve yonca P. capsici nin hastal?k
    siddetini s?ras?yla %89.5, % 40 ve %10.7 oran?nda azaltm?s, nane ise
    %3.4 oran?nda art?rm?st?r. Bu çal?smada kuru lahana, sar?msak ve yonca
    materyalleri in vitro ve in vivo kosullarda P. capsici ye kars? etkili
    bulunmustur (Turkish) [ABSTRACT FROM AUTHOR]
    AB- Copyright of Turkish Journal of Agriculture & Forestry is the
    property of Scientific and Technical Research Council of Turkey and
    its content may not be copied or emailed to multiple sites or posted
    to a listserv without the copyright holder's express written
    permission. However, users may print, download, or email articles for
    individual use. This abstract may be abridged. No warranty is given
    about the accuracy of the copy. Users should refer to the original
    published version of the material for the full abstract. (Copyright
    applies to all Abstracts.)
    DE- PHYTOPHTHORA diseases
    DE- FUNGAL diseases of plants
    DE- GARLIC
    DE- ALFALFA
    SU- PEPPERMINT
    KW- pepper
    KW- Phytophthora capsici
    KW- Phytophthorablight
    KW- plant materials
    KW- biber
    KW- bitkisel materyaller
    KW- Phytophthora capsici
    KW- Phytophthora yan?kl?g?
    LK- English; Turkish
    AD- 1University of Ankara, Faculty of Agriculture, Plant Protection
    Department, Ankara -- Turkey
    IS- 1300011X
    AN- 22865585
    UR- http://search.ebscohost.com/login.aspx?direct=true&db=a9h&AN=22865585&site=ehost-live
    UR- http://search.ebscohost.com/login.aspx?direct=true&db=a9h&AN=22865585&site=ehost-live


    please reply
    ddddddd, Aug 4, 2009
    #1
    1. Advertising

  2. > can we convert it into xml directly?

    You could certainly write code to parse this document format and produce
    XML output. Of course, you have to decide what XML markup you're going
    to use to represent the data, but after doing that it's not
    significantly harder to produce XML than to put the data into a database.

    I would recommend taking advantage of existing XML APIs -- DOM or SAX --
    which will take care of the details of XML syntax and let you focus on
    the structure of the XML document. DOM is probably easier for a beginner
    to learn, since it's a tree structure; SAX is event-based and requires
    that you maintain more state in your own code, but would be more
    efficient for this sort of flow-through format conversion.

    There may be off-the-shelf tools which can do the conversion for you,
    but I haven't used any... and I'd consider this a fairly trivial bit of
    programming so I wouldn't bother looking for one.

    You might also want to contact whoever maintains the database you're
    trying to access via the web, and see if they already offer a service
    which returns the data in XML form rather than text form.
    Joe Kesselman, Aug 4, 2009
    #2
    1. Advertising

  3. ddddddd

    Daniel Guest

    On Aug 3, 11:12 pm, ddddddd <> wrote:
    > hello sir
    >
    > we have to devolop a literatures database. i need to include varoius
    > features like search by author,search title,
    > search by publication year etc:-
    >
    > so what we noramlly doing is first store all the literature details to
    > datbase(it includes foolwing
    > information like author(Au),title(TI),abstract(
    > Ab) ,published year etc-:
    > in order to store  the these information we need to create one
    > literature entry form...and store the details via  form...
    > this method  is possible... it is very tedoius since we have around
    > 6000 literaturees are there.
    >
    > so is there any other method?
    > we are downloding these literatures from some external websites...all
    > the literatures are in the same format
    > ..i am sending the litertures record forma  below...please check
    > that ...
    >
    > [in the below record format  TI:means title      AU means author
    > AB:abstract  KW means Keywords PY means publication year
    > etc:
    >
    > please check below records...and suggest any new methods.
    > can we convert it  into xml directly?
    >

    Have a look at the open source project ServingXML at http://servingxml.sourceforge.net/,
    and check out some of the examples in the Examples link. You should
    be able to define a resources script to convert these records to XML
    easily.

    -- Daniel
    Daniel, Aug 21, 2009
    #3
  4. ddddddd

    Peter Flynn Guest

    ddddddd wrote:
    > hello sir
    >
    > i am shaji from kerala india. i am a programmer. i hve one doubt i
    > will explain below.
    >
    > we have to devolop a literatures database. i need to include varoius
    > features like search by author,search title,
    > search by publication year etc:-
    >
    > so what we noramlly doing is first store all the literature details to
    > datbase(it includes foolwing
    > information like author(Au),title(TI),abstract(
    > Ab) ,published year etc-:
    > in order to store the these information we need to create one
    > literature entry form...and store the details via form...
    > this method is possible... it is very tedoius since we have around
    > 6000 literaturees are there.
    >
    > so is there any other method?


    Yes. Use a bibliographic database package like JabRef (free), or EndNote
    / ProCite / ReferenceManager (expensive), or even Zotero (plugin for
    Firefox). All of these can export the data in many different formats.

    > we are downloding these literatures from some external websites...all
    > the literatures are in the same format
    > ..i am sending the litertures record forma below...please check
    > that ...

    [...]
    > TI- A Survey of Phytophthora Species on Hainan Island of South China.
    > AU- Hui-cai Zeng1

    [...]

    That looks like RIS format or one of its derivatives. All the
    bibliographic databases above should be able to open files of that
    format, and you can then export to something like MODS (XML).

    ///Peter
    Peter Flynn, Aug 21, 2009
    #4
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Marco Leist
    Replies:
    1
    Views:
    567
    Mike Brown
    Aug 6, 2003
  2. Replies:
    2
    Views:
    429
    TextDoctor
    May 7, 2005
  3. Replies:
    1
    Views:
    380
  4. Replies:
    1
    Views:
    478
    Juan T. Llibre
    Oct 18, 2006
  5. XML Conversion

    Convert your documents to XML

    XML Conversion, Dec 9, 2008, in forum: XML
    Replies:
    0
    Views:
    486
    XML Conversion
    Dec 9, 2008
Loading...

Share This Page