Memory problem with XML::DOM::Parser???

Markus Mohr · Jun 5, 2004

He, everybody,

I'm having a big problem when it comes to parsing a large file with
the ActiveState XML-DOM 1.43 XML-Parser: It consumes a hell of a lot
of memory, raises the CPU of the commputer to 100 % and takes a very
long time to handle files of "merely" 500 kB size.

Is there any way to speed things up?

Sincerely

Markus Mohr

Ben Morrow · Jun 5, 2004

Quoth (e-mail address removed):

He, everybody,

I'm having a big problem when it comes to parsing a large file with
the ActiveState XML-DOM 1.43 XML-Parser: It consumes a hell of a lot
of memory, raises the CPU of the commputer to 100 % and takes a very
long time to handle files of "merely" 500 kB size.

Is there any way to speed things up?

I would have a look to see if XML::LibXML2 or XML::Xerces could be used
instead. Unfortunately their APIs are both different from XML:

OM's,
but they should be substantially faster. XML:

OM does its DOM
processing in Perl, based on the callbacks provided by the Expat XML
parser; the other two libraries parse, build the DOM and manipulate it
directly in C(++).

Now, what Perl could really do with is a standard DOM API like
XML::SAX...

Ben

Ben Morrow · Jun 5, 2004

Quoth Ben Morrow said:
I would have a look to see if XML::LibXML2

^^
Of course, I just meant XML::LibXML... the C lib is called libxml2.

Ben

Markus Mohr · Jun 6, 2004

Quoth (e-mail address removed):

I would have a look to see if XML::LibXML2 or XML::Xerces could be used
instead. Unfortunately their APIs are both different from XML:OM's,
but they should be substantially faster. XML:OM does its DOM
processing in Perl, based on the callbacks provided by the Expat XML
parser; the other two libraries parse, build the DOM and manipulate it
directly in C(++).

Now, what Perl could really do with is a standard DOM API like
XML::SAX...

Ben

Thought so myself, but have no apparent idea on how to use the syntax
correctly. Even reading the book XML & Perl (O'Reilly) did not really
enlightne my brain.

Now, here is the code, and that's prety all I have to master.

Do you think there is anything to do about rwriting this piece of code
for XML::LibXML2?

Sincerely

Markus Mohr

------- Code sample -------
#!/usr/bin/perl -w

#------------------------------------------------------------------------------#
# CFilter.pm
#
#
#
# Modul für die Filter-Funktionen des Client im Zusammenspiel mit
CGUI.pm und #
# CXML.pm
#
#------------------------------------------------------------------------------#

package CFilter;

#------------------------------------------------------------------------------#
# Interne Versionierung
#
#------------------------------------------------------------------------------#
use vars qw/$VERSION $TIMESTAMP/;

# $VERSION = 1.0;
# $TIMESTAMP = 20030321;
# $VERSION = 1.1;
# $TIMESTAMP = 20030627;
# $VERSION = "1.5.4";
# $TIMESTAMP = 20040505;
$VERSION = "1.5.5.build.2";
$TIMESTAMP = 20040604;

#------------------------------------------------------------------------------#
# Laden der internen Module (1)
#
#------------------------------------------------------------------------------#
# XML:

OM ist ein CPAN Modul und existiert in dieser Form nicht auf
# ActiveState fuer die Version 5.8.0, sondern lediglich fuer die
Version 5.6.1.
# Fuer alle Versionen gueltig ist aber die Version auf cpan.perl.org.
# XML:

OM muss auf dem Client-Rechner installiert sein!
# Seit Mai 2004 gibt es auch eine ActiveState-Version 1.43.
use XML:

OM;

#------------------------------------------------------------------------------#
# Laden der externen Module (1)
#
#------------------------------------------------------------------------------#
use CXML;

# Pragmata
use diagnostics;
use strict;

use open ':utf8';

return 1;

#------------------------------------------------------------------------------#
# Subroutine, um eine Anfrage in eine Patientenakte umzuwandeln
#
#------------------------------------------------------------------------------#
sub import_anfrage ($$) {
my ( $self, $anfrage, $konfiguration ) = @_;
print "\nDie ANFRAGE wird imporiert:\n";
print "---------------------------\n";

open( TEMP, ">./anf_temp.anf" );
print TEMP $anfrage;
close TEMP;

# Wir legen ein neues XML-Objekt an, das alte wird verworfen
my $xml = CXML->new();
$xml->construct_xml($konfiguration);
$xml = $konfiguration->get_value('xml');
my $xml_root = $xml->{'root'};

# Die Anfrage wird in ein XML-Dokument geparst
print "Debug: -> Die ANFRAGE wird gePARSt.\n";
my $anfrage_parser = new XML:

OM:

arser( KeepCDATA => 1 );
my $anfrage_doc = $anfrage_parser->parsefile("./anf_temp.anf");

unlink("./anf_temp.anf");
print "Debug: -> Die ANFRAGE ist FERTIG gePARSt.\n";

# Die Anfrage ist Teil der neuen EPA
my $anfrage_root = $anfrage_doc->getElementsByTagName('ANFRAGE');
$anfrage_root = $anfrage_root->item(0);
$anfrage_root->setOwnerDocument( $xml->{'doc'} );
my $nodes = $xml_root->getElementsByTagName('anfragen');
my $node = $nodes->item(0);
$node->appendChild($anfrage_root);

# Anschliessend die Daten der Anfrage in die EPA übertragen
foreach my $type (qw( PATIENT ARZT INSTITUTION UNTERSUCHUNG DIAGNOSE
ANAMNESE MASSNAHME soziomedizinischedaten )) {
my $anfrage = $xml_root->getElementsByTagName('ANFRAGE');
$anfrage = $anfrage->item(0);
for my $element ( $anfrage->getElementsByTagName($type) ) {
my $destination_parent;
SWITCH: for ($type) {
/PATIENT/ && do { $destination_parent = 'patient';
last; };
/ARZT/ && do { $destination_parent = 'arztliste';
last; };
/INSTITUTION/ && do { $destination_parent =
'paramedizinischeliste'; last; };
/UNTERSUCHUNG/ && do { $destination_parent = 'untersuchungen';
last; };
/DIAGNOSE/ && do { $destination_parent = 'diagnosen';
last; };
/ANAMNESE/ && do { $destination_parent = 'anamnesen';
last; };
/MASSNAHME/ && do { $destination_parent = 'massnahmen';
last; };

#/soziomedi/ && do { $destination_parent = $xml_root;
last; };
}

$destination_parent =
$xml_root->getElementsByTagName($destination_parent);
$destination_parent = $destination_parent->item(0);
my $source = $element->cloneNode(1);

$destination_parent = $xml_root if ( $type =~ /soziomed/ );

# print "Vorher:\n",
$destination_parent->toString, "\n";
#print $destination_parent->toString;
#print "\n";

# print "TYPE: $type DP $destination_parent CT ",
$source->toString, "\n";
#print $source->toString;
#print "\n";

$destination_parent->appendChild($source);
$destination_parent->removeChild(
$destination_parent->getElementsByTagName('soziomedizinischedaten')->item(0)
) if ( $type =~ /soziomed/ );

# print "\nJetzt:\n", $destination_parent->toString, "\n";
#print $destination_parent->toString;
#print "\n";

# Bei den Daten des Patienten die alten (= leeren) Daten
entfernen
if ( $type eq 'PATIENT' ) {
my $old_data = $destination_parent->getFirstChild;
$destination_parent->removeChild($old_data);
}

# Altes Element aus der Anfrage entfernen
my $source_parent = $element->getParentNode();
$source_parent->removeChild($element);
}
}
$anfrage_doc->dispose;
return;
}

#------------------------------------------------------------------------------#
# Subroutine, um eine vom Server geholte Antwort einzulesen:
#
# Der ANFRAGEnde holt die Antwort vom ANTWORTenden
#
#------------------------------------------------------------------------------#
sub import_antwort ($$) {
my ( $self, $antwort, $konfiguration ) = @_;
print "\nDie ANTWORT wird importiert:\n";
print "----------------------------\n";

open( TEMP, ">./antwort_temp.ant" );
print TEMP $antwort;
close TEMP;

# Die Antwort wird in ein XML-Dokument geparst
print "Debug: -> Die ANTWORT wird gePARSt.\n";
my $antwort_parser = new XML:

OM:

arser( KeepCDATA => 1 );
my $antwort_doc =
$antwort_parser->parsefile("./antwort_temp.ant");

unlink("./antwort_temp.ant");
print "Debug: -> Die ANTWORT ist FERTIG gePARSt.\n";

# Unser aktuelles lokales Dokument holen
my $xml = $konfiguration->get_value('xml');
my $xml_root = $xml->{'root'};
print "Debug: -> Das aktuelle lokale Dokument wird geholt.\n";

# Daten der Antwort durchsehen
# Patientendaten sind ja bereits enthalten und müssen deshalb nicht
# gesondert nochmal uebernommen werden
foreach my $type (qw(ANAMNESE UNTERSUCHUNG DIAGNOSE MASSNAHME
ANFRAGE)) {
print "Debug: -> ITEM: $type\n";

# Die einzelnen Antwort (Remote) Items durchgehen
REMOTEITEM:
for my $remote_element ( $antwort_doc->getElementsByTagName($type)
) {
my $data = $remote_element->toString;

# ID herausfinden
$data =~ /<id>(\d+)<\/id>/;
my $remote_id = $1;
my $is_new_item = 1;
my $local_parent;
print "Debug: -> Das REMOTE-ELEMENT ID $remote_id ist
vorhanden.\n";

# Gibt es überhaupt schon entsprechende Items? Wenn nicht, ist
das Item auf
# jeden Fall neu und wird angehängt
my @local_element_node_list =
$xml_root->getElementsByTagName($type);
if ( scalar(@local_element_node_list) eq 0 ) {

#
# Es gibt keine lokalen Items
#
print "Debug: -> Es gibt keine lokalen ITEMs (= NO LOCAL).\n";
my $parent_tag = lc($type) . "en";
$local_parent =
$xml_root->getElementsByTagName($parent_tag)->item(0);
&attach_item( $xml, $type, $remote_element, $local_parent ) if
$local_parent;
next REMOTEITEM;
}
else {

#
# Es gibt lokale Items
#
print "Debug: -> Es gibt lokale ITEMs (= LOCAL).\n";

# Die lokalen Items einzeln durchgehen und checken
for my $local_element (@local_element_node_list) {
my $local_data = $local_element->toString;

# Lokale ID herausfinden
$local_data =~ /<id>(\d+)<\/id>/;
my $local_id = $1;

# Und das Parent-Element bestimmen
$local_parent = $local_element->getParentNode;

# Stimmt die lokale ID mit der remote ID überein?
if ( $local_id eq $remote_id ) {

#
# Die IDs stimmen ueberein
#
print "Debug: -> Die IDs stimmen \x84berein (= ID
MATCH).\n";

# Prüfen ob der Inhalt abweicht
if ( $local_data ne $data ) {

#
# Der Inhalt weicht ab
#
print "Debug: -> Der Inhalt stimmt nicht \x84berein (=
CONTENT MISMATCH).\n";

# Ueberpruefen, wer der Ersteller ist (arztid)
$data =~ /<arztid>(\d+)<\/arztid>/;
my $remote_arztid = $1;
$local_data =~ /<arztid>(\d+)<\/arztid>/;
my $local_arztid = $1;
if ( $remote_arztid eq $local_arztid ) {

#
# Der Ersteller ist der gleiche - Item wurde
veraendert
#

# Deklaration des Anfragezeitpunktes
foreach my $type ('anfragezeitpunkt') {
print "Debug: -> ITEM: $type\n";

# Die einzelnen Antwort (Remote) Items durchgehen
REMOTEITEM:
for my $remote_element (
$antwort_doc->getElementsByTagName($type) ) {

#my $data = $remote_element->toString;
my $anf_time = $remote_element->toString;
$anf_time =~ s/.*<anfragezeitpunkt>//g;
$anf_time =~ s/<\/anfragezeitpunkt>.*//g;
$anf_time =~ s/<//g;
$anf_time =~ s/>//g;
$anf_time =~ s/\///g;
$anf_time =~ s/]]//g;
$anf_time =~ s/
/\n/g; #
<Return-Taste>
$anf_time =~ s/ / /g; #
<Tab-Taste>

# Deklaration des Antwortzeitpunktes
foreach my $type ('antwortzeitpunkt') {

# Die einzelnen Antwort (Remote) Items
durchgehen
REMOTEITEM:
for my $remote_element (
$antwort_doc->getElementsByTagName($type) ) {

my $antw_time = $remote_element->toString;
$antw_time =~ s/.*<antwortzeitpunkt>//g;
$antw_time =~ s/<\/antwortzeitpunkt>.*//g;
$antw_time =~ s/<//g;
$antw_time =~ s/>//g;
$antw_time =~ s/\///g;
$antw_time =~ s/]]//g;
$antw_time =~ s/
/\n/g; #
<Return-Taste>
$antw_time =~ s/ / /g; #
<Tab-Taste>

# Online-Display der Daten der Anfrage
$local_data =~ s/.*?<anfragetext>/Die Anfrage
wurde erstellt am $anf_time.\n\n/g;
$local_data =~ s/<\/anfragetext>.*//g;
$local_data =~ s/.*ANFRAGE//g;
$local_data =~ s/ANFRAGE.*?//g;
$local_data =~ s/<//g;
$local_data =~ s/>//g;
$local_data =~ s/\///g;
$local_data =~ s/!\[CDATA\[//g;
$local_data =~ s/]]//g;
$local_data =~ s/
/\n/g;
# <Return-Taste>
$local_data =~ s/ / /g;
# <Tab-Taste>

# Online-Display der Daten der Antwort
$data =~ s/.*?<antworttext>/Die Antwort wurde
erstellt am $antw_time.\n\n/g;
$data =~ s/<\/antworttext>.*//g;
$data =~ s/.*ANFRAGE/Es wurde kein Antworttext
angegeben.\n\n/g;
$data =~ s/ANFRAGE.*?//g;
$data =~ s/<//g;
$data =~ s/>//g;
$data =~ s/\///g;
$data =~ s/!\[CDATA\[//g;
$data =~ s/]]//g;
$data =~ s/
/\n/g;
# <Return-Taste>
$data =~ s/ / /g;
# <Tab-Taste>
}
}
}
}
print "Debug: -> Die Ersteller sind identisch - das
ITEM wurde ver\x84ndert (= CREATOR MATCH).\n";

# Zur Pruefung vorlegen und den Benutzer entscheiden
lassen,
# ob die Aenderungen angenommen werden
print "Debug: -> Benutzerentscheidung auf Akzeptanz
der \x8enderungen (= CREATE ACCEPT WINDOW).\n";
CGUI->create_accept( substr( $data, 0, 1000 ), substr(
$local_data, 0, 1000 ), $konfiguration, \$is_new_item );
print "Debug: -> SOLL $is_new_item.\n";

# $is_new_item ist 2 wenn der Benutzer die Änderungen
als neues Item möchte
# $is_new_item ist 1 wenn der Benutzer die Änderungen
akzeptiert
# $is_new_item ist 0 wenn der Benutzer die Änderungen
ablehnt
if ( $is_new_item eq 0 ) { next REMOTEITEM; }
elsif ( $is_new_item eq 1 ) {
$local_parent->removeChild($local_element); &attach_item( $xml, $type,
$remote_element, $local_parent ); }
elsif ( $is_new_item eq 2 ) { &attach_item( $xml,
$type, $remote_element, $local_parent ); }
}
else {

#
# Ersteller sind nicht identisch
#
print "Debug: -> Die Ersteller sind nicht identisch (=
CREATOR MISMATCH).\n";

# Item wurde parallel erstellt - in lokale Akte
übernehmen
&attach_item( $xml, $type, $remote_element,
$local_parent );
} # ENDIF Prüfung ArztID
}
else {

#
# Die Inhalte stimmen überein, Item ist unverändert -
keine Aktion nötig
#
print "Debug: -> Die Inhalte sind identisch - das ITEM
ist unver\x84ndert (= CONTENT MATCH).\n";
next REMOTEITEM;
} # ENDIF Prüfung Inhaltsgleichheit
} # ENDIF Prüfung ID-Gleichheit
print "Debug: -> Die IDs sind nicht identisch (= ID
MISMATCH).\n";
} # Ende Durchlauf lokaler Items
#
# Alle lokalen Items durchgegangen - Antwort-Item ist neu
#
&attach_item( $xml, $type, $remote_element, $local_parent );
} # ENDIF Lokale Items vorhanden
} # Ende Durchlauf der Items von der Antwort
} # Ende Durchlauf der einzelnen Items
$antwort_doc->dispose;
return 1;
}
------- Code sample -------

I'm sorry most of the comments are in German. The two modules are the
only sources where it comes to using the parsing process:
"Import_anfrage" and "import_antwort". The context is that
sent-to-the-server medical queries containing textual and image
information will have to be processed by the parser for retrieving.
The whole thing is part of a client-server-client teleconsultation
system which enables client 1 to send some query to client 2 over the
server. Client 2 retrieves the data from the server and needs to parse
them, formulates and answer, and client 1, again, has to retrieve the
answer from the server thus needing to parse it again.

Ben Morrow · Jun 6, 2004

Quoth (e-mail address removed):

Now, here is the code, and that's prety all I have to master.

Do you think there is anything to do about rwriting this piece of code
for XML::LibXML2?

------- Code sample -------
#!/usr/bin/perl -w

<standard moan>
use strict;
use warnings;

#------------------------------------------------------------------------------#
# CFilter.pm
#
#
#
# Modul für die Filter-Funktionen des Client im Zusammenspiel mit
CGUI.pm und #
# CXML.pm
#
#------------------------------------------------------------------------------#

Big box comments like this really don't help readability; and info about
what the module is and does should be put in POD so it can be read later
more easily.

use CXML;

What is this module? It's not on CPAN, so I presume it's yours? By the
looks of things this will need rewriting as well.

# Pragmata
use diagnostics;
use strict;

Oh right, you've got it down here... use strict and warnings should come
first.

use open ':utf8';

If you say

use open ':encoding(utf8)';

you will get better error handling and fallback facilities when the data
isn't valid.

return 1;

Don't do this... put it at the end.

sub import_anfrage ($$) {
my ( $self, $anfrage, $konfiguration ) = @_;
print "\nDie ANFRAGE wird imporiert:\n";
print "---------------------------\n";

open( TEMP, ">./anf_temp.anf" );
print TEMP $anfrage;
close TEMP;

You don't need to do this. XML:

OM and XML::LibXML can both parse XML
from a string (though I admit that in the case of XML:

OM the
documentation is less than clear...).

# Wir legen ein neues XML-Objekt an, das alte wird verworfen
my $xml = CXML->new();
$xml->construct_xml($konfiguration);
$xml = $konfiguration->get_value('xml');
my $xml_root = $xml->{'root'};

Here is your first problem. CXML objects appear to contain XML:

OM
objects; AFAIK there is no way to transfer a node from an XML::LibXML
tree to an XML:

OM tree short of serialising it and re-parsing. This
means you will have to modify CXML to use XML::LibXML (or whatever) as
well.

# Die Anfrage wird in ein XML-Dokument geparst
print "Debug: -> Die ANFRAGE wird gePARSt.\n";

Debug messages like this are better sent to stderr with warn.

unlink("./anf_temp.anf");

.... or die translate_to_German("couldn't delete auf_temp.anf: $!");

# Die Anfrage ist Teil der neuen EPA
my $anfrage_root = $anfrage_doc->getElementsByTagName('ANFRAGE');
$anfrage_root = $anfrage_root->item(0);
$anfrage_root->setOwnerDocument( $xml->{'doc'} );
my $nodes = $xml_root->getElementsByTagName('anfragen');
my $node = $nodes->item(0);
$node->appendChild($anfrage_root);

All of this stuff will be the same with XML::LibXML, once you have your
CXML object using the same DOM library.

In theory, as the DOM provides a specification of the methods etc., you
should simply be able to switch 'XML::LibXML' for 'XML:

OM' throughout
and it'll all be fine... it won't, of course (life's never that simple),
but the changes required shouldn't be major.

Ben

Markus Mohr · Jun 7, 2004

Quoth (e-mail address removed):

<standard moan>
use strict;
use warnings;

Big box comments like this really don't help readability; and info about
what the module is and does should be put in POD so it can be read later
more easily.

What is this module? It's not on CPAN, so I presume it's yours? By the
looks of things this will need rewriting as well.

Oh right, you've got it down here... use strict and warnings should come
first.

If you say

use open ':encoding(utf8)';

you will get better error handling and fallback facilities when the data
isn't valid.

Don't do this... put it at the end.

You don't need to do this. XML:OM and XML::LibXML can both parse XML
from a string (though I admit that in the case of XML:OM the
documentation is less than clear...).

Here is your first problem. CXML objects appear to contain XML:OM
objects; AFAIK there is no way to transfer a node from an XML::LibXML
tree to an XML:OM tree short of serialising it and re-parsing. This
means you will have to modify CXML to use XML::LibXML (or whatever) as
well.

Debug messages like this are better sent to stderr with warn.

... or die translate_to_German("couldn't delete auf_temp.anf: $!");

All of this stuff will be the same with XML::LibXML, once you have your
CXML object using the same DOM library.

In theory, as the DOM provides a specification of the methods etc., you
should simply be able to switch 'XML::LibXML' for 'XML:OM' throughout
and it'll all be fine... it won't, of course (life's never that simple),
but the changes required shouldn't be major.

Ben

Okay, Ben, thank you very much. Here is the complete code for
"CXML.pm" for your interest. Of course, it contains XML:

OM
statements.

Can you have a look at this file as well?

-----------------------------------------------
#!/usr/bin/perl -w

#------------------------------------------------------------------------------#
# CXML.pm
#
#
#
# Modul für die XML-Funktionen des Clients
#
#------------------------------------------------------------------------------#

package CXML;

#------------------------------------------------------------------------------#
# Interne Versionierung
#
#------------------------------------------------------------------------------#
use vars qw/$VERSION $TIMESTAMP/;
# $VERSION = 1.0;
# $TIMESTAMP = 20030321;
# $VERSION = 1.1;
# $TIMESTAMP = 20030627;
# $VERSION = "1.5.4";
# $TIMESTAMP = 20040505;
# $VERSION = "1.5.5.build.1";
# $TIMESTAMP = 20040521;
$VERSION = "1.5.5.build.2";
$TIMESTAMP = 20040604;

#------------------------------------------------------------------------------#
# Laden der internen Module (1)
#
#------------------------------------------------------------------------------#
# XML:

OM ist ein CPAN Modul und existiert in dieser Form nicht auf
# ActiveState, sondern nur unter cpan.perl.org.
# XML:

OM muss daher auf dem Client-Rechner installiert sein!
use XML:

OM;

#------------------------------------------------------------------------------#
# Laden der externen Module (0)
#
#------------------------------------------------------------------------------#

# Pragmata
use diagnostics;
use strict;
use locale;

# use open ':utf8';

return 1;

#------------------------------------------------------------------------------#
# Subroutine zum Anlegen einer neuen "Fallmappe"
#
#------------------------------------------------------------------------------#
sub new {
my $self = {};

$self->{doc} = XML:

OM:

ocument->new();
$self->{xml} = $self->{doc}->createXMLDecl( '1.0', 'UTF-8' );
$self->{root} = undef;
$self->{type} = undef;
$self->{template} = undef;
$self->{arzt} = undef;

bless($self);
return $self;
}

#------------------------------------------------------------------------------#
# Subroutine, um die XML-Struktur aus dem XML-Rootfile und den
referenzierten #
# Dateien zu generieren
#
#------------------------------------------------------------------------------#
sub construct_xml ($) {
my ( $self, $konfiguration ) = @_;

my $rootfile = $konfiguration->get_value('xmlrootfile');
my $gui = $konfiguration->get_value('gui');
my @xmlfiles = $rootfile;
my %xmlroots;
my %xmldocs;

foreach my $current_file (@xmlfiles) {
if ( -r $current_file ) {
if ($gui) { $gui->set_status( 52, $current_file );
$gui->{main}->Busy( -recurse => 1 ); }
else { print CText->get( $konfiguration, 52, $current_file
), "\n"; }
open( XML, $current_file ) or die CText->get(
$konfiguration, 1001, $current_file );
my @file = <XML>;
my $line_tot = @file;
close(XML);

# Für jede XML-Datei einen Datenbaum erstellen
my $xml_cur_doc = XML:

OM:

ocument->new();
$xml_cur_doc->createXMLDecl( '1.0', 'UTF-8' );
my $xml_cur_roo = undef;
my @parent_list = ();

# Die XML-Datei auswerten
for ( my $line_cur = 0 ; $line_cur < $line_tot ;
$line_cur++ ) {
SWITCH: for ( $file[$line_cur] ) {

# Importierte XML-Schemata vormerken und später
einlesen
/include schemaLocation=\"([\w|\.]+)\"/ && do { my
$filename = substr( $current_file, 0, rindex( $current_file, "/" ) + 1
) . $1; push ( @xmlfiles, $filename ); last; };

# Ein </element>-Tag schliesst ein Wrapper-Element
/<\/.*element>/ && do { my $x = shift
(@parent_list); last; };

# Referenz auf weitere Elemente/Datei überspringen
/element ref=\"(\w+)\"/ && do { last; };

# Normales Element - unter seinem Parent einordnen
und den Typ speichern
/element name=\"(\w+)\".*type=\"\w*?:*(\w+)\"/ &&
do {
my $child = $xml_cur_doc->createElement($1);
my ($parent) = @parent_list;
$parent->appendChild($child);
$self->{type}{$1} = $2;
last;
};

# Komplexes oder Wrapper-Element
/element name=\"(\w+)\"/ && do {
my $element = $1;

# Falls in den nächsten Zeilen "complexType"
und "Content" stehen ist es ein komplexes Element
if ( $file[ $line_cur + 1 ] =~ /complexType/
&& $file[ $line_cur + 2 ] =~ /Content/ ) {
my $child =
$xml_cur_doc->createElement($element);
my ($parent) = @parent_list;
$parent->appendChild($child);
my @enum_values = ();

until ( $file[ $line_cur - 1 ] =~
/<\/.*element\>/ ) {
if ( $file[$line_cur] =~ /enumeration
value=\"(.*?)\"/ ) { push ( @enum_values, $1 ); }
$line_cur++;
}
$self->{type}{$element} = "enum";
$self->{enum}{$element} = [@enum_values];
last;

# Ansonsten ist es ein Wrapper-Element das
als Parent fungiert
}
else {
my $parent =
$xml_cur_doc->createElement($element);
if ( defined $xml_cur_roo ) { my
($preparent) = @parent_list; $preparent->appendChild($parent); }
else { $xml_cur_roo = $parent; }
unshift ( @parent_list, $parent );
last;
}
}
}
}

# Das erzeugte XML-Dokument für diese Datei in einem Hash
ablegen - Index ist der Dateiname
$self->{template}{doc}{$current_file} = $xml_cur_doc;
$self->{template}{root}{$current_file} = $xml_cur_roo;

}
else {
die CText->get( $konfiguration, 1001, $current_file );
}
}

$self->{template}{root}{$rootfile}->setOwnerDocument( $self->{doc}
);
$self->{root} = $self->{template}{root}{$rootfile};

# In die Konfiguration die Referenz auf das XML-Objekt ablegen
$konfiguration->set_value( 'xml', $self );

# Einen Patienten anlegen
CXML->insert( $konfiguration, 'pat' );

if ($gui) { $gui->set_status(53); $gui->{main}->Unbusy; }
return;
}

#------------------------------------------------------------------------------#
# Subroutine, um XML-File aus einer Datei einzulesen
#
#------------------------------------------------------------------------------#
sub read($$) {
my ( $self, $file, $konfiguration ) = @_;

if ( $konfiguration->get_value('gui') ) { return if
$konfiguration->get_value('gui')->create_confirm( CText->get(
$konfiguration, 950 ), $konfiguration ); }
if ( $konfiguration->get_value('gui') ) {
$konfiguration->get_value('gui')->set_status(54); }

my $parser = new XML:

OM:

arser( KeepCDATA => 1, ErrorContext =>
2 );
my $doc = $parser->parsefile($file);
unless ($doc) {
warn CText->get( $konfiguration, 1002 );

# Logfileeintrag
if ( defined $konfiguration->get_value('log') ) { CTools->log(
$konfiguration, 904 ); }
}
my $xml = $konfiguration->get_value('xml');
$xml->{'doc'} = $doc;
$xml->{'root'} = $doc;
close(XML);

# Nach dem Import werden in der internen Datendarstellung die
Umlaute als
# Umlaute und nicht codiert gefuehrt
for my $child ( $xml->{'root'}->getElementsByTagName('*') ) {
if ( $child->toString =~ /<!\[CDATA\[(.*?&#\d{3};.*?)\]\]>/ )
{
my $childdata = $1;
my @list = $child->getElementsByTagName('*');
if ( $#list eq -1 ) {
$childdata = CXML->code($childdata);
my $value_node =
$xml->{'doc'}->createCDATASection($childdata);
my $fc = $child->getFirstChild;
$child->replaceChild( $value_node, $fc );
}
}
}

# Logfileeintrag
if ( defined $konfiguration->get_value('log') ) { CTools->log(
$konfiguration, 905 ); }
return;
}

#------------------------------------------------------------------------------#
# Subroutine, um die XML-Struktur in eine Datei zu schreiben
#
#------------------------------------------------------------------------------#
sub write($$) {
my ( $self, $konfiguration ) = @_;

# Dateinamen ermitteln, dazu ID ds angemeldeten Arztes, Vorname,
Nachname
# und Geburtsdatum ermitteln
my $arzt_id = $konfiguration->get_value('uid');
my $pat_data = CXML->extract_flattened( 'VCARDMOD', 'PATIENT', 0,
1, $konfiguration );
my $soz_data = CXML->extract_flattened( 'soziomedizinischedaten',
'', 0, 0, $konfiguration );
$pat_data =~
/<id>(\d+)<\/id>.*?<vorname><!\[CDATA\[(.*?)\]\]><\/vorname>.*?<nachname><!\[CDATA\[(.*?)\]\]><\/nachname>/;
return 0 unless $1 && $2 && $3;
my $file = join ( "-", ( $arzt_id, $1, $2, $3 ) );
$soz_data =~ /<geburtszeitpunkt>(\d+)-(\d+)-(\d+)\s/;
return 0 unless $1 && $2 && $3;
$file .= "-$3-$2-$1.epa";

# Datei zum schreiben öffnen
open( XML, ">$file" ) or die CText->get( $konfiguration, 1001,
$file );

# EPA auf Festplatte bringen
print XML xmlcode(
$konfiguration->get_value('xml')->{'root'}->toString );
close(XML);
if ( defined $konfiguration->get_value('log') ) { CTools->log(
$konfiguration, 906 ); }
return 1;
}

#------------------------------------------------------------------------------#
# Subroutine, um die bezeichnete Patientenakte von der Festplatte zu
entfernen #
#------------------------------------------------------------------------------#
sub delete($$) {
my ( $self, $filename, $konfiguration ) = @_;
if ( $konfiguration->get_value('gui') ) { return if
$konfiguration->get_value('gui')->create_confirm( CText->get(
$konfiguration, 951 ), $konfiguration ); }

unlink $filename;

if ( defined $konfiguration->get_value('log') ) { CTools->log(
$konfiguration, 907, $filename ); }
if ( $konfiguration->get_value('gui') ) {
$konfiguration->get_value('gui')->set_status(55); }
return;
}

#------------------------------------------------------------------------------#
# Subroutine, um einen bestehenden Ast des XML-Schemas in einen
anderen zu #
# kopieren bzw. zu bewegen
#
#------------------------------------------------------------------------------#
sub copy ($$$$$$$) {
my ( $self, $from_major, $from, $from_id, $to, $to_id,
$to_element, $remove_source, $konfiguration ) = @_;

my $xml = $konfiguration->get_value('xml');
my $xmlroot = $xml->{'root'};

# Source-Bereich finden
for my $source ( $xmlroot->getElementsByTagName($from_major) ) {
for my $source ( $source->getElementsByTagName($from) ) {

# Anschliessend die ID des Bereiches suchen
for my $id_node ( $source->getElementsByTagName('id') ) {

# Und feststellen ob es die gewünschte ID ist
if ( $id_node->getFirstChild->toString =~
/^$from_id$/i ) {

# Falls dem so sein sollte, den Bereich (rekursiv)
kopieren...
my $nodecopy = $source->cloneNode(1);

# Eventuell die alte Node entfernen
if ($remove_source) { my $source_parent =
$source->getParentNode(); $source_parent->removeChild($source); }

# ...anschliessen den Zielabschnitt suchen
for my $destination (
$xmlroot->getElementsByTagName($to) ) {

# Anschliessend die ID des Ziels suchen
for my $id_node (
$destination->getElementsByTagName('id') ) {

# Und feststellen ob es die gewünschte ID
ist
if ( $id_node->getFirstChild->toString =~
/^$to_id$/i ) {
for my $destination (
$destination->getElementsByTagName($to_element) ) {

# Falls dem so sein sollte, den
kopierten Bereich anfügen

$destination->appendChild($nodecopy);
return;
}
}
}
}
}
}
}
}
return;
}

#------------------------------------------------------------------------------#
# Subroutine, um ein Wert-Child zu entfernen
#
#------------------------------------------------------------------------------#
sub remove ($$$$) {
my ( $self, $abschnitt, $keyword, $element, $nr, $konfiguration )
= @_;

# Das XML-Objekt holen
my $xml = $konfiguration->get_value('xml');
my $xmlroot = $xml->{'root'};
my ( $abs, @values );

# Zuerst den Abschnitt suchen (z.B. 'anfragen')
if ($abschnitt) { ( $abs, @values ) =
$xml->{'root'}->getElementsByTagName($abschnitt); }
else { $abs = $xml->{'root'}; }

# Anschliessend den Unterabschnitt suchen (z.B. eine 'ANFRAGE')
my $nodes = $abs->getElementsByTagName($keyword);

# Falls es mehrere Unterabschnitte gibt den gewünschten auswählen
my $teil = $nodes->item($nr);
if ( defined $teil ) {

# Rekursiv im Unterabschnitt nach dem Tag dessen Wert gelöscht
werden soll suchen (z.B. fachrichtung)
for my $elem ( $teil->getElementsByTagName( $element, 1 ) ) {

# Falls dieser Tag ein Wert-Kind besitzt dieses löschen
my $value_child = $elem->getFirstChild;
$elem->removeChild($value_child) if defined $value_child;
}
}

return;
}

#------------------------------------------------------------------------------#
# Subroutine, um die uebergebene XML-Struktur in das Gesamtschema
einzupflegen #
#------------------------------------------------------------------------------#
sub insert($$$) {
my ( $self, $konfiguration, $insert_this, $parent ) = @_;

my $xml = $konfiguration->get_value('xml');
my $xmlroot = $xml->{'root'};
my $xmlrootfile = $konfiguration->get_value('xmlrootfile');

SWITCH: for ($insert_this) {
/pat/ && do { $insert_this = "PATIENT"; $parent =
"patient"; last; };
/arz/ && do { $insert_this = "ARZT"; $parent =
"arztliste"; last; };
/par/ && do { $insert_this = "INSTITUTION"; $parent =
"paramedizinischeliste"; last; };
/ana/ && do { $insert_this = "ANAMNESE"; $parent =
"anamnesen"; last; };
/unt/ && do { $insert_this = "UNTERSUCHUNG"; $parent =
"untersuchungen"; last; };
/dia/ && do { $insert_this = "DIAGNOSE"; $parent =
"diagnosen"; last; };
/mas/ && do { $insert_this = "MASSNAHME"; $parent =
"massnahmen"; last; };
/anf/ && do { $insert_this = "ANFRAGE"; $parent =
"anfragen"; last; };
}

for my $parent_element ( $xmlroot->getElementsByTagName($parent) )
{

# Dateinamen für das einzusetzende Datenblatt bestimmen, da
der
# Dateiname als Index dient
my $insert_file = $xmlrootfile;
$insert_file =~ s/(.*\/)\w+(\.xsd)/$1$insert_this$2/i;

# Neue ID für das Element generieren
my $newid = 1;
my %oldid;

# Vorhandenen ID-Nodes suchen
my @id_nodes = $parent_element->getElementsByTagName( 'id', 1
);
my $id_anz = $#id_nodes;

# Wenn die Anzahl -1 ist, gibt es keine IDs
unless ( $id_anz < 0 ) {

# Vorhandene IDs auslesen
foreach my $id_node (@id_nodes) {
if ( $id_node->hasChildNodes ) { my $id =
$id_node->getFirstChild->toString; $oldid{$id} = 1; }
}
$newid++ while $oldid{$newid};
}

# Neues Element erzeugen
my $newelement =
$xml->{template}{root}{$insert_file}->cloneNode(1);
$newelement->setOwnerDocument( $xml->{doc} );

# Und die ID des Elementes setzen
for my $id_node ( $newelement->getElementsByTagName( 'id', 1 )
) { my $id_value = $xml->{doc}->createTextNode($newid);
$id_node->appendChild($id_value); }

# Falls die Arzt-ID als Erzeuger-ID gesetzt werden kann, dies
tun
for my $id_node ( $newelement->getElementsByTagName( 'arztid',
1 ) ) { my $id_value = $xml->{doc}->createTextNode(
$konfiguration->get_value('uid') ); $id_node->appendChild($id_value);
}

# Neues Element in den XML-Baum einpflegen
$parent_element->appendChild($newelement);
}

return;
}

#------------------------------------------------------------------------------#
# Subroutine, um den Wert eines Elementes in einem XML-Abschnitt zu
aendern #
#------------------------------------------------------------------------------#
sub update($$$$$$) {
my ( $self, $element, $wert, $keyword, $abschnitt, $nr,
$konfiguration ) = @_;

print "ICH SOLL UPDATEN: ELEMENT $element AUF WERT $wert KEYWORD
$keyword ABSCHNITT $abschnitt NR $nr...\n";

# Das XML-Objekt holen
my $xml = $konfiguration->get_value('xml');

# Variablen der Subroutine deklarieren
my $count = 0;

# Den jeweiligen Abschnitt suchen
if ($abschnitt) { my @abschnitte =
$xml->{'root'}->getElementsByTagName($abschnitt); $abschnitt =
$abschnitte[0]; }
else { $abschnitt = $xml->{'root'}; }

print "SUCHE TEILABSCHNITT $nr, bin bei $count...\n";

# Suchen wir nach dem Element, das mit dem Keyword bezeichnet wird
for my $teil ( $abschnitt->getElementsByTagName($keyword) ) {

# Prüfen ob wir auch das richtige Keyword gefunden haben (z.B.
die VCARDMOD des 3. Arztes)
if ( $count eq $nr ) {

print "GEFUNDEN!\n";

# ...und dann den Tag sofern er existiert
for my $zieltag ( $teil->getElementsByTagName($element) )
{

print "ZIELTAG GEFUNDEN\n";

# Das Werte-Element des Zieltags erzeugen
my $new_value_element = CXML->create_value_element(
$wert, $zieltag, $xml, $konfiguration ) if $wert;

# Hat das Zielelement bereits ein Value-Kind?
if ( defined $zieltag->getFirstChild ) {

print "HAT CHILD\n";

# Wenn es bereits ein Value-Kind gibt das alte
ersetzen bzw. löschen falls Wert '' ist
my $old_value_element = $zieltag->getFirstChild;
if ( defined $wert && defined $new_value_element
&& $wert ) {

print "REPLACED\n";
$zieltag->replaceChild( $new_value_element,
$old_value_element );
}
else {
$zieltag->removeChild($old_value_element);

print "REMOVED\n";
}
}
else {
$zieltag->appendChild($new_value_element) if $wert
&& defined $new_value_element;

print "APPEND\n";
}
}
return 1;
}

# Wir gehen weiter in der Liste und suchen das naechste
Vorkommen (z. B.
# die VCARDMOD des naechsten Arztes)
$count++;
}

print "FERTIG!\n";

return 0;
}

#------------------------------------------------------------------------------#
# Subroutine, um aus dem Gesamtschema den durch das Keyword
beschriebenen Teil #
# auszulesen
#
#------------------------------------------------------------------------------#
sub extract($$$$) {
my ( $self, $keyword, $abschnitt, $nr, $konfiguration ) = @_;

# Das XML-Objekt holen
my $xml = $konfiguration->get_value('xml');
my $abs;

# Variablen der Subroutine deklarieren
my $count = 0;
my @values;

if ($abschnitt) { ( $abs, @values ) =
$xml->{'root'}->getElementsByTagName($abschnitt); }
else { $abs = $xml->{'root'}; }

my $nodes = $abs->getElementsByTagName($keyword);

my $total = $nodes->getLength;

my $teil = $nodes->item($nr);

my %valueshash;
if ( defined $teil ) {

# Alle Nodes aus dem Abschnitt holen
sub getallchildnodes {
my ( $node, $parentname, $valuesref, $hashref ) = @_;

# Child-Nodes jeder Node durchlaufen
foreach my $child ( $node->getChildNodes ) {

# Falls auch diese Node Kinder hat, rekursiv
durchlaufen
getallchildnodes( $child, $child->getNodeName,
$valuesref, $hashref ) if $child->hasChildNodes;

# Keine Kinder? Dann ist es eine Wert-Node - Wert
ermitteln bzw. '' setzen falls nicht initialisiert
my $value = defined $child->getNodeValue ?
$child->getNodeValue : '';

# Den Wert speichern unter dem Namen der Eltern-Node
unless ( defined $hashref->{ $child->getNodeName } &&
$hashref->{ $child->getNodeName } ne '' ) { $hashref->{
$child->getNodeName } = $value; }
unless ( defined $hashref->{$parentname} &&
$hashref->{$parentname} ne '' ) { $hashref->{$parentname} = $value; }
}
}

# Rekursive Funktion anstossen
getallchildnodes( $teil, $teil->getNodeName, \@values,
\%valueshash );
}

# Hash neu strukturieren
foreach my $tagname ( keys %valueshash ) { push ( @values, {
$tagname => $valueshash{$tagname} } ) unless $tagname =~
/cdata-section/; }

# Array mit der Gesamtzahl und den Werten zurückliefern
return ( $total, @values );
}

#------------------------------------------------------------------------------#
# Subroutine um aus dem Gesamtschema einen Teil als Scalar auszugeben
#
#------------------------------------------------------------------------------#
sub extract_flattened ($$$$$) {
my ( $self, $keyword, $abschnitt, $nr, $id, $konfiguration ) = @_;

# Das XML-Objekt holen
my $xml = $konfiguration->get_value('xml');
my ( $abs, @temp );

# Variablen der Subroutine deklarieren
my $count = 0;

if ($abschnitt) { ( $abs, @temp ) =
$xml->{'root'}->getElementsByTagName($abschnitt); }
else { $abs = $xml->{'root'}; }

my $nodes = $abs->getElementsByTagName($keyword);

# Suchen wir nach einer Nr oder einer ID?
if ( $id == 0 ) {
my $teil = $nodes->item($nr);
if ( defined $teil ) { return $teil->toString; }
}
else {

foreach my $teil ( @{$nodes} ) { return $teil->toString if
$teil->toString =~ /<id>$id<\/id>/; }
}

return;
}

#------------------------------------------------------------------------------#
# Subroutine, um die gesamte EPA als String zu dumpen
#
#------------------------------------------------------------------------------#
sub extract_all_flattened ($) {
my ( $self, $konfiguration ) = @_;
return $konfiguration->get_value('xml')->{'root'}->toString;
}

#------------------------------------------------------------------------------#
# Subroutine, um die Daten für den Menuebutton einer Anfrage zu
generieren #
#------------------------------------------------------------------------------#
sub get_payload_data ($$$$$$) {
my ( $self, $nr, $konfiguration, $existref, $picksref, $beref ) =
@_;

# Das XML-Objekt holen
my $xml = $konfiguration->get_value('xml');
my $xmlroot = $xml->{'root'};

# Die vorhandenen Anamnesen, Untersuchungen, Diagnosen und
Massnahmen suchen
foreach my $area (qw/ANAMNESE UNTERSUCHUNG DIAGNOSE MASSNAHME/) {
foreach my $entity ( $xmlroot->getElementsByTagName($area) ) {

# Nach der ID suchen
my $entity_text = $entity->toString;
$entity_text =~ /<id>(\d+)<\/id>/;
my $id = $1;
push @{ $existref->{"\L$area\E"} }, $id;
$picksref->{"\L$area\E"}->{$id} = 0;
SWITCH: for ($area) {
/ANAMNESE/ && do {
$entity_text =~
/<text>.*?CDATA\[(.*?)\].*?<\/text>.*?<anamnesezeitpunkt>(\d+)\D(\d+)\D(\d+)\s/;
my ( $b, $j, $m, $t ) = ( $1, $2, $3, $4 );
$b = "Anamnese" unless $b;
$j = "??" unless $j;
$m = "??" unless $m;
$t = "??" unless $t;
$beref->{'anamnese'}->{$id} = substr( code($b), 0,
40 ) . " ($t.$m.$j)";
last;
};
/UNTERSUCHUNG/ && do {
$entity_text =~
/<untersuchungsbezeichnung>(.*?)<\/untersuchungsbezeichnung>.*?<untersuchungszeitpunkt>(\d+)\D(\d+)\D(\d+)\s/;
my ( $b, $j, $m, $t ) = ( $1, $2, $3, $4 );
$b = "Untersuchung" unless $b;
$j = "??" unless $j;
$m = "??" unless $m;
$t = "??" unless $t;
$beref->{'untersuchung'}->{$id} = substr(
code($b), 0, 40 ) . " ($t.$m.$j)";
last;
};
/DIAGNOSE/ && do {
$entity_text =~
/<diagnosetyp>.*?CDATA\[(.*?)\].*?<\/diagnosetyp>.*?<diagnosezeitpunkt>(\d+)\D(\d+)\D(\d+)\s/;
my ( $b, $j, $m, $t ) = ( $1, $2, $3, $4 );
$b = "Diagnose" unless $b;
$j = "??" unless $j;
$m = "??" unless $m;
$t = "??" unless $t;
$beref->{'diagnose'}->{$id} = substr( code($b), 0,
40 ) . " ($t.$m.$j)";
last;
};
/MASSNAHME/ && do {
$entity_text =~
/<bezeichnung>.*?CDATA\[(.*?)\].*?<\/bezeichnung>.*?<zeitpunktbeginn>(\d+)\D(\d+)\D(\d+)\s/;
my ( $b, $j, $m, $t ) = ( $1, $2, $3, $4 );
$b = "Massnahme" unless $b;
$j = "??" unless $j;
$m = "??" unless $m;
$t = "??" unless $t;
$beref->{'massnahme'}->{$id} = substr( code($b),
0, 40 ) . " ($t.$m.$j)";
last;
};
}
}
}

# Feststellen, welche Daten als zu senden gespeichert sind
my $anfrageliste = $xmlroot->getElementsByTagName('ANFRAGE');
my $anfrage = $anfrageliste->item($nr);
my $datentransmit =
$anfrage->getElementsByTagName('datentransmit')->item(0) if defined
$anfrage;
my $idstring = $datentransmit->toString if defined
$datentransmit;

while ( defined $idstring && $idstring =~ /<(\w+)id>(\d+)</ ) {
my $type = $1;
my $id = $2;
$picksref->{$type}->{$id} = 1;
$idstring =~ s/<($type)id>$id<\/($type)id>//;
}

return;
}

#------------------------------------------------------------------------------#
# Subroutine, um die Daten für den Menuebutton einer Anfrage
festzulegen #
#------------------------------------------------------------------------------#
sub set_payload_data ($$$$) {
my ( $self, $nr, $konfiguration, $typ, $id, $status ) = @_;

# Das XML-Objekt holen
my $xml = $konfiguration->get_value('xml');
my $xmlroot = $xml->{'root'};

# Feststellen, welche Daten als zu senden gespeichert sind
my $anfrageliste = $xmlroot->getElementsByTagName('ANFRAGE');
my $anfrage = $anfrageliste->item($nr);
my $datentransmit =
$anfrage->getElementsByTagName('datentransmit')->item(0) if defined
$anfrage;
my $tagname = $typ . "id";

if ($status) {

# ID zur Payload hinzufügen
my $value_element = $xml->{'doc'}->createElement($tagname);
my $value_child = $xml->{'doc'}->createTextNode("$id");
$datentransmit->appendChild($value_element);
$value_element->appendChild($value_child);
}
else {

# Die ID aus der Payload wieder entfernen
foreach my $child (
$datentransmit->getElementsByTagName($tagname) ) {
$datentransmit->removeChild($child) if $child->toString =~
/>$id</;
}
}

return;
}

#------------------------------------------------------------------------------#
# Subroutine, um die Daten für die Direktauswahl zu generieren
#
#------------------------------------------------------------------------------#
sub get_directpick_data ($$) {
my ( $self, $konfiguration, $layout, $optionsref ) = @_;

# Das XML-Objekt holen
my $xml = $$konfiguration->get_value('xml');
my $xmlroot = $xml->{'root'};

# Die vorhandenen Abschnitte suchen
my $area;
$area = "ARZT" if $$layout eq "arz";
$area = "INSTITUTION" if $$layout eq "par";
$area = "ANAMNESE" if $$layout eq "ana";
$area = "UNTERSUCHUNG" if $$layout eq "unt";
$area = "DIAGNOSE" if $$layout eq "dia";
$area = "MASSNAHME" if $$layout eq "mas";
$area = "ANFRAGE" if $$layout eq "anf";

# Laufende Nummer mitzaehlen
my $nr = 0;
foreach my $entity ( $xmlroot->getElementsByTagName($area) ) {
my $entity_text = $entity->toString;
my $text;
SWITCH: for ($area) {
/ARZT/ && do {
$entity_text =~ /<vorname>.*?CDATA\[(.*?)\]/;
my $v = ( $1 ? $1 : "??" );
$entity_text =~ /<nachname>.*?CDATA\[(.*?)\]/;
my $n = ( $1 ? $1 : "??" );
$text = "$v $n";
last;
};
/INSTITUTION/ && do { # dgraf
$entity_text =~ /<vorname>.*?CDATA\[(.*?)\]/;
my $v = ( $1 ? $1 : "??" );
$entity_text =~ /<nachname>.*?CDATA\[(.*?)\]/;
my $n = ( $1 ? $1 : "??" );
$text = "$v $n";
last;
};
/ANAMNESE/ && do {
$entity_text =~ /<text>.*?CDATA\[(.*?)\]/;
my $t = ( $1 ? $1 : "Anamnese" );
$entity_text =~
/<anamnesezeitpunkt>(\d+)\D(\d+)\D(\d+)\s/;
my $d = ( $1 ? "$3.$2.$1" : "??.??.??" );
$text = substr( code($t), 0, 40 ) . " $d";
last;
};
/UNTERSUCHUNG/ && do {
$entity_text =~
/<untersuchungsbezeichnung>.*?CDATA\[(.*?)\]/;
my $t = ( $1 ? $1 : "Untersuchung" );
$entity_text =~
/<untersuchungszeitpunkt>(\d+)\D(\d+)\D(\d+)\s/;
my $d = ( $1 ? "$3.$2.$1" : "??.??.??" );
$text = substr( code($t), 0, 40 ) . " $d";
last;
};
/DIAGNOSE/ && do {
$entity_text =~ /<diagnosetyp>.*?CDATA\[(.*?)\]/;
my $t = ( $1 ? $1 : "Diagnose" );
$entity_text =~
/<diagnosezeitpunkt>(\d+)\D(\d+)\D(\d+)\s/;
my $d = ( $1 ? "$3.$2.$1" : "??.??.??" );
$text = substr( code($t), 0, 40 ) . " $d";
last;
};
/MASSNAHME/ && do {
$entity_text =~ /<bezeichnung>.*?CDATA\[(.*?)\]/;
my $t = ( $1 ? $1 : "Massnahme" );
$entity_text =~
/<zeitpunktbeginn>(\d+)\D(\d+)\D(\d+)\s/;
my $d = ( $1 ? "$3.$2.$1" : "??.??.??" );
$text = substr( code($t), 0, 40 ) . " $d";
last;
};
/ANFRAGE/ && do {
$entity_text =~ /<anfragetext>.*?CDATA\[(.*?)\]/;
my $t = ( $1 ? $1 : "Anfrage" );
$entity_text =~
/<anfragezeitpunkt>(\d+)\D(\d+)\D(\d+)\s/;
my $d = ( $1 ? "$3.$2.$1" : "??.??.??" );
$text = substr( code($t), 0, 40 ) . " $d";
last;
};
}

# Umlaute und andere Sonderzeichen entfernen
$text =~ tr/äöüßÄÖÜ/aousAOU/;
$text =~ s/&#\d{2,};//g;
push @{$optionsref}, [ $text, $nr++ ];
}
return;
}

#------------------------------------------------------------------------------#
# Subroutine, um die Daten der Belege einer Untersuchung auszulesen
#
#------------------------------------------------------------------------------#
sub get_untbelege($$) {
my ( $self, $konfiguration, $nr ) = @_;

# Das XML-Objekt holen
my $xml = $konfiguration->get_value('xml');
my $xmlroot = $xml->{'root'};

# Array fuer Daten der Belege
my @belegdaten;

# Zuerst den Abschnitt 'UNTERSUCHUNG' suchen und die richtige
Nummer nehmen
my $untersuchungen =
$xmlroot->getElementsByTagName("UNTERSUCHUNG");
my $untersuchung = $untersuchungen->item($nr);

# In der Untersuchung alle Belege durchgehen
foreach my $beleg ( $untersuchung->getElementsByTagName("beleg") )
{
my $result;
$result->{'beschreibung'} =
$beleg->getElementsByTagName("beschreibung")->item(0)->getFirstChild->getNodeValue
if
$beleg->getElementsByTagName("beschreibung")->item(0)->hasChildNodes;
$result->{'daten'} =
$beleg->getElementsByTagName("daten")->item(0)->getFirstChild->getNodeValue
if $beleg->getElementsByTagName("daten")->item(0)->hasChildNodes;
push ( @belegdaten, $result ) if defined
$result->{'beschreibung'} && $result->{'beschreibung'};
}
return @belegdaten;
}

#------------------------------------------------------------------------------#
# Subroutine, um Untersuchungsbelege eingeben zu koennen
#
#------------------------------------------------------------------------------#
sub insert_untbeleg($$) {
my ( $self, $konfiguration, $nr, $beschreibung, $daten ) = @_;

# UTF8-Coding in XML Numerischen Character Encoding wandeln
$beschreibung =~ s/([^\x20-\x7F])/'&#' . ord($1) . ';'/gse;

# Die Encodings in Umlaute wandeln
$beschreibung = code($beschreibung);

# Das XML-Objekt holen
my $xml = $konfiguration->get_value('xml');
my $xmlroot = $xml->{'root'};

# Zuerst den Abschnitt 'Belege' suchen und die richtige Nummer
nehmen
# Da jede Untersuchung genau einen Abschnitt 'Belege' hat, ist die
Nr. der
# Belege mit der Nr. der Untersuchung identisch...
my $untersuchungen = $xmlroot->getElementsByTagName("belege");
my $belege = $untersuchungen->item($nr);

# Anschliessend einen Beleg erzeugen
my $belegelement = $xml->{'doc'}->createElement('beleg');
my $beschreibungselement =
$xml->{'doc'}->createElement('beschreibung');
my $datenelement = $xml->{'doc'}->createElement('daten');
my $beschreibungsnode =
$xml->{'doc'}->createCDATASection($beschreibung);
my $datennode =
$xml->{'doc'}->createCDATASection($daten);

# Daten an die Elemente anfügen
$beschreibungselement->appendChild($beschreibungsnode);
$datenelement->appendChild($datennode);

# Elemente unter dem Beleg anfügen
$belegelement->appendChild($beschreibungselement);
$belegelement->appendChild($datenelement);

# Beleg anfügen
$belege->appendChild($belegelement);

return;
}

#------------------------------------------------------------------------------#
# Subroutine, um Untersuchungs-Belege zu loeschen
#
#------------------------------------------------------------------------------#
sub delete_untbeleg($$) {
my ( $self, $konfiguration, $nr, $belegnr ) = @_;

# Das XML-Objekt holen
my $xml = $konfiguration->get_value('xml');
my $xmlroot = $xml->{'root'};

# Zuerst den Abschnitt 'Belege' suchen und die richtige Nummer
nehmen
# Da jede Untersuchung genau einen Abschnitt 'Belege' hat, ist die
Nr. der
# Belege mit der Nr. der Untersuchung identisch ....
my $untersuchungen = $xmlroot->getElementsByTagName("belege");
my $belege = $untersuchungen->item($nr);

# Anschliessend den zu loeschenden Beleg finden
my $belegliste = $belege->getElementsByTagName("beleg");
my $beleg = $belegliste->item($belegnr);

# Beleg entfernen
$belege->removeChild($beleg);

return;
}

#------------------------------------------------------------------------------#
# Subroutine, um encodierte Sonderzeichen zu decodieren bzw.
umgekehrt; #
# abhaengig davon, ob &# ... ; in dem String vorkommt oder nicht
#
#------------------------------------------------------------------------------#
sub code ($) {
my ( $self, $char ) = @_;
$char = $self unless defined $char;
if ( defined $char && $char =~ /&#\d+;/ ) {

# Ampersand rauswerfen
$char =~ s/&?#195;//g;
$char =~ s/&//g;

# Lower Bit UTF
$char =~ s/&?#159;/ß/g;
$char =~ s/&?#164;/ä/g;
$char =~ s/&?#182;/ö/g;
$char =~ s/&?#188;/ü/g;
$char =~ s/&?#132;/Ä/g;
$char =~ s/&?#150;/Ö/g;
$char =~ s/&?#156;/Ü/g;

# XML Numeric Character Encoding
$char =~ s/&?#223;/ß/g;
$char =~ s/&?#228;/ä/g;
$char =~ s/&?#246;/ö/g;
$char =~ s/&?#252;/ü/g;
$char =~ s/&?#196;/Ä/g;
$char =~ s/&?#214;/Ö/g;
$char =~ s/&?#220;/Ü/g;
}

return $char;
}

#------------------------------------------------------------------------------#
# Subroutine, um Umlaute nach XML-Entitaeten zu konvertieren
#
#------------------------------------------------------------------------------#
sub xmlcode ($) {
my $char = shift;

$char =~ s/ß/ß/g;
$char =~ s/ä/ä/g;
$char =~ s/ö/ö/g;
$char =~ s/ü/ü/g;
$char =~ s/Ä/Ä/g;
$char =~ s/Ö/Ö/g;
$char =~ s/Ü/Ü/g;

return $char;
}

#------------------------------------------------------------------------------#
# Subroutine, um ein korrektes Element für den Wert eines Tag zu
erstellen und #
# ggf. Korrekturfilter anzuwenden auf importierte Daten
#
# Die Ueberpruefung von int/long hat z. Zt. keinen tieferen Zweck und
dient #
# nur der Sicherheit, dass alle Werte korrekt sind (Begrenzungswerte
sind fuer #
# "signed"-Typen eines typischen C/C++-Compilers auf x86 Architektur
ausgelegt #
# - ggf. anpassen)
#
#------------------------------------------------------------------------------#
sub create_value_element ($$$) {
my ( $self, $import_value, $zieltag, $xml, $konfiguration ) = @_;

my $type = $xml->{type}{ $zieltag->getTagName };
my $value_node = undef;

# UTF8-Coding in XML Numerischen Character Encoding wandeln
$import_value =~ s/([^\x20-\x7F])/'&#' . ord($1) . ';'/gse unless
$type eq 'base64Binary';

# Die Encodings in Umlaute wandeln
$import_value = code($import_value) unless $type eq
'base64Binary';

SWITCH: for ($type) {
/string/ && do {
if ( $import_value =~ /[\w]/ ) { $value_node =
$xml->{'doc'}->createCDATASection($import_value) }
last;
};
/dateTime/ && do {
if ( $import_value =~ /[\d|-]+/ ) { $value_node =
$xml->{'doc'}->createTextNode( date_iso($import_value) ) }
last;
};
/dateTimefix/ && do {
if ( $import_value =~ /[\d|-]+/ ) { $value_node =
$xml->{'doc'}->createTextNode( date_iso($import_value) ) }
last;
};
/long/ && do {
if ( $import_value =~ /\d*/ && $import_value < 2147483647
) { $value_node = $xml->{'doc'}->createTextNode($import_value) }
last;
};
/int/ && do {
if ( $import_value =~ /\d*/ && $import_value < 32767 ) {
$value_node = $xml->{'doc'}->createTextNode($import_value) }
last;
};
/enum/ && do {

foreach my $value_allowed ( @{ $xml->{enum}{
$zieltag->getTagName } } ) {
if ( $import_value eq $value_allowed ) { $value_node =
$xml->{'doc'}->createTextNode($import_value); }
elsif ( convert_value( $import_value, $value_allowed )
) { $value_node = $xml->{'doc'}->createTextNode($value_allowed); }
}
last;
};
/base64Binary/ && do { $value_node =
$xml->{doc}->createCDATASection($import_value); last; };
die CText->get( $konfiguration, 1003, $_ );
}

return $value_node;
}

#------------------------------------------------------------------------------#
# Subroutine, um einen Wert auf einen Enumerationswert hinzubiegen,
falls #
# maeglich - TRUE zurückgeben, falls das geht, ansonsten FALSE
#
#------------------------------------------------------------------------------#
sub convert_value ($$) {
my ( $iv, $av ) = @_;
SWITCH: for ($av) {
/männlich/ && do {
if ( $iv =~ /^m/ ) { $iv = 1; last; }
};
/weiblich/ && do {
if ( $iv =~ /^[wf]/ ) { $iv = 1; last; }
};
$iv = 0;
}
return $iv;
}

#------------------------------------------------------------------------------#
# Subroutine, um ein Datum in ein ISO-konformes Format "YYYY-MM-TT
HH:MM:SS" #
# zu bringen
#
#------------------------------------------------------------------------------#
sub date_iso ($) {
my ( $self, $date ) = @_;
$date = $self unless defined $date;
my $iso = "";

if ( $date =~ /(\d{1,2})\.(\d{1,2})\.(\d{2}\d*)(.*)/ ) {

# Deutsches Datum
my $time = $4;
$iso = "$3-$2-$1 ";
if ( defined $time && $time =~
/(\d{1,2})\D(\d{1,2})\D(\d{1,2})/ ) { $iso .= "$1:$2:$3"; }
else { $iso .= "00:00:00"; }
}
elsif ( $date =~ /(\d{4})\D(\d{2})\D(\d{2})
(\d{2})\D(\d{2})\D(\d{2})/ ) {

# ISO-Datum
my $iso = "$3.$2.$1 ($4:$5:$6)";
}
return $iso;
}
-----------------------------------------------

Sincerely

Markus

Ben Morrow · Jun 7, 2004

Quoth (e-mail address removed):

Okay, Ben, thank you very much. Here is the complete code for
"CXML.pm" for your interest. Of course, it contains XML:OM
statements.

Can you have a look at this file as well?

No, *you* look at it. AFAICS, there is very little in there that won't
work if you simply change 'XML:

OM' to 'XML::LibXML' throughout. When
you've done this, and fixed the obvious differences (e.g. LibXML takes
the xml version and encoding in the document constructor rather than in
a separate method call) *then* post back if you have any insurmountable
problems, with a *SMALL* complete example showing what it is you can't
figure out.

Ben

Tad McClellan · Jun 7, 2004

[ snip 100 lines of upside-down full-quoted text ]

Okay, Ben, thank you very much. Here is the complete code for
"CXML.pm" for your interest.

[ snip over 1000 lines of code!! ]

Please learn how to properly compose a followup.

Have you seen the Posting Guidelines that are posted here frequently?

Markus Mohr · Jun 7, 2004

[ snip 100 lines of upside-down full-quoted text ]

Okay, Ben, thank you very much. Here is the complete code for
"CXML.pm" for your interest.

Click to expand...

[ snip over 1000 lines of code!! ]

Please learn how to properly compose a followup.
Willingly.

Have you seen the Posting Guidelines that are posted here frequently?

No.

Sincerely

Markus Mohr

little problem with xml::dom::parser	0	Jan 22, 2008
problem of python whitespace XML dom	0	Jan 13, 2016
KeepCDATA => 1 in XML::DOM	5	Jun 17, 2004
How to avoid Out of Memory Errors when dealing with a large XML file?	2	Jan 10, 2011
nbsp and xml::dom::parser	6	Feb 10, 2006
OutOfMemoryException With XML DOM	4	Feb 21, 2008
My experience with XML::DOM VS XML::LibXML	21	Dec 1, 2005
pugXML parser - dumping to XML	2	Nov 16, 2008

Memory problem with XML::DOM::Parser???

Markus Mohr

Ben Morrow

Ben Morrow

Markus Mohr

Ben Morrow

Markus Mohr

Ben Morrow

Tad McClellan

Markus Mohr

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads