REgular expression to match a XML tag

K

Karthik

Hi All,

I am trying to match an XML tag using JS regular expressions. The
pattern I am using is

pattern="/(<" + tagname + ">)" + "(*)" + "(<." + tagname +
">/g)";

where I want to replace the tagname variable with the name of the tag
which I want to search for. Unfortunately this doesn't work. If I
replace the tagname variable with the actual tag's name it works.
Any idea how to fix this issue?

If any of you could post a script that could do this it would be
great.

Thanks
Karthik
 
K

Karthik

Hi All,

MOdified the pattern to
var patt="(<" + tagname + ">)" + "(*)" + "(<." + tagname +
">)";

without the intial / and ending /g still no go...
 
K

Karthik

Hi All,

MOdified the pattern to
var patt="(<" + tagname + ">)" + "(*)" + "(<." + tagname +
">)";

without the intial / and ending /g still no go...

Here is the full script...
here str is just a temporary storage, Actually I will be applying the
pattern on the source of the HTML page of the "current window"
object.

<html>
<body>

<script type="text/javascript">
var tagname="ContentId";
var result="";
var str = "&lt;ContentId&gt;12345&lt;/ContentId&gt;";
var patt="(&lt;" + tagname + "&gt;)" + "(*)" + "(&lt;." + tagname +
"&gt;)";
//var patt=/(&lt;ContentId&gt;)([\d]*)/g
document.write(patt + " &nbsp PAttern <BR>");
document.write(str + "<BR>");
var patt2=new RegExp(patt);

result=patt2.exec(str);
document.write(result + " Result &nbsp <BR>");
document.write(RegExp.$2);
</script>

</body>
</html>
 
K

Karthik

MOdified the pattern to
var patt="(&lt;" + tagname + "&gt;)" + "(*)" + "(&lt;." + tagname +
"&gt;)";
without the intial / and ending /g still no go...

Here is the full script...
here str is just a temporary storage, Actually I will be applying the
pattern on the source of the HTML page of the "current window"
object.

<html>
<body>

<script type="text/javascript">
var tagname="ContentId";
var result="";
var str = "&lt;ContentId&gt;12345&lt;/ContentId&gt;";
var patt="(&lt;" + tagname + "&gt;)" + "(*)" + "(&lt;." + tagname +
"&gt;)";
//var patt=/(&lt;ContentId&gt;)([\d]*)/g
document.write(patt + " &nbsp PAttern <BR>");
document.write(str + "<BR>");
var patt2=new RegExp(patt);

result=patt2.exec(str);
document.write(result + " Result &nbsp <BR>");
document.write(RegExp.$2);
</script>

</body>
</html>

Got the expression...

here it is...
var regexpr= new RegExp("(&lt;" + tagname + "&gt;)([A-Z]*[[a-z]*[0-9]*)
(&lt;." + tagname + "&gt;)");
apply a exec of this pattern on any string/html source/xml file, it
will fetch you the values between the tags..
one word of warning though if the tag has got child tags, it will
retrieve all the child tags also :)

Thanks
Karthik
 
J

Jeremy

Karthik said:
Hi All,
MOdified the pattern to
var patt="(&lt;" + tagname + "&gt;)" + "(*)" + "(&lt;." + tagname +
"&gt;)";
without the intial / and ending /g still no go...
Hi All,
I am trying to match an XML tag using JS regular expressions. The
pattern I am using is
pattern="/(&lt;" + tagname + "&gt;)" + "(*)" + "(&lt;." + tagname +
"&gt;/g)";
where I want to replace the tagname variable with the name of the tag
which I want to search for. Unfortunately this doesn't work. If I
replace the tagname variable with the actual tag's name it works.
Any idea how to fix this issue?
If any of you could post a script that could do this it would be
great.
Thanks
Karthik
Here is the full script...
here str is just a temporary storage, Actually I will be applying the
pattern on the source of the HTML page of the "current window"
object.

<html>
<body>

<script type="text/javascript">
var tagname="ContentId";
var result="";
var str = "&lt;ContentId&gt;12345&lt;/ContentId&gt;";
var patt="(&lt;" + tagname + "&gt;)" + "(*)" + "(&lt;." + tagname +
"&gt;)";
//var patt=/(&lt;ContentId&gt;)([\d]*)/g
document.write(patt + " &nbsp PAttern <BR>");
document.write(str + "<BR>");
var patt2=new RegExp(patt);

result=patt2.exec(str);
document.write(result + " Result &nbsp <BR>");
document.write(RegExp.$2);
</script>

</body>
</html>

Got the expression...

here it is...
var regexpr= new RegExp("(&lt;" + tagname + "&gt;)([A-Z]*[[a-z]*[0-9]*)
(&lt;." + tagname + "&gt;)");
apply a exec of this pattern on any string/html source/xml file, it
will fetch you the values between the tags..
one word of warning though if the tag has got child tags, it will
retrieve all the child tags also :)

Thanks
Karthik

Using regular expressions alone will never really get you a robust
parser. For example, "<foo>bar<afoo>" would match your current
expression, even though <afoo> doesn't close <foo>.

You want to search through the current document for a certain tag?
Wouldn't it be easier to use DOM for this purpose?

Jeremy
 
B

Bart Van der Donck

Karthik said:
var regexpr= new RegExp("(&lt;" + tagname + "&gt;)([A-Z]*[[a-z]*[0-9]*)
(&lt;." + tagname + "&gt;)");
apply a exec of this pattern on any string/html source/xml file, it
will fetch you the values between the tags..
one word of warning though if the tag has got child tags, it will
retrieve all the child tags also :)

And that's only the very beginning :)

Take a look at

http://groups.google.com/group/comp.lang.perl.misc/browse_frm/thread/795b006db41efc7b/

to get idea about the complexity of real XML string parsing.

Do yourself a favour and load it into the XML parser.
 
T

Thomas 'PointedEars' Lahn

Karthik said:
MOdified the pattern to
var patt="(&lt;" + tagname + "&gt;)" + "(*)" + "(&lt;." + tagname +
"&gt;)";
without the intial / and ending /g still no go...
Hi All,
I am trying to match an XML tag using JS regular expressions. The
pattern I am using is
pattern="/(&lt;" + tagname + "&gt;)" + "(*)" + "(&lt;." + tagname +
"&gt;/g)";
where I want to replace the tagname variable with the name of the tag
which I want to search for. Unfortunately this doesn't work. If I
replace the tagname variable with the actual tag's name it works.
Any idea how to fix this issue?
If any of you could post a script that could do this it would be
great.
[...]

Got the expression...

Not at all, you don't.
here it is...
var regexpr= new RegExp("(&lt;" + tagname + "&gt;)([A-Z]*[[a-z]*[0-9]*)
(&lt;." + tagname + "&gt;)");
apply a exec of this pattern on any string/html source/xml file, it
will fetch you the values between the tags..

Only if the content is ASCII-alphanumeric. XML, however, is UTF-8-safe.
one word of warning though if the tag has got child tags, it will ^^^^^^^^^^^^^^^^^^^^^^^^^^
retrieve all the child tags also :)
^^^^^^^^^^
http://www.w3.org/TR/REC-html40/intro/sgmltut.html#h-3.2.1 (esp. the last,
green-colored paragraph)

It will _not_ match any child _elements_, as you have explicitly excluded
their start tags from the content of the `tagname' element, assuming that
the double `[' was but a typo (if it was not, the expression would match `['
in the content as well). Why you escape `<' and `>' remains a mystery;
further assuming that you use it within an XHTML `script' element (where
declaring it as CDATA would have sufficed to avoid the character entity
references), the possible match would be

<foo>abc<bar>def</bar>ghi</foo>
^^^^^^^^^^

However, that match is discarded because `ar' does not match `fo'.

The Chomsky hierarchy, taught in computer science classes, tells us that
it is usually not possible to use (only) a regular grammar, such as the one
regular expressions are based on, to parse a context-free language, such as
SGML-based markup. Because every regular language is context-free, but not
every context-free language is regular.

Therefore, only if you need to parse the markup as such instead of accessing
the corresponding DOM objects, you are looking for a non-deterministic
pushdown automaton (which can parse those languages), implemented as an XML
parser (such as DOMParser in Gecko-based UAs), instead. If you don't want
to use such an external API, it is possible to combine the efficiency of
regular expression matching with the reliability of an NPDA in your code.

http://en.wikipedia.org/wiki/Chomsky_hierarchy


PointedEars
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,020
Latest member
GenesisGai

Latest Threads

Top