Regex question; match after opening tag

Peter J. Holzer · Feb 19, 2011

Yes it is.

I wasn't actually sure whether an unescaped "<" within an attribute
value is allowed (it's forbidden in XML), but James Clark's SGML parser
accepts it and http://www.isgmlug.org/sgmlhelp/g-sg16.htm suggests it.
In any case I'm sure that an unescaped ">" is allowed, and that's the
one which brings breaks the proposed solution.

That is XHTML.

Even in XML, an unescaped ">" within an attribute is allowed, so

 element" />

is valid XHTML and breaks the proposed solution.

XHTML is not the same as HTML.

ACK. Although I tend to use HTML compatible XHTML[1] instead of HTML.

hp

[1] http://www.w3.org/TR/xhtml1/#guidelines

Jason · Feb 22, 2011

I don't have it installed, so found it on <http://search.cpan.org> and
scanned its docs - there's a handy list of all its methods at the top.
Based on its name, toString() looked like it might be relevant to what
you were trying to do, so I checked the full description of it to make
sure.

I've invested quite a few points in the Looking Stuff Up skill over the
years, and found that it's a pretty good investment.

sherm--

So...

I'm using HTML::HTML5:

arser now, and while it works fine on catching
opening or whatever without a closing , I'm still not
sure how I should use this to remove the opening or trailing in a
string like:

$text = " Test This is fine.";

Which should be converted to:

$text = "Test This is fine.";

Or, like:

$text = "This is fine. Test ";

Which should be converted to:

$text = "This is fine. Test";

Just to reiterate, this is coming from a message board post, so these
strings are just basic samples. I'm trying to remove any that
comes at the beginning (or end) of the string, even if it follows (or
precedes) another tag that is acceptable.

Peter J. Holzer · Feb 22, 2011

[...]

I'm using HTML::HTML5:arser now, and while it works fine on catching
opening or whatever without a closing , I'm still not
sure how I should use this to remove the opening or trailing in a
string like:

As Sherm said, HTML::HTML5:

arser returns an XML::LibXML:

ocument
object, so you can use all the methods of XML::LibXML:

ocument (and
XML::LibXML::Node, which is a superclass of XML::LibXML:

ocument) to
manipulate the tree.

For example:

* findnodes to find your br elements
* nextNonBlankSibling and previousNonBlankSibling to check if they are
the last or first nonblank element of their parent.
* unbindNode or removeChild to delete them

hp

ccc31807 · Feb 22, 2011

XHTML is not the same as HTML.

I was thinking of running the HTML code through the W3C validator. I
almost always to so, and try my best to achieve the green light.

http://validator.w3.org/

Sometimes I settle for less, but to my thinking (not following the
precise definitions but just my habits) anything that passes the
validator is valid HTML and anything that doesn't isn't.

I understand that this is a subject that people can have very
different opinions on. My opinion is that, whenever possible, HTML
should pass the validator, but I don't insist that others have the
same opinion.

CC.

Help with code	0	Jun 12, 2022
Different font sizes inside same div	2	Dec 3, 2023
Help with my responsive home page	2	Dec 14, 2022
I dont get this. Please help me!!	2	Jan 24, 2023
Positioning CSS components	1	Nov 16, 2023
I am trying to detect Which image id="" was clicked ?	22	Jan 3, 2023
Slideshow not working properly	2	Jan 7, 2023
Troubles with Fullpage / please help	0	Dec 14, 2023

Regex question; match <br> after opening tag

Peter J. Holzer

Jason

Peter J. Holzer

ccc31807

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads