stuck on a REGEX (\S[^\s/>]*)

darrel · Jul 12, 2004

I'm trying to find the opening < and the text of a tag (without the
attributes or closing tags)

This is what I'm using:

(\S[^\s/>]*)

Which, I think, reads as:

(any number of non-whitespace characters [up to a space, /, or >])

Is that correct? I can't get it to work.

If my text is:

<tag

then it returns "<tag" which is what I want.

However, if I have:

<tag/ or <tag>

it instead matches "/" or ">" respectively.

Why?

mikeb · Jul 12, 2004

darrel said:
I'm trying to find the opening < and the text of a tag (without the
attributes or closing tags)

This is what I'm using:

(\S[^\s/>]*)

Which, I think, reads as:

(any number of non-whitespace characters [up to a space, /, or >])

Is that correct? I can't get it to work.

If my text is:

<tag

then it returns "<tag" which is what I want.

However, if I have:

<tag/ or <tag>

it instead matches "/" or ">" respectively.

Why?

In my brief testing, when run against "<tag/" it first matches "<tag" -
then the next match is "/". The second match matches "/" because it
matches the \S character class.

Post some examples of how you want the regex to behave, and maybe
someone can help put one together.

darrel · Jul 12, 2004

In my brief testing, when run against "<tag/" it first matches "<tag" -

then the next match is "/". The second match matches "/" because it
matches the \S character class.

But shouldn't this: [^/] stop it from doing that?

Here's how I want the regex to behave:

I want to find the first 'word' in the string. this would be any number of
characters in a row up to (but not including) a space, a new line, or a / or
so in this:

"hello there, how are you"

it should match 'hello'

in this:

"<blockquote>hello there, how are you"

it should match '<blockquote'

Thanks!

-Darrel

darrel · Jul 12, 2004

But shouldn't this: [^/] stop it from doing that?

Aha. Mike, you are correct!

Here's what's happening. If this is my text:

<blockquote>monkey</blockquote>

and this is my Regex:

\S[^>]*

It returns these matches:

<blockquote

monkey</blockquote

So, it's returning the last match, I suppose. This is where I get lost. How
do I get it to ONLY return the first match?

darrel · Jul 12, 2004

Got it!

The problem was the very next group I was using.

I had this:

(\S[^\s/>]*)
but had to add another group:
(\s|\n[^\S>]*)|(>))
which checks for whitespace/new lines OR a closing tag.
-Darrel

Guest · Jul 12, 2004

Use the Match Class of the regular expression object
Dim m as Match = yourRegEx.Match(string)
m will return the first match

darrel said:
But shouldn't this: [^/] stop it from doing that?

Click to expand...

Aha. Mike, you are correct!

Here's what's happening. If this is my text:

<blockquote>monkey</blockquote>

and this is my Regex:

\S[^>]*

It returns these matches:

<blockquote

monkey</blockquote

Click to expand...

So, it's returning the last match, I suppose. This is where I get lost. How
do I get it to ONLY return the first match?

Trying to build a SARIMAX model to forecast the S&P500 trend	0	Nov 5, 2023
RegEx	0	Sep 1, 2022
I am writing a Age of Empires game but it is being played by codes but ı am stuck.	1	Jul 14, 2023
A number everyday of the month "and" a different number depending on the day of the month´s day time	2	Mar 16, 2021
Hello I am learning how to code and I tried making a calculator with HTML and js with some CSS I am stuck at thing, Like the screen value is	0	Mar 13, 2025
Stupid regex problem, s/// catching extra letter	2	Jul 18, 2012
I need help with a Gemini prompt	1	May 14, 2025
Efficiency of s///e?	9	May 16, 2013

stuck on a REGEX (\S[^\s/>]*)

darrel

mikeb

darrel

darrel

darrel

Guest

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads