Attempting to parse free-form ANSI text.

M

Michael B. Trausch

Alright... I am attempting to find a way to parse ANSI text from a
telnet application. However, I am experiencing a bit of trouble.

What I want to do is have all ANSI sequences _removed_ from the output,
save for those that manage color codes or text presentation (in short,
the ones that are ESC[#m (with additional #s separated by ; characters).
The ones that are left, the ones that are the color codes, I want to
act on, and remove from the text stream, and display the text.

I am using wxPython's TextCtrl as output, so when I "get" an ANSI color
control sequence, I want to basically turn it into a call to wxWidgets'
TextCtrl.SetDefaultStyle method for the control, adding the appropriate
color/brightness/italic/bold/etc. settings to the TextCtrl until the
next ANSI code comes in to alter it.

It would *seem* easy, but I cannot seem to wrap my mind around the idea.
:-/

I have a source tarball up at http://fd0man.theunixplace.com/Tmud.tar
which contains the code in question. In short, the information is
coming in over a TCP/IP socket that is traditionally connected to with a
telnet client, so things can be broken mid-line (or even mid-control
sequence). If anyone has any ideas as to what I am doing, expecting, or
assuming that is wrong, I would be delighted to hear it. The code that
is not behaving as I would expect it to is in src/AnsiTextCtrl.py, but I
have included the entire project as it stands for completeness.

Any help would be appreciated! Thanks!

-- Mike
 
F

Frederic Rentsch

Michael said:
Alright... I am attempting to find a way to parse ANSI text from a
telnet application. However, I am experiencing a bit of trouble.

What I want to do is have all ANSI sequences _removed_ from the output,
save for those that manage color codes or text presentation (in short,
the ones that are ESC[#m (with additional #s separated by ; characters).
The ones that are left, the ones that are the color codes, I want to
act on, and remove from the text stream, and display the text.

I am using wxPython's TextCtrl as output, so when I "get" an ANSI color
control sequence, I want to basically turn it into a call to wxWidgets'
TextCtrl.SetDefaultStyle method for the control, adding the appropriate
color/brightness/italic/bold/etc. settings to the TextCtrl until the
next ANSI code comes in to alter it.

It would *seem* easy, but I cannot seem to wrap my mind around the idea.
:-/

I have a source tarball up at http://fd0man.theunixplace.com/Tmud.tar
which contains the code in question. In short, the information is
coming in over a TCP/IP socket that is traditionally connected to with a
telnet client, so things can be broken mid-line (or even mid-control
sequence). If anyone has any ideas as to what I am doing, expecting, or
assuming that is wrong, I would be delighted to hear it. The code that
is not behaving as I would expect it to is in src/AnsiTextCtrl.py, but I
have included the entire project as it stands for completeness.

Any help would be appreciated! Thanks!

-- Mike
*I have no experience with reading from TCP/IP. But looking at your
program with a candid mind I'd say that it is written to process a chunk
of data in memory. If, as you say, the chunks you get from TCP/IP may
start and end anywhere and, presumably you pass each chunk through
AppendText, then you have a synchronization problem, as each call resets
your escape flag, even if the new chunk starts in the middle of an
escape sequence. Perhaps you should cut off incomplete escapes at the
end and prepend them to the next chunk.

And:

if(len(buffer) > 0):
wx.TextCtrl.AppendText(self, buffer) <<< Are you sure text goes
into the same place as the controls?

if(len(AnsiBuffer) > 0):
wx.TextCtrl.AppendText(self, AnsiBuffer) <<< You say you want to
strip the control sequences


Frederic

*
 
P

Paul McGuire

Michael B. Trausch said:
Alright... I am attempting to find a way to parse ANSI text from a
telnet application. However, I am experiencing a bit of trouble.

What I want to do is have all ANSI sequences _removed_ from the output,
save for those that manage color codes or text presentation (in short,
the ones that are ESC[#m (with additional #s separated by ; characters).
The ones that are left, the ones that are the color codes, I want to
act on, and remove from the text stream, and display the text.
Here is a pyparsing-based scanner/converter, along with some test code at
the end. It takes care of partial escape sequences, and strips any
sequences of the form
"<ESC>[##;##;...<alpha>", unless the trailing alpha is 'm'.
The pyparsing project wiki is at http://pyparsing.wikispaces.com.

-- Paul

from pyparsing import *

ESC = chr(27)
escIntro = Literal(ESC + '[').suppress()
integer = Word(nums)

colorCode = Combine(escIntro +
Optional(delimitedList(integer,delim=';')) +
Suppress('m')).setResultsName("colorCode")

# define search pattern that will match non-color ANSI command
# codes - these will just get dropped on the floor
otherAnsiCode = Suppress( Combine(escIntro +
Optional(delimitedList(integer,delim=';')) +
oneOf(list(alphas)) ) )

partialAnsiCode = Combine(Literal(ESC) +
Optional('[') +
Optional(delimitedList(integer,delim=';') +
Optional(';')) +
StringEnd()).setResultsName("partialCode")
ansiSearchPattern = colorCode | otherAnsiCode | partialAnsiCode


# preserve tabs in incoming text
ansiSearchPattern.parseWithTabs()

def processInputString(inputString):
lastEnd = 0
for t,start,end in ansiSearchPattern.scanString( inputString ):
# pass inputString[lastEnd:start] to wxTextControl - font styles
were set in parse action
print inputString[lastEnd:start]

# process color codes, if any:
if t.getName() == "colorCode":
if t:
print "<change color attributes to %s>" % t.asList()
else:
print "<empty color sequence detected>"
elif t.getName() == "partialCode":
print "<found partial escape sequence %s, tack it on front of
next>" % t
# return partial code, to be prepended to the next string
# sent to processInputString
return t[0]
else:
# other kind of ANSI code found, do nothing
pass

lastEnd = end

# # pass inputString[lastEnd:] to wxTextControl - this is the last bit
# of the input string after the last escape sequence
print inputString[lastEnd:]


test = """\
This is a test string containing some ANSI sequences.
Sequence 1: ~[10;12m
Sequence 2: ~[3;4h
Sequence 3: ~[4;5m
Sequence 4; ~[m
Sequence 5; ~[24HNo more escape sequences.
~[7""".replace('~',chr(27))

leftOver = processInputString(test)


Prints:
This is a test string containing some ANSI sequences.
Sequence 1:
<change color attributes to ['1012']>

Sequence 2:

Sequence 3:
<change color attributes to ['45']>

Sequence 4;
<change color attributes to ['']>

Sequence 5;
No more escape sequences.

<found partial escape sequence ['\x1b[7'], tack it on front of next>
 
F

Frederic Rentsch

Paul said:
Alright... I am attempting to find a way to parse ANSI text from a
telnet application. However, I am experiencing a bit of trouble.

What I want to do is have all ANSI sequences _removed_ from the output,
save for those that manage color codes or text presentation (in short,
the ones that are ESC[#m (with additional #s separated by ; characters).
The ones that are left, the ones that are the color codes, I want to
act on, and remove from the text stream, and display the text.
Here is a pyparsing-based scanner/converter, along with some test code at
the end. It takes care of partial escape sequences, and strips any
sequences of the form
"<ESC>[##;##;...<alpha>", unless the trailing alpha is 'm'.
The pyparsing project wiki is at http://pyparsing.wikispaces.com.

-- Paul

from pyparsing import *

snip


test = """\
This is a test string containing some ANSI sequences.
Sequence 1: ~[10;12m
Sequence 2: ~[3;4h
Sequence 3: ~[4;5m
Sequence 4; ~[m
Sequence 5; ~[24HNo more escape sequences.
~[7""".replace('~',chr(27))

leftOver = processInputString(test)


Prints:
This is a test string containing some ANSI sequences.
Sequence 1:
<change color attributes to ['1012']>
I doubt we should concatenate numbers.
Sequence 2:

Sequence 3:
<change color attributes to ['45']>

Sequence 4;
<change color attributes to ['']>

Sequence 5;
No more escape sequences.

<found partial escape sequence ['\x1b[7'], tack it on front of next>
Another one of Paul's elegant pyparsing solutions. To satisfy my own
curiosity, I tried to see how SE stacked up and devoted more time than I
really should to finding out. In the end I don't know if it was worth
the effort, but having made it I might as well just throw it in.

The following code does everything Mike needs to do, except interact
with wx. It is written to run standing alone. To incorporate it in
Mike's class the functions would be methods and the globals would be
instance attributes. Running it does this:
Sequence 1 Valid code, invalid numbers: \x1b[10;12mEnd of sequence 1
Sequence 2 Not an 'm'-code: \x1b[30;4;77hEnd of sequence 2
Sequence 3 Color setting code: \x1b[30;45mEnd of sequence 3
Sequence 4 Parameter setting code: \x1b[7mEnd of sequence 4
Sequence 5 Color setting code spanning calls: \x1b[3"""
Sequence 6 Invalid code: \x1b[End of sequence 6
Sequence 7 A valid code at the end: \x1b[9m
"""

This is a test string containing some ANSI sequences.
Sequence 1 Valid code, invalid numbers: >>! Ignoring unknown number 10
!<< >>! Ignoring unknown number 12 !<< End of sequence 1
Sequence 2 Not an 'm'-code: End of sequence 2
Sequence 3 Color setting code: >>setting foreground BLACK<< >>setting
background MAGENTA<< End of sequence 3
Sequence 4 Parameter setting code: >>Calling parameter setting function
7<< End of sequence 4
Sequence 5 Color setting code spanning calls: >>setting foreground
GREY<< >>setting background GREEN<< End of sequence 5
Sequence 6 Invalid code: nd of sequence 6
Sequence 7 A valid code at the end: >>Calling parameter setting
function 9<<


###################

And here it goes:

def init ():

# To add to AnsiTextCtrl.__init__ ()

import SE # SEL is less import overhead but doesn't have
interactive development features (not needed in production versions)

global output #-> For testing
global Pre_Processor, digits_re, Colors, truncated_escape_hold #
global -> instance attributes

# Screening out all ansi escape sequences except those controlling color
grit = '\n'.join (['(%d)=' % i for i in range (128,255)]) + ' (13)= '
# Regular expression r'[\x80-\xff\r]' would work fine but is four
times slower than 127 fixed definitions
all_escapes = r'\x1b\[\d*(;\d*)*[A-Za-z]'
color_escapes = r'\x1b\[\d*(;\d*)*m'
Pre_Processor = SE.SE ('%s ~%s~= ~%s~==' % (grit, all_escapes,
color_escapes)) # SEL.SEL for production
# 'all_escapes' also matches what 'color_escapes' matches. With
identical regular expression matches it is the last definitions that
applies. Other than that, the order of definitions is irrelevant to
precedence.

# Isolating digits.
digits_re = re.compile ('\d+')

# Color numbers to color names
Colors = SE.SE ('''
30=BLACK 40=BLACK
31=RED 41=RED
32=GREEN 42=GREEN
33=YELLOW 43=YELLOW
34=BLUE 44=BLUE
35=MAGENTA 45=MAGENTA
36=CYAN 46=CYAN
37=GREY 47=GREY
39=GREY 49=BLACK
<EAT>
''')

truncated_escape_hold = '' #-> self.truncated_escape_hold
output = '' #-> For testing only


# What follows replaces all others of Mike's methods

def process_text (text):

global output #-> For testing
global truncated_escape_hold, digits_re, Pre_Processor, Colors

purged_text = truncated_escape_hold + Pre_Processor (text)
# Text is now clean except for color codes beginning with ESC

ansi_controlled_sections = purged_text.split ('\x1b')
# Each ansi_controlled_section starts with a color control, except
the first one (leftmost split-off)

if ansi_controlled_sections:
#-> self.AppendText(ansi_controlled_sections [0]) #->
For real
output += ansi_controlled_sections [0] #-> For testing #->
For testing
for section in ansi_controlled_sections [1:]:
if section == '': continue
try: escape_ansi_controlled_section, data = section.split ('m', 1)
except ValueError: # Truncated escape
truncated_escape_hold = '\x1b' + section # Restore ESC
removed by split ('\x1b')
else:
escapes = escape_ansi_controlled_section.split (';')
for escape in escapes:
try: number = digits_re.search (escape).group ()
except AttributeError:
output += ' >>!!!Invalid number %s!!!<<< ' % escape
#-> For testing
continue
_set_wx (number)
#-> self.AppendText(data) #-> For real
output += data #-> For testing


def _set_wx (n):

global output # For testing only
global Colors

int_n = int (n)
if 0 <= int_n <= 9:
#-> self._number_to_method (n)() #->
For real
output += ' >>Calling parameter setting function %s<< ' % n #->
For testing
return
color = Colors (n)
if color:
if 30 <= int_n < 50:
if 40 <= int_n:
#-> self.AnsiBGColor = color #->
For real
output += ' >>setting background %s<< ' % color #->
For testing
else:
#-> self.AnsiFGColor = color #->
For real
output += ' >>setting foreground %s<< ' % color #->
For testing
return
output += ' >>!!!Ignoring unknown number %s!!!<< ' % n #->
For testing


#-> For real requires this in addition:
#->
#-> # Methods controlled by 'm' code 0 to 9: # Presumably 'm'?
#->
#-> def _0 (self):
#-> self.AnsiFGColor = 'GREY'
#-> self.AnsiBGColor = 'BLACK'
#-> self.AnsiFontSize = 9
#-> self.AnsiFontFamily = wx.FONTFAMILY_TELETYPE
#-> self.AnsiFontStyle = wx.FONTSTYLE_NORMAL
#-> self.AnsiFontWeight = wx.FONTWEIGHT_NORMAL
#-> self.AnsiFontUnderline = False
#->
#-> def _1 (self): self.AnsiFontWeight = wx.FONTWEIGHT_BOLD
#-> def _2 (self): self.AnsiFontWeight = wx.FONTWEIGHT_LIGHT
#-> def _3 (self): self.AnsiFontStyle = wx.FONTSTYLE_ITALIC
#-> def _4 (self): self.AnsiFontUnderline = True
#-> def _5 (self): pass
#-> def _7 (self): self.AnsiFGColor, self.AnsiBGColor =
self.AnsiBGColor, self.AnsiFGColor
#-> def _8 (self): self.AnsiFGColor = self.AnsiBGColor
#-> def _9 (self): pass
#->
#->
#-> _number_to_method = {
#-> '0' : _0,
#-> '1' : _1,
#-> '2' : _2,
#-> '3' : _3,
#-> '4' : _4,
#-> '7' : _7,
#-> '8' : _8,
#-> '9' : _9,
#-> }

################

The most recent version of SE is now 2.3 with a rare malfunction
corrected. (SE from http://cheeseshop.python.org/pypi/SE/2.2 beta)
 
F

Frederic Rentsch

Sorry about the line wrap mess in the previous message. I try again with
another setting:

Frederic

######################################################################

The following code does everything Mike needs to do, except interact
with wx. It is written to run standing alone. To incorporate it in
Mike's class the functions would be methods and the globals would be
instance attributes. Running it does this:
Sequence 1 Valid code, invalid numbers: \x1b[10;12mEnd of sequence 1
Sequence 2 Not an 'm'-code: \x1b[30;4;77hEnd of sequence 2
Sequence 3 Color setting code: \x1b[30;45mEnd of sequence 3
Sequence 4 Parameter setting code: \x1b[7mEnd of sequence 4
Sequence 5 Color setting code spanning calls: \x1b[3"""
Sequence 6 Invalid code: \x1b[End of sequence 6
Sequence 7 A valid code at the end: \x1b[9m
"""

This is a test string containing some ANSI sequences.
Sequence 1 Valid code, invalid numbers: >>!!!Ignoring unknown number 10!!!<< >>!!!Ignoring unknown number 1!!!<< End of sequence 1
Sequence 2 Not an 'm'-code: End of sequence 2
Sequence 3 Color setting code: >>setting foreground BLACK<< >>setting background MAGENTA<< End of sequence 3
Sequence 4 Parameter setting code: >>Calling parameter setting function 7<< End of sequence 4
Sequence 5 Color setting code spanning calls: >>setting foreground GREY<< >>setting background GREEN<< End of sequence 5
Sequence 6 Invalid code: nd of sequence 6
Sequence 7 A valid code at the end: >>Calling parameter setting function 9<<


#################


def init (): # To add to AnsiTextCtrl.__init__ ()

import SE # SEL is less import overhead but doesn't have interactive
development features (not needed in production versions)

global output #-> For testing
global Pre_Processor, digits_re, Colors, truncated_escape_hold #
global -> instance attributes

# Screening out all ansi escape sequences except those controlling color
grit = '\n'.join (['(%d)=' % i for i in range (128,255)]) + ' (13)= '
# Makes 127 fixed expressions plus delete \r
# Regular expression r'[\x80-\xff\r]' would work fine but is four
times slower than 127 fixed expressions
all_escapes = r'\x1b\[\d*(;\d*)*[A-Za-z]'
color_escapes = r'\x1b\[\d*(;\d*)*m'
Pre_Processor = SE.SE ('%s ~%s~= ~%s~==' % (grit, all_escapes,
color_escapes)) # SEL.SEL for production
# 'all_escapes' also matches what 'color_escapes' matches. With
identical regular expression matches the last regex definitions applies.

# Isolating digits.
digits_re = re.compile ('\d+')

# Color numbers to color names
Colors = SE.SE ('''
30=BLACK 40=BLACK
31=RED 41=RED
32=GREEN 42=GREEN
33=YELLOW 43=YELLOW
34=BLUE 44=BLUE
35=MAGENTA 45=MAGENTA
36=CYAN 46=CYAN
37=GREY 47=GREY
39=GREY 49=BLACK
<EAT>
''')

truncated_escape_hold = '' #-> self.truncated_escape_hold
output = '' #-> For testing only


# What follows replaces all others of Mike's methods

def process_text (text):

global output #-> For testing
global truncated_escape_hold, digits_re, Pre_Processor, Colors

purged_text = truncated_escape_hold + Pre_Processor (text)
# Text is now clean except for color codes beginning with ESC

ansi_controlled_sections = purged_text.split ('\x1b')
# Each ansi_controlled_section starts with a color control, except the
first one (leftmost split-off)

if ansi_controlled_sections:
#-> self.AppendText(ansi_controlled_sections [0]) #->
For real
output += ansi_controlled_sections [0] #-> For testing #->
For testing
for section in ansi_controlled_sections [1:]:
if section == '': continue
try: escape_ansi_controlled_section, data = section.split ('m', 1)
except ValueError: # Truncated escape
truncated_escape_hold = '\x1b' + section # Restore ESC
removed by split ('\x1b')
else:
escapes = escape_ansi_controlled_section.split (';')
for escape in escapes:
try: number = digits_re.search (escape).group ()
except AttributeError:
output += ' >>!!!Invalid number %s!!!<<< ' % escape
#-> For testing
continue
_set_wx (number)
#-> self.AppendText(data) #-> For real
output += data #-> For testing


def _set_wx (n):

global output # For testing only
global Colors

int_n = int (n)
if 0 <= int_n <= 9:
#-> self._number_to_method (n)() #->
For real
output += ' >>Calling parameter setting function %s<< ' % n #->
For testing
return
color = Colors (n)
if color:
if 30 <= int_n < 50:
if 40 <= int_n:
#-> self.AnsiBGColor = color #->
For real
output += ' >>setting background %s<< ' % color #->
For testing
else:
#-> self.AnsiFGColor = color #->
For real
output += ' >>setting foreground %s<< ' % color #->
For testing
return
output += ' >>!!!Ignoring unknown number %s!!!<< ' % n #->
For testing


#-> For real requires this in addition:
#->
#-> # Methods controlled by 'm' code 0 to 9: # Presumably 'm'?
#->
#-> def _0 (self):
#-> self.AnsiFGColor = 'GREY'
#-> self.AnsiBGColor = 'BLACK'
#-> self.AnsiFontSize = 9
#-> self.AnsiFontFamily = wx.FONTFAMILY_TELETYPE
#-> self.AnsiFontStyle = wx.FONTSTYLE_NORMAL
#-> self.AnsiFontWeight = wx.FONTWEIGHT_NORMAL
#-> self.AnsiFontUnderline = False
#->
#-> def _1 (self): self.AnsiFontWeight = wx.FONTWEIGHT_BOLD
#-> def _2 (self): self.AnsiFontWeight = wx.FONTWEIGHT_LIGHT
#-> def _3 (self): self.AnsiFontStyle = wx.FONTSTYLE_ITALIC
#-> def _4 (self): self.AnsiFontUnderline = True
#-> def _5 (self): pass
#-> def _7 (self): self.AnsiFGColor, self.AnsiBGColor =
self.AnsiBGColor, self.AnsiFGColor
#-> def _8 (self): self.AnsiFGColor = self.AnsiBGColor
#-> def _9 (self): pass
#->
#->
#-> _number_to_method = {
#-> '0' : _0,
#-> '1' : _1,
#-> '2' : _2,
#-> '3' : _3,
#-> '4' : _4,
#-> '7' : _7,
#-> '8' : _8,
#-> '9' : _9,
#-> }
 
P

Paul McGuire

Frederic Rentsch said:
Prints:
This is a test string containing some ANSI sequences.
Sequence 1:
<change color attributes to ['1012']>
I doubt we should concatenate numbers.

Oof! No doubt. I should read my test output more closely!

This is an artifact of using Combine to only recognize escape sequences with
no intervening whitespace. Removing Combine solves this problem.

-- Paul
 
M

Michael B. Trausch

Frederic said:
Michael said:
Alright... I am attempting to find a way to parse ANSI text from a
telnet application. However, I am experiencing a bit of trouble.
[snip]

*I have no experience with reading from TCP/IP. But looking at your
program with a candid mind I'd say that it is written to process a chunk
of data in memory.

That would be correct... that's the only way that I can think of to do
it, since the chunks come in from the network to a variable.
If, as you say, the chunks you get from TCP/IP may
start and end anywhere and, presumably you pass each chunk through
AppendText, then you have a synchronization problem, as each call resets
your escape flag, even if the new chunk starts in the middle of an
escape sequence. Perhaps you should cut off incomplete escapes at the
end and prepend them to the next chunk.

The question would be -- how does one determine if it is incomplete or
not? The answer might lie in the previous response to this, where there
is another ANSI python module that works with the text. Actually, it's
possible that my entire approach is faulty -- I am, after all, a rather
newbie programmer -- and this is my first go at an application that does
something other than "Hello, world!"
And:

if(len(buffer) > 0):
wx.TextCtrl.AppendText(self, buffer) <<< Are you sure text goes
into the same place as the controls?

What do you mean? This function is in AnsiTextCtrl (so, it is
AnsiTextCtrl.AppendText). I have derived AnsiTextCtrl from wx.TextCtrl,
so I think that if I need to call the parent's function, I would do so
directly, with 'self' being the current object, and buffer being the
text to add from the network.

When I call AppendText from an instance of a text control, (e.g.,
AnsiTextCtrl.AppendText(buffer)), I don't need to provide the object
that needs to be worked on. It seems to work okay; but I know that just
because something works okay, that doesn't mean it is right. So, am I
doing it wrong?
if(len(AnsiBuffer) >
0): wx.TextCtrl.AppendText(self, AnsiBuffer) <<<
You say you want to strip the control sequences

Yes, in the end I do. Right now, I had to see what was going on. I
figure that once I do figure out how to get it working, I can see the
color changes when the ANSI code shows up in the buffer, and they should
match up with the little chart that I have of the codes here.
Hopefully. And then I can stop printing the extra crud on here.

-- Mike
 
D

Dennis Lee Bieber

Don't give up, attach it as a file!
Which might be acceptable on a mailing list, but might be
problematic on a "text" newsgroup... Though one attachment a year might
not be noticed by those providers with strict "binaries in binary groups
only" <G>
--
Wulfraed Dennis Lee Bieber KD6MOG
(e-mail address removed) (e-mail address removed)
HTTP://wlfraed.home.netcom.com/
(Bestiaria Support Staff: (e-mail address removed))
HTTP://www.bestiaria.com/
 
F

Frederic Rentsch

Steve said:
Don't give up, attach it as a file!

regards
Steve

Thank you for the encourangement!

Frederic


The following code does everything Mike needs to do, except interact with wx. It is written to run standing alone. To incorporate it in Mike's class the functions would be methods and the globals would be instance attributes. Running it does this:
Sequence 1 Valid code, invalid numbers: \x1b[10;12mEnd of sequence 1
Sequence 2 Not an 'm'-code: \x1b[30;4;77hEnd of sequence 2
Sequence 3 Color setting code: \x1b[30;45mEnd of sequence 3
Sequence 4 Parameter setting code: \x1b[7mEnd of sequence 4
Sequence 5 Color setting code spanning calls: \x1b[3"""
Sequence 6 Invalid code: \x1b[End of sequence 6
Sequence 7 A valid code at the end: \x1b[9m
"""

This is a test string containing some ANSI sequences.
Sequence 1 Valid code, invalid numbers: >>!!!Ignoring unknown number 10!!!<< >>!!!Ignoring unknown number 1!!!<< End of sequence 1
Sequence 2 Not an 'm'-code: End of sequence 2
Sequence 3 Color setting code: >>setting foreground BLACK<< >>setting background MAGENTA<< End of sequence 3
Sequence 4 Parameter setting code: >>Calling parameter setting function 7<< End of sequence 4
Sequence 5 Color setting code spanning calls: >>setting foreground GREY<< >>setting background GREEN<< End of sequence 5
Sequence 6 Invalid code: nd of sequence 6
Sequence 7 A valid code at the end: >>Calling parameter setting function 9<<


#################


def init (): # This would have to be added to __init__ ()

import SE # SEL is less import overhead but doesn't have interactive development features (not needed in production versions)

global output #-> For testing
global Pre_Processor, digits_re, Colors, truncated_escape_hold # global -> instance attributes

# Screening out invalid characters and all ansi escape sequences except those controlling color
grit = '\n'.join (['(%d)=' % i for i in range (128,255)]) + ' (13)= ' # Makes 127 fixed expressions plus deletion of \r
# Regular expression r'[\x80-\xff\r]' would work fine but is four times slower than 127 fixed expressions
all_escapes = r'\x1b\[\d*(;\d*)*[A-Za-z]'
color_escapes = r'\x1b\[\d*(;\d*)*m'
Pre_Processor = SE.SE ('%s ~%s~= ~%s~==' % (grit, all_escapes, color_escapes)) # SEL.SEL for production
# 'all_escapes' also matches what 'color_escapes' matches. With identical regular expression matches the last regex definition applies.

# Isolating digits.
digits_re = re.compile ('\d+')

# Color numbers to color names
Colors = SE.SE ('''
30=BLACK 40=BLACK
31=RED 41=RED
32=GREEN 42=GREEN
33=YELLOW 43=YELLOW
34=BLUE 44=BLUE
35=MAGENTA 45=MAGENTA
36=CYAN 46=CYAN
37=GREY 47=GREY
39=GREY 49=BLACK
<EAT>
''')

truncated_escape_hold = '' #-> self.truncated_escape_hold
output = '' #-> For testing only



# What follows replaces all others of Mike's methods in class AnsiTextCtrl(wx.TextCtrl)

def process_text (text):

global output #-> For testing
global truncated_escape_hold, digits_re, Pre_Processor, Colors

purged_text = truncated_escape_hold + Pre_Processor (text)
# Text is now clean except for color codes, which beginning with ESC

ansi_controlled_sections = purged_text.split ('\x1b')
# Each section starts with a color control, except the first one (leftmost split-off)

if ansi_controlled_sections:
#-> self.AppendText(ansi_controlled_sections [0]) #-> For real
output += ansi_controlled_sections [0] #-> For testing
for section in ansi_controlled_sections [1:]:
if section == '': continue
try: escape_ansi_controlled_section, data = section.split ('m', 1)
except ValueError: # Truncated escape
truncated_escape_hold = '\x1b' + section # Restore ESC removed by split ('\x1b')
else:
escapes = escape_ansi_controlled_section.split (';')
for escape in escapes:
try: number = digits_re.search (escape).group ()
except AttributeError:
output += ' >>!!!Invalid number %s!!!<<< ' % escape #-> For testing
continue
_set_wx (number)
#-> self.AppendText(data) #-> For real
output += data #-> For testing


def _set_wx (n):

global output # For testing only
global Colors

int_n = int (n)
if 0 <= int_n <= 9:
#-> self._number_to_method (n)() #-> For real
output += ' >>Calling parameter setting function %s<< ' % n #-> For testing
return
color = Colors (n)
if color:
if 30 <= int_n < 50:
if 40 <= int_n:
#-> self.AnsiBGColor = color #-> For real
output += ' >>setting background %s<< ' % color #-> For testing
else:
#-> self.AnsiFGColor = color #-> For real
output += ' >>setting foreground %s<< ' % color #-> For testing
return
output += ' >>!!!Ignoring unknown number %s!!!<< ' % n #-> For testing



#-> For real requires this in addition:
#->
#-> # Methods controlled by 'm' code 0 to 9: # Presumably 'm'?
#->
#-> def _0 (self):
#-> self.AnsiFGColor = 'GREY'
#-> self.AnsiBGColor = 'BLACK'
#-> self.AnsiFontSize = 9
#-> self.AnsiFontFamily = wx.FONTFAMILY_TELETYPE
#-> self.AnsiFontStyle = wx.FONTSTYLE_NORMAL
#-> self.AnsiFontWeight = wx.FONTWEIGHT_NORMAL
#-> self.AnsiFontUnderline = False
#->
#-> def _1 (self): self.AnsiFontWeight = wx.FONTWEIGHT_BOLD
#-> def _2 (self): self.AnsiFontWeight = wx.FONTWEIGHT_LIGHT
#-> def _3 (self): self.AnsiFontStyle = wx.FONTSTYLE_ITALIC
#-> def _4 (self): self.AnsiFontUnderline = True
#-> def _5 (self): pass
#-> def _7 (self): self.AnsiFGColor, self.AnsiBGColor = self.AnsiBGColor, self.AnsiFGColor
#-> def _8 (self): self.AnsiFGColor = self.AnsiBGColor
#-> def _9 (self): pass
#->
#->
#-> _number_to_method = {
#-> '0' : _0,
#-> '1' : _1,
#-> '2' : _2,
#-> '3' : _3,
#-> '4' : _4,
#-> '7' : _7,
#-> '8' : _8,
#-> '9' : _9,
#-> }
 
F

Frederic Rentsch

Dennis said:
Which might be acceptable on a mailing list, but might be
problematic on a "text" newsgroup... Though one attachment a year might
not be noticed by those providers with strict "binaries in binary groups
only" <G>
The comment isn't lost on me. Much less as it runs in an open door.
Funny thing is that I verified my settings by sending the message to
myself and it looked fine. Then I sent it to the news group and it was
messed up again. I will work some more on my setting.

Frederic
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top