Some text processing questions

E

Eli Bendersky

Hello all,

In an effort to read and process non-trivial command files for a
transactor in a testbench, I came upon a couple of questions regarding
text / string processing in VHDL:

1) It seems to be impossible to pass "line" variables into functions,
therefore making it difficult to implement such useful functions as
is_whitespace_only(line) and is_comment(line). For the meantime I made
them accept strings and read the line into a string. Isn't there a way
around this ? How do you prefer to work with "line" type vars ?

2) VHDL's strong typedness makes it difficult to work with strings in
a flexible way. For instance, the following "would be" code doesn't
really work:

var msg := string;
....
....
if (something) then
msg := "hello";
else
msg := "bye";
end if;

Because "msg" can't be declared as an unconstrained string. However,
to give it length would mean to constrain all strings assignable to it
to this length. If I say:

vsr msg := string(1 to 5);

Then I can assign "hello" to it (length 5) but not "bye". This is
quite painful. How can I solve this problem ?

In general, could you post useful code snippets for string / text
processing, and / or point to non-standard libraries you use ? I found
txt_util (http://www.stefanvhdl.com/vhdl/vhdl/txt_util.vhd) pretty
helpful for making some operations less painful.

TIA
Eli
 
J

Jonathan Bromley

In an effort to read and process non-trivial command files for a
transactor in a testbench,

Before I try to answer your question, it's worth noting that many
engineers have simply given up on this; instead they let the
VHDL compiler do the work for them. Your transactor provides
a bunch of useful procedures, and then you create a test case
not in plain text but in procedural VHDL, calling those procedures.
It's usually quite easy to write your set of procedures so
that a succession of calls to them looks almost like a script.
And of course you have the full power of the language for
doing conditionals, loops and what-have-you.

Anyways, after that digression:
I came upon a couple of questions regarding
text / string processing in VHDL:

No complete answers, I'm afraid, because text processing in VHDL
somewhat sucks; but there *are* good answers to at least some
of your questions.
1) It seems to be impossible to pass "line" variables into functions,
therefore making it difficult to implement such useful functions as
is_whitespace_only(line) and is_comment(line). For the meantime I made
them accept strings and read the line into a string. Isn't there a way
around this ?

Given a line variable L, pass the string value (L.all) as a string
parameter to the function. You can also access individual characters
in L by writing subscripts or slices:

L(1) is the first character
L(L'length) is the last character
L(1 to 3) is the first 3 characters
L.all is equivalent to L(1 to L'length)

A LINE is merely a pointer to a string, making it possible to
create strings of arbitrary length. Not nicely, but possible.

Your string-testing functions can, of course, have unconstrained
input parameters. That is one thing that VHDL does really
beautifully.
2) VHDL's strong typedness makes it difficult to work with strings in
a flexible way. For instance, the following "would be" code doesn't
really work:

In fact, it doesn't work at all, because you can't create a variable
of unconstrained type...
var msg := string;
...
...
if (something) then
msg := "hello";
else
msg := "bye";
end if;

Sure, but again line variables can come to your rescue:

procedure copy_string_to_line(L: inout line; S: in string) is
begin
deallocate(L); --- has no effect if L was already null
write(L, S);
end;
....
if (difficult) then
copy_string_to_line(L, "hard");
else
copy_string_to_line(L, "exceptionally easy");
end if;

And don't forget that you can create functions that return
unconstrained strings. So, having created an unknown-length
string in your line variable L, you could do...

impure function make_string_from_something(.....) return string is
variable L: line;
...
begin
--- mess around until you have the right stuff in L
return L.all;
end;

Impure function, because it probably needs to call some
procedures such as WRITE(...). Yuck.

However, this code fragment represents a memory leak because L
goes out of scope on function return, but its contents have not been
deallocated. (At least, that's my understanding. I don't think
the LRM says anything about how to deal with this.) This is,
to put it mildly, a pain; you want to deallocate L *after*
returning the value it references. Yuck again. If you can
set an upper limit on the length of L, there is a hack^wfix:

impure function ....
variable L: line;
variable N: natural;
variable result: string (1 to MAX);
begin
... --- manufacture variable-length result in L
N := L'length;
assert N <= MAX; --- paranoia
result(1 to N) := L.all;
--- storage referenced by L will not be recovered when
--- L goes out of scope, so we must deallocate it here:
deallocate(L);
return(result(1 to N));
--- storage for "result" is automatically deallocated
--- when it goes out of scope
end;

I'd LOVE to know a better answer to that one... anybody???
Maybe it's better to give up on functions, and pass things
to and from procedures instead. Makes your package's
API nastier, though.

I'll say it once more... c'mon, folks, where's the ability to
overload assignment in VHDL? That would make it SOOOO much
easier to write stuff like string processing packages.....
--
Jonathan Bromley, Consultant

DOULOS - Developing Design Know-how
VHDL * Verilog * SystemC * e * Perl * Tcl/Tk * Project Services

Doulos Ltd., 22 Market Place, Ringwood, BH24 1AW, UK
(e-mail address removed)
http://www.MYCOMPANY.com

The contents of this message may contain personal views which
are not the views of Doulos Ltd., unless specifically stated.
 
J

Jim Lewis

Eli said:
1) It seems to be impossible to pass "line" variables into functions,
therefore making it difficult to implement such useful functions as
is_whitespace_only(line) and is_comment(line). For the meantime I made
them accept strings and read the line into a string. Isn't there a way
around this ? How do you prefer to work with "line" type vars ?
It needs to be an inout variable.
2) VHDL's strong typedness makes it difficult to work with strings in
a flexible way. For instance, the following "would be" code doesn't
really work:

var msg := string;
...
...
if (something) then
msg := "hello";
else
msg := "bye";
end if;

Because "msg" can't be declared as an unconstrained string. However,
to give it length would mean to constrain all strings assignable to it
to this length. If I say:

vsr msg := string(1 to 5);

Then I can assign "hello" to it (length 5) but not "bye". This is
quite painful. How can I solve this problem ?
Instead I use subprograms for transactions. A few other things
I consider hard in file based testbenches:
. Hard to make them reactive (respond to a signal transitioning).
Without this, a small disturbance in the design (add a clock
latency in anything you interact with) will require a change
in your file.
. Hard to have any useful language constructs without creating
a mini-language (that is interpreted rather than compiled).
. Clock based or small transaction based IO is slow.

My rule of thumb is to use textio for reading large transaction
values (like an network packet, a video image, ...). Use VHDL
for small transaction processing (like CpuRead, CpuWrite,
SendCharacter, ...).

Note others have solved these problems, and like the file based
approach - I prefer a different method.

Cheers,
Jim
 
J

Jim Lewis

Eli,
It needs to be an inout variable.

I did not note you said function until I read Jonathan's post.

Have you tried variable with in for a function?
I suspect you are right, but that would seem to
be a language limitation that needs to go.
Otherwise, how would I write,
is_empty

I would hate to have to test: L'length = 0 all the time.
Not too readable.

OTOH, you can make your other subprograms work by changing
their functionality to:
remove_whitespace
remove_comment

This is what the read procedures do.

Create a constant that defines your maximum length token.

variable message : string (1 to MAX_TOKEN_LEN) ;

then you can read character by character using text until it fails.
Alternately, put your variable length tokens first in the file and
use VHDL's built-in read[file, string] to read them from the file
(not a line, so don't do readline until after).

As I mentioned, I don't read files except for big data sets
which are values and not tokens, so what I describe above
for reading tokens will have the normal bugs and flaws.

Cheers,
Jim
 
J

Jonathan Bromley

Have you tried variable with in for a function?

That would make good sense ("const ref") but can't be done;
function parameters must be of class constant. Ouch.
I suspect you are right, but that would seem to
be a language limitation that needs to go.
Agreed.

Otherwise, how would I write,
is_empty

You can't :) Not even if the function is impure. This is
something you may like to look at for VHDL++; something
halfway between a function and a procedure - allowing
side-effects and maybe inout or output arguments, but
unable to consume simulation time. Right now the
function/procedure distinction conflates these two
issues, and it's kinda inconvenient.
OTOH, you can make your other subprograms work by changing
their functionality to:
remove_whitespace
remove_comment

This is what the read procedures do.

You can do all that sort of stuff, but it remains a fact that
the idiom

target := func(args);

is exceedingly convenient and rather more readable than
the procedure-driven equivalent.

--
Jonathan Bromley, Consultant

DOULOS - Developing Design Know-how
VHDL * Verilog * SystemC * e * Perl * Tcl/Tk * Project Services

Doulos Ltd., 22 Market Place, Ringwood, BH24 1AW, UK
(e-mail address removed)
http://www.MYCOMPANY.com

The contents of this message may contain personal views which
are not the views of Doulos Ltd., unless specifically stated.
 
J

Jonathan Bromley

Given a line variable L, pass the string value (L.all) as a string
parameter to the function. You can also access individual characters
in L by writing subscripts or slices:
[...]

What I *forgot* to mention - and was forcibly reminded of
when I tried some of this myself a few moments ago - is
that you cannot make *any* reference through a null
pointer, so passing L.all to a function - or even enquiring
about L'length - will crash the simulator if L is null. On
the other hand, it *is* OK to put an empty string into
a line variable:

write(L, string'(""));

and you will then find that L'length=0 as expected;
L'range is "1 to 0", a null range.

I never said it was going to be easy or nice :)
--
Jonathan Bromley, Consultant

DOULOS - Developing Design Know-how
VHDL * Verilog * SystemC * e * Perl * Tcl/Tk * Project Services

Doulos Ltd., 22 Market Place, Ringwood, BH24 1AW, UK
(e-mail address removed)
http://www.MYCOMPANY.com

The contents of this message may contain personal views which
are not the views of Doulos Ltd., unless specifically stated.
 
E

Eli Bendersky

Create a constant that defines your maximum length token.
variable message : string (1 to MAX_TOKEN_LEN) ;

then you can read character by character using text until it fails.
Alternately, put your variable length tokens first in the file and
use VHDL's built-in read[file, string] to read them from the file
(not a line, so don't do readline until after).

As I mentioned, I don't read files except for big data sets
which are values and not tokens, so what I describe above
for reading tokens will have the normal bugs and flaws.

I don't understand. How can this help me write code like:

variable name: string; -- incorrect declaration

....
....
if (something) then
name := "jim";
else
name := "john";
end if;

assert false report "Hello " & name & ", nice to meet you !";

Is there no way to do this in VHDL at all ? I can pad "jim" with a
space to be of length 4, but that will screw up with the printout.

Tx
Eli
 
E

Eli Bendersky

Before I try to answer your question, it's worth noting that many
engineers have simply given up on this; instead they let the
VHDL compiler do the work for them. Your transactor provides
a bunch of useful procedures, and then you create a test case
not in plain text but in procedural VHDL, calling those procedures.
It's usually quite easy to write your set of procedures so
that a succession of calls to them looks almost like a script.
And of course you have the full power of the language for
doing conditionals, loops and what-have-you.

Anyways, after that digression:
[...]

Thanks for the detailed reply. It's a real shame what you're saying,
since I think that a text based command file is in many ways more
versatile than commands coded into VHDL. Moreover, when coupled with a
monitor that logs the outputs to another file, it allows complex
result analysis with an external tool (like Perl) that has only two
text files to work on. This would be more difficult with the
transactor commands coded into VHDL.

Eli
 
J

Jim Lewis

Eli
Create a constant that defines your maximum length token.

variable message : string (1 to MAX_TOKEN_LEN) ;

then you can read character by character using text until it fails.
Alternately, put your variable length tokens first in the file and
use VHDL's built-in read[file, string] to read them from the file
(not a line, so don't do readline until after).

As I mentioned, I don't read files except for big data sets
which are values and not tokens, so what I describe above
for reading tokens will have the normal bugs and flaws.

I don't understand. How can this help me write code like:

variable name: string; -- incorrect declaration

...
...
if (something) then
name := "jim";
else
name := "john";
end if;

assert false report "Hello " & name & ", nice to meet you !";

Is there no way to do this in VHDL at all ? I can pad "jim" with a
space to be of length 4, but that will screw up with the printout.

OOPs. I thought you wanted to read a token from a file
and was not paying attention when you switched the problem:

variable buf : line ;
> if (something) then
write(buf, string'("jim")) ;
write(buf, string'("john"));
> end if;

assert false report "Hello " & buf.all & ", nice to meet you !";
deallocate(buf) ;

I should have also looked at Jonathan's post first, as he
showed something similar. Note that write will allows you
to build multiple things up into the string and that you
need to deallocate it to clear it.

Cheers,
Jim
 
A

Andy

I'm with Jonathan on this one... Do your stimulus and your analysis in
vhdl. text-io is for data input/output (images, etc.) only.

Andy

Before I try to answer your question, it's worth noting that many
engineers have simply given up on this; instead they let the
VHDL compiler do the work for them. Your transactor provides
a bunch of useful procedures, and then you create a test case
not in plain text but in procedural VHDL, calling those procedures.
It's usually quite easy to write your set of procedures so
that a succession of calls to them looks almost like a script.
And of course you have the full power of the language for
doing conditionals, loops and what-have-you.
Anyways, after that digression:

[...]

Thanks for the detailed reply. It's a real shame what you're saying,
since I think that a text based command file is in many ways more
versatile than commands coded into VHDL. Moreover, when coupled with a
monitor that logs the outputs to another file, it allows complex
result analysis with an external tool (like Perl) that has only two
text files to work on. This would be more difficult with the
transactor commands coded into VHDL.

Eli
 
K

KJ

Eli Bendersky said:
I don't understand. How can this help me write code like:

variable name: string; -- incorrect declaration

...
...
if (something) then
name := "jim";
else
name := "john";
end if;

assert false report "Hello " & name & ", nice to meet you !";

Is there no way to do this in VHDL at all ? I can pad "jim" with a
space to be of length 4, but that will screw up with the printout.

Not that this is a great way, but it works

function foo(Something: boolean) return STRING is
begin
if Something then
return("Jim");
else
return("john");
end if;
end function foo;

....

assert false report "Hello " & foo(TRUE) & ", nice to meet you !";
assert false report "Hello " & foo(FALSE) & ", nice to meet you !";

And that produces the following results

# ** Error: Hello john, nice to meet you !
# ** Error: Hello Jim, nice to meet you !

KJ
 
R

Ralf Hildebrandt

Eli Bendersky schrieb:

1) It seems to be impossible to pass "line" variables into functions,
therefore making it difficult to implement such useful functions as
is_whitespace_only(line) and is_comment(line). For the meantime I made
them accept strings and read the line into a string. Isn't there a way
around this ? How do you prefer to work with "line" type vars ?

Not a complete solution for this, but maybe string processing with
Unix/C-compatible functions may be another option:
<http://bear.ces.cwru.edu/vhdl/>.

Ralf
 
K

KJ

Eli Bendersky said:
Hello all,

In an effort to read and process non-trivial command files for a
transactor in a testbench, I came upon a couple of questions regarding
text / string processing in VHDL:
As has already been mentioned by others on this thread, the use of text
files for detailed control of testbenches has it's own problems that
basically cause you to define your own custom language so using an already
accepted standard language is probably a better choice.

At the other extreme, controlling all testbench parameters via simulator
command line arguments that pass the parameters through to the testbench
allows for much cleaner code that is fully integrated but (personally) I
find that the mega long command line strings to be rather ugly to edit and
maintain but ultimately probably are the most time efficient when you're
actively working on the project. I tend to like to have a separate text
file that contains all of the parameters and have the testbench read it in
to define what goes on for that particular run. It's clunkier than using
command line arguments but the resulting testbench code is essentially
identical (except for the extra code needed to read in the text and set the
parameters up). When testing smaller sub-units I tend to simply code up the
possibilities right in the testbench so that it doesn't need any parameters
at all.

All of the aboe might all be off the beaten path of what you were looking
for, and I'm not quite sure what you mean by 'non-trivial command file for a
transactor' and whether that means some sort of thing to control an
interface which consists of a whole set of signals or something completely
different. If by that you mean something like, the detailed control of the
various signals required to perform a PCI bus write transaction (just as an
example) then I would say don't code take the text file down to that level
at all, that protocol should be written into the testbench. At a higher
level you'd interact with this testbench code either directly or at an even
higher level via parameters you'd pass it that would ultimately come from
either command line or text file parameters....in any case, don't try to
code bus protocols (if that's what you mean by transactors) into a text
file, it's far too fragile and prone to mistakes.
2) VHDL's strong typedness makes it difficult to work with strings in
a flexible way. For instance, the following "would be" code doesn't
really work:

var msg := string;
...
...
if (something) then
msg := "hello";
else
msg := "bye";
end if;

Because "msg" can't be declared as an unconstrained string. However,
to give it length would mean to constrain all strings assignable to it
to this length.
As I posted further down the thread though, a function can return an
unconstrained string which can then be used to build up your final resulting
string.

function foo(something:boolean) return string is
begin
if (something) then
return("hello");
else
return("bye");
end if;
end function foo;

The other thing I've found useful is recursion to build up that string.
Let's say you had to create a text image of an array of some type. I would
start with a function that can create a text image of a single element of
that type, this can generally be done without having to explicitly define
any string lengths (like below)
....
return(integer'image(X.some_integer) & LF &
heximage(X.some_std_logic_vector));

Then make a function that takes as an argument an array of these things and
recursively call the above function concatenating the two strings together
using the '&' operator.

Again, I'm not quite sure what types of text functionality you're really
looking for here so the above might not be at all applicable, they are just
things that I've found useful in working with strings while trying to avoid
hard coded string lengths and the problems that they cause.

The key point to keep in mind is that a function can return an unconstrained
length string so any place where you're faced with having to return either
"This string" or "That longer string" (i.e. different length strings) you
'should' be able to embed that in a function and use the function result at
the appropriate point to build up your text string result.
In general, could you post useful code snippets for string / text
processing, and / or point to non-standard libraries you use ? I found
txt_util (http://www.stefanvhdl.com/vhdl/vhdl/txt_util.vhd) pretty
helpful for making some operations less painful.

I've used Ben Cohen's image_pkg
http://members.aol.com/vhdlcohen/vhdl/vhdlcode/image_pb.vhd which on the
surface seems to do similar things to what you use.

Though not standard, most simulators probably provide some form of C-like
string manipulation functions like strcpy, strlen, strcat, etc. The problem
though is the 'not standard' part so portability may be an issue. They can
be useful, but can frequently be avoided by the techniques I mentioned
above.

Not sure if this is of any help or not for your particular application, but
this thread was interesting reading.

Kevin Jennings
 
E

Eli Bendersky

I have another practical need - initialize a constant array of strings
which are of different lengths. Here, again, is a hypothetical (non
working code):

....
type string_array is array(natural range <>) of string;
....
constant names_mapping: string_array := (0 => "hi", 1 => "bye");
....

This doesn't work because I can't make a type declaration of
unconstrained string types. I must constrain it thus (for example):

type string_array is array(natural range <>) of string(1 to 5);

But then all my constant strings must be of length 5:

constant names_mapping: string_array := (0 => "hi ", 1 => "bye ");

Needless to say, this is ugly.

Enlightened by a new understanding of the LINE type from this thread,
I tried to make the array of lines, but that fails for various reasons
(it seems that lines can only be used as variables). Is there any way
to achieve what I want with a constant ? The following works:

function names_mapping(n: natural) return string is
begin
case n is
when 0 => return "hi";
when 1 => return "bye";
...
end case;
end ...

But it's: (1) longer to type, (2) less efficient, (3) clunkier.

Any ideas are welcome.
 
J

Jonathan Bromley

I have another practical need - initialize a constant array of strings
which are of different lengths. Here, again, is a hypothetical (non
working code):

...
type string_array is array(natural range <>) of string;
...
constant names_mapping: string_array := (0 => "hi", 1 => "bye");

Yes... painful. One little feature of VHDL comes to your rescue,
though: Accessing a constant array looks exactly like
calling a function. So how about this...

subtype padded_string is string(1 to 8);
type string_array is array(natural range <>) of padded_string;
constant padded_names_mapping: string_array := (
0 => "hi ",
1 => "greetings",
...
);
function names_mapping(n: natural) return string is
begin
return string_strip_trailing_spaces(padded_names_mapping(n));
end;

and then it's only necessary to write
function string_strip_trailing_spaces(s: string) return string;
which is easy enough.
There's no particular point in parameterising the maximum string
length, since you need to write out the string literals with the
right length anyhow.


Oh, here's another idea: use an array of LINE variables, but
then write a procedure to initialise them cleanly:

type line_array_t is array(natural range <>) of line;
type line_set is access line_array_t;
variable string_mapping: line_set;

impure function get_name(i: natural) return string is
begin
return line_set(i).all;
end;

procedure set_name(i: natural; s: string) is
begin
... make sure line_set(i) exists; create it if not
... if there was anything in it, deallocate it
write(line_set(i), s);
end;

Now I can create my set of "constants" thus:

set_name(0, "hi");
set_name(1,"Mulligatawny");
....

and read them easily:

assert alls_well report "Error from " & get_name(3);

which is not too bad.

Oh, and you could make "line_set" a shared variable,
so that many processes could access it.
You can also implement associative arrays in this way.

I'm not trying to say all this is good - it isn't; as I said
in my first response, it sucks - but with a bit of ingenuity
you can usually get somewhere close to what you want,
in my experience.

HTH
--
Jonathan Bromley, Consultant

DOULOS - Developing Design Know-how
VHDL * Verilog * SystemC * e * Perl * Tcl/Tk * Project Services

Doulos Ltd., 22 Market Place, Ringwood, BH24 1AW, UK
(e-mail address removed)
http://www.MYCOMPANY.com

The contents of this message may contain personal views which
are not the views of Doulos Ltd., unless specifically stated.
 
E

Eli Bendersky

Yes... painful. One little feature of VHDL comes to your rescue,
though: Accessing a constant array looks exactly like
calling a function. So how about this...

subtype padded_string is string(1 to 8);
type string_array is array(natural range <>) of padded_string;
constant padded_names_mapping: string_array := (
0 => "hi ",
1 => "greetings",
...
);
function names_mapping(n: natural) return string is
begin
return string_strip_trailing_spaces(padded_names_mapping(n));
end;

and then it's only necessary to write
function string_strip_trailing_spaces(s: string) return string;
which is easy enough.
There's no particular point in parameterising the maximum string
length, since you need to write out the string literals with the
right length anyhow.

Oh, but this is even less efficient than a function with an internal
"case" - the stripping should be done each time the function is
called.
Oh, here's another idea: use an array of LINE variables, but
then write a procedure to initialise them cleanly:

type line_array_t is array(natural range <>) of line;
type line_set is access line_array_t;
variable string_mapping: line_set;

impure function get_name(i: natural) return string is
begin
return line_set(i).all;
end;

procedure set_name(i: natural; s: string) is
begin
... make sure line_set(i) exists; create it if not
... if there was anything in it, deallocate it
write(line_set(i), s);
end;

Now I can create my set of "constants" thus:

set_name(0, "hi");
set_name(1,"Mulligatawny");
...

and read them easily:

assert alls_well report "Error from " & get_name(3);

which is not too bad.

Oh, and you could make "line_set" a shared variable,
so that many processes could access it.

Yes, this looks closer to what I was thinking of. The initialization
is only done once, but then we can get the fastest access time to the
table. I forgot about shared variables, so I thought that making it a
variable would only limit it to one process.
You can also implement associative arrays in this way.

I'm not trying to say all this is good - it isn't; as I said
in my first response, it sucks - but with a bit of ingenuity
you can usually get somewhere close to what you want,
in my experience.

I wonder if Ada has the same problems. Such painful string processing
isn't good for a modern programming language.

Thanks for the insights

Eli
 
M

M. Hamed

Create a set of VHDL procedures
Create your command file
Create a PERL processor to convert your command file to VHDL code
using a test bench template.
RUN the VHDL through the simulator.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,756
Messages
2,569,535
Members
45,008
Latest member
obedient dusk

Latest Threads

Top