Java and awk (jawk)

S

Sascha Effert

Hello,

I am implementing a key/value store, which maps Strings to Strings.
This store needs some kind of filters. These filters have to support
regular expressions, but they also have to be able to interpret the
key as a number (if it is one) to get ranges.

I would like to support awk as filter language as it can do anything I
want. I know there is jawk[1] as an Java implementation of awk. My
Problem is that the homepage only describes how to use it from the
command line, not how to use out of Java code. I would like to use the
library and give it the skript and the Input as String or as
InputStream and I would like to get the output as OutputStream or
String. Is here anybody who used jawk and can give me a hint how to do
it?

[1] http://jawk.sourceforge.net/

bests

Sascha Effert
 
L

Lew

Sascha said:
I am implementing a key/value store, which maps Strings to Strings.
This store needs some kind of filters. These filters have to support
regular expressions, but they also have to be able to interpret the
key as a number (if it is one) to get ranges.

I would like to support awk as filter language as it can do anything I

So can Java.
want. I know there is jawk[1] as an Java implementation of awk. My
Problem is that the homepage only describes how to use it from the
command line, not how to use out of Java code. I would like to use the
library and give it the skript and the Input as String or as
InputStream and I would like to get the output as OutputStream or
String. Is here anybody who used jawk and can give me a hint how to do
it?

What would the filter do, exactly, and why is Java not good for that?
 
T

Tom Anderson

I am implementing a key/value store, which maps Strings to Strings. This
store needs some kind of filters. These filters have to support regular
expressions, but they also have to be able to interpret the key as a
number (if it is one) to get ranges.

I would like to support awk as filter language as it can do anything I
want. I know there is jawk[1] as an Java implementation of awk. My
Problem is that the homepage only describes how to use it from the
command line, not how to use out of Java code. I would like to use the
library and give it the skript and the Input as String or as InputStream
and I would like to get the output as OutputStream or String. Is here
anybody who used jawk and can give me a hint how to do it?

Not me. But you can get the source, so get it, and figure out how the main
class does it, then do that yourself.

I've had a quick look, and it doesn't look easy: the
central interpreter class is org.jawk.backend.AVM, and its only real
entrypoint is a method:

public int interpret(AwkTuples tuples)

Which has no obvious way to return a value - it's the moral equivalent of
'main'. If you delve a bit deeper, you find things like the PRINT opcode
being hardwired to System.out; this is clearly not code that was designed
for embedding.

I'm not saying you can't do it, just that it's going to require some major
hackery to wrap it in a facade that will let you use it as a filter.

I strongly suspect writing something far simpler from scratch (ie a
language which can express regular expressions and integer ranges) will be
an easier way of getting to the goal that is important to you.

tom
 
J

John B. Matthews

Sascha Effert said:
I am implementing a key/value store, which maps Strings to Strings.
This store needs some kind of filters. These filters have to support
regular expressions, but they also have to be able to interpret the
key as a number (if it is one) to get ranges.

I would like to support awk as filter language as it can do anything
I want. I know there is jawk[1] as an Java implementation of awk. My
Problem is that the homepage only describes how to use it from the
command line, not how to use out of Java code. I would like to use
the library and give it the skript and the Input as String or as
InputStream and I would like to get the output as OutputStream or
String. Is here anybody who used jawk and can give me a hint how to
do it?

[1] http://jawk.sourceforge.net/

I have to agree with Tom: the home page says "Jawk can be invoked via
the JSR 223 scripting API (J2SE 6)." Awkwardly, I don't see the manifest
entries required by the JAR Service Provider specification:

<http://java.sun.com/developer/technicalArticles/J2SE/Desktop/scripting/>

I have to agree with Lew: You haven't mentioned anything that can't be
done with Java.
 
D

Daniel Pitts

Hello,

I am implementing a key/value store, which maps Strings to Strings.
This store needs some kind of filters. These filters have to support
regular expressions, but they also have to be able to interpret the
key as a number (if it is one) to get ranges.
Lucene handles this by converting numbers into a zero padded string.
That way, the strings lexical order is isomorphic to the numeric order.

Which brings up the point, why are you implementing this, why not use
existing libraries such as lucene?
 
S

Sascha Effert

So my Problem is, that I have to build an API which gives the User of
the API the ability to select entries by any kind of language. It is
not important for me if the language is awk or just regular
expressions, it just has to hold for all my use cases. I think not
that lucene gives me such an language, so I think I will have to
implement my own language knowing regular expressions and which can
handle int values. Or is there perhaps a way to handle the int ranges
only by regular expressions?

bests

Sascha Effert
 
L

Lew

So my Problem is, that I have to build an API which gives the User of
the API the ability to select entries by any kind of language. It is
not important for me if the language is awk or just regular
expressions, it just has to hold for all my use cases. I think not
that lucene gives me such an language, so I think I will have to
implement my own language knowing regular expressions and which can
handle int values. Or is there perhaps a way to handle the int ranges
only by regular expressions?

The question on the table that you have not answered is why Java is not suitable.
 
S

Simon Brooke

The question on the table that you have not answered is why Java is not
suitable.

Go easy on the guy. He's posting from Germany but he writes English as
though his first language is French - in any case, his first language is
not English!

So, Sascha, first, as everyone is saying, there's good regexp handling in
Java and everything else you need for general purpose programming, so an
idiomatic Java solution is probably the best answer.

Converting jawk to play nicely as an embedded filter would be a major
exercise - but it would be a good learning exercise, and, if you
contributed your solution back, a good thing for the community. You could
do this. However, unless you have a lot of spare time, I would not
recommend it.
 
L

Lew

Simon said:
Go easy on the guy. He's posting from Germany but he writes English as
though his first language is French - in any case, his first language is
not English!

How is that relevant? No one is faulting his English.
So, Sascha, first, as everyone is saying, there's good regexp handling in
Java and everything else you need for general purpose programming, so an
idiomatic Java solution is probably the best answer.

Go easy on the guy. His first language is not English!
 
S

steph

Sascha said:
So my Problem is, that I have to build an API which gives the User of
the API the ability to select entries by any kind of language. It is
not important for me if the language is awk or just regular
expressions, it just has to hold for all my use cases. I think not
that lucene gives me such an language, so I think I will have to
implement my own language knowing regular expressions and which can
handle int values. Or is there perhaps a way to handle the int ranges
only by regular expressions?

bests

Sascha Effert

Awk and Regexp are not two different ways to filter. But Awk use regexp
to filer lines and perform action. Awk is a programmation language.

If you want your users to be able to get keys from a pattern, you shoud
use java.util.regex.Pattern which implement a very common form of regext
similar to that used by Perl.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,770
Messages
2,569,583
Members
45,074
Latest member
StanleyFra

Latest Threads

Top