tiniest SQL + tiniest app-server

T

tnorgd

Dear Group,

I have a command-line based java program that does a lot of stuff and
its single run take a couple of days. I would like to monitor its
progress in a way smarter than just reading output text file.

I would like to make it writing to a SQL database, then I could look
on charts through a web browser. I run tests on my laptop, so I would
need the smallest possible SQL database. Idealy a single jar file I
could launch in a terminal. Moreover I need a webserwer, preferably
also a single jar.

I am sure there is such stuff already waiting to be used...

Kindest regards,
Dominik
 
R

Roedy Green

I would like to make it writing to a SQL database, then I could look
on charts through a web browser. I run tests on my laptop, so I would
need the smallest possible SQL database. Idealy a single jar file I
could launch in a terminal. Moreover I need a webserwer, preferably
also a single jar.

see http://mindprod.com/jgloss/sqlvendors.html Look for the embedded
engines. Derby is pretty small and it comes with the JDK.
 
J

John B. Matthews

tnorgd said:
I have a command-line based java program that does a lot of stuff and
its single run take a couple of days. I would like to monitor its
progress in a way smarter than just reading output text file.

I would like to make it writing to a SQL database, then I could look
on charts through a web browser. I run tests on my laptop, so I would
need the smallest possible SQL database. Idealy a single jar file I
could launch in a terminal. Moreover I need a webserwer, preferably
also a single jar.

I am sure there is such stuff already waiting to be used...

I use Tomcat as a web server and servlet container. You can modify one
of the existing examples to display your text file, while you learn
about database connectivity. I like H2 Database for its small footprint;
I use JFreeChart for creating charts.
 
T

Tom Anderson

I have a command-line based java program that does a lot of stuff and
its single run take a couple of days. I would like to monitor its
progress in a way smarter than just reading output text file.

I would like to make it writing to a SQL database, then I could look on
charts through a web browser. I run tests on my laptop, so I would need
the smallest possible SQL database. Idealy a single jar file I could
launch in a terminal.

Derby, HSQL, H2.

Derby is part of java 1.6, so if you're using that, that will provide the
smallest code footprint. If not, i'm afraid i don't know the sizes of the
jars off the top of my head, although i remember being surprised by how
big Derby was back when it was Cloudscape. In terms of RAM footprint,
according to its own propaganda, H2 is generally a bit smaller than Derby:

http://www.h2database.com/html/performance.html

and faster than either Derby or HSQL.

However, i would seriously question your need for a relational database.
They're the default choice for data storage these days, but for no
goodreason. You only actually need one when (a) you have strong
requirements about data integrity, concurrency, transactionality, etc,
*and* (b) you need a highly general query interface. For a situation where
one process appends records to a log, and another reads from it, you
don't.

Personally, i'd stick with a text (which could mean CSV of XML) file, or
possibly a directory full of text files. Less disk footprint, less memory
and processor overhead, and honestly no harder to write. You can still
write a webapp to browse the data, but back it with the filesystem. Plus,
you can look at the data with a text editor, grep, etc, which eases
development and gives you flexibility.
Moreover I need a webserwer, preferably also a single jar.

The two famous webservers in the Java world are Tomcat and Jetty. My
impression is that Jetty is smaller.

tom
 
A

Arne Vajhøj

I have a command-line based java program that does a lot of stuff and
its single run take a couple of days. I would like to monitor its
progress in a way smarter than just reading output text file.

I would like to make it writing to a SQL database, then I could look
on charts through a web browser. I run tests on my laptop, so I would
need the smallest possible SQL database. Idealy a single jar file I
could launch in a terminal. Moreover I need a webserwer, preferably
also a single jar.

I am sure there is such stuff already waiting to be used...

There are plenty of embedded databases to pick from
Cloudscape/Derby/JavaDB, Hypersonic SQL/HSQLDB/H2,
BDB Java edition, McKoi etc..

HTTP server is a different animal though. In general I am
rather happy with Tomcat, but I think that Jetty may be
better for embedded purpose.

http://wiki.eclipse.org/Jetty/Tutorial/Embedding_Jetty

document how easy it is.

Attached below is my own small demo example. It does not
get much simpler than that.

Arne

==================================

import java.io.IOException;
import java.io.PrintWriter;
import java.util.HashMap;
import java.util.Map;

import javax.servlet.ServletException;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;

import org.eclipse.jetty.server.Request;
import org.eclipse.jetty.server.Server;
import org.eclipse.jetty.server.handler.AbstractHandler;

public class WebServer extends AbstractHandler {
private Map<String, Integer> counter = new HashMap<String, Integer>();
public void handle(String target, Request req, HttpServletRequest
httpreq, HttpServletResponse httpresp) throws IOException,
ServletException {
int n;
String path = httpreq.getPathInfo();
if(counter.containsKey(path)) {
n = counter.get(path) + 1;
} else {
n = 1;
}
httpresp.setContentType("text/html");
httpresp.setStatus(HttpServletResponse.SC_OK);
PrintWriter pw = httpresp.getWriter();
pw.println("<html>");
pw.println("<head>");
pw.println("<title>Hit counter</title>");
pw.println("</head>");
pw.println("<body>");
pw.println("<h1>Hit counter</h1>");
pw.println("<p>" + path + " has " + n + " hits</p>");
pw.println("</body>");
pw.println("</html>");
pw.flush();
counter.put(path, n);
req.setHandled(true);
}
public static void main(String[] args) throws Exception {
Server server = new Server(8080);
server.setHandler(new WebServer());
server.start();
server.join();
}
}
 
A

Arne Vajhøj

However, i would seriously question your need for a relational database.
They're the default choice for data storage these days, but for no
goodreason. You only actually need one when (a) you have strong
requirements about data integrity, concurrency, transactionality, etc,
*and* (b) you need a highly general query interface. For a situation
where one process appends records to a log, and another reads from it,
you don't.

Personally, i'd stick with a text (which could mean CSV of XML) file, or
possibly a directory full of text files. Less disk footprint, less
memory and processor overhead, and honestly no harder to write. You can
still write a webapp to browse the data, but back it with the
filesystem. Plus, you can look at the data with a text editor, grep,
etc, which eases development and gives you flexibility.

Even if a database is not needed, then I would suggest one
unless good reasons exists not to use a database.

Database is a lot more extendable if more features are needed
later *and* it really does not matter much. Or to put it
another way: if the CPU/memory/IO overhead by using a database
is too high, then flat files is out of the question.

Arne
 
T

Tom Anderson

Even if a database is not needed, then I would suggest one unless good
reasons exists not to use a database.

Database is a lot more extendable if more features are needed later
*and* it really does not matter much.

I wouldn't give up the easy manual inspection, greppability, lack of
dependencies, etc that i get from flat files unless i had a concrete
reason to do so.
Or to put it another way: if the CPU/memory/IO overhead by using a
database is too high, then flat files is out of the question.

Huh? You think a database is *faster* than flat files?

tom
 
A

Arne Vajhøj

I wouldn't give up the easy manual inspection, greppability, lack of
dependencies, etc that i get from flat files unless i had a concrete
reason to do so.

I would say that SQL gives better query capabilities than grep.
Huh? You think a database is *faster* than flat files?

No. That was not what I wrote.

I am saying that if:

load_database(number_records) - load_flatfiles(number_records)

is big enough to affect total application performance, then
number_records is so big that you definitely want to go
database for manageability reasons.

Arne
 
A

Arne Vajhøj

It can be, in a real world example, where the amount of data is not tiny.

Opening and closing files are rather expensive operations, so
accessing many flat files can and will perform worse than database.

For a database to be faster than a single file, then it needs
to be pretty big to benefit from that the database can write
to multiple disks in parallel.

Arne
 
L

Lew

Arne said:
Opening and closing files are rather expensive operations, so
accessing many flat files can and will perform worse than database.

For a database to be faster than a single file, then it needs
to be pretty big to benefit from that the database can write
to multiple disks in parallel.

It depends on the usage patterns. RDBMSes store data in a structured, indexed
fashion. Individual items can be fetched somewhat directly without scanning
every byte of data. Established products have engineered in tremendous
amounts of optimization and tunability. For raw single-scan access to an
entire file, flat files will win of course. For structured, query-type
access, even just for simple associative mapping, database systems start to
become faster than flat-file scan-and-match might be.

As the usage gets more complex databases win even more, and that's just in the
performance dimension. There are storage mechanisms more efficient for known
queries than relational tables but few that maintain good performance together
with flexible ad hoc access. Programmer productivity is important; I'd hate
to roll my own flat-file equivalent of an INNER JOIN. Data integrity is
paramount - structured data storage and associated logging mechanisms allow
much higher reliability than typical flat-file architectures.

Data integrity affects throughput, if you amortize the downtime for repairing
corrupt datastores over the operational time. Time spent doing nothing at all
really kills your average throughput.

Yes, the question on the table is performance, and databases hold their own in
that arena under the sort of use they get. Performance isn't really how many
megabytes you can push per second, but how quickly a result set can become
useful in your code, and databases have it all over flat files for that. But
remember that other factors matter - getting wrong answers twice as fast helps
no one.
 
M

Martin Gregorie

It depends on the usage patterns. RDBMSes store data in a structured,
indexed fashion. Individual items can be fetched somewhat directly
without scanning every byte of data. Established products have
engineered in tremendous amounts of optimization and tunability. For
raw single-scan access to an entire file, flat files will win of course.
For structured, query-type access, even just for simple associative
mapping, database systems start to become faster than flat-file
scan-and-match might be.
It also depends on the hit rate on the data set during a program run.
Remembering back to my mainframe days when both machines and disks were
slow and memory was too expensive to allow more caching than simple
double-buffering plus one or two index block buffers, we used a rule of
thumb something like this:

Hit rate Request ordering Access method
======== ================ =============
<5% n/a random access

5-20% sorted ISAM with keyed access

20-100% sorted serial processing

By 'hit rate' I mean the percentage of records in the file that will be
accessed and/or updated during the run. There's also an assumption that
the file is too big to be held entirely in main memory.

Faster disks with good caching in the drive and controllers tilts the
balance toward random or keyed access. More caching in main memory or
(much) larger storage space per read/write head tilts the balance toward
serial access. If the file can be read entirely into memory from a fast
filing system then serial access is the hands-down winner regardless of
the hit rate.

A decent RDBMS will work out the above for itself and select the access
strategy along similar lines though this isn't obvious to the many
programmers (and some DBAs) who never ask the RDBMS to explain its query
strategy.
 
S

steph

tnorgd said:
Dear Group,

I have a command-line based java program that does a lot of stuff and
its single run take a couple of days. I would like to monitor its
progress in a way smarter than just reading output text file.

I would like to make it writing to a SQL database, then I could look
on charts through a web browser. I run tests on my laptop, so I would
need the smallest possible SQL database. Idealy a single jar file I
could launch in a terminal. Moreover I need a webserwer, preferably
also a single jar.

I am sure there is such stuff already waiting to be used...

Kindest regards,
Dominik

It looks like an interesting case to use JMX.
Develloping few MBeans (manageable beans) you will be able to see
attributes (counters, etc) and, if inspired, modify behavior (number of
workers, etc).
This is now native in the JRE using proper jvm arguments and the
JConsole program, no need of webserver dbserver + huge developments.

see http://java.sun.com/javase/technologies/core/mntr-mgmt/javamanagement/

Tell me if it can fit.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,770
Messages
2,569,583
Members
45,074
Latest member
StanleyFra

Latest Threads

Top