help with my first project on first job, how to read a strange file, thanks a lot!!!!!!!

M

matt

Java guys:
this is my first project at my first job. so pls help if you could.
i am working with a text file with strange format. The file has a lot
of white space between the last line of valid text and the end of
file character. And the file is update frequently. New valid text is
appended behind the original valid text and overwrite some whitespace.
I need to feed this file as an input to an application. but this
application only take files without such whitespace. The application
need to read the file frequently to see whether new text is appended.
if there is, get the appended text.
My initial solution is to convert the original file into a new file in
which the whitespace is truncated. then the application can read the
new file.
possibility 1:
loop
read 1 line of text of original file
write this line to new file
until read the long line of white space
close both file


in this case, what class and method should i use, especially in
examing the white spaces?

possibility 2:
the previous one is not smart because the same text is read and write
each time when the file is read. so is there a way i can just each
time check whether update happens to the file and then just write the
update to the new file? such as in C, a file pointer know the position
of last read. can i do the same in Java or C#? or other ways to do it?
possibility 3:
very unlikely but smarter,
read the file in a stream, truncate the whitespace inside the
stream, then feed the stream directly into the application. but it is
unlikely because i can not change the souce code the application.

any other possibilies to solve this problem?
for all the possibilities, pls tell me what class and method should i
use, sample code and website is extremely helpful.
thanks a lot!!!!!
 
P

Paul Lutus

matt said:
Java guys:
this is my first project at my first job. so pls help if you could.
i am working with a text file with strange format. The file has a lot
of white space between the last line of valid text and the end of
file character. And the file is update frequently. New valid text is
appended behind the original valid text and overwrite some whitespace.
I need to feed this file as an input to an application. but this
application only take files without such whitespace. The application
need to read the file frequently to see whether new text is appended.
if there is, get the appended text.

First things first. Explain exactly what the file consists of. Do not
describe it, show it.

Second, for what is probably true about your file, use String.trim() on the
text lines that you read from the file.
 
A

Alex Kizub

You didn't mention OS. Probably it could be solved only by OS tools.
For example for UNIX like it could be grep, tail -f, awk... |, >

Java has other features, but since you can't change application and should
only change the file (which is not good solution itself) here are some
solutions for you.

Use java.io.RandomAccessFile.
So you can set position which you alreadu reached with method seek, you
can know length of new open file with method
length().
Then, I suggest, read file with method read(byte[] b) copy none white
spaces to another array and write it to the new file.
Pretty easy.

BTW. With java.io.File you can understand last modification time and
decide do you need reread file again.

Good luck in your new job.
Alex Kizub.
 
A

Alex Hunsley

Paul said:
matt wrote:




First things first. Explain exactly what the file consists of. Do not
describe it, show it.

OP: To add to that: as well as pasting the text of the file (or a
pertinent example), a hexdump to show the actual bytes would be useful.
(Or else we don't know what the whitespace we're seeing actually is -
tabs, spaces, weird chars etc.)

alex
 
M

matt

here is a sample of the file.

2004-10-21 12:51:06 10.10.1.6 - 10.10.1.3 80 GET /test.html:1256 - 404
Mozilla/4.0+(compatible;+MSIE+6.0;+Windows+NT+5.0)
2004-10-21 12:51:10 10.10.1.6 - 10.10.1.3 80 GET /test.html:12534 -
404 Mozilla/4.0+(compatible;+MSIE+6.0;+Windows+NT+5.0)
2004-10-21 13:55:50 10.10.1.6 - 10.10.1.3 80 GET / - 403
Mozilla/4.0+(compatible;+MSIE+6.0;+Windows+NT+5.0)
2004-10-21 13:55:53 10.10.1.6 - 10.10.1.3 80 GET /aseere - 404
Mozilla/4.0+(compatible;+MSIE+6.0;+Windows+NT+5.0)
2004-10-21 13:55:57 10.10.1.6 - 10.10.1.3 80 GET /aseegegeg - 404
Mozilla/4.0+(compatible;+MSIE+6.0;+Windows+NT+5.0)
2004-10-21 13:56:03 10.10.1.6 - 10.10.1.3 80 GET /aseegegggeegergeg -
404 Mozilla/4.0+(compatible;+MSIE+6.0;+Windows+NT+5.0)
2004-10-21 13:56:07 10.10.1.6 - 10.10.1.3 80 GET
/asegegeegegegegggeegergeg - 404
Mozilla/4.0+(compatible;+MSIE+6.0;+Windows+NT+5.0)
2004-10-21 13:59:08 10.10.1.6 - 10.10.1.3 80 GET /grgejihghewgho - 404
Mozilla/4.0+(compatible;+MSIE+6.0;+Windows+NT+5.0)
2004-10-21 13:59:15 10.10.1.6 - 10.10.1.3 80 GET /grgejihghewghos -
404 Mozilla/4.0+(compatible;+MSIE+6.0;+Windows+NT+5.0)
2004-10-21 13:59:18 10.10.1.6 - 10.10.1.3 80 GET /grgejihghewghossaf -
404 Mozilla/4.0+(compatible;+MSIE+6.0;+Windows+NT+5.0)
2004-10-21 14:03:05 10.10.1.6 - 10.10.1.3 80 GET /12 - 404
Mozilla/4.0+(compatible;+MSIE+6.0;+Windows+NT+5.0)
2004-10-21 14:03:09 10.10.1.6 - 10.10.1.3 80 GET /1234 - 404
Mozilla/4.0+(compatible;+MSIE+6.0;+Windows+NT+5.0)
2004-10-21 14:09:24 10.10.1.6 - 10.10.1.3 80 GET /1234gt - 404
Mozilla/4.0+(compatible;+MSIE+6.0;+Windows+NT+5.0)
2004-10-21 14:21:17 10.10.1.6 - 10.10.1.3 80 GET /1234gtffffff - 404
Mozilla/4.0+(compatible;+MSIE+6.0;+Windows+NT+5.0)


this is IIS log file. i am not sure how to make hexdump. but the white
space is space bar. thanks a lot
 
M

matt

the OS is mainly windows. i doubt windows has powerful tool to do the
job automatically.
i need to open and close the file frequently because another process
is writing to the file. then Java create a new object each time I open
the file. in this case, can the program remeber the position I set
last time?
if yes, can i do the same thing with BufferedReader class instead of
RandomAccessFile. i am not sure whether RandomAccessFile can easily
allow me to keep the new line characters and white space among the
valid text. i need these chars in the application. The length() can
not help me a lot, since the file size does not change when new text
is written into the file. the new text write over the whitespace but
do not change the file size.
thanks again!
Alex Kizub said:
You didn't mention OS. Probably it could be solved only by OS tools.
For example for UNIX like it could be grep, tail -f, awk... |, >

Java has other features, but since you can't change application and should
only change the file (which is not good solution itself) here are some
solutions for you.

Use java.io.RandomAccessFile.
So you can set position which you alreadu reached with method seek, you
can know length of new open file with method
length().
Then, I suggest, read file with method read(byte[] b) copy none white
spaces to another array and write it to the new file.
Pretty easy.

BTW. With java.io.File you can understand last modification time and
decide do you need reread file again.

Good luck in your new job.
Alex Kizub.
matt said:
Java guys:
this is my first project at my first job. so pls help if you could.
i am working with a text file with strange format. The file has a lot
of white space between the last line of valid text and the end of
file character. And the file is update frequently. New valid text is
appended behind the original valid text and overwrite some whitespace.
I need to feed this file as an input to an application. but this
application only take files without such whitespace. The application
need to read the file frequently to see whether new text is appended.
if there is, get the appended text.
My initial solution is to convert the original file into a new file in
which the whitespace is truncated. then the application can read the
new file.
possibility 1:
loop
read 1 line of text of original file
write this line to new file
until read the long line of white space
close both file

in this case, what class and method should i use, especially in
examing the white spaces?

possibility 2:
the previous one is not smart because the same text is read and write
each time when the file is read. so is there a way i can just each
time check whether update happens to the file and then just write the
update to the new file? such as in C, a file pointer know the position
of last read. can i do the same in Java or C#? or other ways to do it?
possibility 3:
very unlikely but smarter,
read the file in a stream, truncate the whitespace inside the
stream, then feed the stream directly into the application. but it is
unlikely because i can not change the souce code the application.

any other possibilies to solve this problem?
for all the possibilities, pls tell me what class and method should i
use, sample code and website is extremely helpful.
thanks a lot!!!!!
 
P

Paul Lutus

matt said:
here is a sample of the file.

/ ... snip unformatted text sample
this is IIS log file. i am not sure how to make hexdump. but the white
space is space bar.

1. If you want to become successful in computer programming, you need to
adopt more strict thinking and reporting standards. For example, when you
write sentences, pretend that your prose will be compiled by an "English"
compiler, one that, just like a Java compiler, requires you to put in all
the words you believe to be unimportant.

2. To solve this specific problem, even to describe the problem, you have to
tighten up your thinking and writing.

3. Tell us exactly what you plan to do with the file, and exactly what is in
it. The above pasted text leaves out some crucial information, because it
is unformatted text.

On carefully reading your posts, I must ask how much experience you have
writing Java programs. Have you considered reading the file's lines and
trimming whitespace from the lines before further processing? If you did
this, the original problem you state wold most likely be solved (assuming
your description is correct).

Yo may be able to simply read input lines, trim whitespace using
String.trim(), and emit the same lines, like this:

java MyProgramName < inputfile | destination_application

5. Where is your code? I ask because we do not ordinarily write code to
solve problems posted by programmers who do not post their own code, and we
also do not do this, on principle, for students who do not post code.
 
A

Alex Kizub

I just wonder why people are not tired to reinvent the wheel.
There are so much good log analyzers...

Here you are:

import java.io.*;
public class LogAnalyzer {

public static void main(String[] args) throws Exception {
System.out.println("LogAnalyser starts");

File logFile=new File("abc.log");
long seek,lastModified;
seek=lastModified=0;

RandomAccessFile ra=new RandomAccessFile(logFile,"r");
String rez;

while((rez=ra.readLine())!=null){
System.out.println(rez);
seek=ra.getFilePointer();
}
ra.close();
lastModified=logFile.lastModified();

int step=200;
while (--step>0){
Thread.sleep(1000);
if (lastModified!=logFile.lastModified()){
ra=new RandomAccessFile(logFile,"r");
ra.seek(seek);
while((rez=ra.readLine())!=null){
System.out.println("added="+rez);
seek=ra.getFilePointer();
}
ra.close();
lastModified=logFile.lastModified();
}
}

System.out.println("LogAnalyser ends");
}
}


I hope you know what to do with String except to print it.
Alex Kizub.

the OS is mainly windows. i doubt windows has powerful tool to do the
job automatically.
i need to open and close the file frequently because another process
is writing to the file. then Java create a new object each time I open
the file. in this case, can the program remeber the position I set
last time?
if yes, can i do the same thing with BufferedReader class instead of
RandomAccessFile. i am not sure whether RandomAccessFile can easily
allow me to keep the new line characters and white space among the
valid text. i need these chars in the application. The length() can
not help me a lot, since the file size does not change when new text
is written into the file. the new text write over the whitespace but
do not change the file size.
thanks again!
Alex Kizub said:
You didn't mention OS. Probably it could be solved only by OS tools.
For example for UNIX like it could be grep, tail -f, awk... |, >

Java has other features, but since you can't change application and should
only change the file (which is not good solution itself) here are some
solutions for you.

Use java.io.RandomAccessFile.
So you can set position which you alreadu reached with method seek, you
can know length of new open file with method
length().
Then, I suggest, read file with method read(byte[] b) copy none white
spaces to another array and write it to the new file.
Pretty easy.

BTW. With java.io.File you can understand last modification time and
decide do you need reread file again.

Good luck in your new job.
Alex Kizub.
matt said:
Java guys:
this is my first project at my first job. so pls help if you could.
i am working with a text file with strange format. The file has a lot
of white space between the last line of valid text and the end of
file character. And the file is update frequently. New valid text is
appended behind the original valid text and overwrite some whitespace.
I need to feed this file as an input to an application. but this
application only take files without such whitespace. The application
need to read the file frequently to see whether new text is appended.
if there is, get the appended text.
My initial solution is to convert the original file into a new file in
which the whitespace is truncated. then the application can read the
new file.
possibility 1:
loop
read 1 line of text of original file
write this line to new file
until read the long line of white space
close both file

in this case, what class and method should i use, especially in
examing the white spaces?

possibility 2:
the previous one is not smart because the same text is read and write
each time when the file is read. so is there a way i can just each
time check whether update happens to the file and then just write the
update to the new file? such as in C, a file pointer know the position
of last read. can i do the same in Java or C#? or other ways to do it?
possibility 3:
very unlikely but smarter,
read the file in a stream, truncate the whitespace inside the
stream, then feed the stream directly into the application. but it is
unlikely because i can not change the souce code the application.

any other possibilies to solve this problem?
for all the possibilities, pls tell me what class and method should i
use, sample code and website is extremely helpful.
thanks a lot!!!!!
 
M

matt

thanks a lot! this code illustrate clearly how to use the "lastmodify"
and "file pointer". it is really helpful. i will modify it to handle
the white space at the end and also write it to a new file. I will
also change the sleep(), because this may cause a starvation with
another process which write to the same file.
by the way, could you give me some website that i can find Java code?
thanks again!!!!!!!!!!!!!!
I just wonder why people are not tired to reinvent the wheel.
There are so much good log analyzers...

Here you are:

import java.io.*;
public class LogAnalyzer {

public static void main(String[] args) throws Exception {
System.out.println("LogAnalyser starts");

File logFile=new File("abc.log");
long seek,lastModified;
seek=lastModified=0;

RandomAccessFile ra=new RandomAccessFile(logFile,"r");
String rez;

while((rez=ra.readLine())!=null){
System.out.println(rez);
seek=ra.getFilePointer();
}
ra.close();
lastModified=logFile.lastModified();

int step=200;
while (--step>0){
Thread.sleep(1000);
if (lastModified!=logFile.lastModified()){
ra=new RandomAccessFile(logFile,"r");
ra.seek(seek);
while((rez=ra.readLine())!=null){
System.out.println("added="+rez);
seek=ra.getFilePointer();
}
ra.close();
lastModified=logFile.lastModified();
}
}

System.out.println("LogAnalyser ends");
}
}


I hope you know what to do with String except to print it.
Alex Kizub.

the OS is mainly windows. i doubt windows has powerful tool to do the
job automatically.
i need to open and close the file frequently because another process
is writing to the file. then Java create a new object each time I open
the file. in this case, can the program remeber the position I set
last time?
if yes, can i do the same thing with BufferedReader class instead of
RandomAccessFile. i am not sure whether RandomAccessFile can easily
allow me to keep the new line characters and white space among the
valid text. i need these chars in the application. The length() can
not help me a lot, since the file size does not change when new text
is written into the file. the new text write over the whitespace but
do not change the file size.
thanks again!
Alex Kizub said:
You didn't mention OS. Probably it could be solved only by OS tools.
For example for UNIX like it could be grep, tail -f, awk... |, >

Java has other features, but since you can't change application and should
only change the file (which is not good solution itself) here are some
solutions for you.

Use java.io.RandomAccessFile.
So you can set position which you alreadu reached with method seek, you
can know length of new open file with method
length().
Then, I suggest, read file with method read(byte[] b) copy none white
spaces to another array and write it to the new file.
Pretty easy.

BTW. With java.io.File you can understand last modification time and
decide do you need reread file again.

Good luck in your new job.
Alex Kizub.
matt wrote:

Java guys:
this is my first project at my first job. so pls help if you could.
i am working with a text file with strange format. The file has a lot
of white space between the last line of valid text and the end of
file character. And the file is update frequently. New valid text is
appended behind the original valid text and overwrite some whitespace.
I need to feed this file as an input to an application. but this
application only take files without such whitespace. The application
need to read the file frequently to see whether new text is appended.
if there is, get the appended text.
My initial solution is to convert the original file into a new file in
which the whitespace is truncated. then the application can read the
new file.
possibility 1:
loop
read 1 line of text of original file
write this line to new file
until read the long line of white space
close both file

in this case, what class and method should i use, especially in
examing the white spaces?

possibility 2:
the previous one is not smart because the same text is read and write
each time when the file is read. so is there a way i can just each
time check whether update happens to the file and then just write the
update to the new file? such as in C, a file pointer know the position
of last read. can i do the same in Java or C#? or other ways to do it?
possibility 3:
very unlikely but smarter,
read the file in a stream, truncate the whitespace inside the
stream, then feed the stream directly into the application. but it is
unlikely because i can not change the souce code the application.

any other possibilies to solve this problem?
for all the possibilities, pls tell me what class and method should i
use, sample code and website is extremely helpful.
thanks a lot!!!!!
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,014
Latest member
BiancaFix3

Latest Threads

Top