Reading log and saving data to DB

G

Guy Tamir

Hi all,

I have a Ubuntu server running NGINX that logs data for me.
I want to write a python script that reads my customized logs and after
a little rearrangement save the new data into my DB (postgresql).

The process should run about every 5 minutes and i'm expecting large chunks of data on several 5 minute windows..

My plan for achieving this is to install python on the server, write a script and add it to cron.

My question is what the simplest way to do this?
should i use any python frameworks?
For my python app i'm using Django, but on this server i just need to read a file, do some manipulation and save to DB.

if you think any of my plan seem troubling in any way i'd love to hear..

Regards,
Guy
 
M

marduk

Hi all,

I have a Ubuntu server running NGINX that logs data for me.
I want to write a python script that reads my customized logs and after
a little rearrangement save the new data into my DB (postgresql).

The process should run about every 5 minutes and i'm expecting large
chunks of data on several 5 minute windows..

My plan for achieving this is to install python on the server, write a
script and add it to cron.

My question is what the simplest way to do this?
should i use any python frameworks?

Rarely do I put "framework" and "simplest way" in the same set.

I would do 1 of 2 things:

* Write a simple script that reads lines from stdin, and writes to the
db. Make sure it gets run in init before nginx does and tail -F -n 0 to
that script. Don't worry about the 5-minute cron.

* Similar to above but if you want to use cron also store in the db the
offset of the last byte read in the file, then when the cron job kicks
off again seek to that position + 1 and begin reading, at EOF write the
offset again.

This is irrespective of any log rotating that is going on behind the
scenes, of course.
 
D

Dennis Lee Bieber

Hi all,

I have a Ubuntu server running NGINX that logs data for me.

Is the log coming from NGINX or (since you mention Django below) coming
solely from the Django application.

If the logging is from the Django application only, you should be able
to have it connect to the database and write directly to it.
 
G

Guy Tamir

Is the log coming from NGINX or (since you mention Django below) coming

solely from the Django application.



If the logging is from the Django application only, you should be able

to have it connect to the database and write directly to it.

--

Wulfraed Dennis Lee Bieber AF6VN

(e-mail address removed) HTTP://wlfraed.home.netcom.com/

the log is from NGINX..
 
G

Guy Tamir

Rarely do I put "framework" and "simplest way" in the same set.



I would do 1 of 2 things:



* Write a simple script that reads lines from stdin, and writes to the

db. Make sure it gets run in init before nginx does and tail -F -n 0 to

that script. Don't worry about the 5-minute cron.



* Similar to above but if you want to use cron also store in the db the

offset of the last byte read in the file, then when the cron job kicks

off again seek to that position + 1 and begin reading, at EOF write the

offset again.



This is irrespective of any log rotating that is going on behind the

scenes, of course.

Not sure i understood the first options and what it means to run before the nginx.

The second options sound more like what i had in mind.
Aren't there any components like this written that i can use?

since the log fills up a lot i'm having trouble reading so much data and writing it all to the DB in a reasonable amount of time.

The table receiving the new data is somewhat complex.. the table's purpose is to save data regarding ads shown from my app, the fields are - (ad_id,user_source_site,user_location,day_date,specific_hour,views,clicks)
each row is distinct by the first 5 fields since i need to show different types of stats..

because each new line created may or may not be in the DB i have to run a upsert command (update or insert) on each row..

This leads to very poor performance..
Do have any ideas about how i can make this script more efficient?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,055
Latest member
SlimSparkKetoACVReview

Latest Threads

Top