OT: squid-type cache for XML?

Discussion in 'HTML' started by CptDondo, Nov 14, 2006.

  1. CptDondo

    CptDondo Guest

    OK, this is OT for this group, but I really have no idea where to post this.

    I am working on a project where a 'client' periodically queries a number
    of 'servers'. The exchanges are done using XML.

    There is one client and an awful lot of servers (hundreds), and
    bandwidth is limited. It can take hours for the client to query all of
    the servers in round-robin fashion. (We can't use exception reporting
    or have the servers report for technical reasons.)

    My solution is to develop intermediate proxy-cache boxes, which would
    query servers in their subnet and cache the results. The client then
    would only need to query the proxies.

    This seems like a pretty simple idea, and there solutions out there for
    html proxies doing this sort of thing.

    Is anyone aware of anything out there for xml queries?
    CptDondo, Nov 14, 2006
    #1
    1. Advertising

  2. Hello,

    CptDondo wrote:

    > OK, this is OT for this group, but I really have no idea where to post
    > this.
    >
    > I am working on a project where a 'client' periodically queries a number
    > of 'servers'. The exchanges are done using XML.
    >
    > There is one client and an awful lot of servers (hundreds), and
    > bandwidth is limited. It can take hours for the client to query all of
    > the servers in round-robin fashion. (We can't use exception reporting
    > or have the servers report for technical reasons.)
    >
    > My solution is to develop intermediate proxy-cache boxes, which would
    > query servers in their subnet and cache the results. The client then
    > would only need to query the proxies.
    >
    > This seems like a pretty simple idea, and there solutions out there for
    > html proxies doing this sort of thing.
    >
    > Is anyone aware of anything out there for xml queries?


    Proxies like squid work on the protocol level (HTTP) - they do not care what
    kind of data is being transferred.
    If you are using HTTP to fetch the XML data, then you should be able to use
    any generic HTTP proxy including squid.
    Just make sure that the data is cachable at all: proper HTTP headers, data
    is fetched using GET, not POST...
    You could install cronjobs on or near the proxy servers, which pull the data
    (via the proxy) and just drop it - to make sure the data is in the cache,
    when your client comes around. A simple bash script with lots
    of 'wget -O /dev/null http://...' might be sufficient.

    HTH

    --
    Benjamin Niemann
    Email: pink at odahoda dot de
    WWW: http://pink.odahoda.de/
    Benjamin Niemann, Nov 14, 2006
    #2
    1. Advertising

  3. CptDondo

    CptDondo Guest

    Benjamin Niemann wrote:

    > Proxies like squid work on the protocol level (HTTP) - they do not care what
    > kind of data is being transferred.
    > If you are using HTTP to fetch the XML data, then you should be able to use
    > any generic HTTP proxy including squid.
    > Just make sure that the data is cachable at all: proper HTTP headers, data
    > is fetched using GET, not POST...
    > You could install cronjobs on or near the proxy servers, which pull the data
    > (via the proxy) and just drop it - to make sure the data is in the cache,
    > when your client comes around. A simple bash script with lots
    > of 'wget -O /dev/null http://...' might be sufficient.


    That's a neat idea....

    I don't control the client so I'll have to see if the XML-over-HTTP will
    work, but at least I can talk intelligently to my (human) client about
    the issue.... :)

    --Yan
    CptDondo, Nov 14, 2006
    #3
  4. CptDondo

    Andy Dingley Guest

    CptDondo wrote:

    > It can take hours for the client to query all of
    > the servers in round-robin fashion.


    > My solution is to develop intermediate proxy-cache boxes,


    This isn't proxying (and so I don't think Squid will help).

    If you had a squillion clients querying one server with an identical
    request, then you could cache that. What your problem is though is one
    client querying lots of endpoints -- effectively many totally separate
    requests. You can't cache that - even worse is that you might cache it,
    and all "servers" appeared to report the same result!

    You might be able to proxy this by setting up proxies (custom-written
    but simple) somewhere that had good bandwidth to the servers, then
    reported their results in some "denser" fashion to the client. This
    isn't a transparent proxy though.

    Chances are that you could even do this in-house, maybe even by just
    re-writing the client to be multi-threaded. Is it really bandwidth
    that's the problem here, or latency?
    Andy Dingley, Nov 14, 2006
    #4
  5. CptDondo

    Toby Inkster Guest

    CptDondo wrote:

    > I am working on a project where a 'client' periodically queries a number
    > of 'servers'. The exchanges are done using XML.
    >
    > There is one client and an awful lot of servers (hundreds), and
    > bandwidth is limited. It can take hours for the client to query all of
    > the servers in round-robin fashion.


    XML is fairly bulky. Have you thought of compressing the entire exchange
    with gzip? Ought to reduce bandwidth by about 60% or so.

    --
    Toby A Inkster BSc (Hons) ARCS
    Contact Me ~ http://tobyinkster.co.uk/contact
    Toby Inkster, Nov 14, 2006
    #5
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Replies:
    0
    Views:
    535
  2. Zlatko Hristov

    Python script to grep squid logs

    Zlatko Hristov, Apr 14, 2004, in forum: Python
    Replies:
    1
    Views:
    864
    Lee Harr
    Apr 15, 2004
  3. Eddie Butcher

    Zope, Squid and manage_workspace

    Eddie Butcher, Jun 11, 2004, in forum: Python
    Replies:
    0
    Views:
    309
    Eddie Butcher
    Jun 11, 2004
  4. s88

    hashtable...squid

    s88, May 31, 2005, in forum: C Programming
    Replies:
    1
    Views:
    361
    Jack Klein
    May 31, 2005
  5. pantagruel
    Replies:
    1
    Views:
    671
    pantagruel
    Mar 10, 2008
Loading...

Share This Page