README file for SquidClamav Version 3.9

SquidClamav - A Clamav Antivirus Redirector for Squid -
	(http://sourceforge.net/projects/squidclamav/)

REQUIREMENT:
------------

You need libcurl 7.12.1 at least and a standard install of regex. Those should
already be installed in modern distro.


INSTALLATION:
-------------

Please see INSTALL file for installation instructions, for express install
just do the following :

	./configure
	make
	make install
	cp squidclamav.conf.dist /etc/squidclamav.conf
	touch /var/log/squidclamav.log
	chown squid /var/log/squidclamav.log

and edit /etc/squidclamav.conf to match you needs.

SQUID 2.5 CONFIGURATION:
------------------------

To integrate squidclamav to your squid cache just edit the squid.conf
file and set the following:

on ACL definition you should have declared:

        acl localhost src 127.0.0.1/255.255.255.255
        acl to_localhost dst 127.0.0.0/8

on http_acces definition you should declared the follwing :

        http_access deny to_localhost
        http_access allow localhost
        redirector_access deny localhost

and on the redirect section the following:

        redirect_program /usr/local/squidclamav/bin/squidclamav
        redirect_children 15

If you have huge access and enough memory set the redirect_children to
upper value.

SQUID 2.6 / 2.7 / 3.0 CONFIGURATION:
------------------------------------

As 2.6 has signifiant change in the configuration file regarding
redirector, to integrate squidclamav to your squid cache just edit
the squid.conf file and set the following:

on ACL definition you should have declared:

        acl localhost src 127.0.0.1/255.255.255.255
        acl to_localhost dst 127.0.0.0/8

on http_acces definition you should declared the follwing :

        http_access deny to_localhost
        http_access allow localhost
        url_rewrite_access deny localhost

and on the redirect section the following:

        url_rewrite_program /usr/local/bin/squidclamav
        url_rewrite_children 15

If you have huge access and enough memory set the url_rewrite_children to
upper value.


SQUICLAMAV CONFIGURATION:
-------------------------

By default, the configuration file is now located at:

  /etc/squidclamav.conf

If you need an other path just give the path in command line argument.

You need to create this file from scratch with the aids of squidclamav.conf.dist
and the following instructions:

Squidclamav Patterns:

The syntax of lines in the squidclamav.conf file are of the form:

	regex|regexi pattern
or
	abort|aborti pattern
or
	content|contenti pattern
or
	abortcontent|abortcontenti pattern
or
	whitelist pattern
or
	redirect cgi_url_redirection

Full regex matching is made available by the use of the GNU Regex libary.
It also supports pattern buffers.

There's two levels to configure squidclamav virus scanning. The first is at
URL stage. Related configuration options are: regex, abort, whitelist. The
second is at HTTP header stage with the Content-Type. Related configuration
options are: content and abortcontent. At this level squidclamav need to send
a HEAD request to the remote server.

URL stage:

	regex|regexi pattern => Virus scan at pattern match on URL.
	abort|aborti pattern => No virus scan at pattern match on URL.
	whitelist pattern => No virus scan for an entire site.

Content-Type stage:

	content|contenti pattern => Virus scan at pattern match on content type.
	abortcontent|abortcontenti pattern => No virus scan at pattern match on 					      content type.


Let's say you want to check against the ClamAv antivirus files with
case insensitive extension .exe, .com and .zip. Then here are the line
you may include:

	regexi  ^.*\.exe$
	regexi  ^.*\.com$
	regexi  ^.*\.zip$

Now let's say you want don't want to check image and HTML files, then you
should include the following lines in the configuration file:

	aborti ^.*\..gif$
	aborti ^.*\..png$
	aborti ^.*\..jpg$
	abort ^.*\..html$
	abort ^.*\..htm$

If you don't want to check directory listing or default index.html add
the following line :

	abort ^.*\/$

You may want to allow virus scanning based on content type, for example
for all 'application/*' file:

	content ^application\/.*$

will scan all files with this content-type.

Some download doesn't have extension so you may want to skip scanning
for certain content type. Proceed as follow:

	abortcontenti ^.*application\/x-javascript.*$

This will abort virus scanning for javascript file with unknown extension.
Or if you experience too much CPU usage on some content type, this is usefull
too. For example:

	abortcontenti ^.*application\/x-mms-framed.*$


If you want to disable virus scan and squidguard call for a given
URL or domain there's now the keyword 'whitelist'. It is given as
convenience but using the 'aborti' keyword is the same. Use it as
follow:

	whitelist www.trustdomain.com

This must be used if you experienced some problem with squidclamav
and malformed sites.

Here is the configuration I use:

	abort mappy.com
	abort ^.*\.pdf$
	abort ^.*\.js$
	abort ^.*\.html$
	abort ^.*\.css$
	abort ^.*\.xml$
	abort ^.*\.xsl$
	abort ^.*\.js$
	abort ^.*\.html$
	abort ^.*\.css$
	abort ^.*\.xml$
	abort ^.*\.xsl$
	abort ^.*\.js$
	abort ^.*\.jsp$
	abort ^.*\.jsp\?.*$
	aborti ^.*servlet.*$
	abort ^.*\.ico$
	aborti ^.*\.gif$
	aborti ^.*\.png$
	aborti ^.*\.jpg$
	aborti ^.*\.swf$
	abortcontenti ^.*application\/x-mms-framed.*$
	abortcontenti ^.*application\/x-javascript.*$
	content ^.*application\/.*$
	#whitelist www.eicar.org

When a virus is found the squidclamav program should redirect the request
to a CGI program. You must specify the URL to this CGI with the redirect
directive as follow:

	redirect http://proxy.domain.com/cgi-bin/clwarn.cgi

Squidclamav will pass to this CGI the following parameters:

	url=ORIGNAL_HTTP_REQUEST
	virus=NAME_OF_THE_VIRUS
	source=DOWNLOADER_IP_ADDRESS
	user=DOWNLOADER_IDENT

Virus scanning can be stop on large files by configuring clamd (see clamd.conf).

You can overwrite the URL/port to your Squid proxy as follow :

	proxy http://127.0.0.1:3128

This is the default value. To disable the use of proxy download in special
case, set this option to 'none'. Note that you should always use localhost
interface to prevent loop. See section Squid ACL in INSTALL file.

The log file can be set with:

	logfile /var/log/squidclamav.log

This is now the default.

If you need to change the path to the configuration file edit
file path.h change the default value and recompile+reinstall
or give it at squid.conf call as rewrite program first parameter.

You can specify a timeout on libcurl connection when squidclam
try to download the file by using the 'timeout' configuration
option.

Some http servers send malformed header (especially for ads) and
libcurl is not able to return valid content-type or other header
information. I you want to force virus scan of these URL set the
'force' option to 1.

To show time statistics of URL processing you must set the 'stat' option
to 1.

You can now chain squidclamav with an other redirector like SquidGuard.
This chained program is called before the antivirus scanner. To do that
just simply set the 'squidguard' comfiguration directive to the path of the
redirector.

Internet site now have excessive use of HTTP redirect so that you may
not want to follow the hundred location to not waste time. By default
squidclamav stop after 10 redirect. If you want to adjust this value,
the 'maxredir' configuration option is here for that.

Some HTTP server don't want you if you are not using IE. Squidclamav fetch
the http header with libcurl user agent so you may want to anonymize the Curl
download with the configuration 'useragent' as follow:

	useragent Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)

You can force SquidClamav to reread his configuration at runtime by
sending signal SIGUSR1 to all running process. This is usefull to not
reconfigure Squid each time you modify squiclamav.conf. For example when you
switch on/off. debug mode You still need to reconfigure squid if you want to
modify the following configuration option: squidguard,maxredir,timeout and
useragent. Use the following command line to proceed:

	killall -10 squidclamav


CONFIGURING CLAMD CONNECTION:
-----------------------------

You have 3 configuration option to set the connection to clamd daemon.

If you use unix local socket, just set clamd_local to the socket path
as follow:

	clamd_local /tmp/clamd

If you use TCP socket set the clamd_ip and clamd_port as follow:

	clamd_ip 192.168.1.5
	clamd_port 3310

DO NOT set the clamd_local if you want to use TCP socket !!!


TESTING SQUIDCLAMAV:
--------------------

Once you have installed+configured squidclamav and modified Squid configuration
the best way to see if squidclamav is well working is to test it. If you want
to see detailled output set the debug option to 1 in squidclamav.conf file.
If you want more debug trace set debug option to 2.

Open a terminal onto your proxy server and run squidclamav, this will give you
this kind of output:

	root@theproxy# squidclamav 
	SquidClamav running as UID 0: writing logs to stderr
	Thu ... 2008 LOG Reading configuration from /etc/squidclamav.conf
	Thu ... 2008 LOG Chaining with /usr/local/squidGuard/bin/squidGuard
	Thu ... 2008 LOG SquidClamav (PID 7012) started
	Thu ... 2008 bidirectional pipe to squidGuard childs ready...

At this point squidclamav is waiting for squid input. The input line consists
of four fields:

	URL ip-address/fqdn ident method

For example, let's check slashdot:

	http://www.slashdot.org/ 192.168.1.3 mylog GET

As this site doesn't contains any virus :-) squidclamav simply return an empty
line. Now to test clamav antivir let's type the following entry:

	http://www.eicar.org/download/eicar.com 192.168.1.3 mylog GET

The result must be a redirection the clwarn.cgi as follow:

	Thu ... 2008 LOG Redirecting URL to: http://theproxy.com/cgi-bin/clwarn.cgi?url=http://www.eicar.org/download/eicar.com&source=192.168.1.3&user=mylog&virus=stream:+Eicar-Test-Signature+FOUND
	http://theproxy.com/cgi-bin/clwarn.cgi?url=http://www.eicar.org/download/eicar.com&source=192.168.1.3&user=mylog&virus=stream:+Eicar-Test-Signature+FOUND 192.168.1.3 mylog GET

This last line is the request returned to squid.
Type Ctrl+C to quit.


FEEDBACK:
---------

If you find it useful, I'd like to know - please send email
to gilles@darold.net

ACKNOWLEDGEMENT:
----------------

I must thanks a lot all the great contributors:

	- Leonardo Humberto Liporati from www.ig.com.br
	- Dale Laushman from The Uptime Group
	- Rainer schoepf from Proteosys.com

and all others who help me to build a usefull and reliable product.


COPYRIGHT:
----------

This project is a modified version of the excellent Squirm Redirector for Squid
Maintained by Chris Foote, and copyrighted as follow :

        Copyright (C) 1998 Chris Foote & Wayne Piekarski

The original Squirm version used was squirm-1.0betaB. Some other parts are
cut and paste from the ex1.c program given in the ClamAv distribution and
are copyrighted: Copyright (C) 2002 - 2004 Tomasz Kojm

All other code: Copyright (C) 2005-2008 Gilles Darold


LICENSE:
--------

    This program is free software; you can redistribute it and/or modify
    it under the terms of the GNU General Public License as published by
    the Free Software Foundation; either version 2 of the License, or
    (at your option) any later version.

    This program is distributed in the hope that it will be useful,
    but WITHOUT ANY WARRANTY; without even the implied warranty of
    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
    GNU General Public License for more details.

    You should have received a copy of the GNU General Public License
    along with this program; if not, write to the Free Software
    Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.

Please see the file COPYING in this directory for full copyright
information.

