Thursday, September 10, 2009

How To Block Ads And Banners In SafeSquid Proxy Server

Ads and banners not only unnecessarily consume bandwidth, but also distract users, and can be exceedingly irritating at times. Few ads and banners keep rotating and fetching new contents, and keep consuming bandwidth in the background.
SafeSquid can be configured to either blank out these ads and banners, or replace them with a custom html page or an image. In this tutorial I will explain how you can replace ads and banners with a custom html page. This requires configuring 3 sections, viz. Templates, Profiles and URL redirecting.
The first thing to do is to design an html page, to replace the ads and banners. Since many ads and banners are displayed in a small window, the html page that you design to replace them, should be as small as possible.
Here is a sample :
Copy your custom html file, which we will call ads.html in this tutorial, to the SafeSquid template directory. By default, the directory is located at /opt/safesquid/safesquid/templates/. You can verify this from the SafeSquid Interface => Config => Templates Section. 
SafeSquid's Templates section, allows you to add your own custom templates, or messages, to be displayed when a page is blocked by a filter, instead of SafeSquid's default message. The replacement file could be an html page, an image, audio / video file, an executable, etc. and is a very powerful tool, that we will explained in a future tutorial.
After you have copied the file to the templates directory, you need to define the file in the Templates section (create a template), and give it a name by which it will be identified in the other section. To do this, open the SafeSquid Interface and go to Config => Templates. Click on Add under the Template sub-section and add the new template as shown below:
Option
Value
Enabled
Yes
Comment
Template to replace ads and banners
Profiles

Name
replace-ad-banner
File
/opt/safesquid/safesquid/templates/ads.html
Mime type
text/html
Response code
302
Type
File
Parsable
Yes
The explanation for the various fields above can be found at http://www.safesquid.com/html/portal.php?page=24 and will be covered in a future tutorial.
Now the file ads.html can be used as a template in SafeSquid, and has been named replace-ad-banner. We will later use it in the URL Redirecting section.
The next thing to do, is to identify ads and banners that appear in web pages, so that they can be replaced. They could either be fetched from a remote Ad Server, or located on the same web server. In the former case, if the Ad Servers are identified, then it would be easy to identify the content being fetched by these servers. In the later case, in most cases, the link to the content has the words ad, ads, adv, advert, banner, banners, etc. in the file part of the URL, e.g. d7.zedo.com/ads2/*, *.googlesyndication.com/pagead/show_ads.js. So, if we can filter our such URLs, we can replace them with our custom template.
SafeSquid allows the use of Perl Compatible Regular Expressions (PCRE), hence we can create a single rule that can cover multiple words, strings or expressions. Go to Config => Profiles and create the following two rules:
Profile to identify Ad Servers:
Option
Value
Enabled
True
Comment
Identify content from Ad Servers
Host
(^ad(|s|v|server)\.|adtag\.|targetsearches\.com|webconnect\.net|imgis\.com|atwola\.com|
fastclick\.net|abz\.com|tribalfusion\.com|advertising\.com|atdmt\.com|spinbox\.(com|net)|
linkexchange\.com|hitbox\.com|doubleclick\.net|valueclick\.com|click2net\.com|mediaplex\.com|
247media\.com|clickagents\.com|adbutler\.com|qkimg\.net|realmedia\.com|us\.a1\.yimg\.com|
clickheretofind\.com|images\.cybereps\.com|adbureau\.net|sfads\.osdn\.com|adflow\.com|
adprofs\.com|zedo\.com|digitalmedianet\.com|ad-flow\.com|/adsync/|adtech\.de|netdirect\.nl|
rcm-images\.amazon\.com|pamedia\.com|msads\.net|valuead\.com|smartadserver\.com|thisbanner\.com|
aaddzz\.com|scripps\.com|ru4\.com|adtrix\.net|falkag\.net)
Time match mode
text/html
Added profiles
Ad-Server-Content
The above rule analyzes the Host part of URLs to verify if the content is being served from any of the Ad Servers listed in the Host field, and if a positive mach is found, applies the profile Ad-Server-Content to that content. (A URL is made up of  protocol://host/file, e.g. http://www.safesquid.com/html).
The Host field in the above rule is a regular expression. Host names are separated  with a pipe (|). In regular expressions, a '.' is a special character - a single character wildcard. A '\' before a '.' specifies that it is to be interpreted as character '.' and not wildcard. The expression begins with ^ad(|s|v|server)\. This will match the expressions ad., ads., adv. and adserver. in the host part of a URL, e.g. ad.indiatimes.com, ads.asiafriendfinder.com, adv.elbuscador.com, etc. You can also add additional hosts to the expression.
Profile to identify expressions in file part of URL:
Option
Value
Enabled
True
Comment
Template to replace ads and banners
File
/(adimages/|banner(|s)/|ad(|s|v|(|_)banner(|s))/|adx/|sponsors/|advert(ising|s|)/|adcycle/|
track/|promo/|adspace/|admentor/|image\.ng/|ajrotator/|adview.php|clickthru|affiliates|
banmat(\.cgi|.\.cgi)|adproof/|bannerfarm/|BannerAds/|banner_|sponsorid|servfu.pl|
RealMedia/|pagead/|adsync/|_ad_|adceptdelivery.cgi)
Time match mode
text/html
Added profiles
Ad-Banner
The above rule analyzes the file part of URLs to verify if they contain any of the expressions specified in the file field of the rule, and if a positive match is found, applies the profile Ad-Banner to the content.
We can now use the added profiles Ad-Server-Content and Ad-Banner that is applied to positive matches, to redirect the requests for them to our custom html page ads.htnl. To achieve this, go to Config => URL redirecting. The URL redirecting section allows you to redirect requests for specific URL, to another URL. This is a very powerful feature, and is mostly used to create redundancy for web servers, when SafeSquid is used in reverse proxy mode.
Verify that the section is enabled - Enabled = Yes, click on Add under Redirect sub-section and add the following rule:
Option
Value
Enabled
True
Comment
Redirect specified profiles to template replace-ad-banner
Profiles
Ad-Server-Content,Ad-Banner
URL
/.*
Redirect
http://safesquid.cfg/template/replace-ad-banner
Port
0
302 redirect false
Applies to
both
Simply put, this rule will redirect all the requests that carry the profiles Ad-Server-Content and Ad-Banner, to the template replace-ad-banner, which is the name of our custom html page - ads.html. I will cover the URL redirecting section and explanation for the various fields, in a future tutorial.
We are now ready to test the results of the above configurations. Open the browser and visit a website that has lots of ads and banners, e.g. www.in.indiatimes.com. The ads and banners should now be replaced with the custom html, as shown below:

You can also verify what URLs are being redirected, by checking the SafeSquid logs. Click on View log entries in the Top Menu of the interface. You will see a lot of entries. To filter out the entries for URL redirecting, type redirect in the Regular expression match field, and click on Submit. This will filter out entries similar to this:
2008 06 06 13:06:45 [19] redirect: request for http://ads.indiatimes.com/ads.dll/genptypead?slotid=1942 to http://safesquid.cfg/template/replace-ad-banner

No comments:

Post a Comment

tag ur valuable ideas below