Surfer Protection Program's
Technical Manual
December 1, 2000
Version 1.0
Intelligent Software Modeling, Inc.
Copyright © 2000 Intelligent Software Modeling, Inc. All Rights Reserved.
Table of Contents
- Introduction
- Purpose
- Acknowledgements
- Overview
- Detailed Product Description
- Product Performance Features
- Product Privacy Features
- The Configuration File
- The file syntax
- The configuration options
- PORT
- Accept_Mask_Connection
- Max_Thread_Pool_Size
- Default_Proxy
- Cntrl_Msg_Suffix
- History
- Block_URI_Image
- File_Blocked_URLs, File_not_Blocked_URLs, and File_Header_Filters
- Site Blocking and Privacy Files
- Site Blocking Files
- Privacy (Header Filter) Files
- The User Interface
- Full Copyright Statement
In early 2000, the FCC gave its permission to advertisers to
create a web tracking database and tie it to other databases
containing specific people's names and addresses, thereby making
it possible for advertisers to associate individuals' surfing
activities with their identities. The FCC is leaving it to the
advertising industry to police itself. Currently, the industry
offers the option to "opt-out" of their web tracking
database. This means that, by default, companies are allowed to
gather and use information about you until you expressly tell
them to stop doing so. In many cases, the process of telling them
to stop is not clearly posted or easy to accomplish successfully.
Rather than having to "opt-out," we feel users
should be able to selectively "opt-in," and actively
decide how much, if any, of their
individual Internet browsing habits should be documented. We
built Surfer Protection Program to allow users to control what
personal information is revealed as they browse the Internet.
We strongly recommend reading the User Manual document first. The User Manual document provides a good introductory overview of the features provided by Surfer Protection Program. This document provides a description of the format and structure of the startup files.
This product includes software developed by the University of
California, Berkeley and its contributors. Specifically, a POSIX
compliant regular expression package written by Henry Spencer is
used.
We would like to thank George
J. Carrette who maintains a web site where, among other
things, he offers his port to the WIN32 environment of Henry
Spencer's regular expression package. (George also provides his
port of the Free Software Foundation's regular expressions
package.)
This document describes Surfer Protection Program's startup
files. The document is broken into the following sections:
The Surfer Protection Program is an HTTP/1.1 proxy that
intercepts, rejects or modifies all requests sent by the browser.
To use Surfer Protection Program, your browser must be able to be
configured to direct its requests to an HTTP proxy. It is very
likely that your browser can be configured to talk to a proxy,
unless you are using a free ISP. Most free ISP providers, e.g.,
Juno, Excite's FreeLane, AltaVista Free Access, and Freeserve,
over-ride the browser's proxy configuration, and would not work
with Surfer Protection Program.
Using the Surfer Protection Program, you can:
- on a site by site basis, modify the HTTP request from the
browser to the web server to
- modify/delete the following 8 fields: Accept,
Accept-Charset, Accept-Encoding, Accept-Language,
Cookie, From, Referer (sic), User-Agent.
- delete the Via field.
- on a site by site basis, modify HTTP response from the
web server to the browser to
- Delete the Set-Cookie field.
- block designated sites (use a set of block-site rules and
permit-site rules).
- on a site by site basis, forward the request to a
designated HTTP proxy.
- review history of message requests and responses and quickly
see the requests that were blocked, and the individual
fields that were deleted or modified.
- over-ride any blocked request to see what the request
will return.
The Surfer Protection Program is designed to speed up the page downloading process:
- Block designated sites - some sites are unrelated to the
main body of the page, e.g. ads, sponsors, affiliates,
etc.
- Implement the HTTP/1.1 protocol - persistent connections.
Also implements a pool of connections for the requested web
servers that will remain open for a few seconds in case
the browser is not HTTP/1.1 compliant or chooses to use a
few threads to fill a page.
- Domain Name caching with periodic refreshing.
- Multi-threaded design (lets the computer handle more than one request at a time).
This file defines several options that affect Surfer
Protection Program's behavior. These options are read during
startup and whenever the Reset interface is invoked. All options
will reset, except for the Port option. The program must
be shut down and restarted to reset the Port option.
The file is composed of a set of lines. A line may be blank or
contain a comment which is delimited with a '#'. Multiple spaces
are ignored.
The options names are not case-sensitive. The option name must be followed by a colon in the file. Following the colon is the
option's value, or a blank(s) or a comment. Some options have
default values assigned to them. A default value is used if the
option is deleted or commented out of the configuration file.
The configuration options are:
- PORT
-
- The browser must be given this value to communicate with the
proxy.
- Initial configuration file's value: 8284
- Default value: 8284
- Legal values: Any number between 5000 and 10000 may be used.
- Accept_Mask_Connection
-
- This is a mask, a validation check, that is applied to all connection attempts - an attempt of one program to talk with Surfer Protection Program.
With the value set to "127.0.0.1", only requests from browsers on the same machine as the one running Surfer Protection Program will be accepted. You should not change this value unless you are making Surfer Protection Program a proxy server for multple clients on a network.
- Initial configuration file's value: 127.0.0.1
- Default value: 127.0.0.1
- Legal values: any legal Internet address mask. Do not
change this value unless you know what you are doing.
- Max_Thread_Pool_Size
-
- This value defines the number of threads Surfer Protection Program is permitted to use. The default works well with a 56k dial-up connection with a 266 mhz processor. Here are some obvious guidelines if you do set it:
- If the machine cpu and disk are working hard, then lower the thread count, i.e. slow processor, disk, etc.
- If the machine Internet connection is about full speed, then lower the thread count to stop unecessary system thread switching.
- If the cpu and disk are idle, and the internet connection has room, then raise the thread count.
- Initial configuration file's value: 6
- Default value: 6
- Legal values: 1-30 times the number of cpus.
- Default_Proxy
-
- This is a proxy address of the form <host>:<port>. When Surfer Protection Program sends a request, it will use the first address found in the following:
- a proxy value from a site privacy rule that matches the address. See Site Blocking and Privacy Files..
- the default proxy value, defined by this, Default_Proxy, option.
- the address of the web server that controls the web page.
- Initial configuration file's value: (There is no value defined.)
- Default value: (There is no value defined.)
- Legal values: Any valid hostname:port of an HTTP proxy.
- Cntrl_Msg_Suffix
-
- This option is used only if Surfer Protection Program's
control message syntax interferes with a web page's URI.
The control message syntax is "++" by default. If we set the value of this option to "spp" then the control message syntax would be "++spp".
- Initial configuration file's value: (There is no value defined.)
- Default value: (There is no value defined.)
- Legal values: any word using letters, numbers, symbols on a
keyboard. This value is case sensitive.
- History
-
This option indicates the total number of page requests that
will be retained in the archive for the user to review. The sub-requests
that compose a page are not counted.
The algorithm to categorize request as a parent page or a sub-page is is a trade off between the utility of categorizing the pages and the speed to do so. It uses only the fields of the message to make this determination. At times, the algorithm is inaccurate. It will categorize a sub-page request as a page request, but it will never categorize a sub-page request as a page request.
- Initial configuration file's value: 5
- Default value: 5
- Legal values: any number between 0 and 200 inclusive.
- To maximize proxy speed, set this value to 0.
- Block_URI_Image
-
- The value of this option is an image that will be returned
when a request is blocked. Use any image you'd like. Very small
images that scale without distortion work best. If there is no
value for this option, then a blocked page response in text
format is returned. Most browsers will ignore the text response
when an image was expected.
[Transparent.gif | spp32pix.gif ]
- Initial configuration file's value: spp32pix.gif
- Default value: no value is defined for the default.
- Legal values: any image file, Transparent.gif and spp32pix.gif are provided with the program.
- File_Blocked_URLs,
File_not_Blocked_URLs, and File_Header_Filters
-
- These three options define files that contain rules for
blocking, never blocking, and privacy customizations,
respectively. The value of these options is a list of files (full
paths) that contain blocking rules. Relative paths from the
install directory may be used. This option lets the user sort the
patterns into different files (by type, by author, etc.)
- The configuration file comes with two files defined for each
option. The first file listed for each option starts with the
prefix "my-". These files are empty. As you use Surfer
Protection Programs user interface to add filters, the
newly added filters will go into the appropriate file starting
with a "my-" prefix.
- Initial configuration file's value:
- for File_Blocked_URLs:
- rules/my-sites-blocked.ini
- rules/sites-blocked.ini
- for File_not_Blocked_URLs:
- rules/my-sites-allowed.ini
- rules/sites-allowed.ini
- for File_Header_Filters:
- rules/my-msg-filters.ini
- rules/msg-filters.ini
- The files with Header Filter Rules are order dependent.
If more than one rule matches a URI, the last rule in the file
will be applied. This lets you define the more general patterns
at the top of the file and the more specific patterns at the
bottom. For multiple files, the first file listed is defined as
the most specific pattern and the last file listed is the most
general pattern. Anything you define using Surfer Protection
Programs user interface gets added to the my-msg-filters.ini
file and will be more specific than what is in the msg-filters.ini
file.
- Default value: no value is defined for the default for any of
the three options.
- Legal values: any properly formatted file defined in
Site Blocking and Privacy Files.
All three files use POSIX regular expression for pattern
matching. For the Block Files and the Never Block Files, the only
content is a list of regular expressions.
The pattern matching algorithm skips over the "http://"
part of the URI. So, the very first character that can be matched
is the first letter after "http://".
The Never Block and the Block files contain a list of regular
expressions. When a URI that matches the regular expression is
requested, a Block-Flag is set or cleared as required by the file.
There is no order to the entries within a file.
The files with Header Filter Rules are order dependent.
If more than one rule matches a URI, the last rule in the file
will be applied. This lets you define the more general patterns
at the top of the file and the more specific patterns at the
bottom. For multiple files, the first file listed is defined as
the most specific pattern and the last file listed is the most
general pattern. Anything you define that gets added to the msg-filters.ini
file will be more specific than what is in the msg-filters.ini
file.
The format for the header customization is a set of Entries.
Each entry consists of a set of tags and values. The tags with
the legal values (actual values are in bold) are:
- URI: <any regular expression - ignore http://
>
- To-Server-Block: <zero or more of: Accept Accept-Charset Accept-Encoding
Accept-Language Cookie From Referer Via >
- From-Server-Block: <none or: Set-Cookie>
- Use-Proxy: <host:port, that is host-colon-port,
an example is 127.0.0.1:8000>
- Referer: http://SurferProtectionProgram.com/ (an
example see next paragraph)
Additionally, for any value of the To-Server-Block tag
except for the via value, you may use the value as a tag
and provide your own custom value for it. In the above example,
we show the Referer value being customized in this manner.
You should not need to customize these fields. If you do
customize a field, remember to remove the value from the To-Server-Block
entry. The block will win out over the custom value
definition.
The tags are not case sensitive, the values are case sensitive.
The URI tag is mandatory. This tells the proxy which
URI to apply the customizations to. A legal entry would be just
the URI tag with a regular expression. This means let any
request that matches the regular expression go through the proxy
without any modifications.
The user interface is described in the Quick Start document.
Copyright © Intelligent
Software Modeling, Inc. (2000). All Rights Reserved.
This document and translations of it may be copied and
furnished to others, and derivative works that comment on or
otherwise explain it or assist in its implementation may be
prepared, copied, published and distributed, in whole or in part,
without restriction of any kind, provided that the above
copyright notice and this paragraph are included on all such
copies and derivative works. However, this document itself may
not be modified in any way, such as by removing the copyright
notice.
The limited permissions granted above are perpetual and will
not be revoked by Intelligent
Software Modeling, Inc., or its successors or assigns.
This document and the information contained herein is provided
on an "as is" basis. Intelligent
Software Modeling, Inc., disclaims all warranties,
expressed or implied, including but not limited to any warranty
that the use of the information herein will not infringe any
rights or any implied warranties of merchantability or fitness
for a particular purpose.