Friday, January 1, 2010

What are Internet Filters?

(First in a series of five articles on Internet Filters)

Before exploring the Free Speech problems potentially surrounding the use of Internet Filters on library computers, it is important to explain a little about how they work.

The General Idea
Internet Filters are designed to keep computer users from displaying certain kinds of internet content.  Some filters are designed as parental controls, intended to keep children from accessing sexually explicit websites from a computer at home.  Some filtering programs are designed more for a corporate environment, intended to keep employees from using company computers to access sexually explicit websites, or otherwise wasting valuable company time. The Child Internet Protection Act (CIPA), which mandates filters in school and public libraries that accept certain government funds, has created a market for yet another variant on the same basic idea, one with an emphasis on a library’s goals. There are many filtering products available on the market, and while similar in general structure, no two are exactly alike. 

How Filters Work
Internet Filters monitor every website a computer user attempts to access, regardless of whether that attempt is intentional on the part of the user or is computer-generated (links that automatically connect one webpage to another webpage).  In each instance, the filter program decides whether to allow or disallow access to the requested webpage.  While the details vary from program to program, filters generally make this decision based on three kinds of information:
  • Black Lists.  Lists of websites to which access is always denied. 
  • White Lists.  Lists of websites to which access is always permitted.
  • Text Algorithms.  Word patterns that indicate access should be denied.
Filters generally begin by checking to see if a requested webpage is on the Black List.  If the page is on the Black List, access is denied, and no further analysis is needed.  If the requested page is not on the Black List, the program can then check the White List.  If the page is on the White List, access is permitted without further analysis.  If the requested webpage is on neither list, the program must examine the text (words and characters) on the requested page, and must use the Text Algorithms to estimate whether access should be allowed or disallowed.  The Text Algorithm will look for certain words or phrases, their frequency, their placement relative to each other, and might consider words in multiple languages. 

Where do the Lists and Algorithms Come From?
Black Lists, White Lists, and Text Algorithms are the intellectual property of the company that produces a given Internet Filter program.  The company has employees who spend endless hours analyzing internet traffic and reviewing the content of the more commonly accessed websites.  The company trains employees to categorize websites according to criteria the company believes its customers want. Since the content available on the web is always changing, these lists must always change also. The buyer of an Internet Filter program typically pays a fee to subscribe to regular updates to the company's Black Lists and White Lists.

Because the content available on the internet is vast, comprising billions of pages, and because it is constantly changing, no company can come close to reviewing every website.  For this reason, the producers of filtering programs also develop and maintain Text Algorithms that the filter will use to evaluate web pages not on the lists.  Employees study the words and phrases that appear on websites they've reviewed, and from that analysis they develop patterns and programming logic that can be applied to the automatic evaluation of pages that have not yet been reviewed by human beings. 

While different filtering products may agree with each other on allowing or disallowing access to many websites, they don't agree on everything.  The criteria of acceptability used by each company, which sites they have or have not reviewed, and their Text Algorithms, are proprietary, often kept private or even secret.  They are far from identical. It is a certainty, then, that some websites disallowed by some filtering products will be allowed by others.

Companies producing filtering programs compete with each other for customers, at least in part on the basis of the customers' perceptions of the effectiveness of each product.  There is room in the market for products with different emphases, because different users have different sensitivities as to what they think should be allowed or disallowed.  A corporation seeking to control employee use of the internet has objectives different from those of parents trying to protect their children at home, and different parents have different ideas about what they want their children to be allowed to access or prevented from accessing.

Private Choices and Public Policy.
A critical issue that will be expanded upon in subsequent articles in this series is that most Internet Filters are designed for use in private settings.  The development of Internet Filters is a business with companies designing products to please their buyers, and most of the buyers are private individuals or private corporations.

In a private setting, few constitutional issues arise.  At home, a parent has a right to implement any restrictions he or she wishes.  Free Speech issues are more relevant in a corporate setting, but even that is essentially a private matter, since the corporation owns the computers that employees use, pays for the internet connection, and pays employees to perform specific tasks that don't usually require general access to the entire web. 

A very different set of legal issues applies, though, when the internet connection is paid for with tax dollars and is made available to the public within a government agency like a public library.  In such a setting the First Amendment applies, and that means that both adults and minors have Free Speech rights to receive information.  Filters that are perfectly legal in a private setting can easily infringe on the rights of adults in a public setting, and might even infringe on the somewhat narrower rights of minors.

Future articles in this series:
Internet Filters Underblock and Overblock.
Can Internet Filters Identify Obscene Images?
Internet Filters and Email, Chats, and Attachments.
Internet Filters: The Constitutional Headache

No comments:

Post a Comment