6.805/STS085: Click "OK" to Send Your Personal Information

Click "OK" to Send Your Personal Information

Lucy Borodavkina

Paper for MIT 6.805/STS085: Ethics and Law on the Electronic Frontier, Fall 1999

1. Executive Summary

As the use of the Internet grows, increasingly more computers are online 24 hours a day. That is, any program running on a networked computer can access the net using conventional communication protocols, and no special intervention from the user is required. That, as well as the advent of "smart" software, like email readers that can display graphics and HTML, is a cause of worry for privacy advocates, because there is nothing to stop the programs or email readers from transmitting users' personal information to some remote sites without the user ever knowing it.

There are a lot of ways for a particular company to collect information about their consumers. They can go the direct route and ask people to fill out questionairs. Or they might require certain data in exchange for the services. But, perhaps, the worst of all is the kind of data they can collect without consumers' knowledge or intervention (in fact, in some ways it is impossible for a person to keep that data secret with the current set up). For example, they might send an email with hidden HTML tags[1], such that if one reads it in HTML-capable software (which most of email software is these days), some site gets a "hit" and the administrators know that a particular user has read her email without her even suspecting that. Or they might build communication channels into their software. For example, a program that plays someone's music might be transmitting her play-list back to its company. Scary possibilities? Not at all -- it is a scary reality. Moreover, companies like Barnes and Noble and Amazon.com freely admit to reserving to email tricks[2], arguing that they are doing it for consumer's benefit.

True, there is some benefit to the consumer from companies collecting aggregate data. For example, a software maker might want to know how their software is used so that they can speed up the most common bottlenecks. Or a company might reconsider discontinuing tech support for an old version of their software if their reports indicate that a large number of people are still using it.

So what is more important -- consumer privacy or consumer benefits? In fact, no absolute answer exists. Instead, each consumer should be given a choice. Instead of being able to secretly transmit information, the program should first clearly specify which information it collects and ask the user for permission to transmit it (it can be done on a one-time basis). To ensure that this route is taken by software makers, there needs to be legislation covering various aspects of background and secret data collection using such technology.

This paper argues for a privacy legislation that would protect consumer from secret data collection trough software means. Furthermore, it outlines the general structure for such legislation, addressing the necessary points relevant to data collection where user's input is not technologically required (for the purposes of this paper, I will only consider the case of such data collection, because addressing all the issues about all kinds of data collection is outside my scope). This proposed legislation will take into consideration the fact that information collection may be beneficial to the consumer, and so will suggest the safeguards that must be placed to prevent the abuse of data collection. The title ("Click OK to Send your Private Information to Us") is not satire -- instead, it is the goal of the suggested legislation -- that user input is required before any private information might be transmitted to the collection point.

The main principles behind the proposed legislation are the same as behind the widely accepted principles concerning fair information practices[3] :

The right of individuals to know what information is being collected about them.
The right to know how that information is being used.
The right to take legal action if it is misused.

But the legislation needs to go even further: in addition to the principles listed above it needs to have the following ones:

The right of individuals to refuse the collection of private information.
The right to access collected personal data and modify it in case any inaccuracies are present.
The agencies collecting private information should be prohibited from using it in any purposes other than those specified at the time of the collection without explicit permission from the individuals involved.

Only with all of the above safeguards in place, can a person be assured that her personal data will not be unrightfully collected through stealth communications or misused by those who collect it.

By making the Internet communications safer for individuals, we will encourage people to participate in Internet transactions, thus furthering Internet commerce and online communication. Being able to retain control over personal data in the age of the digital communication should not be the exceptional privilege, it should be the rule, and until the legislation concerning data collection is implemented, the situation is unlikely to change because companies have a lot of inertia and no incentive to change their data collection policies.

2. Introduction: privacy in the digital age

So you got your new computer and it came with the "two years of unlimited Internet access" deal; or, maybe, you just moved to school where your dorm room is wired for Ethernet and you can leave your computer connected all you want; or, perhaps, your area is now covered by the cable modem provider, and you are on your way to a speedy access for as long as you want at one monthly fee; or, maybe, you are a first time Internet user, who decided to try that "big net" you've been hearing about after receiving one of the free AOL CDs which gives you free 200 hours and then unlimited access for one monthly fee. Whichever the case might be, you are feeling pretty good -- you can access any site you want, you can spend hours browsing quirky or educational web-sites, you can download new programs, purchase your holiday presents, you can chat with other "internetters" via email or chat, or you can even do your taxes over the net or, instead, sit there idly twiddling your thumbs while watching a streaming video coming from some site abroad. In fact, you are starting to notice that, as you learn about the possibilities, the net seems to learn about you as well -- after you purchased that set of screwdrivers at ShopOnline.com, a lot of banner ads with the power tools started to appear on seemingly unrelated web sites; and, after you spent quite a while exploring the Nokia web site you are suddenly bombarded with cell phone advertisements. Of course, sometimes, the correlation is not obvious -- those banners with Julia Child's face that you commonly see on different sites are quite a surprise to you -- you've never been interested in cooking and the only time you read anything about it was when you received an email from Barnesandnoble.com about discount on cookbooks[4].

Are you really being watched or are you imagining things? No, you are not a victim of your hyper-imagination -- in fact, almost anything you do on the Internet can and is monitored and recorded in various databases, which might contain not only your shopping history, but also your click-through history, your preferences, your email address, and your private data, such as your Social Security Number and the information about your bank accounts.

Assume that, being tired of the constant "spying," you decide to stick to not giving out any personal information online. Even if you stick to only using the programs on your desktop, you are not any safer -- as long as your computer is connected to the Internet, any program running on your desktop can contact any machine on the Internet and transmit any information it wants. In fact, they often do so. And, currently, such transmissions are completely legal -- writers of the software are under no obligation to inform the users that the information is being collected and transmitted. In fact, a lot of them claim that it's done for users' benefit -- after all, there is no reason to bother the users when all the collection can be done without their involvement.

Is this the kind of interconnected world you want to live in? The world where the programs on your computer are more dangerous than the hackers trying to break in from outside, the kind where anything you do can be watched without you having a slightest suspicion of your programs being "bugged", where tomorrow your "clickstream data" -- all the web pages you looked at -- might turn up as the public info? If not, read on: this paper argues for the need for the legislation to regulate the stealth data collection and proposes guidelines for such a regulation, modeled after the European Data Directive[5].

2.1 History of Privacy legislation (The Privacy Act[6], etc)

Privacy rights, although not explicitly stated in the Constitution, remain an issue upon which American citizens place significant concern, and privacy rights are something that Americas have come to rely on[6].

In 1890 Louis D. Brandeis and Samuel D. Warren published their historic article "The Right to Privacy" in the Harvard Law Review, stipulating that the individuals have "the right to be let alone." That was the beginning of the articulated legal theory on privacy rights.

In the real world, there are some protection that make privacy easier to achieve. For example, it is impossible for someone to follow a person around, recording her every move in the shop, without her noticing. But in the online world, where such information collection, where such collection is accepted and standard, it happens all the time.

Users often rely on the implied privacy because of their expectations of how things work in the real world. For example, because we expect to have privacy in the US Postal Service, people expected privacy in online email. Indeed, the Electronic Communications Privacy Act was passed specifically to apply privacy protection to traffic in public messaging systems[7]. Unfortunately, this carryover in expectations is most often not justified in the context of the cyberspace -- the possibilities are different there and companies often take advantage of these new possibilities to pry into consumers' private information.

3. Persistent computing and privacy violations

As the Internet becomes more pervasive and a web browser is no longer a software program a user might fire up once in a while but something people rely on all the time[8], more and more people reach for having their computer connected to the Internet round the clock. As the use of cable modems[9], high-speed DSL lines, and Internet service providers that offer unlimited usage grows ever popular, we can expect the trend to continue that way -- towards the increase in number of computers which are online all the time, thus allowing any programs running on them to access the net at any point.

Companies or software creators that want to find out how their programs or services are being used could go two ways about it: they can ask the users for direct input: by making them fill out a form or provide information in some other way. Although those questions can sometimes be unnecessary invasive[10], and, although it is not uncommon for a company to refuse to provide a particular service unless the user discloses certain information, at least with direct questioning a user knows exactly what information they are giving out, and, should they wish to do so, they can contact the company and complain about their data collection policies or inquire what the collected data is going to be used for.

Far more insidious is the information collection through "hidden" channels -- where no direct user input is required, and, in majority of cases, the user is not even aware that she is being monitored. These covert channels can be as simple as the information that a web browser routinely sends out with each request, such as the originating computer's IP address, the platform it's running on, and even the user's email address if one is entered, or as complicated as desktop software using special protocols lying on top of the standard TCP/IP layer to communicate with the originating server to gather information about the users and the way they use their desktop software.

3.1 Hidden information channels

There are several standard clandestine ways of collecting the information. Among them are web "cookies," hidden HTML tags (both in web pages and email) and desktop software communicating with some servers by using standard protocols. And all of the above methods can be used in conjunction with each other to gather even more data.

3.1.2 Cookies and Hidden HTML tags

Cookies were originally designed by Netscape[11] so that the server could place some information on the user's computer that can be retrieved in later sessions to facilitate "user recognition." They can be used in e-commerce to store users' "shopping baskets", to personalize web search engines and web portals[12], to allow users to participate in web contests, and generally to do anything that requires repeated user recognition. In most cases, not only does the storage of the personal information go unnoticed, so does access to it[13].

It is thus possible that the user is getting tracked without their knowledge -- if every site they click on contains DoubleClick ads[14] that read the same cookie, DoubleClick will record all of that user's clicks, as well as making sure that the ads are tailored to that particular user.

On the latest versions of the Netscape and Internet Explorer web browsers, it is possible to turn off the cookies altogether. However, some sites will not give any useful information without a cookie. In fact, any site that has pages based on Microsoft ASP technology[15] will be nothing but a collection of broken links unless the cookies are turned on. Moreover, with the current browsers, it is not possible to turn cookies on only for several sites (that, perhaps, have a clear privacy policy statement and do not associate with DoubleClick or the likes), while leaving them off for everyone else.

The one other cookie setting (besides "on" and "off") is "Warn me before accepting a cookie."[16] However, the warnings that one gets fail the notification clause that we specified in our requirements -- instead of giving user a clear idea of what kind of information a cookie contains and what things the server will be able to gather, the standard notification reads as follows[17]:

The server photo.net 
wishes to set a cookie that
will be sent only back to itself 
The name and value of the cookie are: 
last_visit=944894101

This cookie will persist until Thu Dec. 31 20:00:00 2009

As you can see, this is not very informative. Perhaps, the clearest information it gives is about the expiration date. But, in fact, this is one of the more informative cookie warnings around -- usually even the name of the cookie is not something that can be parsed as plain English.

Sometimes users think that they can detect pages that set cookies from such "click-watching" companies as DoubleClick by seeing the add banners from DoubleClick and its associates. However, it is not at all the case. A lot of times a cookie from a different server is hiding in the clandestine HTML tag -- an image tag that contains graphics sized one pixel by one pixel, which is not detectable by users but is still loaded by the browser. These hidden images have long been used to gather site visitor information and now they are getting even more insidious spy functions.

3.1.2 Hidden mark-up tags in email

Over the years, email readers have gotten "smarter" -- instead of dealing with plaintext only, they now transport images, text rich with color, pictures, and even animation; they load URLs and parse HTML; they function as much more than simple email programs. The result of these improvements? Not only the increase in email volume but also large privacy holes that emails now contains.

One of the favorite tricks of email marketers is to place hidden HTML tags in the email they send out. That way they can monitor exactly when the email has been read by a particular user, what kind of email reading software that user has and whether it is graphics-capable, and other information that the browser might send out; and use that data to directly target the customers in their next advertisement.

The basic technique is for an HTML message to include a graphics that is loaded from a Web server belonging to a banner ad company. This image is specified using a standard HTML IMG tag. For example, this IMG tag will fetch a graphics named "SYNC.GIF" from a Web server belonging to MyBannerAds.com (a fictitious company):

<img src="http://www.mybannerads.com/sync.gif>

The tag can appear anywhere on the page, and the graphics file, SYNC.GIF, will be fetched and displayed when the Email address is read.

In addition, if cookies are enabled in the Web browser and a cookie is present on the computer for MyBannerAds.com, the cookie will be sent to the server[18]. This is very disturbing indeed: now we have a direct link between the email address and the information that a cookie can provide. Therefore, the cookie is no longer "un-identifiable" -- now it is most definitely a personal information that is transferred to the server.

Moreover, often these graphics are hidden by the same technique as hidden graphics in web pages -- they are made to be so tiny as to be unnoticed by the reader. And so the person reading the email never even suspects that her computer is secretly communicating with some server, transferring along the information about her preferences.

Do you think that it is only the unsolicited marketers who engage in such practices? Think again! Quite respectable companies use these tactics and are not at all ashamed of them.

Barnesandnoble.com, the online branch of the famous book retailer, sends out email that contains "invisible" image tags within the email which are used to inform the sender as to when, and how, that e-mail was read (if the email is read within an html-compatible reader).

When contacted by Lauren Weinstein, a PRIVACY forum moderator and a staunch supporter of online privacy, Barnesandnoble.com did not deny the allegation[19]. Instead, the representative admitted that they had been testing the use of these "sniffers", and felt that the use was completely legitimate, among other reasons, because "everybody did it." She said that the tags were used to determine if consumers were using HTML-capable email software, and, if so, they would start sending them graphics-rich and brand-building email. When asked why they assumed that anybody who has an email reader capable of HTML would like to receive graphics-rich email and why the company has not simply asked the users, the representative felt that the users could not be relied upon to "know" whether or not their software had the capability. In addition, the representative pointed out that Amazon.com, their main competitor used the same strategy and that it was "standard industry practice."

While it might very well be a "standard industry practice," that is only implication that others are violating consumer privacy, not an excuse to do so. "But everybody is doing it" should not be an excuse for harmful and invasive actions, which is why we need legislation to protect us from such trend-setters.

3.2 Program-to-web site communication channels

It is becoming increasingly popular for software packages, which seem to have nothing to do with the Internet, to establish a clandestine link back to servers to pass along a variety of information or to establish hidden control channels[20]. It is often that companies advertise various add-ons that will make software remote-controlled. It is especially popular for the "demo" packages -- the ones that you can download and play on your computer, but every time you do, the information is sent back to the server, and you get reminders of "Buy now." Moreover, even the installers lately contain nothing but a shell capable of contacting a server which then provides the actual package[21]. It is billed as a benefit to consumers because this way the server can provide the latest version of the software. And, while it is true that benefits are often there, it is still inexcusable that most of this information transfer occurs without users' knowledge. Surely, if these add-ons are improvements, the users will be able to realize that and "opt-in."

A good example of a Trojan-horse like information collector is a Comet Cursor [22]. Comet cursor is a free plug-in that one can download for Netscape and Internet Explorer browsers. The Comet Cursor changes its appearance by flashing and moving and can adopt the cursors of compatible Web sites. What can possibly be dangerous about such a thing? But the truth is that each cursor contains a unique GUID (Global User ID) which monitors user web site visits and transmits the information back to Comet Systems [23]. In essence, for every user of the Comet Cursor, the Comet Systems has a complete account of sites that user has visited.

The Comet Cursor has been immensely popular with users (who did not suspect that something so innocent as a changing cursor can be monitoring their every click). Moreover, even if someone decides to not download the cursor or uses the uninstaller (which takes additional downloading), for every web site that utilizes the Comet Cursor features they visit with Internet Explorer, Internet Explorer will "nag" them to download the plug in. According to the Comet System's Privacy Information page, that is not a problem -- one can disable this nagging. How? By allowing Comet Systems to place a special cookie that will be then presented to the nagging sites. And, of course, if the user has turned off cookies to prevent the privacy violations associated with them, she is out of luck in preventing "reminder" windows from popping up at practically every site she visits.

This information collection by a seemingly innocuous free program running in conjunction with the browser is exactly the kind of trapdoor that privacy advocates worries about. But does it bother the writers of the privacy statement at the Comet Systems? Not at all: "We have never asked for a user name or email address, in order to respect the privacy of our users," states the web page. Surely, the privacy is not defined only by whether the email address is revealed. And, as discussed above, should the Comet Systems decide to sell their database of user browsing patterns, they will have no trouble finding buyers who will be able to match each profile with actual email address through cookies (which Comet Systems web page also sets) and other technologies.

Despite its seeming egregiousness, the Comet Cursor is only a minor example compared to the kind of information the other programs can collect and transmit. A lot of shareware and commercial software that can be downloaded from the web now contacts the "home" web site on each startup -- to identify whether a new version or a patch has been released or simply to sync up the information with the web site or remind the user to register or buy a new version. Needless to say, there is nothing stopping them from transmitting any other kind of information.

In another recent scandal, it was found out that Real Networks, a maker of the RealPlayer music software, has implanted a unique UID in each player. In addition, the player communicated with the server every time the new CD was played, sending along not only the titles of the music played (and how many times it has been played before), but also the unique ID assigned to that user. And, of course, Real Networks already has email address and other personal information to associate with this information. When questioned about its practices, Real Networks claimed that they did nothing wrong because they did not sell or give out user info. But they did admit that there is nothing that would prevent them from doing so in the future.

A lot of software developers claim that information collection is to the benefit of the consumer -- real usage patterns allow developers to better focus on customers' needs. But what kind of information might be considered useful to a developer? While one might only need to know how many times the program is used daily, the other might wonder what purposes it is used for, or what kind of transactions are recorded in it.

The network firewalls do not protect from these Trojan informers, either. The protocol of choice for such activities is HTTP[24] -- the standard web protocol, which is ordinarily allowed to go through the firewall[25], especially when the connection is made from the "inside." Unfortunately, connecting from user's computer no longer means connecting with user's permission.

3.2.3 Lies about which information is collected

The program developers need not have malicious intent to violate someone's privacy. They might genuinely be concerned with improving their software. However, once the information is collected, what is to stop them from using it for other purposes? In addition, sometimes the developers themselves do not understand the risks to which they are exposing users -- for example, most of this data is sent in clear-text and a malicious eavesdropper can easily figure out the data format and listen for such transmissions to acquire her own database. Certainly, making users agree to transmitting the information is not going to get rid of the eavesdropper problems, but at least the user will be aware of what she is risking and what kind of information might end up in wrong hands. All in all, that is far preferable to a reticent program that transmits information behind the user's back.

4. Why might we want them to keep collecting the data

Companies often claim that their data collection is done solely with the consumer in mind. And, indeed, the data they gather can help them serve the users better. They might decide to support a particular product, or concentrate on a particular item that seems problematic to consumers. In addition, they might target their efforts in software development to the biggest "hot spots." The aggregate data can prove to be useful in finding trends in current online commerce and identifying the possibilities for future additions and improvements. Targeted services and targeted notices are not that bad if they mean that a user is receiving less junk email and is bothered less by irrelevant information. The problem with data collection is not the actual collection, but that it is done without consumers knowledge or permission. If the benefits are apparent, there is no doubt that permission will be granted, should companies only ask.

5. How to resolve the conflict

5.1 Need for legislation

The ground-breaking legislation (and, sadly, perhaps the only reason that US is even thinking of the possibility of regulating information collection) comes from Europe. It is the European Data Collection Directive enabled throughout the European Union[26]. The directive controls the gathering and dissemination of information by granting the consumer aforementioned rights to refuse collection, to know how the information is to be used, and to be able to control and change the information once it has been collected. In particular, the Directive states that "in order to be lawful, the processing of personal data must in addition be carried out with the consent of the data subject..."

The reason that this legislation so worries companies in the United States is that, according to the Directive, "the transfer of personal data to a third country which does not ensure an adequate level of protection must be prohibited." What that means is that, unless US implements comparable safeguards, a lot of US companies will be prohibited from doing business with citizens of the European Union, which will be a noticeable blow to online commerce. In light of that, policymakers in US have tried to argue that US does not need legislation because it has industry self-regulation which is enough to uphold the data protection standards. But, unfortunately, the self-regulation has not been able to solve any of the problems.

5.1.1 Self-regulation is not an answer

The opponents of privacy legislation claim that either self-regulation or technical solutions are the answer to the privacy problems on the Internet. However, both have proven to be only effective where the companies choose to utilize them, and the biggest violators are those who do not subscribe to the them.

Supporters of self-regulation claim that legislation will impose undue hardship on companies and stunt the growth of online commerce[27]. In addition, they argue that self-regulation is sufficient to provide the necessary protections to consumers.

The best hope for self-regulation advocates (and their best defense) are companies like TRUST-e and the Council of Better Business Bureaus, that are created to help sites implement privacy policies and to enforce those policies.

TRUSTe[28] certifies and polices sites' privacy policies. Its members are required to tell their visitors what information about them is gathered and how that information is used. In return for their compliance, TRUST-e rewards them with a seal of approval that sites can post.

However, in March of 1999, TRUSTe refused to pursue an audit of one of its biggest donors[29], the Microsoft Corporation, in an awkward decision that demonstrated that there can be no effective regulations as long as the companies doing policing are being paid by those who are being policed.

TRUSTe said that, while Microsoft did compromise consumer privacy and trust with an identifying number in its Windows 98 operating system, it found no privacy violations involving information collected through the company's Web site. An answer, indeed. Is this the self-regulation we have heard about? The kind where a regulator has to come up with a feeble excuse in order not to prosecute the violator?

The same excuse was given in the most recent UID scandal, involving Real Networks. Once again, TRUSTe claimed that it could not prosecute because it was not the web site that was doing the offending collection, but the actual software, and software is out of TRUSTe's scope. Then whose scope is it in? Who is to regulate it and make sure that our private information is not seeping out through the innocent-looking desktop programs?

The key concept in the European model is "enforceability." The European Union is concerned that data subjects have rights that are enshrined in explicit rules, and that they can go to authority in case a violation occurs[30]. United States, on the other hand, has avoided general data protection rules in favor of specific sectoral laws governing, for example, video rental records and financial privacy. The problem with this approach is that it requires that new legislation be introduced with each new technology or that consumers must rely on the industry to self-regulate, which rarely happens.

As far as technological solutions go, there are several proposed ones. Among them is the Platform for Privacy Preferences, or P3P[31], developed by the World Wide Web consortium (W3C)[32]. The system is designed to let consumers tell their Web browsers how much information they are wiling to give out. The sites would then disclose how much information they collect and the browser will warn a user if there is a discrepancy between the site's expectations and users preferences.

P3P is an admirable effort, however, it once again relies on sites complying with it; and, to truly be useful, it requires ubiquitous implementation and worldwide acceptance. As yet, that has not happened, and it is doubtful that, keeping in mind the size of the Internet, we can expect the majority of the sites to adopt P3P in the near enough future.

Massachusetts representative Edward Markey, who is a privacy advocate and a strong believe in technology, commented: "Personal privacy should not bend to the latest technology. But technology should be developed with privacy in mind"[33]. It is this comment that I would like everyone to take as a slogan for the campaign for privacy.

It is the legislation that must solve the egregious intrusions in personal privacy that technology has helped to produce. Guided by laws designed to protect the consumer while allowing companies to innovate and gather the information they truly need to do the best job they can, software developers can create a new brand of web sites and software -- the kind that answers to users' needs without needlessly intruding into their personal lives.

5.1.2 Legal solution can balance different sides

There are arguments for allowing data-gathering and against it. Ultimately, the consumer should be given a choice to disclose or not disclose the information, and the service provider should be given a choice to withhold the service in case the consumer refuses to provide the information. Self-regulation cannot help us achieve all the goals. Instead, legislation is the answer that can balance both sides of the information war, and make sure that everyone is given due process and a path for recourse.

5.1.3 Legislation currently in congress

To be sure, there are politicians who realize the importance of privacy and letting people have control of their own information. In fact, a number of them have proposed bills to the congress that might change the current flagrant violations.

Possibly the strongest Internet privacy regulation in US could come from the Electronic Privacy Bill of Rights Act of 1999[34] that was introduced in the House on November tenth, 1999, by Mr. Markey and Mr. Luther. Among the findings, there are affirmations that "it is important to establish personal privacy rights and industry obligations now so that consumers have confidence that their personal privacy is fully protected on our Nation's telecommunications networks."

The bill on the whole is a very admirable work and, if it passes, it will be a very significant accomplishment in the fight for privacy. The bill provides for the arch-important factors such as requiring user's permission before private information is collected, full disclosure of how the information gathered will be used, user's access to her data to modify in case of collection errors, and the right for legal recourse.

In fact, it may be considered as one of the most significant legislative moves for online privacy in US. If passed, it will establish regulations on commercial services comparative to those in Europe that are imposed by the European Data Directive.

Unfortunately, the bill addresses only the web site operation and online services. However, the term "online services" is very vague, and can be misconstrued not to apply to the desktop software communicating with servers (after all, it is not an online service). Also, this bill does not apply to non-commercial venues.

The biggest blind spot is probably the lack of definition for the term "private data." The bill talks about "identifiable private data," but it is not clear what falls into that category. Such companies as Comet Systems and Real Networks claim that they do not violate individual privacy because they do not gather "identifiable" data (or, at least, do not release it). But what is "identifiable" is often in the eye of the beholder. While such things as IP addresses might be fairly unidentifiable for AOL[35] users, they are a direct link to individuals whose computers have static IP addresses, as is the case in most educational and professional institutions, as well as workplaces, and even cable modem environments. So IP address is only impersonal information in very narrow cases. Same goes for the clickstream data collected through the cookies -- as described earlier in the paper, while cookies themselves might not contain personal information, it is not hard to link them to the actual email addresses, thus getting a very personal profile, which, according to DoubleClick, still might fall under the "not individually identifiable data."

5.2 Proposed legislation -- Digital Information Privacy Act[36]

To resolve the problems outlined above, I would like to propose a new legislation -- call it Digital Information Privacy Act of 2000, that will cover the issues of information collection and dissemination. [37]

The basic principles behind the legislation are that the personal information must be:

Obtained fairly and lawfully
used only for the original specified purpose
adequate, relevant, and not excessive to purpose
accurate and up to date
destroyed after its purpose is completed

In fact, this bill should be very similar to the Electronic Privacy Bill of Rights Act of 1999. I would call the actual Act a model, if not for a foresight about covering only commercial services and web sites. Everything, not only the commercial sites, but desktop software and any other digital communication must be covered by the Privacy protection act. Real Networks should not be allowed to collect the information about user preferences because they are not an online service.

In addition, the legislation should be stricter in defining the kind of information that is protected -- it is not only the Driver License, Credit Card, and Social Security numbers that need to be protected. Things like IP addresses, email addresses, and other electronic footprints can be quite telling and should not be left out of the scope of the legislation. Until everything is covered, consumers cannot be certain that their privacy is not violated every time their computers are connected to the Internet.

Digital Information Privacy Act should be enforced by the Federal Trade Commission -- they already have established procedures to deal with similar regulations (such as privacy laws for video rentals and telephone calls). Likewise, any complaint against companies will be directed to FTC for investigation and possible sanctions.

5.4.2 Pathways for a new legislation

A number of politicians are starting to realize the pressing need for privacy legislation that would protect the consumer, not just place faith in self-regulation. Massachusetts Representative Edward Markey voiced his support for the legislation to put basic rules on the books to protect consumers in the digital age. He said his new proposal would encompass three basic principles: "the right of individuals to know what information is being collected about them online, the right to know how that information is being used, and the right to take legal action if it is misused."[38] All those are the same principles that are taken as requirements in this paper.

In turn, one can only hope that more lawmakers will subscribe to Markey's point of view. If they will, then the legislation has a good chance to be passed, despite the powerful lobbying from various interest groups. And, if it passes, it will be a great step to preserving individual privacy in the digital age.

6. Conclusion

Whether we notice it or not, blatant privacy violations occur online all the time. In fact, most of the time the violators count on the clandestine nature of their information collection to protect them from consumer outrage. Instead of asking users for permission, online services and desktop programs move behind the users' backs, gathering data and transporting it back to the servers. When questioned about it, companies deny the wrongdoings in their actions, claiming that "everyone is doing it."

This situation, and "everyone is doing it" as an excuse, are not the path to global electronic commerce, they are only a deterrent to it. As long as a user cannot be assured that her privacy is not violated, there will be thousands of people refusing (and rightly so) to do business over the Internet for fear that their information will end up being sold or misused, or that they will be tracked by "bugged" email or web pages.

To remedy this problem, legislation is needed. No self-regulation is going to be strict enough and enforce enough protection to do the adequate job. The proposed legislation should follow in the footsteps of the bill H.R.3321, currently before congress, which proposes a bill of rights for electronic privacy. But even stricter guarantees are needed to protect users from undue invasion of their privacy by web site and software developers who strive to gather the information they have no need or right for.

References

[1] The Cookie Leak Security Hole in HTML Email messages. <http://www.tiac.net/users/smiths/privacy/cookleak.htm>

[2] "Spies" in Your Software? A PRIVACY Forum Special Report -- 11/1/99 Lauren Weinstein.

[3] Federal Trade Commission. Privacy Online: A Report to Congress. 1998.

[4] I made up the examples in this paragraph, but they are not far from reality -- in fact, Barns and Nobles does monitor whether the recipient of their email has read it, and they can set cookies on user's computer when their web site is accessed.

[5] European Data Protection Directive. 1998. <http://europa.eu.int/eur-lex/en/lif/dat/1995/en_395L0046.html>

[6] Atkinson, J. Right to Privacy in the Age of Telecommunication. <http://www.tscm.com>

[7] Branscomb, A.W. Cyberspaces: Familiar Territory or Lawless Frontiers. <http://www.ascusc.org/jcmc/vol12/issue1/intro.html>

[8] The Privacy Act of 1974, 5 U.S.C. ' 552a (1988)

[9] Indeed, one (in)famous software company claims that a web browser is an integral part of the operating system.

[10] See, for example, Mediaone Broadband Services. <http://www.mediaone.net>. Unfortunately, I can't quote their web site because it requires the javascript which I have turned off. If I turn on the java script and try to view the website, my browser immediately crashes. Either way, the readers will have to investigate for themselves

[11] Companies often ask for financial information to find out what "class" their customers fit in. They might also ask for things like credit card number when there is no apparent need for one or even a social security number.

[12] Nescape <http://www.netscape.com>

[13] R. Smith. 11/30/1999. The Cookie Leak Security Hole in HTML Email messages.

[14] For example, Yahoo's "My Yahoo" site. <http://www.yahoo.com>

[15] DoubleClick, Inc. <http://www.doubleclick.com>

[16] Microsoft Active Server Pages. I actually discovered this fact in my own browsing. After encountering frustratingly many sites that seemed to claim they were a content-rich page or a online store, where all I could see were links that led to broken pages, I started suspecting something. But, once the cookies were turned back on, suddenly, the pages "came alive " -- the broken links were no longer broken and I could browse the store catalog. And this happened every time I got to the .asp page. So, while I know that in theory it is possible to build an ASP page that does not rely on cookies, in practice most web site developers choose to take the heavy advantage of cookies.

[17] In Netscape navigator 4.5 the options one gets are: 1) Accept cookies 2) Accept cookies that will be send only to the originating server 3) Do not accept cookies And with any of the above choices you can click on the "Warn me before accepting a cookie" button.

[18] Taken from the 6.805 online discussion forum.

[19] R. Smith. 11/30/1999. The Cookie Leak Security Hole in HTML Email messages.

[20] L. Weinstein. 11/21/1999. Barnesandnoble.com Defends Use of Invasive "Mail Sniffers." PRIVACY forum digest.

[21] L. Weinstein. 11/1/1999. "Spies" in Your Software? A PRIVACY Forum Special Report.

[22] Most of MIT IS installers work that way -- instead of downloading the actual installer, a user gets a small program capable of collecting the network information and contacting the main server which then sends over the actual installer. See, for example, Eudora 4.0 installer for Macintosh.

[23] Comet Systems. <http://www.cometcursor.com>

[24] Chris Oakes. 11/30/1999. Wired News: "Mouse Pointer Records Clicks." <http://www.wired.com/news/technology/0,1282,32788,00.html>

[25] The Comet Systems Privacy Information page. <http://www.nerve.com/Phillips/1.html>

[26] HTTP: HyperText Transfer Protocol. See W3C <http://www.w3c.org> for further specifications.

[27] L. Weinstein. 11/1/1999. "Spies" in Your Software? A PRIVACY Forum Special Report.

[28] The Platform for Privacy Preferences 1.0 (P3P1.0) Specification <http://www.w3.org/TR/1999/WD-P3P-19991102>

[29] World Wide Web Consortium. <http://www.w3c.org>

[30] Jeri Clausing. "Lawmaker Plans Bill to Protect Consumer Privacy Online." Technology Cybertimes. April 8, 1999.

[31] Directive 95/46/EC if the European Parliament and of the Council of 24 October 1995 on the protection of individuals with regard to the processing of personal data and on the free movement of such data. Official Journal L 281, 23/11/1995 p. 0031-0050.

[32] Pacific Research Institute. Governing Internet Privacy: A Free-market Primer.

[33] TRUST-e. <http://www.truste.org>

[34] Clausing, J. 3/22/1999. Privacy Watchdog Declines to Pursue Microsoft, a Backer. Technology Cybertimes.

[35] Global Internet Liberty Campaign. Privacy and Human Rights: An International Survey of Privacy Laws and Practices.

[36] H.R. 3321. Electronic Privacy Bill of Rights Act of 1999 (Introduced in the House) on 11/10/1999.

[37] America Online, Inc. <http://www.aol.com>

[38] Digital Information Privacy Act -- DIPA is probably not the best acronym, but it makes me appreciate how hard it is to come up with good legislature names. My first attempt resulted in Privacy in the Digital Age (PDA) Act.

[39] when I proposed Information Privacy as the topic for my final paper, I was planning on describing in a fair detail what the legislation should look like to remedy the situation. However, I have since read the Electronic Privacy Bill of Rights Act of 1999 and have to admit that it does a great job in covering all the necessary aspect. So, instead of rephrasing it, I have decided to focus on the differences between it and the need for protection as I see it -- the need for more than just commercial services to be covered by the Act.

[40] Jeri Clausing. "Lawmaker Plans Bill to Protect Consumer Privacy Online." Technology Cybertimes. April 8, 1999.

Return to Course home page