PICS Developers' Workshop Summary
Held June 20-21, MIT
Notes by Paul Resnick (presnick@research.att.com)
Agenda
9:10 Welcome and Introductions
10:00 PICS History (Weitzner and Miller)
10:15 PICS Spec Overview (Resnick)
11:00 Break
11:15 Standards Process (Miller)
11:30 Afternoon Preview (Resnick, Schloss, Miller, Kotok)
12:00 Lunch (provided)
1:30 Breakout sessions:
1) Operating with PICS-1.1
2) Protocol evolution
3:30 Break
4:00 Reports back
5:00 Adjourn
Introductions and briefings on the current status of the
specifications and the public policy process took most of the
morning. More than 40 people attended the first PICS developers'
workshop, representing xx companies and organizations. Danny Weitzner
discussed ramifications of the Philadelphia court injunction against
enforcing the CDA. It is likely that the government will appeal to
the Supreme Court. It is also possible that the New York court,
hearing a similar case, may come to a different conclusion than the
Philadelphia court. Further legislation in the U.S. is also a
possibility, once the current CDA status becomes final. Catherine
Soubeyrand summarized a recently passed French law that requires
Internet Service Providers to do two things: 1) filter materials
deemed illegal by a Government labeling body that was appointed; 2)
give subscribers access to filtering technology so that they can
choose what else they want to block.
I walked the group through the specs. Many questions came up, but
since I was presenting, I don't have notes (anyone who took notes,
please send them and I'll link them in here.) Several questions were
redirected to the afternoon breakout session on operating with the
1.1 specs.
A few points of terminology kept tripping us up, so here's a quick
glossary that we agreed to follow as best we could for the rest of
the day:
- As technical terms, we use label and rating
interchangeably. Depending on the audience, one may communicate more
effectively than the other.
- A rating system or rating vocabulary is the
dimensions and scales used for labeling.
- A rating service is the entity that provides the labels.
We say that RSACi and SafeSurf are rating services, even though
webmasters make their own decisions about which RSACi or SafeSurf
labels apply to their documents.
- A label bureau is an http server that responds to
requests for labels independent of documents.
Jim Miller discussed options for moving PICS from de facto to de jure
standard, including IETF, ISO, and IEEE. Most participants felt that
it was not worth the effort, though there were a few dissenters. The
group voiced confidence in W3C to make appropriate about when and
whether to forward the work to an official standards body.
In the afternoon, we split into two breakout groups. One discussed
the current state of implementations, the other future protocol
evolution.
Working With the 1.1 Specs
The current implementations group surveyed what features are
currently being implemented, decided to create several web pages
to keep tabs on the developer community's progress, and identified
several areas where additional work is needed.
Status
Keeping in mind that not all developers were present (regrettably,
the invitations went out only 3 weeks prior to the workshop), the
following appears to be the current status:
- Several implementors are using the filename extension .rat for
local storage of a service descriptin (the MIME type
application/pics-service). No one claimed to be using anything else.
From here on, everyone is encouraged to use .rat
- No one is paying attention to icons associated with scales or
values in service descriptions, and neither SafeSurf nor RSACi are
providing icons. Developers agreed that the specification needed to
be more specific about icons, specifying a size (e.g., 48x48) and an
encoding (e.g., gif) in order to be useful. Rating services are
hereby warned that effort put into creating beautiful icons
associated with numeric values may be wasted effort.
- The PICS extensions do not rely on implementation of full PEP.
The additional PICS headers (Protocol-Request, Protocol, PICS-Label)
will work with http/1.0 and http/1.1. Servers and clients should be
prepared for PICS headers in either of these versions of http.
- SafeSurf is not providing expiration dates for its labels and
RSACi providing 1-year expiration dates. No one knew of any instances
of expiration times in the two-minute range.
- Security features. RSACi and SafeSurf are not signing labels or
providing MD5 document hashes. None of the clients have yet
implemented the ability to decode signatures or MD5 hashes.
- Generally, clients are providing restricted Boolean filtering
rules (profiles). For example, Microsoft provides only an implicit
AND (violence < 3 AND literary quality < 2). Some clients permitted
rules based on labels from more than one rating service. There were
also a few variants on requiring there to be at least one label
versus requiring that every service have labeled the item. In the
other room, they were deciding to define a standard format for
profiles, so this discussion will no doubt continue in that forum.
- Clients were mixed in what they do to look for labels when a
document does not have one in a META element. Two implementors said
that they give up and assume that no label is present. One said that
his software checks for a "site" label in the site home page. That
is, if the document URL is "http://www.greatdocs.com/foo/bar/bat.htm", and
there is no label in the document (or in the http header stream
accompanying the document), the software GETs
"http://www.greatdocs.com/" and looks for a generic label in the META
element of that page. Ray Soular from SafeSurf argued that it would
be better to look for a generic label for the immediate directory,
"http://www.greatdocs.com/for/bar/" He has been telling sites that
rate themselves through SafeSurf that they should make generic labels
on a per directory basis. He also pointed out that many people get
only a single directory, not an entire domain name, and so looking
for a generic site label would be too generic. A lively discussion
ensued.
In the end, those who were not looking up the hierarchy for
generic labels at all seemed to carry the day, since an extra
connection and document download may be a significant performance
penalty, especially since it happens only after processing the
original document. It was agreed that tools are needed so that
webmasters can enter wildcard labels for directories and have the
server automatically send out the labels with requested documents,
rather than counting on the client to go looking. Several kinds of
tools may help:
- Http servers that put labels in the http header stream could
look in files up the hierarchy for generic labels, or, better yet, in
a local labels database. Two http server developers were present who
are planning to add support for labels in the http header stream.
- Popular web authoring tools could build in support for adding labels and
propagating generic labels through all the specific documents.
- New stand-alone tools could look for generic labels and
propagate them through all the documents in a directory. Several
participants indicated that they could find or write such tools for
specific platforms.
It was also suggested the recent distributed
indexing workshop held by W3C had dealt with similar problems of
finding the home directory for a page, and may have come up with
interesting solutions.
New Resources
We agreed to keep lists of resources that will be useful to
developers. To spread the burden, these lists will be maintained by
various people, with the PICS page linking to them. Since the people
maintaining these pages may need to include links to competitors'
products, those who are maintaining the pages all agreed to be:
"fair, equitable, and speedy." If you are on this list, please send
me a URL as soon as possible, and I will make the links. On the page
that you create, please indicate submission instructions so that
people can send you additional links. Please indicate also that
pics-ask@w3.org is a good place to send an "appeal" if anyone feels
that the page is not being run fairly, equitably, and speedily.
There will be lists of:
More Work Needed
We identified several areas where additional work is needed:
- Convince the major HTTP server vendors to pass labels in the
header stream. IBM's server will have this feature, and Robert Thau
offered to put a limited version into the Apache server, but it would
be nice to get it into all the major servers, so that webmasters can
move away from embedding labels in documents.
- Tools for rating are needed, as noted above.
- Test suites are needed. The developers asked W3C to create a
test suite as part of the reference code. Jim Miller agreed in
principle, but suggested that any help member companies provide will
speed the process.
- A common language for profiles is needed, so that they can be
easily downloaded and installed, saving parents from having to set
the volume on each of the rating dimensions. This should be taken
care of by the new profile format that will be developed.
- An NNTP extension for requesting labels in netnews, similar to
the HTTP extension already defined, may be useful. No working group
has yet formed on this, although individuals are thinking about it.
Protocol Extensions
(Slightly edited version of summary provided by Alan Kotok)
Jim Miller discussed the question of whether there needed to be a
way of asking a rating bureau for lists of documents matching certain
ratings. The initial conclusion was, no, not yet. But this turned
around later, since a subcommitte was solicited to develop this idea.
Jim Miller then discussed the question of whether the protocol
needed to allow arbitrary text strings as values of ratings. He
pointed out that all known justifications for this requirement could
be met in other ways. It was agreed to postpone this for a more
compelling argument. However, in the discussion, a problem with text
strings being language-specific was identified. Proposed solutions
to that problem were (1) including language identifiers to tag the
strings (where one or more such string was provided), and (2) having
multiple "ratings" retrieved using the yet-to-be-widely-adopted
language preference part of HTTP.
In the discussion about interfacing PICS to Search Services, we
discussed both sending the filtering criteria to the Search Service,
and the Search Service supplying enough information for the browser
to do filtering.
We agreed that latter was better for the strict purpose of filtering, but
the former was required for other reasons, as well: PICS may well convey
many kinds of "ratings", some of which have nothing to do with filtering,
but which may well be useful as search criteria. Therefore it is desirable
that there be a standard protocol to convey PICS-based criteria to search
services, both for use in guiding searches, and avoiding the problem of
"here's the first 10 responses, but 9 of them were censored by your
browser."
Some issues raised:
- How is a document composed from many URLs conveyed? It was claimed that
HTML now has a "section" identifier.
- Browsers always tell servers which PICS labels they want. What happens if
the Search engine doesn't have such a label? Were all embedded labels
extracted? Are they forwarded?
- If there isn't a label, and the browser goes to the label bureau to get a
label, it may have changed since the crawler found the document. Then what?
- The distributed search and indexing workshop in May discussed META info,
schemes, home and title page definitions, document-based robot instructions. A summary of that workshop can be found at http://www.w3.org/pub/WWW/Search/9605-Indexing-Workshop/
pics-profiles working group
A new working group was formed, with work to be conducted by mailing
list. If you would like to be on the mailing list, send email to (Kevin Fink saying why you'd like to
participate.
The pics-profiles working group will specify a format for describing
a PICS preference profile. A preference profile indicates what labels
are required in order for a URL to be "acceptable" for a particular
user. The profile will have to indicate which services' labels
should be consulted, and what constitutes an acceptable label. It may
also include information such as the user name and password
associated with the profile, or rules about which profiles apply to
which users. Details are still to be determined.
It is believed that the preference profile will also be sufficient
for communicating queries to label bureaus, such as "find all labels
above 3 on scale A and between 2 and 4 on scale B." Enough people had
this hunch that we decided to make a single working group for both
functions. We'll watch carefully to make sure the eventual preference
profile format also is adequate as a query language.
The work of this group will be carried out exclusively by email, at
least for the time being. Jonathan Brezin from IBM has agreed to take
primary responsibility for creating an initial proposal that we can
all respond to. Anyone else is also welcome to suggest options or
full proposals.