\documentstyle[rfc,fancyheadings,times]{cernman}
\lhead[]{June 1993}
\chead{WWW Server Guide}
\rhead[June 1993]{}
\lfoot[\thepage]{Berners-Lee}
\rfoot[Berners-Lee]{\thepage}
\cfoot{}
\pagestyle{fancy}
\begin{document}
% First page special
\thispagestyle{plain}
\begin{tabular*}{\textwidth}{@{}l@{\extracolsep{\fill}}r@{}}
Tim Berners-Lee, CERN\\[0.5cm]
\end{tabular*}

\begin{center}
\Large\bf\sf
WWW Server Guide\\[1cm]
\large A Guide to WWW Servers\\[1cm]
\end{center}
% --------------------------------------------------------


\chapter{W3 Server Software}A W3 server, like the ftp daemon
, is a program which responds to
an incoming tcp connection and provides
a service to the caller.  There are
many varieties of W3 server software
to serve different forms of data.
\section{Basic W3 servers}
\begin{DL}{allow this much space}
\item[CERN server
] The basic W3  daemon
program serves files already in hypertext
or plain text.  This daemon then
is used as a basis for many other
types of server and gateways .  Platforms:
unix, VMS.
\item[NCSA server
] A server for files, written
in C, public domain.  Runs on top
of a gopher-style database just like
"gopherd". Platforms: unix.
\item[GN
] A single server providing both
HTTP and Gopher access to the same
data. In C, General Public License.
Designed to help serevrs transition
from gopher to WWW.  Platforms: unix.
\item[Perl server
] from Marc VanHeyningen
at Indiana University. Wriiten in
perl . Platforms: unix
\item[Plexus
] Tony Sander's server originally
based on Marc VH's, but incorporating
lots more stuff, including an Archie
gateway, etc etc.  Platforms: unix.
\item[MacHTTP
] Server for the Macintosh.
\item[REXX for VM
] A server consisting of
a amall C program which passes control
to a  server written in REXX.
\end{DL}
Whatever server you are running,
you will probably be interested in:
\begin{itemize}
\item Tools for information providers
\item Syle Guide for Online Hypertext
\end{itemize}
\section{Writing a new server}This daemon is often used as a basis
for a more specific server for a
given application.  A server which
allows a world of data to be seen
as part of the W3 universe is known
as a gateway.  (Most servers could
therefore be regarded as gateways,
but the term implies some conversion
or mapping between dissimilar worlds)
.  For  short tutorials with examples,
see:
\begin{itemize}
\item Writing a server in C
\item Writing a server as a script
\end{itemize}It is a good idea to pick the basic
daemon or one of the servers in the
list as a starting point when making
a new server.
\section{Other servers and Gateways}These are servers which provide data
extracted from other systems. they
are built using code from the basic
daemon, or scripts. See  
\begin{itemize}
\item List of Gateways available .
\end{itemize}
Tim BL


\section{About documents generated from hypertext}Paper manuals generated from hypertext
are made for convenience, for example
for reading when one has no computer
to turn to.  We have tried to make
the hypertext into fairly conventional
paper documents, but they may seem
a little strange in some ways.\par 
All the links have been removed.
Therefore, it is worth looking at
the table of contents to see what
there is in the manual.  Something
which is not explained in place may
be explained in detail elsewhere.\par 
We have tried to keep related matter
together, but sometimes necessarily
you might have to check the table
of contents to find it.\par 
Please remember that these are for
the most part "living documents".
That is, they are constantly changing
to reflect current knowledge. If
you see a statement such as "Product
xxx does not support this feature",
remember that it was the case when
the document was generated, and may
not be the same now.   So if in doubt,
check the online version. Of course,
the living document may be out of
date too, in which case it is helpful
to mail its author.
Tim BL


\chapter{WWW Server user guide}The basic WWW server allows files
and directories in a file system
to be server to the world as menu
trees, multimedia, and/or hypertext.\par 
The http daemon, httpd , is a general
server program which runs a w3 protocol,
" HTTP ".   This is a TCP/IP based
protocol running by convention on
port 80.
\section{In this guide}
\begin{DL}{allow this much space}
\item[Distribution
] How to get the code.
\item[Compilation
] The daemon is compiled
in the same way as the library and
line mode browser $--$ see WWW distributed
code .
\item[Installation
] How to install a server
under unix internet daemon
\item[Options
] Command line options at run
time
\item[Rule File
] The format of a rule file.
By default, /etc/httpd.conf
\item[Etiquette
] Conventions you should
follow to make life smoother
\item[Debugging
] If it doesn't seem to work
\item[Known bugs
] and improvements desired
\item[Change History
] change list of improvements
made and bug fixes.
\end{DL}

\section{Related documents}
\begin{DL}{allow this much space}
\item[HTML specification
] A description
of the hypertext markup language
used for representing menus, etc
\item[HTTP specification
] A desription of
the protocol used by the server.
\end{DL}


\section{Status of basic WWW server}A basic fast information server for
files.
\begin{DL}{allow this much space}
\item[Author
] TBL
\item[Status:
] Version  2 available by anonymous
FTP, with no index search but file
access, name mapping and security
filter, ability to act as gateway
for anything in the WWW library's
repertoire, including WAIS.
\item[Plans:
] A version which will allow
general unix users to set up an index
search daemon. As index search tools
are not generally available, we may
use the NeXT digital Librarian or
WAIS as an basis.
\item[Platforms
] Unix, VMS, VM/CMS (VM/XA).
\item[Next Milestone:
] Run shell scripts
to implement virtual documents and
searches.
\item[More information:
] User guide ,  Bug
list , Internals ,  Change history
.
\item[Wider scope:
] W3 servers , Other WWW
software
\end{DL}
Features include
\begin{itemize}
\item Installation under inetd or run stand-alone
\item Can be run stand-alone by normal
user
\item Automatically generates hypertext
view of directory tree
\item Uses "README" files to document directory
listings
\item Handles multimple formats of same
file, selects format apropriate for
client  capabilities
\item Document name to filename mapping
for longer-lived document names
\item Can act as gateway for WAIS, news,
etc if needed
\item Provides access
authorization
\end{itemize}


\section{WorldWideWeb CERN-distributed code}See the CERN copyright .  This is
the README file which you get when
you unwrap one of our tar files.
These files contain information about
hypertext, hypertext systems, and
the WorldWideWeb project. If you
have taken this with a .tar file,
you will have only a subset of the
files.\par 
THIS FILE IS A VERY ABRIDGED VERSION
OF THE INFORMATION AVAILABLE ON THE
WEB.   IF IN DOUBT, READ THE WEB
DIRECTLY. If you have not got ANY
browser installed yet, do this by
telnet to info.cern.ch (no username
or password).\par 
Files from info.cern.ch are also
mirrored on ftp.ripe.net.
\subsection{Archive Directory structure}Under /pub/www , besides this README
file, you'll find bin , src and doc
directories.  The main archives are
as follows:
\begin{DL}{allow this much space}
\item[bin/xxx/bbbb
] Executable binaries
of program bbbb for system xxx. Check
what's there before you bother compiling.
(Note HP700/8800 series is "snake")
\item[bin/next/WorldWideWeb\_v.vv.tar.Z
]
The Hypertext Browser/editor for
the NeXT $--$ binary.
\item[src/WWWLibrary\_v.vv.tar.Z
] The W3
Library. All source, and Makefiles
for selected systems.
\item[src/WWWLineMode\_v.vv.tar.Z
] The Line
mode browser - all source, and Makefiles
for selected systems. Requires the
Library .
\item[src/WWWDaemon\_v.vv.tar.Z
] The HTTP
daemon, and WWW-WAIS  gateway programs.
Source.  Requires the Library.
\item[src/WWWMailRobot\_v.vv.tar.Z
] The Mail
Robot.
\item[doc/WWWBook.tar.Z
] A snapshot of our
internal documentation - we prefer
you to access this on line $--$ see
warnings below.
\end{DL}

\subsection{Basic WWW software installation from
source}This applies to the line mode client
and the server.  Below, \$prod means
LineMode or Daemon depending on which
you are building.
\subsubsection{Generated Directory structure}The tar files are all designed to
be unwrapped in the same (this) directory.
They create different parts of a
common directory tree under that
directory. There may be some duplication.
They also generate a few files in
this directory: README.*, Copyright.*,
and some installation instructions
(.txt).\par 
The directory structure is, for product
\$prod  and machine \$WWW\_MACH
\begin{DL}{allow this much space}
\item[WWW/\$prod/Implementation
] Source files
for a given product
\item[WWW/\$prod/Implementation/CommonMakefile
]
The machine-independent parts of
the Makefile for this product
\item[WWW/\$prod/\$WWW\_MACH/
] Area for compiling
for a given system
\item[WWW/All/\$WWW\_MACH/Makefile.include
]
The machine-dependent parts of the
makefile for any product
\item[WWW/All/Implementation/Makefile.product
]
A makefile which includes both parts
above and so can be used from any
product, any machine.
\end{DL}

\subsubsection{Compilation on already supported
platforms}You must get the WWWLibrary tar file
as well as the products you want
and unwrap them all from the same
directory.\par 
You must define the environmant variable
WWW\_MACH to be the architecure of
your machine (sun4, decstation, rs6000,
sgi, snake, etc)\par 
In directory WWW, type BUILD.
\subsubsection{Compilation on new platforms}If your machine is not on the list:
\begin{itemize}
\item Make up a new subdirectory of that
name under WWW/\$prod and WWW/All,
copying the contents of a basically
similar architecture's directory.
\item Check the  WWW/All/\$WWW\_MACH/Makefile.include
for suitable directory and flag definitions.
\item Check the file tcp.h for the system-specific
include file coordinates, etc.  
\item Send any changes you have to make
back to www-request@info.cern.ch
for inclusion into future releases.
\item Once you have this set up, type BUILD.
\end{itemize}
\subsection{NeXTStep Browser/Editor}The browser for the NeXT is those
files contained in the application
directory WWW/Next/Implementation/WorldWideWeb.app
and is compiled. When you install
the app, you may want to configure
the default page, WorldWideWeb.app/default.html.
These must point to some useful information!
You should keep it up to date with
pointers to info on your site and
elsewhere. If you use the CERN home
page note there is a link at the
bottom to the master copy on our
server.   You should set up the address
of your local news server with
\begin{verbatim}                      dwrite WorldWideWeb NewsHost  news

\end{verbatim}
replacing the last word with the
actual address of your news host.
See Installation instructions .
\subsection{Line Mode browser}Binaries of this for some systems
are available in /pub/www/bin/ .
The binaries can be picked up, set
executable, and run immediately.\par 
If there is no binary, see "Installation
from source" above.\par 
 (See Installation notes ).  Do the
same thing (in the same directory)
to the WWWLibrary\_v.cc.tar.Z file
to get the common library.\par 
You will have an ASCII printable
manual in the file WWW/LineMode/Defaults/line-mode-guide.txt
which you can print out at this stage.
This is a frozen copy of some of
the online documentation.\par 
Whe you install the browser, you
may configure a default page. This
is /usr/local/lib/WWW/default.html
for the line mode browser. This must
point to some useful information!
You should keep it up to date with
pointers to info on your site and
elsewhere. If you use the CERN home
page note there is a link at the
bottom to the master copy on our
server.\par 
Some basic documentation on the browser
is delivered with the home page in
the directory WWW/LineMode/Defaults.
A separate tar file of that directory
(WWWLineModeDefaults.tar.Z) is available
if you just want to update that.\par 
The rest of the documentation is
in hypertext, and so wil be readable
most easily with a browser. We suggest
that after installing the browser,
you browse through the basic documentation
so that you are aware of the options
and customisation possibilities for
example.
\subsection{Server}The server can be run very simply
under the internet  daemon, to export
a file directory tree as a browsable
hypertext tree.  Binaries are avilable
for some platofrms, otherwise follow
instructions above for compiling
and then go on to " Installing the
basic W3 server ".
\subsection{XMosaic}XMosaic is an X11/Motif  W3 browser.\par 
The sources and binaries are distributed
separately from FTP.NCSA.UIUC.EDU
, in  /Web/xmosaic .  Binaries are
available for some platforms.  If
you have to build from source, check
the README in the distribution.\par 
The binaries can be picked up, uncompressed,
set "executable" and run immediately.
\subsection{Viola browser for X11}Viola is an X11 application for reading
global hypertext.  If a binary is
available from your machine, in /pub/www/bin/.../viola*,
then take that and also the Viola
"apps" tar file which contains the
scripts you will need.\par 
To generate this from source, you
will need both the W3 library and
the Viola source files.  There is
an Imakefile with the viola source
directory. You will need to generate
the XPA and XPM libraries and the
W3 library befere you make viola
itself.
\subsection{Documentation}In the /pub/www/doc directory are
a number articles, preprints and
guides on the web. \par 
See the online WWW bibliography for
a list of these and other articles,
books, etc. and also the list of
WWW Manuals available in text and
postscript form.
\subsection{General}Your comments will of course be most
appreciated, on code, or information
on the web which is out of date or
misleading. If you write your own
hypertext and make it available by
anonymous ftp or using a server,
tell us and we'll put some pointers
to it in ours. Thus spreads the web...
Tim Berners-Lee\par WorldWideWeb project\par CERN, 1211 Geneva 23, Switzerland\par Tel: +41 22 767 3755; Fax: +41 22
767 7155; email: timbl@info.cern.ch


\section{Installing the basic WWW server}IIf using unix, for the simplest
method see Installation under the
Internet Daemon.\par 
There are special instructions if
you are installing under VMS . \par 
The usual way to install a daemon
is to either run it from the bootstrap
command file (for example /etc/rc)
so that it runs continuously, or
to set up the internet daemon (inetd)
to run it when a call comes in. \par 
See a csh script which does everything
below for unix BSD systems but which
you should modify with care for your
own system.\par 
Note: With  version 2.0 on, a rule
file is no longer essential if you
want to just export a directory tree.\par 
The installation normally requires
superuser status, but it is poosible
to run httpd from a terminal session
as a normal user.
\subsection{
$<$IMG SRC="../../AccessAuthorization/icons/48x48/spy.gif"$>$
Access authorization
}
See quick guide on how
to set up access authorization (for versions 2.12
and newer).
\subsection{Log file}If  a log file is required,  make
sure that the user name under which
the daemon is run  has the right
to write the file
Tim BL


\subsection{Priviliged ports}The TCP/IP port numbers below 1024 are special in that normal users
are not  allowed to run servers on them.  This is a security feaure,
in that if you connect to a service on one of these ports you are
fairly sure that you have the real thing, and not a fake which some
hacker has put up for you.\par 
The normal port number for W3 servers is port 80, which is such a
port. (This number is assigned by the Internet Assigned Numbers Authority,
IANA).\par 
When you run a server as a test from a non-priviliged account, you
will normally test it on other ports, such as 2784 or 5000 typically.
\subsubsection{Under unix}The inet daemon (running as root) can listen for incomming conections
on port 80 and pass them down to a process with a safer uid for the
server itself. Of course, you have to be root to set up the inet daemon.
\subsubsection{Under VMS }Under UCX, The process running as a server needs BYPASS privilege
to listen to ports below 1024.  This might mean you have to install
the server.  With other TCP/IP packages, privilege of some sort is
similarly required.\par 

Tim BL


\subsection{Under VMS}The daemon runs just as under unix,
for which the rest of the documentation
was written.  These instructions
are my ideas about how to run it
under VMS but it is a long time since
I did anything like this, so please
tell me what is wrong. We don't have
effort available to distribute HTTPD.EXE,
sorry.\par 
Compilation of the daemon for VMS
requires taking the library and Daemon
source files from the unix release,
copying them all onto the VMS system,
compiling them all.  The object files
from the Library should go into a
libwww.olb file.  The object files
from the Daemon should be linked
together and with the libwww.olb
library.\par 
When compiling the sources, you must
use a compiler flag to specify whether
you have Multinet, UCX or Wollongong
TCP/IP. (cf rebuilding the line mode
browser ).  The flags should be one
of:
\begin{verbatim}		/DEF=MULTINET
		/DEF=WIN_TCP
		/DEF=UCX

\end{verbatim}

\subsubsection{Running}
\begin{verbatim}

\end{verbatim}
The daemon works with document names
which look like unix-style filenames.
At the point of reading a file, these
are converted into unix style filenames.
\subsubsection{Testing it}Suppose you have compiled and linked
httpd successfully. You write a "welcome.html"
file as an introduction to your server
for those from outside, and you put
it in some suitable directory which
you wish to export, say sys\$disk\lbrack my.public\rbrack welcome.html\par 
You run it as an ordinary user on
a port over 1024 from a terminal
window.
\begin{verbatim}		httpd == $sys$disk:[my.directory]httpd.exe
		httpd -p 8000 -v "/sys$disk/my/public"

\end{verbatim}
Note that the directory to be exported
is given in unix style. Don't panic.
Watch the trace (enabled by the -v
option) . The server should end up
waiting for a message.\par 
From another terminal window, you
test the server, giving the internet
node name of your machine in place
if mynode.dom.ain and the same port
number.   We assume you have the
lin mode browser installed.  You
could test it with a GUI browser,
but the trace might be more difficult
to find.
\begin{verbatim}		www -v "http://mynode.dom.ain:8000/welcome.html"

\end{verbatim}
You should now get your welcome page
displayed on the terminal. theer
will be a lot of trace as well which
may make it almost unreadable, but
if it works of course you run both
server and/or client next time without
the -v.
\subsubsection{Installing properly.}Check whwther your TCP/IP brand contains
an inetd daemon. If it does, that
is great, you just run it under the
inetd daemon following the manufacturer's
instructions.  Set the daemon up
to run on any TCP connection to port
80.  (The service name for port 80
is http).In this case the only command
line parameter which you will need
to pass to httpd is the directory
name.  Omit the port number to tell
httpd that it is running under the
inet daemon. If you find that this
daemon is too slow (very possible
under VMS), then switch to using
the method below.\par 
If you don't have an Inet daemon,
then you have to run the daemon as
a detached process.  To do this you
have to add something to one of the
many VMS boot startup files like
SYSTARTUP.COMor some such.  You need
to be the system manager to do this,
and if you are, you probably know
where you personally like to put
these things.  The command line should
be as in the example when you tested
it, except the port should be 80
(not 8000), and there should be no
trace requested.\par 
In practice it seems that under VMS
you always have to start a DCL environment
to run a server, because if you just
detach HTTPD you can't pass it any
parameters.  So you use the usual
trick of running loginout.exe to
create a DCL environment:
\begin{verbatim}		$RUN/DETACH/IN=SYS$EXE:HTTPD.COM/OUT=SYS$TEMP:HTTPD.LOG -
 		    SYS$SYSTEM:LOGINOUT.EXE

\end{verbatim}
where HTTPD.COM is
\begin{verbatim}		$ httpd == $sys$disk:[my.directory]httpd.exe
		$ httpd -p 80 "/sys$disk/my/public"

\end{verbatim}
Check that out and tell me if it
doesn't work...
Tim BL


\subsection{Installing a daemon under inetd}This is how to to set up the internet
daemon (inetd) to run your HTTPD
server whenever a request comes in.
  (These steps are the same for any
daemon under unix: you will probably
find a similar thing has been done
for the FTP daemon, ftpd, for example.)
\subsubsection{Step1}Copy the daemon program or shell
script ( httpd in this example) into
a suitable directory such as /usr/etc.
Protect it from anyone writing to
it except root.
\subsubsection{Step2}Put "http" in the /etc/services file,
or use the name of a specific service
of your own if you want to use have
a special port number.\par 
 (Exceptions: on a NeXT, see  using
the NetInfomanager . On any machine
running NIS (yellow pages), see specicial
instructions ). \par 
For example,
\begin{verbatim}http		80/tcp			# WorldWideWeb server

\end{verbatim}

\subsubsection{Step3}Put a line in the internet daemon
configuration file, /etc/inetd.conf.
For example,
\begin{verbatim}http	stream	tcp	nowait	nobody	/usr/etc/httpd		httpd /Public

\end{verbatim}
(That was all one line.) Here "http"
is used as a link between the services
file and inetd.conf: it could have
been any identifier. "nobody" is
the user name under which you want
the daemon to run, which determines
what privileges it has for example
to read data. "/usr/etc/httpd" is
the actual file name of the server.
The rest of the line is the arguments
passed to httpd: arg0 is the program
name, "httpd",  by convention. Here
the argument "/Public"  is the directory
tree to be exported. This is in fact
the default if no directory is given.
See command line syntax for more
details. \par 
Note: The inted.conf format varies
from system to system. If in doubt,
copy the format of other lines in
your existing inted.conf. For example,
under ultrix there is no user name
field $--$ everything runs as root.\par 
Note: there seem to be, on the NeXT
at least, a limit of 4 arguments
passed across by inetd!
\subsubsection{Step 4}When you have updated inted.conf,
find out which process is running
inetd, and send it a "HUP" signal.
 On BSD unix (For system V, use ps-el
for ps aux) this looks like:
\begin{verbatim}		
		> ps aux | grep inetd | grep -v grep
		root        85   0.0  0.9 1.24M  304K ?  S     0:01 /usr/etc/inetd
		> kill -HUP 85
		>


\end{verbatim}

\subsubsection{Test it}Test the server with the line mode
browser by giving its address explicitly:
\begin{verbatim}			www http://myhost.dom.ain/welcome.html

\end{verbatim}
This assumes that you have a file
"welcome.html" in your exported directory.
 If it doesn't work, you have probably
missed something. See notes on debugging
.
Tim BL


\subsection{Using NIS (Yellow pages)}If your machine is running Sun's
"Network Information Service", originally
know as 'yellow pages", read this.\par 
You must:
\begin{itemize}
\item First make an addition to the /etc/services
file just as for a normal unix system.
\item Then, change directory to /var/yp
and type "make".
\end{itemize}This will  load the /etc/services
file info the yellow pages information
system.\par 
Some peopl ehave found that they
needed to reboot he system afterward
for the change to take effect.
Tim BL


\subsection{Adding a service on the NeXT}The NeXT uses the the "netinfo" database
instead of the /etc/services file.
 This is managed with the /NextAdmin/NetInforManager
application. Here's how to add the
service "www":
\begin{itemize}
\item Start the NetInfomanager by  double-clicking
on its icon.
\item If you are operating in a cluster,
 open either your local domain (/hostname)
or if you have authority, the whole
cluster domain (/). If you're not
in a cluster,  just use the domain
you are presented with.
\item Select "services" from the browser
tree.
\item Select "ftp" from the list of services
\item Select "dupliacte" from the edit
menu.
\item Select "copy of  ftp" and double-click
on its icon to get theproperty editor.

\item Click on  "name" and then on the
value "copy of ftp". Change this
to "www" by typing "www" in the window
at the botton, and hitting return.
\item Click on "port", and then on the
value "21". Change it to "80".
\item Use "Directory:Save" menu (Command/s)
to save the result. You will have
to give a root password or netinfo
manager password.
\end{itemize}
Tim BL


\section{The Rule File}The rule file (configuration file)
defines how the WWW software will
translate a request into a document
name.   For a server, it allows one
to provide an extra level of  name
mapping above that given by links
in the file system. It allows, for
example, out of date names to mapped
onto their more recent counterparts.\par 
For the client, it allows access
to certain servers to be remapped
for example caching servers, or to
local copies of the same information.\par 
The rule file also allows access
to be restricted.  This is essential,
to prevent, for example, unauthorized
access to your password file.\par 
By default, the rule file /etc/httpd.conf
is loaded, unless specified otherwise
with the -R or -r options . \par 
See also: example rule files , Old
format for software before 2.0 ,
Setting up gateways , Firewall gateways
.
\subsection{Mapping and filtering}Each line consists of an operation
code and one or two parameters, referred
to as the template and the result.
Anything on a line after and including
a hash sign (\#) is ignored, as are
empty lines.\par 
The server uses the top rule first,
then EACH SUCCESSIVE RULE  unless
told otherwise by PASS or FAIL. The
operation codes are as follows
\begin{DL}{allow this much space}
\item[map template result
] If the address
matches the template, use the result
string from now on for future rules.
\item[pass template
] If the address maches
the template, use it as it is, porocessing
no further rules.
\item[pass template result
] If the string
matches the template, use the result
string as it is, processing no futher
rules.
\item[fail template
] If the address matches
the template, prohibit access, processing
no futher rules.
\end{DL}
The template string may contain at
most one wildcard asterisk ("*").
The result string may have one wildcard
only if the template has one.\par 
When matching,
\begin{itemize}
\item Rules are scanned from the top of
the file to the bottom.
\item If a request matches a "map" template
exactly, the result string is used
instead of the original string and
applied to successive rules.
\item If the request maches a "map" template
with wildcard, then the text of the
request which matches the wildcard
is inserted in place of the wildcard
in the result string to form the
translated request. If the result
string has no wildcard, it is used
as it is.
\item When a map substitution takes place,
the rule scan continues with the
next rule using the new string in
place of the request.  This is not
the case if a pass ro fail is matched:
they terminate the rule scan.
\end{itemize}
\subsubsection{
$<$IMG SRC="../../AccessAuthorization/icons/48x48/spy.gif"$>$
Access Authorization
}
From the version $<$B$>$2.12$<$/B$>$ on daemon
supports access authorization which
introduces two new rules:
$<$CODE$>$protect$<$/CODE$>$ and $<$CODE$>$defprot$<$/CODE$>$. They have the
following syntax:
\begin{verbatim}
        defprot <template> <setupfile> <uid.gid>
        protect <template> <setupfile> <uid.gid>
\end{verbatim}


\begin{DL}{allow this much space}

\item[$<$CODE$>$$<$setupfile$>$$<$/CODE$>$] is a pathname for protection
setup file which sets up the
actual protection parameters.
\par 
Setup file can be omitted from $<$CODE$>$protect$<$/CODE$>$ rule, but it is
obligatory in $<$CODE$>$defprot$<$/CODE$>$ rule. If setup file is omitted it
is not possible to give the $<$CODE$>$$<$uid.gid$>$$<$/CODE$>$ part,
either.
\par 
\item[$<$CODE$>$$<$uid.gid$>$$<$/CODE$>$] are the Unix user id and group id
(either by name or by number, separated by comma) to which the server
should change when serving the request. These are only meaningful when
the server is running as $<$CODE$>$root.$<$/CODE$>$
\par 
These can be omitted, when they default to $<$CODE$>$nobody$<$/CODE$>$ and
$<$CODE$>$nogroup$<$/CODE$>$.
\end{itemize}
\par 
See also the full
description of $<$CODE$>$protect$<$/CODE$>$ and $<$CODE$>$defprot.$<$/CODE$>$
\par 
\subsection{Suffix definitions}As well as any mapping lines in the
rule file, the rule file may be used
to define the data types of files
with particular suffixes.  The syntax
\begin{verbatim}		suffix  <suffix>  <representation> <encoding> [ <quality> ]

\end{verbatim}
for example:
\begin{verbatim}		suffix  .pc     text/plain          7bit	1.0
		suffix	*.*     application/binary  binary	0.1
		suffix	*	text/plain	    7bit


\end{verbatim}
The parameters are as follows:
\begin{DL}{allow this much space}
\item[$<$suffix$>$
] The last part of the filename.
There are two special cases. "*.*"
matches to all files which have not
been matched by any explicit suffixes
but do contain a dot. "*" by itself
matches to any file which does not
match any other suffix.
\item[$<$representation$>$
] A MIME "content-type"
style description of the repreentation
in fact in use in the file.  See
the HTTP spec.  This need not be
a real MIME type $--$ it will only
be used if it matches a type given
by a client.
\item[$<$encoding$>$
] A MIME content transfer
encoding type.  Much more limited
in variety than representations,
basically whether the file is ASCII
(7bit or 8bit) or binary. A few other
encodings are allowed, and maybe
extension to compression.
\item[$<$quality$>$
] Optional. A floating point
number between 0.0 and 1.0 which
determines the relative merits of
files xxx.* which differ in their
suffix only, when a link to xxx.multi
is being resolved.  Defaults to 1.0.
\end{DL}

\subsection{Presentation definitions}In the rule file for a client, you
can define the presentation of a
given data type. The syntax is
\begin{verbatim}		presentation   <representation>  <command-string>
\end{verbatim}
where the parameters are
\begin{DL}{allow this much space}
\item[$<$representation$>$
] A MIME-style content
type. You can use regulare MIME types,
such as image/jpeg, or your own extensions
which start with x-, such as image/x-tiff,
application/x-my-app.  See also above
.
\item[$<$command string$>$
] The command needed
to display a temporary file of this
type.  A "\%s" within this string
will be replaces with the name of
the temporary file.  Note that is
any file suffix has been specified
as corresenponding to this representation,
then the temporarty file will be
give that (or the first if there
is a choice) suitable suffix.
\end{DL}

Tim BL


\subsection{Allowing Directory Browsing}Sometimes one has a large body of
information and no desire to write
or generate hypertext for it.  In
this case, the WWW daemon may be
set up to present the directory structure
of existing files as a hyperetxt
tree.\par 
The rule file is still used in exactly
the same way to map document names
onto directory names.  When a document
name is allowed and it corresponds
to a directory, then the behaviour
of the httpd server depends on the
command line options given
\subsubsection{Controlling access}If -dn is give, the access is denied.\par 
If -dy is give, or -ds is given and
a file named ".www\_browsable" exists
in the directory, then brwosing is
allowed.  Note that -ds is the default
if neither -dn nor -dy is specified.
\subsubsection{Inclusion of README files}It is common practice to put a file
named README into a directory containing
instructions or notices to be read
by anyone new to the directory. The
http server will be default embed
any README file in the hypertext
version of a directory.\par 
If  the -dr option is given, README
files ar e not included but a link
is included in the listig as for
other files.\par 
The -db and -dt options control whether
the README file is included at hte
top, above the listing (-dt, the
default) or at the bottom, below
the listing (-db).  To put them at
the top is normal, but they might
be better at the bottom if they are
very long.\par 
These features are available in httpd
version 0.9b or later.
Tim BL


\subsection{Rule file examples}A basic rule file for the http daemon
might look like this (it looked different
before version 2.0 ):
\begin{verbatim}
pass    /          file:/u/john/welcome.html
pass    /*         file:/u/john/public/*
fail   *
\end{verbatim}
The first line maps the root document
onto a specific document about the
server, and accepts it.  (see etiquette
about the welcome page)\par 
The second line maps all document
names onto filenames in a particular
directory and accepts them.\par 
The third line disallows access to
all other documents. (There won't
be in any in this case because of
the mapping, but its wise to put
in for later).
\subsubsection{Second example}
\begin{verbatim}
map    /            /tnotes/welcome.html
map    /tnotes/*    file:/u/john/public/*
map    /seminars/*  file:/u/jane/seminars/*
pass   file:/u/john/public/*
pass   file:/u/jane/seminars/*.html
fail   *
\end{verbatim}
The first line maps the root document
onto a specific document about the
server.   Because it is "map and
not "pass",  it DOESN'T accept it
 but passes it on for futher mapping
by lines futher down.\par 
The second line maps all document
names starting with /tnote/ onto
filenames in a particular directory
where john maintains the technical
notes. If someone else takes over
the technical notes, we can change
this. Here we are starting to distinguish
between document names and file names.
This can be carried much further
if necessary, but one level of mapping
is enough to allow for changes of
administration of different areas.\par 
The third line separately maps the
seminar information into Jane's directory.\par 
The fourth and fifth line enable
access to anything in John's "public"
directory, and any .html file in
Jane's "seminar" directory tree.
Note here that the * maps to any
sequence INCLUDING SLASHES so all
files in any subdirectory of /u/jane/seminars
will be enabled so long as they end
in .html.\par 
The bottom line will pick up for
example any attempt to use the server
to access non-html files in Jane's
seminars directory.
\subsubsection{Configuration file for a WAIS gateway}The httpd daemon can be used as a
WAIS gateay if it has been compiled
with the necessary options and linked
with the freeWAIS software. A suitable
configuration file is
\begin{verbatim}map     /*	  	wais://*
pass	wais://*
fail	*


\end{verbatim}


\section{Server Command Line}The command line syntax for the basic
www server allows a number of options
and an optional directory argument.
\begin{verbatim}			httpd  [options] [directory]

\end{verbatim}
The directory argument, if present,
indicates the directory to be exported.
(Version 2.0 and later only.)  If
not present, either a rule file is
be used, to export combinations of
directories, or else the default
is to export the "/Public" directory
tree.
\subsection{Examples}
\begin{verbatim}       			httpd -p 80  -dyt /ftp/pub

\end{verbatim}
This exports the entire /ftp/pub
tree with browsable directories and
README files included at the top
of directory listings.
\begin{verbatim} 			httpd

\end{verbatim}
This comamnd in the inetd configuration
file inetd.conf exports the /Public
directory tree.  This tree may contain
soft links to other directory trees.
\begin{DL}{allow this much space}
\item[-dn
] Disable directory browsing. An
attempt to access a directory will
generate an error response.
\item[-dy
] Enable direcory browsing.  Directories
are returned as hypertext documents.
See browsing directories . This is
the default.
\item[-ds
] Enable directory browsing only
for directories containing a file
named ".www\_browsable".
\item[-dt
] For any browsable directory which
contains a README file, include the
text of the README file at the top
of the document before the listing.
This is the default.
\item[-db
] As -dt but put the README at
the bottom, after the listing.  The
-db and -dt options may be combined
with -dy as -dyb, -dty etc.
\item[-dr
] Disables the README inclusion
feature .
\item[-l  file
] Log all calls to the given
file. The file is appended to if
it already exists.
\item[-p port
] Specify the port number.
If this option is not given, the
daemon assumes that it has been run
by inetd, and uses stdin and stdout
as its communication channel . Note
that port numbers under 1024 are
privileged .
\item[-v
] Verbose mode. Copious trace messages
are written to the standard output
stream. Mainly for debugging.
\item[-r file
] Load a rule file . The rules
are added after any rules already
loaded.  Inhibits the loading of
the default rule file.
\item[-R
] Do not use. Inhibit the loading
of the default rule file.  Warning:
running without a rule file  normally
poses a security problem.  It won't
work in general as only the path
part of a URL is input into the rule
file, and a fully qualifiue URL (with
file: in front for example) is required
on output.
\end{DL}

Tim BL


\section{Debugging the daemon}Suppose you think you have installed
a W3 server but it doesn't work.
That is, you have followed the installation
instructions and the test at the
end fails. Here we assume you have
used port 80.  If you have a situation
not handled by this problem-solving
guide, please mail me.\par 
Type
\begin{verbatim}	www http://myhost.domain:80/


\end{verbatim}
What happens?
\begin{itemize}
\item "Cannot connect to information server"
message, "Unable to access document"
or some other generic-sounding error
message 
\item An empty document is displayed
\item A document containing the words "Document
address invalid or access not authorised",
or some "Error 500" message is displayed

\item A document is displayed, but not
what you wanted the server to give
in response to that document name
(/)
\end{itemize}
Tim BL


\subsection{Document address invalid}You have accessed a W3 server and
you get back a message "Document
address invalid or access not authorized",
or some other error message from
the server.\par 
The 1.x server does not (originally
for security reasons) distringuish
between a document which does not
exist, and one to which you are not
allowed access.  However, most server
are public servers which allow access
to anyone, so if you are following
a bona fide link, this could mean
\begin{itemize}
\item You have been passed a bad document
address. If you are following a link,
check with the author of the document
which contained the link. 
\item The document has been moved. Check
with the server administrator. You
should be able to find out who runs
the server by going to the welcome
page (type "g /" with the line mode
browser) and seeing a link to information
about the maintainers.
\end{itemize}If you are the server administrator,
and you can't  understand why the
daemon refuses to deliver the file,
\begin{itemize}
\item Check the rule file if you have one.
 Think out way the document name
will be mapped successively by each
line, and what the result will be.
Checking the trace below may help
clarify this.
\item Run the daemon with trace from a
terminal session to get trace information
\end{itemize}
Tim BL


\subsection{Can't connect to server}There is more information you can get.  use the "verbose" option on
the browser to find out what went wrong:
\begin{verbatim}			www -v http://myhost.domain:80/

\end{verbatim}
What do you get? A load of trace messages. There are several cases.
\begin{itemize}
\item The browser can't look up the name of the host. If it can, it will
display "Parsed address as" message. If not, try fixing your name
server or /etc/hosts file, or quoting the IP number of the host in
decimal notation (like 128.141.77.45) instead.
\item The browser can get to the host but gets "Connection refused" status
back . 
\item Your browser gets an error number but prints "error message not translated".
This is because when it was compiled on your platform it didn't know
what form the error message table took. Try the same thing form a
unix platform for example.
\item You get some network error like "network unreachable". Depending on
whether the IP network is your responsibility or not, and your attitude
to life, either fix it,  try again in an hour's time, or complain
to someone.
\end{itemize}
Tim BL


\subsection{"Connection Refused"}The browser tries to connect to the daemon but gets this status in
the trace. \par 
This means that noone was listening on that port number. Check the
port numbers match btween server and client.  Make sure you specify
the port number explicitly in the document address for www.\par 
If you are running the daemon without the inet daemon, (with the -a
option) then try running it from the terminal with -v as well.  The
trace for the server should say "socket, bind and listen all ok".
If it does, and you still get "connection refused", then you must
be talking to the wrong host (or, conceivably, different ethernet
adapters on the same host)\par 
If you are running with the inet daemon, then check both the services
file (/etc/service) or database (yellow pages, netinfo) if your system
uses it,  and the /etc/inetd.conf file. Check the service name matches
between these two.\par 
Did you remember to kill -HUP the inet daemon when you changed the
inted.conf file?\par 
Try running the deamon from a shell window to see what happens better.
Tim BL


\subsection{You get an Empty document}The document sent back is empty,
but there is no error message.\par 
The inet daemon has started a process
to run your server but it immediately
failed.  Possibilities include:
\begin{itemize}
\item The daemon may not be in the file
specified, or may not be executable
by the specified user (or, if a user
id is not specified in your variety
of inetd.conf, root)
\item You have written your own daemon
and it crashes.
\item You are using ours and it crashes
(mail us!)
\end{itemize}Try running the daemon from a terminal
window to see what happens.
Tim BL


\subsection{Bad output from the daemon}These are some ideas:
\begin{itemize}
\item Try running the server from the terminal .
\item Check the HTML source the daemon produces with
\end{itemize}
\begin{verbatim}	www -source http://myost.domain:80/

\end{verbatim}

\begin{itemize}
\item Try telnetting to the daemon and simulating the client:
\end{itemize}
\begin{verbatim}
	> telnet myhost.domain 80
	Connected to myhost.domain on port 80
	Escape is ^[
	GET /documentname


\end{verbatim}

Tim BL


\subsection{Telnetting to a server}Most implementations of telnet allow you to specify a port number.
Under unix this is often just a second parameter, under VMS a /PORT
option.\par 
The HTTP protocol is a telnet protocol, so you can simulate it just
by typing things in.  This will help you to see exactly what a sending
back, and it will check you that it really is the server not the browser
which has a problem.\par 
Here is an example. (You type "telnet..." and  "GET ...").
\begin{verbatim}	> telnet myhost.domain 80
	Connected to myhost.domain on port 80
	Escape is ^[
	GET /documentname
	<PLAINTEXT>
	Document name "/documentname" invalid.

\end{verbatim}


\subsection{Running under shell}You don't have to run the daemon
under the inted if it doesn't work.
You can run it from a shell session.\par 
If the daemon is httpd, then run
it from your terminal, with a different
port number like 8000.  You use the
-p option . 
\begin{verbatim}		httpd -p 8000

\end{verbatim}
Note: You must be root (under VMS,
have some privilege) to run with
a port number below 1024. \par 
If you select a port above 1024,
then you can run as a normal user.
 This way, anyone can publish files
on the net. Howeever, it isn't very
reliable, as your server will not
automatically come back up if the
machine is rebooted. In the long
term it is best to install it under
"inetd".\par 
You can't use a port number which
has been used by a daemon process
recently, so you may have to switch
port number if you {\char94}C and restart
the daemon.  When it is running like
this, you can read the trace messages
and use a debugger on it if necessary.
(See also: telnetting to the server
)
\subsubsection{Debugging using Trace}If you can't understand why a server
refuses to give back a document,
then run wiith the -v option to get
trace.  You will see the daemon setting
up the rules for translating requests
into local URLs, and you will see
its attept to access the file (assuming
you map requests onto files).
\begin{verbatim}		httpd -v -p 8000

\end{verbatim}
Try to access the document from a
client using another terminal window.
Look at the trace printout.  It will
probably explain what is happening.
 If it includes specific messages
below, follow them to detailed help.
\begin{itemize}
\item Can't find internet hostname `'
\end{itemize}If you still can't figure out the
problem, mail your local guru help
desk or if desperate www-request@info.cern.ch
ENCLOSING a copy of that trace.
\subsubsection{Even simpler}For testing a daemon very simply,
without using a client, you can make
the terminal be the client.  With
httpd, or if the server is a shell
script "myserver", try just running
it with the terminal and typing GET
/documentname into its input:
\begin{verbatim}			> httpd
			GET /

\end{verbatim}
Try it with the -v option if what
comes back isn't a formatted document.
Tim BL


\section{The basic W3 server:  Internals}This describes the generic hypertext
daemon (server) program. The daemon
is part of the WWW project. See also:
\begin{itemize}
\item User guide . 
\item Bugs and Features
\item Other servers
\end{itemize}The hypertext daemon, like the ftp
daemon, is a program which responds
to an incomming tcp connection and
provides a service to the caller.
\subsection{Sources}A compilation option (SELECT) controls
whether more than one connection
can be handled at a time. This is
a function of whether the TCP/IP
implementation beneath the application
has a working "select()" routine.
If  it is not true, this implementation
services one connection, then drops
it before accepting another one.
In neither case does the daemon concurrently
serve two clients, nor does it fork
off a process to do that.\par 
The basic server loop is in the file
HTDaemon.c .  A separate module (
for example HTRetrieve.c ) contains
the code to handle one request. Various
specific versions of this may be
written for different flavours of
server. Also used are various modules
of WWW common code.  The httpd released
from CERN uses almost the entire
W3 library and can therefore access
any object which a browser running
on that machine can access, and return
it as HTML or some other format.
Tim BL


\section{Bugs and Improvements needed}Improvements to be made in the HTTP
daemon program are as follows. (Se
also Features )
\begin{itemize}
\item Call shell scripts to perform searches
on directory trees or documents.
\item The HTRetrieve() routine ought to
be able to pick up the user node
and userid, etc...
\item Ought to have chroot option. (wwwww
July 93)
\end{itemize}
Tim BL


\section{Daemon features: Update history}History list for the WWW daemon .
(See also bugs ).  Many other changes
to the daemon are in fact changes
to the common code library.
\subsection{2.12  11 October 93}
\begin{itemize}
\item First release with access
authorization.
\end{itemize}
\subsection{2.06  7 June 93}
\begin{itemize}
\item Bug fix: Load error 500 returned
as proper HTTP status, not as simple
document.
\item WAIS gateway now caches source files
again.
\item Bug fix: Daemon used to try to display
graphics file locally on the server
when the client couldn't display
them!  Cause of much confusion  :-)
\end{itemize}
\subsection{2.05}
\begin{itemize}
\item Big bug fix in local file directory
handling .. didn't work in 2.04!
\end{itemize}
\subsection{2.04  28 April 93}
\begin{itemize}
\item With the properly compiled libwww
library, this daemon will operate
as a WAIS, news etc gaetway if so
configured.
\item WAIS gateway operation bug fix.
\end{itemize}
\subsection{2.03-beta: unreleased}
\begin{itemize}
\item Bug fix: operation with no rule file
didn't work as expected.
\end{itemize}
\subsection{2.02-beta: 17 March 93}
\begin{itemize}
\item Misleading error trace removed. 
\item Compiled on HP, SGI, Sun, DEC, NeXT
and binaries available
\item Binary handling fixed in library.
\item Reference to missing HTDirRead.h
removed.
\item Assumes that user can handle files
of unknown format (application/binary).
\end{itemize}
\subsection{2.00-alpha  15 Mar 93}
\begin{itemize}
\item Simple command line $--$ with no parameters,
exports the /Public directory.
\item Multiformat handling $--$ see library
changes for 2.0.  Links to .multi
filenames resolve to any file with
same root, any recognised extension.
\end{itemize}
\subsection{unrealeased 0.9b}
\begin{itemize}
\item Bug fix: If a PASS or FAIL line in
the configuration file acted on a
single document id (ie no wildcard)
then it crashed the daemon. (HTRules.c,
17-Jun-92, TBL).
\end{itemize}
\subsection{Sept 1991 v0.3}
\begin{itemize}
\item Bug fix: Plain text files were returned
to be parsed as SGML, causing them
to come out as garbage. (Mike Sendall)
\end{itemize}
\subsection{August 1991 v 0.2}
\begin{itemize}
\item -R option now suppresses default
rule file.
\item Rule file format changed completely.
Now allows authorisation of specific
paths only.
\end{itemize}
\subsection{June 1991 version 0.1}
\begin{itemize}
\item -r and -R options for rules
\item Default address is now for Inet daemon
working. (29 June)
\item -l option to log to a file.
\item -a option for address other than
default
\end{itemize}
Tim BL


\subsection{Internet hostname `'}Sounds as though you are running
a server which has a bad rule file.
(This error also happened with pre-2.07
servers when thry weren't given a
rule file).\par 
You need something like
\begin{verbatim}		pass     /*    file:/my/directory/*

\end{verbatim}
in your rule file.  The "file:' bit
is important as it shows that the
rest is a local filename. If you
don't put that, then the server can
output this message in attempting
to access the document over the net
without a hostname.
Tim BL\par 


\section{
$<$IMG SRC="icons/64x64/spy.gif"$>$
Access Authorization Overview}

\begin{DL}{allow this much space}

\item[Status of this documentation:]

This is the documentation of WWW
telnet-level Access Authorization as implemented in October 1993
$<$CODE$>$(Basic$<$/CODE$>$ scheme, part of the WWW
Common Library).  Contains also proposals for encryption level
protection $<$CODE$>$(Pubkey$<$/CODE$>$ scheme proposal and RIPEM based
proposal).

\end{DL}

\subsection{
$<$IMG SRC="icons/misc/help.gif"$>$
Quick References}
\begin{DL}{allow this much space}

\item[AA protocol in a nutshell:]
Protocol examples.

\item[Setup in a nutshell:]
a quick manual on how to set
up protection in the CERN Daemon.

\end{DL}

\subsection{
$<$IMG SRC="icons/misc/manpage2.gif"$>$
Complete Documentation}
\end{DL}

\item[CERN Server:]
Protection setup user manual.

\item[Scheme specifications:]
Basic Protection Scheme $<$CODE$>$(Basic)$<$/CODE$>$
as implemented in October 1993, Public Key
Protection Scheme $<$CODE$>$(Pubkey)$<$/CODE$>$ proposal, and RIPEM based
proposal by Tony
Sanders. 

\item[Details:]
Browser side AA and Server side AA.

\item[See also:]
Vocabulary.

\end{DL}

\subsubsection{
$<$IMG SRC="icons/misc/triangle.gif"$>$
AA Testing Page
}

\par 
AL 14 October 1993


\subsection{Access Control List File}

The Access Control List File $<$CODE$>$(.www\_acl)$<$/CODE$>$
contains access information for the files
in that directory. It is of the following format:
\begin{verbatim}
        template: method,method,...: group,user,group,...
\end{verbatim}


\begin{DL}{allow this much space}

\item[$<$CODE$>$template$<$/CODE$>$] is a name of a file in that directory
(not containing the path). It may contain one wildcard $<$CODE$>$*$<$/CODE$>$
like any template in WWW server rule file. If there are entries with
references to files in other directories they are completely
ignored.

\item[$<$CODE$>$method,method,...$<$/CODE$>$] is a list of methods allowed.

\item[$<$CODE$>$group,user,group,...$<$/CODE$>$] lists the groups and users
allowed to execute those methods for files matching the
$<$CODE$>$template$<$/CODE$>$.  Group list can also have IP number templates,
and in fact the group definition syntax is
exactly as in the group file.

\end{DL}

There may be many entries for each file, for example the following is
valid:
\begin{verbatim}
        * : get,put : ari,tim,robert
        * : get     : group1,group2
\end{verbatim}

Should $<$CODE$>$put$<$/CODE$>$ imply $<$CODE$>$get$<$/CODE$>$?
\par 

If an entry for a file is missing, the file is considered to be
completely protected from everybody, and it is never served to
anybody.\par 

$<$IMG SRC="icons/32x32/caution.gif" ALT="Note 1:"$>$
If Access Control List File exists it is $<$EM$>$always$<$/EM$>$ consulted,
even when there is no $<$CODE$>$protect$<$/CODE$>$ rule saying so.  In other
words, an existing ACL file turns protection on (and then there must
be a matched $<$CODE$>$defprot$<$/CODE$>$ rule).\par 

$<$IMG SRC="icons/32x32/warning.gif" ALT="Note 2:"$>$
If there is a $<$CODE$>$protect$<$/CODE$>$ rule protecting a directory, but
there is no corresponding ACL $<$EM$>$and$<$/EM$>$ there is no
$<$CODE$>$Mask-Group$<$/CODE$>$ definition in the protection setup file, the
situation is handled as if ACL was empty (i.e. contains no entries for
any files), so access will be forbidden.\par 

AL 14 October 1993


\subsection{Basic Protection Scheme}

The Basic Protection Scheme consists of the following steps:
\begin{itemize}
\item  (Server sends an Unauthorized status).
\item  Client authenticates himself.
\item  Server checks authentication and authorization.
\item  If previous was step successful, document is sent normally by the
server.
\item  Document is recieved normally by the client.
\end{itemize}
If the server protection hierarchy is clear and the browser
sophisticated enough to figure out right away if a document is
protected, first step is visited very seldom (possibly only once)
during the entire browsing session for each protected server.\par 


\subsubsection{
$<$IMG SRC="icons/misc/bomb.gif"$>$
Step 1:  Server Sends an Unauthorized Status}

Once a server receives a request without an
$<$CODE$>$Authorization:$<$/CODE$>$ field to access a document that is
protected, it sends an $<$CODE$>$Unauthorized 401$<$/CODE$>$ status code, and
a set of $<$CODE$>$WWW-Authenticate:$<$/CODE$>$ fields containing valid
authentication schemes and their scheme-specific parameters.\par 

In $<$CODE$>$Basic$<$/CODE$>$ scheme the reply is following:
\begin{verbatim}
        HTTP/1.0 401 Unauthorized -- authentication failed
        WWW-Authenticate: Basic realm="CollabName"
\end{verbatim}

where realm specifies used password file; same server can use
different password file for different trees of documents (this is the
$<$CODE$>$server-id$<$/CODE$>$ specified in CERN server protection setup
file). Client can thus figure out which password to use at any given
time.\par 


\subsubsection{
$<$IMG SRC="icons/misc/info.gif"$>$
Step 2: Client Authenticates Himself}

After receiving $<$CODE$>$Unauthorized$<$/CODE$>$ status code, the browser
prompts for user name and password (if they are not already given by
the user), and constructs a string containing those two separated by a
colon:
\begin{verbatim}
        username:password
\end{verbatim}

This string is then encoded into printable
characters, and sent it along with the next request in the
$<$CODE$>$Authorization:$<$/CODE$>$ field as follows:
\begin{verbatim}
        Authorization: Basic encoded_string
\end{verbatim}

\par 


\subsubsection{
$<$IMG SRC="icons/misc/question.gif"$>$
Step 3:  Server Checks Authentication and Authorization}

When the server receives a request to access a document protected by
the Basic Scheme, and the request is a full
request containing $<$CODE$>$Authorization:$<$/CODE$>$ field which
contains the Basic Scheme information, it will execute the following
Access Request Validation Procedure:
\begin{itemize}

\item The server receives an $<$CODE$>$Authorization:$<$/CODE$>$ field with the
scheme name $<$CODE$>$Basic$<$/CODE$>$ and encoded authorization string.

\item  If the scheme name is wrong, access is denied, and an
$<$CODE$>$Unauthorized 401$<$/CODE$>$ status with
$<$CODE$>$WWW-Authenticate:$<$/CODE$>$ field containing appropriate scheme
name $<$CODE$>$(Basic)$<$/CODE$>$ and realm name is sent back (as if no
authorization information was given).

\item  If scheme name is correct the authorization string is decoded.

\item  If the access information is correct, the result should have two
fields separated by a colon, of which at least the first must be
non-empty (there can be a username without a password).

\item  If not, access is denied, and an $<$CODE$>$Unauthorized 401$<$/CODE$>$
status with appropriate $<$CODE$>$WWW-Authenticate:$<$/CODE$>$ field is sent
back.

\item  Otherwise, username and password are checked for validity
from the password file.

\item  If the username-password pair is incorrect, access is denied with
an $<$CODE$>$Unauthorized 401$<$/CODE$>$ status and
$<$CODE$>$WWW-Authenticate:$<$/CODE$>$ field etc.

\item  If the username-password pair is correct, the server checks if
user and connecting IP address are members of $<$CODE$>$mask-group$<$/CODE$>$
(if) specified in protection setup file (using group file).

\item  Server then looks for an entry for the requested file in the
corresponding Access Control List
File, which is in the same directory as the file to be accessed,
named $<$CODE$>$.www\_acl$<$/CODE$>$ (if any).

\item  If there is no $<$CODE$>$mask-group$<$/CODE$>$ nor ACL, or if ACL exists,
but there is no entry for that file, access is denied with a
$<$CODE$>$Frobidden 403$<$/CODE$>$ status code.

\item  If there is an ACL entry for it, server checks if the user and
connecting IP address belong to the list of groups and users allowed
to access it (using group file).

\item  If not, an $<$CODE$>$Unauthorized 401$<$/CODE$>$ status etc. is sent.

\item  Otherwise, the server checks if the requested file exists.

\item  If not, a $<$CODE$>$Not found 404$<$/CODE$>$ status is sent back.

\item  Otherwise access is allowed, and the server sends the
document normally to the browser.

\end{itemize}
\par 

See also the discussion about Basic
Protection Scheme.\par 

AL 14 October 1993


\subsection{Discussion About the Basic Protection Scheme}

Because the password flies (almost) unencrypted through the Internet,
anyone who is listening to the Internet traffic can find out people's
user names and corresponding passwords. Thus this kind of a
telnet-level protection only protects from accidental viewing of
classified documents.\par 

You migth think of this as a door. If you really want to get in, you
can always break the lock, and get what you want, but the bottom line
then is that then you have broken something, and that is wrong.\par 

Thus the Basic Protection Scheme only provides the means of telling
people that a document is a protected one; it does not prevent the
document from being accessed by someone who wants it badly enough to
go through the trouble of listening to the Internet traffic, finding
out which printable encoding scheme we use, and decoding your username
and password.\par 

However, using IP address masking together with usernames $<$I$>$("only
these people from these internet addresses")$<$/I$>$ makes it more secure,
because an intruder would also have to have access to the machine
having the required IP address. \par 


AL 14 October 1993


\subsection{
$<$IMG SRC="icons/misc/terminal.gif"$>$
Browser Side Access Authorization Description}

The exact browser side Access Authorization procedures are described
in the corresponding protection scheme specification:
\begin{itemize}
\item  Basic Protection Scheme
$<$CODE$>$(Basic)$<$/CODE$>$ and
\item Public Key Protection
    Scheme $<$CODE$>$(Pubkey)$<$/CODE$>$
\end{itemize}

During a browsing session the client side keeps track on the hosts,
schemes, and the corresponding usernames and passwords.  Because the
browser keeps track of this authorization information, on subsequent
requests to servers that it has contacted already during a particular
browsing session, the browser can automatically send the authorization
information
\begin{itemize}
\item without first failing to access the document, and
\item without having to re-prompt for the username and password from the user.
\end{itemize}


\subsubsection{
$<$IMG SRC="icons/48x48/question.gif"$>$
How Does the Browser Know When to Send AA Info}

The protected documents are to be collected to directories of
protected documents. In those directories there should be only
protected documents, all of which are protected by the same scheme.
The browser can then use this assumption to make the decision about
whether to send authorization information along with the request:
$<$BLOCKQUOTE$>$
	If the servers replies 401 (Unauthorized) for some file, every
	other file in that directory and in its subdirectories is
	considered protected by that same server (and shceme).
$<$/BLOCKQUOTE$>$
The 'directory' in this context means what seems to be a directory
when examining a given URL.\par 

AL 15 October 1993


\subsection{Security Hole in the Unix Finger Daemon}

On some systems, the finger daemon, $<$CODE$>$fingerd$<$/CODE$>$, was run
under user-id zero (root). In this case a user could make his
$<$CODE$>$.plan$<$/CODE$>$ file just to be a link to a read-protected file.
Then fingering himself he could access that file.\par 

AL 14 October 1993


\subsection{Group File}

Each user may belong to zero or more groups, and a group may contain
zero or more users and/or other groups. Groups are just abbreviations
long lists of users. Group names can be referenced in protection setup
file (in $<$CODE$>$mask-group$<$/CODE$>$ field), and in ACL file (the last
field in each line). \par 

\subsubsection{Group Declaration}

Each line in the group file contains information about one group, and
the format is like in the following example (this is called a $<$I$>$group
declaration:$<$/I$>$
\begin{verbatim}
        groupname: user1,user2,group1,user3,group2
\end{verbatim}

That is, the groupname is followed by a colon followed by a
comma-separated list of usernames and/or groupnames in arbitrary order
(this list is called a $<$I$>$group definition).$<$/I$>$ \par 

A groupname must be defined before it is referenced (and a groupname
is not defined inside its own definition).  An undefined reference is
treated as a username.  This guarantees the absence of circular
structures in the group hierarchy.\par 


\subsubsection{Syntax of Group Definition Part}

Group definition part appears not only in the group file, but also
\begin{itemize}

\item  in $<$CODE$>$mask-group$<$/CODE$>$ field in protection setup file, and

\item  as last item on each line of the ACL file.

\end{itemize}
Group definitions are in their simples form just one user or group
name, or a comma-separated list of them. \par 


\paragraph{IP Address Masks}

Any group definition may contain an IP address restriction like:
\begin{itemize}
\item  $<$I$>$"anybody from these IP addresses"$<$/I$>$
\item  $<$I$>$"this user from these addresses"$<$/I$>$
\item  $<$I$>$"these users and groups from these addresses"$<$/I$>$
\end{itemize}
IP address restriction starts with an at sign $<$CODE$>$@$<$/CODE$>$ and is
followded by an IP number template. In IP template each of the 4 parts
may contain one wildcard character $<$CODE$>$*.$<$/CODE$>$ \par 

IP address restriction can be on its own when it allows anyone
from a matching address:
\begin{verbatim}
    cern_site: @128.141.*.*
\end{verbatim}

However, it can also immediately follow a user or group name in which
case these users are only allowed if they connect from a matching
address:
\begin{verbatim}
    ari_at_work: luotonen@128.141.8.187
\end{verbatim}


\paragraph{Lists of Names and IP Address Templates}

It is possible to make a list of users and groups, and IP addresses,
and combine them all together with parentheses:
\begin{verbatim}
    cern_hackers: (luotonen,timbl)@(128.141.8.187, 128.141.244.101)
\end{verbatim}


\paragraph{Continuation Line}

Long group definitions can be split on multiple lines after any comma
in the group definition:
\begin{verbatim}
    wizards: marca, sanders, kevin, dave, montulli, timbl,
             cailliau, hallam, jak
    hackers: marca@141.142.*.*, sanders@153.39.*.*,
             (luotonen, timbl, hallam)@128.141.*.*,
             cailliau@(128.141.201.162, 128.141.248.119)
\end{verbatim}


See also: Password file.\par 

AL 14 October 1993


\subsection{Discussion About Unix Links}

Usually WWW servers providing protected information also want to
provide public information. Since the information about which document
files are protected cannot reside in the same file as the document
itself, the Unix links (both soft and hard) pose a serious safety
problem, because with them it is possible to make a file appear in
some other directory than where it really resides.\par 


\subsubsection{Description of How to Override the Protection}

\begin{itemize}
\item Make a document.
\item Make a Unix (soft or hard) link to the protected document
    (has to be on the same machine, of course).
\item From your own document make a hypertext link to the newly
    created Unix link.
\item You can now access the protected document by following the
    hypertext link in your own document.
\end{itemize}

As you can see, in order to gain illegal access to protected
documents, the person has to have an account in the same machine as
the document resides. Also, the Unix link must be put under the real
WWW server on that machine, not just a privately run copy of it
(because otherwise it would not have Unix read access to the protected
documents). Thus, just everybody cannot override the access
authorization system even with the weak spot existing.\par 

However, the worst thing about this is the fact that it is not just
the creator of the Unix link who gains access to the protected data,
but in fact every person, who can access that file (link) through the
Web (and that's the entire world).\par 

The problem originates from the fact that the WWW server has Unix
access to both protected and public documents, and IT has to resolve
whether it is in fact protected or public, and the underlying Unix
file system certainly doesn't make it any easier.\par 

Unix links have caused similar trouble
before, too.


\subsubsection{Solution in CERN Daemon}

Obviously the simplest and safest solution is to run the server under
such a user-id that has access to documents of one collaboration, but
not any others. Because the server has to be able to serve documents
of multiple collaborations it runs first as $<$CODE$>$root$<$/CODE$>$, and
sets its process user and group ids just before serving the request.
\par 

AL 14 October 1993


\subsection{Discussion about Passwords in Access Authorization}

There are a number of ways in which a user can identify himself to a
system; some of them are better than the others in some way, but worse
in another.\par 

In order to build a totally, completely bullet proof protection scheme
into WWW system we would have to construct a system so heavy that it
would ruin everything that WWW stands for: $<$EM$>$fast$<$/EM$>$ access to a
$<$EM$>$large$<$/EM$>$ amount of data from $<$EM$>$anywhere$<$/EM$>$ in the world
$<$EM$>$easily.$<$/EM$>$ The Web is not a place to keep state secrets.
However, it should provide at least some level of protection.\par 

There are three methods worth considering in this connection:
\begin{itemize}
\item username and password identification
\item single key encryption
\item public key encryption
\end{itemize}

If we use the public key encryption, the user must keep the private
key in some file somewhere accessible to the WWW browser. Since some
platforms do $<$EM$>$not$<$/EM$>$ provide $<$EM$>$any$<$/EM$>$ kind of protection of
data at all, this method by itself is not sufficient.\par 

Moreover, WWW users must be able to access the Web from anywhere
in the world. This implies, that they would have to carry a diskette
or some other form of media with them containing the private key.
Also, the users would have to worry about the media being
compatible with the platform they are temporarily using. Otherwise the
private key would have to be transmitted via an unsecure channel,
which is why we need the encryption system in the first place.\par 

Even if we should end up using public key or other advanced encryption
method as a means of authentication, one problem still remains: a
chain is as weak as the weakest link in it. In other words, no matter
how safe the WWW system itself is, all the platforms using it cannot
guarantee that the user is who he says he is.\par 

For example, someone might break into a machine and use someone else's
public-private key pair, and the WWW system could not do anything
about it.\par 

Since most of the platforms use only a simple username-password
method, it will suffice for the protection needs of the WWW system, at
least for the time being. We shall call this telnet-level protection
shceme the Basic Access Authorization Scheme,
or $<$CODE$>$Basic$<$/CODE$>$ for short.\par 

Later on, an enhanced version of this (a combination of all the three
methods mentioned above) will be implemented, called the Public Key Access Authorization Scheme, or
$<$CODE$>$Pubkey$<$/CODE$>$ for short.\par 

AL 14 October 1993


\subsection{Password File}

The information about users and their passwords is kept in a password
file of the server. Each line in the
password file contains information about one user, in the following
format:
\begin{verbatim}
        username:password:real name and maybe other information
\end{verbatim}


$<$CODE$>$password$<$/CODE$>$ field is encrypted by C library
$<$CODE$>$crypt()$<$/CODE$>$ function.  This makes it compatible with Unix
password file $<$CODE$>$(/etc/passwd)$<$/CODE$>$. Password file can be
maintained by the $<$CODE$>$htadm$<$/CODE$>$
program.\par 

Password file should not reside in the served tree of documents, or it
should be carefully checked that the rule file prevents it from being
accessed via the WWW server.\par 

There must not be duplicate entries for the same username, and
username must never contain colons.\par 

See also: Group file.\par 

AL 14 October 1993


\subsection{Selecting the Encryption Methods for the pubkey Protection Scheme}

There are two encryption methods needed to implement the Public Key Protection Scheme. We need a
conventional single key method, where the same key both encrypts and
decrypts (for encrypting and decrypting the server reply: the headers
and the document itself), and a public key method (used for encrypting
user's identification information and his encryption key).\par 

The reason for using two encryption methods is the fact that public
key encryption is too slow for large amounts of data (documents), so
the documents have to encrypted with a single key method. But the key
has to be sent over an unsecure channel, and the way to do this
securely is to use a public key method.\par 


\subsubsection{Single Key Methods to Consider}

The following single key encryption methods are worth considering:
\begin{DL}{allow this much space}
\item[DES] Patent in the U.S.
\item[IDEA] Patent in Europe, no license fee for noncommercial use.
\end{DL}

I suggest that DES encryption be used, since there are so many
different implementations all over the world, that it is easy to plug
it in, if just clear hooks are left in the WWW Common Library code.\par 


\subsubsection{Public Key Methods to Consider}

The following public key encryption methods are worth considering:
\begin{DL}{allow this much space}
\item[RSA] Rivest-Shamir-Adleman, patent in the U.S.

\item[Rabin] Public Key Partners claim their
patent covers all public key cryptography.
\end{DL}

 \par 

AL 14 October 1993


\subsection{Public Key Protection Scheme Proposal}

In the Basic Protection Scheme the password
flies trough the net unencrypted, which is not a very good idea. One
solution to this is to encrypt username and password with a public key
of the server.\par 

Furthermore, the documents might be classified or copyrighted in such
a way that they need to be encrypted, too, while transferring them
through the Internet.\par 

The Public Key Protection Scheme consists of the following steps:
\begin{itemize}
\item  (Server sends an Unauthorized status).
\item  Client authenticates himself.
\item  Server checks authentication and authorization.
\item  Server sends an encrypted reply.
\item  Client decrypts the reply from server.
\end{itemize}


\subsubsection{
$<$IMG SRC="icons/misc/bomb.gif"$>$
Step 1:  Server Sends an Unauthorized 401 Status}

On reception of a request to access a protected document the Public
Key Protection Scheme works otherwise like the Basic Protection
Scheme, except that the WWW server sends also its public key in the
$<$CODE$>$WWW-Authenticate:$<$/CODE$>$ header field of the reply:
\begin{verbatim}
        HTTP/1.0 401 Unauthorized -- authentication failed
        WWW-Authenticate: Basic realm="CollabName",
                                key="encodedPublicKey"
\end{verbatim}


$<$IMG SRC="icons/32x32/caution.gif" ALT="Note:"$>$ If the client had
given the $<$CODE$>$Authorization:$<$/CODE$>$ field already with the request,
then the scheme continues at step 3: server checks
authentication and authorization.\par 


\subsubsection{
$<$IMG SRC="icons/misc/info.gif"$>$
Step 2:  Client Authenticates Himself}

After having received the $<$CODE$>$Unauthorized$<$/CODE$>$ status code (or
otherwise knowing from a previous request to the server that it
requires authorization information when accessing the desired file),
browser prompts for username and password (unless already given),
generates a random encryption key, then concatenates the user name,
password, browser's IP address, timestamp and the generated encryption
key, with colons as separators:
\begin{verbatim}
    username:password:browser_inet_address:timestamp:browser_key
\end{verbatim}

encrypts the gained string with the server's public key, and encodes it into printable characters.\par 

The client then sends the encrypted string along with the
next request in the $<$CODE$>$Authorization:$<$/CODE$>$ field as follows:
\begin{verbatim}
        Authorization: Pubkey encrypted_string
\end{verbatim}


$<$IMG SRC="icons/32x32/caution.gif" ALT="Note:"$>$
Although browser's encryption key exists only in it's memory and in
the server's memory, and is encrypted with the server's public key
while it flies through the network, the same key should not be used
twice, but a new key should be generated even when accessing the same
server, thus reducing the possibility of encryption key being
cracked.\par 

Browser's encryption key is concatenated with the identification
information $<$EM$>$before$<$/EM$>$ encryption to guarantee, that even if
someone catches the authorization string it will be useless, because
using it will produce undecryptable results. Thus replaying is
possible from the same internet address as the original request during
the (short) time when timestamp is valid, but useless.\par 


\subsubsection{
$<$IMG SRC="icons/misc/question.gif"$>$
Step 3:  Server Checks Authentication and Authorization}

When the server receives an access request to a document protected by
the Public Key Access Authorization Scheme, and the request is a full
request containing $<$CODE$>$Authorization:$<$/CODE$>$ field which
contains the Public Key Scheme authorization information, it will
execute the same Access Request
Validation Procedure as in $<$CODE$>$Basic$<$/CODE$>$ scheme with the
following exceptions and additions:

\begin{itemize}
\item  The authorization string is decrypted with servers private key
after decoding it from printable characters.

\item  If access information is correct, result should be five fields
with colons as field separators. Those fields contain username,
password, internet address, timestamp and browser's encryption key,
respectively.

\item  IP address is checked with the actual requesting address. If no
match access is forbidden $<$CODE$>$(403$<$/CODE$>$ status code).

\item  Timestamp is checked with current server time. If not within
limits access is denied because of failing authentication
$<$CODE$>$(401$<$/CODE$>$ status code). Server sends also a
$<$CODE$>$WWW-Server-Time:$<$/CODE$>$ field giving the browser its current
time (this removes the need for syncronized clocks). \par 

\end{itemize}
\par 


\subsubsection{
$<$IMG SRC="icons/misc/broadcast.gif"$>$
Step 4: Server Sends An Encrypted Reply}

In the Public Key Scheme, if the client is allowed to access the
document, the reply from server may be encrypted. Server replies with
the usual status line, and immediately after that follow the
$<$CODE$>$DEK-Info:$<$/CODE$>$,
$<$CODE$>$ Key-Info:$<$/CODE$>$ and
$<$CODE$>$ MIC-Info:$<$/CODE$>$ fields
(almost as in RFC1421):
\begin{verbatim}
        HTTP/1.0 200 Document follows
        DEK-Info: DES-CBC,BFF968AA74691AC1
        Key-Info: DES_ECB,DJSFo7dSDFf34hKHFD8234jDFf2bfasdf832DF3nZ
        MIC-Info: MD5,
         LDKJF3kr34hfDuf23r98FBk38ftDFP9873hbrFDp9gb23kfDPF2b3JfKeL7G
         DLkwtDICl234FJi9834kjfslk
        ... other headers and the encrypted document follow ...
\end{verbatim}


The document body is not encoded into printable characters, but is
pure binary as output by the encryption procedure. This is to save
time, space and bandwidth. \par 


\subsubsection{
$<$IMG SRC="icons/misc/escherknot.gif"$>$
Step 5: Client Decrypts The Reply From Server}

When the client recieves a reply with 
$<$CODE$>$DEK-Info:$<$/CODE$>$,
$<$CODE$>$ Key-Info:$<$/CODE$>$ and
$<$CODE$>$ MIC-Info:$<$/CODE$>$ it knows
that the body is encrypted. These fields are used to decrypt the
document as described in RFC1421.\par 

In further discussion about the Public
Key Scheme there are considerations about possible encryption methods to use.\par 

AL 14 October 1993


\subsection{}

\subsection{Discussion about the Public Key Protection Scheme}

Implemented in this way, Access Authorization does not violate agaist
the implementation guideline requiring that there are no
sessions between client and server.\par 


The Public Key Scheme in itself is independent of the encryption
methods selected. The only requirements are that one is a public key
cryptosystem, and the other one is a single key cryptosystem, and that
server and client agree on the cryptosystems used.  See the discussion
about possible encryption methods.\par 

AL 14 October 1993


$<$B$>$Origin: $<$/B$>$
This is the file $<$CODE$>$THEORY$<$/CODE$>$ in $<$CODE$>$rpem.tar.Z,$<$/CODE$>$
available from $<$CODE$>$dcssparc.cl.msu.edu (35.8.1.6).$<$/CODE$>$
\par 
$<$B$>$Edited into HTML by: $<$/B$>$
AL
\par 

\subsection{Description of the Rabin Public Key Cryptosystem}

Here are some messages from Marc Ringuette and Bennet Yee concerning
the Rabin system.  They provide a succinct description of the system,
and statements concerning its public domainness.
\par 
Note that the version of the Rabin system I/we have implemented is not
exactly as described in Rabin's papers, so I may be giving him short
shrift here.  We/I use the Berlekamp square root algorithm
(which is very much different than the exponentiation that RSA uses) in
order to be sure that no one at RSA can claim this is an RSA ripoff.
I think it's safe to say that this square root algorithm, coupled with
the Chinese Remainder Theorem, is the "magic" that makes this whole
system work.
\par 
\begin{verbatim}
-------- Messages follow ---------------------------------------

Date: Fri, 24 Aug 1990 11:26-EDT
From: Marc.Ringuette@DAISY.LEARNING.CS.CMU.EDU
To: Mark Riordan <riordanmr@clvax1.cl.msu.edu>
Subject: Re: Royalty-free public key algorithm wanted
\end{verbatim}

Happy news - I have something for you.  My friend Bennet Yee introduced
me to it, and it's a simple PK technique, provably as hard as factoring,
that is probably equivalent to or better than RSA.  It's not patented
as far as I know...but I haven't written away to the author yet.
\par 
It was invented by Michael Rabin, and goes like this:
\begin{itemize}
\item     The private key is a pair of large random primes, as for RSA

\item     The encryption function is squaring/square root modulo pq.  Squaring
    is easy $--$ modular multiplication $--$ but taking a square root modulo
    pq is as hard as factoring.  Once you know the factors, though, it
    is possible.

\item     So to encrypt a short message with the public key, square the message
    modulo pq.

\item     To decrypt it, take the four square roots modulo pq, and choose the correct
    one somehow.
\end{itemize}
In a practical system, you use this function to encrypt a one-time key for
DES or some other private-key system, then encrypt the rest of the message
with the private key system.
\par 

p.s. Here's a brief proof that the method is as hard as factoring:
\par 
Assume you can take arbitrary square roots modulo pq.  If a number has a
square root (1 out of 4 numbers do), then it has 4 square roots, two distinct
ones and their negations mod pq.
\par 
To factor pq, choose a random number, square it, and take the square root.
With 50\% probability, you will obtain the other distinct square root.  From
these you can derive the factoring (damn, I can't quite remember how - was
it the Chinese Remainder Theorem, or some sort of GCD?).  I can fill in
the details sometime if you want.
\par 

\begin{verbatim}
Return-Path: <Marc.Ringuette@DAISY.LEARNING.CS.CMU.EDU>
Received: from DAISY.LEARNING.CS.CMU.EDU by clvax1.cl.msu.edu with SMTP ;
          Thu, 13 Sep 90 14:09:28 EDT
Date: Thu, 13 Sep 1990 14:06-EDT
From: Marc.Ringuette@DAISY.LEARNING.CS.CMU.EDU
To: ceblair@ux1.cso.uiuc.edu, riordanmr@clvax1.cl.msu.edu
Subject: Re: Is Rabin cryptosystem covered by patents?
\end{verbatim}

I just got mail from Michael Rabin, saying that his technique is in the
public domain.  Yay!
\par 

Bennet Yee adds:
\begin{verbatim}
Date: Sun, 28 Apr 91 22:06:12 EDT
From: Bennet.Yee@PLAY.MACH.CS.CMU.EDU
\end{verbatim}


Rabin's protocol is equivalent to factoring:  Suppose you have a procedure P
which, given a quadratic residue, gives one of its square roots mod pq.  The
four nsquare roots of a quadratic residue y=x{\char94}2 mod pq is -x, x, -gamma x,
gamma x, where gamma is the nontrivial square root of unity mod pq.
\par 
Aside:  you can find gamma if you know p and q by using the Chinese
Remainder Theorem (CRT) and solving the system of equations
\begin{verbatim}
        x = -1 mod p
        x = 1 mod q
\end{verbatim}

\lbrack  You can see where the other square roots of unity comes from:  they are the
other possible patterns of signs on the 1's in the system of eqns for CRT. \rbrack 
\par 
Now, given P, you choose a random r between 1 and pq-1 inclusive and compute
y = P(r{\char94}2).  With 1/2 probability, y = +/- gamma r.  Since you knew r, you
can find g = y/r = +/- gamma.  Now, since g-1 is either 0 mod q or 0 mod p,
so GCD(g-1,pq) will give you p or q.
\par 
\lbrack  To find 1/r mod pq, use EGCD:  The extended Euclidean algorithm, given
m,n, will find GCD(m,n) as well as the pair a,b such that am+bn=GCD(m,n).
When GCD(m,n)=1, we have a=1/m mod n. \rbrack 
\par 
Note that this can be simplfied a little, since with very high probability r
does not divide pq:  r(g-1) = r(y/r - 1) = y - r, so GCD(y-r,pq) will work
just as well.  If r divides pq, you've already (accidentally) factored the
modulus.
\begin{verbatim}
-------- End of Messages -----------------------------------------------
\end{verbatim}


Let me add a few words about "choosing the correct root somehow".  If
there's one square root of X mod pq, then there are four square roots.
In general, it's not obvious which of the four square roots is the
original message.
\par 
H. C. Williams devised a modification of the Rabin system which allows
the cryptographer to decide definitively which of the four square roots
is the original message.  I started to implement Williams' variation
(see the code in cippkg.c that has been \#if'ed out), but decided that
his variation made the system look too much like RSA.  The RSA system
is great, but I don't want their lawyers after me.
\par 
So, the question remains:  how should we distinguish which of four
candidates is the original plaintext?  I decided upon a brute force
approach:  I add 64 bits of redundant information to a message before
encrypting it.  The 64 bits are simply the first 64 bits of the
message.  If the message is less than 64 bits long, it is repeated as
necessary to fill out the 64 bits.  When the ciphertext is decrypted,
the correct plaintext can be detected (with a probability of error of
2{\char94}-64, I assume) by looking for the redundancy.
\par 
This technique is ugly because it does not *guarantee* unique
detection of the correct root (though 2{\char94}-64 is good enough for me),
and also because it wastes bits.  However, the waste of bits isn't as
bad as it looks.
\par 
Messages in the Rabin system have to be broken up into chunks of size
(just less than) pq.  But since p and q need to be rather large
in order to provide adequate security, each chunk of the
message should be several hundred bits or more in size.
Using 64 bits of that to discriminate amoungst
the square roots is not much overhead.  Plus,
public key systems are typically used only to encipher a message key
for a more conventional (and much faster) secret key system.  The
message key is typically much smaller than several hundred bits,
so there's plenty of room left over for redundancy.
\par 


\subsubsection{Selected References}

\begin{itemize}
\item M. O. Rabin, "Digitized signatures and public-key functions as
   intractable as factorization,", MIT Lab. for Computer Science,
   Technical Report LCS/TR-212, 1979.
   \lbrack I've not located this paper myself and have instead relied upon
   references to it in other papers and upon Marc Ringuette's
   description.\rbrack 

\item H. C. Williams, "A Modification of the RSA Public-Key Encryption
   Procedure," IEEE Transactions on Information Theory, Vol IT-26,
   No. 6, November 1980.
   \lbrack I decided not to use this because it looked too RSA-like.\rbrack 

\item Trygve Nagell, Introduction to Number Theory.  New York:
   Chelsea Publishing Company, 1964.
   \lbrack Basic number theory text, better for cryptographic purposes
   than most.  See esp. the chapter "Theory of Quadratic Residues".\rbrack 

\item Henk C. A. van Tilborg, An Introduction to Cryptology.  Boston:
   Kluwer Academic Publishers, 1988.
   \lbrack Especially strong on public key systems.  Comes with handy
   appendices on number theory and the theory of finite fields.\rbrack 

\item Jennifer Seberry and Josef Pieprzyk, Cryptography:  An Introduction
   to Computer Security.  Sydney, Australia:  Prentice Hall, 1989.
   \lbrack More easily readable than most similar books, with more of
   an eye toward applications.  Contains complete C source to
   a DES implementation.  So much for DES being a secret.\rbrack 
\end{itemize}
\par 

Mark Riordan,   riordanmr@clvax1.cl.msu.edu,    late April 1991


\subsection{AA Additions to Rule File}

Access Authorization brings two additional rules to the rule file:
$<$CODE$>$protect$<$/CODE$>$ and $<$CODE$>$defprot$<$/CODE$>$. They have the same
syntax:
\begin{verbatim}
        defprot <template> <setupfile> <uid.gid>
        protect <template> <setupfile> <uid.gid>
\end{verbatim}


\begin{DL}{allow this much space}

\item[$<$CODE$>$$<$template$>$$<$/CODE$>$] is the usual template used in
rule file to match agaist the requested URL.

\item[$<$CODE$>$$<$setupfile$>$$<$/CODE$>$] is a pathname for protection setup file which sets up the
actual protection parameters.\par 

Setup file can be omitted from $<$CODE$>$protect$<$/CODE$>$ rule, but it is
obligatory in $<$CODE$>$defprot$<$/CODE$>$ rule. If setup file is omitted it
is not possible to give the $<$CODE$>$$<$uid.gid$>$$<$/CODE$>$ part,
either.\par 

\item[$<$CODE$>$$<$uid.gid$>$$<$/CODE$>$] are the Unix user id and group id
(either by name or by number, separated by comma) to which the server
should change when serving the request. These are only meaningful when
the server is running as $<$CODE$>$root.$<$/CODE$>$\par 

These can be omitted, when they default to $<$CODE$>$nobody$<$/CODE$>$ and
$<$CODE$>$nogroup$<$/CODE$>$. Also either part by itself may be omitted, as
far it is kept in mind that the dot belongs to the group id part:
\begin{verbatim}
	user.group        user        .group
\end{verbatim}

are all valid.

\end{DL}


\subsubsection{The defprot Rule}

$<$CODE$>$defprot$<$/CODE$>$ rule specifies the default protection setup file
and process uid and gid.\par 

$<$CODE$>$defprot$<$/CODE$>$ by itself $<$I$>$does not$<$/I$>$ protect anything, but
if protection is later on turned on by
\begin{itemize}
\item either an existing access
control list file
\item or a $<$CODE$>$protect$<$/CODE$>$ rule without setup file name
\end{itemize}
the protection settings of $<$CODE$>$defprot$<$/CODE$>$ rule are used. Rule
translation continues normally after $<$CODE$>$defprot$<$/CODE$>$ rule. If
another $<$CODE$>$defprot$<$/CODE$>$ rule is matched it overrides the
previous.\par 


\subsubsection{The protect Rule}

$<$CODE$>$protect$<$/CODE$>$ rule tells the server, that the document matching
template is protected. If protection setup
file is not specified it is taken from the previously matched
$<$CODE$>$defprot$<$/CODE$>$. If no $<$CODE$>$defprot$<$/CODE$>$ rule has matched
before it is an error.\par 

Rule translation continues normally, but the document is served in
protected mode: either an access control list file
$<$CODE$>$(.www\_acl)$<$/CODE$>$ must be found in the same directory as the
document, or a mask must be present in protection setup file, (or
both) and in addition, of course, the requirements in mask/ACL must be
met (i.e. the user/IP number must belong to an allowed group).\par 

If another $<$CODE$>$protect$<$/CODE$>$ rule is matched it overrides the
privious one.\par 

$<$B$>$Note:$<$/B$>$ Even without $<$CODE$>$protect$<$/CODE$>$ rule protection is
enabled if there is an Access Control
List in the same directory as the requested file.\par 

The reason for $<$CODE$>$protect$<$/CODE$>$ rule existing is that it is
possible to tell that an entire hierarchy of files is protected, and
if for some reason the ACL is missing, it does not result in protected
files being exposed.\par 

It can also be used to avoid having ACLs alltogether when
$<$CODE$>$Mask-Group$<$/CODE$>$ is set in the protection setup file.\par 


\subsubsection{Examples}
\begin{verbatim}
    defprot  *               /WWW/httpd.prot
    protect  /priv/*         /WWW/priv/httpd.prot         foo.bar
    protect  /priv/secret/*  /WWW/priv/secret/httpd.prot  foo.bar
    fail     *.prot
    map      /*              file:/WWW/*
    fail     *
\end{verbatim}

This setup uses protection setup files in the top-level directory for
each different protection level (this doesn't need to be the case).
When accessing "private" and "secret" files the server sets its
process user and group id to $<$CODE$>$foo$<$/CODE$>$ and $<$CODE$>$bar$<$/CODE$>$.
Otherwise it is running as $<$CODE$>$nobody$<$/CODE$>$ in
$<$CODE$>$nogroup.$<$/CODE$>$

$<$CODE$>$fail$<$/CODE$>$ rule explisitly fails every request to access any
protection setup files (however, they need not be called
$<$CODE$>$httpd.prot$<$CODE$>$).\par 

AL 14 October 1993


\subsection{
$<$IMG SRC="icons/misc/sun-server.gif"$>$
Server Side Access Authorization Description}

The exact server side Access Authorization procedures are described in
the corresponding protection scheme specification:
\begin{itemize}
\item  Basic Protection Scheme
    ($<$CODE$>$basic$<$/CODE$>$) and
\item Public Key Protection
    Scheme ($<$CODE$>$pubkey$<$/CODE$>$)
\end{itemize}
\par 

Because the Unix file system with (soft
and hard) links makes it easy to access a file from another
directory than where the file actually resides, server needs to use
the unix filesystem protections in its favour. Therefore, the Unix
file system must provide the protection between the collaborations
using the same machine, and the server sets its process uid and gid
according to which set of files are currently served.\par 


\subsubsection{
$<$IMG SRC="icons/48x48/knife\_fork.gif"$>$
Forking and Process uid and gid}


The server can be standalone, in which case it $<$CODE$>$fork$<$/CODE$>$s
another copy of itself and after that sets its user and group ids.
(Forking is necessary because once a process has set its user-id to
something else than $<$CODE$>$root$<$/CODE$>$ it cannot change back.) If the
server is run by $<$CODE$>$inetd$<$/CODE$>$ (inet daemon) there is no need for
forking.  \par 

If users in the server machine can be trusted files can have world (or
group) read permission, and the server can run as nobody (or with
appropriate group id). In this case there is no need to fork even when
running standalone.\par 


AL 14 October 1993


\subsection{Server's Public And Private Keys}

Server's public and private keys must remain the same for a reasonably
long time because, in principle, every time the keys are changed it's
likely that there are one or two clients just waiting for their user's
to type in their usernames and passwords. When they have completed,
the authorization string is encrypted with the $<$EM$>$old$<$/EM$>$ public key
thus leading to an $<$CODE$>$Unauthorized$<$/CODE$>$ status from the server
although the user may well be authorized.\par 

The server might accept data encrypted with either of the keys for a
while, but this would introduce state to the server, and would
complicate things too much for something, that is really not that
vital.\par 

Furthermore, if the keys keep changing all the time (say once a
minute, or even every ten minutes) the browser will practically always
have to first fail trying to access a document to get the new public
key, and then use it to encrypt the authorization information again
(of course generating a new encryption key, because otherwise the
material to be encrypted with the public key would be exactly the same
as encrypted with the old key and thus compromise the safety of the
system, because having two different encryptions of the same message
makes it easier to break).\par 

Since public key encryption can be considered rather safe for a period
of even years, it will be reasonable to say, that the server needs not
change it's public and private keys more often than say, every couple of
weeks.\par 

On Suns, if the server is run by $<$CODE$>$inetd$<$/CODE$>$ which only starts
the server when someone is requesting a connection to it, i.e., the
server is not running all the time, there may be a separate program
updating the keys either regularly (run by $<$CODE$>$cron$<$/CODE$>$), or
during the system init (run from $<$CODE$>$/etc/rc.local).$<$/CODE$>$\par 

On other platforms, especially those not providing multiple processes,
the key update has to be done either once at the server startup, or if
the server is not booted often enough (and why would it be?) the
server itself must do this task regularly.\par 

Private key must be kept in a directory with no world or group
permissions under the WWW server pseudo account's home directory, in a
file with no world or group permissions to it either.\par 

AL 14 October 1993


\subsection{Printable Encoding}

Encoding into printable characters is done as defined in RFC 1421.\par 

AL 14 October 1993


\subsection{Vocabulary}


\subsubsection{}
\begin{DL}{allow this much space}
\item[]
\end{DL}


\subsubsection{A}
\begin{DL}{allow this much space}

\item[AA
]Access Authorization

\item[Asymmetric Cryptography
]Cryptography in which two keys are used: the other one encrypts
and the other one decrypts. What is more is that it is also $<$B$>$vice
versa$<$/B$>$.  Moreover, something that has veen encrypted with one of the
keys can be decrypted $<$B$>$only$<$/B$>$ with the other one.

\end{DL}


\subsubsection{D}
\begin{DL}{allow this much space}
\item[DEK
]Data Encryption Key. A single (symmetric) key used to encrypt the
document (but not the headers) sent by the server.
\end{DL}


\subsubsection{H}
\begin{DL}{allow this much space}

\item[HTTP
]HyperText Transfer Protocol.

\end{DL}


\subsubsection{P}
\begin{DL}{allow this much space}

\item[PEM
]Privacy Enhanced Mail.

\item[Private Key
]The secret component of the two keys used in public key encryption.

\item[Public Key
]The public componen of the two keys used in public key encryption.

\item[Public Key Cryptography
]See Asymmetric Cryptography.

\end{DL}

\subsubsection{S}
\begin{DL}{allow this much space}

\item[Secret Key
]The single key used in symmetric
cryptography.

\item[Symmetric Cryptography
]Encryption and decryption done by the same (secret) key.

\end{DL}


\subsubsection{U}
\begin{DL}{allow this much space}
\item[URL
]Universal Resource Locator.
\end{DL}


\par 

AL 14 October 1993


\begin{verbatim}
Network Working Group                                            J. Linn
Request for Comments: 1421                    IAB IRTF PSRG, IETF PEM WG
Obsoletes: 1113                                            February 1993


           Privacy Enhancement for Internet Electronic Mail:
        Part I: Message Encryption and Authentication Procedures

Status of this Memo

   This RFC specifies an IAB standards track protocol for the Internet
   community, and requests discussion and suggestions for improvements.
   Please refer to the current edition of the "IAB Official Protocol
   Standards" for the standardization state and status of this protocol.
   Distribution of this memo is unlimited.

Acknowledgements

   This document is the outgrowth of a series of meetings of the Privacy
   and Security Research Group (PSRG) of the IRTF and the PEM Working
   Group of the IETF.  I would like to thank the members of the PSRG and
   the IETF PEM WG, as well as all participants in discussions on the
   "pem-dev@tis.com" mailing list, for their contributions to this
   document.

1.  Executive Summary

   This document defines message encryption and authentication
   procedures, in order to provide privacy-enhanced mail (PEM) services
   for electronic mail transfer in the Internet.  It is intended to
   become one member of a related set of four RFCs.  The procedures
   defined in the current document are intended to be compatible with a
   wide range of key management approaches, including both symmetric
   (secret-key) and asymmetric (public-key) approaches for encryption of
   data encrypting keys.  Use of symmetric cryptography for message text
   encryption and/or integrity check computation is anticipated. RFC
   1422 specifies supporting key management mechanisms based on the use
   of public-key certificates.  RFC 1423 specifies algorithms, modes,
   and associated identifiers relevant to the current RFC and to RFC
   1422.  RFC 1424 provides details of paper and electronic formats and
   procedures for the key management infrastructure being established in
   support of these services.

   Privacy enhancement services (confidentiality, authentication,
   message integrity assurance, and non-repudiation of origin) are
   offered through the use of end-to-end cryptography between originator
   and recipient processes at or above the User Agent level.  No special
   processing requirements are imposed on the Message Transfer System at


Linn                                                            [Page 1]

RFC 1421        Privacy Enhancement for Electronic Mail    February 1993


   endpoints or at intermediate relay sites.  This approach allows
   privacy enhancement facilities to be incorporated selectively on a
   site-by-site or user-by-user basis without impact on other Internet
   entities.  Interoperability among heterogeneous components and mail
   transport facilities is supported.

   The current specification's scope is confined to PEM processing
   procedures for the RFC-822 textual mail environment, and defines the
   Content-Domain indicator value "RFC822" to signify this usage.
   Follow-on work in integration of PEM capabilities with other
   messaging environments (e.g., MIME) is anticipated and will be
   addressed in separate and/or successor documents, at which point
   additional Content-Domain indicator values will be defined.

2.  Terminology

   For descriptive purposes, this RFC uses some terms defined in the OSI
   X.400 Message Handling System Model per the CCITT Recommendations.
   This section replicates a portion of (1984) X.400's Section 2.2.1,
   "Description of the MHS Model: Overview" in order to make the
   terminology clear to readers who may not be familiar with the OSI MHS
   Model.

   In the [MHS] model, a user is a person or a computer application.  A
   user is referred to as either an originator (when sending a message)
   or a recipient (when receiving one).  MH Service elements define the
   set of message types and the capabilities that enable an originator
   to transfer messages of those types to one or more recipients.

   An originator prepares messages with the assistance of his or her
   User Agent (UA).  A UA is an application process that interacts with
   the Message Transfer System (MTS) to submit messages.  The MTS
   delivers to one or more recipient UAs the messages submitted to it.
   Functions performed solely by the UA and not standardized as part of
   the MH Service elements are called local UA functions.

   The MTS is composed of a number of Message Transfer Agents (MTAs).
   Operating together, the MTAs relay messages and deliver them to the
   intended recipient UAs, which then make the messages available to the
   intended recipients.

   The collection of UAs and MTAs is called the Message Handling System
   (MHS).  The MHS and all of its users are collectively referred to as
   the Message Handling Environment.


Linn                                                            [Page 2]

RFC 1421        Privacy Enhancement for Electronic Mail    February 1993


3.  Services, Constraints, and Implications

   This RFC defines mechanisms to enhance privacy for electronic mail
   transferred in the Internet. The facilities discussed in this RFC
   provide privacy enhancement services on an end-to-end basis between
   originator and recipient processes residing at the UA level or above.
   No privacy enhancements are offered for message fields which are
   added or transformed by intermediate relay points between PEM
   processing components.

   If an originator elects to perform PEM processing on an outbound
   message, all PEM-provided security services are applied to the PEM
   message's body in its entirety; selective application to portions of
   a PEM message is not supported. Authentication, integrity, and (when
   asymmetric key management is employed) non-repudiation of origin
   services are applied to all PEM messages; confidentiality services
   are optionally selectable.

   In keeping with the Internet's heterogeneous constituencies and usage
   modes, the measures defined here are applicable to a broad range of
   Internet hosts and usage paradigms.  In particular, it is worth
   noting the following attributes:

        1.  The mechanisms defined in this RFC are not restricted to a
            particular host or operating system, but rather allow
            interoperability among a broad range of systems.  All
            privacy enhancements are implemented at the application
            layer, and are not dependent on any privacy features at
            lower protocol layers.

        2.  The defined mechanisms are compatible with non-enhanced
            Internet components.  Privacy enhancements are implemented
            in an end-to-end fashion which does not impact mail
            processing by intermediate relay hosts which do not
            incorporate privacy enhancement facilities.  It is
            necessary, however, for a message's originator to be
            cognizant of whether a message's intended recipient
            implements privacy enhancements, in order that encoding and
            possible encryption will not be performed on a message whose
            destination is not equipped to perform corresponding inverse
            transformations.  (Section 4.6.1.1.3 of this RFC describes a
            PEM message type ("MIC-CLEAR") which represents a signed,
            unencrypted PEM message in a form readable without PEM
            processing capabilities yet validatable by PEM-equipped
            recipients.)

        3.  The defined mechanisms are compatible with a range of mail
            transport facilities (MTAs).  Within the Internet,


Linn                                                            [Page 3]

RFC 1421        Privacy Enhancement for Electronic Mail    February 1993


            electronic mail transport is effected by a variety of SMTP
            [2] implementations.  Certain sites, accessible via SMTP,
            forward mail into other mail processing environments (e.g.,
            USENET, CSNET, BITNET).  The privacy enhancements must be
            able to operate across the SMTP realm; it is desirable that
            they also be compatible with protection of electronic mail
            sent between the SMTP environment and other connected
            environments.

        4.  The defined mechanisms are compatible with a broad range of
            electronic mail user agents (UAs).  A large variety of
            electronic mail user agent programs, with a corresponding
            broad range of user interface paradigms, is used in the
            Internet.  In order that electronic mail privacy
            enhancements be available to the broadest possible user
            community, selected mechanisms should be usable with the
            widest possible variety of existing UA programs.  For
            purposes of pilot implementation, it is desirable that
            privacy enhancement processing be incorporable into a
            separate program, applicable to a range of UAs, rather than
            requiring internal modifications to each UA with which PEM
            services are to be provided.

        5.  The defined mechanisms allow electronic mail privacy
            enhancement processing to be performed on personal computers
            (PCs) separate from the systems on which UA functions are
            implemented.  Given the expanding use of PCs and the limited
            degree of trust which can be placed in UA implementations on
            many multi-user systems, this attribute can allow many users
            to process PEM with a higher assurance level than a strictly
            UA-integrated approach would allow.

        6.  The defined mechanisms support privacy protection of
            electronic mail addressed to mailing lists (distribution
            lists, in ISO parlance).

        7.  The mechanisms defined within this RFC are compatible with a
            variety of supporting key management approaches, including
            (but not limited to) manual pre-distribution, centralized
            key distribution based on symmetric cryptography, and the
            use of public-key certificates per RFC 1422.  Different
            key management mechanisms may be used for different
            recipients of a multicast message.  For two PEM
            implementations to interoperate, they must share a common
            key management mechanism; support for the mechanism defined
            in RFC 1422 is strongly encouraged.


Linn                                                            [Page 4]

RFC 1421        Privacy Enhancement for Electronic Mail    February 1993


   In order to achieve applicability to the broadest possible range of
   Internet hosts and mail systems, and to facilitate pilot
   implementation and testing without the need for prior and pervasive
   modifications throughout the Internet, the following design
   principles were applied in selecting the set of features specified in
   this RFC:

        1.  This RFC's measures are restricted to implementation at
            endpoints and are amenable to integration with existing
            Internet mail protocols at the user agent (UA) level or
            above, rather than necessitating modifications to existing
            mail protocols or integration into the message transport
            system (e.g., SMTP servers).

        2.  The set of supported measures enhances rather than restricts
            user capabilities.  Trusted implementations, incorporating
            integrity features protecting software from subversion by
            local users, cannot be assumed in general.  No mechanisms
            are assumed to prevent users from sending, at their
            discretion, messages to which no PEM processing has been
            applied. In the absence of such features, it appears more
            feasible to provide facilities which enhance user services
            (e.g., by protecting and authenticating inter-user traffic)
            than to enforce restrictions (e.g., inter-user access
            control) on user actions.

        3.  The set of supported measures focuses on a set of functional
            capabilities selected to provide significant and tangible
            benefits to a broad user community.  By concentrating on the
            most critical set of services, we aim to maximize the added
            privacy value that can be provided with a modest level of
            implementation effort.

   Based on these principles, the following facilities are provided:

        1.  disclosure protection,

        2.  originator authenticity,

        3.  message integrity measures, and

        4.  (if asymmetric key management is used) non-repudiation of
            origin,

   but the following privacy-relevant concerns are not addressed:

        1.  access control,


Linn                                                            [Page 5]

RFC 1421        Privacy Enhancement for Electronic Mail    February 1993


        2.  traffic flow confidentiality,

        3.  address list accuracy,

        4.  routing control,

        5.  issues relating to the casual serial reuse of PCs by
            multiple users,

        6.  assurance of message receipt and non-deniability of receipt,

        7.  automatic association of acknowledgments with the messages
            to which they refer, and

        8.  message duplicate detection, replay prevention, or other
            stream-oriented services

4.  Processing of Messages

4.1  Message Processing Overview

   This subsection provides a high-level overview of the components and
   processing steps involved in electronic mail privacy enhancement
   processing.  Subsequent subsections will define the procedures in
   more detail.

4.1.1  Types of Keys

   A two-level keying hierarchy is used to support PEM transmission:

        1.  Data Encrypting Keys (DEKs) are used for encryption of
            message text and (with certain choices among a set of
            alternative algorithms) for computation of message integrity
            check (MIC) quantities.  In the asymmetric key management
            environment, DEKs are also used to encrypt the signed
            representations of MICs in PEM messages to which
            confidentiality has been applied. DEKs are generated
            individually for each transmitted message; no
            predistribution of DEKs is needed to support PEM
            transmission.

        2.  Interchange Keys (IKs) are used to encrypt DEKs for
            transmission within messages.  Ordinarily, the same IK will
            be used for all messages sent from a given originator to a
            given recipient over a period of time.  Each transmitted
            message includes a representation of the DEK(s) used for
            message encryption and/or MIC computation, encrypted under
            an individual IK per named recipient.  The representation is


Linn                                                            [Page 6]

RFC 1421        Privacy Enhancement for Electronic Mail    February 1993


            associated with Originator-ID and Recipient-ID fields
            (defined in different forms so as to distinguish symmetric
            from asymmetric cases), which allow each individual
            recipient to identify the IK used to encrypt DEKs and/or
            MICs for that recipient's use.  Given an appropriate IK, a
            recipient can decrypt the corresponding transmitted DEK
            representation, yielding the DEK required for message text
            decryption and/or MIC validation.  The definition of an IK
            differs depending on whether symmetric or asymmetric
            cryptography is used for DEK encryption:

                 2a. When symmetric cryptography is used for DEK
                     encryption, an IK is a single symmetric key shared
                     between an originator and a recipient.  In this
                     case, the same IK is used to encrypt MICs as well
                     as DEKs for transmission.  Version/expiration
                     information and IA identification associated with
                     the originator and with the recipient must be
                     concatenated in order to fully qualify a symmetric
                     IK.

                 2b. When asymmetric cryptography is used, the IK
                     component used for DEK encryption is the public
                     component [8] of the recipient.  The IK component
                     used for MIC encryption is the private component of
                     the originator, and therefore only one encrypted
                     MIC representation need be included per message,
                     rather than one per recipient.  Each of these IK
                     components can be fully qualified in a Recipient-ID
                     or Originator-ID field, respectively.
                     Alternatively, an originator's IK component may be
                     determined from a certificate carried in an
                     "Originator-Certificate:" field.

4.1.2  Processing Procedures

   When PEM processing is to be performed on an outgoing message, a DEK
   is generated [1] for use in message encryption and (if a chosen MIC
   algorithm requires a key) a variant of the DEK is formed for use in
   MIC computation.  DEK generation can be omitted for the case of a
   message where confidentiality is not to be applied, unless a chosen
   MIC computation algorithm requires a DEK.  Other parameters (e.g.,
   Initialization Vectors (IVs)) as required by selected encryption
   algorithms are also generated.

   One or more Originator-ID and/or "Originator-Certificate:" fields are
   included in a PEM message's encapsulated header to provide recipients
   with an identification component for the IK(s) used for message


Linn                                                            [Page 7]

RFC 1421        Privacy Enhancement for Electronic Mail    February 1993


   processing.  All of a message's Originator-ID and/or "Originator-
   Certificate:" fields are assumed to correspond to the same principal;
   the facility for inclusion of multiple such fields accomodates the
   prospect that different keys, algorithms, and/or certification paths
   may be required for processing by different recipients.  When a
   message includes recipients for which asymmetric key management is
   employed as well as recipients for which symmetric key management is
   employed, a separate Originator-ID or "Originator-Certificate:" field
   precedes each set of recipients.

   In the symmetric case, per-recipient IK components are applied for
   each individually named recipient in preparation of ENCRYPTED, MIC-
   ONLY, and MIC-CLEAR messages. A corresponding "Recipient-ID-
   Symmetric:" field, interpreted in the context of the most recent
   preceding "Originator-ID-Symmetric:" field, serves to identify each
   IK.  In the asymmetric case, per-recipient IK components are applied
   only for ENCRYPTED messages, are independent of originator-oriented
   header elements, and are identified by "Recipient-ID-Asymmetric:"
   fields.  Each Recipient-ID field is followed by a "Key-Info:" field,
   which transfers the message's DEK encrypted under the IK appropriate
   for the specified recipient.

   When symmetric key management is used for a given recipient, the
   "Key-Info:" field following the corresponding "Recipient-ID-
   Symmetric:" field also transfers the message's computed MIC,
   encrypted under the recipient's IK. When asymmetric key management is
   used, a "MIC-Info:" field associated with an "Originator-ID-
   Asymmetric:" or "Originator-Certificate:" field carries the message's
   MIC, asymmetrically signed using the private component of the
   originator.  If the PEM message is of type ENCRYPTED (as defined in
   Section 4.6.1.1.1 of this RFC), the asymmetrically signed MIC is
   symmetrically encrypted using the same DEK, algorithm, encryption
   mode and other cryptographic parameters as used to encrypt the
   message text, prior to inclusion in the "MIC-Info:" field.

4.1.2.1  Processing Steps

   A four-phase transformation procedure is employed in order to
   represent encrypted message text in a universally transmissible form
   and to enable messages encrypted on one type of host computer to be
   decrypted on a different type of host computer.  A plaintext message
   is accepted in local form, using the host's native character set and
   line representation.  The local form is converted to a canonical
   message text representation, defined as equivalent to the inter-SMTP
   representation of message text.  This canonical representation forms
   the input to the MIC computation step (applicable to ENCRYPTED, MIC-
   ONLY, and MIC-CLEAR messages) and the encryption process (applicable
   to ENCRYPTED messages only).


Linn                                                            [Page 8]

RFC 1421        Privacy Enhancement for Electronic Mail    February 1993


   For ENCRYPTED PEM messages, the canonical representation is padded as
   required by the encryption algorithm, and this padded canonical
   representation is encrypted. The encrypted text (for an ENCRYPTED
   message) or the unpadded canonical form (for a MIC-ONLY message) is
   then encoded into a printable form.  The printable form is composed
   of a restricted character set which is chosen to be universally
   representable across sites, and which will not be disrupted by
   processing within and between MTS entities. MIC-CLEAR PEM messages
   omit the printable encoding step.

   The output of the previous processing steps is combined with a set of
   header fields carrying cryptographic control information.  The
   resulting PEM message is passed to the electronic mail system to be
   included within the text portion of a transmitted message. There is
   no requirement that a PEM message comprise the entirety of an MTS
   message's text portion; this allows PEM-protected information to be
   accompanied by (unprotected) annotations.  It is also permissible for
   multiple PEM messages (and associated unprotected text, outside the
   PEM message boundaries) to be represented within the encapsulated
   text of a higher-level PEM message. PEM message signatures are
   forwardable when asymmetric key management is employed; an authorized
   recipient of a PEM message with confidentiality applied can reduce
   that message to a signed but unencrypted form for forwarding purposes
   or can re-encrypt that message for subsequent transmission.

   When a PEM message is received, the cryptographic control fields
   within its encapsulated header provide the information required for
   each authorized recipient to perform MIC validation and decryption of
   the received message text.  For ENCRYPTED and MIC-ONLY messages, the
   printable encoding is converted to a bitstring.  Encrypted portions
   of the transmitted message are decrypted.  The MIC is validated.
   Then, the recipient PEM process converts the canonical representation
   to its appropriate local form.

4.1.2.2  Error Cases

   A variety of error cases may occur and be detected in the course of
   processing a received PEM message. The specific actions to be taken
   in response to such conditions are local matters, varying as
   functions of user preferences and the type of user interface provided
   by a particular PEM implementation, but certain general
   recommendations are appropriate. Syntactically invalid PEM messages
   should be flagged as such, preferably with collection of diagnostic
   information to support debugging of incompatibilities or other
   failures.  RFC 1422 defines specific error processing requirements
   relevant to the certificate-based key management mechanisms defined
   therein.


Linn                                                            [Page 9]

RFC 1421        Privacy Enhancement for Electronic Mail    February 1993


   Syntactically valid PEM messages which yield MIC failures raise
   special concern, as they may result from attempted attacks or forged
   messages.  As such, it is unsuitable to display their contents to
   recipient users without first indicating the fact that the contents'
   authenticity and integrity cannot be guaranteed and then receiving
   positive user confirmation of such a warning.  MIC-CLEAR messages
   (discussed in Section 4.6.1.1.3 of this RFC) raise special concerns,
   as MIC failures on such messages may occur for a broader range of
   benign causes than are applicable to other PEM message types.

4.2  Encryption Algorithms, Modes, and Parameters

   For use in conjunction with this RFC, RFC 1423 defines the
   appropriate algorithms, modes, and associated identifiers to be used
   for encryption of message text with DEKs.

   The mechanisms defined in this RFC incorporate facilities for
   transmission of cryptographic parameters (e.g., pseudorandom
   Initializing Vectors (IVs)) with PEM messages to which the
   confidentiality service is applied, when required by symmetric
   message encryption algorithms and modes specified in RFC 1423.

   Certain operations require encryption of DEKs, MICs, and digital
   signatures under an IK for purposes of transmission.  A header
   facility indicates the mode in which the IK is used for encryption.
   RFC 1423 specifies encryption algorithm and mode identifiers and
   minimum essential support requirements for key encryption processing.

   RFC 1422 specifies asymmetric, certificate-based key management
   procedures based on CCITT Recommendation X.509 to support the message
   processing procedures defined in this document. Support for the key
   management approach defined in RFC 1422 is strongly recommended.  The
   message processing procedures can also be used with symmetric key
   management, given prior distribution of suitable symmetric IKs, but
   no current RFCs specify key distribution procedures for such IKs.

4.3  Privacy Enhancement Message Transformations

4.3.1  Constraints

   An electronic mail encryption mechanism must be compatible with the
   transparency constraints of its underlying electronic mail
   facilities.  These constraints are generally established based on
   expected user requirements and on the characteristics of anticipated
   endpoint and transport facilities.  An encryption mechanism must also
   be compatible with the local conventions of the computer systems
   which it interconnects.  Our approach uses a canonicalization step to
   abstract out local conventions and a subsequent encoding step to


Linn                                                           [Page 10]

RFC 1421        Privacy Enhancement for Electronic Mail    February 1993


   conform to the characteristics of the underlying mail transport
   medium (SMTP).  The encoding conforms to SMTP constraints.  Section
   4.5 of RFC 821 [2] details SMTP's transparency constraints.

   To prepare a message for SMTP transmission, the following
   requirements must be met:

        1.  All characters must be members of the 7-bit ASCII character
            set.

        2.  Text lines, delimited by the character pair <CR><LF>, must
            be no more than 1000 characters long.

        3.  Since the string <CR><LF>.<CR><LF> indicates the end of a
            message, it must not occur in text prior to the end of a
            message.

   Although SMTP specifies a standard representation for line delimiters
   (ASCII <CR><LF>), numerous systems in the Internet use a different
   native representation to delimit lines.  For example, the <CR><LF>
   sequences delimiting lines in mail inbound to UNIX systems are
   transformed to single <LF>s as mail is written into local mailbox
   files.  Lines in mail incoming to record-oriented systems (such as
   VAX VMS) may be converted to appropriate records by the destination
   SMTP server [3].  As a result, if the encryption process generated
   <CR>s or <LF>s, those characters might not be accessible to a
   recipient UA program at a destination which uses different line
   delimiting conventions.  It is also possible that conversion between
   tabs and spaces may be performed in the course of mapping between
   inter-SMTP and local format; this is a matter of local option.  If
   such transformations changed the form of transmitted ciphertext,
   decryption would fail to regenerate the transmitted plaintext, and a
   transmitted MIC would fail to compare with that computed at the
   destination.

   The conversion performed by an SMTP server at a system with EBCDIC as
   a native character set has even more severe impact, since the
   conversion from EBCDIC into ASCII is an information-losing
   transformation.  In principle, the transformation function mapping
   between inter-SMTP canonical ASCII message representation and local
   format could be moved from the SMTP server up to the UA, given a
   means to direct that the SMTP server should no longer perform that
   transformation.  This approach has a major disadvantage: internal
   file (e.g., mailbox) formats would be incompatible with the native
   forms used on the systems where they reside.  Further, it would
   require modification to SMTP servers, as mail would be passed to SMTP
   in a different representation than it is passed at present.


Linn                                                           [Page 11]

RFC 1421        Privacy Enhancement for Electronic Mail    February 1993


4.3.2  Approach

   Our approach to supporting PEM across an environment in which
   intermediate conversions may occur defines an encoding for mail which
   is uniformly representable across the set of PEM UAs regardless of
   their systems' native character sets.  This encoded form is used (for
   specified PEM message types) to represent mail text in transit from
   originator to recipient, but the encoding is not applied to enclosing
   MTS headers or to encapsulated headers inserted to carry control
   information between PEM UAs.  The encoding's characteristics are such
   that the transformations anticipated between originator and recipient
   UAs will not prevent an encoded message from being decoded properly
   at its destination.

   Four transformation steps, described in the following four
   subsections, apply to outbound PEM message processing:

4.3.2.1  Step 1: Local Form

   This step is applicable to PEM message types ENCRYPTED, MIC-ONLY, and
   MIC-CLEAR.  The message text is created in the system's native
   character set, with lines delimited in accordance with local
   convention.

4.3.2.2  Step 2: Canonical Form

   This step is applicable to PEM message types ENCRYPTED, MIC-ONLY, and
   MIC-CLEAR.  The message text is converted to a universal canonical
   form, similar to the inter-SMTP representation [4] as defined in RFC
   821 [2] and RFC 822 [5]. The procedures performed in order to
   accomplish this conversion are dependent on the characteristics of
   the local form and so are not specified in this RFC.

   PEM canonicalization assures that the message text is represented
   with the ASCII character set and "<CR><LF>" line delimiters, but does
   not perform the dot-stuffing transformation discussed in RFC 821,
   Section 4.5.2.  Since a message is converted to a standard character
   set and representation before encryption, a transferred PEM message
   can be decrypted and its MIC can be validated at any type of
   destination host computer.  Decryption and MIC validation is
   performed before any conversions which may be necessary to transform
   the message into a destination-specific local form.

4.3.2.3  Step 3: Authentication and Encryption

   Authentication processing is applicable to PEM message types
   ENCRYPTED, MIC-ONLY, and MIC-CLEAR.  The canonical form is input to
   the selected MIC computation algorithm in order to compute an


Linn                                                           [Page 12]

RFC 1421        Privacy Enhancement for Electronic Mail    February 1993


   integrity check quantity for the message.  No padding is added to the
   canonical form before submission to the MIC computation algorithm,
   although certain MIC algorithms will apply their own padding in the
   course of computing a MIC.

   Encryption processing is applicable only to PEM message type
   ENCRYPTED.  RFC 1423 defines the padding technique used to support
   encryption of the canonically-encoded message text.

4.3.2.4  Step 4: Printable Encoding

   This printable encoding step is applicable to PEM message types
   ENCRYPTED and MIC-ONLY.  The same processing is also employed in
   representation of certain specifically identified PEM encapsulated
   header field quantities as cited in Section 4.6.  Proceeding from
   left to right, the bit string resulting from step 3 is encoded into
   characters which are universally representable at all sites, though
   not necessarily with the same bit patterns (e.g., although the
   character "E" is represented in an ASCII-based system as hexadecimal
   45 and as hexadecimal C5 in an EBCDIC-based system, the local
   significance of the two representations is equivalent).

   A 64-character subset of International Alphabet IA5 is used, enabling
   6 bits to be represented per printable character.  (The proposed
   subset of characters is represented identically in IA5 and ASCII.)
   The character "=" signifies a special processing function used for
   padding within the printable encoding procedure.

   To represent the encapsulated text of a PEM message, the encoding
   function's output is delimited into text lines (using local
   conventions), with each line except the last containing exactly 64
   printable characters and the final line containing 64 or fewer
   printable characters.  (This line length is easily printable and is
   guaranteed to satisfy SMTP's 1000-character transmitted line length
   limit.) This folding requirement does not apply when the encoding
   procedure is used to represent PEM header field quantities; Section
   4.6 discusses folding of PEM encapsulated header fields.

   The encoding process represents 24-bit groups of input bits as output
   strings of 4 encoded characters. Proceeding from left to right across
   a 24-bit input group extracted from the output of step 3, each 6-bit
   group is used as an index into an array of 64 printable characters.
   The character referenced by the index is placed in the output string.
   These characters, identified in Table 1, are selected so as to be
   universally representable, and the set excludes characters with
   particular significance to SMTP (e.g., ".", "<CR>", "<LF>").


Linn                                                           [Page 13]

RFC 1421        Privacy Enhancement for Electronic Mail    February 1993


   Special processing is performed if fewer than 24 bits are available
   in an input group at the end of a message.  A full encoding quantum
   is always completed at the end of a message.  When fewer than 24
   input bits are available in an input group, zero bits are added (on
   the right) to form an integral number of 6-bit groups.  Output
   character positions which are not required to represent actual input
   data are set to the character "=".  Since all canonically encoded
   output is an integral number of octets, only the following cases can
   arise: (1) the final quantum of encoding input is an integral
   multiple of 24 bits; here, the final unit of encoded output will be
   an integral multiple of 4 characters with no "=" padding, (2) the
   final quantum of encoding input is exactly 8 bits; here, the final
   unit of encoded output will be two characters followed by two "="
   padding characters, or (3) the final quantum of encoding input is
   exactly 16 bits; here, the final unit of encoded output will be three
   characters followed by one "=" padding character.


   Value Encoding  Value Encoding  Value Encoding  Value Encoding
       0 A            17 R            34 i            51 z
       1 B            18 S            35 j            52 0
       2 C            19 T            36 k            53 1
       3 D            20 U            37 l            54 2
       4 E            21 V            38 m            55 3
       5 F            22 W            39 n            56 4
       6 G            23 X            40 o            57 5
       7 H            24 Y            41 p            58 6
       8 I            25 Z            42 q            59 7
       9 J            26 a            43 r            60 8
      10 K            27 b            44 s            61 9
      11 L            28 c            45 t            62 +
      12 M            29 d            46 u            63 /
      13 N            30 e            47 v
      14 O            31 f            48 w         (pad) =
      15 P            32 g            49 x
      16 Q            33 h            50 y

                  Printable Encoding Characters
                             Table 1


4.3.2.5  Summary of Transformations

   In summary, the outbound message is subjected to the following
   composition of transformations (or, for some PEM message types, a
   subset thereof):

         Transmit_Form = Encode(Encrypt(Canonicalize(Local_Form)))


Linn                                                           [Page 14]

RFC 1421        Privacy Enhancement for Electronic Mail    February 1993


   The inverse transformations are performed, in reverse order, to
   process inbound PEM messages:

       Local_Form = DeCanonicalize(Decipher(Decode(Transmit_Form)))

   Note that the local form and the functions to transform messages to
   and from canonical form may vary between the originator and recipient
   systems without loss of information.

4.4  Encapsulation Mechanism

   The encapsulation techniques defined in RFC-934 [6] are adopted for
   encapsulation of PEM messages within separate enclosing MTS messages
   carrying associated MTS headers. This approach offers a number of
   advantages relative to a flat approach in which certain fields within
   a single header are encrypted and/or carry cryptographic control
   information.  As far as the MTS is concerned, the entirety of a PEM
   message will reside in an MTS message's text portion, not the MTS
   message's header portion. Encapsulation provides generality and
   segregates fields with user-to-user significance from those
   transformed in transit.  All fields inserted in the course of
   encryption/authentication processing are placed in the encapsulated
   header.  This facilitates compatibility with mail handling programs
   which accept only text, not header fields, from input files or from
   other programs.

   The encapsulation techniques defined in RFC-934 are consistent with
   existing Internet mail forwarding and bursting mechanisms.  These
   techniques are designed so that they may be used in a nested manner.
   The encapsulation techniques may be used to encapsulate one or more
   PEM messages for forwarding to a third party, possibly in conjunction
   with interspersed (non-PEM) text which serves to annotate the PEM
   messages.

   Two encapsulation boundaries (EB's) are defined for delimiting
   encapsulated PEM messages and for distinguishing encapsulated PEM
   messages from interspersed (non-PEM) text.  The pre-EB is the string
   "-----BEGIN PRIVACY-ENHANCED MESSAGE-----", indicating that an
   encapsulated PEM message follows.  The post-EB is either (1) another
   pre-EB indicating that another encapsulated PEM message follows, or
   (2) the string "-----END PRIVACY-ENHANCED MESSAGE-----" indicating
   that any text that immediately follows is non-PEM text.  A special
   point must be noted for the case of MIC-CLEAR messages, the text
   portions of which may contain lines which begin with the "-"
   character and which are therefore subject to special processing per
   RFC-934 forwarding procedures.  When the string "- " must be
   prepended to such a line in the course of a forwarding operation in
   order to distinguish that line from an encapsulation boundary, MIC


Linn                                                           [Page 15]

RFC 1421        Privacy Enhancement for Electronic Mail    February 1993


   computation is to be performed prior to prepending the "- " string.
   Figure 1 depicts the encapsulation of a single PEM message.

   This RFC places no a priori limits on the depth to which such
   encapsulation may be nested nor on the number of PEM messages which
   may be grouped in this fashion at a single nesting level for
   forwarding.  A implementation compliant with this RFC must not
   preclude a user from submitting or receiving PEM messages which
   exploit this encapsulation capability.  However, no specific
   requirements are levied upon implementations with regard to how this
   capability is made available to the user.  Thus, for example, a
   compliant PEM implementation is not required to automatically detect
   and process encapsulated PEM messages.

   In using this encapsulation facility, it is important to note that it
   is inappropriate to forward directly to a third party a message that
   is ENCRYPTED because recipients of such a message would not have
   access to the DEK required to decrypt the message.  Instead, the user
   forwarding the message must transform the ENCRYPTED message into a
   MIC-ONLY or MIC-CLEAR form prior to forwarding.  Thus, in order to
   comply with this RFC, a PEM implementation must provide a facility to
   enable a user to perform this transformation, while preserving the
   MIC associated with the original message.

   If a user wishes PEM-provided confidentiality protection for
   transmitted information, such information must occur in the
   encapsulated text of an ENCRYPTED PEM message, not in the enclosing
   MTS header or PEM encapsulated header. If a user wishes to avoid


Linn                                                           [Page 16]

RFC 1421        Privacy Enhancement for Electronic Mail    February 1993


   Encapsulated Message

       Pre-Encapsulation Boundary (Pre-EB)
           -----BEGIN PRIVACY-ENHANCED MESSAGE-----

       Encapsulated Header Portion
           (Contains encryption control fields inserted in plaintext.
           Examples include "DEK-Info:" and "Key-Info:".
           Note that, although these control fields have line-oriented
           representations similar to RFC 822 header fields, the set
           of fields valid in this context is disjoint from those used
           in RFC 822 processing.)

       Blank Line
           (Separates Encapsulated Header from subsequent
           Encapsulated Text Portion)

       Encapsulated Text Portion
           (Contains message data encoded as specified in Section 4.3.)

       Post-Encapsulation Boundary (Post-EB)
           -----END PRIVACY-ENHANCED MESSAGE-----


                   Encapsulated Message Format
                            Figure 1


   disclosing the actual subject of a message to unintended parties, it
   is suggested that the enclosing MTS header contain a "Subject:" field
   indicating that "Encrypted Mail Follows".

   If an integrity-protected representation of information which occurs
   within an enclosing header (not necessarily in the same format as
   that in which it occurs within that header) is desired, that data can
   be included within the encapsulated text portion in addition to its
   inclusion in the enclosing MTS header.  For example, an originator
   wishing to provide recipients with a protected indication of a
   message's position in a series of messages could include within the
   encapsulated text a copy of a timestamp or message counter value
   possessing end-to-end significance and extracted from an enclosing
   MTS header field.  (Note: mailbox specifiers as entered by end users
   incorporate local conventions and are subject to modification at
   intermediaries, so inclusion of such specifiers within encapsulated
   text should not be regarded as a suitable alternative to the
   authentication semantics defined in RFC 1422 and based on X.500
   Distinguished Names.) The set of header information (if any) included
   within the encapsulated text of messages is a local matter, and this


Linn                                                           [Page 17]

RFC 1421        Privacy Enhancement for Electronic Mail    February 1993


   RFC does not specify formatting conventions to distinguish replicated
   header fields from other encapsulated text.

4.5  Mail for Mailing Lists

   When mail is addressed to mailing lists, two different methods of
   processing can be applicable: the IK-per-list method and the IK-per-
   recipient method.  Hybrid approaches are also possible, as in the
   case of IK-per-list protection of a message on its path from an
   originator to a PEM-equipped mailing list exploder, followed by IK-
   per-recipient protection from the exploder to individual list
   recipients.

   If a message's originator is equipped to expand a destination mailing
   list into its individual constituents and elects to do so (IK-per-
   recipient), the message's DEK (and, in the symmetric key management
   case, MIC) will be encrypted under each per-recipient IK and all such
   encrypted representations will be incorporated into the transmitted
   message.  Note that per-recipient encryption is required only for the
   relatively small DEK and MIC quantities carried in the "Key-Info:"
   field, not for the message text which is, in general, much larger.
   Although more IKs are involved in processing under the IK-per-
   recipient method, the pairwise IKs can be individually revoked and
   possession of one IK does not enable a successful masquerade of
   another user on the list.

   If a message's originator addresses a message to a list name or
   alias, use of an IK associated with that name or alias as a entity
   (IK-per-list), rather than resolution of the name or alias to its
   constituent destinations, is implied. Such an IK must, therefore, be
   available to all list members. Unfortunately, it implies an
   undesirable level of exposure for the shared IK, and makes its
   revocation difficult.  Moreover, use of the IK-per-list method allows
   any holder of the list's IK to masquerade as another originator to
   the list for authentication purposes.

   Pure IK-per-list key management in the asymmetric case (with a common
   private key shared among multiple list members) is particularly
   disadvantageous in the asymmetric environment, as it fails to
   preserve the forwardable authentication and non-repudiation
   characteristics which are provided for other messages in this
   environment.  Use of a hybrid approach with a PEM-capable exploder is
   therefore particularly recommended for protection of mailing list
   traffic when asymmetric key management is used; such an exploder
   would reduce (per discussion in Section 4.4 of this RFC) incoming
   ENCRYPTED messages to MIC-ONLY or MIC-CLEAR form before forwarding
   them (perhaps re-encrypted under individual, per-recipient keys) to
   list members.


Linn                                                           [Page 18]

RFC 1421        Privacy Enhancement for Electronic Mail    February 1993


4.6  Summary of Encapsulated Header Fields

   This section defines the syntax and semantics of the encapsulated
   header fields to be added to messages in the course of privacy
   enhancement processing.

   The fields are presented in three groups.  Normally, the groups will
   appear in encapsulated headers in the order in which they are shown,
   though not all fields in each group will appear in all messages.  The
   following figures show the appearance of small example encapsulated
   messages.  Figure 2 assumes the use of symmetric cryptography for key
   management.  Figure 3 illustrates an example encapsulated ENCRYPTED
   message in which asymmetric key management is used.


   -----BEGIN PRIVACY-ENHANCED MESSAGE-----
   Proc-Type: 4,ENCRYPTED
   Content-Domain: RFC822
   DEK-Info: DES-CBC,F8143EDE5960C597
   Originator-ID-Symmetric: linn@zendia.enet.dec.com,,
   Recipient-ID-Symmetric: linn@zendia.enet.dec.com,ptf-kmc,3
   Key-Info: DES-ECB,RSA-MD2,9FD3AAD2F2691B9A,
             B70665BB9BF7CBCDA60195DB94F727D3
   Recipient-ID-Symmetric: pem-dev@tis.com,ptf-kmc,4
   Key-Info: DES-ECB,RSA-MD2,161A3F75DC82EF26,
             E2EF532C65CBCFF79F83A2658132DB47

   LLrHB0eJzyhP+/fSStdW8okeEnv47jxe7SJ/iN72ohNcUk2jHEUSoH1nvNSIWL9M
   8tEjmF/zxB+bATMtPjCUWbz8Lr9wloXIkjHUlBLpvXR0UrUzYbkNpk0agV2IzUpk
   J6UiRRGcDSvzrsoK+oNvqu6z7Xs5Xfz5rDqUcMlK1Z6720dcBWGGsDLpTpSCnpot
   dXd/H5LMDWnonNvPCwQUHt==
   -----END PRIVACY-ENHANCED MESSAGE-----

          Example Encapsulated Message (Symmetric Case)
                            Figure 2


   Figure 4 illustrates an example encapsulated MIC-ONLY message in
   which asymmetric key management is used; since no per-recipient keys
   are involved in preparation of asymmetric-case MIC-ONLY messages,
   this example should be processable for test purposes by arbitrary PEM
   implementations.

   Fully-qualified domain names (FQDNs) for hosts, appearing in the
   mailbox names found in entity identifier subfields of "Originator-
   ID-Symmetric:" and "Recipient-ID-Symmetric:" fields, are processed in
   a case-insensitive fashion.  Unless specified to the contrary, other
   field arguments (including the user name components of mailbox names)


Linn                                                           [Page 19]

RFC 1421        Privacy Enhancement for Electronic Mail    February 1993


   are to be processed in a case-sensitive fashion.

   In most cases, numeric quantities are represented in header fields as
   contiguous strings of hexadecimal digits, where each digit is
   represented by a character from the ranges "0"-"9" or upper case
   "A"-"F".  Since public-key certificates and quantities encrypted


Linn                                                           [Page 20]

RFC 1421        Privacy Enhancement for Electronic Mail    February 1993


   -----BEGIN PRIVACY-ENHANCED MESSAGE-----
   Proc-Type: 4,ENCRYPTED
   Content-Domain: RFC822
   DEK-Info: DES-CBC,BFF968AA74691AC1
   Originator-Certificate:
    MIIBlTCCAScCAWUwDQYJKoZIhvcNAQECBQAwUTELMAkGA1UEBhMCVVMxIDAeBgNV
    BAoTF1JTQSBEYXRhIFNlY3VyaXR5LCBJbmMuMQ8wDQYDVQQLEwZCZXRhIDExDzAN
    BgNVBAsTBk5PVEFSWTAeFw05MTA5MDQxODM4MTdaFw05MzA5MDMxODM4MTZaMEUx
    CzAJBgNVBAYTAlVTMSAwHgYDVQQKExdSU0EgRGF0YSBTZWN1cml0eSwgSW5jLjEU
    MBIGA1UEAxMLVGVzdCBVc2VyIDEwWTAKBgRVCAEBAgICAANLADBIAkEAwHZHl7i+
    yJcqDtjJCowzTdBJrdAiLAnSC+CnnjOJELyuQiBgkGrgIh3j8/x0fM+YrsyF1u3F
    LZPVtzlndhYFJQIDAQABMA0GCSqGSIb3DQEBAgUAA1kACKr0PqphJYw1j+YPtcIq
    iWlFPuN5jJ79Khfg7ASFxskYkEMjRNZV/HZDZQEhtVaU7Jxfzs2wfX5byMp2X3U/
    5XUXGx7qusDgHQGs7Jk9W8CW1fuSWUgN4w==
   Key-Info: RSA,
    I3rRIGXUGWAF8js5wCzRTkdhO34PTHdRZY9Tuvm03M+NM7fx6qc5udixps2Lng0+
    wGrtiUm/ovtKdinz6ZQ/aQ==
   Issuer-Certificate:
    MIIB3DCCAUgCAQowDQYJKoZIhvcNAQECBQAwTzELMAkGA1UEBhMCVVMxIDAeBgNV
    BAoTF1JTQSBEYXRhIFNlY3VyaXR5LCBJbmMuMQ8wDQYDVQQLEwZCZXRhIDExDTAL
    BgNVBAsTBFRMQ0EwHhcNOTEwOTAxMDgwMDAwWhcNOTIwOTAxMDc1OTU5WjBRMQsw
    CQYDVQQGEwJVUzEgMB4GA1UEChMXUlNBIERhdGEgU2VjdXJpdHksIEluYy4xDzAN
    BgNVBAsTBkJldGEgMTEPMA0GA1UECxMGTk9UQVJZMHAwCgYEVQgBAQICArwDYgAw
    XwJYCsnp6lQCxYykNlODwutF/jMJ3kL+3PjYyHOwk+/9rLg6X65B/LD4bJHtO5XW
    cqAz/7R7XhjYCm0PcqbdzoACZtIlETrKrcJiDYoP+DkZ8k1gCk7hQHpbIwIDAQAB
    MA0GCSqGSIb3DQEBAgUAA38AAICPv4f9Gx/tY4+p+4DB7MV+tKZnvBoy8zgoMGOx
    dD2jMZ/3HsyWKWgSF0eH/AJB3qr9zosG47pyMnTf3aSy2nBO7CMxpUWRBcXUpE+x
    EREZd9++32ofGBIXaialnOgVUn0OzSYgugiQ077nJLDUj0hQehCizEs5wUJ35a5h
   MIC-Info: RSA-MD5,RSA,
    UdFJR8u/TIGhfH65ieewe2lOW4tooa3vZCvVNGBZirf/7nrgzWDABz8w9NsXSexv
    AjRFbHoNPzBuxwmOAFeA0HJszL4yBvhG
   Recipient-ID-Asymmetric:
    MFExCzAJBgNVBAYTAlVTMSAwHgYDVQQKExdSU0EgRGF0YSBTZWN1cml0eSwgSW5j
    LjEPMA0GA1UECxMGQmV0YSAxMQ8wDQYDVQQLEwZOT1RBUlk=,
    66
   Key-Info: RSA,
    O6BS1ww9CTyHPtS3bMLD+L0hejdvX6Qv1HK2ds2sQPEaXhX8EhvVphHYTjwekdWv
    7x0Z3Jx2vTAhOYHMcqqCjA==

   qeWlj/YJ2Uf5ng9yznPbtD0mYloSwIuV9FRYx+gzY+8iXd/NQrXHfi6/MhPfPF3d
   jIqCJAxvld2xgqQimUzoS1a4r7kQQ5c/Iua4LqKeq3ciFzEv/MbZhA==
   -----END PRIVACY-ENHANCED MESSAGE-----

    Example Encapsulated ENCRYPTED Message (Asymmetric Case)
                            Figure 3


Linn                                                           [Page 21]

RFC 1421        Privacy Enhancement for Electronic Mail    February 1993


   using asymmetric algorithms are large in size, use of a more space-
   efficient encoding technique is appropriate for such quantities, and
   the encoding mechanism defined in Section 4.3.2.4 of this RFC,
   representing 6 bits per printed character, is adopted for this
   purpose.

   Encapsulated headers of PEM messages are folded using whitespace per
   RFC 822 header folding conventions; no PEM-specific conventions are
   defined for encapsulated header folding.  The example shown in Figure
   4 shows (in its "MIC-Info:" field) an asymmetrically encrypted
   quantity in its printably encoded representation, illustrating the
   use of RFC 822 folding.

   In contrast to the encapsulated header representations defined in RFC
   1113 and its precursors, the field identifiers adopted in this RFC do
   not begin with the prefix "X-" (for example, the field previously
   denoted "X-Key-Info:" is now denoted "Key-Info:") and such prefixes
   are not to be emitted by implementations conformant to this RFC.  To
   simplify transition and interoperability with earlier
   implementations, it is suggested that implementations based on this
   RFC accept incoming encapsulated header fields carrying the "X-"
   prefix and act on such fields as if the "X-" were not present.


Linn                                                           [Page 22]

RFC 1421        Privacy Enhancement for Electronic Mail    February 1993


   -----BEGIN PRIVACY-ENHANCED MESSAGE-----
   Proc-Type: 4,MIC-ONLY
   Content-Domain: RFC822
   Originator-Certificate:
    MIIBlTCCAScCAWUwDQYJKoZIhvcNAQECBQAwUTELMAkGA1UEBhMCVVMxIDAeBgNV
    BAoTF1JTQSBEYXRhIFNlY3VyaXR5LCBJbmMuMQ8wDQYDVQQLEwZCZXRhIDExDzAN
    BgNVBAsTBk5PVEFSWTAeFw05MTA5MDQxODM4MTdaFw05MzA5MDMxODM4MTZaMEUx
    CzAJBgNVBAYTAlVTMSAwHgYDVQQKExdSU0EgRGF0YSBTZWN1cml0eSwgSW5jLjEU
    MBIGA1UEAxMLVGVzdCBVc2VyIDEwWTAKBgRVCAEBAgICAANLADBIAkEAwHZHl7i+
    yJcqDtjJCowzTdBJrdAiLAnSC+CnnjOJELyuQiBgkGrgIh3j8/x0fM+YrsyF1u3F
    LZPVtzlndhYFJQIDAQABMA0GCSqGSIb3DQEBAgUAA1kACKr0PqphJYw1j+YPtcIq
    iWlFPuN5jJ79Khfg7ASFxskYkEMjRNZV/HZDZQEhtVaU7Jxfzs2wfX5byMp2X3U/
    5XUXGx7qusDgHQGs7Jk9W8CW1fuSWUgN4w==
   Issuer-Certificate:
    MIIB3DCCAUgCAQowDQYJKoZIhvcNAQECBQAwTzELMAkGA1UEBhMCVVMxIDAeBgNV
    BAoTF1JTQSBEYXRhIFNlY3VyaXR5LCBJbmMuMQ8wDQYDVQQLEwZCZXRhIDExDTAL
    BgNVBAsTBFRMQ0EwHhcNOTEwOTAxMDgwMDAwWhcNOTIwOTAxMDc1OTU5WjBRMQsw
    CQYDVQQGEwJVUzEgMB4GA1UEChMXUlNBIERhdGEgU2VjdXJpdHksIEluYy4xDzAN
    BgNVBAsTBkJldGEgMTEPMA0GA1UECxMGTk9UQVJZMHAwCgYEVQgBAQICArwDYgAw
    XwJYCsnp6lQCxYykNlODwutF/jMJ3kL+3PjYyHOwk+/9rLg6X65B/LD4bJHtO5XW
    cqAz/7R7XhjYCm0PcqbdzoACZtIlETrKrcJiDYoP+DkZ8k1gCk7hQHpbIwIDAQAB
    MA0GCSqGSIb3DQEBAgUAA38AAICPv4f9Gx/tY4+p+4DB7MV+tKZnvBoy8zgoMGOx
    dD2jMZ/3HsyWKWgSF0eH/AJB3qr9zosG47pyMnTf3aSy2nBO7CMxpUWRBcXUpE+x
    EREZd9++32ofGBIXaialnOgVUn0OzSYgugiQ077nJLDUj0hQehCizEs5wUJ35a5h
   MIC-Info: RSA-MD5,RSA,
    jV2OfH+nnXHU8bnL8kPAad/mSQlTDZlbVuxvZAOVRZ5q5+Ejl5bQvqNeqOUNQjr6
    EtE7K2QDeVMCyXsdJlA8fA==

   LSBBIG1lc3NhZ2UgZm9yIHVzZSBpbiB0ZXN0aW5nLg0KLSBGb2xsb3dpbmcgaXMg
   YSBibGFuayBsaW5lOg0KDQpUaGlzIGlzIHRoZSBlbmQuDQo=
   -----END PRIVACY-ENHANCED MESSAGE-----

     Example Encapsulated MIC-ONLY Message (Asymmetric Case)
                            Figure 4


4.6.1  Per-Message Encapsulated Header Fields

   This group of encapsulated header fields contains fields which occur
   no more than once in a PEM message, generally preceding all other
   encapsulated header fields.

4.6.1.1  Proc-Type Field

   The "Proc-Type:" encapsulated header field, required for all PEM
   messages, identifies the type of processing performed on the
   transmitted message.  Only one "Proc-Type:" field occurs in a
   message; the "Proc-Type:" field must be the first encapsulated header


Linn                                                           [Page 23]

RFC 1421        Privacy Enhancement for Electronic Mail    February 1993


   field in the message.

   The "Proc-Type:" field has two subfields, separated by a comma.  The
   first subfield is a decimal number which is used to distinguish among
   incompatible encapsulated header field interpretations which may
   arise as changes are made to this standard.  Messages processed
   according to this RFC will carry the subfield value "4" to
   distinguish them from messages processed in accordance with prior PEM
   RFCs.  The second subfield assumes one of a set of string values,
   defined in the following subsections.

4.6.1.1.1  ENCRYPTED

   The "ENCRYPTED" specifier signifies that confidentiality,
   authentication, integrity, and (given use of asymmetric key
   management) non-repudiation of origin security services have been
   applied to a PEM message's encapsulated text.  ENCRYPTED messages
   require a "DEK-Info:" field and individual Recipient-ID and "Key-
   Info:" fields for all message recipients.

4.6.1.1.2  MIC-ONLY

   The "MIC-ONLY" specifier signifies that all of the security services
   specified for ENCRYPTED messages, with the exception of
   confidentiality, have been applied to a PEM message's encapsulated
   text. MIC-ONLY messages are encoded (per Section 4.3.2.4 of this RFC)
   to protect their encapsulated text against modifications at message
   transfer or relay points.

   Specification of MIC-ONLY, when applied in conjunction with certain
   combinations of key management and MIC algorithm options, permits
   certain fields which are superfluous in the absence of encryption to
   be omitted from the encapsulated header.  In particular, when a
   keyless MIC computation is employed for recipients for whom
   asymmetric cryptography is used, "Recipient-ID-Asymmetric:" and
   "Key-Info:" fields can be omitted.  The "DEK-Info:" field can be
   omitted for all "MIC-ONLY" messages.

4.6.1.1.3  MIC-CLEAR

   The "MIC-CLEAR" specifier represents a PEM message with the same
   security service selection as for a MIC-ONLY message.  The set of
   encapsulated header fields required in a MIC-CLEAR message is the
   same as that required for a MIC-ONLY message.

   MIC-CLEAR message processing omits the encoding step defined in
   Section 4.3.2.4 of this RFC to protect a message's encapsulated text
   against modifications within the MTS.  As a result, a MIC-CLEAR


Linn                                                           [Page 24]

RFC 1421        Privacy Enhancement for Electronic Mail    February 1993


   message's text can be read by recipients lacking access to PEM
   software, even though such recipients cannot validate the message's
   signature. The canonical encoding discussed in Section 4.3.2.2 is
   performed, so interoperation among sites with different native
   character sets and line representations is not precluded so long as
   those native formats are unambiguously translatable to and from the
   canonical form.  (Such interoperability is feasible only for those
   characters which are included in the canonical representation set.)

   Omission of the printable encoding step implies that MIC-CLEAR
   message MICs will be validatable only in environments where the MTS
   does not modify messages in transit, or where the modifications
   performed can be determined and inverted before MIC validation
   processing.  Failed MIC validation on a MIC-CLEAR message does not,
   therefore, necessarily signify a security-relevant event; as a
   result, it is recommended that PEM implementations reflect to their
   users (in a suitable local fashion) the type of PEM message being
   processed when reporting a MIC validation failure.

   A case of particular relevance arises for inbound SMTP processing on
   systems which delimit text lines with local native representations
   other than the SMTP-conventional <CR><LF>.  When mail is delivered to
   a UA on such a system and presented for PEM processing, the <CR><LF>
   has already been translated to local form.  In order to validate a
   MIC-CLEAR message's MIC in this situation, the PEM module must
   recanonicalize the incoming message in order to determine the inter-
   SMTP representation of the canonically encoded message (as defined in
   Section 4.3.2.2 of this RFC), and must compute the reference MIC
   based on that representation.

4.6.1.1.4  CRL

   The "CRL" specifier indicates a special PEM message type, used to
   transfer one or more Certificate Revocation Lists.  The format of PEM
   CRLs is defined in RFC 1422.  No user data or encapsulated text
   accompanies an encapsulated header specifying the CRL message type; a
   correctly-formed CRL message's PEM header is immediately followed by
   its terminating message boundary line, with no blank line
   intervening.

   Only three types of fields are valid in the encapsulated header
   comprising a CRL message.  The "CRL:" field carries a printable
   representation of a CRL, encoded using the procedures defined in
   Section 4.3.2.4 of this RFC. "CRL:" fields may (as an option) be
   followed by no more than one "Originator-Certificate:" field and any
   number of "Issuer-Certificate:" fields. The "Originator-Certificate:"
   and "Issuer-Certificate:" fields refer to the most recently previous
   "CRL:" field, and provide certificates useful in validating the


Linn                                                           [Page 25]

RFC 1421        Privacy Enhancement for Electronic Mail    February 1993


   signature included in the CRL.  "Originator-Certificate:" and
   "Issuer-Certificate:" fields' contents are the same for CRL messages
   as they are for other PEM message types.

4.6.1.2  Content-Domain Field

   The "Content-Domain:" encapsulated header field describes the type of
   content which is represented within a PEM message's encapsulated
   text.  It carries one string argument, whose value is defined as
   "RFC822" to indicate processing of RFC-822 mail as defined in this
   specification.  It is anticipated that additional "Content-Domain:"
   values will be defined subsequently, in additional or successor
   documents to this specification. Only one "Content-Domain:" field
   occurs in a PEM message; this field is the PEM message's second
   encapsulated header field, immediately following the "Proc-Type:"
   field.


4.6.1.3  DEK-Info Field


   The "DEK-Info:" encapsulated header field identifies the message text
   encryption algorithm and mode, and also carries any cryptographic
   parameters (e.g., IVs) used for message encryption.  No more than one
   "DEK-Info:" field occurs in a message; the field is required for all
   messages specified as "ENCRYPTED" in the "Proc-Type:" field.

   The "DEK-Info:" field carries either one argument or two arguments
   separated by a comma.  The first argument identifies the algorithm
   and mode used for message text encryption.  The second argument, if
   present, carries any cryptographic parameters required by the
   algorithm and mode identified in the first argument.  Appropriate
   message encryption algorithms, modes and identifiers and
   corresponding cryptographic parameters and formats are defined in RFC
   1423.

4.6.2  Encapsulated Header Fields Normally Per-Message

   This group of encapsulated header fields contains fields which
   ordinarily occur no more than once per message.  Depending on the key
   management option(s) employed, some of these fields may be absent
   from some messages.

4.6.2.1  Originator-ID Fields

   Originator-ID encapsulated header fields identify a message's
   originator and provide the originator's IK identification component.
   Two varieties of Originator-ID fields are defined, the "Originator-
   ID-Asymmetric:" and "Originator-ID-Symmetric:" field.  An
   "Originator-ID-Symmetric:" header field is required for all PEM


Linn                                                           [Page 26]

RFC 1421        Privacy Enhancement for Electronic Mail    February 1993


   messages employing symmetric key management.  The analogous
   "Originator-ID-Asymmetric:" field, for the asymmetric key management
   case, is used only when no corresponding "Originator-Certificate:"
   field is included.

   Most commonly, only one Originator-ID or "Originator-Certificate:"
   field will occur within a message. For the symmetric case, the IK
   identification component carried in an "Originator-ID-Symmetric:"
   field applies to processing of all subsequent "Recipient-ID-
   Symmetric:" fields until another "Originator-ID-Symmetric:" field
   occurs.  It is illegal for a "Recipient-ID-Symmetric:" field to occur
   before a corresponding "Originator-ID-Symmetric:" field has been
   provided.  For the asymmetric case, processing of "Recipient-ID-
   Asymmetric:" fields is logically independent of preceding
   "Originator-ID-Asymmetric:" and "Originator-Certificate:" fields.

   Multiple Originator-ID and/or "Originator-Certificate:" fields may
   occur in a message when different originator-oriented IK components
   must be used by a message's originator in order to prepare a message
   so as to be suitable for processing by different recipients. In
   particular, multiple such fields will occur when both symmetric and
   asymmetric cryptography are applied to a single message in order to
   process the message for different recipients.

   Originator-ID subfields are delimited by the comma character (","),
   optionally followed by whitespace.  Section 5.2, Interchange Keys,
   discusses the semantics of these subfields and specifies the alphabet
   from which they are chosen.

4.6.2.1.1  Originator-ID-Asymmetric Field

   The "Originator-ID-Asymmetric:" field contains an Issuing Authority
   subfield, and then a Version/Expiration subfield.  This field is used
   only when the information it carries is not available from an
   included "Originator-Certificate:" field.

4.6.2.1.2  Originator-ID-Symmetric Field

   The "Originator-ID-Symmetric:" field contains an Entity Identifier
   subfield, followed by an (optional) Issuing Authority subfield, and
   then an (optional) Version/Expiration subfield.  Optional
   "Originator-ID-Symmetric:" subfields may be omitted only if rendered
   redundant by information carried in subsequent "Recipient-ID-
   Symmetric:" fields, and will normally be omitted in such cases.


Linn                                                           [Page 27]

RFC 1421        Privacy Enhancement for Electronic Mail    February 1993


4.6.2.2  Originator-Certificate Field

   The "Originator-Certificate:" encapsulated header field is used only
   when asymmetric key management is employed for one or more of a
   message's recipients.  To facilitate processing by recipients (at
   least in advance of general directory server availability), inclusion
   of this field in all messages is strongly recommended.  The field
   transfers an originator's certificate as a numeric quantity,
   comprised of the certificate's DER encoding, represented in the
   header field with the encoding mechanism defined in Section 4.3.2.4
   of this RFC.  The semantics of a certificate are discussed in RFC
   1422.


4.6.2.3  MIC-Info Field


   The "MIC-Info:" encapsulated header field, used only when asymmetric
   key management is employed for at least one recipient of a message,
   carries three arguments, separated by commas.  The first argument
   identifies the algorithm under which the accompanying MIC is
   computed.  The second argument identifies the algorithm under which
   the accompanying MIC is signed.  The third argument represents a MIC
   signed with an originator's private key.

   For the case of ENCRYPTED PEM messages, the signed MIC is, in turn,
   symmetrically encrypted using the same DEK, algorithm, mode and
   cryptographic parameters as are used to encrypt the message's
   encapsulated text.  This measure prevents unauthorized recipients
   from determining whether an intercepted message corresponds to a
   predetermined plaintext value.

   Appropriate MIC algorithms and identifiers, signature algorithms and
   identifiers, and signed MIC formats are defined in RFC 1423.

   A "MIC-Info:" field will occur after a sequence of fields beginning
   with a "Originator-ID-Asymmetric:" or "Originator-Certificate:" field
   and followed by any associated "Issuer-Certificate:" fields.  A
   "MIC-Info:" field applies to all subsequent recipients for whom
   asymmetric key management is used, until and unless overridden by a
   subsequent "Originator-ID-Asymmetric:" or "Originator-Certificate:"
   and corresponding "MIC-Info:".

4.6.3  Encapsulated Header Fields with Variable Occurrences

   This group of encapsulated header fields contains fields which will
   normally occur variable numbers of times within a message, with
   numbers of occurrences ranging from zero to non-zero values which are
   independent of the number of recipients.


Linn                                                           [Page 28]

RFC 1421        Privacy Enhancement for Electronic Mail    February 1993


4.6.3.1  Issuer-Certificate Field

   The "Issuer-Certificate:" encapsulated header field is meaningful
   only when asymmetric key management is used for at least one of a
   message's recipients.  A typical "Issuer-Certificate:" field would
   contain the certificate containing the public component used to sign
   the certificate carried in the message's "Originator-Certificate:"
   field, for recipients' use in chaining through that certificate's
   certification path.  Other "Issuer-Certificate:" fields, typically
   representing higher points in a certification path, also may be
   included by an originator.  It is recommended that the "Issuer-
   Certificate:" fields be included in an order corresponding to
   successive points in a certification path leading from the originator
   to a common point shared with the message's recipients (i.e., the
   Internet Certification Authority (ICA), unless a lower Policy
   Certification Authority (PCA) or CA is common to all recipients.)
   More information on certification paths can be found in RFC 1422.

   The certificate is represented in the same manner as defined for the
   "Originator-Certificate:" field (transporting an encoded
   representation of the certificate in X.509 [7] DER form), and any
   "Issuer-Certificate:" fields will ordinarily follow the "Originator-
   Certificate:" field directly.  Use of the "Issuer-Certificate:" field
   is optional even when asymmetric key management is employed, although
   its incorporation is strongly recommended in the absence of alternate
   directory server facilities from which recipients can access issuers'
   certificates.

4.6.4  Per-Recipient Encapsulated Header Fields

   The encapsulated header fields in this group appear for each of an
   "ENCRYPTED" message's named recipients.  For "MIC-ONLY" and "MIC-
   CLEAR" messages, these fields are omitted for recipients for whom
   asymmetric key management is employed in conjunction with a keyless
   MIC algorithm but the fields appear for recipients for whom symmetric
   key management or a keyed MIC algorithm is employed.

4.6.4.1  Recipient-ID Fields

   A Recipient-ID encapsulated header field identifies a recipient and
   provides the recipient's IK identification component.  One
   Recipient-ID field is included for each of a message's named
   recipients. Section 5.2, Interchange Keys, discusses the semantics of
   the subfields and specifies the alphabet from which they are chosen.
   Recipient-ID subfields are delimited by the comma character (","),
   optionally followed by whitespace.


Linn                                                           [Page 29]

RFC 1421        Privacy Enhancement for Electronic Mail    February 1993


   For the symmetric case, all "Recipient-ID-Symmetric:" fields are
   interpreted in the context of the most recent preceding "Originator-
   ID-Symmetric:" field.  It is illegal for a "Recipient-ID-Symmetric:"
   field to occur in a header before the occurrence of a corresponding
   "Originator-ID-Symmetric:" field.  For the asymmetric case,
   "Recipient-ID-Asymmetric:" fields are logically independent of a
   message's "Originator-ID-Asymmetric:" and "Originator-Certificate:"
   fields.  "Recipient-ID-Asymmetric:" fields, and their associated
   "Key-Info:" fields, are included following a header's originator-
   oriented fields.

4.6.4.1.1  Recipient-ID-Asymmetric Field

   The "Recipient-ID-Asymmetric:" field contains, in order, an Issuing
   Authority subfield and a Version/Expiration subfield.

4.6.4.1.2  Recipient-ID-Symmetric Field

   The "Recipient-ID-Symmetric:" field contains, in order, an Entity
   Identifier subfield, an (optional) Issuing Authority subfield, and an
   (optional) Version/Expiration subfield.


4.6.4.2  Key-Info Field


   One "Key-Info:" field is included for each of a message's named
   recipients.  In addition, it is recommended that PEM implementations
   support (as a locally-selectable option) the ability to include a
   "Key-Info:" field corresponding to a PEM message's originator,
   following an Originator-ID or "Originator-Certificate:" field and
   before any associated Recipient-ID fields, but inclusion of such a
   field is not a requirement for conformance with this RFC.

   Each "Key-Info:" field is interpreted in the context of the most
   recent preceding Originator-ID, "Originator-Certificate:", or
   Recipient-ID field; normally, a "Key-Info:" field will immediately
   follow its associated predecessor field. The "Key-Info:" field's
   argument(s) differ depending on whether symmetric or asymmetric key
   management is used for a particular recipient.

4.6.4.2.1  Symmetric Key Management

   When symmetric key management is employed for a given recipient, the
   "Key-Info:" encapsulated header field transfers four items, separated
   by commas: an IK Use Indicator, a MIC Algorithm Indicator, a DEK and
   a MIC.  The IK Use Indicator identifies the algorithm and mode in
   which the identified IK was used for DEK and MIC encryption for a
   particular recipient.  The MIC Algorithm Indicator identifies the MIC
   computation algorithm used for a particular recipient.  The DEK and


Linn                                                           [Page 30]

RFC 1421        Privacy Enhancement for Electronic Mail    February 1993


   MIC are symmetrically encrypted under the IK identified by a
   preceding "Recipient-ID-Symmetric:" field and/or prior "Originator-
   ID-Symmetric:" field.

   Appropriate symmetric encryption algorithms, modes and identifiers,
   MIC computation algorithms and identifiers, and encrypted DEK and MIC
   formats are defined in RFC 1423.

4.6.4.2.2  Asymmetric Key Management

   When asymmetric key management is employed for a given recipient, the
   "Key-Info:" field transfers two quantities, separated by a comma.
   The first argument is an IK Use Indicator identifying the algorithm
   and mode in which the DEK is asymmetrically encrypted.  The second
   argument is a DEK, asymmetrically encrypted under the recipient's
   public component.

   Appropriate asymmetric encryption algorithms and identifiers, and
   encrypted DEK formats are defined in RFC 1423.

5.  Key Management

   Several cryptographic constructs are involved in supporting the PEM
   message processing procedure.  A set of fundamental elements is
   assumed.  Data Encrypting Keys (DEKs) are used to encrypt message
   text and (for some MIC computation algorithms) in the message
   integrity check (MIC) computation process.  Interchange Keys (IKs)
   are used to encrypt DEKs and MICs for transmission with messages.  In
   a certificate-based asymmetric key management architecture,
   certificates are used as a means to provide entities' public
   components and other information in a fashion which is securely bound
   by a central authority.  The remainder of this section provides more
   information about these constructs.

5.1  Data Encrypting Keys (DEKs)

   Data Encrypting Keys (DEKs) are used for encryption of message text
   and (with some MIC computation algorithms) for computation of message
   integrity check quantities (MICs).  In the asymmetric key management
   case, they are also used for encrypting signed MICs in ENCRYPTED PEM
   messages.  It is strongly recommended that DEKs be generated and used
   on a one-time, per-message, basis.  A transmitted message will
   incorporate a representation of the DEK encrypted under an
   appropriate interchange key (IK) for each of the named recipients.

   DEK generation can be performed either centrally by key distribution
   centers (KDCs) or  by endpoint systems.  Dedicated KDC systems may be
   able to  implement stronger algorithms for random DEK generation than


Linn                                                           [Page 31]

RFC 1421        Privacy Enhancement for Electronic Mail    February 1993


   can be supported in endpoint systems.  On the other hand,
   decentralization allows endpoints to be relatively self-sufficient,
   reducing the level of trust which must be placed in components other
   than those of a message's originator and recipient.  Moreover,
   decentralized DEK generation at endpoints reduces the frequency with
   which originators must make real-time queries of (potentially unique)
   servers in order to send mail, enhancing communications availability.

   When symmetric key management is used, one advantage of centralized
   KDC-based generation is that DEKs can be returned to endpoints
   already encrypted under the IKs of message recipients rather than
   providing the IKs to the originators.  This reduces IK exposure and
   simplifies endpoint key management requirements.  This approach has
   less value if asymmetric cryptography is used for key management,
   since per-recipient public IK components are assumed to be generally
   available and per-originator private IK components need not
   necessarily be shared with a KDC.

5.2  Interchange Keys (IKs)

   Interchange Key (IK) components are used to encrypt DEKs and MICs.
   In general, IK granularity is at the pairwise per-user level except
   for mail sent to address lists comprising multiple users.  In order
   for two principals to engage in a useful exchange of PEM using
   conventional cryptography, they must first possess common IK
   components (when symmetric key management is used) or complementary
   IK components (when asymmetric key management is used).  When
   symmetric cryptography is used, the IK consists of a single
   component, used to encrypt both DEKs and MICs.  When asymmetric
   cryptography is used, a recipient's public component is used as an IK
   to encrypt DEKs (a transformation invertible only by a recipient
   possessing the corresponding private component), and the originator's
   private component is used to encrypt MICs (a transformation
   invertible by all recipients, since the originator's certificate
   provides all recipients with the public component required to perform
   MIC validation.

   This RFC does not prescribe the means by which interchange keys are
   made available to appropriate parties; such means may be centralized
   (e.g., via key management servers) or decentralized (e.g., via
   pairwise agreement and direct distribution among users).  In any
   case, any given IK component is associated with a responsible Issuing
   Authority (IA).  When certificate-based asymmetric key management, as
   discussed in RFC [1422, is employed, the IA function is performed by
   a Certification Authority (CA).

   When an IA generates and distributes an IK component, associated
   control information is provided to direct how it is to be used.  In


Linn                                                           [Page 32]

RFC 1421        Privacy Enhancement for Electronic Mail    February 1993


   order to select the appropriate IK(s) to use in message encryption,
   an originator must retain a correspondence between IK components and
   the recipients with which they are associated.  Expiration date
   information must also be retained, in order that cached entries may
   be invalidated and replaced as appropriate.

   Since a message may be sent with multiple IK components identified,
   corresponding to multiple intended recipients, each recipient's UA
   must be able to determine that recipient's intended IK component.
   Moreover, if no corresponding IK component is available in the
   recipient's database when a message arrives, the recipient must be
   able to identify the required IK component and identify that IK
   component's associated IA.  Note that different IKs may be used for
   different messages between a pair of communicants.  Consider, for
   example, one message sent from A to B and another message sent (using
   the IK-per-list method) from A to a mailing list of which B is a
   member.  The first message would use IK components associated
   individually with A and B, but the second would use an IK component
   shared among list members.

   When a PEM message is transmitted, an indication of the IK components
   used for DEK and MIC encryption must be included.  To this end,
   Originator-ID and Recipient-ID encapsulated header fields provide
   (some or all of) the following data:

        1.  Identification of the relevant Issuing Authority (IA
            subfield)

        2.  Identification of an entity with which a particular IK
            component is associated (Entity Identifier or EI subfield)

        3.  Version/Expiration subfield

   In the asymmetric case, all necessary information associated with an
   originator can be acquired by processing the certificate carried in
   an "Originator-Certificate:" field; to avoid redundancy in this case,
   no "Originator-ID-Asymmetric:" field is included if a corresponding
   "Originator-Certificate:" appears.

   The comma character (",") is used to delimit the subfields within an
   Originator-ID or Recipient-ID.  The IA, EI, and version/expiration
   subfields are generated from a restricted character set, as
   prescribed by the following BNF (using notation as defined in RFC
   822, Sections 2 and 3.3):


Linn                                                           [Page 33]

RFC 1421        Privacy Enhancement for Electronic Mail    February 1993


   IKsubfld       :=       1*ia-char

   ia-char        :=       DIGIT / ALPHA / "'" / "+" / "(" / ")" /
                           "." / "/" / "=" / "?" / "-" / "@" /
                           "%" / "!" / '"' / "_" / "<" / ">"


   An example Recipient-ID field for the symmetric case is as follows:

   Recipient-ID-Symmetric: linn@zendia.enet.dec.com,ptf-kmc,2

   This example field indicates that IA "ptf-kmc" has issued an IK
   component for use on messages sent  to "linn@zendia.enet.dec.com",
   and that the IA has provided the number 2 as a version indicator for
   that IK component.

   An example Recipient-ID field for the asymmetric case is as follows:

   Recipient-ID-Asymmetric:
    MFExCzAJBgNVBAYTAlVTMSAwHgYDVQQKExdSU0EgRGF0YSBTZWN1cml0eSwgSW5j
    LjEPMA0GA1UECxMGQmV0YSAxMQ8wDQYDVQQLEwZOT1RBUlk=,66

   This example field includes the printably encoded BER representation
   of a certificate's issuer distinguished name, along with the
   certificate serial number 66 as assigned by that issuer.

5.2.1  Subfield Definitions

   The following subsections define the subfields of Originator-ID and
   Recipient-ID fields.

5.2.1.1  Entity Identifier Subfield

   An entity identifier (used only for "Originator-ID-Symmetric:" and
   "Recipient-ID-Symmetric:" fields) is constructed as an IKsubfld.
   More restrictively, an entity identifier subfield assumes the
   following form:

                      <user>@<domain-qualified-host>

   In order to support universal interoperability, it is necessary to
   assume a universal form for the naming information.  For the case of
   installations which transform local host names before transmission
   into the broader Internet, it is strongly recommended that the host
   name as presented to the Internet be employed.


Linn                                                           [Page 34]

RFC 1421        Privacy Enhancement for Electronic Mail    February 1993


5.2.1.2  Issuing Authority Subfield

   An IA identifier subfield is constructed as an IKsubfld.  This RFC
   does not define this subfield's contents for the symmetric key
   management case. Any prospective IAs which are to issue symmetric
   keys for use in conjunction with this RFC must coordinate assignment
   of IA identifiers in a manner (centralized or hierarchic) which
   assures uniqueness.

   For the asymmetric key management case, the IA identifier subfield
   will be formed from the ASN.1 BER representation of the distinguished
   name of the issuing organization or organizational unit.  The
   distinguished encoding rules specified in Clause 8.7 of
   Recommendation X.509 ("X.509 DER") are to be employed in generating
   this representation.  The encoded binary result will be represented
   for inclusion in a transmitted header using the procedure defined in
   Section 4.3.2.4 of this RFC.

5.2.1.3  Version/Expiration Subfield

   A version/expiration subfield is constructed as an IKsubfld.  For the
   symmetric key management case, the version/expiration subfield format
   is permitted to vary among different IAs, but must satisfy certain
   functional constraints.  An IA's version/expiration subfields must be
   sufficient to distinguish among the set of IK components issued by
   that IA for a given identified entity.  Use of a monotonically
   increasing number is sufficient to distinguish among the IK
   components provided for an entity by an IA; use of a timestamp
   additionally allows an expiration time or date to be prescribed for
   an IK component.

   For the asymmetric key management case, the version/expiration
   subfield's value is the hexadecimal serial number of the certificate
   being used in conjunction with the originator or recipient specified
   in the "Originator-ID-Asymmetric:" or "Recipient-ID-Asymmetric:"
   field in which the subfield occurs.

5.2.2  IK Cryptoperiod Issues

   An IK component's cryptoperiod is dictated in part by a tradeoff
   between key management overhead and revocation responsiveness.  It
   would be undesirable to delete an IK component permanently before
   receipt of a message encrypted using that IK component, as this would
   render the message permanently undecipherable.  Access to an expired
   IK component would be needed, for example, to process mail received
   by a user (or system) which had been inactive for an extended period
   of time.  In order to enable very old IK components to be deleted, a
   message's recipient desiring encrypted local long term storage should


Linn                                                           [Page 35]

RFC 1421        Privacy Enhancement for Electronic Mail    February 1993


   transform the DEK used for message text encryption via re-encryption
   under a locally maintained IK, rather than relying on IA maintenance
   of old IK components for indefinite periods.

6.  User Naming

   Unique naming of electronic mail users, as is needed in order to
   select corresponding keys correctly, is an important topic and one
   which has received (and continues to receive) significant study.  For
   the symmetric case, IK components are identified in PEM headers
   through use of mailbox specifiers in traditional Internet-wide form
   ("user@domain-qualified-host"). Successful operation in this mode
   relies on users (or their PEM implementations) being able to
   determine the universal-form names corresponding to PEM originators
   and recipients.  If a PEM implementation operates in an environment
   where addresses in a local form differing from the universal form are
   used, translations must be performed in order to map between the
   universal form and that local representation.

   The use of user identifiers unrelated to the hosts on which the
   users' mailboxes reside offers generality and value.  X.500
   distinguished names, as employed in the certificates of the
   recommended key management infrastructure defined in RFC 1422,
   provide a basis for such user identification. As directory services
   become more pervasive, they will offer originators a means to search
   for desired recipients which is based on a broader set of attributes
   than mailbox specifiers alone. Future work is anticipated in
   integration with directory services, particularly the mechanisms and
   naming schema of the Internet OSI directory pilot activity.

7.  Example User Interface and Implementation

   In order to place the mechanisms and approaches discussed in this RFC
   into context, this section presents an overview of a hypothetical
   prototype implementation.   This implementation is a standalone
   program   which is invoked by a user, and   lies above the existing
   UA sublayer.  In the UNIX system, and possibly in other environments
   as well,  such a program can be invoked as a "filter" within an
   electronic mail UA or a  text editor, simplifying the sequence of
   operations which must be performed by  the user. This form of
   integration offers the  advantage that the program can be used in
   conjunction with a range of UA  programs, rather than being
   compatible only with a particular UA.

   When a user wishes to apply privacy enhancements to an outgoing
   message, the user prepares the message's text and invokes the
   standalone program, which in turn generates output suitable for
   transmission via the UA.  When a user receives a PEM message, the UA


Linn                                                           [Page 36]

RFC 1421        Privacy Enhancement for Electronic Mail    February 1993


   delivers the message in encrypted form, suitable for decryption and
   associated processing by the standalone program.

   In this prototype implementation, a cache of IK components is
   maintained in a local file, with entries managed manually based on
   information provided by originators and recipients.  For the
   asymmetric key management case, certificates are acquired for a
   user's PEM correspondents; in advance and/or in addition to retrieval
   of certificates from directories, they can be extracted from the
   "Originator-Certificate:" fields of received PEM messages.

   The IK/certificate cache is, effectively, a simple database indexed
   by mailbox names.  IK components are selected for transmitted
   messages based on the originator's identity and on recipient names,
   and corresponding Originator-ID, "Originator-Certificate:", and
   Recipient-ID fields are placed into the message's encapsulated
   header.  When a message is received, these fields are used as a basis
   for a lookup in the database, yielding the appropriate IK component
   entries.  DEKs and cryptographic parameters (e.g., IVs) are generated
   dynamically within the program.

   Options and destination addresses are selected by command line
   arguments to the standalone program.  The function of specifying
   destination addresses to the privacy enhancement program is logically
   distinct from the function of specifying the corresponding addresses
   to the UA for use by the MTS.  This separation results from the fact
   that, in many cases, the local form of an address as specified to a
   UA differs from the Internet global form as used in "Originator-ID-
   Symmetric:" and "Recipient-ID-Symmetric:" fields.

8.  Minimum Essential Requirements

   This section summarizes particular capabilities which an
   implementation must provide for full conformance with this RFC.

   RFC 1422 specifies asymmetric, certificate-based key management
   procedures to support the message processing procedures defined in
   this document; PEM implementation support for these key management
   procedures is strongly encouraged.  Implementations supporting these
   procedures must also be equipped to display the names of originator
   and recipient PEM users in the X.500 DN form as authenticated by the
   procedures of RFC 1422.

   The message processing procedures defined here can also be used with
   symmetric key management techniques, though no RFCs analogous to RFC
   1422 are currently available to provide correspondingly detailed
   description of suitable symmetric key management procedures.   A
   complete PEM implementation must support at least one of these


Linn                                                           [Page 37]

RFC 1421        Privacy Enhancement for Electronic Mail    February 1993


   asymmetric and/or symmetric key management modes.

   A full implementation of PEM is expected to be able to send and
   receive ENCRYPTED, MIC-ONLY, and MIC-CLEAR messages, and to receive
   CRL messages.  Some level of support for generating and processing
   nested and annotated PEM messages (for forwarding purposes) is to be
   provided, and an implementation should be able to reduce ENCRYPTED
   messages to MIC-ONLY or MIC-CLEAR for forwarding. Fully-conformant
   implementations must be able to emit Certificate and Issuer-
   Certificate fields, and to include a Key-Info field corresponding to
   the originator, but users or configurers of PEM implementations may
   be allowed the option of deactivating those features.

9.  Descriptive Grammar

   This section provides a grammar describing the construction of a PEM
   message.

   ; PEM BNF representation, using RFC 822 notation.

   ; imports field meta-syntax (field, field-name, field-body,
   ; field-body-contents) from RFC-822, sec. 3.2
   ; imports DIGIT, ALPHA, CRLF, text from RFC-822
   ; Note: algorithm and mode specifiers are officially defined
   ; in RFC 1423

   <pemmsg> ::= <preeb>
                <pemhdr>
                [CRLF <pemtext>]   ; absent for CRL message
                <posteb>

   <preeb> ::= "-----BEGIN PRIVACY-ENHANCED MESSAGE-----" CRLF
   <posteb> ::= "-----END PRIVACY-ENHANCED MESSAGE-----" CRLF / <preeb>

   <pemtext> ::= <encbinbody>      ; for ENCRYPTED or MIC-ONLY messages
               / *(<text> CRLF)    ; for MIC-CLEAR

   <pemhdr> ::= <normalhdr> / <crlhdr>

   <normalhdr> ::=  <proctype>

               <contentdomain>
               [<dekinfo>]         ; needed if ENCRYPTED
               (1*(<origflds> *<recipflds>)) ; symmetric case --
                            ; recipflds included for all proc types
               / ((1*<origflds>) *(<recipflds>)) ; asymmetric case --
                            ; recipflds included for ENCRYPTED proc type


Linn                                                           [Page 38]

RFC 1421        Privacy Enhancement for Electronic Mail    February 1993


   <crlhdr> ::= <proctype>
               1*(<crl> [<cert>] *(<issuercert>))

   <asymmorig> ::= <origid-asymm> / <cert>

   <origflds> ::= <asymmorig> [<keyinfo>] *(<issuercert>)
                  <micinfo>                        ; asymmetric
                  / <origid-symm> [<keyinfo>]      ; symmetric

   <recipflds> ::= <recipid> <keyinfo>

   ; definitions for PEM header fields

   <proctype> ::= "Proc-Type" ":" "4" "," <pemtypes> CRLF
   <contentdomain> ::= "Content-Domain" ":" <contentdescrip> CRLF
   <dekinfo> ::= "DEK-Info" ":" <dekalgid> [ "," <dekparameters> ] CRLF
   <symmid> ::= <IKsubfld> "," [<IKsubfld>] "," [<IKsubfld>]
   <asymmid> ::= <IKsubfld> "," <IKsubfld>
   <origid-asymm> ::= "Originator-ID-Asymmetric" ":" <asymmid> CRLF
   <origid-symm> ::= "Originator-ID-Symmetric" ":" <symmid> CRLF
   <recipid> ::= <recipid-asymm> / <recipid-symm>
   <recipid-asymm> ::= "Recipient-ID-Asymmetric" ":" <asymmid> CRLF
   <recipid-symm> ::= "Recipient-ID-Symmetric" ":" <symmid> CRLF
   <cert> ::= "Originator-Certificate" ":" <encbin> CRLF
   <issuercert> ::= "Issuer-Certificate" ":" <encbin> CRLF
   <micinfo> ::= "MIC-Info" ":" <micalgid> "," <ikalgid> ","
                  <asymsignmic> CRLF
   <keyinfo> ::= "Key-Info" ":" <ikalgid> "," <micalgid> ","
                 <symencdek> "," <symencmic> CRLF     ; symmetric case
                 / "Key-Info" ":" <ikalgid> "," <asymencdek>
                 CRLF                                ; asymmetric case
   <crl> ::= "CRL" ":" <encbin> CRLF

   <pemtypes> ::= "ENCRYPTED" / "MIC-ONLY" / "MIC-CLEAR" / "CRL"

   <encbinchar> ::= ALPHA / DIGIT / "+" / "/" / "="
   <encbingrp> ::= 4*4<encbinchar>
   <encbin> ::= 1*<encbingrp>
   <encbinbody> ::= *(16*16<encbingrp> CRLF) [1*16<encbingrp> CRLF]
   <IKsubfld> ::= 1*<ia-char>
   ; Note: "," removed from <ia-char> set so that Orig-ID and Recip-ID
   ; fields can be delimited with commas (not colons) like all other
   ; fields
   <ia-char> ::=  DIGIT / ALPHA / "'" / "+" / "(" / ")" /
                  "." / "/" / "=" / "?" / "-" / "@" /
                  "%" / "!" / '"' / "_" / "<" / ">"
   <hexchar> ::= DIGIT / "A" / "B" / "C" / "D" / "E" / "F"
                                                      ; no lower case


Linn                                                           [Page 39]

RFC 1421        Privacy Enhancement for Electronic Mail    February 1993


   ; This specification defines one value ("RFC822") for
   ; <contentdescrip>: other values may be defined in future in
   ; separate or successor documents
   ;
   <contentdescrip> ::= "RFC822"

   ; The following items are defined in RFC 1423
   ;  <dekalgid>
   ;  <dekparameters>
   ;  <micalgid>
   ;  <ikalgid>
   ;  <asymsignmic>
   ;  <symencdek>
   ;  <symencmic>
   ;  <asymencdek>


NOTES:

     [1]  Key generation for MIC computation and message text encryption
          may either be performed by the sending host or by a
          centralized server.  This RFC does not constrain this design
          alternative.  Section 5.1 identifies possible advantages of a
          centralized server approach if symmetric key management is
          employed.

     [2]  Postel, J., "Simple Mail Transfer Protocol", STD 10,
          RFC 821, August 1982.

     [3]  This transformation should occur only at an SMTP endpoint, not
          at an intervening relay, but may take place at a gateway
          system linking the SMTP realm with other environments.

     [4]  Use of a canonicalization procedure similar to that of SMTP
          was selected because its functions are widely used and
          implemented within the Internet mail community, not for
          purposes of SMTP interoperability with this intermediate
          result.

     [5]  Crocker, D., "Standard for the Format of ARPA Internet Text
          Messages", STD 11, RFC 822, August 1982.

     [6]  Rose, M. T. and Stefferud, E. A., "Proposed Standard for
          Message Encapsulation", RFC 934, January 1985.

     [7]  CCITT Recommendation X.509 (1988), "The Directory -
          Authentication Framework".


Linn                                                           [Page 40]

RFC 1421        Privacy Enhancement for Electronic Mail    February 1993


     [8]  Throughout this RFC we have adopted the terms "private
          component" and "public component" to refer to the quantities
          which are, respectively, kept secret and made publicly
          available in asymmetric cryptosystems.  This convention is
          adopted to avoid possible confusion arising from use of the
          term "secret key" to refer to either the former quantity or to
          a key in a symmetric cryptosystem.

Patent Statement

   This version of Privacy Enhanced Mail (PEM) relies on the use of
   patented public key encryption technology for authentication and
   encryption.  The Internet Standards Process as defined in RFC 1310
   requires a written statement from the Patent holder that a license
   will be made available to applicants under reasonable terms and
   conditions prior to approving a specification as a Proposed, Draft or
   Internet Standard.

   The Massachusetts Institute of Technology and the Board of Trustees
   of the Leland Stanford Junior University have granted Public Key
   Partners (PKP) exclusive sub-licensing rights to the following
   patents issued in the United States, and all of their corresponding
   foreign patents:

      Cryptographic Apparatus and Method
      ("Diffie-Hellman")............................... No. 4,200,770

      Public Key Cryptographic Apparatus
      and Method ("Hellman-Merkle").................... No. 4,218,582

      Cryptographic Communications System and
      Method ("RSA")................................... No. 4,405,829

      Exponential Cryptographic Apparatus
      and Method ("Hellman-Pohlig").................... No. 4,424,414

   These patents are stated by PKP to cover all known methods of
   practicing the art of Public Key encryption, including the variations
   collectively known as El Gamal.

   Public Key Partners has provided written assurance to the Internet
   Society that parties will be able to obtain, under reasonable,
   nondiscriminatory terms, the right to use the technology covered by
   these patents.  This assurance is documented in RFC 1170 titled
   "Public Key Standards and Licenses".  A copy of the written assurance
   dated April 20, 1990, may be obtained from the Internet Assigned
   Number Authority (IANA).


Linn                                                           [Page 41]

RFC 1421        Privacy Enhancement for Electronic Mail    February 1993


   The Internet Society, Internet Architecture Board, Internet
   Engineering Steering Group and the Corporation for National Research
   Initiatives take no position on the validity or scope of the patents
   and patent applications, nor on the appropriateness of the terms of
   the assurance.  The Internet Society and other groups mentioned above
   have not made any determination as to any other intellectual property
   rights which may apply to the practice of this standard. Any further
   consideration of these matters is the user's own responsibility.

Security Considerations

   This entire document is about security.

Author's Address

   John Linn

   EMail: 104-8456@mcimail.com


Linn                                                           [Page 42]

\end{verbatim}


\chapter{A Shell Server for HTTP}The HTTP protocol is very simple.
The following is an example of a
server program written in sh:
\begin{verbatim}#! /bin/sh
read get docid
echo "<TITLE>$docid</TITLE>"
echo Here is the data

\end{verbatim}
The docid may have a trailing carriage
return to be stripped off on some
systems. You can modify that script
to produce the data you actually
want. The HTML syntax for marked-up
text is fairly simple, but if you
want just to send plain text, then
just send the  .PLAINTEXT.tag first:
\begin{verbatim}#! /bin/sh
read get docid
sed -f txt2html.sed $docid

\end{verbatim}
or in csh
\begin{verbatim}#! /bin/csh
request = ( `echo $<`)
if ($#request <2) exit
sed -f txt2html.sed $request[2]


\end{verbatim}
When you have written your script,
set the execute bit and then configure
the inet daemon to run it . A few
more examples:
\begin{itemize}
\item A sh script to generate a menu for
files in a directory
\item An awk script to generate menu from
a list of files .
\item A perl script for all kinds of stuff
on the ASIS server
\item The shell script of the Hytelnet
gateway
\end{itemize}If you know the perl language, then
that is a powerful (if otherwise
incomprehensible) language with which
to hack together a server. \par 
See also a case study of mapping
a database onto the web .\par 
All contributions to these examples
welcome!
Tim BL


\section{Making a server}Here is a run-through of what is needed to make a www server , with
examples from a suggested server for the HEPDATA base of Mike Whalley
. See also etiquette .\par 
Basically, to make the data available, you make a server which is
a modified version of your program. When a user follows a link to
HEPDATA (or runs a command to jump straight there), the client program
opens a connection to a server program on a VM machine (say, but could
be VMS or unix). The server in turn runs your program.\par 
Let me just describe the essence of the changes needed so that you
can get an idea of how much effort would be involved.\par 
The first thing you do is to make up an arbitrary naming method for
anything which HEPDATA can display.  In this I include the welcome
page, any menu, any article, any help text.  Typically one invents
a hierarchical naming scheme, like
\begin{verbatim}	/HEPDATA			The first "welcome" menu
	/HEPDATA/HELP			The top-level help

	/HEPDATA/HELP/REAC		The help on the reaction database.

	/HEPDATA/REAC			The reaction database itself

	/HEPDATA/REAC?P+PBAR		list of reactions involving p and pbar (?)

	/HEPDATA/DATA/RD125V687		Some article (say).

\end{verbatim}
You do this because, whereas an interactive user follows a path through
the program, the W3 user calls the program once for each thing. There
is no "state" information. This allows one to make a hypertext link
to any part of the scheme and jump back in again later. For example,
one might want to quote an article, or the reaction database, or a
particular list of reactions.\par 
Now all you do is modify the program so that, given a name above,
it will\par 
return the required document.  This means basically turning it from
a sequence the user goes through into a set of conditionals to isolate
each of the individual cases above. Apart from that, the data retrieval
code is unchanged apart from the output formatting.  Many of the options
in fact mean mapping the name onto a fixed\par 
file's name its the searches which have to activate real code.\par 
The hypertext trick you need to use in the menus. Where an option
is normally output to the screen, you have to tell the client what
to ask for is the user selects that option. For example, in the main
menu /HEPDATA you have an option which gives the help. You would represnt
this "anchor" as
\begin{verbatim}<A NAME=4 HREF=/HEPDATA/HELP> Help </A>

\end{verbatim}

"Help" is all that is displayed, with some indication
that it is an option. If the user choses (clicks a mouse on, choses
by number depending on which client he has) then the client asks the
server for /HEPDATA/HELP. ("A" is for "anchor", "HREF" is for "hypertext
reference")\par 
For the index searches, it's as simple. When the server sends the
text called /HEPDATA/REAC it also sends a special tag . This tells
the client to enable a FIND command, or find panel etc (depending
on the client). You don't have to do any human interface work. The
client automatically comes back with a search coded up in the form
/HEPDATA/REAC?P+PBAR etc. Your server in turn returns a menu (say)
with pointers to the data which has been found.\par 
You can also put some formatting tags (like headings) which will make
the data look really nice on a window system.\par 

Tim BL


\chapter{W3 and HTMLTools}These tools, part of the available
WWW software ,  are managements of
W3 servers, generation of hypertext,
etc.
\section{Generating HTML}
\begin{DL}{allow this much space}
\item[List of filters
] and converters between
various formats and HTML, collected
by Richard Brandwein.
\item[Mail Archive to HTML
] Make that mail
archive available on the web. Markus.Stumpf@Informatik.TU-Muenchen.DE
\item[Framemaker interface
] There are some
tar files on the anonymous FTP archive
on file://info.cern.ch/www/src which
allow FRAMEmaker to be used as a
W3 tool. Dan Conolly, Convex. Incldues
MIF HTML translation.
\item[Generating HTML
] These are scripts
for generating SGML hypertext from
things like directory listings, etc.
Also, for checking and correcting
dubious HTML.
\item[WP5.1 to HTML
] WordPerfect 5.1 to
HTML conversion
\item[LaTex to HTML
] Code from Nikos Drakos,
Computer Based Learning Unit, University
of Leeds.
\end{DL}

\section{Editing HTML}
\begin{DL}{allow this much space}
\item[BBEdit Extensions
] Allow easier edit
of HTML files with BBedit on the
Mac.
\item[NeXTStep editor
] WYSIWYG hypertext.
\item[html-mode for Emacs
] Not wysiwyg but
useful.
\end{DL}

\section{Generating things from HTML}
\begin{DL}{allow this much space}
\item[Plain text
]Use the line mode browser,
www with options like -n and -na
or -listrefs. 
\item[LaTeX
]There are some scripts around
to generate LaTeX or variations from
HTML.  Other sed scripts can be used
to combine documents at various levels
into one big book.
\end{DL}

\section{Analysing Log Files}
\begin{DL}{allow this much space}
\item[Server log analysis
] Analysing server
logs requires first of all changing
the numeric internet node numbers
into domain names. httpd-analyse.c
is a program to do that. Feed the
results through awk and grep of your
choice!  Some documentation on the
program.
\item[Server log analysis
] Getsites .c is
a program which generates reports
on a weekly or monthly basis.
\end{DL}

\section{Web Wanderers}
\begin{DL}{allow this much space}
\item[Web-roaming  robot etc
] Guido van
Rossum's knobot code in "Python"
language.
\item[Web Checker
] James Pitkow's web checking
robot
\end{DL}

\section{Public WWW Access Services}
\begin{DL}{allow this much space}
\item[Telnet server
] Setting up a service
machine for anonymous users to log
in to a www client.
\item[Mail Robot
] A program to return any
information in the web information
by electronic mail
\end{DL}

Tim BL


\section{HTMLGeneration}Here are some example files you can
use for generating HTML from lists
of files and other things.
\begin{DL}{allow this much space}
\item[RTF to HTML
] Convert RTF (using specific
styles) into HTML.
\item[fix-html.pl  
]written by Dan Connolly,
is a perl script to legitimize old
HTML files into SGML-abiding HTML
(as per the DTD that Dan created).
\item[texi2html
]Lionel Con's converter from
Gnu TeXInfo format.
\item[text2html.sed
] A sed script to turn
plain text into plain-looking valid
HTML markup so that it will be rendered
just as it was.
\item[ls2html.awk 
]is an awk script which
will just take a list of names and
generate a menu.
\item[dir2html 
]is a shell script which
generates a menu of pointers to files
with particular suffixes in a set
of directories. It also includes
a README file at the head of the
hypertext list if one exists.
\item[htn2html.c
] See the Hytelnet gateway
for the program to convert hytelnet
data into HTML.
\item[findrefs.pl
] Written by Ari Lemmke,
finds references http:... in plain
text files and generates anchors
out of them.
\item[LaTeX to HTML
] Latex to HTML converter
program by Nikos Drakos - not only
does it successfully show the more
complex Latex formatting, for example
for mathematics, but it also has
a set of iconic images, which are
included for navigation, and to mark
footnotes and references.
\end{DL}
You can make any variations on these
you like of course. \lbrack CERN does not
accept any responsability for things
quoted in these lists\rbrack .


\section{Updating the Newsgroup lists}To update some of the news pages
automatically you must be logged
on to the news server or have the
news directories mounted.\par 
 Carl mentioned that you must be
a member of the UNIX group news (otherwise
you won't have permission to read
the news directories) but that doesn't
seem to be necessary for these functions.
\subsection{UpdateGroups}This script updates the list of newsgroups.
For the overview list , it saves
everything before the "Others" heading,
and adds on a list of pointers to
newsgroup stems not already mentioned
in the saved hypertext.\par 
For each stem, it saves any command
before the glossary list of groups,
and then regenerates that list of
groups.
\subsection{NewsPage\_Update (old)}The script NewsPage\_Update creates
complete lists of active groups for
the following groups: alt, bionet,
bit, biz, cern, ch, comp, eunet,
gnu, news, rec, sci, soc, talk, vmsnet.
It does this by writing the header
in explicitly for each group, and
then generating a list of of subgroups
using FindGroups\par 
For comp and news, a full list is
placed in fullcomp.html and fullnews.html.
The files comp.html and news.html
are formatted by hand already, and
so are not touched by the script.\par 
NewsPage\_Update works by writing
some HTML text into a file for each
group to be updated, called \lbrack newsgroup\_name\rbrack .html.new,
then calling the script FindNewsGroups.
 This checks the file /usr/local/lib/news/newsgroups
for the groups within the current
group which are active.  Finally
the new file is renamed to remove
the .new.\par 
The list of stems to search, and
their titles and any other comment
is hardcoded into the NewsPage\_Update
script, and the list is DUPLICATED
in Others\_Update.
\subsection{Others\_Update}The Others\_Update script finds stems
which are not included in the Overview.html
file, but which are active.  This
list of which groups not to include
is hardcoded into the script.  For
each group, it calls GrpCreate. 
This adds the name to OtherGroups/Overview.
   It then runs FindNewsGroups for
each group.
\subsubsection{NOTE}Once the script has completed all
the .new groups must be renamed manually
to remove the .new extension.
\subsection{GrpCreate}This reads a newsgroup stem name
from stdin.\par 
It then creates the top of a file
for the list of groups with that
stem. This will be called \$\{nn\}.html.new.
where \$\{nn\} is the stem name. Unfortunately
there is no way to get a description
of the stem to include in this file.
However, if the .html file already
exists, it will use everything up
to an excluding the first DL tag
from the .html file for the .html.new
file. Therefore, everything above
the DL tag may be hand edited.\par 
GrpCreate adds a pointer from OtherGroups/Overview.html.new
to the .html file.\par 
The .html file is renamed .html.old,
and teh .html.new becomes .html,
with diffs being stored in a .diffs
file under the date.
.\char'134 " Macros for HTML
.\char'134 " Jim Davis 6 Nov 92
.ps 12
.in 5
.de B
..
.de R
..
.de H1
.ti -5
.ps 18
\char'134 fB\char'134 \char'134 \$1\char'134 fR
.ps 12
.br
..
.de H2
.ti -3
.ps 14
\char'134 fB\char'134 \char'134 \$1\char'134 fR
.ps 12
.br
..
.de H3
\char'134 \char'134 \$1
.br
..
.de H4
\char'134 \char'134 \$1
..
.de H5
\char'134 \char'134 \$1
..
.de H6
\char'134 \char'134 \$1
..
.de H7
\char'134 \char'134 \$1
..
.de H8
\char'134 \char'134 \$1
..
.de H9
\char'134 \char'134 \$1
..
.de DL
.in +5
..
.de DE
.in -5
..
.de DT
.ti -3
* \char'134 \char'134 \$1
..
.de DD
.br
..

\begin{verbatim}
Date: Wed, 4 Nov 1992 16:48:34 -0500
From: Jim Davis <davis@dri.cornell.edu>
To: wei@xcf.berkeley.edu, www-talk@nxoc01.cern.ch
Subject: improved printing of WWW files
\end{verbatim}
If you can't quite manage to live without hardcopy, you may
wish sometimes to print WWW files.  I have written a couple
of scripts to do this.  They are particularly useful with
Pei Wei's excellent Viola WWW browser.
\par 
A tar archive is available for anonymous FTP:
\par 
dri.cornell.edu/pub/davis/print-www.tar
\par 
It contains:
\begin{verbatim}
README
print-www
print-www.l
html-to-latex
html2latex.sed (modified version of original CERN version)
\end{verbatim}
The hardest part was writing the perl script to obtain documents
via http protocol - turns out you cant just run pipes through telnet.
\par 
The conversion from HTML to LaTex is not really robust yet - 
this  is doubly hard since there is no guarentee that the HTML
is legal.  But at least it works for my test cases.  No doubt
it will be improved in time.
\par 
best wishes


\chapter{Gateway Software}See also: W3 server software , W3
client software\par 
These are servers which provide data
extracted from other systems. they
are built using code from the basic
daemon, or scripts.
\begin{DL}{allow this much space}
\item[ACEDB gateway (see also the french
version )
] ACEDB is the database program
written for the nematode genome project.
\item[FIND gateway 
]for CERN/VM XFIND which
calls a REXX exec to get the information
from the XFIND system running on
the CERNVM mainframe.
\item[Hytelnet gateway
] A gateway to Peter
Scott's list of telnet sites
\item[News Indexer
]Index a news spool file
using gateway to "ni".  Mitchel Charity,
MIT.
\item[VMS Help gateway
] This allows any
VMS help files to be made available
to WWW clients. Runs on VAX/VMS.
\item[WAISGate
] A gateway to information
available using the W.A.I.S. protocol.
\item[DCLServer
] A server for VMS systems
which allows you to write a gateway
to your own favorite information
system using DCL.
\item[System33
] A (big) csh script server
providing data including Xerox System33
documents, man pages in plain text,
phone numbers, etc. etc...!
\item[Oracle
] A generic server to oracle.
Could be used as a basis for gateways
to specific Oracle databases.
\item[Geography
] Gateway to the Geography
server at U Michigan
\item[TechInfo
] TechInfo is the CWIS from
MIT.  A gateway exists thanks to
Linda Murphy/Upenn.
\end{DL}

Tim BL


\section{Geography gateway}
Wed, 18 Nov 1992 \par Jim Davis $<$davis@dri.cornell.edu$>$
Here is a quickly hacked up Gateway from WWW to the University of
Michigan Geography server.  It expects one argument, a  WWW doc id.
 It ignores the "pathname", extracts the search words, then passes
those to the server.  It does NOT parse the data returned by the server
(that is an improvment yet to be done) but you can understand the
output.\par 
To use this, you would need to have an HTTP server running someplace
where you can attach this gateway.  I can provide the very simple
HTTP server I use here, but this subject is already documented in
the WWW online documentation.\par 
Source code in perl


\section{The WWW TechInfo gateway}This is a gateway built using the
basic server code, plus one source
file in C. Thanks to Linda Murphy
of Univerity of Pennsylvania for
the etchinfo code.
\begin{itemize}
\item The gateway data as running at CERN
\item The source file
\end{itemize}
Tim BL


\section{The W.A.I.S. - WWW gateway}This is an example of a WWW server
and a WAIS client. It is just the
regular httpd daemon linked with:
\begin{itemize}
\item a version of the libwww library which
was compiled with the DIRECT\_WAIS
option, and includes the HTWAIS module;
\item the freeWAIS libraries from CNIDR
.
\end{itemize}See a summary of some data available
through the gateway . 
\subsection{WSRC files}The gateway keeps a cache of WAIS
"source" files. These are files describing
WAIS servers. They are normally picked
up automatically by searching a "directory
of servers" index. Once the gateway
has picked up a desciption of  a
server,  it uses the description
to describe the server to those who
follow links to it. (See the HTWSRC
module of libwww)\par 
These source files are parsed, and
are kept in the directory /usr/local/lib/WAIS
under the server name, port, and
database name.
Tim BL\par 


$<$em$>$Warning: this is no longer working with http 1.0 . This is a known bug$<$/em$>$
\section{VMS Help server}This server can provide WWW users
with any information stored in VMS
Help format.
\subsubsection{Additional information available:
      :-$>$ }
\begin{DL}{allow this much space}
\item[Try me ! 
]An example server running
at CERN
\item[Status
]The current state, pointers
to more information
\end{DL}

JFG


\subsection{Gateway to VMS Help: Internals}These are technical and installation notes about the gateway to VMS
Help . Please send bug reports and suggestions to Jean-Francois Groff
(jfg@cernvax.cern.ch).
\subsubsection{Sources}The program consists of the generic daemon HTDaemon.c , and a special
function, stored in VMSHelpGate.c , to retrieve VMS Help data and
convert it to HTML.
\subsubsection{Installation}The files you need are as follows. You should customise them, putting
in your own directory names.:
\begin{DL}{allow this much space}
\item[launchgate.com
]Runs the server as a detached process. Put a call to
this from your sys\$startup procedure, wherever that is. This detaches
a job to use www\_server.com ans input, and a log file as output.
\item[www\_server.com
]The server command file, a wrapper for the actual server
executable.  In this file, set the temporary directory for the storage
of a cache of .HLP files. This file runs the executable.
\item[test.com
]Here is just an example of  a file to build and test the server.
\item[descrip.mms
]This is an MMS file to build the executable. If you don't
have MMS, you may be able to figure out from loking at it which commands
you should use.  You can find a machine running MMS and generate the
equivalent .com files. See comments at the top of this file on how
to run it.
\end{DL}
The source files and executable .EXE are currently (October 92) available
on HEP  decnet in vxcrna::disk\$d1:\lbrack jfg.www...\rbrack .  Note also you can
pick up the master sources from dxcern:: automatically by running
\par 
MMS /MACRO=(U=DXCERN::).\par 
If you are not in HEP decnet, you should find the sources in the WWWDaemon\_v.vv.tar.Z
file in the distribution. See the README file.\par 

JFG


\subsection{VMS Help server Bugs}
This is a list of known bugs and desired improvements. Don't let it shrink too
fast : send your bug reports and suggestions to
Jean-Francois Groff (jfg@cernvax.cern.ch).
\begin{itemize}
\item The keyword search works fine on any number of levels down, but then the
generic daemon doesn't know how deep the server went, so anchor names lack the
intermediate levels. Solution : generate anchor names relative to the input path
(before '?').
\item DANGER : Attempts to access VMS topics with a weird name like ":=" will
crash the server because VMS will try to create a .HLP file with an invalid file
specification due to these special characters. Solution : Make a good escaping
system (that works with VMS and Un*x styles as well). Crude and bulletproof
solution : Ignore any offending topic name !
\item Reference to another help library through @ will only search SYS\$HELP
for the corresponding .HLB file.
\item We need an overview page that lists all help libraries available.
\end{itemize}
JFG


\subsection{VMS Help server Features}This lists the main features of the VMS Help gateway, with improvements
in reverse chronological order. Help make it grow fast : send your
bug reports and suggestions to Jean-Francois Groff (jfg@cernvax.cern.ch).
\subsubsection{Experimental gateway 0.4 $--$ 2 Oct 91}
\begin{itemize}
\item Accepts user queries by number or by name. In the latter case, can
go down several levels, for instance, from the main help page : "cc
/lib" will go to topic CC, subtopic /LIBRARY.
\item On invocation with only //node:port/HELP, displays the contents of
the standard VMS Help library SYS\$HELP:HELPLIB.HLB (function lis\_to\_html).
\item Address format : //node:port/HELP/\lbrack @library/\rbrack \lbrack topic\lbrack /subtopic\rbrack *\rbrack 
\end{itemize}
JFG


\chapter{Style Guide}This guide is designed to help you
create a hypertext database effectively
communicates your knowledge to the
reader.  It has been prepared in
the light of comments by readers,
and many demands by providers of
online documentation.   Some of the
points made may be influenced by
personal preference, and some may
be common sense, but a collection
of points has been demanded, and
so here it is.\par 
The guide is designed to be read
sequentially, but feel free to depart
from this.  The sections are as follows:
\begin{itemize}
\item Introduction
\item Overall structure of your work
\item Within each document
\item Test your document
\item Background reading
\item Reader comments
\end{itemize}
\section{This document is open to comment}Suggestions are strongly invited,
if you think of anything mail it
to timbl@info.cern.ch, mentioning
the Style Guide for Online Hypertext
or its URL.
Tim BL


\section{Introduction}You are going to write (or generate
) some online hypertext. Because
hypertext is potentially unconstrained
you are a little daunted. Do not
be. You can write a document as simply
as you like.  In many ways, the simpler
the better.\par 
You will be writing a number of separate
files.  These files will be linked
to each other, and to external documents,
to make your final work.\par 
You may think of your work as a "document",
and if it were on paper, then you
would call it that.  In the online
case though, we tend to refer to
each individual file as a document.
A  document may correspond, in the
book analogy, to a section or a subsection,
or even a footnote. In this guide,
we'll refer to the whole collection
as a work.\par 
The document is the unit by which
information is picked up.  At any
one time, a document is completely
loaded into the reader's computer.
It is also normally the amount you
edit at any one time, though with
a good editor you will probably have
a number of documents open at a time.\par 
The section on structure discusses
how you organize your material into
documents.   Another section discusses
how to organise your material within
a document .\par 
(Up to overview ,  on to structure
)
Tim BL


\section{Structure}If you have in mind a body of information
to put across to your reader, you
probably have a mental organisation
for it.  Normally this is a sort
of hierarchical tree, like the chapters
of a book if you were to write a
book.\par 
Keep this structure.  It helps readers
to have a tree structure as a basis
for the book: it gives them a feeling
of knowing where they are.   You
can also us this structure for oganising
your files in directories.\par 
You should also bear in mind:
\begin{itemize}
\item The reader's preconceived structure
\item The idea of overlapping trees
\item How big to make each document
\end{itemize}(Up to overview , back to Introduction,
on to: writing each document)
Tim BL


\subsection{The reader's structure .}Remember always the audience for
whom you are writing.  If they are
novices in the subject,  it will
normally help if you are firm about
the structure of your work, so that
they can learn the structure of the
knowledge itself.   For example,
if you feel that the subject falls
into three distinct areas, then that
is an importnat thing to teach.\par 
If, however, your readers will already
have some knowledge in the subject,
then they will already have formed
their own structure for it.  In this
case they will conciously or subconsiouly
know where they expect to find things.
If your structure is different from
theirs,  enforcing it too strongly
will confuse them and put them off.\par 
You may in this case have resist
a strong tendency to put across your
own structure strongly and to the
detriment of all others.  There are
two solutions.\par 
If you have a single well-defined
audience in mind, who will share
a similar world view, then try to
write excatly for that world view
rather than yours.  \par 
If you are simultaneously writing
for more than one group, then you
must provide for both. \par 
When you make a reference,  qualify
it  with a clue to allow soime people
to skip it. For example, "If you
really want to know how it works
inside, see the Internals guide",
or "A step-by-step introduction is
in the tutorial".\par 
Provide links for both reader's views.
Your work will be more connected
than a simple tree, but with proper
qualifiaction, noone should get lost.
\par 
Provide two sepate tree "roots".
For example, you can write a step-by-step
tutorial  and a functionaly direct
reference tree for the same data.
Both will at the lowest level have
the same data, but while the first
will deal with the simple things
first, the second may be functionnaly
grouped.   This is just like having
several indexes to a book.  The tutirial
might also include information which
the reference work does not.\par 
	(Up to overview , back to Introduction
, on to: writing each document )
Tim BL


\subsection{Overlapping Trees}Here is an example of a work (describing
some programming functions, say)
with two separate structures:
\begin{verbatim}			Tutorial			Reference
			   |				    |
		  Let's do it togther		       -----------------
		from simple to difficult	      |			|
			    |			by Functional      Alphabetical
			    |			    group	     by name
		  Task oriented examples	      |			|
			    |			       -----------------
			    |				    |
		  Examples of use of		   Syntax definition for
		  specific functions   <-------->    specific functions

\end{verbatim}
The novice user starts at the top
left, and works his way down. Where
he needs specific details, he will
get down to the examples and from
them a link to the underlying definitive
desctiptions of each. As far as he
is concerned, he is reading a tree-strucured
work.   In fact, he is reading the
same information as the expert who,
coming in to check on one particular
function, then looks up an example
of its use.	(Up to structure , back
to user's structure , on to: document
size )
Tim BL


\subsection{How big to make each document}The most important point here is
that a document should put across
a well-defined concept.  It is not
generally worth splitting one idea
arbitrarily into two bits in order
to make the bits smaller.  Nor is
it a good idea to put together ideas
which area really separate just to
make a bigger document.\par 
A document can be as small as a footnote
.\par 
There are two upper limits on a document's
size.  One is that long documents
will take longer to transfer, and
so a reader will not be able to simply
jump to it and back as fast as he
or she can think.  This depends a
lot on the link speed of course.\par 
The other limit is the difficulty
for a reader to scroll through large
documents. Readers with character
based terminals don't general read
more than a few screens.  They often
only absorb what is on the first
screen, as if that is not interesting
they won't be bothered to scroll
down.  Readers are also put off by
being left at the top of a large
document.\par 
Readers with graphic interfaces generally
scroll through long documents with
a scroll bar.   When the scroll bar
is moved a small amount, the document
should move a sufficiently small
amount so that some of the original
window-full is still left in the
window.  This allows the reader to
scan the document. If the document
is any bigger, then it is basically
unreadable, in that any movement
of the scroll bar will loses the
place and leaves the reader disoriented.\par 
Advantages with longer documents
are that it is easier for readers
with scrollbars to read through in
an uninterrupted flow, if that is
how the document is written.\par 
Also,  one doesn't have to go to
the trouble of making (or generating)
so many links and keeping them up
to date if things are altered.  If
making the links is a problem, just
settle for one link to a contents
page.  Some browsers have "next"
and "previous" buttons to allow a
document to be browsed serially according
to a list.\par 
(In fact, one can normally scroll
up and down explicitly page by page,
but this is gives the same feeling
as the terminal interface.)\par 
A rough guide, then, for the size
of a document is:
\begin{itemize}
\item For online help, menus giving access
to other things: small enough to
fit on 24 lines.  Check this by using
a terminal browser.
\item For textual documents, of the order
of half a letter-sized (A4) page
to 5 pages.
\end{itemize}(Up to structure , back to overlapping
trees , on to: within each document
)
Tim BL


\section{Within each document}This section of the style guide deals
with the layout of text within a
"document", the unit of retrieval
of information on the web.\par 
To be completed.\par 
You should try to:
\begin{itemize}
\item Sign your work
\item Give its status
\item Make links into context .
\item Use context-free document titles
\item Format device-independantly
\item Write for the printed work too
\item Write readable text despite the links
\item Avoid talking about mechanics
\end{itemize}(up to overview , back to structure
, on to testing )
Tim BL


\subsection{Sign It!}An important aspect of information
which helps keep it up to date is
that one can trace its author.  Doing
this with hypertext is easy $--$ all
you have to do is put a link to a
page about the author (or simply
to the author's phone book entry).\par 
Make a page for yourself with your
mail address and phone number. At
the bottom of files for which you
are responsible, put a small note
$--$ say just your initials $--$ and
link it to that page. The address
style (typically right justified)
is useful for this.\par 
 Your author page is also a convenient
place to put and disclaimers, copyright
noitices, etc which law or convention
require. It saves cluttering up the
mesages themselves with a long signature.\par 
If you are using the NeXT hypertext
editor, then you can put this link
from your default blank page so that
it turns up on the bottom of each
new document.\par 
( up , back to ..., on to  giving
your document's status)


\subsection{The status of your document}Some information is definitive, some
is hastily put together and incomplete.
Both are useful to readers, so do
not be shy to put information up
which is incomplete or out of date
$--$ it may be the best there is. However,
do remember to state what the status
is. When was it last updated? Is
it complete? What is its scope? For
a phone book for example, what set
of people are in it?\par 
Not every document needs a status
declaration, if  there is something
in the overview page of the work
which covers it. \par 
You can of course also give a feel
for the status of the text by its
language ... bad spelling, missing
capitals, and relaxed grammer all
indicate informal notes.     Careful
use of verbs such as "shall" and
"should", and the introduction of
Long Capitalised Noun Phrases (LCNPs)
will give at least the impression
of an ISO standard.  ;-)
\subsubsection{Date it}In some cases it can be useful to
put creation dates and last modified
dates on your work.  (Note that this
is the sort of thing which one could
make a server do automatically with
a little programming). \par 
Figure out whether putting one might
later save the reader from following
out of date information.\par 
(back to Sign It, On to links into
context )


\subsection{Linking to context}A major difference between writing
part of a serial text, and an online
document, is that your readers may
have jumped in from anywhere.   Even
though you have only made links to
it from one place, any other person
may want to refer to that particular
point, and will so make a link to
that particular part of your work
from their own. So  you can't rely
on your reader having followed your
path through your work.\par 
Of course if you are writing a tutorial,
it will be important to keep the
flow from one document to the next
in the order you intended for its
primary audience.   You may not wish
to cater specially for those who
jump in out of the blue, but it is
wise to leave them with enough clues
so as not to be hopelessly lost.
Some ways of doing this are:
\begin{itemize}
\item Watch that your text and vocabulary
stands by itself. Starting a document
with "The next thing we we consider
is..." or "The only solution to this
problem is..." will certainly confuse.
\item Sometimes the opening words refer
to the context, and can be linked
to background information.   For
example, in the WWW project documentation,
the first occurence of the acronym
WWW is often linked back to the central
project document.
\item The navigation hints at the top or
bottom of the document can give explicit
pointers.  Examples are at the bottom
of this document.  
\end{itemize}It can also be useful to imagine
as you are writing that  you yourself
may wish to reuse the document. some
day.\par 
(Part of style guide for online hypertext
. Up to Writing each document , on
to Title tag) 
Tim BL


\subsection{Device Independence}The hypertext you write is stored
in HTML language, which does not
contain information about the fonts
and paragraph shapes and spacing
which should be used for displaying
the document.\par 
This gives great advantages in that
your document will be rendered successfully
on whatever platform it is viewed,
including a plain text terminal.\par 
You should be aware that different
clients do use different spacing
and fonts.   You should be careful
to use the structuring elements such
as headers and lists in the way in
which they were intended.  If you
don't like the rendering on your
particular client, don't try to fix
it by using inappropriate elements,
or trying for example to force extra
spacing with empty elements.  This
may well end up being interpreted
differently by other clients and
looking very strange.  You can in
many cases configure the client displays
each element.\par 
For example:
\begin{itemize}
\item Always use heading levels in order,
with one heading level 1 at the top
of the document, and if necessary
several level 2 headings, and then
if necessary several level 3 headings
under each level 2 heading.  If you
don't like the way heading level
2 is formatted, fix it on your client,
don't just skip to heading level
3.
\item Don't put extra spaces or blank lines
into your text to pad it out, except
in preformatted (PRE) sections.
\item Don't refer in your text to facets
of particular browesrs.   Asking
someone to "click here" won't make
sense without a mouse, just as asking
someone to "select a link by number"
will betray the fact that you were
using the line mode browser.  Just
leave a link.  The instructions get
boring as the user will normally
know how to select a link.
\end{itemize}See also: testing your document .\par 
Following these guidelines you may
find that the end result does not
appear on your screen exactly as
you would like, but your readers
will probably be happier.\par 
(Part of the Style Guide for Online
Hypertext .  Up to within each document
, back to , on to printable hypertext)
Tim BL


\subsection{Printable hypertext}In an ideal world, paper might not
be necessary.  In a next to ideal
world, one would have enough time
to write a hypertext version of a
document and also a completely reauthor
a paper version.  In the real world,
you wilkl probably want to generate
any printed documents and online
documents from the same file.\par 
Suppose the HTML files will be the
master, and you will generate the
printable from this, by translation
into TeX, etc.\par 
If you might one day want to do this,
try to avoid references in the text
to online aspects.  "See the section
on device independence " is better
than "For more on device independence,
click here .".  In fact we are talking
about a form of device independence
.\par 
Unfortunately the recommended practices
of signing each document and giving
navigational links  tend to mess
up the printable copy, though one
can of course develop ways of stripping
them out if they follow a common
format.\par 
(Up to:  within each document;  back
to device independece , on to .readable
text)
Tim BL


\section{Test your document}In a way your hypertext is like a
book, which you should have proof-read.
In a way, it is like a program which
you should have tested.  At least
get someone from the group for which
you wrote the document to read it
and give you some feedback.  Other
ideas are:
\begin{itemize}
\item Read the document several different
client programs, to ensure that you
have formatted it in a device independent
way.
\item Monitor the readership of your document.
You can do this by analysing the
server log files .    You may find
that some parts are not being read,
perhaps because people are looking
in the wrong place for them.  You
may see that people often follow
a path and backtrack. If you can
guess what they were looking for,
you can make the clues around the
link more helpful.  (Remember to
keep log information confidential
until you have removed user information
from it.)
\item Make it clear whether your will accept
criticism or suggestions from your
readers, and how they should send
it.
\item Ask people to solve problems using
the document, and report on their
success. If they fail, find out what
they were looking for, whether it
was in the document at all, 
\end{itemize}
\subsection{How much testing?}Testing takes time.    The decision
of how  much testing you do is based
on the quality of the document you
wish to provide.  You are balancing
your reader's time and effort against
yours.   If your document is "selling"
an idea, or if you are selling the
document or providing a service,
you will want  to make it as easy
as possible for the reader.   If
many people will read your work,
a little of your time will save a
lot of theirs.\par 
If however you are documenting some
obscure part of a system in which
no one other than yourself is likely
to be interested,  or if you feel
that your readers are lucky to have
anything available at all, there
is no point wasting time testing
it.  In the event of someone needing
the information, they might have
to go to some extra trouble  to follow
several links to find what they want,
and then to understand what you have
written.  This may be the most efficient
way of working.  I emphasize this
because there is very much information
which is for a fleeting moment in
people's minds, or is hastily scribbled
down on some file, and which may
be important to posterity.  It is
better for this information to be
available even in unpolished form
than for it to be hidden out of embarrassment
for its form.   Before electronic
technology, the effort of publishing
was such that this information was
never seen, and it was a waste, and
and considered an insult to one's
readers, to publish something which
was not of high quality.  Nowadays,
there is "publishing" at all levels,
and both high quality and hasty documents
have their value.    It is important,
though, to make it clear what the
quality of a document is when making
a reference to it, to avoid disappointment.\par 
Monitoring the server log files will
tell you which documents are really
being read.  You can use your time
most efficiently to improve the quality
of those.  Of course, analysing the
server log files also takes time!\par 
(Part of the Style Guide for Online
Hypertext . Back to Within each doument,
On to Background reading)
Tim BL


\section{Within each document}This section of the style guide deals
with the layout of text within a
"document", the unit of retrieval
of information on the web.\par 
To be completed.\par 
You should try to:
\begin{itemize}
\item Sign your work
\item Give its status
\item Make links into context .
\item Use context-free document titles
\item Format device-independantly
\item Write for the printed work too
\item Write readable text despite the links
\item Avoid talking about mechanics
\end{itemize}(up to overview , back to structure
, on to testing )
Tim BL


\section{Background reading}Some other documents which may be
of relevance, if you are reading
the Style Guide for Online Hypertext
:
\begin{itemize}
\item The HTML Specification and references
from it
\item A Beginner's Guide to writing HTML
\item World-Wide Web server software -
a list of pointers
\item Web Ettiquette $--$ for Server Administrators
\end{itemize}(Back to testing, on to ...)


\chapter{Mail Robot}The mail robot is a program which
will accept incoming mail and allow
remote users to:
\begin{itemize}
\item Subscribe to mailing lists (and unsubscribe)
\item Retrieve information given a W3 addresss
(URL)
\end{itemize}Originally from UC Berkeley, an enhanced
robot is distributed as part of the
world-wide web global information
initiative . Futhur information available
is:
\begin{DL}{allow this much space}
\item[Help
] The help file for users of the
robot service
\item[Installation
] Installation instructions
for unix system managers
\item[Bugs
] Lists of improvements requested
or needed.
\item[Change history
] A list of features
introduced and bugs fixed.
\item[See also
]Other WWW software
\end{DL}


\section{Using the W3 mailing robot}This robot maintains the W3 mailing
lists, and allows W3 documents to
be retrieved on request.\par 
You can subscribe or unsubscribe
to any of the various WWW mailing
lists by sending email to the robot
"listserv@info.cern.ch" $--$ see the
commands listed below.\par 
If you have any problems, requests
or questions for a human being, mail
"www-request@info.cern.ch". Lists
are:
\begin{DL}{allow this much space}
\item[www-announce
] Anyone interested in
WWW, who would like information about
new releases or new online data available.
Please refrain from posting administrivia
to this large list !
\item[www-talk
] Developers of WWW code,
or those interested in discussions
of technical details
\end{DL}
You can also find information on
WWW (as well as many other things!)
by telnetting to info.cern.ch (no
username, no password).\par 
If you want to pick up the WWW software,
then use anonymous FTP to info.cern.ch
and look in directory /pub/www. Subdirectories
are src for the latest source packages,
bin for executables for various machines,
doc for "paper copies" of articles
on WWW in PostScript and ASCII form.
To read the latest documentation,
use WWW !
\subsection{Commands}The commands understood by the listserv
program are:
\begin{DL}{allow this much space}
\item[HELP
] lists this file.  This is also
sent whenever a message to listserv
is received from which no valid command
could be parsed.
\item[HELP groupname
] lists a brief description
of the group requested.
\item[ADD listname
] Add yourself to the
list
\item[DELETE listname
] take yourself off
the list
\item[ADD address listname
] Add yourself
with a given mail address to the
given list. The address must not
contain spaces!
\item[DELETE address listname
] Remove the
given name from the given list. For
all ADD/DELETE commands, mail is
sent to the address given to confirm
the add or delete operation.
\item[SEND document-address
] returns a document
with the requested W3 address.
\item[STOP
] Stop processing requests: ignore
the rest of the message. Needed if
you send a signature on the end of
your message (or if some gateway
adds one). If in doubt, use it.
\end{DL}
A command must be the first word
on each line in the message.  Lines
which do not start with a command
word are ignored.  If no commands
were found in the entire message,
this help file will be returned to
you. A single message may contain
multiple commands; a separate response
will be sent for each.
\subsubsection{Examples}
\begin{verbatim}
	add www-announce

	add me@host.uni.edu www-announce

	delete me@host.uni.edu www-talk
	
	send http://info.cern.ch/hypertext/DataSources/bySubject/Overview.html

\end{verbatim}

\subsection{Subscription}If you are not sending mail from
your preferred mail address, then
you can use the second form of the
command to give your mail address.
If you are not on the internet, please
convert your address into arpa stye.
(For example, UK users please use
international ordering joe@host.ac.uk)
Just speficy the mailbox, without
any spaces.\par 
If you omit the 'address' the command
will assume the mailbox that is in
the From: line of the message.  Note
that SUBSCRIBE is a synonym for ADD;
UNSUBSCRIBE for DELETE.\par 
Please note that is IS possible to
add or delete someone else's subscription
to a mailing list.  This facility
is provided so that subscribers may
alter their own subscriptions from
a new or different computer account.
There is therefore some potential
for abuse; we have chosen to limit
this by mailing a confirmation notification
of any addition or deletion to the
address added or deleted including
a copy of the message which requested
the operation.  At least you can
find out who's doing it to you.\par 
Note that although you would mail
submissions to a mailing list by
addressing mail to e.g., www-talk@info.cern.ch,
in a subscription request you specify
the name of the list simply (without
the @hostname part) as in the first
example above.
\subsection{Retrieving documents}The SEND command (or the WWW command
which is equivalent) returns the
document with the given W3 address,
subject to certain restrictions.
Hypertext documents are formatted
to 72 character width, with links
numbered. A separate list at the
end gives the document-addresses
of the related documents.\par 
If the document is hypertext, it
links will be marked by numbers in
brackets, and a list of document
addresses by number will be appended
to the message. In this way, you
can navigate through the web, albeit
only at mail speed.\par 
If you don't know where to start,
try asking for one of
\begin{verbatim}
 http://info.cern.ch./hypertext/DataSources/bySubject/Overview.html
 http://info.cern.ch./hypertext/DataSources/bySubject/Physics/HEP.html
 http://info.cern.ch./hypertext/WWW/TheProject.html

\end{verbatim}
for lists of futher pointers.
\subsection{CAUTIONARY NOTE}As the robot gives potential mail
access to a *vast* amount of information,
we must emphasise that the service
should not be abused. Examples of
appropriate use would be:
\begin{itemize}
\item Accessing any information about W3
itself;
\item Accessing any CERN and/or physics-related
or network development related information;
\end{itemize}Examples of INappropriate use would
be:
\begin{itemize}
\item Attempting to retrieve binaries or
.tar files or anything more than
directory listsings or short ASCCII
files from FTP archive sites;
\item Reading internet newsgroups which
your site doesn't take;
\item Repeated automatic use;
\end{itemize}There is currently a 1000 line limit
on any returned file. We don't want
to overload other people's mail relays
or our server. We reserve the right
to withdraw the service at any time.
We are currently monitoring all use
of the server, so your reading will
not initially enjoy privacy. End
of cautionary note.\par 
Enjoy!
The W3 team at CERN  (www-bug@info.cern.ch)


\section{Installation}Here are the steps necessary to install
the Mail Robot product on your unix
system.
\subsection{Customisation}Set up the variables in listserv.h
and CommonMakefile to suit your site.
\begin{DL}{allow this much space}
\item[POSTMASTER
] The address from which
messages appear to come. Why not
listserv? Perhaps to prevent mail
loops.
\item[SECUREWWW
] The executable W3 line
mode browser (v1.3 or later, so as
to have the -listrefs option). This
is a separate product. For security,
www should be writable only by root.
\item[SERVERDIR
] The directory in which
you want to put your mailing lists
and help about them.
\end{DL}

\subsection{Compile the programs}Everything compiled on AEM's MicroVax
II running ULTRIX 3.0 then TBL's
NeXT without any problem at all.
Your results may vary.
\subsection{Create your SERVDIR}wherever you specified in listserv.h.
Install a HELP file, perhaps using
the example-files/HELP in this directory
as a template.
\subsection{Set up an alias "listserv"}Make an alias in your /etc/aliases
(or /etc/sendmail/aliases, whatever
you have) that points to this program,
for example:
\begin{verbatim}
		listserv:	"|/usr/local/mail/listserv"
		robot:		"|/usr/local/mail/listserv"


\end{verbatim}

\subsection{For each mailing list}Create a name.info file giving a
bit of information about that mailing
list. see the *.info files in the
example-files subdirectory.\par 
Create a name file in the same directory,
consisting of email addresses one
to a line of subscribers to a group.
If it is for a brand-new group, create
an empty file. Remember that this
file must be writable by the mail
daemon. The name of the file is just
the name of the group.\par 
Depending on how you have your mailing
lists set up, you may need to add
an alias to the /etc/aliases file
for each of the mailing lists. For
example:
\begin{verbatim}	real-recipes: :include:/usr/local/mail/maillists/recipes

\end{verbatim}
So sending mail to real-recipes actually
goes to each of the subscribers listed
in /usr/local/mail/maillists/recipes
\subsection{Install listserv}Install in the appropriate directory.
 Edit the CommonMakefile and then
\begin{verbatim}		make install

\end{verbatim}

\subsection{Run newaliases}This gets sendmail to read the changes
in /etc/aliases.
\begin{verbatim}		newaliases

\end{verbatim}

\subsection{Try it out}Send mail to listserv with body
\begin{verbatim}
		HELP

\end{verbatim}
for example.  You should get a plain
text version of the help file.


\section{Mail Robot}This is a "listserv" type program
which maintains mailing lists, and
allows W3 documents to be retrieved
by electronic mail.
\begin{DL}{allow this much space}
\item[Author:
] Various, modified by TBL.
\item[Status:
] Source available  by anonymous
FTP. (Oct 92)
\item[Current version:
] 1.0
\item[Platforms:
] Unix only.
\item[More information:
] Overview , Bugs
, change history .
\end{DL}


\section{Bugs}This is a list of bugs in or improvements
desired in the Mail Robot. See also
the list of bug fixes .
\begin{itemize}
\item The INDEX command ought to be implemented,
but for some reason always returns
an empty list.  Occasionally it seems
to work.
\end{itemize}


\section{Change History}Changes to the Mail Robot , in reverse
chronological order:
\subsection{October 1992}TBL added information retrieval possibility
using WWW. Release as an unsupported
W3 product to those who ask for it.
\subsection{1991}TBL rewrote str.c (used to overwrite
its arguments).
\subsection{AEM}A. E. Mossberg, aem@mthvax.cs.miami.edu
made a couple minor changes, to make
it slightly less UCSD-specific. He
also added a README, and example
files in the subdirectory example-files.

\subsection{Origin}Note this is NOT the bitnet LISTSERV
program. The term "mail robot" is
yused to attempt to prevent confusion
between these two products, which
have different functionality although
they do basically the same sort of
thing.\par 
This was the UCSD listserv program,
which AEM retrieved from ucsd.edu
by anonymous ftp, TBL retrieved from
ftp.eff.org  As retrieved, from file://ftp.eff.org/pub/listserv2.shar,
it consisted of the following files:
\begin{verbatim}    			README
    			Makefile
    			commands.c
    			listserv.h
    			main.c
    			str.c
    			subscribe.c

\end{verbatim}

\end{document}