srv/loop.nw

3067 lines
114 KiB
Text

% -*- mode: noweb; noweb-default-code-mode: lisp-mode; -*-
\documentclass[a4paper,11pt]{article}
\usepackage[text={6.75in,10in},centering]{geometry}
\usepackage{graphicx}
\usepackage{microtype}
\DisableLigatures[-]{family=tt*}
\usepackage[T1]{fontenc}
\usepackage[utf8]{inputenc}
\usepackage{csquotes}
\usepackage{babel}
\usepackage{etoolbox}
\AtBeginEnvironment{quote}{\small}
\AtBeginEnvironment{verbatim}{\small}
%% \usepackage[backend=biber]{biblatex}
%% \addbibresource{refs.bib}
%% \renewcommand{\cite}{\parencite}
% \usepackage[hyperref]{xcolor}
\usepackage[colorlinks=true,citecolor=]{hyperref} % linkcolor=red
\usepackage{amsmath,amsthm,amssymb}
\allowdisplaybreaks
\usepackage{lmodern}
\usepackage{noweb}
\noweboptions{longchunks,smallcode}
\DeclareMathOperator{\mdc}{mdc}
\DeclareMathOperator{\gcdext}{gcdext}
\DeclareMathOperator{\remainder}{remainder}
\DeclareMathOperator{\quotient}{quotient}
\DeclareMathOperator{\diff}{diff}
\def\nwendcode{\endtrivlist \endgroup}
\let\nwdocspar=\par
%% Popular words.
\newcommand{\lxxp}{{\tt loop}}
\newcommand{\Lp}{{\tt LOOP}}
\newcommand{\lp}{\Lp}
\newcommand{\bug}{{\em bug}}
\newcommand{\symlink}{{\em symbolic link}}
\newcommand{\symlinks}{\symlink s}
\title{\Lp\\
{a circle out of fashion}}
\date{January 2024}
\begin{document}
\pdfbookmark[1]{Introduction}{intro}
\fontfamily{cmr}\selectfont
\maketitle
%\setlength{\parskip}{3pt}
%\setlength{\parindent}{0pt}
\Lp\ is an out-of-fashion program, used as medium of communication by
antiquated people. \Lp\ members insist that technical communication
be made in writing and not in a hurry. To give you an idea, they
write \Lp\ in jurassic technology. You wouldn't pay them any
attention.
%
\begin{verbatim}
Drunk and dressed in their best brown baggies and their platform soles
They don't give a damn about any trumpet-playing band
It ain't what they call rock and roll
-- Mark Knopfler, 1978.
\end{verbatim}
It's easy to make a conference on the Internet. E-mail works. When
we write an e-mail to various friends to discuss a certain subject, we
form a circle. When we decide to add another person to this
quickly-made conference by e-mail, sometimes we use a well-known
idiom---``adding John to the loop''. We add John's e-mail address to
the list of destinaries. So long as everyone replies to everyone,
John, too, will start getting all the messages. If anyone violates
this rule of replying to everyone involved, the loop is broken.
\begin{figure}[!htb]
\centering \includegraphics[width=0.8\linewidth]{images/gnus-summary.png}
\caption{Gnus, a news reader embedded in the GNU EMACS text editor.}
\label{fg:gnus}
\end{figure}
\begin{figure}[!htb]
\centering \includegraphics[width=0.8\linewidth]{images/tbird-summary.png}
\caption{Thunderbird, a news reader produced by the Mozilla Foundation.}
\label{fg:bird}
\end{figure}
\begin{figure}[!htb]
\centering \includegraphics[width=0.8\linewidth]{images/sylpheed-summary.png}
\caption{Sylpheed, a news reader produced by Hiroyuki Yamamoto.}
\label{fg:sylpheed}
\end{figure}
There are surely inconveniences in using e-mail as conference medium.
For example, after John has been added to the loop, he is not able to
leave by his own account. He needs to ask everyone involved to stop
writing to him. This is usually easy to do, but instead people tend
to ask for more technology such as mailing lists. A mailing list is
nothing by an automated version of this idea of writing to various
people at once. When the mailing address is written to, a program
resends the message to all subscribers of the mailing list and, this
way, the conference takes place.
But \lp\ has nothing to do with e-mail. \Lp\ uses a communication
strategy---called a ``protocol''---that is even older than the web
itself. The web started out around 1989--1990 and the protocol
\lp\ uses begins its history in 1979, the year in which AT\&T released
UNIX V7, which was UUCP-ready, an acronym that stands for for UNIX to
UNIX copy. It's both a protocol and a set of programs for copying
files between UNIX systems. People begun exchanging messages by
copying files between machines. The idea eventually evolved into a
protocol called NNTP---Network News Transfer Protocol---, which is the
protocol used by \lp. (The protocol used to be called Netnews
Transfer Protocol.) Since e-mail was already daily practice of
members of the Internet back then, many things from e-mail were taken
by the NNTP designers. So, an NNTP message looks a lot like an e-mail
message and the two---NNTP and SMTP (the protocol used by
e-mail)---can often mingle seamlessly. The impression we get from
using NNTP is that we're sending e-mail to a certain group of people.
It's as though the message goes into a collective mailbox and anyone
interested in that mailbox reads the messages there. If anyone would
like to reply to a message, they do so and, this way, communication
flows among the interested crowd. If anyone would like to leave the
group, nothing is needed---the person just doesn't go back to read any
more messages. Unlike mailing lists, there is no need to formally
commit to reading one of these collective mailboxes and no need to
formally notify anyone or any system that you're not interested in
that group any longer. These collective mailboxes are called ``news
groups'' and are often written as ``newsgroups''. And the messages
posted to these newsgroups are called ``articles''.
Just like e-mail and the web, network news is an open protocol.
Anyone could write a program capable of speaking NNTP. There are many
NNTP-aware programs. You could write your own. Figures
\ref{fg:gnus}--\ref{fg:sylpheed} show a few programs for reading
network news via NNTP.
\section*{Principles for a discussion group}\label{principles}
\pdfbookmark[1]{Principles for a discussion group}{principles} We
believe a discussion group should be small and grow slowly. By
``slowly'', we mean that each member comes in through an invitation.
This way, the group being closed by definition, we keep spam out and
give members a certain sense of privilege. A discussion group should
be formed by interested people. If a participant doesn't log-in for a
certain period of time, \lp\ locks the participant's account---see
Section \ref{sec:inactive-users}. The account can be reactivated, but
it will take asking another participant (with an active account) to do
so. In other words, there's an encouragement for an uninterested
member not to come back to the \lp. The idea is to keep a certain
cohesion in the discussion groups. When an account is locked or
unlocked, an article is posted to the group {\tt local.control.news},
so everyone knows who is leaving and arriving. This way, participants
get to have an idea of who is reading them.
Each invitation comes with a certain responsibility: it's possible to
see who invited who. If {\tt BOB} misbehaves, everyone gets to see
that {\tt ALICE} doesn't have nice friends. The {\tt USERS} command
shows the relationship graph:
%
\begin{verbatim}
USERS
200 List of current users:
ANONYMOUS, last seen on Fri Mar 8 19:01:56 2024, invited (ALICE)
ALICE, last seen on Sun Mar 10 21:25:45 2024, invited (BOB CARLA)
BOB, last seen on Sun Mar 10 21:17:30 2024, invited nobody
CARLA, last seen on Sun Mar 10 18:30:48 2024, invited nobody
\end{verbatim}
We conjecture that a discussion group tends to prosper when each
member feels as important as any other member. So we think each
member should have as much managerial power as any other. In an
attempt to realize this ideal, each member is able to not only invite
other people---see the [[CREATE-ACCOUNT]] command---, but also to
create new groups---see the [[CREATE-GROUP]] command.
Despite this introduction, this is not a tutorial on the history of
the Internet or how NNTP works. This is the source code of \lp.
Hereafter, our conversation continues in Lisp. Understanding how
\lp\ is made is only necessary if you intend to modify it. If you
just want to use the system, you probably should stop right here.
\section*{How to install}
See
\href{https://git.antartida.xyz/loop/srv/raw/branch/main/README}{[[README]]}
in \lp's source code.
\section{Implementation strategy}\label{sec:design}
Anything a user sends to the \lp\ is wrapped in a [[request]] and
any command processing must produce a [[response]]:
%
\begin{verbatim}
request ---> process ---> response.
\end{verbatim}
%
The request arrives, \lp\ interprets it, finds out which command
should look at the request and then dispatches the the request to the
right command. The command process the request and returns a
response. Then \lp\ takes this response and sends it out back to the
user through the network using [[send-response!]]. That's all there
is to \lp. Everything else is just the details of this strategy.
Before you investigate the source code of \lp, you should have {\tt
SBCL} installed---the Common Lisp compiler we've been using---and
the package [[:loop]]. This way you can interact with the system:
%
\begin{verbatim}
CL-USER> (ql:quickload :loop)
To load "loop":
Load 1 ASDF system:
loop
; Loading "loop"
..................
(:LOOP)
CL-USER> (in-package :loop)
#<PACKAGE "LOOP">
LOOP> (make-request "HELP")
#S(REQUEST :VERB "HELP" :ARGS NIL :SAID NIL)
\end{verbatim}
%
The procedure [[make-request]] constructs a [[request]]. When a user
connects to the \lp, each command sent by the user is packaged inside
a [[request]] structure like this one above. To see how \lp\ answers
this request for help, we can invoke the procedure [[dispatch]], which
is responsible to find the correct procedure to reply to the request.
\begin{verbatim}
LOOP> (dispatch (make-request :verb "HELP"))
#S(RESPONSE
:CODE 400
:DATA "unrecognized command"
:REQUEST #S(REQUEST :VERB "HELP" :ARGS NIL :SAID NIL)
:MULTI-LINE NIL)
\end{verbatim}
It turns that \lp\ doesn't recognize the command. That's because we
only loaded the [[:loop]] package. The reason it doesn't recognize
the command is that the command table has not been built. Let's build
it and try again:
\begin{verbatim}
LOOP> (set-up-tables!)
(("ARTICLE"
. #S(COMMAND
:FN #<FUNCTION CMD-ARTICLE>
:VERB "ARTICLE"
:DESCRIPTION "fetches full articles"))
[...]
("HELP"
. #S(COMMAND
:FN #<FUNCTION CMD-HELP>
:VERB "HELP"
:DESCRIPTION "displays this menu")) [...])
LOOP> (dispatch (make-request :verb "HELP"))
#S(RESPONSE
:CODE 200
:DATA (87 104 97 [...] 112)
:REQUEST #S(REQUEST :VERB "HELP" :ARGS NIL :SAID NIL)
:MULTI-LINE YES)
\end{verbatim}
That's better. These numbers we see in the response are the bytes in
the response. You can get a string version of these numbers:
\begin{verbatim}
LOOP> (bytes->string (response-data (dispatch (make-request :verb "HELP"))))
"What's on the menu today?
ARTICLE fetches full articles
AUTHINFO makes me trust you
BODY fetches an article body
GROUP sets the current group
HEAD fetches article headers
HELP displays this menu
LIST lists all groups
MODE handles the mode request from clients
NEXT increments the article pointer
POST posts your article
QUIT politely says good-bye
XDD displays your state of affairs
XOVER fetches the overview database of a group"
\end{verbatim}
This is the text that the user sees when they ask for {\tt HELP}. In
other words, the field {\tt data} in a [[response]] stores the data to
be delivered back to the user. The program {\tt nc}---for
``netcat''---that we use below is capable of opening a TCP connection
and handling that connection to our keyboard so that we can interact
with the \lp. You can effectively achieve the same thing using a
program such as {\tt telnet}.
\begin{verbatim}
C:\>nc antartida.xyz 119
200 Welcome! Say ``help'' for a menu.
help
200 What's on the menu today?
ARTICLE fetches full articles
AUTHINFO makes me trust you
BODY fetches an article body
GROUP sets the current group
HEAD fetches article headers
HELP displays this menu
LIST lists all groups
MODE handles the mode request from clients
NEXT increments the article pointer
POST posts your article
QUIT politely says good-bye
XDD displays your state of affairs
XOVER fetches the overview database of a group
.
quit
205 Good-bye.
\end{verbatim}
Despite \lp\ being made to talk to computer programs such as the ones
illustrated in Figures~\ref{fg:gnus}--\ref{fg:sylpheed}, it's
perfectly possible for a user to interact with \lp\ directly using a
keyboard and a command-line tool such as {\tt nc} or {\tt telnet}. In
fact, \lp\ takes advantage of that to be {\em hackable}---see also
Section~\ref{sec:repl}. Commands such as [[CREATE-ACCOUNT]],
[[CREATE-GROUP]], [[PASSWD]] are not part of the NNTP protocol, so
users need to know how to use {\tt nc} or {\tt telnet} to take
advantage of all of \lp's capabilities.
\section{NNTP protocol}
An Internet protocol is usually defined by a document whose tradition
calls RFC---for ``request for comments''---, an acronym suggested by
Stephen D.~Crocker, who wrote RFC 1 back in 1969, when ARPANET
connected four computers. NNTP is defined by RFCs 977, 2980, 3977,
4643 and 5536. RFC 977 was the first and replaced by 3977. Still,
reading RFC 977 is interesting precisely because it gives us a
historical account of the protocol, making it easier to understand the
evolution of the system. The objective of RFC 2980 was to implement
new ideas to the NNTP protocol---to extend the protocol. RFC 3977
adopts some of these extensions. RFC 4643 also extends RFC
2980---addressing concerns with authentication.
\section{It's a network server}
The \lp\ program is a network server, that is, it serves connections
on the network. However, instead of implementing the handling of TCP
connections, we use what is called a ``TCP superserver'' or ``TCP
wrappers'' or perhaps just ``TCP server''. The idea is---a program
called ``TCP server'' waits for connections from the network. When a
client arrives, the TCP server handles the network connection to \lp.
From \lp's perspective, the client is just another keyboard talk to it
directly. This strategy simplifies the implementation of \lp. Both
programs---the TCP server and \lp---have nothing to do with one
another, but their work together make the system work. This
implementation strategy is typically found in UNIX programs.
%
\begin{quote}\small
This is the Unix philosophy. Write programs that do one thing and do
it well. Write programs to work together. Write programs that handle
text streams, because that is a universal interface.\\ --- Doug
McIlroy, 1989, interviewed by Michael S.~Mahoney.
\end{quote}
%
The TCP server just just one thing---listens for new connections and
handles them to the interested program. It does the handling and does
it well. \Lp, on the other hand, concerns itself with the NNTP
protocol and does not worry about handling network connections. This
way, they work together. And \lp\ handles only a text stream, which
is why it's so easy to connect a keyboard to it and interact with it
through the command line as illustrated in Section~\ref{sec:design}.
\section{REPL for \lp}\label{sec:repl}
There's a REPL to \lp's procedures---always useful.
\begin{verbatim}
%./loop --repl
LOOP> (+ 1 1)
2
LOOP> (list-groups)
(local.control.news local.test)
\end{verbatim}
The REPL can also be use in the middle of an NNTP conversation, which
is useful for debugging. All authenticated users can use it, making
passwords only a barrier to spammers, not to your circle of friends,
since everyone can find a way to log in with anyone's account, change
their passwords or just do whatever they want to \lxxp\ process or
databases.
As the comment below explains, we need to [[clear-output]] in the
[[*standard-input*]] to avoid the following problem:
%
\begin{verbatim}
repl
LOOP> quit
200 Okay, no more REPL hacking.
400 I beg your pardon?
\end{verbatim}
%
What happens is that [[read]] consumes just the symbol \verb|quit|, so
there remains a \verb|\n| or \verb|\r\n| in the [[*standard-input*]],
which we need to wipe out, or [[nntp-read-line]] will consume an empty
line, which is what causes that 400 response.
<<Loop's REPL>>=
(defun repl (r)
(in-package :loop)
(loop
(print/finish "LOOP> ")
(handler-case
(let ((expr (read)))
(if (eq 'quit expr)
(progn
;; At this point, there's \r\n still in the stdin due
;; to read having returned a symbol followed by a line
;; termination, so we must wipe that out or nntp-read-line
;; will find an empty line after this.
(clear-input)
(return
(make-response
:code 200 :request r
:data "Okay, that's enough hacking for today.")))
(println "~a" (eval expr))))
(end-of-file ()
(print/finish "^D~%")
(uiop:quit 0))
(interactive-interrupt ()
(print/finish "^C~%")
(uiop:quit 0))
(t (c)
(print/finish
"Oops: ~a~%"
(str:collapse-whitespaces
(str:replace-all (string #\linefeed) " " (fmt "~a" c))))))))
@ %def repl
\section{Representation of a client}
How do we represent a client? A client is typically reading a group
and an article; it's has authenticated itself or not yet. So we need
a global structure to annonate the client's state.
<<Representation of a client>>=
(defstruct client group (article 1) (username "ANONYMOUS") (auth? 'no))
(defparameter *client* (make-client))
@ %def client *client*
\section{Representation of a command}
What does a client typically tell \lp? A client typically sends
commands. Commands typically need arguments. Each command is
dispatched to a procedure that answers it---it's the purpose of
\verb|fn| in the command. Together, all commands make up a table of
commands, which is essentially what the user sees when ask for
\verb|HELP|. (Be aware that some clients use the output of
\verb|HELP|. For example, Gnus v5.13.)
<<Representation of commands>>=
(defstruct command fn verb description)
(defparameter *commands-assoc* nil)
@
<<Table of commands>>=
(defun table-of-commands ()
`(("GROUP" ,#'cmd-group "sets the current group")
("NEXT" ,#'cmd-next "increments the article pointer")
("HELP" ,#'cmd-help "displays this menu")
("LIST" ,#'cmd-list "lists all groups")
("AUTHINFO" ,#'cmd-authinfo "makes me trust you")
("LOGIN" ,#'cmd-login "shorter interface to AUTHINFO")
("HEAD" ,#'cmd-head "fetches article headers")
("MODE" ,#'cmd-mode "handles the mode request from clients")
("BODY" ,#'cmd-body "fetches an article body")
("POST" ,#'cmd-post "posts your article")
("ARTICLE" ,#'cmd-article "fetches full articles")
("XOVER" ,#'cmd-xover "fetches the overview database of a group")
("CREATE-GROUP" ,#'cmd-create-group
"creates a new group so you can discuss your favorite topic")
("CREATE-ACCOUNT",#'cmd-create-account
"creates an account so you can invite a friend")
("PASSWD" ,#'cmd-passwd "changes your password")
("USERS" ,#'cmd-list-users "lists all users")
("DD" ,#'cmd-dd "[d]isplays [d]ata: your state of affairs")
("QUIT" ,#'cmd-quit "politely says good-bye")
("DATE" ,#'cmd-date "displays the current date at this server")
("UNLOCK-ACCOUNT" ,#'cmd-unlock-account "unlocks an account")
("REPL" ,#'cmd-repl "lets you hack away")))
(defun set-up-tables! ()
(labels ((build-commands-assoc (ls)
(if (null ls)
nil
(cons (apply #'make-command-pair (car ls))
(build-commands-assoc (cdr ls)))))
(make-command-pair (name fn desc)
(cons name (make-command :fn fn :verb name :description desc))))
(setf *commands-assoc*
(sort
(build-commands-assoc (table-of-commands))
#'string-lessp :key #'car))))
(defun get-command (key)
(let ((cmd (assoc key *commands-assoc* :test #'string=)))
(labels ((unrecognized-command ()
(make-command
:fn #'(lambda (r)
(make-response
:code 400
:data "unrecognized command"
:request r))
:verb 'unrecognized
:description "a command for all commands typed wrong")))
(or (cdr cmd) (unrecognized-command)))))
@ %def *commands-assoc* set-up-tables! get-command
\section{Representation of requests and responses}
Each command is given through a text line written by the user. Let's
call this text line the [[request]]. When \lp\ parses the request, it
will extract (from the request) a verb and some arguments. We will
take a verbatim copy of everything the user has said, possibly for
debugging purposes.
How do we represent a [[response]]? A [[response]] is always a
reaction to a [[request]]. The NNTP protocol always specifies an
integer as the code to a response, which is what we call the
\verb|code| in the response. Long responses end with a period and we
mark such responses with [[multi-line]].
<<Representation of requests and responses>>=
(defstruct request verb args said)
(defstruct response code data request multi-line)
@
<<Procedures for requests and responses>>=
(defun empty-response () (make-response :code 400 :data "I beg your pardon?"))
(defun prepend-response-with (message r)
(make-response
:code (response-code r)
:data (data message (crlf) (response-data r))
:multi-line (response-multi-line r)
:request (response-request r)))
@ %def request response make-response make-request empty-response prepend-response-with
Here's how to send a [[response]] to a client.
<<Procedures for requests and responses>>=
(defun append-crlf-if-needed (seq)
(cond
((stringp seq)
(append-crlf-if-needed (string->bytes seq)))
((listp seq)
(append seq
(when (not (= (car (last seq)) 10))
(list 13 10))))
(t (error (format nil "append-crlf-if-needed: unsupported type: ~a" (type-of seq))))))
(defun send-response! (r)
(let ((bs (data (integer->string (response-code r)) " "
(append-crlf-if-needed (response-data r)))))
(my-write bs *standard-output*)
(stderr ">>> ~a" (bytes->string (ucs-2->ascii bs))))
(when (response-multi-line r)
(let ((bs (data "." (crlf))))
(my-write bs *standard-output*)
(stderr ">>> ~a" (bytes->string (ucs-2->ascii bs)))))
(force-output)
r)
@ %def send-response!
The Windows Console---the one we use when we run {\tt cmd.exe}---makes
the Lisp procedure [[write-sequence]] produce UCS-2. Windows will do
what we need of [[write-sequence]] if we're not operating \lp\ on the
Windows Console. Recall that we do not run \lp\ on Windows Console,
so whatever happens because of it is of little importance to us.
However, we implement \lp\ on Windows and so it's convenient for us
that \lp\ and Windows play nicely with each other. A cheap solution
here is to simply convert the bytes to a string if \lp\ is directly
connected to an interactive Lisp stream. This way we effectively
eliminate the UCS-2 encoding used by Windows. It is perfectly fine
for us to destroy the encoding of articles while we're writing \Lp.
It is not fine, however, when it's running in production. But, in
production, {\tt (interactive-stream-p s)} will always be false. How
else should we handle this?
<<Procedures for requests and responses>>=
(defun my-write (ls-of-bytes s)
(if (interactive-stream-p s)
(write-sequence (mapcar #'code-char ls-of-bytes) s)
(write-sequence ls-of-bytes s)))
@ %def my-write
\section{Main loop}
Every command consumes a [[request]] and produces a [[response]]. If
any procedure always produces a [[response]], then delivering a
[[response]] to the user is a matter of sending a string composed of
the [[response]] code concatenated with the [[response]] data.
What does \lp\ do? It repetitively reads a line from the user,
processes that line and always replies something back. Then \lp\ is
back at waiting for the user to say something else again. If the user
says {\tt QUIT}, then we should identify it and terminate \lp's
execution. That's even [[send-response!]] returns the [[request]]
itself---so we can cascade actions based on a user's request.
<<Main loop>>=
(defun main-loop ()
(loop
(let* ((bs (nntp-read-line))
(ln (bytes->string (ucs-2->ascii bs))))
(let ((r (send-response! (dispatch-line ln))))
(when (response-quit? r)
(return))))))
(defun request-quit? (r) (and r (string= 'quit (request-verb r))))
(defun response-quit? (r) (and r (request-quit? (response-request r))))
(defun server-start ()
(set-up-tables!)
(send-banner!)
(main-loop))
(defun main ()
(let ((app (cli/command)))
(clingon:run app)))
(defun send-banner! ()
(send-response!
(make-response
:code 200
:data "<<Welcome message>>")))
@ %def main main-loop
\noindent It's always useful to know which version exactly we're
dealing with:
%
\begin{verbatim}
%./loop.exe
200 Welcome! I am LOOP 9575ac2. Say ``help'' for a menu.
\end{verbatim}
%
So we put a release tag on \lp's banner.
<<Welcome message>>=
Welcome! I am <<Name>> <<Version>>. Say ``help'' for a menu.
@
\noindent We take the opportunity and describe \lp's package,
information which we also use in [[loop.asd]].
<<Name>>=
LOOP
@
<<Description>>=
An NNTP server for a circle of friends.
@
<<Version>>=
a89e088
@
\section{Parsing of requests}
The commands themselves we call {\tt verbs} and everything else the
user types we call {\tt args}. Observe that upper and lower case
letters are equivalent in request verbs.
<<Procedures for requests and responses>>=
(defun parse-request (r)
(let* ((collapsed-s (str:collapse-whitespaces (request-said r)))
(ls (str:split " " collapsed-s :omit-nulls 'please)))
;; What are we going to do with a null request?
(cond ((null ls) (make-request :said (request-said r)))
(t (let ((verb (car ls))
(args (cdr ls)))
(make-request :said (request-said r)
:verb (str:upcase verb)
:args args))))))
@ %def parse-request
\section{Parsing of command-line arguments}
We're using the clingon library as per Vincent Dardel suggestion in
``The Common Lisp Cookbook''. We begin with writing a description of
the program and options it understands. XXX: notice I don't know how
to support a two-argument option, so I hacked a solution away. What
we need to is to implement a new option. The library is extensible.
<<Command-line parsing>>=
(defun cli/options ()
(list
(clingon:make-option
:string
:description "creates a new account"
:long-name "create-account"
:key :create-account)
(clingon:make-option
:string
:description "changes password"
:long-name "change-passwd"
:key :change-passwd)
(clingon:make-option
:flag
:description "lists accounts"
:short-name #\l
:long-name "list-accounts"
:key :list-accounts)
(clingon:make-option
:flag
:description "runs a REPL right now"
:short-name #\r
:long-name "repl"
:key :repl)
(clingon:make-option
:flag
:description "disables the NNTP REPL"
:long-name "disable-nntp-repl"
:key :disable-nntp-repl)
(clingon:make-option
:flag
:description "logging (on stderr)"
:long-name "logging"
:key :logging)))
@
We implement first the procedures that handle options that represent
an entire program. For example, saying [[--list-accounts]] is like
running a program [[./list-accounts]].
<<Command-line parsing>>=
(defun cli/list-accounts ()
(println (str:join (crlf-string) (list-users))))
(defun cli/create-account (username args)
(let ((invited-by (car args)))
(cond ((null invited-by)
(println "Must specify who invites the new account."))
((get-account username)
(println "Username account ``~a'' already exists." username))
((not (get-account invited-by))
(println "Invited-by account ``~a'' doesn't exist." invited-by))
(t
(multiple-value-bind (okay? pass-or-error) (new-account! username invited-by)
(if okay?
(progn (println "Okay, account ``~a'' created with password ``~a''."
username pass-or-error)
(notify-user-created username))
(println "Sorry, ~a." pass-or-error)))))))
(defun cli/change-passwd (username args)
(let* ((random-passwd (random-string 6))
(given-passwd (car args))
(new-passwd (or given-passwd random-passwd)))
(if (not (get-account username))
(println "No such account ``~a''." username)
(multiple-value-bind (okay? problem) (change-passwd! username new-passwd) (if okay?
(println "Okay, account ~a now has password ``~a''." username new-passwd)
(println "Sorry, could not change password: ~a." problem))))))
@ %def cli/change-passwd cli/list-accounts cli/create-account
Now let's write the main procedure in command-line parsing. Notice
that because of the design of the [[clingon]] library, command-line
parsing becomes the main procedure of \lp. In other words, \lp's
service starts with [[server-start]].
<<Command-line parsing>>=
(defun cli/main-with-handlers (cmd)
(handler-case
(cli/main cmd)
(end-of-file ()
(print/finish "^D~%")
(uiop:quit 0))
(interactive-interrupt ()
(print/finish "^C~%")
(uiop:quit 0))))
(defun cli/main (cmd)
(read-accounts!)
(connect-index! "message-id.db")
(create-index!)
(let ((args (clingon:command-arguments cmd))
(run-server t)
(repl (clingon:getopt cmd :repl))
(ca (clingon:getopt cmd :create-account))
(pa (clingon:getopt cmd :change-passwd))
(la (clingon:getopt cmd :list-accounts))
(logging (clingon:getopt cmd :logging))
(disable-nntp-repl (clingon:getopt cmd :disable-nntp-repl)))
(setf *debug* logging)
(when (or ca pa la)
(setf run-server nil))
(when la
(cli/list-accounts))
(when ca
(cli/create-account ca args))
(when pa
(cli/change-passwd pa args))
(when repl
(repl (make-request :verb "repl" :args '(command-line))))
(when disable-nntp-repl
(setq *enable-nntp-repl* nil))
(when run-server
(server-start))))
(defun cli/command ()
(clingon:make-command
:name "loop"
:description "<<Description>>"
:version "<<Version>>"
:license "GPL v3"
:options (cli/options)
:handler #'cli/main-with-handlers))
@ %def cli/options cli/command
\section{Request dispatching mechanism}
Dispatching requests means consuming one and invoking the correct
procedure that will process the request. The invoked procedure must
produce a [[response]]. The work of dispatching is just delivering
the task to an response-producing procedure and then raising the
response to whoever needs to catch it. For example,
[[response-quit?]] is used by [[main-loop]] to identify when the user
has issued {\tt QUIT}, in which case we terminate [[main-loop]].
<<Dispatching of commands>>=
(defun dispatch (r)
(let* ((verb (request-verb r)))
(if (null verb)
(empty-response)
(funcall (command-fn (get-command verb)) r))))
(defun dispatch-line (ln)
(dispatch (parse-request (make-request :said ln))))
@ %def dispatch dispatch-line
\section{Representation and parsing of articles}
An article is made of two parts, the head and the body. We do need to
parse the head, but we never parse the body: we don't want to
intervene in anything that a user might be doing in the body of the
article. The headers, however, are mostly under the jurisdiction of
the server. This decision is due to the fact servers must read
headers. For example, how do we know to which groups an article was
posted? We look at the header {\tt newsgroups}. So, the server must
understand the encoding of headers. Therefore, we assume ASCII
encoding of all headers that we need to parse.
The member [[headers]] of the structure [[article]] is just a string,
while body is a vector of bytes. To get a list of pairs out of the
set of all headers of an article, we can ask [[parse-headers]]. Yes,
I should've called the member [[headers]] as [[head]] and not
[[headers]] because both the word ``headers'' and its plural used here
suggest a list of parsed headers. We're going to rename this in due
time. %% TODO
<<Representation of articles>>=
(defstruct article headers body)
@
<<How to parse articles>>=
(defun parse-article (v)
(let ((parts (split-vector (vector 13 10 13 10) v nil :limit 2)))
(make-article :headers (map 'string #'code-char (car parts)) :body (cadr parts))))
(defun hs-space-collapsed (hs)
(cl-ppcre:regex-replace-all (format nil "~a[ ~a]+" (crlf-string) #\tab) hs " "))
(defun hs-lines (lines) (str:split (crlf-string) lines))
(defun parse-header (header)
(let* ((h (str:collapse-whitespaces header))
(pos (search ":" h)))
(when (null pos)
(throw 'article-syntax-error
(values nil (format nil "missing colon in header |~a|" h))))
(when (<= (length h) (+ 2 pos))
(throw 'article-syntax-error
(values nil (format nil "empty header ~a" h))))
(multiple-value-bind (key val)
(values (subseq h 0 pos) (subseq h (+ 2 pos)))
(cons (str:downcase key) val))))
(defun parse-headers (hs)
(let ((ls (hs-lines (hs-space-collapsed hs))))
(mapcar #'(lambda (h) (parse-header h)) ls)))
(defun string-integer? (s) (ignore-errors (parse-integer s)))
@ %def parse-article parse-headers
We now write some procedures that we use when we're build the {\em
overview} of the command \verb|XOVER|.
<<How to parse articles>>=
(defun get-header-from-article (h a)
(get-header h (parse-headers (article-headers (parse-article a)))))
(defun get-header (key hs)
(let ((pair (assoc key hs :test #'string=)))
(if pair (cdr pair) "")))
(defun fetch-headers (g i)
(let* ((a-string (fetch-article g i))
(a-parsed (parse-article a-string))
(headers (parse-headers (article-headers a-parsed))))
(enrich-headers headers a-string)))
(defun enrich-headers (hs a)
(append hs
`(("line-count" . ,(format nil "~a" (nlines (article-body (parse-article a)))))
("byte-count" . ,(format nil "~a" (length a))))))
(defun nlines (v) (length (split-vector (crlf) v nil)))
@ %def get-header fetch-headers
\section{How to extract articles from the database}
Notice that we do not care about which encoding is used in articles.
We just read the article bytes and handle them to the client. It's
the article viewer's---the NNTP client's, that is---responsibility of
interpreting such bytes. That's why we call [[read-sequence]] here.
<<How to parse articles>>=
(defun fetch-article (g i)
(in-groups
(read-file-raw (format nil "~a/~a" g i))))
(defun read-file-raw (path)
(with-open-file
(in path
:element-type '(unsigned-byte 8)
:if-does-not-exist nil)
(when in
(let* ((size (sb-posix:stat-size (sb-posix:stat path)))
(a (make-array size)))
(read-sequence a in)
a))))
(defun fetch-body (g i)
(article-body (parse-article (fetch-article g i))))
@ %def fetch-article
The purpose of [[encode-body]] is to eventually worry about the
appearance of a period on a line by itself in the middle of an
article. Since ``\verb|.\r\n|'' is part of the NNTP protocol, we must
handle this gracefully---but notice we have not done anything about
that so far. So we are essentially writing a bug right here.
<<How to parse articles>>=
(defun encode-body (a) a)
@ %def encode-body
The procedures [[extract-mid]] and [[lookup]] also belong belong in
this section. Notice that I also wrote [[mid-by-name]], which should
merge with [[extract-mid]]. I think I also wrote more
redundancies---perhaps in the implementatio nof [[xover]]---for not
using [[lookup]]. I need to seek out all such places and organize. %% TODO
<<How to parse articles>>=
(defun extract-mid (a)
(lookup "message-id" (parse-headers (article-headers (parse-article a)))))
(defun lookup (key table)
(cdr (assoc key table :test #'string=)))
@ %def extract-mid lookup
\section{Commands}
\subsection{{\tt HELP}}
When someone asks for help, we present a table of commands. The table
construction is made by [[menu]]. The procedure [[menu]] was one of
the first things I wrote. (This is my first program written in Common
Lisp.) I didn't want to get involved with the famous [[loop]] macro,
so I used recursion in [[menu]]\footnote{I'd like to isolate these
auxiliary procedures inside a single function that uses them. Common
Lisp offers me [[labels]], but [[labels]] don't seem so helpful when
I'm at the REPL. When I use [[defun]], I'm able to always invoke the
procedure at the REPL, but that's not so with [[labels]]. I guess the
use of [[labels]] is when the procedure is so trivial that we have no
reason to think we're doing to debug it.}
<<Command help>>=
(defun cmd-help (r)
(let ((lines (menu *commands-assoc*)))
(prepend-response-with
"What's on the menu today?"
(make-response :code 200 :multi-line 'yes
:request r
:data (str:join (crlf-string) lines)))))
(defun menu (ls)
(loop for item in ls
collect (display-fn item)))
(defun display-fn (cmd-pair)
(let ((cmd (cdr cmd-pair)))
(format nil "~A ~A"
(command-verb cmd)
(command-description cmd))))
@
\subsection{{\tt AUTHINFO}}\label{sec:authinfo}
The implementation of {\tt AUTHINFO}. When we connect to
\lp\ directly from a keyboard, it's a bit painful to authenticate with
two commands---{\tt AUTHINFO user} and {\tt AUTHINFO pass}. So we
also implemented the command {\tt LOGIN}---see
Section~\ref{sec:login}. To check the user's password, we use the
procedure [[pass?]] that's defined in the implementation of {\tt
PASSWD}. Perhaps we should have called it {\tt
is-password-correct?} or something more obvious.
<<authinfo error message>>=
Syntax error. Say ``authinfo USER /your-name/'' then ``authinfo PASS /your-pass/''.
@
<<Command authinfo>>=
(defun cmd-authinfo (r)
(let* ((args (mapcar #'str:upcase (request-args r))))
(cond
((not (= (length args) 2))
(bad-input r "No, no: I take exactly two arguments."))
(t
(multiple-value-bind (cmd arg) (apply #'values args)
(cond
((string= cmd "USER")
(setf (client-username *client*) arg)
(make-response :code 381 :request r
:data (format nil "Hey, ~a, please tell us your password." arg)))
((string= cmd "PASS")
(if (authinfo-check (client-username *client*) arg)
(progn
(log-user-in!)
(make-response
:code 281 :request r
:data (fmt "Welcome, ~a." (client-username *client*))))
(make-response :code 400 :request r :data "Sorry. Wrong password.")))
(t (make-response :code 400 :request r :data "<<authinfo error message>>"))))))))
(defun authinfo-check (username passwd)
(pass? username passwd))
(defun auth? ()
(eq 'yes (client-auth? *client*)))
(defun log-user-in! ()
(setf (client-auth? *client*) 'yes)
(let ((u (get-account (client-username *client*))))
(setf (account-seen u) (get-universal-time)))
(write-accounts!))
@ %def auth? log-user-in!
\subsection{{\tt CREATE-ACCOUNT}}
We allow authenticated members to invite their friends, which creates
a tree of people. ({\em An idea}. We could envision that each tree
trunk manages the rest of the tree underneath it. So I invite you, I
could change your password, say, or handle any problems you might
have. This decentralizes system administration, easing the support
burden.)
The name of each user must conform to the expression
<<Form of user names>>=
^[^\\s]+$
@
Same as in @<<Form of newsgroup names@>>. We'll let users create
whatever complicated user names they want. If they can type it up,
it's their problem.
<<Command create-account>>=
(defun user-name-conforms? (u)
(conforms-to? u "<<Form of user names>>"))
<<Command create-account>>=
(defun cmd-create-account (r)
(with-auth
(with-n-args 1 r
(let* ((args (mapcar #'str:upcase (request-args r)))
(username (car args)))
(multiple-value-bind (username pass-or-error) (new-account! username (client-username *client*))
(if (not username)
(make-response :code 400 :request r
:data (fmt "~a. Choose a new name." pass-or-error))
(progn
(notify-user-created username)
(make-response :code 200 :request r
:data (fmt "Okay, account ~a created with password ``~a''."
username pass-or-error)))))))))
@
<<Representation of accounts>>=
(defstruct account username seen last-post friends pass pass-locked pass-locked-why creation)
(defparameter *accounts* nil)
@ %def *accounts*
<<Command create-account>>=
(defun read-accounts! ()
(let ((*package* (find-package '#:loop)))
(with-open-file
(s "accounts.lisp"
:direction :input)
(setq *accounts* (read s))))
*accounts*)
(defun string->array (s)
(make-array (length s)
:element-type '(unsigned-byte 8)
:initial-contents (map 'vector #'char-code s)))
(defun string->sha256 (s)
(let ((d (ironclad:make-digest :sha256)))
(ironclad:produce-digest (ironclad:update-digest d (string->array s)))))
(defun new-account! (username invited-by)
(let* ((u (str:upcase username))
(p (random-string 6))
(a (make-account :username u
:pass (string->sha256 (str:upcase p))
:creation (get-universal-time))))
(multiple-value-bind (okay? reason) (user-name-conforms? u)
(declare (ignore reason))
(cond ((not okay?)
(values nil (fmt "username must conform to <<Form of user names>>")))
((get-account u)
(values nil (fmt "account ~a already exists" u)))
(t
(push u (account-friends (get-account invited-by)))
(push a *accounts*)
(write-accounts!)
(values (str:upcase username) p))))))
@ %def CREATE-ACCOUNT new-account!
Notice that we have a race condition in [[write-accounts]]. What is
the problem? Two processes in parallel may ask for the writing of
[[accounts.lisp]]. The process that loses the race will have its
modifications lost. What do we need to do? Either we use file
locking or we do something smarter without a real file locking
mechanism. It's not clear to me what is possible here, but this is
definitely a problem that we need to solve.
<<Command create-account>>=
(defun write-accounts! ()
(let ((name
(loop
(let* ((tmp (random-string 10))
(name (format nil "~a.tmp" tmp)))
(when
(ignore-errors
(with-open-file
(s name
:direction :output
:if-exists :error
:if-does-not-exist :create)
(write *accounts* :stream s)))
(return name))))))
(if (ignore-errors (rename-file name "accounts.lisp"))
(values t *accounts*)
(values nil (format nil "could not rename ~a to accounts.lisp" name)))))
(defun get-account (username)
(loop for u in *accounts*
do (when (string= (str:upcase username) (account-username u))
(return u))))
@ %def read-accounts! write-accounts! get-account
\subsection{{\tt UNLOCK-ACCOUNT}}
Inactive accounts are removed or locked---see Section
\ref{sec:inactive-users}. When an account is locked, any member can
unlock it.
<<Command create-account>>=
(defun cmd-unlock-account (r)
(with-auth
(with-n-args 1 r
(let* ((args (mapcar #'str:upcase (request-args r)))
(username (car args)))
(cond ((not (get-account username))
(make-response :code 400 :request r
:data (fmt "No such account ~a." username)))
((not (locked? username))
(make-response :code 400 :request r
:data (fmt "Can't unlock ~a because it's not locked." username)))
(t
(unlock-account! username)
(notify-user-unlocked username)
(make-response :code 200 :request r
:data (fmt "Okay, account ~a unlocked." username))))))))
(defun unlock-account! (username)
(let ((u (get-account username)))
(cond ((not u)
(values nil "no such account"))
((not (locked? username))
(values nil "account isn't locked"))
(t
(setf (account-pass u) (account-pass-locked u))
(setf (account-pass-locked u) nil)
(setf (account-pass-locked-why u) nil)))))
@ %def unlock-account!
\subsection{{\tt LOGIN}}\label{sec:login}
Besides {\tt AUTHINFO} in Section~\ref{sec:authinfo}, we also
implement a more convenient command for authenticationwhen we interact
with \lp\ through a command-line interface. Instead of having to say
two commands, we can just say {\tt login user password}.
<<Command login>>=
(defun cmd-login (r)
(let* ((args (mapcar #'str:upcase (request-args r))))
(cond
((not (= (length args) 2))
(bad-input r "Usage: login your-username your-password"))
(t
(multiple-value-bind (name pass) (apply #'values args)
(cond
((pass? name pass)
(log-user-in-as! name)
(make-response :code 200 :request r
:data (fmt "Welcome, ~a." name)))
(t (make-response :code 400 :request r
:data (fmt "Wrong password.")))))))))
(defun log-user-in-as! (name)
(setf (client-username *client*) name)
(log-user-in!))
@ %def log-user-in-as!
\subsection{{\tt PASSWD}}\label{sec:passwd}
A change of password is made with {\tt PASSWD current new}. Observe
that we are duplicating code from other command procedures. I think
there is a macro emerging here called [[with-upcase-args]]. %% TODO
<<Command passwd>>=
(defun cmd-passwd (r)
(with-auth
(let* ((args (mapcar #'str:upcase (request-args r))))
(cond
((not (= (length args) 2))
(bad-input r "Usage: passwd current-password new-password"))
(t
(multiple-value-bind (cur new) (apply #'values args)
(cond
((pass? (client-username *client*) cur)
(multiple-value-bind (okay? problem) (change-passwd! (client-username *client*) new)
(if okay?
(make-response :code 200 :request r
:data "You got it. Password changed.")
(make-response :code 500 :request r
:data (fmt "Sorry: ~a" problem)))))
(t (make-response :code 400 :request r
:data (fmt "Sorry. Wrong password."))))))))))
(defun pass? (username pass)
(let ((u (get-account username)))
(and
u
(cond ((null (account-pass u)) nil)
((integerp (account-pass u))
(eq (sxhash pass) (account-pass u)))
((arrayp (account-pass u))
(equalp (string->sha256 pass) (account-pass u)))
(t (error "pass? expected to find INTEGERP or ARRAYP but found ~a" (type-of (account-pass u))))))))
(defun change-passwd! (username newpass)
(let ((u (get-account username)))
(when (not u)
(error "I could not find account ~a." username))
(setf (account-pass u) (string->sha256 (str:upcase newpass)))
(write-accounts!)))
@ %def PASSWD pass? change-passwd!
\subsection{{\tt USERS}}\label{sec:users}
The tree of users and their friends is public. Anyone can know who
invited who.
<<Command users>>=
(defun cmd-list-users (r)
(with-auth
(prepend-response-with
"List of current users:"
(make-response
:code 200 :request r :multi-line 'yes
:data (str:join (crlf-string) (list-users))))))
(defun size-of-longest-username ()
(loop for u in *accounts*
maximizing (length (account-username u))))
(defun list-users ()
(mapcar
#'(lambda (row) (cadr row))
(sort
(loop for u in *accounts*
collect (list
(account-username u)
(fmt "~v@a~a, ~a, invited ~a"
(size-of-longest-username)
(account-username u)
(if (locked? (account-username u))
(fmt " (account locked: ~a)"
(account-pass-locked-why u))
"")
(if (last-time-seen (account-username u))
(fmt "last seen on ~a" (last-time-seen (account-username u)))
"never logged in")
(or (account-friends u) "nobody"))))
#'string<= :key #'(lambda (row) (car row)))))
(defun universal-to-human (s)
(format-timestring
nil
(universal-to-timestamp s)
:format +asctime-format+))
(defun last-time-seen (username)
(let ((u (get-account username)))
(if u (let ((s (account-seen u)))
(if s (universal-to-human s))))))
@ %def list-users last-time-seen size-of-longest-username
\subsection{{\tt LIST}}\label{sec:list}
The database of groups and articles is a UNIX directory. We just need
to discover which directories exist and produce a listing. The heavy
work here is finding the index interval of articles in the group. (I
think we should already be optimizing this by merely caching the
information in a file that is read at start-up. I think we should
even cache the overview of the group.) %% TODO
<<Command list>>=
(defstruct group name high low)
(defun cmd-list (r)
(prepend-response-with
"Get in the loop! Lots to choose from."
(make-response :code 215 :multi-line 'yes
:data (str:join (crlf-string) (build-groups-lines (build-groups-structs)))
:request r)))
(defun build-groups-lines (ls)
(reverse
(mapcar
#'(lambda (g)
(format nil "~a ~a ~a y" (group-name g) (group-high g) (group-low g)))
ls)))
(defun build-groups-structs ()
(let ((ret-ls nil))
(dolist (g (list-groups) ret-ls)
(multiple-value-bind (low high len) (group-high-low g)
(declare (ignore len))
(setf ret-ls (cons (make-group :name g :high high :low low) ret-ls))))))
(defun between? (x from to)
(<= from x to))
(declaim (inline between?))
(defun filesize (path)
(sb-posix:stat-size
(sb-posix:stat path)))
(defun zero-file? (path)
(= (filesize path) 0))
(defun temporary-article? (path)
(or (zero-file? path)
(cl-ppcre:scan "\.tmp$" (namestring path))))
(defun article-ready? (path)
(not (temporary-article? path)))
<<Workaround for list-directories and list-files>>
(defun get-articles (g &optional from to)
(in-groups ;; We might want to optimize this some day. That's a
;; problem to be studied.
(let ((as (articles->integers
(remove-if #'temporary-article? (loop-list-files (truename g))))))
(sort (remove-if-not
#'(lambda (x) (between? x (or from x) (or to x)))
as)
#'<))))
(defun group-high-low (g)
(let* ((articles (get-articles g))
(sorted-ints (sort articles #'<)))
(values (or (car sorted-ints) 0)
(or (car (last sorted-ints)) 0)
(length sorted-ints))))
(defun articles->integers (ls)
(remove-if #'null
(mapcar #'(lambda (g)
(ignore-errors
(parse-integer (basename (uiop:unix-namestring g)))))
ls)))
(defun list-groups ()
(let ((groups (in-groups (loop-list-directories (truename ".")))))
(sort (mapcar #'(lambda (g) (basename (uiop:unix-namestring g))) groups)
#'string-lessp)))
(defun last-char (s) (char s (1- (length s))))
@
In [[basename]], what we want to do is---if the path ends with a
slash, we invoke [[pathname]]. Otherwise, we invoke {\tt
file-namestring}.
<<Command list>>=
(defun basename (path)
(let ((s (str:collapse-whitespaces path)))
(if (char= #\/ (last-char s))
(car (last (pathname-directory s)))
(file-namestring s))))
@ %def get-articles group-high-low
To list directories and files, I have been using
\href{https://github.com/Shinmera/filesystem-utils/tree/master}{{\tt
filesystem-utils}} by Yukari Hafner. I found an issue with both
[[list-directories]] and [[list-files]] in a fresh install of
FreeBSD 14.2 and in a Debian 8.11 codename jessie. The issue is
that the [[#+cffi]] chunk of the source code incorrectly produced
[[NIL]]. (Dramatically, the same was not true in a FreeBSD
14.1.) The source code had an alternative chunk of code for
[[#-cffi]] and I discovered that this alternative worked on these
systems I tested. So, as a workaround, I incorporate these
procedures below using the chunk [[#-cffi]] to get \Lp\ working
on these systems.
<<Workaround for list-directories and list-files>>=
(defun loop-directory* (directory &rest args &key &allow-other-keys)
#+allegro (apply #'directory directory :directories-are-files NIL :follow-symbolic-links NIL args)
#+(or clozure digitool) (apply #'directory directory :follow-links NIL args)
#+clisp (apply #'directory directory :circle T :if-does-not-exist :ignore args)
#+(or cmucl scl) (apply #'directory directory :follow-links NIL :truenamep NIL args)
#+lispworks (apply #'directory directory :link-transparency NIL args)
#+sbcl (apply #'directory directory :resolve-symlinks NIL args)
#-(or allegro clozure digitool clisp cmucl scl lispworks sbcl)
(apply #'directory directory args))
(defun loop-list-files (directory)
(let ((directory (pathname-utils:to-directory directory)))
(let* ((directory (pathname-utils:pathname* directory))
(entries
(ignore-errors
(loop-directory*
(merge-pathnames pathname-utils:*wild-file* directory)))))
(remove-if #'directory-p entries))))
(defun loop-list-directories (directory)
(let ((directory (pathname-utils:to-directory directory)))
(let* (#-(or abcl cormanlisp genera xcl)
(wild (merge-pathnames
#-(or abcl allegro cmucl lispworks sbcl scl xcl)
pathname-utils:*wild-directory*
#+(or abcl allegro cmucl lispworks sbcl scl xcl) "*.*"
directory))
(dirs
#+(or abcl xcl) (system:list-directory directory)
#+cormanlisp (cl::directory-subdirs directory)
#+genera (handler-case (loop for (p . k) in (fs:directory-list directory)
when (eql :directory k) collect p)
(fs:directory-not-found () nil))
#+clozure (ignore-errors (directory* wild :directories T :files NIL))
#+mcl (ignore-errors (directory* wild :directories T))
#-(or abcl xcl cormanlisp genera clozure mcl) (loop-directory* wild)))
(loop for path in dirs
when (directory-p path)
collect (pathname-utils:force-directory path)))))
@ %def loop-list-directories loop-list-files
\subsection{{\tt GROUP}}\label{sec:group}
We just need to verify if the group exists and modify [[*client*]].
<<Command group>>=
(defun cmd-group (r)
(with-auth
(with-n-args 1 r
(let ((g (car (request-args r))))
(with-group g r
(set-group! g)
(multiple-value-bind (low high len) (group-high-low g)
(let ((ln (format nil "~a ~a ~a ~a" len low high g)))
(setf (client-article *client*) low)
(make-response :code 211 :request r :data ln))))))))
(defun group? (g)
(in-groups
(ignore-errors (directory-p g))))
(defun xgroup? (g)
(directory-p g))
(defun set-group! (g)
(setf (client-group *client*) g))
@ %def group?
Why have I written {\tt group?} and {\tt xgroup?}? There's probably a
clean-up task here. %% TODO
\subsection{{\tt BODY}, {\tt HEAD}, {\tt ARTICLE} e {\tt NEXT}}
\label{sec:typical-cmds}
Here we have an illustration of the expressive power of first-class
procedures. We want to implement the commands {\tt BODY}, {\tt HEAD},
{\tt ARTICLE} and {\tt NEXT}. The ways to invoke them are (1) with no
argument, (2) with a single integer argument and (3) with a text
argument. In case (1), we use the number of the current
article---which is kept in [[*client*]]. In case (2), the NNTP client
specifies the article number. In case (3), the NNTP client specifies
the exact {\tt message-id}, which forces us to query the index---see
Section~\ref{sec:index} for the implementation of the index.
<<Commands head, body, article>>=
(defun typical-cmd-head-body-article (r fn-name)
(with-auth
(with-group-set
(let ((args (request-args r)))
(cond ((null args)
(funcall fn-name r (client-group *client*) (client-article *client*)))
((= 1 (length args))
(let* ((n-or-mid (car args)))
(cond ((string-integer? n-or-mid)
(funcall fn-name r (client-group *client*) n-or-mid))
(t (multiple-value-bind (group n-str) (lookup-index n-or-mid)
(if (and group n-str)
(funcall fn-name r group n-str)
(bad-input r (format nil "Unknown article ~a." n-or-mid))))))))
(t (bad-input r "No, no: it takes at most two arguments.")))))))
(defun cmd-head (r)
(typical-cmd-head-body-article r #'head-response))
(defun cmd-body (r)
(typical-cmd-head-body-article r #'body-response))
(defun cmd-article (r)
(typical-cmd-head-body-article r #'article-response))
(defun article-response (r g i)
(typical-cmd-response 220 r g i #'(lambda (a) (encode-body a))))
(defun head-response (r g i)
(typical-cmd-response 221 r g i #'(lambda (a) (article-headers (parse-article a)))))
(defun body-response (r g i)
(typical-cmd-response 222 r g i #'(lambda (a) (encode-body (article-body (parse-article a))))))
@
In processing such commands, we need to verify the existence of files
{\em et cetera}. The procedure [[fetch-article]] returns [[nil]] when
the article requested doesn't exist and it throws
[[sb-posix:syscall-error]] due to [[sb-posix:stat-size]], which we
need to find the size of the article. We need to know the file size
not only to allocate an array at the right size, but also because we
must provide the size when producing the \verb|OVERVIEW|. If a
problem such as [[sb-posix:syscall-error]] appears, we just inform the
client and terminate the request---nothing else to do.
XXX: instead of only catching [[sb-posix:syscall-error]], we should
catch anything else, reporting the error. Otherwise, we will blow up
in case of some unexpected error, which might not be a bad idea---as
long as we can log these errors and get a report later on of what's
going on so we can improve the code. I still don't really know what
to do here. Let's leave it as it is. The original idea is to put a
[[t]]-case in the [[handler-case]] below and just log the error
instead of crashing completely. We can simulate the catching of an
unexpected condition by signaling it from fetch-article as a test.
This type of situation should have a testing routine as well. So,
yeah, first give yourself another read of the [[lisp-unit]]
documentation, then how to handle conditions properly and then come
back to this to-do item.
<<Commands head, body, article>>=
(defun typical-cmd-response (code r g i get-data)
(handler-case
(let ((a (fetch-article g i)))
(cond ((null a)
(make-response
:code 400 :request r
:data (format nil "article ~a/~a does not exist" g i)))
(t
(prepend-response-with
(format nil "~a ~a" i (extract-mid a))
(make-response :multi-line 'yes :code code
:request r :data (funcall get-data a))))))
(sb-posix:syscall-error (c)
(make-response
:code 400 :request r
:data (format nil "article ~a/~a: ~a" g i c)))))
@ %def typical-cmd-response
The command \verb|NEXT| has a slight different semantics.
<<Commands head, body, article>>=
(defun cmd-next (r)
(with-auth
(let ((g (client-group *client*))
(n-cur (client-article *client*)))
(cond
((not g) (bad-input :code 412 r "must say GROUP first"))
(t (multiple-value-bind (low high len) (group-high-low g)
(declare (ignore low len))
(cond ((= n-cur high) (bad-input r "you are at the last article already"))
(t (article-next! r g)))))))))
(defun article-next! (r g)
(setf (client-article *client*) (1+ (client-article *client*)))
(let ((cur (client-article *client*)))
(make-response :code 223
:request r
:data (format nil "~a ~a" cur (mid-by-name g cur)))))
(defun mid-by-name (g name)
(extract-mid (fetch-article g name)))
@ %def cmd-next article-next! mid-by-name
\subsection{{\tt XOVER}}\label{sec:xover}
The procedure [[cmd-xover]] is used to figure out what the user said.
Once we have that figured out, we invoke [[xover]], which finishes the
work. Notice that when the argument [[to]] from [[xover]] is [[NIL]],
then the user is asking for articles indexed from the integer [[fr]]
to the last one.
<<Command xover>>=
(defun cmd-xover (r)
(with-auth
(with-group-set
(let ((args (request-args r)))
(cond ((null args)
(xover r (client-article *client*) (client-article *client*)))
((= 1 (length args))
(multiple-value-bind (s v)
(cl-ppcre:scan-to-strings "([0-9]+)([-]?)([0-9]*)" (car args))
(cond
((not s) (make-response :code 502 :request r :data "bad syntax"))
(t (let ((fr (parse-integer (aref v 0)))
(hifen (aref v 1))
(to (ignore-errors (parse-integer (aref v 2)))))
(when (not (string= hifen "-"))
(setq to fr))
(xover r fr to))))))
(t (make-response :code 502 :request r :data "bad syntax")))))))
(defun xover (r from to)
(assert (client-group *client*))
(let* ((g (client-group *client*))
(ls (get-articles g from to)))
(cond ((= 0 (length ls))
(make-response :code 420 :request r :data "no articles in the range"))
(t
(prepend-response-with
"Okay, your overview follows..."
(make-response
:code 224 :request r :multi-line 'yes
:data (str:join
(crlf-string)
(loop for i in ls
collect (xover-format-line
i
(remove-if-not
#'(lambda (h)
(member (car h) (xover-headers)
:test #'string=))
(fetch-headers g i)))))))))))
(defun xover-format-line (i hs)
(str:concat (format nil "~a~a" i #\tab)
(str:join #\tab
(mapcar #'(lambda (h) (get-header h hs))
(xover-headers)))))
(defun xover-headers ()
'("subject" "from" "date" "message-id" "references" "line-count" "byte-count"))
@
\subsection{{\tt MODE READER}}\label{sec:mode-reader}
So, we're always in reader mode, so we just ignore this command.
<<Command mode reader>>=
(defun cmd-mode (r) ;; Whatever.
(make-response :code 200 :request r :data "Sure thing."))
@ %def
\subsection{{\tt DATE}}\label{sec:date}
It's always useful to know the time and date at a computer. We should
surely format it a bit better than what {\tt now} does.
<<Command date>>=
(defun cmd-date (r)
(make-response :code 201
:request r
:data
(format-timestring nil (now))))
@ %def
\subsection{{\tt QUIT}}\label{sec:quit}
The use of {\tt QUIT} has a conection to [[main-loop]]: when the user
says {\tt QUIT}, [[main-loop]] must terminate.
<<Command quit>>=
(defun cmd-quit (r)
(make-response :code 205 :data "Good-bye." :request r))
@ %def
\subsection{{\tt DD}}\label{sec:dd}
The command {\tt DD} means ``[d]isplay client [d]ata structures''. It
shows to the client the internal state of how the server sees it.
I've used only for debugging and it's not really useful any longer.
I'm going to remove this very soon.
<<Command dd>>=
(defun cmd-dd (r)
(make-response :code 200 :data (format nil "state: ~a" *client*) :request r))
@ %def
\subsection{{\tt POST}}\label{sec:post}
If the client says {\tt POST}, then we continue to read line after
line until we find \verb|".\r\n"|. Having done that, we must check
whether we have a conformant article at hands. The definition of
conformant is given by [[conforms?]]. In a few words, the article
must have \verb|message-id|, \verb|subject|, \verb|from|,
\verb|newsgroups|. If the client doesn't provide us with a
\verb|message-id|, then \lp\ adds one. (Similarly for \verb|date|.)
<<Command post>>=
<<Does an article conform?>>
(defun suggest-message-id (&optional (n 20))
(format nil "<~a@loop>" (random-string n)))
(defun random-string (size)
(let* ((universe "abcdefghijklmnopqrstuvwxyz")
(len (length universe))
(state (make-random-state t))
mid)
(dotimes (c size)
(setq mid (cons (char universe (random len state)) mid)))
(coerce mid 'string)))
@
Sometimes we parse an article and sometimes we want to undo that
parsing. Am I doing something wrong? I wonder.
<<Command post>>=
(defun unparse-article (parsed)
(data
(let ((ls))
(dolist (h (parse-headers (article-headers parsed)))
(setq ls (cons (data (str:capitalize (car h)) ": " (cdr h) (crlf)) ls)))
(nreverse ls))
(crlf)
(article-body parsed)))
@ %def unparse-article
If an article being posted has no \verb|message-id| or \verb|date|,
\lp\ provides these headers. We kill these two rabbits with
[[ensure-header]], but we should probably make a table of headers and
procedures that would generate such headers if they're missing. Right
now, however, we have only these two to worry about.
<<Command post>>=
(defun ensure-header (h fn bs)
(let* ((headers (parse-headers (article-headers (parse-article bs)))))
(if (lookup h headers)
bs
(unparse-article
(make-article
:headers
(str:join (crlf-string)
(mapcar #'(lambda (h)
(format nil "~a: ~a" (car h) (cdr h)))
(cons (cons h (funcall fn)) headers)))
:body (article-body (parse-article bs)))))))
(defun get-date ()
(multiple-value-bind (s m h day mon year dow dst-p tz)
(get-decoded-time)
(declare (ignore dow dst-p))
(format nil "~4,'0d-~2,'0d-~2,'0d ~2,'0d:~2,'0d:~2,'0d GMT~a"
year mon day h m s (- tz))))
(defun ensure-mid (bs)
(ensure-header "message-id" #'suggest-message-id bs))
(defun ensure-date (bs)
(ensure-header "date" #'get-date bs))
@ %def ensure-mid ensure-date
Now it's time to look at the header \verb|newsgroups|. XXX: we need
to rewrite this because we have that plan of verifying everything
there is to verify up front in [[conforms?]]. So when we invoke
[[post]], there's nothing else to verify. We're verifying in two
places at the same time.
The name of each group must conform to the expression
<<Form of newsgroup names>>=
^[^\\s/]+$
@
In other words, let group names go wild. They cannot contain a slash
or a space of any kind anywhere on the name---more literally: they
must begin with any character that's not a space and must have at
least one character. The problem wish slashes is that each group will
be a directory on a UNIX file system, so we cannot let slashes appear.
People should have total freedom in naming groups. If users create
groups that mess up the local organization, then people should discuss
the matter and find a solution. Let's let people mess it up instead
of trying to stop them.
<<Command post>>=
(defun cmd-post (r)
(with-auth
(send-response!
(make-response
:code 340
:data (format nil "Okay, go ahead. Suggested message-id ~a."
(suggest-message-id))))
(let* ((bs (nntp-read-article)))
(multiple-value-bind (okay? error) (conforms? bs)
(cond ((not okay?)
(make-response
:code 400 :request r
:data (format nil "Sorry. Your article doesn't conform: ~a." error)))
(t (multiple-value-bind (code reply) (post bs)
(make-response :code code :request r :data reply))))))))
@
It's time to write the action of posting. One thing to keep in mind
is cross-posting. First, notice that we're---so far---duplicating
articles on the file system. (We will undo that once we reimplement
our index. More to follow.) More importantly, we cannot let the user
post to any group if one of the groups is incorrectly named---for
example, when the group doesn't exist. Why don't we post to the ones
that are correct and warn the user of the ones that are incorrect?
Because that is not prudent. The user could be trying to publish news
to be received at the same time by various groups. We would make such
plans all go down the drain.
We collect a list of newsgroups that don't exist (or whose names do
not conform for any reason). If we find any such group, then we
refuse posting and return a 400 code with a message describing which
group names failed. Otherwise we save the article.
<<Command post>>=
(defun post (bs)
(let ((ngs (newsgroups-header->list
(get-header "newsgroups" (parse-headers
(article-headers
(parse-article bs))))))
(ngs-dont-exist))
(dolist (g ngs)
(if (or (not (group-name-conforms? g))
(not (group? g)))
(push g ngs-dont-exist)))
(if (zerop (length ngs-dont-exist))
(progn
(dolist (ng ngs)
(let ((a (ensure-date (ensure-mid bs))))
(save-article-insist ng (get-next-article-id ng) a (extract-mid a))
(update-last-post-date! (client-username *client*))))
(values 240 (data "Thank you! Your article has been saved.")))
(values 400 (data "Sorry. We did not post your article to any newsgroup because "
"the " (word-plural (length ngs-dont-exist) "newsgroup") " "
(str:join ", " (sort ngs-dont-exist #'string<))
" just " (word-plural (length ngs-dont-exist) "doesn't") " exist.")))))
(defun newsgroups-header->list (s)
(mapcar #'(lambda (n) (str:trim (string-downcase n))) (str:split "," s)))
@ %def post
XXX: notice we parse the article again to extract information from it
that we need during [[post]]. That's not only a waste of
time---because we already did that---, but it makes [[post]] a lot
less generic. Perhaps [[conforms?]] should return a data structure
that contains all that [[post]] needs. Then [[post]] consumes that
and saves the article more easily. That's a better idea. I think
[[post]] should not even use variables such as [[*client*]]. The
username to which to update the last-seen date should be included in
the data structure.
<<Command post>>=
(defun update-last-post-date! (username)
(let ((u (get-account username)))
(setf (account-last-post u) (get-universal-time))))
@ %def update-last-post-date!
The job of [[save-article-try]] is to atomically allocate an article
name in a group. The atomicity is achieved with [[:if-exists]] in the
[[with-open-file]] macro. When [[if:-exists]] is used, the {\tt open}
system call uses the flag \verb|O_EXCL|, given us an atomic operation.
So, the first [[with-open-file]] allocates the name. If successful,
we take our time writing the article to the temporary file and we
atomically rename it at the end. We should close [[name]] before
trying to [[rename-no-extention]]---on UNIX systems, renaming an open
target might be okay, but that's not allowed on Windows systems. Even
though we have no interest in running \Lp\ on Windows, closing before
renaming it sounds more like the Right Thing to do.
<<Command post>>=
(defun rename-no-extension (old new)
(rename-file old (make-pathname :name new :type :unspecific)))
(defun save-article-try (name-try bs)
(let ((name (format nil "~a" name-try))
(tmp (format nil "~a.tmp" name-try)))
(with-open-file
(s name
:direction :output
:if-exists nil ;; an atomic operation
:if-does-not-exist :create)
(when (null s)
(progn
(stderr "warning: save-article-try: ~a exists~%" name)
(return-from save-article-try 'name-exists))))
(with-open-file
(s tmp
:direction :output
:if-exists :error
:if-does-not-exist :create
:element-type '(unsigned-byte 8))
(write-sequence bs s))
(rename-no-extension tmp name)))
@ %def save-article-try
The procedure [[save-article-insist]] insists on calling
[[save-article-try]] until it finds an article name that has not been
allocated. Notice that the argument [[name]] is an integer, so
[[name]] is incremented at each iteration.
<<Command post>>=
(defun save-article-insist (g name a message-id)
(loop for name from name do
(in-dir (format nil "groups/~a/" g)
(when (not (eql 'name-exists (save-article-try name a)))
(return (values name (insert-index message-id g (fmt "~a" name))))))))
(defun get-next-article-id (g)
(multiple-value-bind (low high len) (group-high-low g)
(declare (ignore low len))
(1+ high)))
@ %def save-article-insist get-next-article-id
{\bf How to read lines in the NNTP protocol?} We've implemented the
most trivial strategy possible. It's also the slowest. What I think
we need to do here is to use [[vector-push-extend]]. But this is to
be done in [[nntp-read-line]]. I hope to be able to get a faster
procedure in [[nntp-read-line]] and keep [[nntp-read-article]] as it
is. This is important to speed up posting. For instance, if we allow
attachments (which we don't), the performance penalty is clearly
noticeable. %%TODO
<<Command post>>=
(defun nntp-read-article (&optional acc)
;; Returns List-of Byte.
(let* ((ls (ucs-2->ascii (nntp-read-line))))
(cond ;; 46 == (byte #\.)
((equal (list 46) ls) (flatten (add-crlf-between acc)))
(t (nntp-read-article (append acc (list ls)))))))
@ %def nntp-read-article
The NNTP protocol establishes that line termination is done with
\verb|\r\n|, but it's useful to support UNIX line terminations, too,
because we are using the command-line and interacting directly with
the server using tools that will not always send \verb|\r\n| as line
termination. For example, when someone is typing directly from the
keyboard and insert an empty line, we need the {\tt (and acc ...)}
because sometimes the list [[acc]] comes out empty, but an empty line
never comes from the NNTP protocol because there's is always a {\tt
CR} before {\tt LF}.
<<Command post>>=
(defun nntp-read-line (&optional (s *standard-input*) acc)
;; Returns List-of Byte.
(let ((x (read-byte s)))
(cond ((or (null x) (= x 10))
(let ((bs (and acc (nreverse (if (= (car acc) 13) (cdr acc) acc)))))
(stderr "<<< ~a~%" (bytes->string (ucs-2->ascii bs)))
bs))
(t (nntp-read-line s (cons x acc))))))
(defun list->bytes (ls)
(mapcar #'data->bytes ls))
(defun vector->bytes (v)
(mapcar #'data->bytes (coerce v 'list)))
(defun data->bytes (d)
(cond ((null d) nil)
((integerp d) (list d))
((stringp d) (string->bytes d))
((consp d) (list->bytes d))
((vectorp d) (vector->bytes d))
(t (error (format nil "type ~a is not supported" (type-of d))))))
(defun add-crlf-between (ls-of-ls)
;; Add \r\n to each ``line''. Returns List-of Byte.
(mapcar #'(lambda (ls) (append ls (list 13 10))) ls-of-ls))
(defun string->bytes (s)
(map 'list #'char-code s))
(defun bytes->string (ls)
(map 'string #'code-char ls))
@ %def nntp-read-line nntp-read-article
{\bf Does the article conform?} RFC 5536 obliges every article to
have exactly \verb|date|, \verb|from|, \verb|message-id|,
\verb|newsgroups|, \verb|path| and \verb|subject| headers. The
headers \verb|path| is \Lp's responsibility, but it's used only in a
network of servers---so we're leaving that out for now. The header
\verb|message-id| is added by \Lp\ if the client doesn't write it
itself.
Let's criticize the writing of [[conforms?]]. We have a [[catch]]
here and a [[throw]] in [[parse-headers]]. We also have a [[return]]
here. It's getting hard to read this procedure because it's not easy
to know that a procedure has to return a certain value to match the
expectation of another procedure. I don't remember what [[catch]]
does. I need to review this and then add the explanation for myself.
If I don't remember how this works, other beginners won't know it
either. %% TODO
<<Does an article conform?>>=
(defun conforms? (bs)
(catch 'article-syntax-error ;; parse-headers might throw
(let ((headers (parse-headers (article-headers (parse-article bs)))))
(let ((result (dolist (h (headers-required-from-clients))
(when (not (lookup h headers))
(return (format nil "missing the /~a/ header" h)))))
(content-type (get-header "content-type" headers)))
(cond
((stringp result) (values nil result))
((not (text/plain? content-type))
(values nil (format nil "content-type must be plain/text, but it's ~a" content-type)))
(t (values t nil)))))))
(defun text/plain? (header-s)
;; I say T when S begins with "text/plain" or when S is "".
(let* ((s (str:collapse-whitespaces header-s))
(needle "text/plain")
(len (min (length needle) (length s))))
(or (zerop len)
(and (<= (length needle) (length s))
(string= needle s :end1 len :end2 len)))))
(defun headers-required-from-clients ()
'("from" "newsgroups" "subject"))
@ %def conforms?
Notice that up to this point we've only verified if the necessary
headers are present. The \verb|newsgroups| header is a direct
influence to the article storage. For instance, \verb|newsgroups|
must mention only groups that exist. When we save the article, we
check each group. If at least one group exists, we save the article;
if at least one group doesn't exist, we report to the user all groups
that don't exist, but we do save the article if at least one does
exist. That's probably not the best thing to do. We should probably
warn the user that one group doesn't exist because that could make all
the difference to the user. For instance, someone might decide not to
post at all if they can't cross post to all the groups they wish to.
One typo in one group name and the article would be posted to some
groups, but not to the misstyped one. We need to change this. %% TODO
Also, do notice that to simplify matters we're duplicating articles
cross-posted. What we should do is write the article to the first
group in the list of \verb|newsgroups| and then make a symbolic link
to all others. The problem is that I don't know how to do that on
Windows. I'm not sure if Windows supports symbolic links at all. We
could perhaps duplicate articles only when on Windows. %% TODO
\subsection{{\tt CREATE-GROUP}}
We allow every user to create their own groups. (Are we crazy?) When
someone craetes a group, we post an article to {\tt
local.control.news} notifying everyone that a new group has been
created. People then have a chance to subscribe to the new group. We
assume that when someone creates a group, it's either no problem at
all or it has been discussed with the community beforehand.
<<Command create-group>>=
(defun cmd-create-group (r)
(with-n-args 1 r
(let ((g (string-downcase (car (request-args r)))))
(multiple-value-bind (okay? reason)
(group-name-conforms? g)
(if (not okay?)
(make-response :code 580 :request r
:data (format nil "group name does not conform: ~a" reason))
(progn
(multiple-value-bind (path created?)
(in-groups (ensure-directories-exist (concatenate 'string g "/")))
(declare (ignore created?))
(if (not path)
(make-response :code 581 :request r
:data (format nil "could not create group ~a"
(if (group? g)
"because it already exists"
"but we don't know why---sorry!")))
(progn
(notify-group-created g)
(make-response :code 280 :request r
:data (format nil "group ~a created" g)))))))))))
(defun group-name-conforms? (g)
(conforms-to? g "<<Form of newsgroup names>>"))
@ %def CREATE-GROUP group-name-conforms?
\subsection{{\tt REPL}}
\lp\ is totally {\em hackable}. Users can say {\tt repl} to have
complete control over their \lxxp\ process.
<<Command repl>>=
(defun cmd-repl (r)
(if *enable-nntp-repl*
(with-auth
(repl r))
(make-response
:code 400
:data "The REPL has been *explicitly* disabled by the sysadmin. :("
:request r)))
@
If your users are not the hacker-type, you can disable the NNTP REPL
with the command-line option [[--disable-nntp-repl]]. We decide not
to hide the command in the list of commands given by saying {\tt HELP}
to \lp\ because this way users are advertised about the commands that
exist---they could be having fun, but their sysadmin doesn't think
they're skilled enough.
<<Global variable that decides whether to enable the NNTP REPL>>=
(defparameter *enable-nntp-repl* t)
@
\section{Publication of news}
If you're interested in being notified about what's going on in the
\lp, then subscribe to {\tt local.control.news}. Group creation,
invitations {\em et cetera} are published there.
<<Broadcasting>>=
(defun notify-group-created (g)
(post-notification
:subject (fmt "new group ~a by ~a" g (client-username *client*))
:body (fmt "Blame ~a for the group ~a just created." (client-username *client*) g)))
(defun notify-user-created (u)
(post-notification
:subject (fmt "new account ~a by ~a" u (client-username *client*))
:body (fmt "Blame ~a for inviting ~a." (client-username *client*) u)))
(defun notify-user-unlocked (u)
(let ((guilty (client-username *client*)))
(post-notification
:subject (fmt "account ~a unlocked by ~a" u guilty)
:body (fmt "Blame ~a for unlocking ~a." guilty u))))
(defun post-notification (&key subject body)
(in-groups (ensure-directories-exist "local.control.news/"))
(when (group? "local.control.news")
(let ((a (make-news :subject subject :body body)))
(post (concatenate 'vector (article-headers a) (crlf) (article-body a) (crlf))))))
(defun make-news (&key subject body)
(make-article
:headers (data
(add-crlf-between
(mapcar
#'(lambda (p) (data (format nil "~a: ~a" (car p) (cdr p))))
`(("from" . "Loop")
("subject" . ,subject)
("newsgroups" . "local.control.news")))))
:body (data body)))
@ %def notify-group-created notify-user-created
\section{Algorithm of {\tt split-vector}}
How should we describe the algorithm of [[split-vector]]? The
objective is to consume lists of bytes such as
[[(1 2 3 13 10 4 5 6 13 10 7 8 9)]] and produce lists of bytes
such as [[((1 2 3) (4 5 6) (7 8 9))]]. We use [[search]] to find [[13 10]].
When we find this line termination, we have its position in the
consumed list. With the position, we collect the line and we iterate
searching for the next line.
<<How to split a stream into lines>>=
(defun split-vector (delim v acc &key limit (so-far 1))
(let ((len (length v)))
(split-vector-helper delim v len acc limit so-far 0)))
(defun split-vector-helper (delim v len acc limit so-far start)
(if (zerop len)
acc
(let ((pos (search delim v :start2 start :end2 len)))
(cond ((or (not pos) (and limit (= so-far limit)))
(nreverse (cons (subseq v start len) acc)))
(t (split-vector-helper
delim
v
len
(cons (subseq v start (or pos len)) acc)
limit
(1+ so-far)
(+ pos (length delim))))))))
@ %def split-vector
\section{Article index}\label{sec:index}
Every NNTP server needs to have an index of articles. Each article is
indexed by its message-id. For example, the article
%
\begin{verbatim}
From: Eva Lu <evalu@tor.soy>
Newsgroups: comp.unix
Subject: xv6: an introduction
Date: Wed, 24 Jul 2024 22:42:50 -0300
Message-ID: <87plr25js5.fsf@tor.soy>
Cancel-Lock: sha1:82E8hqL1D1WQ9xc+CfnEYbFAaJo=
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
\end{verbatim}
%
is indexed by the header {\tt message-id}. If you ask for article
{\tt <87plr25js5.fsf@tor.soy>}, \Lp\ will tell you that you can find
the article in group {\tt comp.unix} and its numeric ID is 37. Note
that the number 37 is not inside the article, but only in the name of
the file stored in the file system. The index, therefore, knows in
which file is each {\tt message-id}. This fact implies that you
cannot rename files in the file system---of course not: you'd changing
identifiers inside a database. If you have rename a file, you will
need to rebuild the index. Given that the index is an SQL table, you
can adjust the index relative to the any files you may need to rename
for whatever reason. You can also rebuild the entire index by reading
the file system---unless you have a lot of files, that's probably the
easiest thing to do.
The use of [[*default-database*]] by the library [[clsql]] is very
convenient for us: we don't need to specify with which database we're
working. Since we work with only one, we pretty much never need to
specify anything.
<<Reference to the database>>=
(defparameter *default-database* nil)
@ %def *default-database*
<<How to create and connect to the index>>=
(defun connect-index! (filename)
(setq *default-database* (clsql:connect (list filename) :database-type :sqlite3)))
(defun create-index! ()
(clsql:execute-command "create table if not exists indices
(id varchar(1000), grp varchar(1000), article varchar(300))")
(clsql:execute-command "create unique index if not exists idx_id_1
on indices (id)"))
(defun drop-create-index! ()
(clsql:execute-command "drop table if exists indices")
(create-index!))
@ %def create-index! drop-create-index! connect-index!
Of course, the creation and connection of the index must occur before
[[main-loop]], so it takes place in [[main]].
When someone requests an article, it's either by its numeric index but
then the client has already chosen a group, or it's by its {\tt
message-id}. We don't need to tell the client to which groups the
article belongs; we just give the entire article to the client. It
is, therefore, the client's responsibility what to do with the
article. However, to fetch an article, we need to know where in the
database (in the file system) is the article; in other words, we must
know one group in which the article was stored. This implies that the
index must know at least one group. We've decided to always index the
first group in the {\tt newsgroups} header. So the index's anatomy is
$(m, g, i)$, where $m$ is the {\tt message-id}, $g$ is the name of the
group and $i$, is the name of the article in the file system. This
also defines the anatomy of the SQL table.
Should we store more information in the index? Not really. If we
need anything about an article, we can get it after we fetch it from
the file system. For example, suppose that a search command wishes to
display the fact that article was posted in various groups. Suppose
further the command has already located in the index an article to be
displayed. This means the command has the {\tt message-id} and one of
the groups in which the article was posted. The command is then able
to fetch the entire article from the file system. Now it's a matter
of reading the article itself to know almost everything there is to
know about it. (It's also interesting that we keep the index thin
because we need to allow it to grow to great sizes.)
%% (clsql:create-table "INDICES" '(([id] (string 1000)) ([grp] (string 1000)) ([article] (string 300))))
%% (clsql:execute-command "create table if not exists indices (id varchar(1000), grp varchar(1000), article varchar(300))")
%% LOOP> (clsql:create-index "idx_id_1" :on "indices" :attributes "id" :unique "id")
%% ; No value
%% (clsql:execute-command "CREATE UNIQUE INDEX if not exists idx_id_1 ON INDICES (ID)")
%% CL-USER> (clsql:list-attributes "indices")
%% ("ID" "GRP" "ARTICLE")
%% CL-USER> (clsql:insert-records :into "indices" :attributes '(id grp article) :values '("<87plr25js5.fsf@tor.soy>" "comp.unix" 37))
%% ; No value
%% CL-USER> (clsql:select 'id 'grp 'article :from "indices")
%% (("<87plr25js5.fsf@tor.soy>" "comp.unix"))
%% ("ID" "GRP")
\section{Essential operations relative to the index}
Here's how to query the index or how to insert a new article into it.
If [[insert-index]] returns [[nil]], then it's because it found no
errors at all. The return value, therefore, indicates which error
ocurred.
We don't consider an error that an article has already been added to
the index. For instance, we write multiple messages to the storage
when someone cross-posts, but we'll add just a single record to the
index, of course. So, an article already indexed is normal situation.
Sure---in the future, we will not duplicate articles in storage; we
will make symbolic links. We don't do that right now because Windows
doesn't really support symbolic links.
<<Essential operations relative to the index>>=
(defun insert-index (m g i)
(handler-case
(clsql:insert-records
:into "indices"
:attributes '(id grp article)
:values (list (str:trim m) (str:trim g) (str:trim i)))
(clsql-sys:sql-database-data-error (c)
(cond ((= (slot-value c 'clsql-sys::error-id) 19)
'already-indexed)
(t
; We should log this error.
;(format t "other error: ~a" (slot-value c 'clsql-sys::database-message))
'sql-error)))
(:no-error ()
nil)))
(defun lookup-index (mid)
(let* ((found (clsql:select [grp] [article]
:from [indices]
:where [= [id] (str:trim mid)]))
(article (first found))
(grp (first article))
(art (second article)))
(when found
(values grp art))))
@ %def insert-index lookup-index
\section{Procedure to import the index from the file system}
%% get group:
%% (first (last (pathname-directory (car (in-groups (directory "**/*"))))))
%% get article name:
%% (pathname-name (car (in-groups (directory "**/*"))))
%% get all articles
%% LOOP> (in-groups (directory "**/*"))
%% get the newsgroup header from the article
%% LOOP> (get-header-from-article "newsgroups" (fetch-article "comp.lang.lisp" "1"))
When we build the index from scratch, it's important to have a
procedure capable of reading all articles in the file system and index
them one by one. This is what we implement right here. For each news
group, we get the name of the file relative to the article---that's
the numeric ID of the article---and its {\tt message-id}. That's all
we need to index it.
<<How to create an index from the file system>>=
(defun index-from-fs! ()
(loop for path in (in-groups (directory "**/*"))
do (let* ((g (str:trim (first (last (pathname-directory path)))))
(i (str:trim (pathname-name path)))
(m (str:trim (extract-mid (fetch-article g i)))))
(when (> (length m) 0)
(format t "article ~a/~a indexed by ~a~%" g i m)
(insert-index m g i)))))
(defun remake-index-from-fs ()
(drop-create-index!)
(index-from-fs!))
@
Here's a program to build the index from a UNIX shell.
<<build-index-from-fs.lisp>>=
<<Quicklisp loading preamble>>
(ql:quickload :loop :silent t)
(in-package #:loop)
(connect-index! "message-id.db")
(drop-create-index!)
(index-from-fs!)
(format t "Index built.~%")
@
Be careful when using this program: it will build the database
[[message-id.db]], which is an operation that needs to be done only
once. Here's how to use it:
%
\begin{verbatim}
%pwd
/home/dbastos/loop
%sbcl --script build-index-from-fs.lisp
Index built.
%ls -l message-id.db
-rw-r--r-- 1 dbastos wheel 65536 Aug 26 13:32 message-id.db
\end{verbatim}
\section{Deletion and locking of inactive accounts}\label{sec:inactive-users}
We now implement some of the \hyperref[principles]{principles} exposed
earlier on page~\pageref{principles}. The program
@<<cron-remove-inactive-users.lisp@>> would be run by {\tt cron} every
day (at midnight, say). It checks all accounts that are inactive and
either locks them (to be deleted later) or deletes them {\em for
good}. If you want to keep accounts forever, just don't run the
program. XXX: our idea is to also delete {\em for good} all accounts
that are locked (by the same period of time), but we have not yet done
that.
<<cron-remove-inactive-users.lisp>>=
<<Quicklisp loading preamble>>
(ql:quickload :loop :silent t)
(in-package #:loop)
(read-accounts!)
(connect-index! "message-id.db")
(remove-inactive-users!)
(write-accounts!)
@ %def cron-remove-inactive-users.lisp
The entire program is really [[remove-inactive-users!]].
<<How to remove inactive users>>=
(defun remove-inactive-users! ()
(loop for u in *accounts* do
(let ((username (account-username u)))
(cond ((and (not (locked? username))
(inactive-from-never-logged-in? username))
(post-notification
:subject (fmt "account ~a removed by Loop" username)
:body (fmt "~a didn't log in a first time (for ~a month~a) since account creation."
username *months-never-logged-in*
(plural *months-never-logged-in* "s")))
(remove-account! username)
(format t "Removed ~a due to never logging in.~%" username))
((and (not (locked? username))
(inactive-from-last-seen? username))
(post-notification
:subject (fmt "account ~a locked by Loop" username)
:body (fmt "~a disappeared for over ~a month~a."
username *months-inactive-allowed*
(plural *months-inactive-allowed* "s")))
(lock-account! username
(fmt "disappeared for over ~a months"
*months-inactive-allowed*))
(format t "Locked ~a due to long-time-no-see.~%" username))))))
@ %def remove-inactive-users!
To remove an account, we need to first remove the username (to be
removed) from anyone's list of friends. So, this involves scanning
the entire list of accounts. Also, notice that delete ``may modify
{\em sequence}''. More important is to understand that we really must
{\tt setf} the return, otherwise we might find the deletion did not
take effect---for example, when deleting the first element of a list.
(XXX: this deserves a better explanation, but if you know how linked
lists are implemented in C, say, then you're likely well aware of how
it works.)
<<How to remove inactive users>>=
(defun remove-account! (username)
(loop for u in *accounts* do
(setf (account-friends u)
(delete username (account-friends u) :test #'equal)))
(setf *accounts*
(delete-if #'(lambda (a) (equal (account-username a) username))
*accounts*)))
(defun lock-account! (username why)
(let ((u (get-account username)))
(setf (account-pass-locked u) (account-pass u))
(setf (account-pass u) "locked")
(setf (account-pass-locked-why u) why)))
@ %def remove-account!
Accounts that do not have a creation date up until today---Tue Sep 17
21:37:18 ESAST 2024---will have its creation dates migrated to the
\Lp\ epoch, which is January 1st 2024, the exact month in which
\Lp\ was written. But notice that this migration is done only once.
New system administrators of \Lp\ will never need to run this. (We do
not remove this set of source code chunks because they serve as an
example of how to a migration like that.)
<<How to migrate accounts without a creation date>>=
(defun loop-epoch ()
(encode-timestamp 0 0 0 0 1 1 2024))
(defun migrate-add-creation-and-post-date! ()
(read-accounts!)
(loop for u in *accounts*
do (if (not (account-creation u))
(setf (account-creation u) (timestamp-to-universal (loop-epoch)))
(setf (account-last-post u) (account-seen u))))
(write-accounts!))
@ %def migrate-add-creation-and-post-date! loop-epoch
Here's a program to run the migration in a UNIX shell.
<<migrate-add-creation-date.lisp>>=
<<Quicklisp loading preamble>>
(ql:quickload :loop :silent t)
(in-package #:loop)
(migrate-add-creation-and-post-date!)
(format t "Accounts rewritten.~%")
@ %def migrate-add-creation-date.lisp
Now we write the procedures that discover what accounts are inactive.
<<Definition of maximum allowed inactive periods>>=
(defparameter *months-inactive-allowed* 3)
(defparameter *months-never-logged-in* 1)
@
<<How to enumerate inactive accounts>>=
(defun user-inactive? (username)
(or (inactive-from-never-logged-in? username)
(inactive-from-last-seen? username)))
(defun inactive-from-never-logged-in? (username)
(let ((u (get-account username)))
(if (ever-logged-in? username)
NIL
(inactive-from? username *months-never-logged-in*
#'(lambda () (account-creation u))))))
(defun locked? (username)
(equal "locked" (account-pass (get-account username))))
(defun inactive-from-last-post? (username)
(let ((last-post (account-last-post (get-account username)))
(creation (account-creation (get-account username))))
(inactive-from? username *months-inactive-allowed*
(if last-post
#'(lambda () last-post)
#'(lambda () creation)))))
(defun inactive-from-last-seen? (username)
(let* ((u (get-account username))
(last-seen (account-seen u))
(creation (account-creation u)))
(inactive-from? username *months-inactive-allowed*
(if last-seen
#'(lambda () last-seen)
#'(lambda () creation)))))
(defun inactive-from? (username months timestamp-source)
(declare (ignore username))
(timestamp<
(timestamp+
(universal-to-timestamp
(funcall timestamp-source)) months :month)
(now)))
(defun ever-logged-in? (username)
(account-seen (get-account username)))
(defun never-logged-in? (username)
(not (ever-logged-in? username)))
(defun list-inactive-users ()
(loop for u in *accounts* do
(format t "Username ~a is inactive? ~a~%"
(account-username u)
(user-inactive? (account-username u)))))
@ %def list-inactive-users
\section{Macros to ease writing}\label{sec:dsl}
These macros make up a tiny language to ease the writing of \lp. For
example, when we need to access the group database, we use
[[in-groups]]. When a certain command demands authentication, we use
[[with-auth]].
<<Macros>>=
(defmacro in-dir (dir &rest body)
`(let ((*default-pathname-defaults* (truename ,dir)))
(uiop:with-current-directory (,dir)
,@body)))
@
Notice that we set [[*default-pathname-defaults*]] and we set the
process' current working directory. That's not necessary because Lisp
always uses [[*default-pathname-defaults*]] and does not care about
the current working directory. We did this out of the fact that we
used to invoke [[renameat2]] through the [[cffi]], but we don't use it
any more.
<<Macros>>=
(defmacro in-groups (&rest body) `(in-dir "groups/" ,@body))
(defun in-group-lambda (g fn) (in-dir g (funcall fn)))
(defmacro in-group (g &rest body)
`(in-group-lambda ,(fmt "groups/~a/" g) #'(lambda () ,@body)))
(defmacro with-group (g r &rest body)
(let ((g-var (gensym))
(r-var (gensym)))
`(let ((,g-var ,g)
(,r-var ,r))
(if (not (group? ,g-var))
(make-response :code 411 :request ,r-var
:data (format nil "no such group ``~a''" ,g-var))
(progn ,@body)))))
(defmacro with-n-args (n r &rest body)
(let ((args-var (gensym))
(message-var (gensym))
(n-var n))
`(let ((,args-var (request-args r))
(,message-var ,(fmt "bad arguments: needs exactly ~a" n-var)))
(if (not (= ,n-var (length ,args-var)))
(make-response :code 400 :request ,r :data ,message-var)
(progn ,@body)))))
(defmacro with-group-set (&rest body)
(let ((g-var (gensym)))
`(let ((,g-var (client-group *client*)))
(if (not ,g-var)
(bad-input r "must say GROUP first")
,@body))))
(defmacro with-auth (&rest body)
`(if (not (auth?))
(make-response :code 400 :data "You must authenticate first.")
(progn ,@body)))
@ %def in-groups with-group with-n-args with-group-set with-auth
\section{Other procedures}
Of small importance, they have nothing. Notice that [[ucs-2->ascii]]
is iseful only in Windows systems---and just for development. The
procedure destructively converts UCS-2 to ASCII, so it's only really
useful when we're converting an implicitly ASCII-content in the form
of UCS-2. Despite the name UCS-2, notice it is UTF-16. The name UCS
stands for ``Universal Character Set'' and I speculate the number 2
means 2 bytes. So our conversion is just removing the first byte.
<<Little procedures>>=
(defun conforms-to? (s re &optional error-msg)
"Does string S conform to regular expression RE?"
(let ((okay? (cl-ppcre:scan-to-strings re s)))
(if okay?
(values t nil)
(values nil (or error-msg (fmt "must match ~a" re))))))
(defun print/finish (&rest args)
(apply #'format (cons t args))
(finish-output))
(defun word-plural (n word)
(let ((table '(("doesn't" . "don't")
("newsgroup" . "newsgroups"))))
(let ((w (assoc word table :test #'string=)))
(when (not w) (error "word not found"))
(if (< n 2) (car w) (cdr w)))))
(defun plural (v suffix)
(if (> v 1) suffix ""))
(eval-when (:compile-toplevel :load-toplevel :execute)
(defun fmt (cstr &rest args)
(apply #'format nil (list* cstr args))))
(defun out (stream &rest args)
(apply #'format (cons stream args)))
(defun stderr (&rest args)
(when *debug*
(apply #'out (cons *error-output* args))
(finish-output *error-output*)))
(defun stdout (&rest args)
(apply #'out (list* *standard-output* args)))
(defun println (&rest args)
(apply #'out (list* *standard-output* (str:concat (car args) "~%") (cdr args))))
(defun enumerate (ls &optional (first-index 0))
(loop for e in ls and i from first-index
collect (cons i e)))
(defun ucs-2->ascii (bs)
;; I'm a Windows user.
#-win32 bs #+win32 (remove-if #'zerop bs))
(defun bad-input (r msg &key code)
(make-response :code (or code 400) :data msg :request r))
(defun integer->string (n)
(format nil "~a" n))
(defun mkstr (&rest args) ;; a utility
(with-output-to-string (s)
(dolist (a args) (princ a s))))
(defun data (&rest args) ;; a utility
(flatten (map 'list #'data->bytes args)))
(defun crlf ()
(vector 13 10))
(defun crlf-string ()
(format nil "~c~c" #\return #\linefeed))
(defun flatten (obj)
(do* ((result (list obj))
(node result))
((null node) (delete nil result))
(cond ((consp (car node))
(when (cdar node) (push (cdar node) (cdr node)))
(setf (car node) (caar node)))
(t (setf node (cdr node))))))
(defmacro mac (&rest body)
`(macroexpand-1 ,@body))
@ %def bad-input crlf mkstr data crlf-string flatten ucs-2->ascii enumerate
\section{Tests}
I studied the minimum to be able to add these tests as we comprehend
better the direction in which we're going. A test system is essential
for us to trust we can move forward without breaking important past
decisions in the code. XXX: we should not include these tests in
production code as we are doing right now. Divert them to a tests
package or something that makes more sense. To run the tests, you
need to invoke [[lisp-unit:run-tests]]. I believe that invoking it in
a script will make all tests run. Oh, that should be included the
binary, too, so that we can always test an archived project. That's a
good idea. Make it a clingon option on the command line. Of course,
by including tests in the executable, we should isolate all the tests
here. I know how---just define a test package and isolate it all in
it. Packages are for namespace isolation. Lisp's ``systems'' are the
type of packaging meant for loading. Lisp's ``packages'' are merely
namespace isolation. Our test package should use \lp's package, so
that we can use any of the procedures under testing. When we define a
variable, it will be defined in the test package, not in \lp's, but
when we are in \lp's package, we see no names of the test package.
It's simple and a good solution.
When testing, it's important for us not to clutter a production
system---we will want to run tests on production systems. So what we
need to do is to wrap any file system modification to a certain other
directory in which \lp's tests will find the groups directory in
place. Making that happen is as simple as changing
[[*default-pathname-defaults*]].
<<Test procedures>>=
(setq lisp-unit:*print-failures* t)
(define-test dispatching
(assert-true (equalp (empty-response) (dispatch (make-request)))))
(defun unix->nntp (s)
"I substitute \n for \r\n"
(str:replace-all (fmt "~a" #\linefeed) (crlf-string) s))
(defvar a-post (unix->nntp "From: root
Message-id: <pwtdldytefplntosymvo@loop>
Subject: test
Newsgroups: local.test
Quickest test of the West.
"))
(defvar a-bad-post (unix->nntp "From: root
Message-id: <pwtdldytefplntosymvp@loop>
Subject: a bad test
Newsgroups: local.test, bad.newsgroup
A bad test from the biggest mouth of the south.
"))
(define-test post-wrong-newsgroup
(multiple-value-bind (code msg) (post (string->bytes a-bad-post))
(declare (ignore msg))
(assert-true (equal code 400))))
(define-test post-okay
(read-accounts!)
(connect-index! "test.db")
(create-index!)
(setq *client* (make-client :username "ROOT" :auth? 'yes))
(multiple-value-bind (code msg) (post (string->bytes a-post))
(declare (ignore msg))
(assert-true (equal code 240)))
(clsql:disconnect))
@ %def
XXX: we got a problem with test [[post-okay]]. We're getting an
execution error, but we can't see any error message. The posting is
taking place---here in the REPL at least.
\section{How to produce the binary executable}
Just say {\tt make loop} to your shell.
<<build-exe.lisp>>=
<<Quicklisp loading preamble>>
(ql:quickload :loop)
(sb-ext:save-lisp-and-die #P"loop"
:toplevel #'loop:main
:executable t
:save-runtime-options t)
@
<<Quicklisp loading preamble>>=
#-quicklisp
(let ((quicklisp-init (merge-pathnames "quicklisp/setup.lisp"
(user-homedir-pathname))))
(when (probe-file quicklisp-init)
(load quicklisp-init)))
@ %def quicklisp
\section{How to get a log of \lp's communication}
If you invoke \lxxp\ with option [[--logging]], you get logging on
[[stderr]]:
%
\begin{verbatim}
$ ./loop -s --logging > /dev/null
@>>> 200 Welcome! Say ``help'' for a menu.
quit
@<<< quit
@>>> 205 Good-bye.
\end{verbatim}
But remember that \lp\ runs various processes in parallel, so the
question now is how to unify all processes' logs into a single one.
We can't use
\href{https://cr.yp.to/daemontools/multilog.html}{multilog}, for
instance. The easiest solution now is to send all messages to {\tt
syslog}. FreeBSD and GNU systems come with a program called {\tt
logger}, which is able to write messages to {\tt syslog}. If you're
using \href{https://cr.yp.to/daemontools/svscan.html}{{\tt svscan}},
then you can use a program such as
%
\begin{verbatim}
$ cat log/run
#!/bin/sh
exec /usr/bin/logger -i -t loop
\end{verbatim}
%
as your {\tt log/run} script. See {\tt logger(1)} for more
information. Using {\tt logger(1)} means you need to set up {\tt
syslog}, too. By the default, {\tt logger(1)} will use the {\em
facility} {\tt user} and the {\em level} {\tt notice}. So you can
specify in {\tt syslog.conf} the selector {\tt user.notice} and
specify a log file such as {\tt /var/log/loop.log}. See {\tt
syslog.conf(5)} and {\tt newsyslog(1)} for more information.
\section{Package {\tt loop.lisp} as the compiler needs it}
We now put together all source code chunks in the order the compiler
needs to read it. One thing to keep in mind here is---I wonder if
people that might read this source code would read the literate
programming \LaTeX\ output or would they read [[loop.lisp]] directly.
For literate programmers, it doesn't matter how [[loop.lisp]] turns
out because only the compiler reads [[loop.lisp]]. But if we care
about anyone who might read [[loop.lisp]], then we should perhaps tell
our literate programming tools to generate a nice-looking file. For
instance, I declare global variables in the chunks where it's used.
But for someone reading [[loop.lisp]] directly, it is perhaps better
if they would see all global variables at the top of the file. That's
something to think about.
<<loop.lisp>>=
;;; -*- mode: LISP; syntax: COMMON-LISP; package: LOOP; -*-
(eval-when (:compile-toplevel :load-toplevel :execute)
(ql:quickload
'(<<List of packages to be loaded>>)
:silent t))
(clsql:file-enable-sql-reader-syntax)
(defpackage #:loop
(:use :common-lisp :local-time)
(:import-from :lisp-unit define-test assert-true)
(:import-from :org.shirakumo.filesystem-utils
directory-p list-directories list-files)
(:import-from :sb-sys interactive-interrupt)
(:export :main))
(in-package #:loop)
<<Global variables>>
<<Table of commands>>
<<Macros>>
<<Little procedures>>
<<Loop's REPL>>
<<Procedures for requests and responses>>
<<Essential operations relative to the index>>
<<How to create and connect to the index>>
<<How to remove inactive users>>
<<How to enumerate inactive accounts>>
<<How to migrate accounts without a creation date>>
<<How to split a stream into lines>>
<<How to create an index from the file system>>
<<How to parse articles>>
<<Dispatching of commands>>
<<Command authinfo>>
<<Command mode reader>>
<<Commands head, body, article>>
<<Command xover>>
<<Command group>>
<<Command list>>
<<Command help>>
<<Command quit>>
<<Command date>>
<<Command post>>
<<Command create-group>>
<<Command create-account>>
<<Command login>>
<<Command passwd>>
<<Command users>>
<<Command dd>>
<<Command repl>>
<<Broadcasting>>
<<Command-line parsing>>
<<Main loop>>
<<Test procedures>>
@ %def
<<Global variables>>=
(defparameter *debug* nil)
<<Representation of accounts>>
<<Representation of a client>>
<<Representation of requests and responses>>
<<Reference to the database>>
<<Representation of commands>>
<<Representation of articles>>
<<Definition of maximum allowed inactive periods>>
<<Global variable that decides whether to enable the NNTP REPL>>
@
On which packages do we depend?
<<List of packages to be loaded>>=
:lisp-unit :str :uiop :cl-ppcre :local-time :clsql-sqlite3 :clingon
:filesystem-utils :ironclad/digest/sha256
@
The \lp\ system definition:
<<loop.asd>>=
;;; -*- Mode: LISP; Syntax: COMMON-LISP; -*-
(asdf:defsystem :<<Name>>
:version "<<Version>>"
:description "<<Description>>"
:depends-on (<<List of packages to be loaded>>)
:components ((:file "loop")))
@
\section{Other source code}
The shell script {\tt format-def} is invoked whenever we build any
lisp source code. That's to format the source code a bit better for
readers that will be reading it directly. It is not what we do. We
read the documentation in PDF format and we work on the NOWEB file
{\tt loop.nw}. But we know that potential readers will not do the
same and will hack {\tt loop.lisp} directly. Paying respect to these
readers, we try to format Lisp source code as best as possible. So we
do two things: first, we produce the final source code in an order
that should produce no warnings during compilation; second, we make
sure there's one and only one blank line between procedure or macro
definition. We don't add a blank line between global variables.
The following shell script does the job. The first {\tt sed} program
finds our definitions of interest and inserts a new blank line before
the definition. Such action makes function definitions separated by
two blank lines in some cases. We then remove the excess with the
second program. Notice we need the {\tt -E} option because we're
using the {\tt |} metacharacter.
The second program find a blank line as its first step. Then we say
{\tt N} to expand the pattern space to include the next line. Then we
delete the {\em first} blank line and not the second---that's what the
{\tt D} command does. This strategy is explained by Dale Dougherty
and Arnold Robbins in ``sed \& awk'' second edition, pages 112--114.
<<format-def>>=
#!/bin/sh
usage()
{
printf 'usage: %s [file.lisp]\n' $0
exit 1
}
sed -E '/^\(defun |\(defmacro /{
i\
}' "$@" | \
sed '/^[ \t]*$/{
N
/^[ \t]*\n$/D
}'
@
When we make a new release of \lp, we like to name its version as the
tip of the source code repository. We get the information usually
with a command line such as
%
\begin{verbatim}
$ git log --oneline | head -1 | awk '{print $1}'
52663d1
\end{verbatim}
%
To include this version string in the executable, we need to make it
part of the source code. We get help from {\tt sed} once again. As
the usage explains, we invoke it as {\tt ./make-release 52663d1
loop.nw}. The script then rewrites {\tt loop.nw} with the string in
the body of the chunk @<<Version@>>. The {\tt sed} program is
straightforward: locate the chunk definition, move down a line, change
that line and that's all.
<<make-release>>=
#!/bin/sh
usage()
{
printf 'usage: %s tag file\n' $0
exit 1
}
test $# -lt 2 && usage
tag="$1"; shift
sed "/<<Version>>=/ {
n;
c\\
$tag
}" "$@"
@
\section*{Index of chunks}
\nowebchunks
\section*{Index of names}
\nowebindex
\end{document}