srv/loop.nw

% -*- mode: noweb; noweb-default-code-mode: lisp-mode; -*-
\documentclass[a4paper,11pt]{article}
\usepackage[text={6.75in,10in},centering]{geometry}

\usepackage{graphicx}

\usepackage[T1]{fontenc}
\usepackage[utf8]{inputenc}
\usepackage{csquotes}
\usepackage[brazil]{babel}

\usepackage{etoolbox}
\AtBeginEnvironment{quote}{\small}
\AtBeginEnvironment{verbatim}{\small}

%% \usepackage[backend=biber]{biblatex}
%% \addbibresource{refs.bib}
%% \renewcommand{\cite}{\parencite}
\usepackage[hyperref]{xcolor}
\usepackage[colorlinks=true,citecolor=]{hyperref} % linkcolor=red

\usepackage{amsmath,amsthm,amssymb}
\allowdisplaybreaks
\usepackage{lmodern}
\usepackage{noweb}
\noweboptions{brazil,longchunks,smallcode}
\DeclareMathOperator{\mdc}{mdc}
\DeclareMathOperator{\gcdext}{gcdext}
\DeclareMathOperator{\remainder}{remainder}
\DeclareMathOperator{\quotient}{quotient}
\DeclareMathOperator{\diff}{diff}

\def\nwendcode{\endtrivlist \endgroup}
\let\nwdocspar=\par

%% Popular words.
\newcommand{\lxxp}{{\em loop}}
\newcommand{\Lp}{{\tt LOOP}}
\newcommand{\lp}{\Lp}
\newcommand{\bug}{{\em bug}}
\newcommand{\symlink}{{\em symbolic link}}
\newcommand{\symlinks}{\symlink s}

\title{\Lp\\
  {a circle out of fashion}}
\date{January 2024}
\begin{document}
\fontfamily{cmr}\selectfont
\maketitle
%\setlength{\parskip}{3pt}
%\setlength{\parindent}{0pt}

\Lp\ is an out-of-fashion program, used as medium of communication by
antiquated people.  \Lp\ members insist that technical communication
be made in writing and not in a hurry.  That's how backwards they are.
To give you an idea, they write \Lp\ in Lisp---jurassic technology.
We surely wouldn't pay them any attention.
%
\begin{verbatim}
         Drunk and dressed in their best brown baggies and their platform soles
         They don't give a damn about any trumpet-playing band
         It ain't what they call rock and roll
                                                        -- Mark Knopfler, 1978.
\end{verbatim}

It's easy to make a conference on the Internet.  E-mail works.  When
we write an e-mail to various friends to discuss a certain subject, we
form a circle.  When we decide to add another person to this
quickly-made conference by e-mail, sometimes we use a well-known
idiom---``adding John to the loop''.  We add John's e-mail address to
the list of destinaries.  So long as everyone replies to everyone,
John, too, will start getting all the messages.  If anyone violates
this rule of replying to everyone involved, the loop is broken.

There are surely inconveniences in using e-mail as conference medium.
For example, after John has been added to the loop, he is not able to
leave by his own account.  He needs to ask everyone involved to stop
writing to him.  This is usually easy to do, but instead people tend
to ask for more technology such as mailing lists.  A mailing list is
nothing by an automated version of this idea of writing to various
people at once.  When the mailing address is written to, a program
resends the message to all subscribers of the mailing list and, this
way, the conference takes place.

But \lp\ has nothing to do with e-mail.  \Lp\ uses a communication
strategy---called a ``protocol''---that is even older than the web
itself.  The web started out around 1989--1990 and the protocol
\lp\ uses was conceived in 1979 and implemented in 1980.  The name of
the protocol used by \lp\ is NNTP---Network News Transfer Protocol.
Since e-mail was already daily practice of members of the Internet
back then, many things from e-mail were taken by the NNTP designers.
So, an NNTP message looks a lot like an e-mail message and the
two---NNTP and SMTP (the protocol used by e-mail)---can often mingle
seamlessly.  The impression we get from using NNTP is that we're
sending e-mail to a certain group of people.  It's as though the
message goes into a collective mailbox and anyone interested in that
mailbox reads the messages there.  If anyone would like to reply to a
message, they do so and, this way, communication flows among the
interested crowd.  If anyone would like to leave the group, nothing is
needed---the person just doesn't go back to read any more messages.
Unlike mailing lists, there is no need to formally commit to reading
one of these collective mailboxes and no need to formally notify
anyone or any system that you're not interested in that group any
longer.  These collective mailboxes are called ``news groups'' and are
often written as ``newsgroups''.  And the messages posted to these
news groups are called ``articles''.

Just like e-mail and the web, network news is an open protocol.
Anyone could write a program capable of speaking NNTP.  There are many
NNTP-aware programs.  You could write your own.  Figures
\ref{fg:gnus}--\ref{fg:sylpheed} show a few programs for reading
network news via NNTP.

\begin{figure}[!htb]
  \centering \includegraphics[width=0.8\linewidth]{images/gnus-summary.png}
  \caption{Gnus, a news reader embedded in the GNU EMACS text editor.}
  \label{fg:gnus}
\end{figure}

\begin{figure}[!htb]
  \centering \includegraphics[width=0.8\linewidth]{images/tbird-summary.png}
  \caption{Thunderbird, a news reader produced by the Mozilla Foundation.}
  \label{fg:bird}
\end{figure}

\begin{figure}[!htb]
  \centering \includegraphics[width=0.8\linewidth]{images/sylpheed-summary.png}
  \caption{Sylpheed, a news reader produced by Hiroyuki Yamamoto.}
  \label{fg:sylpheed}
\end{figure}

{\bf Principles for a discussion group}. We believe a discussion group
should be small and grow slowly.  By ``slowly'', we mean that each
member comes in through an invitation.  This way, the group being
closed by definition, we keep spam out and give members a certain
sense of privilege.

A discussion group should be formed by interested people.  If a
participant doesn't log-in for a certain period of time, \lp locks the
participant's account---see Section \ref{sec:inactive-users}.  The
account can be reactivated, but it will take asking another
participant (with an active account) to do so.  In other words,
there's an encouragement for an uninterested member not to come back
to the \lp.  The idea is to keep a certain cohesion in the discussion
groups.  When an account is locked or unlocked, an article is posted
to the group {\tt local.control.news}, so everyone knows who is
leaving and arriving.  This way, participants get to have an idea of
who is reading them.

Each invitation comes with a certain responsibility: it's possible to
see who invited who.  If {\tt BOB} misbehaves, everyone gets to see
that {\tt ALICE} doesn't have nice friends.  The {\tt USERS} command
shows the relationship graph:
%
\begin{verbatim}
USERS
200 List of current users:
ANONYMOUS, last seen on Fri Mar  8 19:01:56 2024, invited (ALICE)
    ALICE, last seen on Sun Mar 10 21:25:45 2024, invited (BOB CARLA)
      BOB, last seen on Sun Mar 10 21:17:30 2024, invited nobody
    CARLA, last seen on Sun Mar 10 18:30:48 2024, invited nobody
\end{verbatim}

We conjecture that a discussion group tends to prosper when each
member feels as important as any other member.  So we think each
member should have as much managerial power as any other.  In an
attempt to realize this ideal, each member is able to not only invite
other people---see the [[CREATE-ACCOUNT]] command---, but also to
create new groups---see the [[CREATE-GROUP]] command.

Despite this introduction, this is not a tutorial on the history of
the Internet or how NNTP works.  This is the source code of \lp.
Hereafter, our conversation continues in Lisp.  Understanding how
\lp\ is made is only necessary if you intend to modify it.  If you
just want to use the system, you probably should stop right here.

\section{The implementation strategy}\label{sec:design}

Anything a user sends to the \lp\ is wrapped in a [[request]] and
any command processing must produce a [[response]]:
%
\begin{verbatim}
                          request  --->  process  --->  response.
\end{verbatim}
%
The request arrives, \lp\ interprets it, finds out which command
should look at the request and then dispatches the the request to the
right command.  The command process the request and returns a
response.  Then \lp\ takes this response and sends it out back to the
user through the network using [[send-response!]].  That's all there
is to \lp.  Everything else is just the details of this strategy.

Before you investigate the source code of \lp, you should have {\tt
  SBCL} installed---the Common Lisp compiler we've been using---and
the package [[:loop]].  This way you can interact with the system:
%
\begin{verbatim}
CL-USER> (ql:quickload :loop)
To load "loop":
  Load 1 ASDF system:
    loop
; Loading "loop"
..................
(:LOOP)
CL-USER> (in-package :loop)
#<PACKAGE "LOOP">
LOOP> (make-request "HELP")
#S(REQUEST :VERB "HELP" :ARGS NIL :SAID NIL)
\end{verbatim}
%
The procedure [[make-request]] constructs a [[request]].  When a user
connects to the \lp, each command sent by the user is packaged inside
a [[request]] structure like this one above.  To see how \lp\ answers
this request for help, we can invoke the procedure [[dispatch]], which
is responsible to find the correct procedure to reply to the request.

\begin{verbatim}
LOOP> (dispatch (make-request :verb "HELP"))
#S(RESPONSE
   :CODE 400
   :DATA "unrecognized command"
   :REQUEST #S(REQUEST :VERB "HELP" :ARGS NIL :SAID NIL)
   :MULTI-LINE NIL)
\end{verbatim}

It turns that \lp\ doesn't recognize the command.  That's because we
only loaded the [[:loop]] package.  The reason it doesn't recognize
the command is that the command table has not been built.  Let's build
it and try again:

\begin{verbatim}
LOOP> (set-up-tables!)
(("ARTICLE"
  . #S(COMMAND
       :FN #<FUNCTION CMD-ARTICLE>
       :VERB "ARTICLE"
       :DESCRIPTION "fetches full articles"))
[...]
("HELP"
  . #S(COMMAND
       :FN #<FUNCTION CMD-HELP>
       :VERB "HELP"
       :DESCRIPTION "displays this menu")) [...])

LOOP> (dispatch (make-request :verb "HELP"))
#S(RESPONSE
   :CODE 200
   :DATA (87 104 97 [...] 112)
   :REQUEST #S(REQUEST :VERB "HELP" :ARGS NIL :SAID NIL)
   :MULTI-LINE YES)
\end{verbatim}

That's better.  These numbers we see in the response are the bytes in
the response.  You can get a string version of these numbers:

\begin{verbatim}
LOOP> (bytes->string (response-data (dispatch (make-request :verb "HELP"))))
"What's on the menu today?
ARTICLE fetches full articles
AUTHINFO makes me trust you
BODY fetches an article body
GROUP sets the current group
HEAD fetches article headers
HELP displays this menu
LIST lists all groups
MODE handles the mode request from clients
NEXT increments the article pointer
POST posts your article
QUIT politely says good-bye
XDD displays your state of affairs
XOVER fetches the overview database of a group"
\end{verbatim}

This is the text that the user sees when they ask for {\tt HELP}.  In
other words, the field {\tt data} in a [[response]] stores the data to
be delivered back to the user.  The program {\tt nc}---for
``netcat''---that we use below is capable of opening a TCP connection
and handling that connection to our keyboard so that we can interact
with the \lp.  You can effectively achieve the same thing using a
program such as {\tt telnet}.

\begin{verbatim}
C:\>nc antartida.xyz 119
200 Welcome! Say ``help'' for a menu.
help
200 What's on the menu today?
ARTICLE fetches full articles
AUTHINFO makes me trust you
BODY fetches an article body
GROUP sets the current group
HEAD fetches article headers
HELP displays this menu
LIST lists all groups
MODE handles the mode request from clients
NEXT increments the article pointer
POST posts your article
QUIT politely says good-bye
XDD displays your state of affairs
XOVER fetches the overview database of a group
.
quit
205 Good-bye.
\end{verbatim}

But keep in mind that \lp\ was not made to talk to users directly.
\Lp\ was made to talk to your NNTP client, programs such as the ones
illustrated by Figures \ref{fg:gnus}--\ref{fg:sylpheed}.  That's why
we see these numbers in the responses given by \lp.  These numbers are
there to help clients understand how the conversation is going.  Each
specific such number is determined by the NNTP protocol.  But, despite
the protocol being made for machines to talk to each other, it's
perfectly possible for a user to interact with \lp\ directly using a
keyboard and a command-line tool such as {\tt nc} or {\tt telnet}.  In
fact, \lp\ takes advantage of that to be hackable.  \Lp\ was written
so that it can talk to NNTP clients---such as Gnus, Sylpheed {\em et
  cetera}---but also to users directly.  Commands such as
[[CREATE-ACCOUNT]], [[CREATE-GROUP]], [[PASSWD]] are not part of the
NNTP protocol, so users need to know how to use {\tt nc} or {\tt
  telnet} to take advantage of all of \lp's capabilities.

\section{The NNTP protocol}

An Internet protocol is usually defined by a document whose tradition
calls RFC---for ``[r]equest [f]or [c]omments''.  The NNTP protocol is
defined by RFCs 977, 2980, 3977, 4643 and 5536.  RFC 977 was the first
and replaced by 3977.  Still, reading RFC 977 is interesting precisely
because it gives us a historical account of the protocol, making it
easier to understanding the evolution of the system.  The objective of
RFC 2980 was to implement new ideas to the NNTP protocol---to extend
the protocol.  RFC 3977 adopts some of these extensions.  RFC 4643
also extends RFC 2980---addressing concerns with authentication.

\section{It's a network server}

The \lp\ program is a network server, that is, it serves connections
on the network.  However, instead of implementing the handling of TCP
connections, we use what is called a ``TCP superserver'' or ``TCP
wrappers'' or perhaps just ``TCP server''.  The idea is---a program
called ``TCP server'' waits for connections from the network.  When a
client arrives, the TCP server handles the network connection to \lp.
From \lp's perspective, the client is just another keyboard talk to it
directly.  This strategy simplifies the implementation of \lp.  Both
programs---the TCP server and \lp---have nothing to do with one
another, but their work together make the system work.  This
implementation strategy is typically found in UNIX programs.
%
\begin{quote}\small
This is the Unix philosophy. Write programs that do one thing and do
it well. Write programs to work together. Write programs that handle
text streams, because that is a universal interface.\\ -- Doug
McIlroy, 1989, interviewed by Michael S.~Mahoney.
\end{quote}
%
The TCP server just just one thing---listens for new connections and
handles them to the interested program.  It does the handling and does
it well.  \Lp, on the other hand, concerns itself with the NNTP
protocol and does not worry about handling network connections.  This
way, they work together.  And \lp\ handles only a text stream, which
is why it's so easy to connect a keyboard to it and interact with it
through the command line as illustrated in Section~\ref{sec:design}.

\section{The representation of a client}

How do we represent a client?  A client is typically reading a group
and an article; it's has authenticated itself or not yet.  So we need
a global structure to annonate the client's state.

<<Representation of a client>>=
(defstruct client group (article 1) (username "ANONYMOUS") (auth? 'no))
(defparameter *client* (make-client))
@ %def client *client*


\section{The representation of a command}

What does a client typically tell \lp?  A client typically sends
commands.  Commands typically need arguments.  Each command is
dispatched to a procedure that answers it---it's the purpose of
\verb|fn| in the command.  Together, all commands make up a table of
commands, which is essentially what the user sees when ask for
\verb|HELP|.  (Be aware that some clients use the output of
\verb|HELP|.  For example, Gnus v5.13.)

<<Table of commands>>=
(defstruct command fn verb description)
(defparameter *commands-assoc* nil)

(defun table-of-commands ()
  `(("GROUP" ,#'cmd-group "sets the current group")
    ("NEXT" ,#'cmd-next "increments the article pointer")
    ("HELP" ,#'cmd-help "displays this menu")
    ("LIST" ,#'cmd-list "lists all groups")
    ("AUTHINFO" ,#'cmd-authinfo "makes me trust you")
    ("LOGIN" ,#'cmd-login "shorter interface to AUTHINFO")
    ("HEAD" ,#'cmd-head "fetches article headers")
    ("MODE" ,#'cmd-mode "handles the mode request from clients")
    ("BODY" ,#'cmd-body "fetches an article body")
    ("POST" ,#'cmd-post "posts your article")
    ("ARTICLE" ,#'cmd-article "fetches full articles")
    ("XOVER" ,#'cmd-xover "fetches the overview database of a group")
    ("CREATE-GROUP" ,#'cmd-create-group
                    "creates a new group so you can discuss your favorite topic")
    ("CREATE-ACCOUNT",#'cmd-create-account
                     "creates an account so you can invite a friend")
    ("PASSWD" ,#'cmd-passwd "changes your password")
    ("USERS" ,#'cmd-list-users "lists all users")
    ("DD" ,#'cmd-dd "[d]isplays [d]ata: your state of affairs")
    ("QUIT" ,#'cmd-quit "politely says good-bye")
    ("DATE" ,#'cmd-date "displays the current date at this server")
    ("UNLOCK-ACCOUNT" ,#'cmd-unlock-account "unlocks an account")))

(defun set-up-tables! ()
  (labels ((build-commands-assoc (ls)
             (if (null ls)
                 nil
                 (cons (apply #'make-command-pair (car ls))
                       (build-commands-assoc (cdr ls)))))
           (make-command-pair (name fn desc)
             (cons name (make-command :fn fn :verb name :description desc))))
    (setf *commands-assoc*
          (sort
           (build-commands-assoc (table-of-commands))
           #'string-lessp :key #'car))))

(defun get-command (key)
  (let ((cmd (assoc key *commands-assoc* :test #'string=)))
    (labels ((unrecognized-command ()
               (make-command :fn #'(lambda (r)
                                     (make-response :code 400
                                                    :data "unrecognized command"
                                                    :request r))
                             :verb 'unrecognized
                             :description "a command for all commands typed wrong")))
      (or (cdr cmd) (unrecognized-command)))))
@ %def *commands-assoc* set-up-tables! get-command

\section{The representation of requests and responses}

Each command is given through a text line written by the user.  Let's
call this text line the [[request]].  When \lp\ parses the request, it
will extract (from the request) a verb and some arguments.  We will
take a verbatim copy of everything the user has said, possibly for
debugging purposes.

How do we represent a [[response]]?  A [[response]] is always a
reaction to a [[request]].  The NNTP protocol always specifies an
integer as the code to a response, which is what we call the
\verb|code| in the response.  Long responses end with a period and we
mark such responses with [[multi-line]].

<<Representation of requests and responses>>=
(defstruct request verb args said)
(defstruct response code data request multi-line)

(defun empty-response () (make-response :code 400 :data "I beg your pardon?"))
(defun prepend-response-with (message r)
  (make-response
   :code (response-code r)
   :data (data message (crlf) (response-data r))
   :multi-line (response-multi-line r)
   :request (response-request r)))
@ %def request response make-response make-request empty-response prepend-response-with

Here's how to send a [[response]] to a client.

<<Representation of requests and responses>>=
(defun append-crlf-if-needed (seq)
  (cond
    ((stringp seq)
     (append-crlf-if-needed (string->bytes seq)))
    ((listp seq)
     (append seq
             (when (not (= (car (last seq)) 10))
               (list 13 10))))
    (t (error (format nil "append-crlf-if-needed: unsupported type: ~a" (type-of seq))))))

(defun send-response! (r)
  (let ((bs (data (integer->string (response-code r)) " "
                  (append-crlf-if-needed (response-data r)))))
    (my-write bs *standard-output*)
    (stderr ">>> ~a" (bytes->string (ucs-2->ascii bs))))
  (when (response-multi-line r)
    (let ((bs (data "." (crlf))))
      (my-write bs *standard-output*)
      (stderr ">>> ~a" (bytes->string (ucs-2->ascii bs)))))
  (force-output)
  r)
@ %def send-response!

The Windows Console---the one we use when we run {\tt cmd.exe}---makes
the Lisp procedure [[write-sequence]] produce UCS-2.  Windows will do
what we need of [[write-sequence]] if we're not operating \lp\ on the
Windows Console.  Recall that we do not run \lp\ on Windows Console,
so whatever happens because of it is of little importance to us.
However, we implement \lp\ on Windows and so it's convenient for us
that \lp\ and Windows play nicely with each other.  A cheap solution
here is to simply convert the bytes to a string if \lp\ is directly
connected to an interactive Lisp stream.  This way we effectively
eliminate the UCS-2 encoding used by Windows.  It is perfectly fine
for us to destroy the encoding of articles while we're writing \Lp.
It is not fine, however, when it's running in production.  But, in
production, {\tt (interactive-stream-p s)} will always be false.  How
else should we handle this?

<<Representation of requests and responses>>=
(defun my-write (ls-of-bytes s)
  (if (interactive-stream-p s)
      (write-sequence (mapcar #'code-char ls-of-bytes) s)
      (write-sequence ls-of-bytes s)))
@ %def my-write

\section{The parsing of requests}

The commands themselves we call {\tt verbs} and everything else the
user types we call {\tt args}.  Observe that upper and lower case
letters are equivalent in request verbs.

<<Representation of requests and responses>>=
(defun parse-request (r)
  (let* ((collapsed-s (str:collapse-whitespaces (request-said r)))
         (ls (str:split " " collapsed-s :omit-nulls 'please)))
    ;; What are we going to do with a null request?
    (cond ((null ls) (make-request :said (request-said r)))
          (t (let ((verb (car ls))
                   (args (cdr ls)))
               (make-request :said (request-said r)
                             :verb (str:upcase verb)
                             :args args))))))
@ %def parse-request

\section{The main loop}

Every command consumes a [[request]] and produces a [[response]].  If
any procedure always produces a [[response]], then delivering a
[[response]] to the user is a matter of sending a string composed of
the [[response]] code contacated with the [[response]] data.

What does \lp\ do?  It repetitively reads a line from the user,
processes that line and always replies something back.  Then \lp\ is
back at waiting for the user to say something else again.  If the user
says {\tt QUIT}, then we should identify it and terminate \lp's
execution.  That's even [[send-response!]] returns the [[request]]
itself---so we can cascade actions based on a user's request.

<<Main loop>>=
(defun main-loop ()
  (let* ((bs (nntp-read-line))
         (ln (bytes->string (ucs-2->ascii bs))))
    (if ln
        (let ((r (send-response! (dispatch-line ln))))
          (when (not (response-quit? r))
            (main-loop)))
        (progn
          (stderr "eof~%")
          'eof))))

(defun request-quit? (r) (and r (string= 'quit (request-verb r))))
(defun response-quit? (r) (and r (request-quit? (response-request r))))

(defun main ()
  (send-banner!)
  (set-up-tables!)
  (read-accounts!)
  (connect-index! "message-id.db")
  (create-index!)
  (main-loop))

(defun send-banner! ()
  (send-response!
   (make-response :code 200 :data "Welcome! Say ``help'' for a menu.")))
@ %def main main-loop

\section{The request dispatching mechanism}

Dispatching requests means consuming one and invoking the correct
procedure that will process the request.  The invoked procedure must
produce a [[response]].  The work of dispatching is just delivering
the task to an response-producing procedure and then raising the
response to whoever needs to catch it.  For example,
[[response-quit?]] is used by [[main-loop]] to identify when the user
has issued {\tt QUIT}, in which case we terminate [[main-loop]].

<<Dispatching of commands>>=
(defun dispatch (r)
  (let* ((verb (request-verb r)))
    (if (null verb)
        (empty-response)
        (funcall (command-fn (get-command verb)) r))))

(defun dispatch-line (ln)
  (dispatch (parse-request (make-request :said ln))))
@ %def dispatch dispatch-line

\section{The representation and parsing of articles}

An article is made of two parts, the head and the body.  We do need to
parse the head, but we never parse the body: we don't want to
intervene in anything that a user might be doing in the body of the
article.  The headers, however, are mostly under the jurisdiction of
the server.  This decision is due to the fact servers must read
headers.  For example, how do we know to which groups an article was
posted?  We look at the header {\tt newsgroups}.  So, the server must
understand the encoding of headers.  Therefore, we assume ASCII
encoding of all headers that we need to parse.

The member [[headers]] of the structure [[article]] is just a string,
while body is a vector of bytes.  To get a list of pairs out of the
set of all headers of an article, we can ask [[parse-headers]].  Yes,
I should've called the member [[headers]] as [[head]] and not
[[headers]] because both the word ``headers'' and its plural used here
suggest a list of parsed headers.  We're going to rename this in due
time. %% TODO

<<Representation and parsing of articles>>=
(defstruct article headers body)

(defun parse-article (v)
  (let ((parts (split-vector (vector 13 10 13 10) v nil :limit 2)))
    (make-article :headers (map 'string #'code-char (car parts)) :body (cadr parts))))

(defun hs-space-collapsed (hs)
  (cl-ppcre:regex-replace-all (format nil "~a[ ~a]+" (crlf-string) #\tab) hs " "))

(defun hs-lines (lines) (str:split (crlf-string) lines))

(defun parse-header (header)
  (let* ((h (str:collapse-whitespaces header))
         (pos (search ":" h)))
    (when (null pos)
      (throw 'article-syntax-error
        (values nil (format nil "missing colon in header |~a|" h))))
    (when (<= (length h) (+ 2 pos))
      (throw 'article-syntax-error
        (values nil (format nil "empty header ~a" h))))
    (multiple-value-bind (key val)
        (values (subseq h 0 pos) (subseq h (+ 2 pos)))
      (cons (str:downcase key) val))))

(defun parse-headers (hs)
  (let ((ls (hs-lines (hs-space-collapsed hs))))
    (mapcar #'(lambda (h) (parse-header h)) ls)))

(defun string-integer? (s) (ignore-errors (parse-integer s)))
@ %def parse-article parse-headers

We now write some procedures that we use when we're build the {\em
  overview} of the command \verb|XOVER|.

<<Representation and parsing of articles>>=
(defun get-header-from-article (h a)
  (get-header h (parse-headers (article-headers (parse-article a)))))

(defun get-header (key hs)
  (let ((pair (assoc key hs :test #'string=)))
    (if pair (cdr pair) "")))

(defun fetch-headers (g i)
  (let* ((a-string (fetch-article g i))
         (a-parsed (parse-article a-string))
         (headers (parse-headers (article-headers a-parsed))))
    (enrich-headers headers a-string)))

(defun enrich-headers (hs a)
  (append hs
    `(("line-count" . ,(format nil "~a" (nlines (article-body (parse-article a)))))
     ("byte-count" . ,(format nil "~a" (length a))))))

(defun nlines (v) (length (split-vector (crlf) v nil)))
@ %def get-header fetch-headers

\section{How to extract articles from the database}

Notice that we do not care about which encoding is used in articles.
We just read the article bytes and handle them to the client.  It's
the article viewer's---the NNTP client's, that is---responsibility of
interpreting such bytes.  That's why we call [[read-sequence]] here.

<<Representation and parsing of articles>>=
(defun fetch-article (g i)
  (in-groups
   (read-file-raw (format nil "~a/~a" g i))))

(defun read-file-raw (path)
  (let* ((size (sb-posix:stat-size (sb-posix:stat path)))
         (a (make-array size)))
    (with-open-file (in path :element-type '(unsigned-byte 8))
      (read-sequence a in)
      a)))

(defun fetch-body (g i)
  (article-body (parse-article (fetch-article g i))))
@ %def fetch-article

The purpose of [[encode-body]] is to eventually worry about the
appearance of a period on a line by itself in the middle of an
article.  Since ``\verb|.\r\n|'' is part of the NNTP protocol, we must
handle this gracefully---but notice we have not done anything about
that so far.  So we are essentially writing a bug right here.

<<Representation and parsing of articles>>=
(defun encode-body (a) a)
@ %def encode-body

The procedures [[extract-mid]] and [[lookup]] also belong belong in
this section.  Notice that I also wrote [[mid-by-name]], which should
merge with [[extract-mid]].  I think I also wrote more
redundancies---perhaps in the implementatio nof [[xover]]---for not
using [[lookup]].  I need to seek out all such places and organize. %% TODO

<<Representation and parsing of articles>>=
(defun extract-mid (a)
  (lookup "message-id" (parse-headers (article-headers (parse-article a)))))
(defun lookup (key table)
  (cdr (assoc key table :test #'string=)))
@ %def extract-mid lookup

\section{The commands}

\subsection{{\tt HELP}}

When someone asks for help, we present a table of commands.  The table
construction is made by [[menu]].  The procedure [[menu]] was one of
the first things I wrote.  (This is my first program written in Common
Lisp.)  I didn't want to get involved with the famous [[loop]] macro,
so I used recursion in [[menu]]\footnote{I'd like to isolate these
auxiliary procedures inside a single function that uses them.  Common
Lisp offers me [[labels]], but [[labels]] don't seem so helpful when
I'm at the REPL.  When I use [[defun]], I'm able to always invoke the
procedure at the REPL, but that's not so with [[labels]].  I guess the
use of [[labels]] is when the procedure is so trivial that we have no
reason to think we're doing to debug it.} %% TODO

<<Help command>>=
(defun cmd-help (r)
  (let ((lines (menu *commands-assoc*)))
    (prepend-response-with
     "What's on the menu today?"
     (make-response :code 200 :multi-line 'yes
                    :request r
                    :data (str:join (crlf-string) lines)))))
(defun menu (ls)
  (if (null ls)
      nil
      (cons (display-fn (car ls)) (menu (cdr ls)))))

(defun display-fn (cmd-pair)
  (let ((cmd (cdr cmd-pair)))
    (format nil "~A ~A"
            (command-verb cmd)
            (command-description cmd))))
@

\subsection{{\tt AUTHINFO}}\label{sec:authinfo}

The implementation of {\tt AUTHINFO}.  When we connect to
\lp\ directly from a keyboard, it's a bit painful to authenticate with
two commands---{\tt AUTHINFO user} and {\tt AUTHINFO pass}.  So we
also implemented the command {\tt LOGIN}---see
Section~\ref{sec:login}.  To check the user's password, we use the
procedure [[pass?]] that's defined in the implementation of {\tt
  PASSWD}.  Perhaps we should have called it {\tt
  is-password-correct?}  or something more obvious.

<<authinfo error message>>=
Syntax error. Say ``authinfo USER /your-name/'' then ``authinfo PASS /your-pass/''.
@

<<Authinfo command>>=
(defun cmd-authinfo (r)
  (let* ((args (mapcar #'str:upcase (request-args r))))
    (cond
      ((not (= (length args) 2))
       (bad-input r "No, no: I take exactly two arguments."))
      (t
       (multiple-value-bind (cmd arg) (apply #'values args)
         (cond
           ((string= cmd "USER")
            (setf (client-username *client*) arg)
            (make-response :code 381 :request r
                           :data (format nil "Hey, ~a, please tell us your password." arg)))
           ((string= cmd "PASS")
            (if (authinfo-check (client-username *client*) arg)
                (progn
                  (log-user-in!)
                  (make-response
                   :code 281 :request r
                   :data (fmt "Welcome, ~a." (client-username *client*))))
                (make-response :code 400 :request r :data "Sorry. Wrong password.")))
           (t (make-response :code 400 :request r :data "<<authinfo error message>>"))))))))

(defun authinfo-check (username passwd)
  (pass? username passwd))

(defun auth? ()
  (eq 'yes (client-auth? *client*)))

(defun log-user-in! ()
  (setf (client-auth? *client*) 'yes)
  (let ((u (get-account (client-username *client*))))
    (setf (account-seen u) (get-universal-time)))
  (write-accounts!))
@ %def auth? log-user-in!

\subsection{{\tt CREATE-ACCOUNT}}

We allow authenticated members to invite their friends.

%% A propósito, estamos removendo a conta {\tt ROOT} de exibição.  O que
%% significa que {\tt ROOT} não nem mesmo se conectar ao \Lp.  Se
%% desejarmos que {\tt ROOT} se conecte, talvez a gente possa fazer
%% código especialmente pra gerenciar a conta dele.  Fazemos assim pra
%% não permitir que usuários tenham qualquer chance de
%%
%% (remove-if #'(lambda (u) (equal "ROOT" (account-username u)))
%%                        (read s))

<<Command create-account>>=
(defun cmd-create-account (r)
  (with-auth
      (with-n-args 1 r
        (let* ((args (mapcar #'str:upcase (request-args r)))
               (username (car args)))
          (multiple-value-bind (username pass-or-error) (new-account! username)
            (if (not username)
                (make-response :code 400 :request r
                               :data (fmt "~a. Choose a new name." pass-or-error))
                (progn
                  (notify-user-created username)
                  (make-response :code 200 :request r
                                 :data (fmt "Okay, account ~a created with password ``~a''."
                                            username pass-or-error)))))))))

(defparameter *accounts* nil)
(defstruct account username seen last-post friends pass pass-locked pass-locked-why creation)

(defun read-accounts! ()
  (let ((*package* (find-package '#:loop)))
    (with-open-file
        (s "accounts.lisp"
           :direction :input)
      (setq *accounts* (read s))))
  *accounts*)

(defun new-account! (username)
  (let* ((u (str:upcase username))
         (p (random-string 6))
         (a (make-account :username u
                          :pass (sxhash (str:upcase p))
                          :creation (get-universal-time))))
    (if (get-account u)
        (values nil (fmt "account ~a already exists" u))
        (let ((c (get-account (client-username *client*))))
          (push u (account-friends c))
          (push a *accounts*)
          (write-accounts!)
          (values (str:upcase username) p)))))
@ %def CREATE-ACCOUNT

Notice that we have a race condition in [[write-accounts]].  What is
the problem?  Two processes in parallel may ask for the writing of
[[accounts.lisp]].  The process that loses the race will have its
modifications lost.  What do we need to do?  Either we use file
locking or we do something smarter without a real file locking
mechanism.  It's not clear to me what is possible here, but this is
definitely a problem that we need to solve.

<<Command create-account>>=
(defun write-accounts! ()
  (let ((name
          (loop
            (let* ((tmp (random-string 10))
                   (name (format nil "~a.tmp" tmp)))
              (when
                  (ignore-errors
                   (with-open-file
                       (s name
                          :direction :output
                          :if-exists :error
                          :if-does-not-exist :create)
                     (write *accounts* :stream s)))
                (return name))))))
    (if (ignore-errors (rename-file name "accounts.lisp"))
        (values t *accounts*)
        (values nil (format nil "could not rename ~a to accounts.lisp" name)))))

(defun get-account (username)
  (loop for u in *accounts*
        do (when (string= (str:upcase username) (account-username u))
             (return u))))
@ %def read-accounts! write-accounts! get-account

\subsection{{\tt UNLOCK-ACCOUNT}}

Inactive accounts are removed or locked---see Section
\ref{sec:inactive-users}.  When an account is locked, any member can
unlock it.

<<Command create-account>>=
(defun cmd-unlock-account (r)
  (with-auth
      (with-n-args 1 r
        (let* ((args (mapcar #'str:upcase (request-args r)))
               (username (car args)))
          (cond ((not (get-account username))
                 (make-response :code 400 :request r
                                :data "No such account ~a." username))
                ((not (locked? username))
                 (make-response :code 400 :request r
                                :data (fmt "Can't unlock ~a because it's not locked." username)))
                (t
                 (unlock-account! username)
                 (notify-user-unlocked username)
                 (make-response :code 200 :request r
                                :data (fmt "Okay, account ~a unlocked." username))))))))

(defun unlock-account! (username)
  (let ((u (get-account username)))
    (cond ((not u)
           (values nil "no such account"))
          ((not (locked? username))
           (values nil "account isn't locked"))
          (t
           (setf (account-pass u) (account-pass-locked u))
           (setf (account-pass-locked u) nil)
           (setf (account-pass-locked-why u) nil)))))
@ %def unlock-account!

\subsection{{\tt LOGIN}}\label{sec:login}

Besides {\tt AUTHINFO} in Section~\ref{sec:authinfo}, we also
implement a more convenient command for authenticationwhen we interact
with \lp\ through a command-line interface.  Instead of having to say
two commands, we can just say {\tt login user password}.

<<Command login>>=
(defun cmd-login (r)
  (let* ((args (mapcar #'str:upcase (request-args r))))
    (cond
      ((not (= (length args) 2))
       (bad-input r "Usage: login your-username your-password"))
      (t
       (multiple-value-bind (name pass) (apply #'values args)
         (cond
           ((pass? name pass)
            (log-user-in-as! name)
            (make-response :code 200 :request r
                           :data (fmt "Welcome, ~a." name)))
           (t (make-response :code 400 :request r
                             :data (fmt "Wrong password.")))))))))

(defun log-user-in-as! (name)
  (setf (client-username *client*) name)
  (log-user-in!))
@ %def log-user-in-as!

\subsection{{\tt PASSWD}}\label{sec:passwd}

A change of password is made with {\tt PASSWD current new}.  Observe
that we are duplicating code from other command procedures.  I think
there is a macro emerging here called [[with-upcase-args]]. %% TODO

<<Command passwd>>=
(defun cmd-passwd (r)
  (with-auth
      (let* ((args (mapcar #'str:upcase (request-args r))))
        (cond
          ((not (= (length args) 2))
           (bad-input r "Usage: passwd current-password new-password"))
          (t
           (multiple-value-bind (cur new) (apply #'values args)
             (cond
               ((pass? (client-username *client*) cur)
                (multiple-value-bind (okay? problem) (change-passwd! (client-username *client*) new)
                  (if okay?
                      (make-response :code 200 :request r
                                     :data "You got it. Password changed.")
                      (make-response :code 500 :request r
                                     :data (fmt "Sorry: ~a" problem)))))
               (t (make-response :code 400 :request r
                                 :data (fmt "Sorry. Wrong password."))))))))))

(defun pass? (username pass)
  (let ((u (get-account username)))
    (and u
         (eq (sxhash pass) (account-pass u)))))

(defun change-passwd! (username newpass)
  (let ((u (get-account username)))
    (when (not u)
      (error "I could not find account ~a." username))
    (setf (account-pass u) (sxhash newpass))
    (write-accounts!)))

@ %def PASSWD pass? change-passwd!

\subsection{{\tt USERS}}\label{sec:users}

The tree of users and their friends is public.  Anyone can know who
invited who.

<<Command users>>=
(defun cmd-list-users (r)
  (with-auth
      (prepend-response-with
       "List of current users:"
       (make-response
        :code 200 :request r :multi-line 'yes
        :data (str:join (crlf-string) (list-users))))))

(defun size-of-longest-username ()
  (loop for u in *accounts*
        maximizing (length (account-username u))))

(defun list-users ()
  (read-accounts!)
  (mapcar (lambda (row) (cadr row))
          (sort
           (loop for u in *accounts*
                 collect (list (account-username u)
                               (fmt "~v@a~a, ~a, invited ~a"
                                    (size-of-longest-username)
                                    (account-username u)
                                    (if (locked? (account-username u))
                                        (fmt " (account locked: ~a)"
                                             (account-pass-locked-why u))
                                        "")
                                    (if (last-time-seen (account-username u))
                                        (fmt "last seen on ~a" (last-time-seen (account-username u)))
                                        "never logged in")

                                    (or (account-friends u) "nobody"))))
           #'string<= :key (lambda (row) (car row)))))

(defun universal-to-human (s)
  (format-timestring
   nil
   (universal-to-timestamp s)
   :format +asctime-format+))

(defun last-time-seen (username)
  (let ((u (get-account username)))
    (if u (let ((s (account-seen u)))
            (if s (universal-to-human s))))))
@ %def list-users last-time-seen size-of-longest-username

\subsection{{\tt LIST}}\label{sec:list}

The database of groups and articles is a UNIX directory.  We just need
to discover which directories exist and produce a listing.  The heavy
work here is finding the index interval of articles in the group.  (I
think we should already be optimizing this by merely caching the
information in a file that is read at start-up.  I think we should
even cache the overview of the group.) %% TODO

<<Command list>>=
(defstruct group name high low)

(defun cmd-list (r)
  (prepend-response-with
   "Get in the loop! Lots to choose from."
   (make-response :code 215 :multi-line 'yes
                  :data (str:join (crlf-string) (build-groups-lines (build-groups-structs)))
                  :request r)))

(defun build-groups-lines (ls)
  (reverse
   (mapcar
    #'(lambda (g)
        (format nil "~a ~a ~a y" (group-name g) (group-high g) (group-low g)))
    ls)))

(defun build-groups-structs ()
  (let ((ret-ls nil))
    (dolist (g (list-groups) ret-ls)
      (multiple-value-bind (low high len) (group-high-low g)
        (declare (ignore len))
        (setf ret-ls (cons (make-group :name g :high high :low low) ret-ls))))))

(defun between? (x from to)
  (<= from x to))
(declaim (inline between?))

(defun filesize (path)
  (sb-posix:stat-size
   (sb-posix:stat path)))

(defun zero-file? (path)
  (= (filesize path) 0))

(defun temporary-article? (path)
  (or (zero-file? path)
      (cl-ppcre:scan "\.tmp$" (namestring path))))

(defun article-ready? (path)
  (not (temporary-article? path)))

(defun get-articles (g &optional from to)
  (in-groups ;; We might want to optimize this some day.  Most likely,
             ;; though, we'll not be using directories. That's a
             ;; problem to be studied.
   (let ((as (articles->integers
              (remove-if #'temporary-article? (cl-fad:list-directory g)))))
     (sort (remove-if-not
            #'(lambda (x) (between? x (or from x) (or to x)))
            as)
           #'<))))

(defun group-high-low (g)
  (let* ((articles (get-articles g))
         (sorted-ints (sort articles #'<)))
    (values (or (car sorted-ints) 0)
            (or (car (last sorted-ints)) 0)
            (length sorted-ints))))

(defun articles->integers (ls)
  (remove-if #'null
             (mapcar #'(lambda (g)
                         (ignore-errors
                          (parse-integer (basename (uiop:unix-namestring g)))))
                     ls)))

(defun list-groups ()
  (let ((groups (in-groups (cl-fad:list-directory "."))))
    (sort (mapcar #'(lambda (g) (basename (uiop:unix-namestring g))) groups)
          #'string-lessp)))

(defun last-char (s) (char s (1- (length s))))
@

In [[basename]], what we want to do is---if the path ends with a
slash, we invoke [[pathname]].  Otherwise, we invoke {\tt
  file-namestring}.

<<Command list>>=
(defun basename (path)
  (let ((s (str:collapse-whitespaces path)))
    (if (char= #\/ (last-char s))
        (car (last (pathname-directory s)))
        (file-namestring s))))
@ %def get-articles group-high-low

\subsection{{\tt GROUP}}\label{sec:group}

We just need to verify if the group exists and modify [[*client*]].

<<Command group>>=
(defun cmd-group (r)
  (with-auth
      (with-n-args 1 r
        (let ((g (car (request-args r))))
          (with-group g r
            (set-group! g)
            (multiple-value-bind (low high len) (group-high-low g)
              (let ((ln (format nil "~a ~a ~a ~a" len low high g)))
                (setf (client-article *client*) low)
                (make-response :code 211 :request r :data ln))))))))

(defun group? (g)
  (in-groups
   (cl-fad:directory-exists-p g)))

(defun xgroup? (g)
  (cl-fad:directory-exists-p g))

(defun set-group! (g)
  (setf (client-group *client*) g))
@ %def group?

Why have I written {\tt group?} and {\tt xgroup?}?  There's probably a
clean-up task here. %% TODO

\subsection{{\tt BODY}, {\tt HEAD}, {\tt ARTICLE} e {\tt NEXT}}
\label{sec:typical-cmds}

Here we have an illustration of the expressive power of first-class
procedures.  We want to implement the commands {\tt BODY}, {\tt HEAD},
{\tt ARTICLE} and {\tt NEXT}.  The ways to invoke them are (1) with no
argument, (2) with a single integer argument and (3) with a text
argument.  In case (1), we use the number of the current
article---which is kept in [[*client*]].  In case (2), the NNTP client
specifies the article number.  In case (3), the NNTP client specifies
the exact {\tt message-id}, which forces us to query the index---see
Section~\ref{sec:index} for the implementation of the index.

<<Commands head, body, article>>=
(defun typical-cmd-head-body-article (r fn-name)
  (with-auth
      (with-group-set
          (let ((args (request-args r)))
            (cond ((null args)
                   (funcall fn-name r (client-group *client*) (client-article *client*)))
                  ((= 1 (length args))
                   (let* ((n-or-mid (car args)))
                     (cond ((string-integer? n-or-mid)
                            (funcall fn-name r (client-group *client*) n-or-mid))
                           (t (multiple-value-bind (group n-str) (lookup-index n-or-mid)
                                (if (and group n-str)
                                    (funcall fn-name r group n-str)
                                    (bad-input r (format nil "Unknown article ~a." n-or-mid))))))))
                  (t (bad-input r "No, no: it takes at most two arguments.")))))))

(defun cmd-head (r)
  (typical-cmd-head-body-article r #'head-response))
(defun cmd-body (r)
  (typical-cmd-head-body-article r #'body-response))
(defun cmd-article (r)
  (typical-cmd-head-body-article r #'article-response))

(defun article-response (r g i)
  (typical-cmd-response 220 r g i #'(lambda (a) (encode-body a))))
(defun head-response (r g i)
  (typical-cmd-response 221 r g i #'(lambda (a) (article-headers (parse-article a)))))
(defun body-response (r g i)
  (typical-cmd-response 222 r g i #'(lambda (a) (encode-body (article-body (parse-article a))))))
@

When we process such commands, it's typical that we need to verify the
existence of files {\em et cetera}.  The procedure that might throw
[[sb-posix:syscall-error]] is [[sb-posix:stat-size]], which we use to
know how many bytes are there in an article, a necessary task in
producing the \verb|OVERVIEW|.

<<Commands head, body, article>>=
(defun typical-cmd-response (code r g i get-data)
  (let ((a (handler-case (fetch-article g i)
             (sb-posix:syscall-error (c)
               (make-response :code 400 :request r
                              :data (format nil "article ~a/~a: ~a" g i c)))
             (sb-ext:file-does-not-exist (c)
               (declare (ignore c))
               (make-response :code 400 :request r
                              :data (format nil "article ~a/~a does not exist" g i))))))
    (cond ((typep a 'response) a)
          (t (prepend-response-with
              (format nil "~a ~a" i (extract-mid a))
              (make-response :multi-line 'yes :code code
                             :request r :data (funcall get-data a)))))))
@ %def typical-cmd-response

The command \verb|NEXT| has a slight different semantics.

<<Commands head, body, article>>=
(defun cmd-next (r)
  (with-auth
      (let ((g (client-group *client*))
            (n-cur (client-article *client*)))
        (cond
          ((not g) (bad-input :code 412 r "must say GROUP first"))
          (t (multiple-value-bind (low high len) (group-high-low g)
               (declare (ignore low len))
               (cond ((= n-cur high) (bad-input r "you are at the last article already"))
                     (t (article-next! r g)))))))))

(defun article-next! (r g)
  (setf (client-article *client*) (1+ (client-article *client*)))
  (let ((cur (client-article *client*)))
    (make-response :code 223
                   :request r
                   :data (format nil "~a ~a" cur (mid-by-name g cur)))))

(defun mid-by-name (g name)
  (extract-mid (fetch-article g name)))
@ %def cmd-next article-next! mid-by-name

\subsection{{\tt XOVER}}\label{sec:xover}

The procedure [[cmd-xover]] is used to figure out what the user said.
Once we have that figured out, we invoke [[xover]], which finishes the
work.  Notice that when the argument [[to]] from [[xover]] is [[NIL]],
then the user is asking for articles indexed from the integer [[fr]]
to the last one.

<<Command xover>>=
(defun cmd-xover (r)
  (with-auth
      (with-group-set
          (let ((args (request-args r)))
            (cond ((null args)
                   (xover r (client-article *client*) (client-article *client*)))
                  ((= 1 (length args))
                   (multiple-value-bind (s v)
                       (cl-ppcre:scan-to-strings "([0-9]+)([-]?)([0-9]*)" (car args))
                     (cond
                       ((not s) (make-response :code 502 :request r :data "bad syntax"))
                       (t (let ((fr (parse-integer (aref v 0)))
                                (hifen (aref v 1))
                                (to (ignore-errors (parse-integer (aref v 2)))))
                            (when (not (string= hifen "-"))
                              (setq to fr))
                            (xover r fr to))))))
                  (t (make-response :code 502 :request r :data "bad syntax")))))))

(defun xover (r from to)
  (assert (client-group *client*))
  (let* ((g (client-group *client*))
         (ls (get-articles g from to)))
    (cond ((= 0 (length ls))
           (make-response :code 420 :request r :data "no articles in the range"))
          (t
           (prepend-response-with
            "Okay, your overview follows..."
            (make-response
             :code 224 :request r :multi-line 'yes
             :data (str:join
                    (crlf-string)
                    (loop for i in ls
                          collect (xover-format-line
                                   i
                                   (remove-if-not
                                    #'(lambda (h)
                                        (member (car h) (xover-headers)
                                                :test #'string=))
                                    (fetch-headers g i)))))))))))
(defun xover-format-line (i hs)
  (str:concat (format nil "~a~a" i #\tab)
              (str:join #\tab
                        (mapcar #'(lambda (h) (get-header h hs))
                                (xover-headers)))))
(defun xover-headers ()
  '("subject" "from" "date" "message-id" "references" "line-count" "byte-count"))
@

\subsection{{\tt MODE READER}}\label{sec:mode-reader}

So, we're always in reader mode, so we just ignore this command.

<<Command mode reader>>=
(defun cmd-mode (r) ;; Whatever.
  (make-response :code 200 :request r :data "Sure thing."))
@ %def

\subsection{{\tt DATE}}\label{sec:date}

It's always useful to know the time and date at a computer.  We should
surely format it a bit better than what {\tt now} does.

<<Command date>>=
(defun cmd-date (r)
  (make-response :code 201
                 :request r
                 :data
                 (format-timestring nil (now))))
@ %def

\subsection{{\tt QUIT}}\label{sec:quit}

The use of {\tt QUIT} has a conection to [[main-loop]]: when the user
says {\tt QUIT}, [[main-loop]] must terminate.

<<Command quit>>=
(defun cmd-quit (r)
  (make-response :code 205 :data "Good-bye." :request r))
@ %def

\subsection{{\tt DD}}\label{sec:dd}

The command {\tt DD} means ``[d]isplay client [d]ata structures''.  It
shows to the client the internal state of how the server sees it.
I've used only for debugging and it's not really useful any longer.
I'm going to remove this very soon.

<<Command dd>>=
(defun cmd-dd (r)
  (make-response :code 200 :data (format nil "state: ~a" *client*) :request r))
@ %def

\subsection{{\tt POST}}\label{sec:post}

If the client says {\tt POST}, then we continue to read line after
line until we find \verb|".\r\n"|.  Having done that, we must check
whether we have a conformant article at hands.  The definition of
conformant is given by [[conforms?]].  In a few words, the article
must have \verb|message-id|, \verb|subject|, \verb|from|,
\verb|newsgroups|.  If the client doesn't provide us with a
\verb|message-id|, then \lp\ adds one.  (Similarly for \verb|date|.)

<<Command post>>=
<<Does an article conform?>>
(defun suggest-message-id (&optional (n 20))
  (format nil "<~a@loop>" (random-string n)))

(defun random-string (size)
  (let* ((universe "abcdefghijklmnopqrstuvwxyz")
         (len (length universe))
         (state (make-random-state t))
         mid)
    (dotimes (c size)
      (setq mid (cons (char universe (random len state)) mid)))
    (coerce mid 'string)))
@

Sometimes we parse an article and sometimes we want to undo that
parsing.  Am I doing something wrong?  I wonder. %% TODO

<<Command post>>=
(defun unparse-article (parsed)
  (data
   (let ((ls))
     (dolist (h (parse-headers (article-headers parsed)))
       (setq ls (cons (data (str:capitalize (car h)) ": " (cdr h) (crlf)) ls)))
     (nreverse ls))
   (crlf)
   (article-body parsed)))
@ %def unparse-article

If an article being posted has no \verb|message-id| or \verb|date|,
\lp\ provides these headers.  We kill these two rabbits with
[[ensure-header]], but we should probably make a table of headers and
procedures that would generate such headers if they're missing.  Right
now, however, we have only these two to worry about.

<<Command post>>=
(defun ensure-header (h fn bs)
  (let* ((headers (parse-headers (article-headers (parse-article bs)))))
    (if (lookup h headers)
        bs
        (unparse-article
         (make-article
          :headers
          (str:join (crlf-string)
                    (mapcar (lambda (h)
                              (format nil "~a: ~a" (car h) (cdr h)))
                            (cons (cons h (funcall fn)) headers)))
          :body (article-body (parse-article bs)))))))

(defun get-date ()
  (multiple-value-bind (s m h day mon year dow dst-p tz)
      (get-decoded-time)
    (declare (ignore dow dst-p))
    (format nil "~4,'0d-~2,'0d-~2,'0d ~2,'0d:~2,'0d:~2,'0d GMT~a"
            year mon day h m s (- tz))))

(defun ensure-mid (bs)
  (ensure-header "message-id" #'suggest-message-id bs))
(defun ensure-date (bs)
  (ensure-header "date" #'get-date bs))
@ %def ensure-mid ensure-date

Now it's time to look at the header \verb|newsgroups|.  (XXX: Our code
here is a bit confusing, but I don't know the best to do here, so I'm
going ahead unpretentiously.)  If we get approved by [[conforms?]],
then we verify the list of newsgroups right away.

The name of each group must conform to the expression

<<Form of newsgroup names>>=
^([a-z0-9]+)
@ %def the-form-of-newsgroup-names

I think people should have total freedom in naming groups.  If users
create groups that mess up the local organization, then people should
discuss the matter and find a solution.  Let's let people mess it up
instead of trying to stop them---the way of the hacker.

<<Command post>>=
(defun newsgroups-header->list (s)
  (mapcar (lambda (n) (str:trim (string-downcase n))) (str:split "," s)))

(defun cmd-post (r)
  (with-auth
      (send-response!
       (make-response :code 340
                      :data (format nil "Okay, go ahead. Suggested message-id ~a."
                                    (suggest-message-id))))
    (let* ((bs (nntp-read-article)))
      (multiple-value-bind (okay? error) (conforms? bs)
        (if (not okay?)
            (make-response :code 400 :request r
                           :data (format nil "Sorry. Your article doesn't conform: ~a." error))
            (multiple-value-bind (code reply) (post bs)
              (make-response :code code :request r :data reply)))))))

(defun post (bs)
  (let ((ngs (newsgroups-header->list
              (get-header "newsgroups" (parse-headers
                                        (article-headers
                                         (parse-article bs))))))
        ngs-dont-exist)
    (dolist (ng ngs)
      (if (and (group-name-conforms? ng)
               (group? ng))
          (progn
            (let ((a (ensure-date (ensure-mid bs))))
              (save-article-insist ng (get-next-article-id ng) a (extract-mid a))
              (update-last-post-date! (client-username *client*))))
          (push ng ngs-dont-exist)))
    (if (zerop (- (length ngs) (length ngs-dont-exist)))
        (values 400 "Sorry. There was not a single valid newsgroup specified.")
        (values 240 (data "Thank you! Your article has been saved."
                          (when ngs-dont-exist
                            (data " However, the groups "
                                  (str:join ", " (sort ngs-dont-exist #'string<))
                                  " just don't exist.")))))))
@ %def post

XXX: Oh, have a look at that.  We accept the article even if there are
invalid groups.  We should not do that.  A user might only want to
post at all if his message is cross-posted to a few groups.  A user
might easily mistype a group name.  The Right Thing here is more
likely to stop posting completely with an error message telling the
user to either remove the invalid group of type it up properly.

<<Command post>>=
(defun update-last-post-date! (username)
  (let ((u (get-account username)))
    (setf (account-last-post u) (get-universal-time))))
@ %def update-last-post-date!

If [[save-article-try]] returns [[NIL]], then [[probe-file]]
has found an article with name [[name-try]], that is, the procedure
is only successful if [[name-try]] is not yet taken and the writing
takes place successfully.

<<Command post>>=
(defun rename-no-extension (old new)
  (rename-file old (make-pathname :name new :type :unspecific)))

(defun save-article-try (name-try bs)
  (let ((name (format nil "~a" name-try))
        (tmp (format nil "~a.tmp" name-try)))
    (with-open-file
        (s name
           :direction :output
           :if-exists :error ;; an atomic operation
           :if-does-not-exist :create))
    ;(format t "save-article-try: ~a~%" name)
    (with-open-file
        (s tmp
           :direction :output
           :if-exists :error
           :if-does-not-exist :create
           :element-type '(unsigned-byte 8))
      (write-sequence bs s))
    (rename-no-extension tmp name)))
@

The procedure [[save-article-insist]] can return [[NIL]] and still
have perfectly done its job: it's possible for [[insert-index]] to
return [[NIL]] because [[message-id]] may already exist in the index,
but that may be no error---for example, when cross-posting.  The
strategy is to write the article using [[name-try]].  If it's not
possible to write it because of a [[sb-ext:file-exists]] condition,
then we try the new name {\tt (1+ name-try)} and we repeat these
attempts until we make it.  If other condition appears, we let it
propagate up the stack.  If we get to the second [[let]], it's because
the article has been saved successfully, so we finish with whatever it
is that [[insert-index-or-log-failure]] must do.

<<Command post>>=
(defun save-article-insist (g name a message-id)
  (loop for name from name do
    (in-dir (format nil "groups/~a/" g)
            (handler-case
                (save-article-try name a)
              (sb-ext:file-exists ()
                ;; We might want to log the fact.
                ;(format t "name ~a already exists...~%" name)
                )
              (:no-error (new before after) ;; the return values from return-file
                (declare (ignore new before after))
                (return (values name (insert-index message-id g (fmt "~a" name)))))))))

(defun get-next-article-name (g)
  (format nil "~a" (get-next-article-id g)))

(defun get-next-article-id (g)
  (multiple-value-bind (low high len) (group-high-low g)
    (declare (ignore low len))
    (1+ high)))
@

{\bf How to read lines in the NNTP protocol?}  We've implemented the
most trivial strategy possible.  It's also the slowest.  What I think
we need to do here is to use [[vector-push-extend]].  But this is to
be done in [[nntp-read-line]].  I hope to be able to get a faster
procedure in [[nntp-read-line]] and keep [[nntp-read-article]] as it
is. This is important to speed up posting.  For instance, if we allow
attachments (which we don't), the performance penalty is clearly
noticeable. %%TODO

<<Command post>>=
(defun nntp-read-article (&optional acc)
  ;; Returns List-of Byte.
  (let* ((ls (ucs-2->ascii (nntp-read-line))))
    (cond ;; 46 == (byte #\.)
      ((equal (list 46) ls) (flatten (add-crlf-between acc)))
      (t (nntp-read-article (append acc (list ls)))))))
@ %def nntp-read-article

The NNTP protocol establishes that line termination is done with
\verb|\r\n|, but it's useful to support UNIX line terminations, too,
because we are using the command-line and interact directly with the
server using tools that will not always send \verb|\r\n| as line
termination.  For example, when someone is typing directly from the
keyboard and insert an empty line, we need the {\tt (and acc ...)}
because sometimes the list [[acc]] comes out empty.  But an empty line
never comes from the NNTP protocol because there's is always a {\tt
  CR} before {\tt LF}, but that's not true when someone is using the
keyboard directly.

<<Command post>>=
(defun nntp-read-line (&optional (s *standard-input*) acc)
  ;; Returns List-of Byte.
  (let ((x (read-byte s)))
    (cond ((or (null x) (= x 10))
           (let ((bs (and acc (nreverse (if (= (car acc) 13) (cdr acc) acc)))))
             (stderr "<<< ~a~%" (bytes->string (ucs-2->ascii bs)))
             bs))
          (t (nntp-read-line s (cons x acc))))))

(defun list->bytes (ls)
  (mapcar #'data->bytes ls))

(defun vector->bytes (v)
  (mapcar #'data->bytes (coerce v 'list)))

(defun data->bytes (d)
  (cond ((null d) nil)
        ((integerp d) (list d))
        ((stringp d) (string->bytes d))
        ((consp d) (list->bytes d))
        ((vectorp d) (vector->bytes d))
        (t (error (format nil "type ~a is not supported" (type-of d))))))

(defun add-crlf-between (ls-of-ls)
  ;; Add \r\n to each ``line''. Returns List-of Byte.
  (mapcar (lambda (ls) (append ls (list 13 10))) ls-of-ls))

(defun string->bytes (s)
  (map 'list #'char-code s))

(defun bytes->string (ls)
  (map 'string #'code-char ls))
@ %def nntp-read-line nntp-read-article

{\bf Does the article conform?}  RFC 5536 obliges every article to
have exactly \verb|date|, \verb|from|, \verb|message-id|,
\verb|newsgroups|, \verb|path| and \verb|subject| headers.  The
headers \verb|path| is \Lp's responsibility, but it's used only in a
network of servers---so we're leaving that out for now.  The header
\verb|message-id| is added by \Lp\ if the client doesn't write it
itself.

Let's criticize the writing of [[conforms?]].  We have a [[catch]]
here and a [[throw]] in [[parse-headers]].  We also have a [[return]]
here.  It's getting hard to read this procedure because it's not easy
to know that a procedure has to return a certain value to match the
expectation of another procedure. I don't remember what [[catch]]
does.  I need to review this and then add the explanation for myself.
If I don't remember how this works, other beginners won't know it
either. %% TODO

<<Does an article conform?>>=
(defun conforms? (bs)
  (catch 'article-syntax-error ;; parse-headers might throw
    (let ((headers (parse-headers (article-headers (parse-article bs)))))
      (let ((result (dolist (h (headers-required-from-clients))
                      (when (not (lookup h headers))
                        (return (format nil "missing the /~a/ header" h)))))
            (content-type (get-header "content-type" headers)))
        (cond
          ((stringp result) (values nil result))
          ((not (text/plain? content-type))
           (values nil (format nil "content-type must be plain/text, but it's ~a" content-type)))
          (t (values t nil)))))))

(defun text/plain? (header-s)
  ;; I say T when S begins with "text/plain" or when S is "".
  (let* ((s (str:collapse-whitespaces header-s))
         (needle "text/plain")
         (len (min (length needle) (length s))))
    (or (zerop len)
        (and (<= (length needle) (length s))
             (string= needle s :end1 len :end2 len)))))

(defun headers-required-from-clients ()
  '("from" "newsgroups" "subject"))
@ %def conforms?

Notice that up to this point we've only verified if the necessary
headers are present.  The \verb|newsgroups| header is a direct
influence to the article storage.  For instance, \verb|newsgroups|
must mention only groups that exist.  When we save the article, we
check each group.  If at least one group exists, we save the article;
if at least one group doesn't exist, we report to the user all groups
that don't exist, but we do save the article if at least one does
exist.  That's probably not the best thing to do.  We should probably
warn the user that one group doesn't exist because that could make all
the difference to the user.  For instance, someone might decide not to
post at all if they can't cross post to all the groups they wish to.
One typo in one group name and the article would be posted to some
groups, but not to the misstyped one.  We need to change this. %% TODO

Also, do notice that to simplify matters we're duplicating articles
cross-posted.  What we should do is write the article to the first
group in the list of \verb|newsgroups| and then make a symbolic link
to all others.  The problem is that I don't know how to do that on
Windows.  I'm not sure if Windows supports symbolic links at all.  We
could perhaps duplicate articles only when on Windows.  %% TODO

\subsection{{\tt CREATE-GROUP}}

We allow every user to create their own groups.  (Are we crazy?)  When
someone craetes a group, we post an article to {\tt
  local.control.news} notifying everyone that a new group has been
created.  People then have a chance to subscribe to the new group.  We
assume that when someone creates a group, it's either no problem at
all or it has been discussed with the community beforehand.

<<Command create-group>>=
(defun cmd-create-group (r)
  (with-n-args 1 r
    (let ((g (string-downcase (car (request-args r)))))
      (multiple-value-bind (okay? reason)
          (group-name-conforms? g)
        (if (not okay?)
            (make-response :code 580 :request r
                           :data (format nil "group name does not conform: ~a" reason))
            (progn
              (multiple-value-bind (path created?)
                  (in-groups (ensure-directories-exist (concatenate 'string g "/")))
                (declare (ignore created?))
                (if (not path)
                    (make-response :code 581 :request r
                                   :data (format nil "could not create group ~a"
                                                 (if (group? g)
                                                     "because it already exists"
                                                     "but we don't know why---sorry!")))
                    (progn
                      (notify-group-created g)
                      (make-response :code 280 :request r
                                     :data (format nil "group ~a created" g)))))))))))

(defun group-name-conforms? (g)
  (let ((okay? (cl-ppcre:scan-to-strings "<<Form of newsgroup names>>" g)))
    (if okay?
        (values t nil)
        (values nil "must match <<Form of newsgroup names>>"))))
@ %def CREATE-GROUP group-name-conforms?

\section{The publication of news}

If you're interested in being notified about what's going on in the
\lp, then subscribe to {\tt local.control.news}.  Group creation,
invitations {\em et cetera} are published there.

<<Broadcasting>>=
(defun notify-group-created (g)
  (post-notification
   :subject (fmt "new group ~a by ~a" g (client-username *client*))
   :body (fmt "Blame ~a for the group ~a just created." (client-username *client*) g)))

(defun notify-user-created (u)
  (post-notification
   :subject (fmt "new account ~a by ~a" u (client-username *client*))
   :body (fmt "Blame ~a for inviting ~a." (client-username *client*) u)))

(defun notify-user-unlocked (u)
  (let ((guilty (client-username *client*)))
    (post-notification
     :subject (fmt "account ~a unlocked by ~a" u guilty)
     :body (fmt "Blame ~a for unlocking ~a." guilty u))))

(defun post-notification (&key subject body)
  (in-groups (ensure-directories-exist "local.control.news/"))
  (when (group? "local.control.news")
    (let ((a (make-news :subject subject :body body)))
        (post (concatenate 'vector (article-headers a) (crlf) (article-body a) (crlf))))))

(defun make-news (&key subject body)
  (make-article
   :headers (data
             (add-crlf-between
              (mapcar
               (lambda (p) (data (format nil "~a: ~a" (car p) (cdr p))))
               `(("from" . "Loop")
                 ("subject" . ,subject)
                 ("newsgroups" .  "local.control.news")))))
   :body (data body)))
@ %def notify-group-created notify-user-created

\section{The algorithm of {\tt split-vector}}

How should we describe the algorithm of [[split-vector]]?  The
objective is to consume lists of bytes such as
[[(1 2 3 13 10 4 5 6 13 10 7 8 9)]] and produce lists of bytes
such as [[((1 2 3) (4 5 6) (7 8 9))]].  We use [[search]] to find [[13 10]].
When we find this line termination, we have its position in the
consumed list.  With the position, we collect the line and we iterate
searching for the next line.

<<How to split a stream into lines>>=
(defun split-vector (delim v acc &key limit (so-far 1))
  (let ((len (length v)))
    (split-vector-helper delim v len acc limit so-far 0)))

(defun split-vector-helper (delim v len acc limit so-far start)
  (if (zerop len)
      acc
      (let ((pos (search delim v :start2 start :end2 len)))
        (cond ((or (not pos) (and limit (= so-far limit)))
               (nreverse (cons (subseq v start len) acc)))
              (t (split-vector-helper
                  delim
                  v
                  len
                  (cons (subseq v start (or pos len)) acc)
                  limit
                  (1+ so-far)
                  (+ pos (length delim))))))))
@ %def split-vector

\section{The index article}\label{sec:index}

Every NNTP server needs to have an index of articles.  Each article is
indexed by its message-id.  For example, the article
%
\begin{verbatim}
    From: Eva Lu <evalu@tor.soy>
    Newsgroups: comp.unix
    Subject: xv6: an introduction
    Date: Wed, 24 Jul 2024 22:42:50 -0300
    Message-ID: <87plr25js5.fsf@tor.soy>
    Cancel-Lock: sha1:82E8hqL1D1WQ9xc+CfnEYbFAaJo=
    MIME-Version: 1.0
    Content-Type: text/plain; charset=utf-8
    Content-Transfer-Encoding: 8bit
\end{verbatim}
%
is indexed by the header {\tt message-id}.  If you ask for article
{\tt <87plr25js5.fsf@tor.soy>}, \Lp\ will tell you that you can find
the article in group {\tt comp.unix} and its numeric ID is 37.  Note
that the number 37 is not inside the article, but only in the name of
the file stored in the file system.  The index, therefore, knows in
which file is each {\tt message-id}.  This fact implies that you
cannot rename files in the file system---of course not: you'd changing
identifiers inside a database.  If you have rename a file, you will
need to rebuild the index.  Given that the index is an SQL table, you
can adjust the index relative to the any files you may need to rename
for whatever reason.  You can also rebuild the entire index by reading
the file system---unless you have a lot of files, that's probably the
easiest thing to do.

The use of [[*default-database*]] by the library [[clsql]] is very
convenient for us: we don't need to specify with which database we're
working.  Since we work with only one, we pretty much never need to
specify anything.

<<Reference to the database>>=
(defvar *default-database* nil)
@ %def *default-database*

<<How to create and connect to the index>>=
(defun connect-index! (filename)
  (setq *default-database* (clsql:connect (list filename) :database-type :sqlite3)))

(defun create-index! ()
  (clsql:execute-command "create table if not exists indices
     (id varchar(1000), grp varchar(1000), article varchar(300))")
  (clsql:execute-command "create unique index if not exists idx_id_1
    on indices (id)"))

(defun remake-index! ()
  (clsql:execute-command "drop table if exists indices")
  (create-index!))
@ %def create-index! remake-index! connect-index!

Of course, the creation and connection of the index must occur before
[[main-loop]], so it takes place in [[main]].

When someone requests an article, it's either by its numeric index but
then the client has already chosen a group, or it's by its {\tt
  message-id}.  We don't need to tell the client to which groups the
article belongs; we just give the entire article to the client.  It
is, therefore, the client's responsibility what to do with the
article.  However, to fetch an article, we need to know where in the
database (in the file system) is the article; in other words, we must
know one group in which the article was stored.  This implies that the
index must know at least one group.  We've decided to always index the
first group in the {\tt newsgroups} header.  So the index's anatomy is
$(m, g, i)$, where $m$ is the {\tt message-id}, $g$ is the name of the
group and $i$, is the name of the article in the file system.  This
also defines the anatomy of the SQL table.

Should we store more information in the index?  Not really.  If we
need anything about an article, we can get it after we fetch it from
the file system.  For example, suppose that a search command wishes to
display the fact that article was posted in various groups.  Suppose
further the command has already located in the index an article to be
displayed.  This means the command has the {\tt message-id} and one of
the groups in which the article was posted.  The command is then able
to fetch the entire article from the file system.  Now it's a matter
of reading the article itself to know almost everything there is to
know about it.  (It's also interesting that we keep the index thin
because we need to allow it to grow to great sizes.)

%% (clsql:create-table "INDICES" '(([id] (string 1000)) ([grp] (string 1000)) ([article] (string 300))))
%% (clsql:execute-command "create table if not exists indices (id varchar(1000), grp varchar(1000), article varchar(300))")

%% LOOP> (clsql:create-index "idx_id_1" :on "indices" :attributes "id" :unique "id")
%% ; No value
%% (clsql:execute-command "CREATE UNIQUE INDEX if not exists idx_id_1 ON INDICES (ID)")

%% CL-USER> (clsql:list-attributes "indices")
%% ("ID" "GRP" "ARTICLE")

%% CL-USER> (clsql:insert-records :into "indices" :attributes '(id grp article) :values '("<87plr25js5.fsf@tor.soy>" "comp.unix" 37))
%% ; No value

%% CL-USER> (clsql:select 'id 'grp 'article :from "indices")
%% (("<87plr25js5.fsf@tor.soy>" "comp.unix"))
%% ("ID" "GRP")

\section{Essential operations relative to the index}

Here's how to query the index or how to insert a new article into it.
If [[insert-index]] returns [[nil]], then it's because it found no
errors at all.  The return value, therefore, indicates which error
ocurred.

We don't consider an error that an article has already been added to
the index.  For instance, we write multiple messages to the storage
when someone cross-posts, but we'll add just a single record to the
index, of course.  So, an article already indexed is normal situation.
Sure---in the future, we will not duplicate articles in storage; we
will make symbolic links.  We don't do that right now because Windows
doesn't really support symbolic links.

<<Essential operations relative to the index>>=
(defun insert-index (m g i)
  (handler-case
      (clsql:insert-records
       :into "indices"
       :attributes '(id grp article)
       :values (list (str:trim m) (str:trim g) (str:trim i)))
    (clsql-sys:sql-database-data-error (c)
      (cond ((= (slot-value c 'clsql-sys::error-id) 19)
             'already-indexed)
            (t
             ; We should log this error.
             ;(format t "other error: ~a" (slot-value c 'clsql-sys::database-message))
             'sql-error)))
    (:no-error ()
      nil)))

(defun lookup-index (mid)
  (let* ((found (clsql:select [grp] [article]
                              :from [indices]
                              :where [= [id] (str:trim mid)]))
         (article (first found))
         (grp (first article))
         (art (second article)))
    (when found
      (values grp art))))
@ %def insert-index lookup-index

\section{A procedure to import the index from the file system}

%% get group:
%% (first (last (pathname-directory (car (in-groups (directory "**/*"))))))

%% get article name:
%% (pathname-name (car (in-groups (directory "**/*"))))

%% get all articles
%% LOOP> (in-groups (directory "**/*"))

%% get the newsgroup header from the article
%% LOOP> (get-header-from-article "newsgroups" (fetch-article "comp.lang.lisp" "1"))

When we build the index from scratch, it's important to have a
procedure capable of reading all articles in the file system and index
them one by one.  This is what we implement right here.  For each news
group, we get the name of the file relative to the article---that's
the numeric ID of the article---and its {\tt message-id}.  That's all
we need to index it.

<<How to create an index from the file system>>=
(defun index-from-fs! ()
  (loop for path in (in-groups (directory "**/*"))
        do (let* ((g (str:trim (first (last (pathname-directory path)))))
                  (i (str:trim (pathname-name path)))
                  (m (str:trim (extract-mid (fetch-article g i)))))
             (when (> (length m) 0)
               (format t "article ~a/~a indexed by ~a~%" g i m)
               (insert-index m g i)))))

(defun remake-index-from-fs ()
  (remake-index!)
  (index-from-fs!))
@

Here's a program to build the index from a UNIX shell.

<<build-index-from-fs.lisp>>=
(load "~/.sbclrc")
(ql:quickload :loop :silent t)
(in-package #:loop)
(connect-index! "message-id.db")
(remake-index!)
(index-from-fs)
(format t "Index built.~%")
@

Be careful when using this program: it will build the database
[[message-id.db]], which is an operation that needs to be done only
once.  Here's how to use it:
%
\begin{verbatim}
%pwd
/home/dbastos/loop

%sbcl --script build-index-from-fs.lisp
Index built.

%ls -l message-id.db
-rw-r--r--  1 dbastos  wheel  65536 Aug 26 13:32 message-id.db
\end{verbatim}

\section{Deletion and locking of inactive accounts}\label{sec:inactive-users}

In [[remove-friend]], note that [[username]] is the account name and
[[friend]] is the name of the account being removed.  Notice as well
that we only know who invited the person after we can get a hold of
the account in [[accounts.lisp]].  This means we must scan each
account to delete an account---we can't delete an account and still
leave the account as someone's friend.

The program [[cron-remove-inactive-users.lisp]] can be executed every
day at midnight, say.

<<cron-remove-inactive-users.lisp>>=
(load "~/.sbclrc")
(ql:quickload :loop :silent t)
(in-package #:loop)
;; (format t *default-pathname-defaults*)
(read-accounts!)
(connect-index! "message-id.db")
(remove-inactive-users!)
(write-accounts!)
@ %def cron-remove-inactive-users.lisp

In [[remove-account]], we probably should use [[delete-if]] as well on
the list of friends since it is effectively what we are doing there
with [[setf]]. %% TODO

<<How to remove inactive users>>=
(defun remove-inactive-users! ()
  (loop for u in *accounts* do
    (let ((username (account-username u)))
      (format t "Username: ~a~%" username)
      (cond ((and (not (locked? username))
                  (inactive-from-never-logged-in? username))
             (post-notification
              :subject (fmt "account ~a removed by Loop" username)
              :body (fmt "~a didn't log in a first time (for ~a month~a) since account creation."
                         username *months-never-logged-in*
                         (plural *months-never-logged-in* "s")))
             (remove-account! username)
             (format t "Removed ~a due to never logging in.~%" username))
            ((and (not (locked? username))
                  (inactive-from-last-seen? username))
             (post-notification
              :subject (fmt "account ~a locked by Loop" username)
              :body (fmt "~a disappeared for over ~a month~a."
                         username *months-inactive-allowed*
                         (plural *months-inactive-allowed* "s")))
             (lock-account! username
                            (fmt "disappeared for over ~a months"
                                 *months-inactive-allowed*))
             (format t "Locked ~a due to long-time-no-see.~%" username))))))

(defun remove-account! (username)
  (loop for u in *accounts* do
    (delete-if #'(lambda (x) (equal x username)) (account-friends u)))
  (delete-if #'(lambda (x) (equal username (account-username x))) *accounts*))

(defun lock-account! (username why)
  (let ((u (get-account username)))
    (setf (account-pass-locked u) (account-pass u))
    (setf (account-pass u) "locked")
    (setf (account-pass-locked-why u) why)))

(defun remove-friend (username friend)
  (remove-if #'(lambda (x) (equal x friend))
             (account-friends (get-account username))))
@ %def remove-account! remove-friend

Accounts that do not have a creation date up until today---Tue Sep 17
21:37:18 ESAST 2024---will have its creation dates migrated to the
\Lp\ epoch, which is January 1st 2024, the exact month in which
\Lp\ was written.  But notice that this migration is done only once.
New system administrators of \Lp\ will never need to run this.

<<How to migrate accounts without a creation date>>=
(defun loop-epoch ()
  (encode-timestamp 0 0 0 0 1 1 2024))

(defun migrate-add-creation-and-post-date! ()
  (read-accounts!)
  (loop for u in *accounts*
        do (if (not (account-creation u))
               (setf (account-creation u) (timestamp-to-universal (loop-epoch)))
               (setf (account-last-post u) (account-seen u))))
  (write-accounts!))
@ %def migrate-add-creation-and-post-date! loop-epoch

Here's a program to run the migration in a UNIX shell.

<<migrate-add-creation-date.lisp>>=
(load "~/.sbclrc")
(ql:quickload :loop :silent t)
(in-package #:loop)
(migrate-add-creation-and-post-date!)
(format t "Accounts rewritten.~%")
@ %def migrate-add-creation-date.lisp

Now we write the procedures that discover what accounts are inactive.

<<How to enumerate inactive accounts>>=
(defparameter *months-inactive-allowed* 3)
(defparameter *months-never-logged-in* 1)

(defun user-inactive? (username)
  (or (inactive-from-never-logged-in? username)
      (inactive-from-last-seen? username)))

(defun inactive-from-never-logged-in? (username)
  (let ((u (get-account username)))
    (if (ever-logged-in? username)
        NIL
        (inactive-from? username *months-never-logged-in*
                        #'(lambda () (account-creation u))))))

(defun locked? (username)
    (equal "locked" (account-pass (get-account username))))

(defun inactive-from-last-post? (username)
  (let ((last-post (account-last-post (get-account username)))
        (creation (account-creation (get-account username))))
    (inactive-from? username *months-inactive-allowed*
                    (if last-post
                        #'(lambda () last-post)
                        #'(lambda () creation)))))

(defun inactive-from-last-seen? (username)
  (let* ((u (get-account username))
         (last-seen (account-seen u))
         (creation (account-creation u)))
    (inactive-from? username *months-inactive-allowed*
                    (if last-seen
                        #'(lambda () last-seen)
                        #'(lambda () creation)))))

(defun inactive-from? (username months timestamp-source)
  (timestamp<
   (timestamp+
    (universal-to-timestamp
     (funcall timestamp-source)) months :month)
   (now)))

(defun ever-logged-in? (username)
  (account-seen (get-account username)))

(defun never-logged-in? (username)
  (not (ever-logged-in? username)))

(defun list-inactive-users ()
  (loop for u in *accounts* do
        (format t "Username ~a is inactive? ~a~%"
                (account-username u)
                (user-inactive? (account-username u)))))
@ %def list-inactive-users

\section{A special-purpose language to ease writing}\label{sec:dsl}

These macros make up a tiny language to ease the writing of \lp.  For
example, when we need to access the group database, we use
[[in-groups]].  When a certain command demands authentication, we use
[[with-auth]].

<<Macros>>=
(defmacro in-dir (dir &rest body)
  `(let ((*default-pathname-defaults* (truename ,dir)))
     (uiop:with-current-directory (,dir)
       ,@body)))

(defmacro in-groups (&rest body) `(in-dir "groups/" ,@body))

(defun in-group-lambda (g fn) (in-dir g (funcall fn)))

(defmacro in-group (g &rest body)
  `(in-group-lambda ,(fmt "groups/~a/" g) (lambda () ,@body)))

(defmacro with-group (g r &rest body)
  (let ((g-var (gensym))
        (r-var (gensym)))
    `(let ((,g-var ,g)
           (,r-var ,r))
       (if (not (group? ,g-var))
           (make-response :code 411 :request ,r-var
                          :data (format nil "no such group ``~a''" ,g-var))
           (progn ,@body)))))

(defmacro with-n-args (n r &rest body)
  (let ((args-var (gensym))
        (message-var (gensym))
        (n-var n))
    `(let ((,args-var (request-args r))
           (,message-var ,(fmt "bad arguments: needs exactly ~a" n-var)))
       (if (not (= ,n-var (length ,args-var)))
           (make-response :code 400 :request ,r :data ,message-var)
           (progn ,@body)))))

(defmacro with-group-set (&rest body)
  (let ((g-var (gensym)))
    `(let ((,g-var (client-group *client*)))
       (if (not ,g-var)
           (bad-input r "must say GROUP first")
           ,@body))))

(defmacro with-auth (&rest body)
  `(if (not (auth?))
       (make-response :code 400 :data "You must authenticate first.")
       (progn ,@body)))

@ %def in-groups with-group with-n-args with-group-set with-auth

\section{Other procedures}

Of small importance, they have nothing.  Notice that [[ucs-2->ascii]]
is iseful only in Windows systems---and just for development.  The
procedure destructively converts UCS-2 to ASCII, so it's only really
useful when we're converting an implicitly ASCII-content in the form
of UCS-2.  Despite the name UCS-2, notice it is UTF-16.  The name UCS
stands for ``Universal Character Set'' and I speculate the number 2
means 2 bytes.  So our conversion is just removing the first byte.

<<Little procedures>>=
(defun plural (v suffix)
  (if (> v 1) "s" ""))

(defun debug? () nil)

(eval-when (:compile-toplevel :load-toplevel :execute)
  (defun fmt (cstr &rest args)
    (apply #'format nil (list* cstr args))))

(defun stderr (&rest args)
  (when (debug?)
      (apply #'format (cons *error-output* args))))

(defun enumerate (ls &optional (first-index 0))
  (loop for e in ls and i from first-index
    collect (cons i e)))

(defun ucs-2->ascii (bs)
  ;; I'm a Windows user.
  #-win32 bs #+win32 (remove-if #'zerop bs))

(defun bad-input (r msg &key code)
  (make-response :code (or code 400) :data msg :request r))

(defun integer->string (n)
  (format nil "~a" n))

(defun mkstr (&rest args) ;; a utility
  (with-output-to-string (s)
    (dolist (a args) (princ a s))))

(defun data (&rest args) ;; a utility
  (flatten (map 'list #'data->bytes args)))

(defun crlf ()
  (vector 13 10))

(defun crlf-string ()
  (format nil "~c~c" #\return #\linefeed))

(defun flatten (obj)
  (do* ((result (list obj))
        (node result))
       ((null node) (delete nil result))
    (cond ((consp (car node))
           (when (cdar node) (push (cdar node) (cdr node)))
           (setf (car node) (caar node)))
          (t (setf node (cdr node))))))

(defmacro mac (&rest body)
  `(macroexpand-1 ,@body))
@ %def bad-input crlf mkstr data crlf-string flatten ucs-2->ascii enumerate

\section{Tests}

I studied the minimum to be able to add these tests as we comprehend
better the direction in which we're going.  A test system is essential
for us to trust we can move forward without breaking past decisions in
the code.

<<Test procedures>>=
(setq lisp-unit:*print-failures* t)
(define-test first-test-of-the-west
  (assert-equal 0 0))

(define-test requests
  (let ((nil-request-1 (make-request))
        (nil-request-2 (make-request :said "  ")))
    (assert-true (request=? nil-request-1 (parse-request nil-request-1)))
    (assert-true (request=? nil-request-2 (parse-request nil-request-2)))
    (assert-true (request=? nil-request-1 nil-request-2))))

(define-test commands
  (let ((ht (make-hash-table))
        (c1 (make-command :fn #'cons :verb 'c1 :description "cons cmd"))
        (c2 (make-command :fn #'list :verb 'c2 :description "list cmd")))))

(define-test dispatching
  (assert-true (equalp (empty-response) (dispatch (empty-request)))))
@ %def

\section{How to produce the binary executable}

Just say {\tt make exe} to your shell.

<<build-exe.lisp>>=
(load "~/.sbclrc")
(ql:quickload :loop :silent t)
(sb-ext:save-lisp-and-die #P"loop.exe"
                         :toplevel #'loop:main
                         :executable t)
@

\section{How to update the remote server}\label{sec:live}

We automate here the process of updating and compilation of a new
version of \lp.  It's certain that what we document here is specific
to a single UNIX system, but what's important is that you (dear
reader) see exactly what must be done to go live with the system.

The system is composed of the Lisp package [[loop]].
The first thing to do is copy the files of each package to their
destinations in the remote server.  The system depends on
[[quicklisp]] and we use the directory called [[local-projects]] as
the repository of our packages.  So we just need ask {\tt ssh} to copy
the files.  We begin with [[make]] to extract all source files from
[[loop.nw]], which is the master source code of \lp.
%
\begin{verbatim}
%scp loop.asd loop.lisp me@remote:quicklisp/local-projects/loop
loop.asd                  | 0 kB |   0.2 kB/s | ETA: 00:00:00 | 100%
loop.lisp                 | 37 kB |  37.5 kB/s | ETA: 00:00:00 | 100%
\end{verbatim}

Files copied.  Now it's time to produce the executabler from the newly
installed source code.  To produce the executable, we run
[[build-exe.lisp]].  I'm going to demonstrate how to run this from my
own development machine.  Since I'm running Windows, I use [[plink]]
and not [[ssh]].
%
\begin{verbatim}
%scp build-exe.lisp me@remote:loop/
build-exe.lisp            | 0 kB |   0.2 kB/s | ETA: 00:00:00 | 100%

%plink -ssh me@remote cd loop/ && sbcl --script loop/build-exe.lisp \
  echo "Executable built."
\end{verbatim}

Produce the executable is sufficient because we're using Daniel
J. Bernstein's [[tcpserver]].  After replacing the executable in the
file system, new TCP connections will invoke the new executable while
older connections still alive will keep using the older executable
already loaded in memory.  There's nothing to restart, in other
words.

The target [[live]] in the [[Makefile]] automates the steps that we
have just described.  Have a look at the [[Makefile]], which is not
included here in this literate document.  With this automation, we
update the remote system with:
%
\begin{verbatim}
%make live
scp loop.asd loop.lisp \
   dbastos@antartida.xyz:quicklisp/local-projects/loop
loop.asd                  | 0 kB |   0.2 kB/s | ETA: 00:00:00 | 100%
loop.lisp                 | 37 kB |  37.5 kB/s | ETA: 00:00:00 | 100%
scp build-exe.lisp \
   dbastos@antartida.xyz:loop/
build-exe.lisp            | 0 kB |   0.2 kB/s | ETA: 00:00:00 | 100%
plink -ssh dbastos@antartida.xyz cd loop/ && \
   sbcl --script build-exe.lisp && \
           echo "Executable built."
Executable built.
\end{verbatim}

Yes, we could parameterize the command with the address of the remote
server and remote path to the installation.  But perhaps we will
always be the ones using this system, so we will delay this task until
further notice. %% TODO

\section{The package {\tt loop.lisp} as the compiler needs it}

We now put together all source code chunks in the order the compiler
needs to read it.  By the way, you see this call to
[[enable-sql-reader-syntax]]?  We need it at the top-level of any file
that uses the SQL syntax from [[clsql]].  You can see an illustration
of the syntax in, for example, [[lookup-index]].

One thing to keep in mind here is---I wonder if people that might read
this source code would read the literate programming \LaTeX\ output or
would they read [[loop.lisp]] directly.  For literate programmers, it
doesn't matter how [[loop.lisp]] turns out because only the compiler
reads [[loop.lisp]].  But if we care about anyone who might read
[[loop.lisp]], then we should perhaps tell our literate programming
tools to generate a nice-looking file.  For instance, I declare global
variables in the chunks where it's used.  But for someone reading
[[loop.lisp]] directly, it is perhaps better if they would see all
global variables at the top of the file.  That's something to think
about.

<<loop.lisp>>=
;;; -*- mode: LISP; syntax: COMMON-LISP; package: LOOP; -*-
(eval-when (:compile-toplevel :load-toplevel :execute)
  (ql:quickload
   '(:lisp-unit
     :str
     :uiop
     :cl-fad
     :cl-ppcre
     :local-time
     :iterate
     :clsql-sqlite3)
   :silent t))

(clsql:enable-sql-reader-syntax)

(defpackage #:loop
  (:use :common-lisp :local-time)
  (:import-from :lisp-unit define-test)
  (:import-from :iterate iter)
  (:export :main))

(in-package #:loop)

<<How to remove inactive users>>
<<How to enumerate inactive accounts>>
<<How to migrate accounts without a creation date>>
<<Reference to the database>>
<<How to create and connect to the index>>
<<Essential operations relative to the index>>
<<Little procedures>>
<<Macros>>
<<Representation of a client>>
<<Table of commands>>
<<Representation of requests and responses>>
<<Main loop>>
<<How to split a stream into lines>>
<<Representation and parsing of articles>>
<<Dispatching of commands>>
<<Authinfo command>>
<<Command mode reader>>
<<Commands head, body, article>>
<<Command xover>>
<<Command group>>
<<Command list>>
<<Help command>>
<<Command quit>>
<<Command date>>
<<Command post>>
<<Command create-group>>
<<Command create-account>>
<<Command login>>
<<Command passwd>>
<<Broadcasting>>
<<Command users>>
<<Command dd>>
<<Test procedures>>
<<How to create an index from the file system>>
@ %def

<<*>>=
<<loop.lisp>>
<<loop.asd>>
<<build-exe.lisp>>
<<build-index-from-fs.lisp>>
@

<<loop.asd>>=
;;; -*- Mode: LISP; Syntax: COMMON-LISP; -*-
(asdf:defsystem :loop
  :version "0.0.1"
  :description "An NNTP server for a group of friends."
  :depends-on (:lisp-unit :str :uiop :cl-fad :cl-ppcre)
  :components ((:file "loop")))
@ %def :loop

\section{The UNIX service}

We use the {\tt tcpserver} program by Daniel J. Bernstein from the
package \href{https://cr.yp.to/ucspi-tcp.html}{\tt ucspi-tcp}.

<<path to the service>>=
/home/dbastos/loop
@ %def

<<port number>>=
119
@ %def

<<run>>=
#!/bin/sh
echo loop
cd <<path to the service>>
exec <<path to the service>>/tcpserver -HR 0.0.0.0 <<port number>> <<path to the service>>/loop.exe
@ %def

How should you update the server if you modified the source-code?  See
Section~\ref{sec:live}.

\section{The writing process}

The program {\tt latexmk} is iseful when I'm writing \LaTeX\ in
general, but to get the attention of {\tt latexmk} we need to rewrite
{\tt loop.tex}.  So what I do while writing \lp\ is to have a
program---called \href{https://github.com/sjl/peat}{[[peat]]} by Steve
Losh---monitor the NOWEB source code {\tt loop.nw} effectively
invoking [[latexmk]] whenever {\tt loop.nw} is modified.  Have a look
at the target [[livedoc]] in the [[Makefile]].

\section{Why isn't {\tt Makefile} in {\tt loop.nw}}

I don't include {\tt Makefile} in the literate source code because I
use [[make]] to drive the literate programming tools.  It is true that
we could include the {\tt Makefile}, then run {\tt noweb} once to
extract the {\tt Makefile} from {\tt loop.nw} and then use [[make]]
after that.  However, I prefer to build a package that's totally
independent from the literate programming tools because, more often
than not, literate programming tools are usually unavailable in the
typical UNIX system out there.  This way, the package we offer the
public can be considered a typical UNIX source code package and
programmers need only worry about literate programming tools if they
decide to modify the source code.

The way I particularly run {\tt noweb} is always by asking for
specific chunks to be extracted.  So the command line I'd usually
write is, for example,
%
\begin{verbatim}
build-exe.lisp: loop.nw
	(any tangle -Rbuild-exe.lisp < loop.nw > build-exe.tmp || \
            (rm build-exe.tmp && exit 1)) && \
	    mv build-exe.tmp build-exe.lisp
\end{verbatim}
%
In other words, I dump the chunk into a temporary file so that I don't
destroy the previous version of the source code unless the extraction
produces no error.  This is too long of a command line and should be
issued by [[make]] itself.

\section*{Index of chunks}
\nowebchunks

\section*{Index of names}
\nowebindex

\end{document}