It is often useful to read RFCs of the Internet Engineering Task Force (IETF) when trying to understand some networking issue or differences in vendor implementations. RFCs can be read using a web browser, via the RFC Editor web site (RFC URLs look like https://www.rfc-editor.org/rfc/rfc<NUMBER>). The problem with this approach is RFC pagination, at least for RFC numbers below 8650, which is not well supported by web browser. Thus the header and footer line included as normal text in an RFC disturb the reading flow in a web browser. If only a web browser is available, the PDF rendering of an RFC is often the better choice than the HTML version.
Starting with the number 8650, RFCs are published as XML files that are not intended to be read directly, but require rendering into a different format, e.g., HTML or PDF. While there exists a rendering into text, this may omit figures and diagrams. Thus it seems safer to read those newer RFCs in the official PDF or HTML format. The PDF rendering uses pagination, while neither the HTML nor the text rendering are paginated. The official HTML rendering can be found at the RFC Editor web site (the URLs look like https://rfc-editor.org/rfc/rfc<NUMBER>).
When using a GNU/Linux system I prefer to retrieve the
RFC text version
with
GNU wget,
(slightly) reformat the text using a small
AWK
script and
sed,
and display it inside an appropriately sized
XTerm
(or other terminal emulator window) using the
less
pager. This allows flipping through the pages similar to a printed RFC.
If the system supports UTF-8, e.g. using a relatively recent GNU/Linux
distribution with default configuration, this method automatically supports
the use of UTF-8 as found in, e.g.,
RFC 8187.
This can done using the rfc-reader
script found on this page:
rfc-reader <NUMBER>
For new RFCs published in XML with an official (or original) HTML
rendering a more appropriate way to read an RFC on a GNU/Linux desktop system
is to use
xdg-open
from
xdg-utils
to open the HTML version in a web browser:
xdg-open https://www.rfc-editor.org/rfc/rfc<NUMBER>.html
I have written a small Bash script called rfc-reader to comfortably read RFC documents in text format on GNU/Linux. It uses the method described above. It relies on GNU Wget to download files. When opening a new terminal window, it relies on the X Window System (also known as X Windows or just X) and works best with XTerm, but GNOME Terminal can work, too.
Besides the download link on this page, it is part of my collection of single file tools in a git repository on GitHub. A change log for rfc-reader can be found in form of the git commit log.
Since I am using GNU/Linux and prefer the GNU versions of Awk and sed, I am only testing with those implementations. Non-GNU implementations are often inferior and their use may break rfc-reader. Since version 0.40 rfc-reader requires the GNU Bash shell.
rfc-reader [OPTION...] [{rfc|bcp|fyi|ien|std}][{-|.| }]NUMBER[.txt] rfc-reader [OPTION...] {I-D.|draft-}DRAFT_NAME[-DRAFT_NUMBER][.txt] rfc-reader [OPTION...]
rfc-reader # print version, copyright, help rfc-reader 1 # downloads and displays RFC 1 rfc-reader rfc1 # downloads and displays RFC 1 rfc-reader rfc 1 # downloads and displays RFC 1 rfc-reader rfc-1 # downloads and displays RFC 1 rfc-reader rfc.1 # downloads and displays RFC 1 rfc-reader rfc1.txt # downloads and displays RFC 1 rfc-reader rfc0001.txt # downloads and displays RFC 1 rfc-reader bcp78 # downloads and displays BCP 78 rfc-reader fyi3 # downloads and displays FYI 3 rfc-reader ien137 # downloads and displays IEN 137 rfc-reader std1 # downloads and displays STD 1 rfc-reader draft-rep-wg-topic-00 # downloads & displays REP I-D rfc-reader I-D.rep-wg-topic-00 # downloads & displays REP I-D rfc-reader draft-rep-wg-topic # downloads & displays REP I-D rfc-reader I-D.rep-wg-topic # downloads & displays REP I-D
Some GNU/Linux Distributions include packages containing RFCs. rfc-reader looks for RFC files in the following directories, used by OpenSUSE and Debian/Ubuntu respectively:
/usr/share/doc/rfc /usr/share/doc/RFC/links
If you know of additional locations used by GNU/Linux distributions or other operating systems to provide local RFC copies, please let me know.
If one of the directories $HOME/.rfc-reader/cache
or
$XDG_CACHE_DIR/rfc-reader
(if $XDG_CACHE_DIR
is unset or set to the empty string,
$HOME/.cache
is used instead)
exists and is readable,
rfc-reader will look for RFC files there. If one of the directories
is writable, rfc-reader will use it to save downloaded RFC files.
If both directories are writable, rfc-reader will prefer
$XDG_CACHE_DIR/rfc-reader
as download cache, but will still read
RFCs from both directories.
If a file from a series with changing contents for a constant document number
is found in the directory selected as download cache, a refresh of the file
is attempted. If this fails (e.g., while offline), the cached copy is used.
Currently (2022-08-14) I know that Best Current Practice
(BCP
) and Internet Standard (STD
) document
contents for a given document number can change.
The download cache, if available, is used for support files as well, e.g.,
the list of Internet Drafts (all_id.txt
) used to
determine the latest version of an Internet Draft, if no specific
version is requested.
If rfc-reader is started without an X Window System environment, i.e.,
with empty or unset DISPLAY
environment variable, instead of
opening a new terminal window of appropriate size, rfc-reader will
use the terminal it is started in. If the RFC is paginated, and the terminal
is too small to show a full RFC page inside the selected pager,
rfc-reader will exit with an error.
-fn c-9x18 -fg green -bg black -bd green -g 72x59
-M -S
--hide-menubar --disable-factory --geometry 72x59
-M -S
Starting rfc-reader as a background job does not work well with
GNOME Terminal. There seems to be a race regarding setting the terminal
dimensions and starting the pager inside the terminal. The result is a
wrong output size inside the pager. As a workaround you can start
rfc-reader in the foreground, suspend it (usually CTRL+Z
in the terminal it was started from), and then make it a background process
with bg
.
The problem mentioned above does not occur with XTerm.
The RFCs use a format that works well when printed on paper. It uses pages with headers and footers. All pages are roughly the same size, one which fits the paper size of most text printers (but actual RFC page size varies slightly). Defining page breaks and thus pagination for all of the RFC is actually necessary to ensure that ASCII art drawings, diagrams, and tables are not affected by page breaks automatically inserted when printing.
Pagination, specifically using header and footer lines, does not work well with the variable size of screens, terminal windows, and fonts. So called pagers (like less) usually adapt to the current screen resp. window size and allow continuous scrolling through the text, losing the benefits of pagination. Worse, headers and footers interrupt reading instead of unobtrusively providing reference information. Therefore something a little more sophisticated than just viewing the original RFC text file with a pager inside an arbitrarily sized terminal window is needed. The good news ist that simple adjustments to both the RFC text file and the terminal window suffice.
It should be noted that some pagers, e.g., some versions of
more
and pg
, support pausing on Form Feed
characters. Using such a pager in a sufficiently large terminal allows
reading an RFC page-by-page. With more -c
, only the current
page is displayed. Sadly, the more comfortable pagers less
(de-facto default on GNU/Linux) and most
do not support to
pause on Form Feed.
A text document comprised of pages with a fixed number of lines, the first and last used as header and footer, looks good in a pager providing exactly the same number of lines. Since RFCs have a defined maximum number of lines (58) and maximum number of characters per line (72) [see RFC 2223], using a text display area of this size should work fine, but not all pages in an RFC have the same length. For printing, use of the form feed character (FF, ASCII code 12) ends a page. Modern pagers usually do not use this convention. Therefore my rfc-reader script fills all pages to be of equal length (58 lines), allowing paginated reading on screen in an appropriatly sized terminal window showing the pager output, and opens a terminal window of just the right size wherein the re-formatted text is shown inside a pager.
To display an RFC in text format with a pager inside a sufficiently large
terminal window, the form feed character should be expanded to fill the
terminal (excluding the status line of the pager). When using a shell that
sets the LINES
environment variable, e.g.
GNU Bash,
this can be done as follows (using less as pager):
awk -v n_lines="${LINES}" '!/\f/ {line++; print}; /\f/ {for (i=line; i+1<n_lines; i++) print ""; line=0}; END {for (i=line; i+1<n_lines; i++) print ""}' RFC.txt | sed '/^$/!s/^/ /' | less -MS
Alternatively, you can use env
to unset the DISPLAY
environment variable and use rfc-reader. Thus rfc-reader
cannot open a new terminal and will try to use the existing one in which it
was started instead:
env -u DISPLAY rfc-reader RFC
While I prefer to read paginated RFC documents in a page-based viewer, you might prefer to remove pagination from the RFC and read the contents as one long stream of text. To do this you can use the rfcstrip tool (not to be confused with the different rfcstrip mentioned in RFC 8407 section 3.11) in combination with a pager. Since rfcstrip removes the page breaks (i.e., form feed characters) along with running headers and footers, and uses a sophisticated squeezing of empty lines, the results constitute nice input for any pager, including less, without need for specific options:
rfcstrip RFC.txt | less
See the RFC Format Change FAQ for an overview and links to the details of the planned changes to the format of (new) RFC documents. Information pertaining to rfc-reader and reading new-format RFCs on text-based systems can be found below.
RFC 6949 drops the pagination requirements (section 3.3), thus the rfc-reader script will lose some of its usefulness for new RFCs some time in the future, but will remain very helpful for RFCs published according to RFC 2223. This includes the already published well-paginated RFCs since there is no plan to re-publish them in the new format.
RFC 7990
names
XML
using the xml2rfc v3
vocabulary as described in
RFC 7991
the canonical format for future RFC documents. While XML is a
text-based format, it is not conveniently readable as-is.
A plain-text version of the XML RFC shall be produced as one of several RFC publication formats. This plain-text version is intended to be a low-fidelity version of the RFC, possibly omitting figures instead of using ASCII art as known from previous RFCs. This plain-text version might still use pagination, i.e., form feed characters, with a maximum of 58 lines per page, but omit page headers and footers (see RFC 7994, but RFC 7990 specifies that unpaginated plain-text will be created, and RFC 6949 retired the pagination requirement). Thus rfc-reader might still be able to correctly format the new plain-text RFCs, but with little, if any, benefit over just using a pager that ignores the meaning of form feed characters. Perhaps the best way to view those new-style plain-text RFCs will be to remove the form feeds, reduce consecutive empty lines to a single empty line (often called squeezing), and use a pager to view the result. This sqeezing is done automatically by rfc-reader if the RFC text does not contain form feed characters.
As of October 2019, the RFC document format change no longer lies in the future.
The rfcstrip script can remove basic pagination from such a text rendering. It can remove classical pagination as well. Thus a script is already available.
Since basic pagination means just the addition of form feed
characters, it can be stripped out using a
POSIX compatible tr
:
tr -d '\f' < RFC.txt
But it seems as if the new text rendering of RFC documents in canonical XML format does not use any pagination, not even basic pagination. To prepare the text version for printing on a text printer, it may help to use pr to add pagination. This, of course, risks introducing page breaks inside of figures, since pr does not analyze the content to avoid this.
There also is a simple Python script available to add pagination to the unpaginated text rendering of an RFC called pagerfc available under a free software license. Currently (2024-04-08), this script does not seem to consider artwork when inserting page breaks.
The free software xml2rfc Python script can still create pagination, at least for Internet-Drafts, even though the RFC text renderings created by the RFC Editor are not paginated. It could be possible to use it to create a paginated text rendering from the XML version of an RFC, too.
Looking at RFC 8655, pagination seems to be dropped from the text version, at least the text version of this RFC accessed on 2019-10-25 does not contain any form feed characters. The same holds for RFC 8651, the first RFC published using the new format. The lowest numbered RFC using the new format seems to be RFC 8650.
The plain-text version of an RFC will no longer be restricted to ASCII, but will use UTF-8 encoding, with the (at best useless for UTF-8) Unicode Byte Order Mark (BOM) prepended. It is probably best to remove the BOM before viewing the text-format RFC.
Viewing the (possibly crippled, because figures and diagrams might be omitted) plain-text version of a future RFC could thus be done as follows:
sed '1s/^'$'\uFEFF''//;/^[[:space:]]*\f[[:space:]]*$/d' RFC.txt | less -s(The above works with GNU sed, uses sed to remove the BOM as well as lines with a form feed character possibly surrounded by white space used for pagination, and uses a function of the less pager to squeeze blank lines. The rfc-reader script automatically removes a prepended BOM in UTF-8 encoding.)
As RFC 7994 notes in its Security Considerations section, “[U]nintended changes to the text as a result of the transformation from the base XML file could in turn corrupt a standard, practice, or critical piece of information about a protocol.”
The PDF/A-3 (see RFC 7995) and HTML (see RFC 7992 and RFC 7993) versions of those future RFCs, rendered from the canonical XML version, are probably the only realistic options left to read all content of a future RFC, including figures and diagrams. This, of course, requires the use of complex software and graphical displays.
The 50th anniversary of the RFC series in 2019 thus marks the end of an era.
back to my homepage.