Network Working Group
Internet Drafts
January 1995
This document is an Internet-Draft. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts.
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet- Drafts as reference material or to cite them other than as ``work in progress.''
To learn the current status of any Internet-Draft, please check the ``1id-abstracts.txt'' listing contained in the Internet- Drafts Shadow Directories on ds.internic.net (US East Coast), nic.nordu.net (Europe), ftp.isi.edu (US West Coast), or munnari.oz.au (Pacific Rim).
Distribution of this memo is unlimited. Please send comments to the <pgp-bugs@mit.edu> mailing list.
PGP was created by Philip Zimmermann and first released, in Version 1.0, in 1991. Subsequent versions have been designed and implemented by an all-volunteer collaborative effort under the design guidance of Philip Zimmermann. PGP and Pretty Good Privacy are trademarks of Philip Zimmermann.
This document describes versions 2.x of PGP. Specifically, versions 2.6 and 2.7 conform to this specification. Version 2.3 conforms to this specification with minor differences.
A new release of PGP, known as PGP 3.0, is anticipated in 1995. To the maximum extent possible, this version will be upwardly compatible with version 2.x. At a minimum, PGP 3.0 will be able to read messages and signatures produced by version 2.x.
Although signatures normally are found attached to the message or file that they sign, this is not always the case: detached signatures are supported. A detached signature may be stored and transmitted separately from the message it signs. This is useful in several contexts. A user may wish to maintain a separate signature log of all messages sent or received. A detached signature of an executable program can detect subsequent virus infection. Finally, detached signatures can be used when more than one party must sign a document, such as a legal contract. Each person's signature is independent and therefore is applied only to the document. Otherwise, signatures would have to be nested, with the second signer signing both the document and the first signature, and so on.
Both digital signature and confidentiality services may be applied to the same message. First, a signature is generated for the message and prepended to the message. Then, the message plus signature is encrypted using a conventional session key. Finally, the session key is encrypted using public-key encryption and prepended to the encrypted block.
The scheme used for this purpose is radix-64 conversion. Each group of three bytes of binary data is mapped into 4 ASCII characters. This format also appends a CRC to detect transmission errors. This radix-64 conversion, also called Ascii Armor, is a wrapper around the binary PGP messages, and is used to protect the binary messages during transmission over non-binary channels, such as Internet Email.
The following table defines the mapping. The characters used are the upper- and lower-case letters, the digits 0 through 9, and the characters + and /. The carriage-return and linefeed characters aren't used in the conversion, nor is the tab or any other character that might be altered by the mail system. The result is a text file that is "immune" to the modifications inflicted by mail systems.
6-bit character 6-bit character 6-bit character 6-bit character
value encoding value encoding value encoding value encoding
0 A 16 Q 32 g 48 w
1 B 17 R 33 h 49 x
2 C 18 S 34 i 50 y
3 D 19 T 35 j 51 z
4 E 20 U 36 k 52 0
5 F 21 V 37 l 53 1
6 G 22 W 38 m 54 2
7 H 23 X 39 n 55 3
8 I 24 Y 40 o 56 4
9 J 25 Z 41 p 57 5
1 K 26 a 42 q 58 6
11 L 27 b 43 r 59 7
12 M 28 c 44 d 60 8
13 N 29 d 45 t 61 9
14 O 30 e 46 u 62 +
15 P 31 f 47 v 63 /
(pad) =
It is possible to use PGP to convert any arbitrary file to ASCII
Armor. When this is done, PGP tries to compress the data before it
is converted to Radix-64.
ASCII Armor is created by concatenating the following data:
The Armor Headers are pairs of strings that can give the user or the receiving PGP program some information about how to decode or use the message. The Armor Headers are a part of the armor, not a part of the message, and hence should not be used to convey any important information, since they can be changed in transport.
The format of an Armor Header is that of a key-value pair, the encoding of RFC-822 headers. PGP should consider improperly formatted Armor Headers to be corruption of the ASCII Armor. Unknown Keys should be reported to the user, but so long as the RFC-822 formatting is correct, PGP should continue to process the message. Currently defined Armor Header Keys include "Version" and "Comment", which define the PGP Version used to encode the message and a user-defined comment.
The Armor Checksum is a 24-bit CRC converted to four bytes of radix-64 encoding, prepending an equal-sign (=) to the four-byte code. The CRC is computed by using the generator 0x864CFB and an initialization of 0xB704CE. The accumulation is done on the data before it is converted to radix-64, rather than on the converted data. For more information on CRC functions, the reader is asked to look at chapter 19 of the book "C Programmer's Guide to Serial Communications," by Joe Campbell.
The Armor Tail is composed in the same manner as the Armor Headerline, except the string "BEGIN" is replaced by the string "END".
Literal byte strings are written from left to right, with pairs of hex nibbles separated by spaces, enclosed by angle brackets: for instance, <05 ff 07> is a byte string of length 3 whose bytes have numeric values 5, 255, and 7 in that order. All numbers in this document outside angle brackets are written in decimal.
The byte string of length 0 is called "empty" and written <>.
Definition. A whole number field is any byte string. It is stored in radix-256 MSB-first format. This means that a whole number field of length N with bytes b_0 b_1 ... b_{N-2} b_{N-1} in that order has value
b_0 * 256^{N-1} + b_1 * 256^{N-2} + ... + b_{N-2} * 256 + b_{N-1}.
Examples. The byte string <00 0D 64 11 00 00> is a valid whole number field with value 57513410560. The byte string <FF> is a valid whole number field with value 255. The byte string <00 00> is a valid whole number field with value 0. The empty byte string <> is a valid whole number field with value 0.
Definition. A multiprecision field is the concatenation of two fields:
Some implementations may limit the possible range of B. The implementor must document which values of B are allowed by an implementation.
Examples. The byte string <00 00> is a valid multiprecision integer with value 0. The byte string <00 03 05> is a valid multiprecision field with value 5. The byte strings <00 03 85> and <00 00 00> are not valid multiprecision fields. The byte string <00 09 01 ff> is a valid multiprecision field with value 511. The byte string <01 00 80 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 07> is a valid multiprecision field with value 2^255 + 7.
Definition. A string field is the concatenation of the following:
Examples: <05 48 45 4c 4c 4f> is a valid string field which would normally be displayed as the string HELLO. <00> is a valid string field which would normally be displayed as the empty string. <01 00> is a valid string field.
Definition. A time field is a whole number field of length 4, with value V. The time represented by the time field is the one-second interval beginning V seconds after 1970 Jan 1 00:00:00 GMT.
Definition. A packet structure field is a byte string of length 1, 2, 3, or 5. Its first byte is the cipher type byte (CTB), with bits labeled 76543210, 7 the most significant bit and 0 the least significant bit. As indicated below the length of the packet structure field is determined by the CTB.
CTB bits 76 have values listed in the following table:
10 - normal CTB
11 - reserved for future experimental work
all others - reserved
CTB bits 5432, the "packet type bits", have values listed in the
following table:
0001 - public-key-encrypted packet
0010 - signature packet
0101 - secret-key certificate packet
0110 - public-key certificate packet
1000 - compressed data packet
1001 - conventional-key-encrypted packet
1011 - literal data packet
1100 - keyring trust packet
1101 - user id packet
1110 - comment packet (*)
all others - reserved
CTB bits 10, the "packet-length length bits", have values listed in
the following table:
00 - 1-byte packet-length field
01 - 2-byte packet-length field
10 - 4-byte packet-length field
11 - no packet length supplied, unknown packet length
As indicated in this table, depending on the packet-length length bits, the remaining 1, 2, 4, or 0 bytes of the packet structure field are a "packet-length field". The packet-length field is a whole number field. The value of the packet-length field is defined to be the value of the whole number field.
A value of 11 is currently used in one place: on compressed data. That is, a compressed data block currently looks like <A3 01 . . .>, where <A3>, binary 10 1000 11, is an indefinite-length packet. The proper interpretation is "until the end of the enclosing structure", although it should never appear outermost (where the enclosing structure is a file).
Options marked with an asterisk (*) are not implemented yet; PGP 2.6.2 will never output this packet type.
Definition. A number ID field is a whole number field of length 8. The value of the number ID field is defined to be the value of the whole number field.
A version number of 2 or 3 is currently allowed for each packet format. New versions will probably be numbered sequentially up from 3. For backwards compatibility, implementations will usually be expected to support version N of a packet whenever they support version N+1. Version 255 may be used for experimental purposes.
A packet is the concatenation of the following:
Other characteristics of the packet are determined by the type of the packet. See the definitions of particular packet types for further details. The CTB packet-type bits inside the packet structure always indicate the packet type.
Note that packets may be nested: one digital envelope may be placed inside another. For example, a conventional-key-encrypted packet contains a disguised packet, which in turn might be a compressed data packet.
If the default option of compression is chosen, then the block consisting of the literal data packet and the signature packet is compressed to form a compressed data packet.
If compression has been used, then conventional encryption is applied to the compressed data packet formed from the compression of the signature packet and the literal data packet. Otherwise, conventional encryption is applied to the block consisting of the signature packet and the literal data packet. In either case, the cyphertext is referred to as a conventional-key-encrypted data packet.
Definition. A literal data packet is the concatenation of the following fields:
Field (c) suggests a filename. Field (d) should be the time at which the file was last modified, or the time at which the data packet was created, or 0.
Note that only field (e) of a literal data packet is fed to a message-digest function for the formation of a signature. The exclusion of the other fields ensures that detached signatures are exactly the same as attached signatures prefixed to the message. Detached signatures are calculated on a separate file that has none of the literal data packet header fields.
Signatures have different meanings. For example, a signature might mean "I wrote this document," or "I received this document." A signature packet includes a "classification" which expresses its meaning.
Definition. A signature packet, version 2 or 3, is the concatenation of the following fields:
A message digest algorithm reads a byte string of any length, and writes a byte string of some fixed length, as indicated in the table above.
The input to the message digest algorithm is the concatenation of some "primary input" and some "appended input."
The appended input is specified by field (c), which gives a number of bytes to be taken from the following fields: (d1), (d2), and so on. Typically the number is 5, for fields (d1) and (d2), or 7, for fields (d) and (e) and (f). Any field not included in the appended input is not "signed" by field (k).
The primary input is determined by the signature classification byte (d). Byte (d) is one of the following hex numbers, with these meanings:
Signature packets are used in two different contexts. One (signature type 00 or 01) is of text (either the contents of a literal packet or a separate file), while types 10 through 1F appear only in key files, after the keys and user IDs that they sign. Type 20 appears in key files, after the keys that it signs, and type 30 also appears after a key/userid combination. Type 40 is intended to be a signature of a signature, as a notary seal on a signed document.
The output of the message digest algorithm is a message digest, or hash code. Field k contains the cyphertext produced by encrypting the message digest with the signer's private key. Field j contains the first two bytes of the unencrypted message digest. This enables the recipient to determine if the correct public key was used to decrypt the message digest for authentication, by comparing this plaintext copy of the first two byes with the first two bytes of the decrypted digest. These two bytes also serve as a 16-bit frame check sequence for the message.
The public-key-encryption algorithm is specified by the public-key cryptosystem (PKC) number of field (h). The following PKC numbers are currently defined:
A PKC number identifies both a public-key encryption method and a signature method. Both of these methods are fully defined as part of the definition of the PKC number. Some cryptosystems are usable only for encryption, or only for signatures; if any such PKC numbers are defined in the future, they will be marked appropriately.
PGP versions 2.3 and later encode the MD into a PKCS-format signature string, which has the following format:
MSB . . . LSB 0 1 <FF>(n bytes) 0 ASN(18 bytes) MD(16 bytes)See RFC1423 for an explanation of the meaning of the ASN string. It is the following 18 byte long hex value:
3020300c06082a864886f70d020505000410
Enough bytes of <FF> padding are added to make the length of this whole string equal to the number of bytes in the modulus.
It is understood that some machines allow the user to set the internal clock to any time, however this field should try to be as reasonable as possible to denote the current time.
The value of field (f) is the "validity period," a number of days. It is currently unused. It should be written as 0.
Definition. A compressed data packet is the concatenation of the following fields:
A compression number selects a compression algorithm for use in compressed data packets. The following compression numbers are currently defined.
Definition. A conventional-key-encrypted data packet is the concatenation of the following fields:
Definition. A conventional-encryption type byte is a single byte which defines the algorithm in use. It is possible that the algorithm in use may require further definition, such as key-length. It is up to the implementor to document the supported key-length in such a situation.
Definition. A public-key-encrypted packet, version 2 or 3, is the concatenation of the following fields:
The value of field (c) is the ID of K.
Note that the packet does not actually identify K: two keys may have the same ID, by chance or by malice. Normally it will be obvious from the context which key K was used to create the packet. But sometimes it is not obvious. In this case field (c) is useful. If, for example, a reader has created several keys, and receives a message, then he should attempt to decrypt the message only with the key whose ID matches the value of field (c). If he has accidentally generated two keys with the same ID, then he must attempt to decrypt the message with both keys, but this case is highly unlikely to occur by chance.
PGP version 2.3 and later encode the DEK into an MPI using the following format:
MSB . . . LSB 0 2 RND(n bytes) 0 ALG(1 byte) DEK(k bytes) CSUM(2 bytes)ALG refers to the algorithm byte for the secret key algorithm used to encrypt the data packet. The DEK is the actual Data Encryption Key, and its size is dependent upon the encryption algorithm defined by ALG. For the IDEA encryption algorithm, type byte 1, the DEK is 16 bytes long. CSUM is a 16-bit checksum of the DEK, used to determine that the correct Private key was used to decrypt this packet. The checksum is computed by the 16-bit sum of the bytes in the DEK. RND is random padding to expand the byte to fill the size of the RSA Public Key that is used to encrypt the whole byte.
Definition. A public key packet is the concatenation of the following fields:
Definition. A user ID packet is the concatenation of the following fields:
William Stallings
Comp-Comm Consulting
P. O. Box 2405
Brewster, MA 02631
EMail: stallings@ACM.org
Philip Zimmermann
Boulder Software Engineering
3021 Eleventh Street
Boulder, Colorado 80304 USA
Phone: +1-303-541-0140
EMail: prz@acm.org