PkCrack - README

0. Version info

This is the README file for pkcrack 1.2.3. Version 1.2.3 is another bugfix release with no new features.

1. Disclaimer

This program may or may not do what you think it does. It may or may not do what its documentation (including this file) tells you it does. Use at your own risk! The author may not be held liable for any damage caused by running this program. Or any other damage, for that matter.

In particular, you are required to make backup copies of any valuable data that might be destroyed by running this program. It's a good idea to have backups, anyway.

This program was written for people who have encrypted their own files and forgotten their passphrases, or for people who have been the victim of some 'practical joke'. In any case it is for people who have a legitimate right (in whatever sense) to gain access to the encrypted data. It is not meant as a tool for wannabe hackers who try to steal other people's intellectual property!

2. Copyright

This package was written and is copyright by Peter Conrad <conrad@unix-ag.uni-kl.de>. Commercial use in any form is strictly forbidden! You may use parts of the code in your own programs for non-commercial use in case you clearly state where you got it. Do not release software using parts of the code without the author's explicit consent.

3. What is this?

This package implements an algorithm for breaking the PkZip cipher that was devised by Eli Biham and Paul Kocher. A paper describing the attack is included in this package as "pkzip.ps.gz".

Since an astonishingly large number of people request the package every day, I have decided to release this program as "CardWare". I don't remember who coined that term, but its meaning is simple:

If you like the program, send me a postcard. Picture postcards from the area where you live are preferred. On the card you may write anything you like, e. g. how much you like the program, what a great person I am, or whatever comes to your mind. Be creative! :-)
Electronic postcards don't count.

My snail mail address is:

Peter Conrad
Am Heckenberg 1
56727 Mayen

Germany

4. Requirements

ANSI compatible C-compiler (gcc is fine)
about 33MB of (virtual) memory
Note that most of the memory is used only during the first cycles of the key-reduction stage. It runs fine with 16MB RAM (and sufficient swapspace). It should work with less than 16MB, but I don't really recommend this.
patience :-)
PGP or GPG (if you want to check the signatures)

5. Building pkcrack

(You can skip this section if you downloaded the pre-compiled binaries for Windows or other systems.)

Unpack the package by entering

   zcat pkcrack-<version>.tar.gz | tar xvf -

This will produce a directory "pkcrack-<version>". Cd into that directory:


   cd pkcrack-<version>

Since you're reading this file you probably have done that already.

The sources are kept in the subdirectory "src". cd into that dir. Then, enter "make" to build the programs. If you're not using GNU make you'll probably want to remove the "if DJGPP ... endif" lines in src/Makefile to avoid problems.

The program was written and tested under Linux, Solaris and 32bit Windows, the latter using DJGPP-2.03 as well as cygwin. Other people have built and tested the programs under Windows and OS/2 using different compilers. Some Makefiles for these are included in the src directory.

If you want to port this to other platforms you should check the definitions of some data types:

byte	should have a range of [0..255]		(unsigned char)
ushort	should have a range of [0..65535]	(unsigned short)
uword	should have a range of [0..(2^32-1)]	(unsigned int)

I can't think of other important changes right now. Please inform me of successful ports to other platforms, I may include them in the Makefile.

6. Using pkcrack

With version 1.2 pkcrack has become one of those "fire & forget" programs. If you are a hacker of the experimental kind you may want to look at the "more complete instructions" below. Otherwise, stick to the "simple instructions".

The first thing you have to know is that this program applies a known plaintext attack to an encrypted file. A known-plaintext-attack recovers a password using the encrypted file and (part of) the unencrypted file. Please note that cryptographers use the word 'plaintext' for any kind of unencrypted data - not necessarily readable ASCII text.

Before you ask why somebody may want to know the password when he already knows the plaintext think of the following situations:

Usually there's a large number of files in a ZIP-archive. Usually all these files are encrypted using the same password. So if you know one of the files, you can recover the password and decrypt the other files.
You need to know only a part of the plaintext (at least 13 bytes). Many files have commonly known headers, like DOS .EXE-files. Knowing a reasonably long header you can recover the password and decrypt the entire file.

Back to the program.

6.1 Simple Instructions

You need two files:

the ZIP-archive which you want decrypted, and
another ZIP-archive, containing at least one of the files from the encrypted archive in unencrypted form. This one has to be compressed with the same compression method used for the encrypted file.

Now, enter

  pkcrack -C encrypted-ZIP -c ciphertextname -P plaintext-ZIP 
          -p plaintextname -d decrypted_file -a

(This is supposed to be a single line. I've wrapped it because it doesn't fit in 80 chars. "Real computer scientists never comment their code - the identifiers are so long they can't afford the disc space." :-/ )

encrypted-ZIP: is the name (and path) of the encrypted ZIP-archive (see 1. above)
ciphertextname: is the name of the file in the archive, for which you have the plaintext
plaintext-ZIP: is the name (and path) of the ZIP-archive containing the compressed plaintext (see 2. above)
plaintextname: is the name of the file in the archive containing the known plaintext
decrypted_file: is the name of a file to which the decrypted archive will be written

All you have to do now is wait a little. Depending on the size of the plaintext and the speed of your computer after about an hour the program should terminate. If the plaintext is very short (less than 100 bytes or so), it will probably take a lot longer.

After pkcrack is finished, you will find the decrypted archive in the file decrypted_file. You can unzip it using any unzip-program, e. g. pkunzip under DOS, or unzip under UNIX.

If decrypted_file doesn't exist, or if unzipping it produces CRC-errors, there are several things that may have gone wrong:

Maybe you used the wrong plaintext. Find better plaintext.
Maybe you used the correct plaintext but used the wrong compression method. Compress the plaintext with a different method.
Maybe pkcrack found more than one set of matching 'keys' and used the wrong one to decrypt the file. Examine the output of pkcrack, and use the sets of key[012]-values with the "zipdecrypt" program to decrypt the file.
Maybe something else went wrong. Now you're in trouble. Read the "more complete instructions" and try to understand why the program failed. Read the FAQ below.

6.2 More Complete Instructions

This section will explain some of the more esoteric options of pkcrack, as well as some of the other programs in this package.

Just like in the "simple instructions" you need two files. Not necessarily ZIP-archives, though. You can specify any file containing nothing but plain data, i. e. no ZIP-headers, and no other fancy stuff. In that case, simply don't specify the -C or -P options, but only the -c and -p options.

So there are two possible ways to tell pkcrack where to find the ciphertext:

-C encrypted-ZIP -c ciphertextname (see above), or
-c ciphertextname

As I said, in the latter case ciphertextname is the name of a file containing nothing but encrypted data.

Analogously there are two possible ways to specify the plaintext:

-P plaintext-ZIP -p plaintextname (see above), or
-p plaintextname

In the latter case, plaintextname is the name of a file containing nothing but (compressed) plaintext.

Note that PkZip prepends 12 random bytes to the compressed data before encryption, so the ciphertextfile has to be 12 bytes longer than the plaintext. If you know only part of the plaintext, the plaintext can be even shorter. Usually, though, a difference of more or less than 12 bytes in the file sizes indicates wrong plaintext, or wrong compression method of the plaintext.

If you want to extract data from a ZIP-archive, you can use the "extract" program contained in this package. Invoke it by entering

  extract ZIP-name name-in-ZIP

This will extract the (possibly encrypted and/or compressed) data stored in the archive ZIP-name under the name name-in-ZIP, and write those data to the file name-in-ZIP in the current directory. If name-in-ZIP contains the names of subdirectories you have to make sure these directories exist below the current directory before you start the extract program.

Another option used in the "simple instructions" above is the -d decrypted_file - option. It tells pkcrack to decrypt the archive specified with the -C option and write the decrypted results to the file decrypted_file. Naturally, it can only be used in conjunction with the -C option.

If you do not specify the -d option, pkcrack will try to find a PkZip-password when it has found a set of keys. If it finds a password, you can use it to decrypt the archive with the pkunzip-program. If it doesn't, you can use the set of keys found by pkcrack with the zipdecrypt program contained in this package to decrypt the archive. Zipdecrypt must be called as follows:

  zipdecrypt key0 key1 key2 encrypted_archive decrypted_archive

where key0, key1, key2 is a set of keys found by pkcrack, encrypted_archive is the name of the archive to be decrypted, and decrypted_archive is the name of the file to which the archive will be written by zipdecrypt.

One option to pkcrack has not been mentioned yet: with -o offset you can specify an offset of the plaintext data into the ciphertext. This is for the special case that the known plaintext starts somewhere in the middle of the encrypted data. The default value for offset is 0, i. e. the 12 encrypted random bytes are not to be included in the offset.

There is another possible application of the offset: the so-called "random" bytes aren't that random. Older versions of PkZip used the CRC-checksum of the file as the last 2 "random" bytes, newer versions use "only" one byte of CRC. In that case you have to prepend the known CRC-bytes before the known plaintext file, and specify a negative offset (e. g. -1 if you know the last of the "random" bytes). This feature has not been tested very thoroughly. Be warned.

What remains to be explained is the findkey program and its counterpart, the makekey program. You can use findkey to find a PkZip-password for a set of key[012]-values found by pkcrack. This is exactly what pkcrack does if you do not specify the -d option. Periodically, findkey prints information about its progress which can be used to restart the program at a later time. The information printed is of the form

  10: xx, or
  11: xxxx, or
  12: xxxxxx and so on.

To restart findkey, enter

  findkey key0 key1 key2 pwdlen initvalue

where key0, key1, key2 is a set of keys found by pkcrack, pwdlen is 10 or 11 or 12 (depending on the point where you want to resume), and initvalue is the "xx" printed by findkey. pwdlen and initvalue are optional parameters.

Note that findkey will take very long to find long passwords. There are 255 times as many possible passwords with a pwdlen of 11 as with a pwdlen of 10. It is probably wiser to use the zipdecrypt program instead.

The makekey program can be used to generate key0, key1 and key2 from a given password. You will not need it to break an encrypted file, but it's a nice tool for playing around with encrypted files.

7. Some details

Here is a short description of the source-files:

crc.c: This file contains functions for calculating CRC-32 checksums. The CRC-polynomial used is defined in crc.h A lookup-table is used, which has to be initialized first.
crc.h: Header file for crc.c - this file contains macros for computing CRC-checksums using a lookup-table which has to be initialized using a function in crc.c
debug.c, debug.h: These two files define some functions that can help in debugging the stage1 and stage2 functions. They require the encryption keys to be known, and can check if all the intermediate values are computed correctly.
exfunc.c: This file contains a function for reading data from a ZIP-archive into memory.
extract.c: This file contains the main() function of the "extract" program, which may be used to extract data from a ZIP-archive and write it to a file.
findkey.c: This program tries to find a PkZip-password for a given initial state of key0, key1 and key2. In the current version it prints information about the progress of the search to stdout every couple of minutes. You can use that information for resuming the search at a later time.
headers.h: This headerfile contains declarations of several data types used in ZIP-archives.
keystuff.c: This file contains functions for initializing and updating the internal state of the PkZip cipher.
keystuff.h: This is a header file for keystuff.c
main.c: This file contains the main() function of the PkZip-cracker. It reads the ciphertext and plaintext files and makes calls to the actual cracking stages.
makekey.c: This file contains the main() function of the makekey program.
mktmptbl.c: This file contains a function for initializing a lookup-table that is used for finding "temp" values for a given "key3" (refer to the paper if you want to know what "temp" and "key3" are).
mktmptbl.h: This is a header file for mktmptbl.c
pkcrack.h: This header file contains some constants used in the program and some global variables from main.c
pkctypes.h: This header file contains the type definitions for byte, ushort and uword. This is probably something you should change when you port the software to another platform.
readhead.c: This file contains several functions for reading and parsing headers in a ZIP-archive. Refer to the file "appnote.iz.txt" for details.
stage1.c: This file implements stage 1 of the cracking process, namely finding initial values for key2_n and reducing the number of possible values. See sections 3.1 and 3.2 of the paper.
stage1.h: This is a header file for stage1.c
stage2.c: This file implements stage2 of the cracking process, namely creating lists of key2-values, calculating the corresponding key1 and key0 values, decrypting the 12 prepended bytes and finally calling stage3. See sections 3.3, 3.4 and 3.5 of the paper.
stage2.h: This is a header file for stage2.c
stage3.c: This file implements stage 3 of the cracking process, namely finding a PkZip-password for a given internal state of key0, key1 and key2. It re-uses some code from stage2.c See section 3.6 of the paper.
stage3.h: This is a header file for stage3.c
writehead.c: This file is the counterpart of readhead.c - it contains functions for writing headers in a ZIP-archive.
zdmain.c: This file contains the main function of the zipdecrypt program.
zipdecrypt.c: This file contains a function for decrypting a ZIP-archive with a given set of key[012] values. It produces a ZIP-archive which can be unzipped using pkunzip under DOS or unzip under UNIX. I wrote this be cause stage 3 (password generation) takes a couple of eons for finding long passwords.

For further information on the attack refer to the paper describing the algorithm (pkzip.ps.gz). Information on the format of pkzip-archives is contained in the file appnote.iz.txt. Additional information can be found in the section "Further Reading" below.

That's it. Some of the less obvious sections of the code are commented. Most aren't.

8. Hints

From a person wishing to remain anonymous:

> I had asked Dimitri for the source to the Win32 version
> of PkCrack because I wanted to change it to allow it
> to scan an encrypted PKSFX-style self-extracting
> Zip file.
> 
> However, after a while, I figured out
> that you can use PkZipFix to convert a Zip .EXE to
> a regular .Zip, which will work with PkCrack.
> PkCrack didn't recognize the .EXE file.

9. Frequently Asked Questions

Q: "When I run the program it says something about increasing constants and rebuilding. What is that supposed to mean?"

A: The program uses an internal array to store intermediate values. During the first cycles of the key-reduction-stage the number of intermediate values can exceed the size of the array. Since the program cannot increase the size of the array it prints an error message and stops. There are two ways to handle that problem:

If you have lots of plaintext you can simply strip off some of the plaintext bytes from the end of the plaintext file. This *may* help.
Increase the value of the constant KEY2SPACE in the file pkcrack.h by at least (1<<21) (that's 2M entries with 4 bytes each = 8MB), and recompile the program. Obviously, you need the sources and a decent C-compiler to do this.

Q: "Shall I use compressed or uncompressed plaintext?"

A: You have to use plaintext compressed with exactly the same method that was used to compress the ciphertext. So if the ciphertext is uncompressed, use uncompressed plaintext. If the ciphertext is shrunk (or imploded, or mega-hyper-special-compressed), shrink (or implode, or mega-hyper-special-compress your plaintext). A good indicator of the correct compression method is the size of the compressed plaintext: it has to be 12 bytes shorter than the ciphertext (that's for pkzip -v output, unzip -v under UNIX should report the same compressed sizes for plain- and ciphertext). I have heard that there are subtle differences in the compression methods in different versions of pkzip, so be sure to use the correct version for compressing the plaintext.

Q: "You said there were 12 random bytes prepended to the ciphertext. Does the ciphertext input to pkcrack have to include these 12 bytes?"

A: Yes.

Q: "But where shall I get the plaintext for the random bytes?"

A: You don't have to. Use only the plaintext, without anything prepended to it.

Q: "Hi, can you decrypt a file for me? I've attached it below." [followed by a ton of MIME-encoded junk]

A: PLEASE DON'T SEND ME YOUR ZIP ARCHIVES UNLESS I ASK YOU FOR IT. Thanks.

Q: "I have an encrypted self-extracting archive. Can pkcrack break it?"

A: pkcrack cannot handle self-extracting archives. But there is a simple workaround: use pkzipfix to convert the self extracting archive into a normal ZIP archive. Then, break the ZIP archive with pkcrack.

10. Further Reading

PKWare's original description of the ZIP file format:
http://www.pkware.com/products/white_papers/white_appnote.html

InfoZIP's modified version:
ftp://ftp.info-zip.org/pub/infozip/doc/appnote-011203-iz.zip

Some observations by Paul Kocher:
http://www.bokler.com/zipcrack.txt

An interesting improvement on the original attack:
http://www.woodmann.com/fravia/mike_zipattacks.htm

Back to pkcrack main page
My Homepage
Unix-AG Homepage

Impressum