Discuss this paper and the subject of security in general  subscribe to the mcgtalk list. 

Ed Gerck
Copyright © 1999 by E. Gerck and MCG, first published
on Jan 5, 1999 in the mcgtalk list server.
Archived at http://mcwg.org/mcgmirror/unicity.htm
All rights reserved, free copying and citation allowed
with source and author reference.
The objective of this communication is thus to address the subject of security from this perspective  which can be summarized in two questions: (i) how to protect one's data against a bruteforce attacker that has almost unlimited resources and, (ii) Can this be done under the current secretkey exportfree bitlength restrictions of the WA?
Apparently, the answer to both questions together must be negative. As answered by the very existence of the WA itself  after all, the very purposes of the WA limitations on cryptography depend on a negative answer.
However, in 1949 Claude E. Shannon, an American scientist, published an application of his Information Theory (then, called "Communication Theory") to Cryptography [Sha49]  which, in this author's opinion and as discussed here, can be used to derive positive and practical answers to both questions (i) and (ii) above, together. This will also show that the Wassenaar Arrangement is a moot issue to security and privacy concerns  notwithstanding all other different issues that it may raise [EFF98].
To answer the questions, we need first to revisit the concept of "unicity".
Shannon [Sha49] defined "unicity distance" (hereafter, "n") as the least amount of plaintext which can be uniquely deciphered from the corresponding ciphertext  given unbounded resources by the attacker. The "amount" of plaintext (i.e., "n") can be measured in any units the user may find convenient, such as bits, bytes, letters, symbols, etc. Actually, Shannon used "letters" in his paper.
Of further importance and often ignored or even contradicted by declarations in the literature such as "any cipher can be attacked by exhaustively trying all possible keys", I want to call attention to the fact that any cipher (including 56bit DES) can be theoretically secure against any attacker  even an attacker with unbounded resources  when the cipher is used within its unicity. Not only the OneTime Pad (also called XORcipher or Vernam cipher). Thus, indeed there is a theoretically secure defense even against bruteforce attacks, which is to work within the unicity limit of the cipher. And, it works for any cipher  irrespective of keylength or encryption method used.
It is important to note, as the literature has also not been very neat in this regard, that unicity is always referred to the plaintext. However, it may also be applied to indicate the least amount of ciphertext which needs to be intercepted in order to attack the cipher  within the ciphertext/plaintext granularity. For example, for a simple XORcipher (Vernam cipher), one byte of ciphertext links back to one byte of plaintext  so, a unicity of n bytes implies n bytes of ciphertext. For DES, the ciphertext must be considered in blocks of 8 bytes  so, a unicity of n bytes implies a corresponding modular number of 8 bytes.
Further, the present results are discussed for English and for compressed
English text, but they could equally have ben discussed for any natural
language, for any protocol or document format  including fax, modem signals
or specialized protocols tunneled in within the encryption [see Ger99,
Section 2].
Please note that it does not matter how the attacker may try to decipher  he can of course use bruteforce and try out all keys or he can use shortcuts, it is his choice and he is entirely free to use any method he desires.
The main point is that if the intercepted message has less than "n" (bits, bytes, letters, etc.) of plaintext then it cannot be uniquely deciphered from an analysis of its ciphertext  even by trying all keys. Conversely, any message that is larger than "n" bits could be theoretically broken  albeit that may require some large (actually, unspecified) amount of work.
I call attention to the fact that the denominator of the formula above
may be zero in some cases of interest  currently, at least of theoretical
interest. This means that the formula will fail to provide an answer in
such cases. I expect to address this situation in a new and expanded formulation
of unicity, which will follow several aspects to be advanced below  specially
in regard to the Sections on identification of unicity, different unicity
definitions and "added trust". The interplay between unicity and trust
will also be further explored in next messages.
In the first case we (the attacker) trust that:
The answer to both questions is the unicity, in this case. In the general case, the first answer is the unicity and the second answer is the least number of cipher blocks which link back to the unicity. With the given data, the unicity is calculated as:
In the second case, we add the trust that the plaintext is in English. What changes? Our trust on f  which is now:
However, I note that the above value is only correct for very long text  certainly not for 8character text. Thus, it will be considered here as an approximation, as usually done in the literature  for lack of a tabulated entropy value for 8character English text. The value of H(English) = 1.2 bits/character will be used throughout this exposition so, at least, the results will be consistent with one another. This also corresponds to a worstcase analysis in security  since the entropy value for 8character English text should be larger than the value for very long English text.
The calculation is:
Thus, the added trust in f' (visavis f) has decreased the unicity
approximately 5 times.
An example may further help clarify the differences, and the benefits of compression  albeit 100% known to the attacker.
First, it is interesting to comment on compression/decompression as an encoding/decoding operation in comparison with encryption/decryption as an encipherment/decipherment operation.
Encipherment/decipherment operations are deterministic by themselves in their values (even though they may include random seeds) but they are usually so complex and involve so many different possibilities as a function of the keys and all possible messages, which are unknown by definition, that one must (and, is justified to) resort to probabilistic methods in order to describe average values for dealing with the variety of possible variables and outcomes  as we have done by defining the entropy of the keys in a cipher system H(K), as well as the entropy of the actual message H(M) and the maximum possible entropy of the plaintext M.
To contrast, lossless compression/decompression can be seen as a reversible deterministic encoding which essentially forms a 1:1 partial function such that given a valid encoding one can uniquely and directly determine the valid decoding and viceversa. Possibly, not only by a linear transformation but by an affine transformation  which is still a 1:1 partial function and deals only with deterministic values. Thus, we may represent compression simply by a deterministic factor which scales bit/characters to bit/compressedcharacters.
Following the arguments above, one can view the compression argument of this Section in an alternative way which may be more illuminating. Suppose we will use 2x compression. Then, the same amount of Englishinformation (i.e., information as what you do not expect; surprise) is present before and after compression  but now in half the space. Thus, 2 characters = 1 compressedcharacter. Of course, compression has limits and the limit of compression is the original entropy of the plaintext  but 2x compression is usually a feasible goal for English text.
But, must we really consider the compression method known to the attacker?
Yes. In cryptography, specially with commercial programs, one usually assumes that only the secret key is not known (even though possible statistics about the secretkey may be known, such as the fact that most UNIX users do not use anything else besides lowercase letters and a few symbols such as !$%& in their passwords). Thus, one must assume that all details at least about the decompression algorithm (i.e., the compression algorithm may be secret in special oneway communication protocols) are either known or can be decompiled (e.g., the RC4 cipher algorithm released on the Usenet).
This makes it simple then for the attacker to set up an interface module that translates valid compressedtext into plaintext, such that given one it is immediately possible to know the other  deterministically, unambiguously, uniquely. Which simply inverts the affine transformation. Of course, no knowledge about the underlying plaintext language statistics is needed for this. To identify which compression algorithm is in use, the attacker can rely upon publiclyknown and necessarily present "magic numbers" or "information tags" as discussed in Section 2 of [Ger99].
For special protocols, in highsecurity applications, the "magic numbers" can also be implicit or encoded with a random seed in each use. However, the encoded data itself may betray its presence by its very structure, size, range, randomness, redundancy, etc. This is a problem in identification and, as given in [Ger98], "to identify is to look for connections"  not just to look for an identity connection with a set of "magic numbers". The attacker can thus rely on intrinsic connections it may find or try.
To look for such intrinsic connections, it is interesting to note that the plaintext statistics are indeed translated to the compressedtext statistics, while of course possibly warped by the affine transformation. Yet, still recognizable through the affine transformation rules  which is the 1:1 partial function. So, clever analysis can even allow the attacker to work directly in compressed space if one so wants, when analyzing a cipher. For example, Huffman coding assigns the most probable character to the code "O", the second most likely to the code "10" and so on (here, one may also have "1" and "01" but that is irrelevant). Thus, for English, we know immediately what they are: "0" is " " (space) and "10" is "e". And, so on, also using digrams, trigrams, etc. for the dicode, tricode, etc. probabilities. The same applies to other compression methods such as LempelZiv, pkzip, zoo, gzip, etc.  not to mention the familiar headings and "magic numbers" that identify each compression method (see the respective comments in Section 2 of [Ger99]). For example, several compression methods rely on "autocode", which is a name for "uniquely decodable code"  i.e., an easy distinguished target.
Now, for the compression example, I will use the last case of the example in section 3. and include a new trustpoint after f'  i.e. added public information which the attacker knows and can rely upon:
As we can see from above, twice compression has increased English text unicity more than twice  actually, 22% more than the compression ratio  visavis the second case.
Thus, this is not a question of efficiency alone  it is a fundamental consequence of compression, since compression must increase the text's entropy per compressedcharacter. But, of course, compression also adds to communication efficiency, since there are less bytes to send for the same amount of text.
The benefit of compression is to increase unicity and such does not depend on obscurity for the compression algorithm.
In an alternative view, compression allows you to tunnel through bruteforce
with more plaintext than the case without compression.
It may seem that the more the attacker trusts information about the intercepted message, the least the unicity. It seems intuitive that the more the attacker knows, the least he would need to know.
This is, however, provably wrong.
Added open trust  publicly known to all, including the attacker  can either increase or decrease unicity, as has been proved and exemplified in Sections 3. and 4. above.
Of course, to measure and understand this effect one needs to refer unicity to the same units of plaintext  for example, in characters of English text. Otherwise we would be comparing apples with speedboats.
There is much more to the subject of added "open trust" than this brief introduction can contain, when one wants to use it to increase the security of cipher systems. Another example is the recent results reported by Tony Bartoletti in the mcgtalk listserver [MCG99], for plaintext hardening before encryption. However, for the moment, suffices to say that, demonstrably, added open trust can increase unicity  even when publicly known to all  and works as a type of "openkey".
Or, in other words, adequately adding open trust  i.e., adding OpenKeys  proved to decrease what the cipher system might disclose at most for a given length of intercepted ciphertext, under proper unicity conditions  notwithstanding the public's full knowledge about such trust (and, the attacker's). With OpenKeys, the more the attacker knows the safer the system is.
Of course, the same feature is trivially true for "closed trust"  i.e, that which is trusted but is not publicly known, as a "secretkey". And, is the basis for current WA export limitations [WA98] to limit the amount of "closed trust" one may have or require. That the same feature but not the same limitation applies to "open trust" is very noteworthy  and not controllable.
The concept that emerges from such discussions is that since OpenKeys can provably enhance the security of tooshort secretkeys, it is worthwhile to investigate what are the eventual limits of such synergy. For example, is there a maximum limit to the size P of an English message that can be sent within unicity conditions for a secretkey bitlength N and an openkey bitlength M? What are the tradeoffs?
The use of open trust should not however be confused with "open ignorance"  where "ignorance" can also be seen as "absent trust" or "Formless trust" ( cf. "Formless" as discussed in [Ger98] and "Formless trust" in trustref list discussions). This will be discussed in the next item.
5.2. Absent Trust and UnknownKeys  70bit DES
What happens when everyone ignores a secret but only one person has trusted information that may lead to that secret? Then, we expect that "one person" to be able to discover the secret sooner  by using that trust  all other resources being equivalent. But, if the resources are not equivalent  can we still favor one in lieu of the others? Yes, if we balance trust and ignorance to counter the resource imbalance, as this item demonstrates in two different scenarios.
First, I will consider increasing the security of TripleDES when used with just one 56bit key  leading to an effective 59bit keylength. Second, I will consider a more general case, describing a protocol called MDES  for which the effective keylength is (56 + M)bits. For M=14, a practical case, this implies 70bit keylength. All these methods are exportfree under WA  since the only secretkey has 56bits. The other security bits are initially ignored by the recipient and attackers alike.
Consider the cipher protocol for onekey TripleDES [NIST99], where the cipher is given by Encrypt[K1, Decrypt[K1, Encrypt[K1, Plaintext]]] = Encrypt[K1, Plaintext] (so that onekey TripleDES is equivalent to the usual DES) and is represented by the letters EDE. Now, consider a (yet hypothetical) newer TripleDES Standard, where all 8 combinations (EEE, EED, DDE ... DDD) and all key orderings K1, K2, K3 would be allowed and postselected by the sender, Bob, after algorithm negotiation (TripleDES) with Alice. This corresponds to added uncertainty. And, even if the attacker knows or supposes that onekey TripleDES is being used (though a TripleDES cipher was negotiated), there are still an open total of 8 possible choices with K1=K2=K3 for the onekey TripleDES case. But Alice, who would have the right key from Bob (e.g., sent by a publickey cipher), would only have to test the first DES block to see which scheme was randomly chosen by Bob  for which she just needs to try out one block 4 times, on average. An attacker would also need to try out an average of four blocks, but each one with an average of half the corresponding keyspace of 2^{56}  which would at least multiply by 4x the average attack workload for onekey DES due to the combination uncertainty of EEE .. DDD.
In the above example, no one knows (neither Alice nor anyone else) which of the 8 possible schemes Bob did use  everyone openly ignores it. Thus, this situation can be defined as "absent trust" or "open ignorance".
The concept of open ignorance can be seen as defining an "unknown key". And, it can be applied not only to the cipher algorithm itself as done above  but also to the ciphertext as discussed in the next paragraph, and in combination. This is in contrast to the usual technique of adding "salt" (random data) to the plaintext, or to the algorithm's bit manipulation as done in the "crypt" procedure for UNIX passwords [MOV97, p. 393394]. These forms of adding variability do not actually benefit from open ignorance to make a bruteforce attack more difficult because the "salt" is publicly known  it is not an unknown key. Actually, their main purpose is to make a dictionary attack more difficult [MOV97]. Further, even if unknown, "salt" added to the plaintext would not increase the attacker`s workload for a bruteforce attack.
The effect of open ignorance can be exemplified in simple onekey 56bit DES encryption, where an unknown key is applied to the ciphertext and imposes a large decryption uncertainty. As will be shown next, it can easily bootstrap keylength by a large factor  without breaking any export restriction on the 56bit limitation.
Consider a hypothetical MDES Standard (hereafter, MDES) which would specify a total of 2^{M }predefined and publicly known 64bit different blocks of data, for various choices of M = 0, 1, ... Now, when Bob wants to send a message to Alice, he needs to negotiate a MDES cipher with Alice, which includes a 56bit secretkey [Note1] and a publicly defined value for M, say M = 14. The value of M is chosen by Bob according to his security needs [Note2], and accepted (or not) by Alice. To use MDES, Bob would privately choose one 64bit "key" at will (from the public list of 2^{14 }"keys") and then repeatedly use this choice to XORencode [Note3] one by one each of all 64bit DES ciphertext blocks which are calculated with the 56bit secretkey he shares with Alice  without disclosing his choice (i.e., anyone else would ignore the choice). Thus, Bob's choice is an "unknown key" to anyone else, including Alice  providing open ignorance. An attacker would have a workload (2^{14})/2 = 8,192 times higher on average for all 56bit secretkey combinations to be tried  which would effectively raise the DES keylength from 56 to 70 bits. But the intended recipient Alice, who knows the 56bit secretkey, would just have to try out an average of 8,192 DES decryptions of one 8byte block in order to find out which "unknown key" was chosen by Bob  a small initial overhead which can be averaged out for different sessions that can use the same "unknown key" (albeit with possibly different DES secretkeys, see below).
Of course, the same example can be used with the extended onekey TripleDES discussed before, but inceasing perceived keylength by additional 3 bits (i.e., for the 8 possible combinations EEE...DDD)  to 73bits for onekey TripleDES. For twokey and threekey DES, additional ignorance can come from an undisclosed key order.
On a practical question, how can 2^{14 }blocks of 64bits be defined and known worldwide without doubt? This is a total of 16,384 blocks of 8bytes each block. If stored as a table, they would take approximately 130 K bytes. But, they can also be defined as an algorithm  in much less space. For example, as recursive applications of a secure hashfunction such as SHA1 [MOV97] on a simple and known seed such as "0", or as the output of a wellknown longcycle randomnumber generator with known seed. But, since ignorance does not pertain to the blocks themselves but rather to which block is chosen, there is a much simpler solution. If the decryption algorithm is a random cipher (see Section 2) at least over a subset of the plaintext space (as it must be for some subset  for example, DES is a random cipher over a 56bit plaintext space [Ger99]), then the needed 16,384 (i.e., 2^{14}) blocks of 64bits each can be formed by a simple sequential numbering from N_{o } = 1 to (N_{o} + 16,384), where N_{o} is publicly known and fixed  e.g., as a 64bit number in the case of DES.
To increase ignorance to the full extent of the 64bit space allowed by the ciphertext block, without the corresponding overhead, two dialogue parties could use a trusted channel (with the needed trust evaluated as a function of risk and cost for each possible channel: personal contact, mail, fax, phone, a previous email, etc.) in order to establish a mutual longterm secret N_{o } = (1 to 2^{64  M}). Then, N_{o } is used together with a publiclyknown ignorance limit of 2^{M}, where M << 64, in the MDES protocol (exemplified by the M=14 case given above). The mutual longterm secret should be different for each dialogue party, increasing security. This would effectively increase the MDES bitlength to 120 bits  from a protocol secretkey of only 56bits, defining a (64  M)bit key besides the 56bit DES key and the unknownkey with Mbits. However, since this third key needs to be secret (for M =14, this key would have 50bits  too large for lowlatency discovery) it would require the user to enter it or read it from a file, floppy, etc. Which might not be allowed by a strict interpretation of [WA98]. However, publicly known M and N_{o }as given by the previous example are of course not "secret" and are allowed by [WA98].
One question is the usual problem of keyrepetition in XORencoding [Bek82] when the message to be enciphered is longer than the key, which is clearly unsafe for nonrandom (e.g., English) messages. However, note that DES ciphertext encrypted with the same DESkey is itself random from block to block  even if only one bit of plaintext changes from block to block [Ger99]  thus, the unknownkey can be kept constant in the XORencoding since the message itself (i.e., the DES ciphertext) that will be XORencoded is either constant or random. In other words, unlimited number of DES ciphertext 8byte blocks can be securely XOR encrypted by a constant unknownkey that has only one 8byte block.
A related question is collision after the final MDES XOR encipherment, that can occur in two different ways  which, however, only affects the attacker's knowledge whether a message was repeated, not what the message is. In the worse case of equal plaintexts before DES, given 2^{M }choices (for example: 16,384) for the unknownkey it is a simple mathematical problem [Bek82] to calculate the maximum allowed number of repeated uses of the same DES 56bit key that would guarantee a nearzero probability of randomly choosing the same unknownkey for that DES secretkey in different sessions  thus deriving worse case reuse limits of the same DES secretkey to avoid XORciphertext collision for each M. The same question can be asked for its dual, calculating (in the worse case of equal DES plaintexts) what is the maximum allowed number of repeated uses of the same unknown key, if the DES secretkey is changed for each session. Clearly, since M << 56, the last question can allow the dialogue parties to strongly reduce the average ignorance computation overhead for a chosen M, by spreading it over several sessions  while the first question allows keynegotiation to be adequately scheduled for a chosen M. But, neither collision events are of practical concern if every session which randomly chooses a DES key (from 2^{56}) should also randomly choose the unknown key (from 2^{M}) and viceversa  even for small M.
But, how does Bob decide on M? Implictly, because Bob is not interested
on M per se  Bob's concern is security versus cost, for which Bob needs
to define a threat model. Let us suppose that Bob's threat model is dominated
by a bruteforce attack for X years under currently estimated military
technology  this directly implies a minimum keylength K0, needed to
defend against such threat within an acceptable confidence level (e.g.,
99.999% per key). Obviously, Bob wants an effective keylength that is
at least equal to K0  which value he inputs to MDES. To find the effective
keylength, MDES solves a decision problem which depends at least
on two independent variables: plaintext statistics and M. For a given statistics
and M, is the effective keylength larger than K0? Yes or no? If not, and
possible, increase M. Some of these issues are discussed in [Note2]
and lead to the following table for some values:
MESSAGE  Entropy  M  effective keylength 
English text  1.2  0  56 
English text  1.2  14  70 
random text  8  0  (120)  full unicity [Note2] 
These and other cryptanalysis issues of MDES, including higher keylength modes, with N_{o }public or private, will be further addressed elsewhere since they are outside the scope of this paper  on unicity. For public discussions, also forwarded from other fora and private messages, please see the mcgtalk listserver [MCG99].
To summarize the MDES scheme, the objective is to slow down the attacker much more than the penalty imposed to Alice and Bob  so much more that it becomes infeasible within a given threat model. This is granted because:
Therefore, open ignorance can much improve limited keylength protocols as exemplified for DES, raising them even above the limit of controlled export [WA98], of 64bits for symmetric ciphers such as DES  in a practical setting. This is also useful because DES is a very welltested cipher, with few and known weak points [Ger99] which can be wellaccounted for  except its low 56bit keylength, which can be solved as above.
Thus, the above exposition is not only a "proof of principle" that keylength export restrictions can be easily foiled but also offers a practical way to achieve worldwide standards for 70bit cryptography or much more  today. Future expositions will show how this can be extended to (for example) 184bit security or even to theoretically unbreakable security, but still only with a 56bit secretkey.
Im general, the method can be applied as a "metamethod" to any block cipher system  not only to DES  defining a "MX" method for cipher X. To summarize the overall scheme, the open ignorance technique depends on two parts:
In conclusion, the usefulness of unicity is perhaps not only as a measure
to help "grade" cipher systems (but where it may miserably fail if used
as originally defined  see Section 2.), but it seems to be more interestingly
inversely related to what a cipher system may disclose at most  when
any number of its cipher messages are attacked by unlimited computational
resources, including time. In this, open trust was shown to play an essential
role to increase unicity  as an effective but open key, which however
does not change the perceived secretkey bitlength. This was contrasted
with open ignorance, which increases the perceived secretkey bitlength
by an added uncertainty that is shared by all and leveraged by the original
secretkey bit length itself.
However, it is important to note that "open ignorance" adds uncertainty
in order to protect the cipher by increasing the attack workload within
limited resources, while OpenKey adds certainty in order
to protect the cipher from attacks that can use any (unlimited)
amount of resources. Also, with open ignorance there is an increase of
the perceived secretkey bitlength, which does not happen with OpenKeys
 notwithstanding, security can be increased by either method. As a further
difference, with OpenKeys the more the attacker knows the safer the system
is  while with UnknownKeys the more everyone needs to know, the safer
the system is.
Thus, both methods described in this Section work in complement
 therefore, it is feasible to use them in conjunction in order to obtain
increased security gains over each method in isolation. For example, improving
upon the 70bit effective figure obtained above for 56bit secretkey DES
and 14 bits of open ignorance  by adding x bits of open trust.
This will be discussed elsewhere.
First, I note that it is hard to define what is "to look valid"  we do not know what a valid plaintext is. So, we must expect difficulties to distinguish between a false plaintext and a correct one, since it is axiomatic that we do not know the plaintext.
In other words, we have an identification problem.
Using the Identification Theory outlined in [Ger98], I can consider the four possible identifications of all possible decryptions, defined by identification level I2, as:
It also serves to exemplify the generalized concept of identification  "to identify is to look for coherence"  which allows identification to be defined without a previously required identity. This removes the circularity implied by the usual definition (see [Ger98]) and allows an intrinsic treatment of identification  as needed here.
To start, I can rephrase Shannon's unicity (hereafter, Unicity1) as:
This immediately allows us to define another unicity, as the "least upper bound" of Ambiguous:
It is easy to prove that Unicity1 = Unicity2 for a random cipher, by writing the probability expression that corresponds to the "maximum amount of plaintext that gives just one expected false decipherment" and transforming it into Unicity1's terms.
This resolves a difference in the literature, by showing that for random ciphers it is equivalent to consider "the least amount of plaintext that can be deciphered" or the "maximum amount of plaintext that gives just one expected false decipherment". Since the last case may be easier to calculate, for some systems, one may choose.
It is also possible (and, useful for security) to define Unicity3 until Unicity6 by following down on the DAOF scale:
"Unicity4 is the maximum amount of Obscure plaintext which can be deciphered from the corresponding ciphertext  given unbounded resources by the attacker"
"Unicity5 is the least amount of Obscure plaintext which can be deciphered
from the corresponding ciphertext  given
unbounded resources by the attacker"
"Unicity6 is the maximum amount of Formless plaintext which can be deciphered from the corresponding ciphertext  given unbounded resources by the attacker"
But, as mentioned in Section 1, a theoretically secure defense against a bruteforce attack on DES is to work within the unicity limit of DES.
This defense requires however the cipher to satisfy the "random cipher" condition within its entire message space. This was originally formulated by Shannon [Sha49] and, in the case of DES (an 8byte block cipher), essentially requires the condition below:
However, as shown in [Ger99], the above statement is neither entirely correct nor, even if entirely correct, would allow DES to be considered a "Random Cipher" over more than one block.
Thus, current unicity calculations for DES, which simply ignore that the validity conditions for the unicity formula itself are denied, are in error and need to be revisited  as I consider next.
When one uses compression (best case, as seen in Section 4.), different authors quote DES unicity to be from 16 to 20 bytes of English. For uncompressed English, a value of 8.2 is often quoted. These numbers are indeed easy to obtain. For example, if I would directly use the calculation given in Sections 3. and 4. above, DES unicity would indeed be 8.2 bytes for uncompressed English and 20 bytes for 2x compressed English  since DES uses 56bit keys and enciphers bytes (although in 8byte blocks).
As another example, quoting from Schneier [Sch98]:
"The unicity distance grows as the redundancy of the plaintext shrinks. For compressed files, the redundancy might be 2.5, or three blocks of DES ciphertext".
Our interest is to study DES under bruteforce attack. In this condition, the discussion in [Ger99] can be summarized in three observations:
ii. there is almost no expected plaintext ambiguity when inverting DES to tentatively decipher a ciphertext from English  one expects to observe either "garbage" or readable (thus, valid) English. This is called the "oneblock impossibility" in [Ger99].
iii. If a second block is used, using only the same keys preselected for the first block, "near correct" messages effectively do not occur  which means that the key can be found even with unsophisticated or incomplete text analysis. This is called the "twoblock impossibility" in [Ger99].
The idea is that the least amount of "Obscure readable and valid text" should already be enough, when compared with "Obscure garbage"  as given in item (ii) above. In other words, the attacker just needs to identify with some degree of success whether English is probably present or not  he does not need to know which English words were used in order to break into the key. This leads to what can be called an "Identification Attack", with least information. Of course, once the key is known, any message that uses that key can be directly deciphered  irrespective of message length.
The question is now what unicity concept should one use as the least amount of information needed to break DES  Unicity5, as given in Section 6, seems more appropriate than Unicity1.
Since DES works on 64bit blocks, this issue is not so important to define the least amount of ciphertext that needs to be intercepted  64 bits is certainly more than enough. However, it is crucial to the system's security to know how many English characters one is allowed to transmit under provable security conditions  i.e., that cannot be broken by a (given and possible) brute force attack. Of course, a low unicity also strongly reduces the workload of the attack since less ciphertext is needed and this may lead to simplifications in the computations.
This needs to be emphasized. Under current technology, 56bit DES is only secure if used within its unicity  which is expectedly much lower than calculated by Shannon's formula, by two counts: (i) the cryptographic formula does not apply and plaintext unicity must be used, (ii) the plaintext unicity is not Unicity1 but the much lower Unicity5. Also, attack workload could be much less than hitherto imagined.
To estimate Unicity5, already defined in Section 6, I will use the given values of H(English) = 1 to 1.5 and define the value of Unicity5 = 2^{1} to 2^{1.5} = 2 to 3 characters. I will use the largest value, as a worst case, to define the detection threshold:
Thus, the unicity of DES for English text is approximately 3 bytes  compressed or not. Not 20 bytes. The difference between a name ... and just the initials.
This also means that if you are trying to bruteforce DES you need just one ciphertext block. Not three, as given in [Sch98]. Which is a decrease of 20:3 in the number of plaintext bytes that need to be known for each key before the cipher is broken.
But, what is the significance of this result for DES?
First, the ratio of 20:3 implies at least a 3x increase in decryption throughput because one can now tentatively decipher only one DES 8byte block per key trial, not 3 blocks (24 bytes). Attacks can be more efficient.
However, this point has relative relevance  since the attack is still not trivial.
Far more important is the consequence that 20 bytes of compressed text are not safe against bruteforce attack on DES  and they never actually were, so we can imagine the consequences. Rekeying is also not very useful to block a bruteforce attack since one would need to rekey DES after every two letters  i.e., changing 56bits every 16 bits.
These considerations thus immediately impact the assumed security of SSL, S/MIME and other protocols that use DES. They also mean that even the small refuge provided by unicity to resist bruteforce attacks on text encrypted with DES, as strongly compressed as one may wish, is gone.
Further, it is interesting to note the selfconsistency achieved in the argumentation. The hypothesis that DES can be attacked by bruteforce has led to a DES unicity of 3 bytes  which supports the hypothesis. In other words, the low unicity of 3 bytes decreases by 3x the workload of a bruteforce attack and thus makes it more viable, when compared to the usual figure of 20 bytes.
As a final remark, the term "3letter" Identification Attack as used against DES should be understood in the sense of frequency tables for 1letter, 2letters and, at most, 3letters. Except in the case of privileged information that can be used as a probable plaintext (for example, OIDs, "magic numbers" as used in protocols, covert channels, etc.), one should view the term "3letter" as an attack using only known statistics, perhaps even without a dictionary.
For further details on the DES specific considerations and probability
calculations pertinent to the "Identification Attack", please see [Ger99].
The paper shows that the lowend limit of security/keybit is occupied by DES. Indeed, DES English messages can be bruteforce attacked over a plaintext space of only 3 letters  in an Identification Attack  instead of the currently assumed 20 letters. This result immediately impacts the assumed security of SSL, S/MIME and other protocols that use DES. It further shows that rekeying is not of very much use under DES, even if done outofband, since one would have to rekey after every two bytes of text.
The paper also revisits the very concept of unicity and presents new unicity types which result from the application of Identification Theory, when identification levels are understood as cardinal numbers. It also provides practical examples, showing how unicity can be used as a merit figure to evaluate different cryptographic systems and how the results can be used to guide the design of safer systems.
In particular, the idea that added open trust can increase security  even if known and certain to the attacker  was shown and calculated. This leads to a new security paradigm, for which:
Further, in order to increase bruteforce attack workload the paper describes a concept called "open ignorance" that deals with UnknownKeys and a very high workload differential to discover the unknown key, as viewed by the intended recipient visavis an attacker. The concept of UnknownKeys is distinct from the usual resource of adding uncertainty (e.g., "salt") to the plaintext or the algorithm's bit manipulation  which mainly aim to bar dictionary attacks but offer no gain against a bruteforce attack. This leads to another new security paradigm, for which:
The OpenKey/UnknownKey concepts may also allow the use of smaller
secretkeys  which can have other applications besides improving exportlimited
cipher systems, such as in smartcards, digital signatures, authentication,
nonrepudiation, etc.
Thus, the above exposition is not only a "proof of principle" that
keylength export restrictions are a technical moot issue but also offers
a practical way to achieve worldwide standards for 70bit cryptography
or much more  today. Which can positively impact Internet ecommerce
and interoperation.
This work is being done within the stated objectives of the MCG, the
MetaCertificate Group  an Internet open group on Certification and Security,
currently with participants from 28 countries. To use public discussions
to develop a worldwide security lingua franca in a set of Open Internet
Standards and Open Source APIs (Application Programming Interface) that
will enpower programmers and users worldwide, in order to fully address
the current security, privacy and restricted interoperation issues of the
Internet  as well as cost. It is important to note that this is not intended
as an attempt to violate any law in any country, but to allow the technical
concepts of privacy and security to be as transparent as possible in regard
to any national legislation in existence or in planning.
[EFF98] Cracking DES, reference in http://www.eff.org/descracker/
[Ger98] Gerck, E., "Identification Theory", in two parts: http://mcwg.org/mcgmirror/coherence.txt and http://mcwg.org/mcgmirror/coherence2.txt
[Ger99] Gerck, E., "Some NonRandom DES Characteristics and an Identification Attack on DES", in http://mcwg.org/mcgmirror/nrdes.htm
[MCG99] mcgtalk archive at http://mcwg.org/mcgmirror/cgibin/lwgmcg/MCGTALK/archives/
[MOV97] A. Menezes et al., Handook of Applied Cryptography, CRC Press, New York, 1997.
[NBS77] NBS FIPS PUB 46, "Data Encryption Standard", National Bureau of Standards, U.S. Department of Commerce, Jan 1977.
[NIST99] NIST FIPS 463, Draft to specify the use of Triple DES and deprecate the use of DES, Jan 1999, see for example http://jya.com/nist011599.txt
[Note1] It is not relevant here how Alice knows the DES secretkey she shares with Bob, and why the attacker does not. This will be the subject of another exposition, as it pertains to a similar discussion of security for asymmetric cryptography  public/privatekey pairs also limited by the WA. For the moment, just suppose that Bob and Alice met and Alice received the secretkey (or, for example they exchanged floppies with 1,000's of 56bit DES keys to be used in sequence).
[Note2] If the plaintext data to be sent by Bob is truly random and unrelated to any other data (a very particular case) then Bob will choose M = 0 because the attacker cannot decide what is a valid plaintext (see the discussion in Section 6, above) and Alice needs to accept the decryption based on her trust that the secretkey is correct. Note that in this case, for truly random 64bit data, keylength is formally120bits but the cipher is theoretically secure since M = H(M)  we have full unicity  so, security is at the maximum possible level even with M = 0. However, if the plaintext is random but constitutes a privatekey then Bob should choose M > 0 because the privatekey can be tested against its publickey counterpart and thus be detectable by an attacker as a valid plaintext. Further, if Bob wants to send twice the same plaintext data that is truly random and unrelated to any other data, but does not want the attacker to know that the message is being repeated, then Bob could also send together with the message some additional nonrandom plaintext that Alice could detect  for example, any message that can be detected by Unicity5 (See Section 6)  and use M > 0.
[Note3] Note that the XORencoding will use a fixed key but a randommessage for each block  since the DES ciphertext is a "message" to the XOR. Thus, the XORencoding is secure since the "message" has a full entropy of 8 bits/byte [Ger99]. However, any other (secure) encryption method can be used with MDES  such as RC4 or RC5, even DES. But, DES is limited to 56bits of keylength. Further, the option that involves least assumptions is surely the one that only uses XOR  which can also be theoretically analyzed, in contrast to RC4, RC5, DES, etc.
[Sha49] Shannon, C. Communication Theory of Secrecy Systems. Bell Syst. Tech. J., vol. 28, pp. 656715, 1949. See also http://www3.edgenet.net/dcowley/docs.html for readable scanned images of the complete original paper and Shannon's definition of "unicity distance" in page 693.
[Sch98] Schneier B., CRYPTOGRAM, December 15, 1998, email newsletter, http://www.counterpane.com. See also Schneier, B., Applied Cryptography, J. Wiley & Sons, 1996.
[WA98] http://www.wassenaar.org