Discuss this paper and the subject of security in general  subscribe to the mcgtalk list. 

Ed Gerck
Copyright © 1999 by E. Gerck and MCG, first published
on Jan 13, 1999 in the mcgtalk list server.
Archived at http://mcwg.org/mcgmirror/nrdes.htm
All rights reserved, free copying and citation allowed
with source and author reference.
However, DES and TripleDES security may change if DES has nonrandom characteristics that could be explored in new attacks. One of the objectives of this paper is thus the study of nonrandom DES characteristics, in order to begin to assess such risks  which are ignored nowadays.
Indeed, even though it has been around for a long time, since the 1970s, no significant weakness has been found on DES [Bek82, MOV97]  except its keylength of 56bits. The few anomalies that were found are restricted to very special cases, such as the complementation for key and plaintext, leading to complemented ciphertext, and the handful cases for weak and semiweak keys. The only significant attack on DES is still thus an exhaustive key search, also called the bruteforce method, which explores its rather small key length when viewed by an attacker with large enough resources [EFF98]  though still modest in comparison with other potential attackers.
Even though sometimes possible as when attacking an automatic email encryption engine, a DES knownplaintext attack is a very special case  where the knownplaintext is simply used with tentative keys until the intercepted ciphertext is produced, which directly defines the correct key. In the general case, however, one must ask what happens when the attacker only knows the ciphertext plus routing information, as when intercepting a communication on the Internet. The first important question in such ciphertextonly attack is then: "What is a correct key?". This question is usually answered either by looking to the set of possible decipherments or to the set of impossible decipherments  or, to both. However, this exposition shows that for DES almost all decipherments done with wrong keys are impossible for natural or compressed English statistics (as an example). So, bruteforce DES attacks, even when only the ciphertext is known, are much simplified as discussed  in fact they can be 3x faster than hitherto imagined [MOV97, Sch98].
Why is that so?
In a ciphertextonly attack, for example over the Internet, the attacker may also have good clues into the message's language  for example, the message is in English  from the messages' routing headers, address, etc. And, as Shannon first pointed out [Sha49], the use of any additional encoding before encryption, such as compression or even a transposition cipher, can be accounted for since the underlying language statistics can still be reached. In this situation, also with compression, one can use the concept of cipher unicity [Sha49, Ger99], which measures the least amount of plaintext that needs to be deciphered in order to unambiguously define the plaintext  hence, its key. Thus, to answer the question "What is a correct key?" one must first answer the question "What is the cipher's unicity?". However, to answer this question one needs to first define over which message and key spaces the cipher can be classified as a "random cipher". This was originally formulated by Shannon [Sha49] and, in the case of DES (an 8byte block cipher with 64bits plaintext space), essentially requires DES to obey the condition:
The above statement of DES random characteristics by Menezes and similar declarations by other authors [Bek82, Sch98] seems justified by practice and is often used, along with similar statements, to justify DES as a "Random Cipher". For example, it is used in DES unicity calculations using Shannon's formula [Sha49], both for uncompressed text and compressed text by Schneier [Sch98], in one block and also over more than one block [MOV97, Sch98].
However, this exposition shows that the usual statement about DES randomness, as quoted above from [MOV97], is neither entirely correct nor, even if entirely correct, would allow DES to be considered a "Random Cipher" over more than one block.
A first consequence is that current unicity calculations for DES (and, some other ciphers) are too large and need to be revised.
A second consequence is that DES cannot really hide the plaintext statistics, since almost nothing that is tentatively deciphered is close to the expected underlying language statistics of the plaintext except the only expected solution. Which observation unfortunately provides a practical and oneblock algorithmic solution to the attacker's problem of separating all the (2^{56}  1) wrong keys from the only correct (but unknown) one.
I note that the reported results have nothing to do with the obvious result that one has "only" 2^{56} possible plaintexts in DES decryption, which is due to the fact that the key also has 56bits. Rather, they have to do with the much larger number of (2^{64}  2^{56}) ~ 2^{64} impossible decryptions and their very influence on the possible decryption statistics, which were so far ignored. This leads to an "Identification Attack" on DES, with two different impossibility constraints:
ii. The second impossibility is differential and defines that a coincidental result of two Englishlooking plaintext statistics is impossible for two ciphertext DES blocks encrypted with one key but decrypted with a wrong key. This defines a second impossibility, the "twoblock impossibility". The paper will provide a probability number for "impossible".
Further, the present results are discussed for English and for compressed English text, but they could have been equally well discussed for any natural language, for any protocol or document format  including email, fax, modem signals or specialized protocols tunneled in within the encryption.
In particular, a natural language such as English was used to exemplify the concepts because natural languages are a harder target to model since they are essentially nondeterministic in all their aspects except alphabet. To contrast, a Microsoft Word .doc file is essentially deterministic in alphabet, grammar and lexicon, for all its control characters  specially the initial control codes for internal commands, which are very repetitive even from file to file for different users. There must be no surprises when a parser decides a file is a MS Word file, or any other file format, since parsers are optimized for size and speed and this requires specialization.
These welldefined and fixed commands will be called "magic numbers" in the exposition, as usual, since their task is to allow immediate "magic" identification by themselves. The "magic numbers" may be present in the beginning of the corresponding section, as usual, but may also be at the end or anywhere else.
However, it is important to remark that one does not have to "know" it is a MS Word document  one just adds its distinct ncharacter "magic number" header to one's list of trusted ngrams, and then the MS Word file is just "discovered". These "magic numbers" are thus encodingsniffers, and one usually just needs three or four bytes to disambiguate document type and even version numbers. And, "magic numbers" are present everywhere, for example in PKZIP files, GIF files, UNIX compress files, in SSL messages with knownplaintext headers for all queries from a knowntext HTML form, in digital certificates such as X.509, with the OID numbers and the CA name and keyhash as well as the distinctive ASN.1 encoding for its various standardized fields, or in PGP for example in the ASCII armor.
In general, "magic numbers" are patterns which will be always present in any type of encoding or language since the very existence of patterns is what makes the encoding or language understandable to others, besides the sending party  after all, this is the goal of standardization. This is explained in Trust Theory [Ger98b] by noting that trust is the vehicle of information and patterns are information tags which need to be mutually trusted (i.e., known and relied upon by all dialogue parties for their decisions) if they are to have any bearing on understanding.
For special protocols, in highsecurity applications, the "magic numbers"
can also be implicit or encoded with a random seed in each use. However,
the encoded data itself may betray its presence by its very structure,
size, range, randomness, redundancy, etc. This is a problem in identification
and, as given in [Ger98], "to identify is to look for connections"  not
just to look for an identity connection with a set of "magic numbers".
For further discussion on this topic and a practical example please see
[Ger99], Section 4.
3.A. Shannon's "Random Cipher" assumption breaks down when one deciphers more than one 8byte DES block.
The reason is that ciphertext in two different blocks did not mutually interact when each was enciphered. One effectively has different statistical ensembles for each block  even though a block can influence the next block in cipherfeedback modes. DES enciphers/deciphers per block and these are the primitives that must be considered.
Therefore, eventual formulas for each basic probabilistic calculation must include the full 8 bytes of ciphertext and plaintext, but not more and not less. And, when connecting results for different blocks, the basic ciphertext probabilities should be considered independently for each block  notwithstanding the plaintext possible correlation between blocks.
These conditions have been hitherto ignored in DES unicity calculations [cf. Sch98]  which are perfunctorily extended over more than one block. However, the above arguments invalidate a priori the use of more than one DES block in Shannon's unicity calculations.
The same conclusion is valid regardless of the DES mode of operation used, such as ECB, CBC, CFB or OFB [MOV97]. For example, if we take DES CBC (CipherBlockChaining) mode, would randomness in the first N blocks thwart analysis of subsequent blocks?
No, with three comments:
This is due to the fact that the probability of at least one coincidence between the plaintext that results from any ciphertext deciphered with any key and the set of all possible decipherments of a different ciphertext must be 1 for a "Random Cipher" but is 1/2^{8} for DES.
For a given ciphertext, even though plaintext bit reversal is equally probable (as empirically known, [MOV97]) for any plaintext bit in one block as one reverses any bit of the key, this only applies to 2^{56} samples of all possible 2^{64} plaintexts  which is the amount spanned by the 2^{56} keys. Thus, DES decipherments do not produce a random distribution over 2^{64} samples, but at most over 2^{56} samples.
This condition has also been hitherto ignored in DES unicity calculations [cf. Sch98], but leads to a value of 1/2^{8} (i.e., 2^{56}/2^{64}) for the probability of at least one coincidence between a plaintext that results from any ciphertext deciphered with any key and the set of all possible decipherments of a different ciphertext. Of course, this probability should have been 1 if the "Random Cipher" assumption were correct for a DES 8byte block. The difference is noteworthy for DES, with a much reduced probability.
3.B.ii DES 56bit Randomness: For DES, it is empirically known that altering any single key bit should alter each ciphertext bit in a block with probability e > 0, except for few known pathological cases. Experimentally, e = 1/2.
3.B.iv DES* 64bit Randomness: For DES*, it is empirically known that altering any single ciphertext bit in an 8byte block should alter each plaintext bit from that block with probability e > 0, except for few known pathological cases. Experimentally, e = 1/2.
3.B.v DES* 56bit Randomness: For DES*, it is empirically known that altering any single key bit should alter each plaintext bit in a block with probability e > 0, except for few known pathological cases. Experimentally, e = 1/2.
3.C. Entropy of a generic Englishlooking message is 1.2 ~ 2 bits/byte, which means that 776 ~ 65,536 different Englishlooking messages are possible for each DES plaintext block, also with text compression.
For a message M, with entropy H = H(M), a DES plaintext block is expected to contain an average of 2^{8*H} different 8character Englishentropic strings.
For a DES block of 8bytes, the message (note that "message" denotes the actual and unknown plaintext used to produce the ciphertext being attacked) corresponds to 8characters of English text. According to the known estimate for the entropy of English text [Sch98] and within some considerations [Ger99] regarding its approximate use for short text, one would need 2^{8*1.2} bits per plaintext block, in order to represent English text. Thus, using the entropy approximation for sequence statistics, a DES plaintext block is expected to contain an average of 2^{8*1.2} = 776 different 8character Englishentropic strings.
Of course, the actual value should be higher for short sequences, as
we are dealing only with 8character strings [Ger99]. This is also the
case when (as usual) text compression is used before encryption. To estimate,
if the entropy for compressed 8character English text is 2 bits/character,
then 2^{8*2} = 65,536 Englishentropic strings with 8characters
are to be expected, including compression.
To summarize, for English, H(M) is 1.2 bits/character or up to 2 bits/character
for compressed text in 8bytes, which provides from 2^{8*1.2} =
776 to 2^{8*2} = 65,536 expected Englishentropic strings, also
including text compression.
3.D. Entropy of plaintext deciphered with a random wrong key is 7 bits/byte.
The claim is that, given an 8byte message M with entropy H = H(M) bits/byte, the corresponding DES ciphertext for a 56bit random key: (i) has an entropy of 7 bits/byte; (ii) when deciphered by the same key produces the same message M; when deciphered by any other key produces a plaintext M' <> M with an entropy of 7 bits/byte.
It is important to note what the claim implies: the entropy values given in (i) and (iii) do not depend on the message entropy H(M)  which is an expected behavior, even though the value of 7 bits/byte will introduce nonrandom characteristics as discussed below. The proof is as follows, per item:
3.D.ii  The correctkey deciphered plaintext entropy is H(M) bit/byte: As given by 3.B.iii, DES* decipherment of the ciphertext with the same key must produce the plaintext M, as one and the only one solution. This proves 3.D.ii.
3.D.iii  The wrongkey deciphered plaintext entropy is 7 bit/byte: As given by 3.B.v, DES* decipherments of any fixed 8byte ciphertext by a random 56bit key leads to random 8byte plaintexts with 1/2 probability of bit reversal for any bit, with 2^{56} possible decipherments. This directly defines an entropy of lg(2^{56}) = 56 bits per 8bytes for the plaintext and proves item 3.D.iii: the entropy is (56 bits)/(8bytes) = 7 bits/byte.
3.D.a. The encipherment redundancy is too high when one varies the key.
3. E. For each DES ciphertext block and plaintext deciphered with a random wrong key, the probability of occurrence of Englishlooking messages is (3 ~ 300)/2^{56 }per key, also including text compression.
This claim reflects the influence of the nonflat DES statistics that results from 3.D.b on "plaintext selection" upon a false decipherment. Consider the tentative decipherment of a given 8byte ciphertext block with all possible wrong keys  which are 2^{56}  1 keys  what is the expected total number of Englishlooking texts M with entropy H = H(M) to be produced for all wrong keys? For a fully random choice of 8 bits/byte, item 3.C argued that 776 ~ 65,536 different choices are expected for Englishlooking text, including compression. However, we now expect a reduction due to the entropy decrease for plaintext that is produced by decipherments with a wrong key  which is 7 bits/byte. There is less variety from the source, less "surprises", while still occupying 8 bytes of plaintext space. However, how much is the reduction, if any?
An estimate can be obtained on average terms  which is appropriate, since we are dealing with a large number of choices (e.g., 2^{64 }). With 8 bits/byte of entropy and 8 bytes of text one has 2^{8*8} = 2^{64 }possibilities of which 776 ~ 65,536 are Englishlooking text (including compression). With 7 bits/byte of entropy and 8 bytes of text one has has 2^{7*8} = 2^{56 }possibilities, that can be seen as 2^{56 } "random choice events with replacement" from a bin with 2^{64 }tokens of which (776 ~ 65,536) are "different" from all the others  but they all have equal selection choice. Each random key corresponds to a choice event and the chosen token is replaced before the next choice event (i.e., before the next random key is used). The probability of selecting at least one of the (776 ~ 65,536) "different" tokens per key (i.e., per choice event) is, directly:
Item 3.C dealt with Englishlooking messages, which were calculated by their expected entropy. The same applies to item 3.E, which concludes with an assertion that "the total expected number of false Englishlooking messages deciphered by using all wrong keys is (3 ~ 300)". However, I note that this occurrence range must include 8byte plaintext strings which are not all necessarily Englishvalid, even though they should be all Englishlooking. For example, the words "IN NO IS", "LAT WHEY", "CRATICT", "FROURE B", "PONDENOM", " REPTAGIN" given by Shannon [Sha48]  which do contain valid unigrams, bigrams, trigrams, etc. according to English statistics (thus, reflected in the entropy value used) but they are not English. Thus, the expected number of Englishentropic 8character strings given in 3.E must be higher than the average number of Englishvalid strings.
In other words, DES decipherment under bruteforce assumptions has been
reduced to a simple English plaintext selection problem, even when text
compression is used before encipherment. The result so far was that less
than 3 ~ 300 wrong messages need to be screened for all possible keys in
order to reveal the only one correct message. But, how many Englishvalid
messages should we expect? During selection, how can we decide which message
is probably not Englishvalid (though Englishlooking) and should be left
for a second pass, which message should one immediately try out?
As Shannon [Sha49] showed 50 years ago, this can be answered using
standard English letter frequencies, digram and trigram frequencies, dictionaries
and even known plaintext  in what is essentially a "yes/no" detection
problem.
This introduces the concept of a detection threshold, which represents the minimum probability that an acceptable 8byte string must have under the language model defined by the attacker in order to qualify for a "yes"  anything less than that is "no", but it is also possible to define a "maybe" class which can be used in a second pass if needed. The chosen detection threshold thus expectedly reduces the number of possible plaintext messages from the total value given in item 3.E to a lesser value.
Given the cost of a false detection, the cost of a miss, and their probabilities, which can be translated in detection theory in terms of accuracy and reliability [Ger98a], one can calculate the best threshold for the first pass (in terms of number of letters, digrams, trigrams, and so on frequencies, dictionary entry confirmation, known plaintext confirmation, etc.) that will minimize cost.
For example, a known result in [Sha49] is that one should need an average of 5characters to disambiguate one English message  this is the unicity distance for plaintext English. Since we have 8characters in a DES block, Shannon [Sha49] showed thus that we should expect no large difficulties to disambiguate one correct English message from the set of all possible Englishlooking 3 ~ 300 samples.
For lower and upper case text represented in 8byte characters, one may also directly reject mixed unlikely occurrences such as "PoNdEnOm" or even "hOWeVeR ", at least in a first pass. Of course, the expectedly few "maybe" results (less than 300) can be stored for a second pass, if needed.
This item answers two questions. First, the practical case where more than one key results from item 3.F above. In this case, I know there are wrong keys  but, how to filter them? Would I need to decrypt all other blocks? Or, could I just decrypt one more block in order to reduce the probability of falsedetection to zero? Second, even if just a single key results from 3.F it would be useful to test it with a low overhead before using it to decipher a long message  but, how can I test a key?
Both questions can be solved by using DES "hashfunction" property, described in 3.D above, but now with two DES blocks  i.e., using two different ciphertexts for the same key. Of course, even if all 3 ~ 300 messages are left after applying the method in item 3.F to a block, this number is still small and would easily allow the corresponding 3 ~ 300 keys to be directly used to decipher just one more block.
Now, if a wrong key was accepted in item 3.F because it fortuitously led to a plaintext which was "nearly correct" in one block, in the next block the probability that this fortuitous case also occurs for the same key is the product of the probabilities for each block, which is ((3 ~ 300)/2^{56})^{2} ~ 1/2^{96}, vanishingly small even if the probability decrease caused by text screening is not taken into account in both cases.
This means that a single key can be easily positively tested. Also, all wrong keys can be directly eliminated by one extra decryption per preselected key, if so needed.
This is a trivial item, included here for presentation clarity. Indeed,
item 3.B.iii shows that there is one and only one key which maps the ciphertext
to the original plaintext that was enciphered  the correct key. Which
implied that we know there is one and only one correct message, even if
we have to try out all the 2^{56} keys and even if for some keys
we have to use one extra block as in item 3.G above. At the end, there
must be only one key and this key is the correct one.
Thus, to summarize all eight points from 3.A to 3.H, when one tries to decrypt the DES encipherment of a plaintext with defined statistics (such as English text or 2x compressed English text), the above arguments show that using the wrong key almost absolutely guarantees that one obtains "garbage" as a result  as with a "hashfunction"  except for very few messages compared to 2^{56}, perhaps as few as 3 ~ 300 for the first block. Afterwards, text frequency screening with an attackerdefined detection threshold can use the known English letter, digram, trigram, 4gram, etc. frequency tables, also for compressed text and even for some expected plaintext and dictionaries, to automatically reduce the occurrences to a number close to one for 8byte text. If one uses unsophisticated (less cost and less time) or incomplete (by necessity) text analysis, then it is possible that more than one key will be left after text screening, however expectedly few and less than 300. But, in this case, a second DES block of ciphertext is all that one needs in order to disambiguate the few remaining keys and find just the correct one, with almost unity probability.
The full message can therefore be readily decrypted and verified under the attack described, based on impossible decipherment constraints  even if language screening is not very effective.
Further, due to its mathematical generality, I suppose that the "hashfunction"
argument advanced in item 3.D and applied also in item 3.G would be valid
to a broad range of plaintext statistics, not just English and not just
natural languages. And, not just DES. On another topic, the "hashfunction"
property could be useful for other purposes, not only for attackers. This
will be reported elsewhere.
Further application of these results are reported in [Ger99], using Identification Theory [Ger98] in level 2. They allow DES unicity to be reduced even more when the text is identified by its linguistic statistics and not by its actual words  i.e., when the text is identified in Obscure form, not necessarily in Distinguished form. As shown in [Ger99], the least amount of information needed to break DES is then given by the concept of Unicity5 (i.e., unicity for obscure identification), which is just 3 characters for English. For English text, even compressed text, one should then be able to break DES with the recognition of at most one, two and threecharacter frequencies. This means a 3letter attack on DES.
Now, it is instructive to consider in somewhat more detail "how" such an attack may be conducted  with 3 or 5 letters  as it can be applied to any other cipher system, not only DES.
The attack could be called an "Identification Attack" and may use the new concept of "Unicity5" described in [Ger99] or even the original unicity, called Unicity1. With Unicity5, however, the intention is not to identify words or even the whole plaintext in a block. The intention is just to identify a "valid plaintext". In both cases, the attack unfolds in the usual way, where one uses a bruteforce keysearch to find a probable key  defined as that key which provides the required level of identification for the plaintext. From this key, the full plaintext can be read in one easy step. In case of failure, the search would continue.
The central observation is thus that in order to break a cipher system one does not need to identify the plaintext as Distinguished, just as Obscure. The reduced identification value in the DAOF scale pays off in terms of a strong reduction of the amount of plaintext needed in order to break the cipher. This was exemplified in the former Section, which showed why one can break DES with three letters.
The DES attack procedure is automatic and involves two possible loopbacks for the full 8character text (i.e., not just for 3characters  the 3characters mentioned are just a matching pattern of possibly 1, 2, 3 or more, in 8):
decipher > trigrams > Yes, "English" > 5letters > Yes, English ^< try new key < No< No 
To develop a feeling for the (small) plaintext space one may need if only 3characters are used, one can start with the maximum number of 2^{18} cases if one includes all upper and lower case letters, space, punctuation and numbers in a 64character alphabet (to simplify calculations). Of course, one should now rate the entries according to letter, bigram, trigram linguistic frequencies for English  taken from general text or (better) from former texts of similar sources (e.g., electrical engineering, computer science, social, medical, etc.). Looking at usual trigram tables [Sha49], [Bek82], I would assume one can thus reduce the plaintext space to probably 12,000 cases for the most relevant English trigrams (the, ing, and, her, ere, ent, tha, nth, was, eth, for, dth, hat, she, ion, int, his,...) [Bek82]. The probability of occurrence decreases rapidly along such a list. For example, "the" is approximately 8x more probable than "his". But, it may get better (i.e., worse). Since most cipher systems begin with a "magic number" or a message begins with the usual "Received", "To:", "From:", etc., then it may be possible to further reduce the 3character list, for the message beginning. Perhaps, down to 100 cases as I can evaluate from the variety in my own messages. This is comparable with the maximum number of expected false decipherments given by 3 ~ 300, as calculated in the above Section for 8byte strings with or without text compression before encryption  which is the value I use next.
Thus, one may need from 3 Kb to 30 Kb to store the entire DESrelevant information on the English plaintext space  relevant enough to allow English to be detected and the DES key to be found. Of course, the less entries one has, the less false detections one may encounter. However, the less entries one has the more misses one may suffer. Given the cost of a false detection, the cost of a miss, and their probabilities, which can be translated in detection theory in terms of accuracy and reliability [Ger98], one can calculate the best threshold (in terms of number of letters, digrams, trigrams, and so on frequencies, dictionary entry confirmation, known plaintext confirmation, etc.) that will minimize cost. This can be further optimized in terms of specific dictionaries, so that one can define an overall best strategy for each language group, targeted in terms of ngram length, their number of entries and dictionaries.
Uniting this Section with the former and summarizing the "why" and "how" of the "3letter DES attack", one recognizes first that only one DES block needs to be used in any case (i.e., with or without text compression before encipherment) and that there are only approximately 3 ~ 300 expected false messages that may resemble English, from all possible decipherments of a given 8byte ciphertext. According to this exposition, the 3 ~ 300 corresponding keys can then be graded as "yes" or "no", by using frequency tables and dictionaries to accept or reject the linguistic characteristics (e.g., for English) for the corresponding text; which allows the cipher to be broken when a second DES ciphertext is tested with the set of "yes" keys  which are expectedly few, and yield just the correct key with an effective unity probability.
An interesting point from the discussion above is that, of course, one may use the full ciphertext for testing purposes at any stage, since one has already intercepted a part of it and since the message itself lies in the full ciphertext  but the main reason why one does not want to do that before one finds only one expected key is that the full ciphertext space is very large, possibly spanning over thousands of DES 64bit blocks which would have to be all tentatively decrypted.
There are two questions which may now surface.
The first question concerns the use of purposefully inserted or otherwise identified "magic numbers" such as OIDs, or covert channels. A protocol usually defines a header  which is always used and allows one to easily spot such message in a tentative plaintext. That is how one may (for example) detect a PGP key by itself, or a PGP message. A covert channel ensues when the existence of this header (and, its significance to aid decryption) is ignored by at least one of the dialogue parties. This means that no one needs to approve "keyescrow" for session keys  the message itself may provide enough unicity in a covert channel, as it already does (Note: I am making no affirmation about its actual use as such in any protocol  just about its actual potential to be used as such, unwittingly to a dialogue party or to all dialogue parties).
The second question concerns the impression that we do only seem to find a "possible" plaintext, but not necessarily the one that was encrypted. As a matter of fact in probability analysis, yes we are concerned with "possible messages". But, the message that was actually encrypted will expectedly be counted among them (as a function of the selection choice)  see item 3.E.
In a different slant, since 56bit DES can be attacked by brute force without much expense as is wellknown [EFF98] and it seems to be even easier as given above, DES well exemplifies the waste of using a WAcontrolled random 56bit key in order to protect at most 24 bits (3 bytes x 8 = 24). While a simple XOR encoder (a Vernam cipher) with 56bits of key would protect 56bits of English text  a 130% increase over DES.
In general, a claim can be made for the Identification Attack  that to break a cipher system one does not need to identify the plaintext as Distinguished, just as Obscure:
Therefore, by avoiding a larger scope for such ghostthreats, designers can make a good protocol easy to attack. But, by recognizing the sandboxes we are forced to be in and the choices we are denied, we can recognize the ghostthreats we cannot use  this is called a "negative pregnant" logical clause. And, to each denied choice we score a minus. On the other hand, for each choice we can insert, we score a plus. It does not matter if we do not use all choices  just being there is enough. This was exemplified here for DES, when I called attention to the fact that the (2^{64}  2^{56}) ~ 2^{64} denied decryption choices were the deciding factor to simplify an attack on DES. In fact, the paper showed that they turn DES into a nonrandom cipher over 64bits  DES was shown to be a random cipher only over 56bits, in 64bits of plaintext.
The results thus show some nonrandom characteristics of DES, all entailed by a dimensional reduction in the effective decipherment message space from 2^{64} to 2^{56} coordinates. Which was a modification introduced after the cipher was initially designed by IBM [Bek82] and is a collateral effect of the keylength reduction from 64bits to 56bits.
The exposition also describes an attack that explores this vulnerability, what I call the "Identification Attack", making use of impossible decryptions in two different impossibility constraints and in three steps:
The present results and the "Identification Attack" are of course not
only valid for DES and may be used to analyze the security of other cipher
systems.
[Bih98] Biham, E. et al., "Cryptanalysis of Skipjack Reduced to 31 Rounds using Impossible Differentials", in Technion Tech Rep. CS0947, 1998. Published at http://wwww.cs.technion.ac.il/~biham/publications.html
[EFF98] Cracking DES, reference in http://www.eff.org/descracker/
[Ger98a] Gerck, E., "Identification Theory", in two parts: http://mcwg.org/mcgmirror/coherence.txt and http://mcwg.org/mcgmirror/coherence2.txt
[Ger98b] Gerck, E., "Towards a RealWorld Model of Trust", in http://mcwg.org/mcgmirror/trustdef.htm
[Ger99] Gerck, E., "Unicity, DES Unicity, OpenKeys; UnknownKeys", in http://mcwg.org/mcgmirror/unicity.htm
[MOV97] A. Menezes et al., Handook of Applied Cryptography, CRC Press, New York, 1997.
[NBS77] NBS FIPS PUB 46, "Data Encryption Standard", National Bureau of Standards, U.S. Department of Commerce, Jan 1977.
[NIST99] NIST FIPS 463, Draft to specify the use of Triple DES and deprecate the use of DES, Jan 1999, see for example http://jya.com/nist011599.txt
[Sha48] Shannon, C. A Mathematical Theory of Communication. Bell Syst. Tech. J., vol. 27, pp. 379423, July 1948. See also http://cm.belllabs.com/cm/ms/what/shannonday/paper.html for a WWW copy.
[Sha49] Shannon, C. Communication Theory of Secrecy Systems. Bell Syst. Tech. J., vol. 28, pp. 656715, 1949. See also http://www3.edgenet.net/dcowley/docs.html for readable scanned images of the complete original paper and Shannon's definition of "unicity distance" in page 693.
[Sch98] Schneier B., CRYPTOGRAM, December 15, 1998, email newsletter, http://www.counterpane.com. See also Schneier, B., Applied Cryptography, J. Wiley & Sons, 1996.
[WA98] http://www.wassenaar.org