Blockwise Chosen Boundary Attack – BEAST Attack


In September 2011 the world was shocked by the BEAST (Browser Exploit Against SSL / TLS) attack that attacks the SSL / TLS by Thai Duong and Juliano Rizzo. Ekoparty the attack demonstrated in 2011 and described in the paper entitled Here Comes the XOR Ninjas. This attack proved practical and effective steal session ID is stored in a cookie-protected website with SSL / TLS (SSL then I just call to SSL / TLS). In this paper I will discuss what the BEAST attack and how it works.
BEAST Attack

For those who have never heard the BEAST attack, please see the first youtube, vs HTTPS BEAST BEAST attacks that demonstrate how to use a Paypal account hijack victims. In the video seen how BEAST successfully decrypt SSL packets one byte by one byte until the entire cookie victim had been stolen. Frightening is not it?

Because of this attack BEAST-do site users to change SSL encryption of block cipher algorithms into stream cipher (RC4). You know why you had to change from a block cipher into a stream cipher? Apparently this is only attacking BEAST SSL attack using block cipher algorithm (eg AES/DES/3DES) in CBC mode (cipher block chaining). By switching to a stream cipher then the site will be immune from attack BEAST.

Let's discuss what's wrong with SSL block cipher and CBC modes so that could be exploited to such fatal.

Block-Cipher and SSL Record

Earlier as I will explain a little background about the block-cipher encryption in SSL.

SSL is essentially similar to the transport layer protocols such as TCP connection-oriented services that provide communication and ensure reliability for the layer above it, only difference is that the data in encrypted form via SSL.

If there is a name in the TCP 3-way handshake to establish a connection, the existing SSL negotiation. In the negotiation process will be agreed upon algorithm (eg encryption, key exchange, MAC) what to wear and also agreed symmetric key used to encrypt the data.

Keep in mind SSL uses symmetric algorithm (eg RC4, AES, DES) to encrypt the data as it is cheaper than the asymmetric algorithm computation (eg RSA). Asymmetric algorithms only be used during the negotiation process to secure the symmetric key exchange process, after the session / channel / SSL connection established, asymmetric algorithms are not used anymore, all the communication channel SSL using either symmetric encryption algorithm block-cipher and stream-cipher.

In the SSL channel data is sent in an SSL record that the maximum size of 16 kB. The data is the data that is sent to the layer above it as request / response in HTTP HTTPS (HTTP over SSL).




Data sent via an SSL channel will be broken down into one or more SSL record before it is sent to the destination and all records encrypted with the same symmetric key (the key to the client to the server, and the key for the server to the client).

As a reminder only, in opeasi block cipher in CBC mode, each block of plaintext is XORed with the previous ciphertext block to produce the ciphertext block. Especially for the first block, the plaintext block is XORed with the IV.



Chained IV

How do I encrypt SSL record? Plaintext data to be encrypted in an SSL record would consist of one or more blocks of plaintext, P1, P2, P3, ..., Pn to be encrypted into ciphertext C1, C2, C3, ..., Cn. Remember in CBC mode, the IV is needed to produce C1, now the question is where does it come IV or C0?

In RFC 2246 on TLS 1.0 is described like this:

With block ciphers in CBC mode (Cipher Block Chaining) the initialization vector (IV) for the first record is generated with the other keys and secrets when the security parameters are set. The IV for subsequent records is the last ciphertext block from the previous record.

SSL IV turned out to be determined at the time of the first record handshaking (negotiations), while the IV for the next record is the last ciphertext block of the previous SSL record. Using the last ciphertext block as the IV for the next record called chained IV.



In the picture above shows that the final ciphertext block of the first record (c4) into C0 or IV for the second record. So is the last ciphertext block of the second record will be the fourth for the third record. Later when the 4th record, last ciphertext block of 3rd record will serve as IV to record SSL 4th.

In the picture above c0 depicted as dotted box because it is not part of the record C0/IV SSL. In the SSL record, the first ciphertext block is not C0 or C1 IV in other words, the approach used is implicit IV.

IV is chained approach looked at all of the ciphertext block all SSL records as if they were sequential stream of ciphertext blocks, C1, C2, C3 .... Cn. In the picture above shows the first record is c1 | | c2 | | c3 | | c4, and 4 blocks of ciphertext on record to-2 can be considered a continuation of the previous record, c5 | | c6 | | C7 | | c8. Three blocks of ciphertext on record to-3 can also be regarded as a continuation of the previous ciphertext block, c9 | | c10 | | c11.



Chained IV proved to be a serious safety issue because an attacker already knows first IV to encrypt subsequent data. I will describe how this can be exploited chained IV.

For the record: Weakness chained IV is fixed in TLS 1.1 by using explicit IV, each record includes a record for the fourth (IV become part of the record as c0).

Chained Exploits IV

Now I will discuss how North chained IV can be exploited. We will assume an attacker were sniffing the network and gain (encrypted) SSL record contains ciphertext Ca = C1 | | C2 | | C3 | | C4 | | C5. BEAST attack in the chosen plaintext attacker has the privilege, meaning that he can determine what will be encrypted plaintext and get the encrypted (ciphertext).

The attacker wants to know whether the plaintext of a ciphertext block, suppose C2 is G (uess). How do I?

Attackers will make plaintext P6 C5 = C1 XOR XOR G then ask the system to encrypt the plaintext. Let's see how P6 encrypted into C6. Remember do XOR with the same value twice to negate the effect, as there XOR C5 twice, then two XOR C5 can be removed.

C6 = E (P6 XOR C5) = E (C1 C5 XOR XOR XOR C5 G) = E (C1 XOR G)

What does it mean from the equation C6 = E (C1 XOR G) above? Note that if G = P2 (plaintext from C2) then there is the C6 = E (P2 XOR C1) = C2.

Okey, so if G = P2, then C6 = C2, then so what? what's so special? For those not aware of the potential dangers, note that just by looking at whether the C6 = C2, the attacker can determine if G = P2. When the attackers saw that C6 = C2 means we can be sure that G = P2, or guessed correctly.

Now the attacker has no way to ascertain whether he was right or wrong

Already started to imagine is not the way to decrypt C2? The first attacker will choose guesses G 'and ask P6 C5 = C1 XOR XOR G' for encrypted (chosen plaintext). Then the attacker will see if the encrypted P6, C6 = C2 or not?



When viewed C6 is not equal to C2, then the attacker will choose the new guesses G ". Remember because of the chained IV, the C6 is a IV to encrypt P7 so on second guesses, the attacker chooses P7 C6 = C1 XOR XOR G ".



Attackers will also see if C7 = C2? When it is still wrong, the attacker will choose the new guesses, G "'. Once again because of the chined IV, IV untu C7 becomes so at a guess encrypt P8-3 P8 assailant chose C7 = C1 XOR XOR G "'.



When this time the attackers see that C8 = C2, then the attacker believes that the G "'is P2 (plaintext from C2). However, if still wrong, attackers will continue to make new guesses to obtain positive results.

In the example above, the chosen plaintext always involved because we want to decrypt C1 C2 (for P2). In general, if you want didekrip is Cn (for Pn), then the chosen plaintext must wear Cn-1.

Chosen Boundary

For the observant would notice there are less than this way. Recall that G is the attacker guesses of the size of a block (16-byte AES). How do I determine the G? Given the size of the block G 16 G byte that possibility very much, certainly not possible we choose G carelessly.

So, how do we create a "educated guess" or "smart guess" to select the most potential G right?

Choosing the right G P2 is very difficult to guess when the unknown is all (16 bytes). But if we believe that the first 15 bytes of P2 is the letter 'x', while the last byte P2 does not know its contents, then there are only 256 possible guesses to try. One among the 256 guesses below must be true if only one last character of unknown contents.

Ga = xxxxxxxxxxxxxxxa
Gb = xxxxxxxxxxxxxxxb
Gc = xxxxxxxxxxxxxxxc
Gd = xxxxxxxxxxxxxxxd
Ge = xxxxxxxxxxxxxxxe
And so on ...
What if plaintextnya is text "topsecret" and the attacker does not know the contents of any byte?

In the BEAST attack, in addition to the privileged chosen plaintext, the attacker has the privilege again, the text insert (prepend) at the beginning or in the middle of other text before the text is encrypted. So when the attacker sends the text "abcd", then the system will encrypt the combined "abcd" and "topsecret".

Privilege text insertion is very important in the success of BEAST attack by inserting text because it means we can shift the same plaintext block boundaries. How does that mean?

Remember that we can guess that a block with ease, we must make the first 15 bytes of the block to be something that we know, then just leaves one byte are not known.

What happens when the text "topsecret" inserted text "xxxxxxxxx" in the beginning? Once merged text combined into "xxxxxxxxxxxxxxxtopsecret". And so what? What is the point to add text at the beginning? Indeed, at first glance it does not look different, will look good when we look at the combined text in the form of blocks of plaintext.



've Seen the difference is not it? Having added 15 letter 'x' in the beginning, now the number of blocks of plaintext into 2 (P1 and P2), and the first block of plaintext is 'xxxxxxxxxxxxxxxt'. Aha! Now we can easily guess P1 because we are convinced that the first 15 characters P1 contains 'x' because we own that adds the letter 'x' is.

Technique to insert this text aims to shift the block boundary (chosen boundary) so that only leaves one character who is not known.

Once an attacker to guess 256 times, guaranteed he would know that the first letter is 't'. The next character to guess how the 2nd?

2nd guess the character is done by repeating the first step above, the boundary shifts to insert text at the beginning. This time it is inserted is a 14 letter 'x', is no longer a 15 letter 'x'. Let's see plainteksnya block.



Because the characters are inserted (prepend) only 14, then the remaining P1 'to'. We already know 14 is the first letter 'x' and the first letter is 't', so only the last character of P1 unknown contents. Once again, we are in a position that we want, we just have to guess 256 times guesses to get 2nd letter:

Ga = xxxxxxxxxxxxxxta
Gb = xxxxxxxxxxxxxxtb
Gc = xxxxxxxxxxxxxxtc
...
Go = xxxxxxxxxxxxxxto
Guess the 3rd character is also done in the same way. We shift the boundary by inserting a 13 character 'x' at the beginning so that the plaintext block that is formed is:



This time P1 is 13, the letter 'x', followed by 2 characters are already known 'to' and the character again is not known. Since only the last character is not known, it is only required up to 256 times the guesses to determine the contents of the character 3rd:

Ga = xxxxxxxxxxxxxtoa
Gb = xxxxxxxxxxxxxtob
Gc = xxxxxxxxxxxxxtoc
...
Gp = xxxxxxxxxxxxxtop
Phase Two Attacks

Earlier we discussed how to decrypt the ciphertext block. In general, the stages can be divided into 2 phases:

Phase shifting the boundary
Phase commit 256 guesses
The first phase of the attackers take advantage of privileged boundarynya chosen to shift the boundary. Attacker will insert a text to shift the plaintext block boundaries so that only leaves one unknown character. The figure below shows the process of the attack in the first phase. Attacker sends 15 character 'x' then receives the encrypted 15 character 'x' and 'topsecret' in the form C0 | | C1 | | C2.



The second phase is the phase for guessing the last character, in this phase the attacker exploit privileged chosen plaintextnya (throwing plaintext, ciphertext received). The first phase of the attacker already knows:

Ciphertext C = C0 | | C1 | | C2
P1 = xxxxxxxxx?
Attacker must guess the last character P1 unknown contents.

The first step the attacker chooses guess Ga = 'xxxxxxxxxxxxxxxa' then determine the plaintext P3 C2 = C0 XOR XOR Ga.

Just a reminder only. Because of the chained IV, we know that our chosen plaintext will be encrypted using the C2 as IV. Later the plaintext will be in-XOR again with IV (C2) before leaving C0 XOR encrypted so that will be encrypted Ga.

C3 = E (P3 XOR C2) = E (Co Ga XOR XOR XOR C2 C2) = E (Co XOR Ga)

Because we have P3 XOR with the first C2, C3 will be leaving = Encrypt (C0 XOR Ga). In P3 we also use C0 because we will compare with C3 to C1 and C1 = Encrypt (C0 XOR P1).

Plaintext P3 are chosen plaintext attacker (chosen plaintext) for encrypted into C3. Then the attacker will see if C3 = C1? If not the same, then the attacker will proceed with other guesses.



Because C3 is not the same as it means the C1 Ga guesses wrong. Create new attacker guesses Gb = 'xxxxxxxxxxxxxxxb' then determine P4 C3 = C0 XOR XOR Gb. Plaintext encrypted the attacker's choice will be C4. Attackers will see if C4 = C1? If not the same, the attackers will continue with other guesses.



Attackers will continue to try until the 20th guesses (in this case), the attacker makes guesses Gt = 'xxxxxxxxxxxxxxxt' and define P22 = C0 XOR XOR Gt C21. After P22 encrypted into C22, C22 attackers see that it turns out = C1, which means that the attacker is confident that the P1 'xxxxxxxxxxxxxxxt'.



Once the attacker knows the first character is 't' next attacker will restart from the first phase to the second phase of shifting boundaries and to guess the last character as much as 256 times the maximum guesses.

Chosen Boundary in HTTPS

So far we have discussed are still at the level of theoretical models or alone. Whether the model actually exist in the real world? The answer is no, BEAST attack is an attack that exploits chained IV using block boundary shift technique (chosen boundary) to decrypt an SSL record.

Here is an example of cookie-bearing requests sent by the browser and chopped into blocks of plaintext, P1 | | P2 | | P3 | | P4. By sniffing attacker managed to get the ciphertext C = C1 | | C2 | | C3 | | C4 and wanted to steal the victim's cookie PHPSESSID. How do I?



Attackers can steal cookie PHPSESSID the same way as we have discussed earlier.

The first phase we have to shift the boundary of tract so that only the last byte of unknown contents. In the HTTP request in the majority of the text content is already known, "POST", "HTTP/1.1" and "Cookie" is a text that commonly exists in the HTTP request, it is not a secret. The only secret in the request that the above is the contents of PHPSESSID "af25c ...", even the length of the content PHPSESSID not something secret.

Attackers could insert additional text in the URI path to shift the boundary. In the example above request we want to shift the "af25c ..." a total of 12 characters to the right. Attacker will send the cookie-bearing requests with URI PATH / xxxxxxxxxxxx so that plaintext block of the request form is:



Note in the request that has been shifted, the P3 is "kie: PHPSESSID = a". Of P3 is only just the last character of the unknown, "Cookie" and "PHPSESSID" is a common text in the HTTP request. Is looming right way? Once we shift that P3 be like that, then we go into phase two, which is the last character to guess as many as up to 256 times.

By repeating the first phase and second phase for all the characters in the cookie, the attacker will eventually be able to steal a cookie. This is what actually happens in the BEAST attack attack.

Attack Scenario

How exactly is it done BEAST attacks to steal cookie PHPSESSID such request example above? The figure below shows how the BEAST attack scenario an attacker to steal cookie PHPSESSID bank.com.



The BEAST attack requires the attacker to be in a position to do the sniffing (eg a LAN network, is in the proxy / router).

The second requirement is the attacker managed to run the same script in the browser (in different tabs) used by the victim to open bank.com. The script serves as an agent that is able to send the cookie-bearing requests and sends the data (over SSL) to the site bank.com. There are many ways an attacker could execute script in the victim's browser, such as by clicking the victims seduce evil.com site that contains the script agent.

Here are the ways in which an attacker to steal PHPSESSID:

In the first phase of the script in the victim's browser to force the victim's browser sends the cookie-bearing requests to bank.com with URI paths containing 'xxxxxxxxxxxxxxxx' PHPSESSID to shift the contents of as many as 12 characters.
Sniffer posted attacker records the POST request in the form of SSL records containing C = C1 | | C2 | | C3 | | C4
Because the first character in the last byte PHPSESSID is P3, then enter the next phase of the attacker to guess the character last two P3
Script agent will force the browser to send data to the target site as a continuation of the first phase of a POST request (as part of the POST body)
Attacker will ask the browser to encrypt P5 = C2 C4 XOR XOR Ga and send it (over SSL) to the target site
Sniffer C5 attacker would see passing in the network and check if C5 = C3? If the same, then the attacker guesses correctly
When the attacker guesses wrong then the attacker will make a new guess and go back to step 5.
At most 256 times the attacker guesses will manage to get the first character of the contents PHPSESSID.
Subsequently the attacker back to the first phase in step 1 until all characters PHPSESSID successfully stolen.
Attack Simulation

I made a small python script to simulate how the BEAST attack decryption process in this case. Here is a screen recording when BEAST attack simulation demo script is run.

Source code of script below can be downloaded here: beast2.py



I'll explain a little bit how the scripts work. AES cipher is used with long blocks of 16 bytes, the contents of text and secret key stored in the variable key and secret.

ChallengePhase1 function (text) is used to simulate an attack on the first phase. Text sent to the function will be added at the beginning of a secret text, "text + = plaintext secret". Furthermore plaintext is encrypted with AES combined CBC mode and this function returns ciphertextnya IV +.

Note that the variable initialization vector iv always turned into a block of ciphertext last. Only the first IV is generated randomly.



ChallengePhase2 function (text) simulate attacks on the second phase (phase guesses). Text sent to a function is selected plaintext (chosen plaintext) to be encrypted. The function returns the ciphertext encrypted result. At this function IV also always changed to last ciphertext block.



After we prepared the two functions for the first phase and the second phase, now we see how attackers perform the attack (of the simulation).

The first attacker will make r that contains many NULL character (0 × 00) as a text to be inserted to shift the plaintext block. At first r will contain the NULL character to guess the 15 first character secret. Next r contains 14 characters NULL character to guess both secret and so on.




Loop block below is the block that does guesses ranging from ASCII characters 32 to 127 ASCII characters (because we know the plaintext is printable ASCII). Guess block G is r + one ASCII character between 32-127 and chosen plaintext (challengeblock) which will be encrypted is Ci XOR XOR IV G.

Output of challengePhase2 will be compared with the ciphertext is sought, if the same, then the kind of character that have been found.



There are sites that make demo / BEAST attack simulation in javascript, please visit

0 komentar:

Posting Komentar