As ever, the opinions expressed in this website are personal to me and do not necessarily reflect the opinions of my employer.
As part of January’s Patch Tuesday, we released 7 patches targeting 8 individual vulnerabilities. Out of these 8 vulnerabilities, I will talk about CVE-2012-0003 – memory corruption vulnerability in Windows Media component that may lead to remote code execution. Attacker can use multiple social engineering tricks on end users and lure them to play a specially crafted MIDI file on the vulnerable system to gain control of the host.
Operating systems including Windows 7 and above are not affected by this vulnerability. We have made patches available for all the 8 vulnerabilities via Windows auto update as well as for manual installation.
MIDI File Format:
To completely understand the vulnerability, one needs to understand the MIDI file format. MIDI stands for Musical Instrument Digital Interface, which is an industry specification for encoding, storing, synchronizing and transmitting a musical performance and controlling data of electronic musical instruments and other electric equipment. You can read more about MIDI files on Wikipedia.
Like every other file, MIDI file has its own file format. Let’s start with its header. A MIDI file always starts with 4D 54 68 64 hex bytes for “MThd” and is followed by 00 00 00 06 which is always constant for Standard MIDI files (SMF).
Every MIDI file ends with marker bytes 00 FF 2F 00. Entire music data lies between these headers and end marker bytes.
Each music data is composed of only 4 bytes; each byte has its own significance as follows:
Byte 1: Is a wait time for the musical event. It has a maximum value of 7F. Value in this byte means waiting for a specific amount of time before playing this musical event.
Byte 2: Is an event type of the musical event. There are multiple event types like Note On, Note off, Aftertouch, Pitch Wheel, Channel pressure etc.
Byte 3: Is a note pitch number. I am not much into music so can’t talk more about the notes. However, I have found that 0×47 is middle ‘D’ and 0x3C is middle ‘C’.
Byte 4: Is a volume of the note. Both bytes 3 and 4 range from 00 – 7F.
For example: When a midi player sees 2C 90 47 60 music track data, it interprets numbers as: wait for 2C time units, and then play the musical note D at volume 60.
Windows media player supports MIDI file format and can play midi files. We have seen above that there is hardly any room to play with MIDI header as most of the contents are static. We also noticed that music data lays only in 4 bytes with byte 3 and 4 having maximum value of 7F.
Memory corruption vulnerability exists when a Windows Media Component (winmm.dll) tries to parse Music data of type Note ON, Note OFF and Aftertouch with a pitch number greater than 7F. This bigger value causes a large offset calculation resulting Heap Overflow in the midiOutPlayNextPolyEvent()
MIDI file can contain many audio controls; hence there is not a straightforward detection technique. As per my understanding unless a specially crafted MIDI file has only one audio control, detection of this vulnerability using IDS is not possible. However, one can write exploit specific IDS signature to detect the attack. I will write a separate post on exploit specific IDS signature. Please let me know if anyone has IDS detection mechanism for this vulnerability(not exploit) on the network.
However, simple checks can be performed on the MIDI file to check if it is a specially crafted exploit for CVE-2012-0003.
This specific vulnerability is being used by malwares in the wild and holds a high likelihood of getting integrated in the popular exploit packs like BlackHole and Phoenix.
In order to protect yourself from such attacks, make sure you use latest operating systems up to date with all the security patches. Do not click on any hyperlinks received from an unknown source. Use latest version of internet browsers, install anti-virus engine and regularly update it.