MPQ File Links
The following MPQ file descriptions are currently available:
- \GameBalance\ItemTypes.gam
- \GameBalance\Items*.gam
- \GameBalance\AffixList.gam
- \GameBalance\SetItemBonuses.gam
- \GameBalance\Recipes*.gam
- \GameBalance\ItemEnhancements.gam
- \Recipe\*.rcp
- \Actor\*.acr
- \Textures\*.tex
- \StringList\*.stl
New file descriptions will be added every week or two, so be sure to check back for updates.
Also, please be sure to check out the recently improved guide on String Hashing. Diablo 3 makes extensive use of string hashes as a way to correlate data making this a must-read section!
MPQ Introduction
The MPQ file format was developed by Blizzard and is used by all their games. It is essentially an archive format that supports various compression techniques. It is optimized for read operations and uses hashing for quick file indexing.
The MPQ files are also used for patching the client game files. When a patch is released, it will include one or more MPQ files which contain only the updated files. The patch MPQ can also contain references to files that are to be deleted. When a file is patched, it can either be replaced in its entirety (if the changes are significant), or, more commonly, the MPQ will contain a BSDIFF of the file. Each file patch entry contains the MD5 of the file before the patch. This allows the patching mechanism to ensure that it does not attempt to patch a file that is not at the correct version level.
When the game client is loaded, it loads the needed game files from the appropriate "core" MPQ, and then applies that patches one at a time in chronological order. The MPQs do not store a version of the patched file! It's possible that the patched files are stored in the cache directory.
The tool that I've been using for working with MPQ files is Ladik's MPQEditor. Ladik's site also contains a lot of detailed information about the MPQ format itself. I've had no trouble extracting and patching Diablo 3 game files with this excellent tool.
Diablo 3 MPQ Files
In the Diablo 3 beta client, the core MPQ files are located in the following directory (for Windows clients):
[D3_Install_Dir]\Data_D3\PC\MPQs\
This directory includes the following MPQ files:
- base-Win.mpq — This MPQ contains the core executable files. Unlike other MPQs, the contents of this MPQ are extracted to the [D3_Install_Dir]. This MPQ includes the "Diablo III.exe", and various DLLs including the "bnet\battle.net.dll".
- ClientData.mpq — This MPQ contains most of the game graphics related files. Most of the files in this MPQ appear to be platform specific, but some like the "\Powers\*.pow ones" don't seem to be.
- CoreData.mpq — This MPQ contains most of the interesting game client data. This includes all the affix information, item information, monster information, etc.
- enUS_Audio.mpq — As the name implies, this MPQ contains all the enUS localized sound files (*.snd and *.ogg). All sound files with English dialog are included here.
- enUS_Cutscene.mpq — This MPQ contains the enUS localized cut-scenes for the five character classes.
- enUS_Text.mpq — This MPQ contains all the enUS localized strings that appear as game text anywhere in the game. This includes everything from localized affix and item names to NPC conversations. The StringList Data section provides the localized (enUS) strings.
- HLSLShaders.mpq — This MPQ contains shaders.
- Sound.mpq — This MPQ contains non-localized music and sounds (i.e., any sounds files that don't contain localized dialog).
- Texture.mpq — This MPQ contains all of the game textures. These textures are stored in a Blizzard specific archive format. Each *.tex file can contain multiple textures. Necrolis' D3TexConv program (source code included in the download) can be used to extract the images from the *.tex files.
Diablo 3 MPQ Patch Files
The MPQ patch files are located in one of three subdirectories under the MPQs directory:
- base — For all base files.
- enUS — For all enUS localized files.
- Win — For all Windows platform specific files.
An MPQ patch file has a name like "d3-update-base-7338.MPQ" where "7338" is the version number. As a reminder, to patch the core MPQ file, you must apply all of the patch files in sequence.
All files in an MPQ have a 0x10 byte file header. The first 4 bytes are always the "magic number": EF BE AD DE. Not sure if the next 4 bytes identify a version number or the file type, but this DWord is identical for all files of the same type for a given version. The last two DWords in this header always seem to be 0x00000000.
MPQ Header
All MPQ files start with a 0x10 byte header. The structure of this header is described below.
| structure MPQHeader | // sizeof 0x10 | |
| { | ||
| 0x000 | DWord mpqMagicNumber; | // MPQ file magic number: 0xDEADBEEF |
| 0x004 | DWord fileTypeId; | // file type or version id (same for all *.gam files) |
| 0x008 | DWord unused[2]; | // always 0x00000000 |
| } |
MPQ GameBalance *.gam Files
The "CoreData.mpq" file contains the "GameBalance" directory. This is where many interesting gam ("*.gam") files are located including "AffixList.gam" and "Items_*.gam". There's a lot of interesting game information located in these files.
The file type id / version number for *.gam files was 0x0000071C in the core MPQ, but is now 0x0000075F as of version 7318.
MPQ GameBalance *.gam Header
The gam files have a header size (excluding the 0x10 byte MPQ header) of 0x3A8 as of version 7318 (this was 0x398 in older versions).
I haven't figured out the complete structure of the gam header, but here is what I have so far (make sure you add 0x10 bytes for the MPQ header if you are looking at absolute file offsets):
| structure GamHeader | // sizeof 0x3A8 | |
| { | ||
| 0x000 | DWord fileId; | // unique file id |
| 0x004 | DWord unused[2] | // always 0x00000000 |
| 0x00C | DWord recordIndex; | // record index for data offsets (see below) |
| 0x010 | DWord fixedString[256]; | // fixed value: "0000.gbi" 0x30303030 0x6962672E |
| 0x110 | DWord resourceName[256]; | // filename of the resource from which this file was generated |
| 0x210 | DWord unknown; | // the value of this DWord is different for each file |
| 0x214 | DWord unused2; | // always 0x00000000 |
| 0x218 | DataArrayEntry[25]; | // array of DataArrayEntry (see below) |
| } |
| structure DataArrayEntry | // sizeof 0x10 | |
| { | ||
| 0x000 | DWord dataOffset; | // data start offset (relative to the starts of the GamHeader |
| 0x004 | Dword dataNumBytes; | // data size (number of bytes); does not include variable data |
| 0x008 | DWord unused[2]; | // always 0x00000000 |
| } |
The array of "DataArrayEntry" contains all 0x00s with the exception of the two DWords that contain the "dataOffset" and "dataNumBytes". For a given version of gam files, the start offset is always the same. Since version 7338, this is 0x000003A8 (0x000003B8 absolute file offset). The "recordIndex" indicates which array entry contains the data information. Unfortunately, this mapping uses an intermediate table that is not part of the file structure. For a given value of the "recordIndex", the array entry containing the data information is always the same. Here are some examples from the core version; the first value is the "recordIndex" (0x0C), and the second value is the absolute file offset where the "dataOffset" DWord is located:
| 0x01 --> 0x228 |
| 0x02 --> 0x238 |
| 0x03 --> 0x248 |
| 0x04 --> 0x348 |
| 0x05 --> 0x268 |
| 0x07 --> 0x288 |
| 0x08 --> 0x278 |
| 0x0A --> 0x298 |
While at first it may appear that the "recordIndex" directly corresponds to the offset location, there are several outlier values (e.g., 0x04 --> 0x348) that prevent this from being the case. This is how we know that an intermediate data structure is being used. The "recordIndex" behaves like a sequence number that is being looked up to determine the actual entry for the data information.
The way I have been processing these files is to read all the bytes starting at 0x218 and looking for the first non-zero DWord; this is the location of the "dataOffset". This approach will work on all existing gam files.
The "unknown" value at 0x210 is completely different for each gam file. I cannot find any kind of pattern and I have no clue what this value is.
MPQ GameBalance *.gam Data
Following the gam header is the data portion. Each record in the data portion has a fixed record size. As far as I can tell, this record size is not specified in the header, nor is the total number of records. However, it's easy enough to figure out the record sizes when looking at the hex dumps of the file.
Some gam files also have a variable data section located at the end of the data records. To determine if a given gam file has a variable data section, simply add the MPQ file header size (0x10 bytes) with the gam file header size (0x3A8) and the "dataNumBytes". If this is the same as the gam file size, then there is no variable data.
Two of the files that have variable data are the "AffixList.gam" and "Items_Armor.gam" (as do most of the other "Item_*.gam" files). Gam files that contain variable data will contain pairs of DWords that specify how to read the variable: the offset of the variable data (from the start of the data portion) and the number of bytes of the variable data.
For the "AffixList.gam" file, each record can have up to 4 references to variable data. Each piece of variable data corresponds to an affix modcode. See the AffixList.gam section for more details on this. The structure for the modcodes is as follows (0x18 in size):
| struct ModCode | // sizeof 0x18 | |
| { | ||
| 0x00 | DWord modCode; | // modCode |
| 0x04 | DWord modParam; | // param used for elemental dmg and resists |
| 0x08 | DWord unused[2]; | // always 0x00000000 |
| 0x10 | DWord varDataOffset; | // variable data offset from the start of the data section |
| 0x14 | DWord varDataNumBytes; | // number of variable data bytes |
| } |
In version 7318 of the "AffixList.gam" file, there are four ModCode structures located towards the end of each record at the following offsets:
| modcode1: 0x178 (offset at 0x188, size at 0x18C) |
| modcode2: 0x190 (offset at 0x1A0, size at 0x1A4) |
| modcode3: 0x1A8 (offset at 0x1B8, size at 0x1BC) |
| modcode4: 0x1C0 (offset at 0x1D0, size at 0x1D4) |
The size of the variable data varies based on the modcode. The possible values are 0 (in the case of no modcode), 0x0C, 0x1C, 0x28,0x34, and 0x40. In the core version, here are some of the modcodes with their corresponding sizes:
| 0x0C: Block, GoldPickUpRadius, HitMana, HitLife |
| 0x1C: Experience, Resists, Attack, Prec, Def, Vit, Will, AllStats, Life, MinD, MaxD |
| 0x28: Damage, CriticalD, Defense, Life, Gold, MF, Haste, Cast, LifeS, Regen, Run |
| 0x34: Inferior, Superior |
| 0x40: ItemCost |
In the near future, I will dive into the details of the "AffixList.gam" file. After that, I will dive into the "Item_*.gam". I was also going to spend time on the \"TreasureClass\" (*.trs) files, but these have been removed as of Patch 7447. Looks like Blizzard finally decided to remove some of the server-side data from the game client MPQs.
