Microsoft's FAT16 and FAT32 File Systems, and how they affect file deletion and recovery |
March 2019:
Introduction: After spending months on the NTFS pages (and later on the SSD pages) I realised that FAT file system systems, although quite ancient, were still being used widely, and there's not a lot of technical information out there about them. This - the briefest of the NTFS/SSD/FAT trilogy - was born way back at the end of 2014. It's nowhere near as conprehensive as the other offerings, and I was getting rather tired of all this. There's no diagrams or witty comments, and it's not even finished. But it might be of some help to someone. Perhaps one day I will come back and polish it a little. I am as usual obliged to those I have borrowed from, and will also be obliged to those who point out any errors without any reward apart from that of contribution.
A Bit of Background: FAT is an acronym of File Allocation Table, a consecutive array of entries representing cluster addresses. FAT file systems have their origins in the late 1970s and early 1980s and it was the file system supported by the Microsoft MS-DOS operating system. It was originally developed as a simple file system suitable for floppy disk drives less than 500K in size. Over time it has been enhanced to support larger and larger media. Although it's owned by Microsoft, and presumably proprietary, this has never been pursued. FAT systems are used by many software and hardware purveyors, and that's why it loaded onto cameras and USB drives instead of NTFS, which you would definitely have to pay for. There are three common FAT file system types: FAT12, FAT16 and FAT32 (exFAT, which is not compatible with FAT and is proprietary, is not discussed here). The basic difference in these FAT types is the size, in bits, of the entries in the FAT structures on the disk. There are 12 bits in a FAT12 FAT entry, 16 bits in a FAT16 FAT entry and, yes, 32 bits in a FAT32 FAT entry. FAT12 won't be mentioned much here as it is little used these days - no device is small enough (FAT12 is primarily used for floppy disks and volumes under 4mb in size). Most of what's here applies to FAT32, with the occasional comparison with FAT16. Boot Sector: FAT file systems have three basic components: the boot sector, the FAT itself, and the root directory. FAT32 differs from FAT16 by having a larger boot sector containing a FSInfo component, and a complete copy of the boot sector for integrity. The first sector on a FAT volume, sector 0, is the Boot Sector. The boot sector starts with the BIOS Parameter Block (BPB). The BPBs for FAT16 and 32 are identical up to offset 0x20, where the BPB ends and the boot sector starts. The FAT32 boot sector is three sectors in size. The boot sector copy starts in sector 6. At offset 0x36 in the FAT16 boot sector is an eight-byte field BS_FilSysType which contains the character string “FAT16 “, or “FAT “. In the FAT32 boot sector the field BS_FilSysType is at offset 0x52 and contains the string “FAT32 “. However this field does not determine the FAT type. The FAT type on a volume is determined entirely by the count of clusters on the volume. This is the one and only way that FAT type is determined. There is no such thing as a FAT12 volume that has more than 4084 clusters, nor is there any such thing as a FAT16 volume that has fewer than 4085 clusters or more than 65,524 clusters. There is no such thing as a FAT32 volume that has fewer than 65,525 clusters. Microsoft operating systems will not handle correctly any FAT volume that violates these rules because they will consider that the volume has a different type of FAT than what has been specified. Note also that the Count of Clusters value is exactly that - the count of data clusters starting at cluster 2. The maximum valid cluster number for the volume is Count of Clusters + 1, and the “count of clusters including the two reserved clusters” is Count of Clusters + 2. documents mistakenly say that this 0xAA55 signature occupies the “last 2 bytes of the boot sector”. This statement is correct if — and only if — BPB_BytsPerSec is 512. If BPB_BytsPerSec is greater than 512, the offsets of these signature bytes do not change (although it is perfectly OK for the last two bytes at the end of the boot sector to also contain this signature). 0x0B - BPB_BytsPerSec (2 bytes) Bytes per sector, typically 512 0x0D - BPB_SecPerClus (1 byte) Sectors per cluster, typically 8 0x0E - BPB_RsvdSecCnt (2 bytes) Number of reserved sectors that precede the first FAT, typically 32 0x10 - BPB_NumFATs (1 byte) Number of FATs on the volume. Inevitably two on hdds, but flash devices may use one 0x20 - BPB_TotSec32 (4 bytes) Number of sectors on the volume (From this point onwards the BPB for FAT32 differs from FAT16) 0x24 - BPB_FATSz32 (4 bytes) number of sectors for one FAT 0x2C - BPB_RootClus (4 bytes) root directory cluster number 0x30 - BPB_FSInfo (2 bytes) FSInfo sectornumber FSInfo Sector structure: On a FAT32 volume the FAT can be a large data structure, unlike on FAT16 where it is limited to a maximum of 128K worth of sectors. The FSInfo area assists file allocation by storing the free cluster count and the last cluster allocated so that they does not have to be continuously computed. The FSInfo sector number is the value in the BPB_FSInfo field; for Microsoft operating systems it is always set to 1. 0x00 - FSI_LeadSig (4 bytes) FSInfo signature, 0x41615252 0x01E4 - FSI_ StrucSig (4 bytes) FSInfo signature, 0x61417272 0x01E8 - FSI_Free_Count (4 bytes) last free cluster count, if 0xFFFFFFFF then must be computed 0x01EC - FSI_Nxt_Free (4 bytes) last cluster allocated, if 0xFFFFFFFF then start at cluster 2 Boot sector backup: Another feature on FAT32 volumes that is not present on FAT16 is the BPB_BkBootSec field. FAT16 volumes can be totally lost if the contents of sector 0 of the volume are overwritten or sector 0 goes bad and cannot be read. This is a “single point of failure” for FAT16 volumes. The BPB_BkBootSec field reduces the severity of this problem for FAT32 volumes, because starting at that sector number on the volume – 6 - there is a backup copy of the boot sector information including the volume’s BPB. In the case where the sector 0 information has been accidentally overwritten, all a disk repair utility has to do is restore the boot sector(s) from the backup copy. In the case where sector 0 goes bad, this allows the volume to be mounted so that the user can access data before replacing the disk. This second case - sector 0 goes bad - is the reason why no value other than 6 should ever be placed in the BPB_BkBootSec field. If sector 0 is unreadable, various operating systems are “hard wired” to check for backup boot sector(s) starting at sector 6 of the FAT32 volume. Note that starting at the BPB_BkBootSec sector is a complete boot record. The Microsoft FAT32 “boot sector” is actually three 512-byte sectors long. There is a copy of all three of these sectors starting at the BPB_BkBootSec sector. A copy of the FSInfo sector is also there, even though the BPB_FSInfo field in this backup boot sector is set to the same value as is stored in the sector 0 BPB. The FSInfo fields are maintained in the primary copy only. Little-Endian: FAT file systems were originally developed for the IBM PC machine architecture, which means that numerical values are held in little-endian form. Little-endian is a method of ordering the bytes of a numeric value, and applies when a field is two or more bytes long. To convert little-endian to a recognisable hex form (big-endian), there are two rules to be followed: the value of each byte is not changed, and the bytes are reordered from right to left. For instance, the FAT32 boot sector has five bytes at offset 0x0B holding three fields, bytes per sector (2 bytes), sectors per cluster (1 byte) and reserved sector count (2 bytes). Typical values are 00 02 08 20 00. Reversing the bytes in the two-byte fields gives values of 0x0200, 0x08, and 0x0020 (512, 8, 32). The FAT: The FAT file system contains two mirrored FATs for security, although it is possible for one FAT only to be used. The FAT is a contiguous area following the reserved sectors, and in a FAT32 volume holds an array of 32-bit entries representing every cluster on the volume. FAT entries either contain zeroes if the cluster is not allocated, the number of the next cluster entry (for a file that has more than one cluster allocated) or an eof if only one cluster is allocated. All non-zero files in a FAT file system have at least one cluster allocated. The cluster address in the FAT entry is 28 bites. The four high-order bits of the entry are not altered on cluster allocation or deletion. Cluster numbers: Cluster numbers are relative to the start of the data area, following the reserved area and the FATs. Clusters in the data area are numbered from two, there being no cluster number zero or one, so two must be subtracted from the cluster number to find the correct offset. Assuming the BPB starts in sector zero, to find the root directory sector with a root directory cluster number of 0x0776 (1910): Root cluster number -2 * 8 = sectors: 1910 – 2 = 1908 * 8 = 15264 Reserved sectors 32 Sectors per FAT * 2: 15285 *2 30570 Root directory start sector: 45866 Directory: A FAT directory is nothing but a “file” composed of a linear list of 32-byte structures. The only special directory, which must always be present, is the root directory. For FAT16 media, the root directory is located in a fixed location on the disk immediately following the last FAT and is of a fixed size in sectors computed from the BPB_RootEntCnt value (see computations for RootDirSectors earlier in this document). For FAT16 media, the first sector of the root directory is sector number relative to the first sector of the FAT volume: FirstRootDirSecNum = BPB_ResvdSecCnt + (BPB_NumFATs * BPB_FATSz16); For FAT32, the root directory can be of variable size and is a cluster chain, just like any other directory is. The first cluster of the root directory on a FAT32 volume is stored in BPB_RootClus. Unlike other directories, the root directory itself on any FAT type does not have any date or time stamps, does not have a file name (other than the implied file name “\”), and does not contain “.” and “..” files as the first two directory entries in the directory. Dir entry is fixed at 32 bytes The directory entry for a file contains the file name, the file size, and the first cluster number of the file’s data. In this way we can go to the first cluster entry in the FAT and follow the cluster chain to retrieve all the data. The field holding the first cluster number is two bytes. This is fine in FAT16, but FAT32 has cluster numbers which won't fit into two bytes, so a second field is used. In each directory entry offset 0x1A holds the two low order bytes, and offset 0x14 holds the high order bytes. Both are in little-endian. Both fields require converting to big-endian before being treated as a four-byte field. To convert to big-endian reverse the two byte order at offset 0x14, reverse the two-byte order at offset 0x1A, join together in 0x14-0x1A order. e.g. Offset 0x14 = 02 00, offset 0x1A = 1E 76. Reverse and join as 00 02 76 1E. Directory entries are added in time seq, not sorted on name, and can be reused. A zero-byte file has a cluster number of zero in its directory entry. File Deletion: The principles of file deletion – and recovery – apply to all FAT versions. To indicate that a file has been deleted the first letter of the file name is altered to 0xE5. This is however a valid byte in the Kanji character set, so an 0x05 can be used to specify that the file has not been deleted and the first character of the live file name starts with 0xE5. 0xE5 is actually a lower case a with two dots above it. This is not an allowable file name character so when examining file names it is displayed as an underscore. On deletion the two bytes at offset 0x14 in the directory, the high-end part of the starting cluster address, are set to zero. Any cluster address which is greater than 65535, and had a value in the two bytes at 0x14, will be invalid. An attempt to follow this cluster address will result in data from a completely different address being returned, making recovery impossible. Deleted files with a cluster address that originally was held in two bytes only will retain the correct address and be recoverable, but that would be very few files, or a small flash drive with a large cluster size. The cluster entries in the FAT for a deleted file are set to zero, thus indicating that they are available for reuse. This also means that the cluster chains are no longer available for file recovery. Without the starting cluster number, and with the cluster chains destroyed, file recovery is best attempted by a scan of the unallocated clusters looking for a recognisable file signature. The following clusters can then be retrieved up to the size of the file. This method depends on: 1) The data clusters being allocated contiguously and consecutively 2) The data clusters being in one extent 3) No newer file being allocated in any part of the old file’s allocation 4) No newer file being allocated in any part of the old file’s allocation and being subsequently deleted If an entry in the FAT contains a non-zero value then this means that this cluster has been allocated to another file. The recovery process can skip over the allocated clusters and continue from the next zero cluster, but this method is not likely to be successful.
You can return to my home page here If you have any questions, comments or criticisms at all then I'd be pleased to hear them: please email me at kes at kcall dot co dot uk. © Webmaster. All rights reserved. |