FAT Filenames


Format

This document is for those who are already familiar with DOS files. It is intended to remind you of DOS conventions.

DOS uses the following pattern for forming file paths:

[disk]:\[directory]\[subdirectories]\[filename].[extension]

[disk] is one letter of the Latin alphabet. Valid letters are A through Z. How disk letters are assigned to physical devices was discussed earlier.

[directory] and [subdirectories] are strings. They specify the location of the file in the directory tree. None, either or both are omitted depending on file location.

[filename] is also a string that identifies name of the file. [extension] usually contains some information regarding type of the file. It is a string. Note that directories may also have extensions. Because file and directory names are of the same format, everything below that applies to filenames also applies to directory names.

Short and Long Names

There are two types of filenames: long and short, or aliases. Short filenames are subject to the infamous 8.3 limitation. Thus, [filename] is one to eight characters long, and [extension] is zero to three characters long. This limitation also applies to directory names. Many sources say that paths are limited to 80 or 128 characters, but these limitations are due to DOS peculiarities, but not FAT format. With FAT file system, one can have infinitely long paths. The recommended (by me) maximum length of a path is 256 characters, including terminating null character. Short filenames use ASCII character set. Thus, each character takes up exactly one byte. According to DOS manuals, short filenames are case insensitive, and the following characters can be used: Case insensitivity is achieved by converting the name to uppercase when the file is accessed or created. The following characters have special meaning: The following characters are called wild cards. They are used in search operations: You are best advised not to allow any other characters in filenames, and to use special characters according to their meaning. If the existing filename has any illegal characters in it, it should either be ignored or the invalid characters should be replaced by the valid characters.

Long names are up to 256 characters long, including extension and the terminating null character. This limitation is artificial, and the long names on the disk can actually be longer than 256 characters. Again, some limitations were created by the software that serves FAT. The maximum length of a directory path, including drive letter, column, and leading slash, but excluding trailing slash, null terminator, filename and extension, is 246 bytes. The maximum length of a full path is 260 characters, including null terminator. This is four characters more than I recommend.

Long filenames are stored in unicode. Each character is two bytes long. There are two important things to remember about unicode:

All characters that are valid for short filenames are also valid for long filenames. In addition, the following characters can be used: Long filenames are what I call half case sensitive. The case of the characters is preserved when creating the file, but other from that long filenames are case insensitive. For example, "File" and "file" are treated as the same string by file system. They cannot co-exist in the same folder, and any of these names can be used to reference the file.

Aliasing

Whenever a file with a long filename is created, its alias with short filename is also created. The converse is usually not the case. If the long filename fits in the standard 8.3 scheme and contains only valid for short filenames characters, the following rules are applied: If the long name is either too long to fit in 8.3 or contains illegal for short filenames characters, the following rules are applied: Needless to say, there is no way to tell the alias by just looking at the long name, and there is no way to retrieve the long name by looking at the alias. Special directory structure insures that they are associated with each other. That is why when the file is accessed via its long filename and is edited (usually, deleted and re-created), the alias may change. It may especially change if the file is copied to a different folder.

One can access the file using its alias.

Finally, only VFAT filesystems support long filenames and aliasing.



Author:  Alex Verstak  3/10/1998