After writing about how DOS filenames can have spaces and a follow-up article about where those spaces can be, I’ve received a few emails asking which characters are allowed in DOS filenames, and which are not. Wikipedia has an excellent discussion about the DOS directory table which includes the legal characters for DOS filenames, but this is an opportunity to learn a bit about programming.
A power multiplier in open source is that if you have a compiler and know a bit about programming, you can learn all kinds of things about how computer work. And in this case, we can learn about what characters are allowed in DOS filenames by writing a test program.
Test all characters
DOS filnames can actually contain so-called “extended ASCII” characters. Really, the ASCII standard only defines characters from 0 to 127, and any character beyond that is a special character. DOS uses code points from 128 to 255 in a code page, such as code page 437 (the typical “extended ASCII” characters that includes line-drawing characters, Greek characters, and other items). However, US keyboards cannot easily type those extra characters, so let’s make a test program that creates filenames using characters up to but not including 127. Code point 127 is actually the “delete” character, and isn’t actually allowed in DOS filenames anyway.
Let’s write a short test program that iterates through the printable characters in a for loop:
#include <stdio.h>
int main()
{
char x;
for (x=' '+1; x<127; x++) {
testfile(x);
}
return 0;
}
This starts the loop at the character after the space, which is the ! character. I wrote it as space + 1 because we already know that DOS filenames cannot have a trailing space so there’s no point in testing a space in a filename; we already know that answer. The loop continues as long as the character is up to but not including 127.
For each character in the loop, the program calls the testfile function. That’s the function where we’ll do the actual test.
Save a test file
The testfile function uses the character to create a filename, then opens a file, writes some data to it, and closes it. To make these test files easy to delete later, I’ll name them all with a .TST file extension.
int testfile(char a)
{
char filename[] = "_.TST";
I’ve defined a string called filename with the initial value _.TST. The first character in the string is actually a placeholder; I’ll replace it with the a character before I try to open a new file with that name:
filename[0] = a;
p = fopen(filename, "w");
With that understanding of how the testfile function works, let’s write the full function:
int testfile(char a)
{
char filename[] = "_.TST";
FILE *p;
fputc(a, stdout);
fputs(" - ", stdout);
filename[0] = a;
p = fopen(filename, "w");
if (p==NULL) {
puts("fail");
return 0; /* fail */
}
fputc(a, p);
fclose(p);
puts("Ok");
return 1;
}
If the function cannot open a new file with that filename, it prints a “fail” message, then immediately exits. Otherwise, it writes one character to the file, then closes it. If successful, the function also prints an “Ok” message.
Putting it all together
This program is not very long, at about 40 lines to test each character in a DOS filename. And that’s an important goal when you’re trying to learn something: write a simple program to test one thing. In this case, the program only tests characters in DOS filenames. Here’s the full program:
#include <stdio.h>
int testfile(char a)
{
char filename[] = "_.TST";
FILE *p;
fputc(a, stdout);
fputs(" - ", stdout);
filename[0] = a;
p = fopen(filename, "w");
if (p==NULL) {
puts("fail");
return 0; /* fail */
}
fputc(a, p);
fclose(p);
puts("Ok");
return 1;
}
int main()
{
char x;
for (x=' '+1; x<127; x++) {
testfile(x);
}
return 0;
}
I’ve saved this as names.c and compiled it using a C compiler. You should be able to use any ANSI-compatible C compiler; for example, we include Open Watcom in the FreeDOS distribution:
D:\SRC\NAMES> wcl -q names.c
Testing the program
After we compile the program, we can use it to test which characters are allowed in a DOS filename. To make it easy to clean up after myself, I ran the program in a new TST directory; that way, I can just delete the directory later:
D:\SRC\NAMES>mkdir tst
D:\SRC\NAMES>cd tst
D:\SRC\NAMES\TST>..\names > ..\names.txt
The program will loop through all characters from ! (33) to ~ (126). For each character, the program tries to write a new file with a name like _.TST. I’ve saved the output to a file called names.txt, but you can just list the directory to see what files are there; these are the valid characters in a DOS filename:
D:\SRC\NAMES\TST>dir /w /l /b
[.] [..] !.tst #.tst $.tst
%.tst &.tst '.tst (.tst ).tst
-.tst 0.tst 1.tst 2.tst 3.tst
4.tst 5.tst 6.tst 7.tst 8.tst
9.tst @.tst a.tst b.tst c.tst
d.tst e.tst f.tst g.tst h.tst
i.tst j.tst k.tst l.tst m.tst
n.tst o.tst p.tst q.tst r.tst
s.tst t.tst u.tst v.tst w.tst
x.tst y.tst z.tst ^.tst _.tst
`.tst {.tst }.tst ~.tst
Note that DOS filenames are case insensitive, so A.TST is the same as A.TST.
With this directory list, we can see that the valid “plain ASCII” characters for DOS filenames are the letters A to Z, the numbers 0 to 9, and these special characters: ! # $ % & ‘ ( ) – @ ^ _ ` { } ~ (and space, with the caveat that you cannot have trailing spaces).
Looking at the “fail” entries in the names.txt file, we can see the list of characters that are not allowed in regular DOS filenames:
D:\SRC\NAMES\TST>find "fail" ..\names.txt
---------------- NAMES.TXT
" - fail
* - fail
+ - fail
, - fail
. - fail
/ - fail
: - fail
; - fail
< - fail
= - fail
> - fail
? - fail
[ - fail
\ - fail
] - fail
| - fail
Learn by writing test programs
With a little knowledge of programming, you can learn all kinds of things about how computers work. With this simple test program, we were able to demonstrate what characters are (and are not) allowed in DOS filenames.