Using Wildcards on the Bash Command Line

The function of any shell, whether Bash, KSH, CSH is to take commands entered on the command line; expand any file globs, that is wildcard characters * and ? and sets, into complete file or directory names; convert the result into tokens for use by the kernel; and then pass the resulting command to the kernel for execution. The shell then sends any resulting output from execution of the command to STDOUT.

Special Pattern Characters

Although most SysAdmins talk about wildcards, or file globbing as it’s also called, we really mean special pattern characters that allow us significant flexibility in matching file names and other strings when performing various actions. These special pattern characters allow matching single, multiple, or specific characters in a string.

  1. ? Matches only one of any character in the specified location within the string.
  2. * Zero or more of any character in the specified location within the string.

In all likelihood you have used these before. Let’s experiment with some ways we can use these more effectively.

Preparation

But first, we need to create a directory in which to perform these experiments, and then we’ll create a lot of files. We’ll do this as a non-root user so there’s no danger of borking the operating system. We’ll also use a directory specifically for this in order to prevent the possibility of damaging files in your home directory. That’s not likely, but we should be safe.

Start by creating a test directory in your home (~) directory, then make it the PWD. I’m going to be a little fancy here by using a variable for the name of the new directory.

$ NewDir="testdir" ; cd ; mkdir $NewDir ; cd $NewDir
dboth@essex:~/testdir$

Brace Expansion

Now we need to create a large number of files so we’ll take a bit of a detour to do that.

If you’re not familiar with brace expansion, it can be a powerful way to generate lists of arbitrary strings and insert them into a specific location within an enclosing static string, or at either end of a static string. First let’s just see what a basic brace expansion does. Note the use of curly brace {} to enclose the components.

dboth@essex:~/testdir$ echo {string1,string2,string3}
string1 string2 string3

Well, that is not very helpful, is it? But look what happens when we use it just a bit differently.

dboth@essex:~/testdir$ echo "Hello "{David,Jen,Rikki,Jason}.
Hello David. Hello Jen. Hello Rikki. Hello Jason.

That looks like something we might be able to use because it can save a good deal of typing when we create our files. Now try this.

dboth@essex:~/testdir$ echo b{ed,olt,ar}s
beds bolts bars

Here is one method for generating file names for testing.

dboth@essex:~/testdir$ echo testfile{0,1,2,3,4,5,6,7,8,9}.txt
testfile0.txt testfile1.txt testfile2.txt testfile3.txt testfile4.txt testfile5.txt testfile6.txt testfile7.txt testfile8.txt testfile9.txt

And here is an even better method for creating sequentially numbered files.

dboth@essex:~/testdir$ echo test{0..9}.file
test0.file test1.file test2.file test3.file test4.file test5.file test6.file test7.file test8.file test9.file

The {x..y} syntax, where x and y are integers, expands to be all integers between and including x and y.

Creating the Files

Now we can create the files we need to explore wildcards.

We want to create a files with a number of name components so that the have a greater variety to make more interesting examples. The files we create will all contain a small amount of data to make this a more realistic exercise.

dboth@essex:~/testdir$ for Name in `echo {my,your,our}.test.file.{000..200}{a..f}.{txt,asc,file,text}` ; do echo "This file is named $Name" > $Name ; done

This command creates 14,472 files each of which has a text including it’s own file name as content. This consumes about 58M of total space in the directory.

Working with Wildcards — Finally!

How did I know that the command above would create 14,472 files? Well, I didn’t. I just ran the little command line program to create them and then the following program to count the number of files in the test directory.

dboth@essex:~/testdir$ ll | wc -l
14473

In order to achieve this result we must understand the structure of the filenames we created. They all contain the string “test” so we can use that. The command uses the shell’s built-in file globbing to match all files that contain the string “test” anywhere in their names, and that can have any number of any character both before and after that one specific string. Let’s just see what that looks like without counting the number of lines in the output.

dboth@essex:~/testdir$ ls *test*

I am sure that “you” don’t want any of “my” files in “your” home directory. First see how many files that begin with “my” there are, then delete them all and verify that none are left.

$ ls my* | wc -l
4824
$ rm -v my* ; ls my*

The -v option of the rm command lists every file as it deletes it. This information could be redirected to a log file for keeping a record of what was done. This file glob enables the ls command to list every file that starts with “my” and perform actions on them.

Find all of “our” files that have txt as the ending extension.

$ ls our*txt | wc -l
1206

Let’s list all files that contain 6 in the tens position of the three digit number embedded in the file names, and that end with asc.

We must do this with a little extra work to ensure that we specify the positioning of the “6” carefully to prevent listing all of the files that only contain a 6 in the hundreds or ones position but not in the tens position of the three digit number. We know that none of the file names contains 6 in the hundreds position, but this makes our glob a bit more general so that it would work in both of those cases.

We don’t care whether the file name starts with our or your, but we use the final “e.” of “file.” – with the dot – to anchor the next three characters. After “e.” in the file name, all of the files have three digits. We do not care about the first and third digits, just the second one. So we use the ? to explicitly define that we have one and only one character before and after the 6. We than use the * to specify that we don’t care how many or which characters we have after that but that we do want to list files that end with “asc”.

$ ls *e.?6?*.asc

How many files that match that specification?

Let’s look at all files that have a 6 in the middle position of the three digit number, but which also has an “a” after the number, as in x6xa. We want all files that match this pattern regardless of the trailing extension, asc, txt, text, or file.

The file pattern specification we have now is almost where we need it in order to do this.

$ ls *e.?6?a.*
our.test.file.060a.asc   our.test.file.160a.asc   your.test.file.060a.asc   your.test.file.160a.asc
our.test.file.060a.file  our.test.file.160a.file  your.test.file.060a.file  your.test.file.160a.file
our.test.file.060a.text  our.test.file.160a.text  your.test.file.060a.text  your.test.file.160a.text
our.test.file.060a.txt   our.test.file.160a.txt   your.test.file.060a.txt   your.test.file.160a.txt
our.test.file.061a.asc   our.test.file.161a.asc   your.test.file.061a.asc   your.test.file.161a.asc
our.test.file.061a.file  our.test.file.161a.file  your.test.file.061a.file  your.test.file.161a.file
our.test.file.061a.text  our.test.file.161a.text  your.test.file.061a.text  your.test.file.161a.text
<SNIP>
our.test.file.068a.txt   our.test.file.168a.txt   your.test.file.068a.txt   your.test.file.168a.txt
our.test.file.069a.asc   our.test.file.169a.asc   your.test.file.069a.asc   your.test.file.169a.asc
our.test.file.069a.file  our.test.file.169a.file  your.test.file.069a.file  your.test.file.169a.file
our.test.file.069a.text  our.test.file.169a.text  your.test.file.069a.text  your.test.file.169a.text
our.test.file.069a.txt   our.test.file.169a.txt   your.test.file.069a.txt   your.test.file.169a.txt

Using Sets

Sets are an interesting and powerful form of special pattern characters. They provide a means to specify that a particular one-character location in a string contains any character from the list inside the square braces []. Sets can be used alone or in conjunction with other special pattern characters.

A set can consist of one or more characters that will be compared against the characters in a specific, single position in the string for a match. The following list shows some typical example sets and the string characters they match.

  • [0-9] Any numerical character.
  • [a-z] Lowercase alpha.
  • [A-Z] Uppercase alpha.
  • [a-zA-Z] Any uppercase or lowercase alpha.
  • [abc] The three lowercase alpha characters, a,b, and c.
  • [!a-z] Any characters except for lowercase alpha.
  • [!5-7] Any numeric characters except 5, 6, or 7.
  • [a-gxz] Lowercase a through g, x, and z.
  • [A-F0-9] Uppercase A through F, or any numeric.

Once again, this will be easier to explain if we just go right to the experiment.

The PWD should still be ~/testdir. Start by finding the files that contain a 6 in the center of the three digit number in the file name.

$ ls *[0-9]6[0-9]*

We could use this alternate pattern because we know that the leftmost digit must be 0 or 1.

$ ls *[01]6[0-9]*

Count the number of file names returned for both cases to verify this.

Now let’s look for the file names that contain a 6 in only the center position, but not in either of the other two digits.

$ ls *\.[!6]6[!6]*

We need to anchor this expression to the period that begins the three digits, otherwise the results will not be what we want. Think about what the result would be, then try it without that anchor to verify.

Find the files that match the pattern we have so far but which also end in t.

$ ls *\.[!6]6[!6]*t

Now find all of the files that match the pattern above but which also have “a” or “e” immediately following the number.

$ ls *\.[!6]6[!6][ae]*t

These are just a few examples of using sets. Continue to experiment with them to enhance your understanding.

Sets provide a powerful extension to pattern matching that gives us even more flexibility in searching for files. It is important to remember, however, that the primary use of these tools is not merely to “find” these files so we can look at their names. It is to locate files that match a pattern so that we can perform some operation on them, such as deleting, moving, adding text to them, searching their contents for specific character strings, and more.

When you’ve finished experimenting, you can delete the testdir directory and it’s contents.

Final Thoughts

Special pattern characters, AKA wildcards or fileglobs, allow us an amazing amount of flexibility in matching file names and other strings when performing various actions on our Linux computers. I use them daily and would be hard pressed to do the things I need to do without them.

A few file managers have a decent search that can use these same wildcard expressions, one of which is Xfe, the default for the Xfce desktop. But it still takes longer to do the search, and using the resulting list of files is significantly restrictive compared to the capabilities we have on the command line.

Leave a Reply