Finding words is nice, but you can just do that with normal string operations. What about finding patterns of text? The first kind of pattern is to have a regex match any single character. You do this with the '.' (dot) operator, which says "match any one character here".

Continuing with the corpus we've been using, here's a new script for you to type:


You can see I'm sort of searching for the same things as before, but instead of the actual words, I'm putting a random '.' (dot) character to make that character a variable match.

What You Should See

When you run this against ex2.txt you should see this:

That file doesn't exist.
> laz.
Input file is empty. Use !load to load something.
> y..d
Input file is empty. Use !load to load something.
> y....
Input file is empty. Use !load to load something.
> T.e.l.z
Input file is empty. Use !load to load something.

That should be close to what you expected, except for the matches for y.... which matches both lines. The reason is it matches "yard." from the 2nd line as you expect, but it also matches "y dog" from the first line. See how it's a 'y' and 4 characters? The regex doesn't care that those characters are chunks of two words, it will match them without any knowledge of the English language.

Extra Credit

  • Use !match to switch from search to match mode and then see if you get the same results. Why?
  • Write a line of only '.' (dot) sequences that matches the 2nd line but not the first.
  • Using a '\' (backslash) let's you escape the '.' to tell the regex that you mean "no actually match this as a ." Use that to fix the 3rd regex so it only matches the 2nd line of the corpus.
  • Change the corpus such that you write two new lines but they still match the same as the other corpus.

Portability Notes

Some regular expression engines mean different things when they say "everything". In Python "everything" means, "Well, not newline chars or just random stuff we decided wasn't really everything." Others actually really mean everything. It all depends on the engine and what they did with it.