Teaching regular expressions can be difficult because they are typically embedded into a programming language or a tool like sed. This is compounded by differences between most of these implementations. Rather than have you get stuck in various different flavors of regex hell, I've created a very small project called Regetron for experimenting with regular expressions. It uses Python's version of regex, but those are pretty close to the 90% of regex you'll actually use.

To install Regetron simply do this:

Alternatively, you can go to the Regetron project page and get the source to install like this:

$ git clone git://gitorious.org/regetron/regetron.git
$ cd regetron
$ sudo python setup.py install

Once you have Regetron, you try running it and doing some things with it:

$ regetron
Regetron! The regex teaching shell.
Type your regex at the prompt and hit enter. It'll show you the
lines that match that regex, or nothing if nothing matches.
Hit CTRL-d to quit (CTRL-z on windows).
> !data "Hello World!"
> .*W.*
0000: Hello World!
> Cats
>

How To Use Regetron

How this works is you give Regetron a file or a string of data, and then you type in regular expressions (regex). You can see this when I type !data "Hello World!" to setup a string to work with, then did a regex against it .*W.*. If a regex matches then it will print out the lines that it found matched. If nothing matches then it prints nothing. That's what happened when I typed Cats.

Regetron has a few commands and options as well:

  • If you give it a file on the command line it will load that: regetron somefile.txt
  • If you hit ENTER to make a blank line it will go into "verbose mode" which we'll use heavily in the book. To finish in verbose mode and run the regex just enter a blank line.
  • The command !data EXP will run any Python expression (EXP) and and set that as the data. Try !data "LOTS OF ME" * 100 and then try regex .*ME.* to see that it duplicated the line 100 times.
  • The command !help will print the available commands.
  • !load FILE will load the given file for your data.
  • !match toggles whether regex are run in match vs. search mode. We'll cover that later.
  • If you have readline installed, then Regetron will give you a readline scrollback and edit feature.
  • To exit, just use CTRL-d (or CTRL-z on Windows) and it'll exit.

Extra Credit

For this exercise just play with Regetron and make sure you're comfortable with it.