Skip to content

Modification to my ‘Directory string validation’ module

July 27, 2012

I’ve spent some time revising my module to include Regular Expressions which has made the code a bit cleaner to look at minux the crazy flags I have to have set.  Also I am having some trouble figuring out what I should have as the output for the main functions.  As I had mentioned in the post before this, I want to have two types of functions per file system: One that corrects problems in the strings, and one to just tell you what the problems are.  Unfortunately, I am having trouble figuring out what I’d like the output to be for the function that tells you what is wrong.  Since it checks so many different things in the string, the output could be very complex.  I may decide to just return the first error it gets or return 0 if the string is correct.
I am still trying to learn the ins and outs of regular expressions but so far they have proven to be pretty useful.  For the NTFS format, there are a few characters that will cause problems.  Those are: “*:<>?\/|. Also you shouldn’t start or end a file or folder with a space or period.  The Regular Expression I’ve rigged to acomodate those rules is:

NTFS_INVALID_CHAR = re.compile(r'[\"*:&lt;&gt;?\\/|]|^\.|\.$|^\ |\ $')

What we really need to look at is whats in the r’ and ‘.  There are two main parts within that section.  The left part [\”*:<>?\\/|] and the right part |^\.|\.$|^\ |\ $.  For the left part, the characters within the [ and ] are the invalid characters. Note that the backslash is in there a lot.  That is an escape character and it is used to tell the compiler that the character directly after it is not a special character.  The right part deals only with checking if a string starts or ends with a space or period. The | characters registers like an ‘or’  so that only one part needs to be found. Caret will search at the begining of the string, the dollar sign searches at the end of a string.  I beleive that the dollar sign needs to be at the end of the Regular Expression block.  Again you see the character escapes before the periods and spaces.
Please note that I have not fully tested the regular expressions yet.  The tests I have currently done have all ended positively but with such a complex line of code, there’s a lot to check.

Leave a Comment

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: