Extracting filenames from the subject

Tips on writing regular expressions for searching the post list

Moderators: Quade, dexter

Extracting filenames from the subject

Postby GrindKore » Fri Dec 27, 2002 5:57 pm

Hello hopefully some RegExp expert here can help me...

NOTE: This has nothing to do with NewsBin Pro functionality, but rather a personal mod I made to allow me further extend certain features.

I wrote a small VBScript that parses *.spool files and populates access database. This script runs every night and updates database with the latest record sets like a clockwork.

What I need is a hint as to how "Show Filename" filter works.
I have constructed different RegExp patters to isolate filename string from the post subject but cannot come up with a consistent way of getting filename from the various diversity of subject formats. I understand that it's not going to be 100% perfect every time due to lack of standard, but hopefully can get it to work like NBP4.
GrindKore
n00b
n00b
 
Posts: 4
Joined: Fri Dec 27, 2002 5:57 pm

Registered Newsbin User since: 06/13/03

RE: Extracting filenames from the subject

Postby dexter » Fri Dec 27, 2002 6:42 pm

Off the top of my head, there are two basic ways, both start from the right and work backwards to make a best guess at the filename. If there is quoted text, it just uses all the quoted text as the filename. If there is no quoted text, it grabs all the characters until the first whitespace character. For example:

([a-zA-Z0-9,.-_]+\([0-9]+/[0-9]+\)$) for non-quoted and
(".+"\([0-9]+/[0-9]+\)$) for quoted

These are close to the real ones in NewsBin but there is a little more logic involved. Maybe when Quade gets back from vacation, he'll give more details. Let me know if you need an english version of how these expressions work.
User avatar
dexter
Site Admin
Site Admin
 
Posts: 9514
Joined: Fri May 18, 2001 3:50 pm
Location: Northern Virginia, US

Registered Newsbin User since: 10/24/97

RE: Extracting filenames from the subject

Postby GrindKore » Sat Dec 28, 2002 4:43 pm

Thanks dexter, your help is apreciated.
GrindKore
n00b
n00b
 
Posts: 4
Joined: Fri Dec 27, 2002 5:57 pm

Registered Newsbin User since: 06/13/03

RE: Extracting filenames from the subject

Postby Quade » Mon Dec 30, 2002 2:46 pm

One problem I have with extracting filenames, is that embedded spaces will always screw you up unless it's a yEnc encoded. yEnc encoded posts use quoting around the filenames.
User avatar
Quade
Eternal n00b
Eternal n00b
 
Posts: 44981
Joined: Sat May 19, 2001 12:41 am
Location: Virginia, US

Registered Newsbin User since: 10/24/97


Return to Regular Expressions

Who is online

Users browsing this forum: No registered users and 2 guests