a quick thought on extending npb search

Tips on writing regular expressions for searching the post list

Moderators: Quade, dexter

a quick thought on extending npb search

Postby bobkoure » Thu Apr 27, 2006 9:17 am

There are some folks who'd like NBP searching to be more like Google et al - and there are some of us who think it's fine as-is.

What happens when you take a very simple syntax, plus some kind of enclosing characters (parens, braces, square brackets, whatever) to enclose regex expressions?
For instance, -(RRR) means not any posts that the regex expression RRR returns true for? (using whatever set of logical characters best make sense - and maybe even insisting that there be no whitespace between the logical character and the opening enclosing character.) Oh, and no nesting. Should be simple to parse, would satisfy the "like Google" request, and the default would be that it continued to work as it is.

Thoughts? Could be a very stupid idea - I haven't taken the time to really think it through.
bobkoure
 

Postby Quade » Thu Apr 27, 2006 11:02 am

It's not that it's a bad idea, it's just that I'm not sure how to implement it. Maybe after 5.2 comes out I'll get a chance to think about it. There are major changes to search coming though. More fundamental than just what syntax the search query is. It won't make 5.2 but, it's coming pretty quickly afterwords.
User avatar
Quade
Eternal n00b
Eternal n00b
 
Posts: 44972
Joined: Sat May 19, 2001 12:41 am
Location: Virginia, US

Registered Newsbin User since: 10/24/97

Postby Smite » Thu Apr 27, 2006 1:45 pm

Not just search, but the find box as well.

Any delimeter would be safe (I suggest {} since they're so rare) as long as you searched for the matching close delimeter, and ignored any internal ones.

Something like:
Code: Select all
for(int pos = 0; pos < searchstring.length; pos++)
{
  switch(searchstring[pos])
  {
    case '+':
      AndList.Add(getSearchTermStartingAt(pos+1));
      break;
    case '-':
      NotList.Add(getSearchTermStartingAt(pos+1));
      break;
    default:
      OrList.Add(getSearchTermStartingAt(pos));
      break;
  }
  pos += term.length;
}

foreach(term in OrList)
{
  AddToList(linesThatMatch(term));
}
etc

.
.
.

public string getSearchTermStartingAt(int pos)
{
  if the first and last char are { and }, strip them
  if the first char is ", find the next ", and return the text between them, escaped for RegEx.
  Otherwise, return the text between pos and the next space, escaped for RegEx
}
Please read the FAQ before asking any questions.
If you're new to newsgroups, and the files on them, you can find a very helpful guide here.
User avatar
Smite
Katamari Damacy Addict
 
Posts: 5318
Joined: Sat May 19, 2001 1:54 am
Location: Alberta, Canada

Registered Newsbin User since: 03/27/03


Return to Regular Expressions

Who is online

Users browsing this forum: No registered users and 2 guests