Quade wrote:Going to try some things in B17. See if it improves things for you.
Thanks Quade. If you need to remotely debug, I'm game.
I'm tempted to write a perl script to mimic what the import function is doing just to see if this is a threading issue or not. I'm still amazed that a computer today cannot import on-the-fly the downloaded headers since I'd think the worst bottleneck on most people's systems is their internet connection which should be much slower than write times on a HD. I mean, think about it. Currently this app is downloading, compressing, saving to disk, then later reading from disk, decompressing, and importing to DB3. Seems like a lot of extra steps. I'd almost prefer to slow (or pause) the downloading of headers to keep up with the writing to the db3 files and ditch the "import" folder concept completely. Either way the result is the same - we have to wait for the import to complete before we can use the data.
One other thing... I've learned the hard way that SQL S U C K S for efficient text searches (and this with REAL sql servers with text indexes). I don't know if this fits with what you want to do with newsbin, but give Sphinx a look if you can:
http://sphinxsearch.com/about/sphinx/I used this as an alternative to full text searching in MySQL of very large DB's and it BLEW THE DOORS off MySQL. The only potential downfall is that it only supports boolean searches, not regex. Of course, I'd think most users only do (or know of) boolean searches anyway, so maybe not a drawback? Either way, it's crazy fast and highly scalable (I think Craigslist uses it, among others). It takes sql queries too, so it may be a simple drop in replacement of some of your code. Of course even if you don't use it for the newsbin executable, it may help with the "internet search" service you are also selling.