Page 1 of 1

can there be gaps in group content?

PostPosted: Fri Dec 28, 2012 2:29 pm
by beany
If so How would I know?

I dont want to do a download all headers for every group. It suits me to stick to 7 days for the first pass then add more when needed. However will I Always get the very next headers back and never miss any (say if I did multip 50K downloads)? ie if I have a header from a week back and a header from a year back can I guarantee that I have all those in between?

cheers

Re: can there be gaps in group content?

PostPosted: Fri Dec 28, 2012 5:51 pm
by DThor
No, there are no guarantees. There are basically two possibilities - since the posting there was a DMCA takedown(which as far as you're concerned makes a header gap inasmuch as you can't get the post from that server), and technical issues. The latter happens extremely rarely.

DT

Re: can there be gaps in group content?

PostPosted: Fri Dec 28, 2012 6:37 pm
by beany
DThor wrote:No, there are no guarantees. There are basically two possibilities - since the posting there was a DMCA takedown(which as far as you're concerned makes a header gap inasmuch as you can't get the post from that server), and technical issues. The latter happens extremely rarely.

DT


Thanks DT. I understand about the DMCA issues I was more concerned that I might inadvertently have big gaps in my headers because I hadn't followed a particular sequence of downloading or had left long gaps before downloading more historical data. I will assume that as that didn't figure in your answer it is not of concern. If there is a concern and it would be better to reset the groups and run an all header update then I would be grateful if someone would let me know and I'll do just that. Else I'll carry on as I am.

Cheers

Re: can there be gaps in group content?

PostPosted: Fri Dec 28, 2012 7:24 pm
by Quade
It's possible to get gaps but, you'd have to really try. For example, if you:

1 - Did a "Download all Headers". Let it download some.

2 - Stop the download

3 - Right click the group and selected "Use Download Age" to reset the group.

3 - Then download headers.

That would give you a gap. Some really old headers and some some new ones.

The only real way to ensure you have no gaps is to

1 - "Download all Headers"

or

2 - Set the "Download Age" really long. Right click the group and "Post Storage/Use Download Age" then update the group.

Re: can there be gaps in group content?

PostPosted: Fri Dec 28, 2012 8:03 pm
by beany
Quade wrote:It's possible to get gaps but, you'd have to really try. For example, if you:

2 - Set the "Download Age" really long. Right click the group and "Post Storage/Use Download Age" then update the group.


If I set download age to say 10000 and download most headers for the groups I use the most, do I need to keep the download age 10000 or could I revert back to 5 days for new groups?

Re: can there be gaps in group content?

PostPosted: Fri Dec 28, 2012 9:03 pm
by beany
beany wrote:
Quade wrote:It's possible to get gaps but, you'd have to really try. For example, if you:

2 - Set the "Download Age" really long. Right click the group and "Post Storage/Use Download Age" then update the group.


If I set download age to say 10000 and download most headers for the groups I use the most, do I need to keep the download age 10000 or could I revert back to 5 days for new groups?


Maybe I am able to answer this myself. I realize there are two download age settings. One in the group and one in the options tab. I assume I set the group to be very long and keep it that way. I can then set the options to be 10 days or so for new groups. I assume then that when I setup a new group it will bring the first 10 days down. However once they are down I would then right click the group and set a long download age there and that will overwrite the options value. - is that right? Do I still need to right click the post store to use download age menu item? If I didn't would it continue to use the options-set download age until I did?

Re: can there be gaps in group content?

PostPosted: Fri Dec 28, 2012 9:08 pm
by Quade
No binary group has longer retention than say 1500 days so, 10,000 is sort of overkill. That's if you use AW or Giga. Many of the other servers don't even have that many days worth of headers.

"Download all" is functionally the same as 10,000 download age and download latest on a reset group. Might as well do a "Download all" then and get the headers your server has for the group.

"Storage Age" determines how many days of Headers Newsbin will store for you. You might want to set it to 2000. Normally you want it set longer than the retention of the server if you intend to keep all the headers.

Just keep in mind, longest retention server is 1500 days and their retention increases 1 day for each day that passes.

- On a group where you want them all (1500 days) do "Download all".
- On Groups where you only want to days, use the "Download Age".
- Storage Age is how long Newsbin will keep the headers on disk. If you have ANY group where you want to keep 1500 days worth, it has to be kept larger than the longest retention group you want to keep.

Re: can there be gaps in group content?

PostPosted: Fri Dec 28, 2012 9:17 pm
by beany
just thought I'd give myself some days before I had to increase the number Image link not allowed for unregistered users

I'm still struggling with this though. How long should it take to get a few days worth of headers? I've set 10 days on all the age timers and deleted then added a couple of group that I know have recent headers in them. However there is nothing displayed. the connection screen shows that the group header stream is connected and speed is saying that there is download at 70Mbps. Still nothing displayed though. Even when I click the red download latest headers I'm not getting anything. Any ideas what's up?

Re: can there be gaps in group content?

PostPosted: Fri Dec 28, 2012 9:22 pm
by beany
beany wrote:just thought I'd give myself some days before I had to increase the number Image link not allowed for unregistered users

I'm still struggling with this though. How long should it take to get a few days worth of headers? I've set 10 days on all the age timers and deleted then added a couple of group that I know have recent headers in them. However there is nothing displayed. the connection screen shows that the group header stream is connected and speed is saying that there is download at 70Mbps. Still nothing displayed though. Even when I click the red download latest headers I'm not getting anything. Any ideas what's up?



okay sorry!! I just needed to restart NB. Working fine after I close/reopened it. btw totally off topic....why cant I post a smiley? the blue text is saying i need to be registered for that?

Image link not allowed for unregistered users
Image link not allowed for unregistered users

Re: can there be gaps in group content?

PostPosted: Fri Dec 28, 2012 9:58 pm
by beany
No Its still not behaving as I expect.

firstly when I delete and re-add a couple of groups nothing happens until it close NB down. when I open it up then I start to see a header count.

Now when I delete a group and then add it back again with 10days set on group properties" use header download age" and also the same value set in the options tab then after the application reboot I see 10 days on one group but nothing on the other. However when I click reload from disk I get a load of old headers on both groups.

I noticed that if I reboot the app again then I can get the other group to get the last 10 days too. so a reload from disk shows a few headers from 10days to 0 then a gap then some headers from 1400 to over 1500.

I'm confused


edit...newsbin pro 6.41rc1 build 2122 on vista

Re: can there be gaps in group content?

PostPosted: Fri Dec 28, 2012 10:51 pm
by Quade
You haven't put your registration key into the forum profile yet so, you're seen as a second class citizen. Once you put the key in and the forum see's you as a registered user, you can post links. It's mainly a means to keep the spammers under control.

I think you're making it too complicated.

1 - Restart Newsbin.

2 - Pick one group.

3 - Right click "Post Storage/Use Download Age"

4 - Make sure download age is set to 10.

5 - "Download Latest" from the group right click menu.

6 - See what happens when it's done.

7 - Report how fast and how many headers it downloaded here.

If it's a small group, Newsbin might decide to pull the whole thing instead of using download age. How long it takes depends on your line speed. if you see (XXX) X = number in the cache line of the status bar, Newsbin is already importing headers you've already downloaded. Until it's done, you won't have all headers available.

Normally, you don't have to do all 7 steps but, I want to start from a clean slate.

Re: can there be gaps in group content?

PostPosted: Fri Dec 28, 2012 11:42 pm
by beany
Did this with a small group and I could see all the headers downloading etc. However nothing was displayed until I rebooted NB. Then I could use the load older files and load all from disk.
If I don't reboot NB I don't see any headers. Guess that's what has been confusing me.

Lets say I have a very large group and I see the headers counting up say 109090/208999. Should I be able to see any of those headers during the download? I'm not and I need to restart NB in order to see them. - unless I'm missing a step - but dont think so.


Edit....
I started to do the same with some other groups and then everything started to work fine. I've tried to replicate the fault over the last 20 mins but can't.

Re: can there be gaps in group content?

PostPosted: Sat Dec 29, 2012 12:47 am
by beany
Okay I see what the problem has been. There is another large group which I had been testing. It is that which is causing the problem. It doesn't appear to be adhering the the download age parameters at all. Instead it is just downloading all of itself. Whilst doing this it is stopping anything else from updating. I see the other groups at 0/0 in the downloading files tab. Even when that group is deleted it is still constantly downloading and blocking any other headers. That is why the small group was always with a record count of 0 and nothing displayed. If I delete this group and delete the download then I have no more problems.

Sorry ... it was difficult to understand what was happening because I didn't know how the app was supposed to work. I was just a bit befuddled. - (more than usual lol!)

Thanks for your patience - i'm gonna call it a night now.

Cheers


edit.... it seems that any big group will act the same way. ie ignore the download age process. those i had problems with were 269MB 274MB and 237MB. Also I noticed on another small group even though I set it for 10 days (with not much activity in those 10 days) it actually pulled down 4900 records and and many of those articles where over 400 days old. so I'm not sure how robust the download age routine actually is.

Re: can there be gaps in group content?

PostPosted: Sat Dec 29, 2012 1:11 am
by Quade
If it's a small group, Newsbin might decide to pull the whole thing instead of using download age. How long it takes depends on your line speed. if you see (XXX) X = number in the cache line of the status bar, Newsbin is already importing headers you've already downloaded. Until it's done, you won't have all headers available.


That's what I'm talking about here. Newsbin downloads the headers at high speeds and saves then to disk. Then another process packs them into the database where they get loaded from. Sounds like you had records from a prior group being imported before the current group imports.

Re: can there be gaps in group content?

PostPosted: Sat Dec 29, 2012 5:03 pm
by beany
Aaaaaarrrggggh!!!! I really cant be doing with this! 12hrs or so of downloading headers for this big group (only half way through) and then I had an internet failure. NB as started to pull it down from the beginning again.
So.... its not unusual to have internet outages surly NB has a way to recover from it. This is all getting too much trouble. Think I'll be back to internet searches. Please tell me I can recover the lost data.


edit... it does look like its much faster though. maybe the headers are just being looked at rather than downloaded? until it finds where it left off yes???

Re: can there be gaps in group content?

PostPosted: Sat Dec 29, 2012 5:10 pm
by Quade
NB as started to pull it down from the beginning again.


I'm skeptical. Normally it start up where it left off UNLESS you tell it to "Download all" again. "Download all" ALWAYS downloads all.

After "download all" one time, even if it fails, if you just "download latest" it'll continue where it left off.

Re: can there be gaps in group content?

PostPosted: Sat Dec 29, 2012 7:17 pm
by beany
Quade wrote:
NB as started to pull it down from the beginning again.


I'm skeptical. Normally it start up where it left off UNLESS you tell it to "Download all" again. "Download all" ALWAYS downloads all.

After "download all" one time, even if it fails, if you just "download latest" it'll continue where it left off.



Well it started from 0/0 then went quite fast to 93400000/1041533514 (1hr or so) at that point the green current progress bar is about 1/8th along. When it stopped originally it was over half so something has definately been lost. It's crawling now so I'm just going to let is go. I dont know why it takes so long? when I check the size of the group's folder it doesn't seem that big. certainly not worth 12+hrs of constant download on a 100Mbps link. Another thing I noticed was that sometimes it would say it was getting more bandwidth than the link allows. Is this because it's monitoring after its decompressed?

I cant afford to tie my PC up like this. What if I download headers on a cheap 'n' cheerful laptop that can just sit there downloading headers. Can I then move the specific group folder from the laptop's spool folder to the main PC where I will be doing the downloads? Is there anything else that would need to be copied over?

thnx

Re: can there be gaps in group content?

PostPosted: Sat Dec 29, 2012 8:32 pm
by Quade
I cant afford to tie my PC up like this


Why not just let it download in the background? Minimize Newsbin, then use the PC for something else. Most modern PC's can do more than one thing at a time. Hell, I game while headers download.

You can download on a laptop and later move them over. When you do, it'll overwrite what's already in the folder.