Technical Question: Bandwidth usage
A question for anybody who knows anything about hosting websites.
So far this month, according to CPanel, Chicken Yoghurt has used 4334.84 MB of bandwidth. However, the AWstats trafic stats says I’ve used a total of 3.28 GB – 2.01 in traffic viewed and 1.27 in traffic not viewed (search engines spidering the site and the like).
So why the discrepancy between the 4334.84 MB and the 3.28 GB (3355 MB)? There’s been no significant FTP or email activity to explain a stray thousand meg of bandwidth. Any ideas? Would polling of the site’s RSS feeds by the likes of Bloglines use that much?
Ta in advance to anyone with any help.
Posted on September 18th, 2006 at 2:17pm under A few administrative notices
| Related posts... • There went the day • Question for Word Press bloggers • The only thing worse than being talked about… |
• Permalink • Trackback • Subscribe |
|
|
|
• 15 Comments |

I am guessing, but educated guess, that CPanel is counting upstream bandwidth and downstream bandwidth. Where Awstats will only be counting downloads.
Cheers John, but what other upstream bandwidth usage could there be other than FTP uploads? I’ve hardly done any uploading – certainly not 1000 meg’s worth.
The Bloglines bot and stuff like that will be included in the “traffic not viewed” stats.
As John said, there’s the upstream bandwidth as well, which AWStats won’t be measuring. And there’s also all the non-HTTP traffic to the server. Do you use the server for email? There might be a lot of SMTP traffic – particularly if you’ve come under spam attack recently.
Dave, I don’t use the server for email any more. I went back to Gmail because its spam filtering was better than the Spam Assassin as used by my webhost. I take it any incoming mail to the server still counts as SMTP traffic regardless of whether I download it via POP?
Yep. Any mail directed to chickyog.net will be “handled” (even if handled means “dropped on the floor”) by this server as it’s listed as the MX server for the domain. And that all counts as bandwidth.
Probably log files. They may clear eventually and the total will come down.
Part of the answer may be that bandwidth includes the protocol overhead, TCP headers, IP headers etc that is in addition to the actual data, whereas the lower figure is the data content only.
Another possibility is the computing world’s sloppy usage of the terms meg and gig. Sometimes a meg has the general world meaning of a million, ie. 10 to the power 6 and sometimes it has the peculiar computer nerd meaning of 2 raised to the power 20 which is actually 1,048,576 which is approximately 4.86% higher than a million. Likewise sometimes a Gig refers to 10 to the power 9 ie 1,000,000,000 and sometimes to 2 to the power 30 ie 1,073,741,824 which differs by approx 7.37%.
Download bandwidth is what you send.
Upload bandwidth is the requests for content from your users.
Looking at sportnetwork upload is about 10% of download.
So adding that in with a probable miss calcualtion of the number of Bytes then your close..plus any pop/smtp/ftp stuff that Awstats won’t pick up.
Do you have fancy 404/403 error pages? My site used to and when it got trackback spammed to death after I deleted the old trackback script; the resulting 404 pages spent 0.6GB of my bandwidth allowance before I realised. My stats package didn’t show it up, as they were not counted successful downloads, but my hosting provider’s bandwidth counter did.
OK. You’re on Apache, so you should be able to use .htaccess files. So if you think you need to ban certain rogue users (mostly bots), you can.
If your host doesn’t offer enough analysis of your logs; download them (in a compressed format) and use analog. Very heavy users of your site should show up.
As Chris A says some 404 pages can cause trouble: unless ‘oohlala.php’really is a legitimate alternative name for ‘index.php’, I’ll guess that 404s redirect to your home page. One mistyped URL (eg ‘htp://www.whatever.com’) will throw every browser and search engine. Robots can keep trying faulty links (to anything: images, pages, whatever) for ages. First, many do a weird thing where they try ten times in succession, just for luck I suppose, and then they store the non-working url, and keep trying for months. With a fairly large page, this can clearly mount up. I’ve got custom 404 and 410 (gone permanently) pages which record what hits them, where it came from, and what it was looking for.
Cheers Chaps, looks like we might be getting somewhere. Dave and Chris – a 404 does indeed redirect to the homepage. You reckon it’s worth disabling this and seeing what happens?
Next question is where to find WordPress’ 404 handling because I’m blowed if I can find it…
Like the new masthead.
As do I. The undeniable massiveness of it will strike fear into the hearts of your enemies.
All for want of inspiration.
Look at the masthead, the masthead, the masthead, not around the masthead, don’t look around the masthead, look at my masthead…