Mudcat Café message #1412908 The Mudcat Café TM
Thread #77685   Message #1412908
Posted By: GUEST,JohnInKansas
17-Feb-05 - 08:59 AM
Thread Name: What's up with Mudcat this time!!???
Subject: RE: What's up with Mudcat this time!!???
Jon

The link that Max posted re the FunWebProducts was a NetworkWorldFusion commentary by Mark Gibbs, marked as a Network World Applications Newsletter, 12/10/03. ( http://www.nwfusion.com/newsletters/web/2003/1208web2.html ). I'm sure you've seen that one already.

I agree that it seemed like rather old news, but it may just be the best short explanation that was available for Max to post. In the absence of further explanation from Max, it must be assumed that he found something on the 'cat servers to implicate FunWeb; but he's the only one who knows.

There are numerous "product makers" who's reputations are no better than FunWeb's. The typical difficulty with many of them arisises from their practice of allowing virtually anyone to "bundle" or "embed" their bot/search/hijack components into arbitrary other "free stuff." The makers make no attempt to control who uses their "engines" so it's impossible to tell who's actually responsible. Of course the core engine that gets used is identifiable, but it doesn't necessarily mean that they directly intended the use that appears. One other similar maker is 180Solutions, cited as the culprit in the recent COAST Collapse ( eWeek at http://www.eweek.com/article2/0,1759,1761466,00.asp )

The "symptom" I get is that mudcat is up and running. I get a "web site contacted, waiting for reply." Usually the "progess bar" indicates intermittent clusters being sent, but I'm on a LAN connection so I don't get a meaningful byte count. Progress stalls, and the connection just sits there.

My amateur reading of this is that a request for connection has priority and is answered. The page requested starts sending, but can be interrupted if a higher priority request appears, i.e. a new request for connection. When the transmission to me is interrupted, my request goes to a que until it's next in line for response. If there's too much traffic, the que fills and my request probably gets pushed off the bottom or the traffic just prevents my request from ever getting back to the top of the que.

A possible mudcat-specific explanation might be that Max doesn't have a large enough que to handle peak traffic, and/or doesn't have "que full" detection and/or overflow handling set up optimally. I've seen some vague comments about the possibility of managing the que so that a "senior request" gets moved up in priority, but that kind of management appears to require a few extra bells and whistles.

The regular appearance of the problem seems to be just about the time London goes to work, or perhaps at morning tea break, so it's possible that it's a simple "too many Brits" thing. While that's a politically popular opinion, the posted traffic doesn't really show that much of a bump.

If the 'cat is being hit with an indexing/search utility, I can speculate that something of this sort might have special effects on the 'cat. Sending results home probably must be "scheduled" for any non-trivial site, since otherwise the receiving site would have to have a continously open connection to every site it's searching. The roughly 04:00 05:00 US Eastern time would be a reasonable time for typical US sites if the bot is looking for a time when US traffic is low. (That's 01:00 02:00 Western US time)

The regular appearance might also be because one user, who comes in at a regular time, is dropping the same specific and unusually disruptive bot at the same time every, or nearly every, day.

Most search engines rely on following links to find what's on a site. Every page at the 'cat has, typically, a dozen of more links at the top for related info, a list of posts with a link to each post, each post has a link to the person posting which also executes search. Each request to open a link is a "new request to connect" which has priority at the top of the que. So how much traffic happens if "someone or something" goes to the 'cat at 04:00 every day and says "open all the links?" The search behavior of a crudely programmed bot might not even be visible at many sites, but at the 'cat I can see a very stuffed buffer.

Plausible(?), but of course no proof. (Of course, I may not understand how this stuff works, and I'll go with the experts when they're ready to talk.)

John