Logs for jabber
[00:19:39] * brlancer joined the chat.
[00:36:55] * swmohsin joined the chat.
[00:37:03] * swmohsin left the chat.
[00:52:38] * aRyo left the chat.
[01:03:26] * swmohsin joined the chat.
[01:05:04] * swmohsin23205 joined the chat.
[01:05:04] * swmohsin23205 left the chat.
[01:05:04] * swmohsin67621 joined the chat.
[01:05:33] * darkrain joined the chat.
[01:12:09] * evilotto left the chat.
[01:13:26] * swmohsin left the chat.
[01:14:20] * swmohsin67621 left the chat.
[01:16:52] * darkrain left the chat.
[02:32:07] * jameschurchman left the chat.
[02:42:22] * dreamcast joined the chat.
[04:24:40] * treebilou joined the chat.
[04:25:57] * dreamcast left the chat.
[04:54:26] * darkrain joined the chat.
[05:15:15] * coolidge47506 joined the chat.
[05:15:27] * coolidge47506 left the chat.
[05:32:28] * darkrain left the chat.
[05:34:09] * darkrain joined the chat.
[05:49:03] * NEOhidra joined the chat.
[06:17:46] * z4rkus@jabber.org joined the chat.
[06:18:16] * marseille_ joined the chat.
[06:19:10] * z4rkus@jabber.org left the chat.
[06:20:07] * harlock joined the chat.
[06:59:10] * NEOhidra left the chat.
[06:59:37] * Lastwebpage joined the chat.
[07:09:09] * marseille_ left the chat.
[07:11:49] * the ♚ joined the chat.
[07:21:41] * the ♚ left the chat.
[07:22:13] * yuppinturic joined the chat.
[07:37:35] * Tobias joined the chat.
[07:53:22] * mpranj joined the chat.
[08:01:42] * harrykar left the chat.
[08:08:52] * mpranj left the chat.
[08:59:22] * the ♚ joined the chat.
[09:52:32] * pinchartl joined the chat.
[09:52:38] <pinchartl> hi
[09:55:33] <Kev> Morning.
[09:59:22] <pinchartl> hi Kevin. I've sent you another e-mail, with a server log this time :-)
[10:01:53] <Kev> Yep, I see, thanks.
[10:03:45] <Kev> I've eliminated just about all the possible causes for this on the jabber.org side, now, I'm seriously wondering if this is
a network error.
[10:05:00] <pinchartl> that's not impossible
[10:05:26] <Kev> The logs don't look particularly unusual for the time you've sent.
[10:06:48] <pinchartl> wait_for_validation: retiisi.org.uk -> jabber.org (connect timeout)
[10:06:52] <pinchartl> that's what bothers me
[10:07:01] <Kev> e.g. the number of network events (connections, auths, disconnects, validations etc.) in that second is roughly the same as
in the second just gone.
[10:07:01] <pinchartl> the S2S connection attempt failed
[10:07:19] <Kev> Right - there are very few things that could cause that in the server.
[10:07:39] <Kev> I'm aware of three - one of which doesn't apply to our configuration.
[10:08:42] <Kev> The other two are an effective DoS (such as happens after a server restart, when many connections hit at roughly the same
moment) and the machine's hardware in some way freezing/lagging.
[10:09:13] <pinchartl> do you have any way to monitor the number of pending connections and detect DoS issues ?
[10:09:14] <Kev> We've fixed instances of the latter recently - both the backup hitting the disk very hard, and the system hitting swap. I
don't see evidence of the former for that period.
[10:09:40] <Kev> Roughly speaking. We log the number of incoming connections by time.
[10:11:03] * brlancer left the chat.
[10:11:47] <pinchartl> how could this be investigated ?
[10:11:52] <pinchartl> it's not an isolated issue
[10:11:57] <pinchartl> and it's really annoying
[10:15:30] <Kev> I'm seeing what I can work out from the logs at the moment.
[10:15:40] <Kev> I'd probably like to see it solved at least as much as you. :)
[10:22:30] <pinchartl> that's definitely good :-)
[10:25:08] <Kev> Possibly.
[10:36:54] * harrykar joined the chat.
[10:39:24] * NEOhidra joined the chat.
[10:43:02] * marseille_ joined the chat.
[10:43:41] * marseille_ left the chat.
[10:43:51] * marseille_ joined the chat.
[10:54:14] * badlop joined the chat.
[11:06:04] * treebilou left the chat.
[11:17:44] * NEOhidra left the chat.
[11:19:58] * NEOhidra joined the chat.
[11:44:16] * the ♚ left the chat.
[11:51:31] * marseille_ left the chat.
[11:53:03] <Kev> pinchartl: Do you know what the timeout in question is, whether that server had any other issues at that time, and whether
a log message of a timeout means a literal timeout, or whether it can mean other things (like TCP cut or something
[11:53:04] <Kev> )
[11:54:43] <pinchartl> Kev: the server had no other (known) issue at that time. communication with other servers were working correctly as far as
I know
[11:55:17] <Kev> Thanks.
[11:55:53] <pinchartl> I suppose that timeout means a TCP connection timeout, but I'm not sure what ejabberd logs exactly
[12:00:19] * sailus joined the chat.
[12:02:27] <sailus> Kev: I've got connection issues with other servers, namely gmail.com, but others have had issues with that one as well.
[12:02:43] <sailus> There are servers, however, with which connections have been fine.
[12:02:51] <Kev> Ah, you're on the server in question :)
[12:04:32] <sailus> Yes, I am. :-)
[12:05:18] <sailus> It's running 2.1.5-3+squeeze1 (Debian squeeze).
[12:05:39] <Kev> Thanks - I'm afraid I know little enough about ejabberd that this gives me very little informationt.
[12:05:42] <Kev> -t
[12:06:30] <sailus> I don't think the ejabberd 2 has any issues related to this; it might be useful information nevertheless.
[12:07:01] <sailus> "connect timeout" very probably means that connect system call returned error code ETIMEDOUT.
[12:07:04] <Kev> I don't believe this to be a problem with your server.
[12:07:21] <sailus> This means that the tcp connection has failed to establish.
[12:07:27] <Kev> The problem being I don't believe it to be a problem with jabber.org either :)
[12:07:38] <Kev> Right.
[12:08:23] <pinchartl> Kev: what do you suspect ? a network problem in the hosting facility ?
[12:08:41] <pinchartl> s/in/at/
[12:09:19] <Kev> I'm really struggling to come up with a plausible explanation other than networking failures somewhere - I note that my lack
of imagination does *not* mean that I'm laying the blame with our hosters; it could well be that I just don't see the problem.
[12:09:41] <pinchartl> :-)
[12:10:51] <Kev> I've been doing some postprocessing of the jabber.org logs, and what I see is quite peculiar, and I can't explain it (apart
from glitches in the Matri^h^h^h^h^hnetwork).
[12:13:28] <sailus> Kev: Another cause might be that the server isn't able to accept connections for a reason or another --- i.e. it hasn't had
time or otherwise been able to issue the accept system call for the incoming connection.
[12:13:46] <Kev> sailus: Indeed.
[12:14:07] <Kev> I even have a plausible explanation for why that would be.
[12:14:42] <Kev> As this is the behaviour we see after a server restart - thousands of clients and servers all hit us at the same moment and
some will time out and retry while we work through them all.
[12:15:06] <Kev> So an event causing many connections to end at once (and therefore to reconnect instantly) would explain this.
[12:15:11] <Kev> I even think I may be seeing that in the logs.
[12:15:25] <Kev> What I *can't* explain is why there would be that sudden slew of disconnects.
[12:17:24] <pinchartl> Kev: how do S2S connections timeout ? after a fixed inactivity timeout ?
[12:17:41] <Kev> jabber.org doesn't timeout S2S connections at the stream level.
[12:17:51] <Kev> That's just wasteful ;)
[12:18:34] <pinchartl> sailus: does your server timeout the S2S connections then ?
[12:18:44] <sailus> Kev: The TCP connections are only kept alive as long as there is traffic between the servers. At least that's the default
configuration for ejabberd 2.
[12:18:48] <Kev> Probably. A number of implementations do.
[12:18:50] <sailus> In Debian, that is. :-)
[12:19:21] <pinchartl> but I doubt that could explain why many connections would end and be restarted at the same time
[12:19:28] <Kev> pinchartl: No.
[12:20:10] * treebilou joined the chat.
[12:20:39] <sailus> The log I have shows that an s2s connection to jabber.org was closed 2011-07-13 11:25:09 (GMT + 3) and another one was attempted
11:44:46, and it timed out 11:49:41.
[12:21:03] <Kev> Oh.
[12:21:12] <Kev> Because that's not what the log pinchartl sent me seemed to say.
[12:21:19] <sailus> Wasn't it?
[12:21:33] <Kev> The log seemed to be saying it was timeout during wait_for_validation.
[12:21:38] <Kev> I'm *assuming* that means dialback.
[12:21:45] <sailus> Kev: I think you're right.
[12:21:58] <sailus> I'm mostly guessing here. :-)
[12:22:09] <Kev> Which would suggest that it was jabber.org trying to connect to your server (connection made) and what timed out was your
server trying to connect to jabber.org for dialback.
[12:22:47] <sailus> I should learn some Erlang, I suppose. :-)
[12:36:10] * NEOhidra left the chat.
[12:36:10] * NEOhidra joined the chat.
[12:41:18] * the ♚ joined the chat.
[12:41:46] * the ♚ left the chat.
[12:42:03] * the ♚ joined the chat.
[12:52:54] * pinchartl left the chat.
[12:53:00] * pinchartl joined the chat.
[13:01:57] * swmohsin joined the chat.
[13:02:19] * swmohsin left the chat.
[13:13:58] * Neustradamus left the chat.
[13:14:35] * Neustradamus joined the chat.
[13:19:04] * harlock left the chat.
[13:33:37] * marseille left the chat.
[13:39:12] * stpeter joined the chat.
[13:43:10] * tsk joined the chat.
[13:58:19] * Tobias left the chat.
[14:05:02] * tsk left the chat.
[14:07:47] * mpranj joined the chat.
[14:14:24] * naw joined the chat.
[14:24:50] * mpranj left the chat.
[14:25:30] * mpranj joined the chat.
[14:42:12] * sailus left the chat.
[14:43:17] * the ♚ left the chat.
[14:52:57] * MattJ joined the chat.
[15:09:35] * Neustradamus left the chat.
[15:32:13] * the ♚ joined the chat.
[15:32:50] * the ♚ left the chat.
[15:33:05] * the ♚ joined the chat.
[15:36:58] * the ♚ left the chat.
[15:37:10] * the ♚ joined the chat.
[15:45:13] * mpranj left the chat.
[15:49:58] * paulmad joined the chat.
[15:50:06] * badlop left the chat.
[15:57:17] * yuppinturic left the chat.
[16:25:46] * wilson39320 joined the chat.
[16:27:54] * wilson39320 left the chat.
[16:29:49] * jameschurchman joined the chat.
[16:30:39] * jameschurchman left the chat.
[16:34:49] * whatever joined the chat.
[16:35:30] * mpranj joined the chat.
[16:36:21] * naw left the chat.
[16:36:27] * marseille_ joined the chat.
[16:41:07] * marseille joined the chat.
[17:14:06] * Tobias joined the chat.
[17:24:36] * Syedking joined the chat.
[17:24:36] * Syedking left the chat.
[17:33:35] * Lastwebpage left the chat.
[17:34:24] * yubeiluo\40jabber.org joined the chat.
[17:37:59] * PaulFertser joined the chat.
[17:38:35] <PaulFertser> Hi there :) i seem to have some gmail s2s issues again.
[17:40:55] <Kev> PaulFertser: Yes, we're aware, thanks. It's not clear what the issue is, we're investigating.
[17:40:58] * naw joined the chat.
[17:41:13] * naw left the chat.
[17:41:34] <PaulFertser> Kev: hey, how's it going, long time no see :)
[17:42:19] * marseille_ left the chat.
[17:42:20] <Kev> It'd be better if gmail s2s was working :)
[17:44:35] * evilotto joined the chat.
[17:45:09] <PaulFertser> Kev: how comes there's so much magic in simple xml-based protocol over ssl? ;)
[17:46:04] * marseille_ joined the chat.
[17:46:17] <Kev> The usual problem of maintaining a popular service on a hostile Internet.
[17:46:31] * Lastwebpage joined the chat.
[17:48:36] <PaulFertser> Kev: (aware of the s2s issues) btw, there's an identi.ca account you've got there ;)
[17:49:22] <Kev> Oh, that.
[17:55:30] <Kev> Oh, which it seems my saved password for is incorrect.
[17:57:06] * the ♚ left the chat.
[18:28:54] * Tobias left the chat.
[18:29:12] * Tobias joined the chat.
[18:29:38] * whatever left the chat.
[18:33:28] * Tobias left the chat.
[18:33:58] * Tobias joined the chat.
[18:35:35] * whatever joined the chat.
[18:43:49] * paulmad left the chat.
[19:02:23] * waqas joined the chat.
[19:03:18] * naw joined the chat.
[19:28:57] * waqas left the chat.
[19:31:26] * naw left the chat.
[19:41:42] * paulmad joined the chat.
[19:50:52] * clinton37476 joined the chat.
[19:54:25] * gorgias\40jabber.org joined the chat.
[19:54:56] <gorgias\40jabber.org> xxx
[19:55:49] * clinton37476 left the chat.
[19:57:37] * gorgias\40jabber.org left the chat.
[20:06:07] * treebilou left the chat.
[20:07:49] * marseille_ left the chat.
[20:08:31] * yuppinturic joined the chat.
[20:13:21] <stpeter> yay
[20:22:57] <louiz’> yay
[20:23:14] * mpranj left the chat.
[20:43:47] * lo0lo0 joined the chat.
[20:44:02] * lo0lo0 left the chat.
[20:47:43] * lo0lo0 joined the chat.
[20:47:43] * lo0lo0 left the chat.
[20:52:03] * tyler15047 joined the chat.
[20:52:03] * tyler15047 left the chat.
[20:52:14] * lo0lo0 joined the chat.
[20:57:00] * lo0lo0 left the chat.
[21:24:13] * mhammad.a.a joined the chat.
[21:24:14] * mhammad.a.a left the chat.
[21:40:15] * pinchartl left the chat.
[21:41:00] * pinchartl joined the chat.
[21:41:05] * pinchartl left the chat.
[21:41:12] * pinchartl joined the chat.
[21:44:38] <pinchartl> jabber.org <-> gmail.com S2S died again :-(
[22:11:37] * pinchartl left the chat.
[22:11:43] * pinchartl joined the chat.
[22:21:19] * NEOhidra left the chat.
[22:35:59] * marseille_ joined the chat.
[22:54:21] * Badja joined the chat.
[23:00:56] * Badja left the chat.
[23:01:23] * Lastwebpage left the chat.
[23:04:32] * Tobias left the chat.
[23:06:22] <stpeter> plus http://xmpp.org/ is offline, too
[23:06:24] <stpeter> lots of fun
[23:06:28] <stpeter> /me investigates
[23:47:38] * pinchartl left the chat.
[23:54:02] * roosevelt38931 joined the chat.
[23:57:38] * yubeiluo\40jabber.org left the chat.
[23:59:00] * roosevelt38931 left the chat.