Project

General

Profile

Feature #259

Improve support for reconnections

Added by Frédéric Barthéléry almost 7 years ago. Updated over 4 years ago.

Status:
Assigned
Priority:
Normal
Category:
XMPP
Target version:
Start date:
04/19/2010
Due date:
% Done:

0%

Close

Description

Beem relies one the ReconnectionManager of aSmack. Currently the service is stopped when a disconnection occurs. We should keep the service running while the ReconnectionManager performs the reconnection and disable some UI functionnality until the reconnection is done.

The ReconnectionManager should be disable when there is no connectivity and the connection should be relaunch when connectivity comes back.

aSmack doesn't load the ReconnectionManager automatically, this must be done explicitly in the BeemService to enable the ReconnetionManager.

reconnection-727.patch Magnifier - Patch to enable ReconnectionManager (9.71 KB) Frédéric Barthéléry, 04/20/2010 12:08 AM

Beem-debug.apk - Beem r737 with reconnection patch (683 KB) Nikita Kozlov, 05/19/2010 07:02 PM

traces.txt Magnifier - traces.txt from the crash at reconnection attempt (8.87 KB) Eugene Crosser, 06/02/2010 07:47 AM

log.txt Magnifier - syslog excerpt for reconnection crash (2.17 KB) Eugene Crosser, 06/02/2010 11:39 AM

log.txt Magnifier - verbose log or crash on reconnect (27.5 KB) Eugene Crosser, 06/02/2010 11:26 PM

log2.txt Magnifier - another log of failing reconnect (10.6 KB) Eugene Crosser, 06/03/2010 12:23 AM

beem-reconnect.apk (728 KB) Jerome M., 06/10/2010 02:25 PM

beem.apk (711 KB) Jerome M., 06/11/2010 11:19 AM

reconnection-771.patch Magnifier - Patch to apply on r771 (14.4 KB) Frédéric Barthéléry, 06/11/2010 10:51 PM

Beem-new-reconnection.apk - Beem r771 with reconnection2.patch (681 KB) Frédéric Barthéléry, 06/11/2010 10:51 PM

beem-log-1.txt Magnifier - log for beem not trying to reconnect (53.5 KB) Eugene Crosser, 07/05/2010 10:53 PM

beem-log-2.txt Magnifier (103 KB) Eugene Crosser, 07/18/2010 02:41 PM

beem-log-3.txt Magnifier (78.2 KB) Eugene Crosser, 07/26/2010 10:31 PM

beem-log-4.txt.bz2 (80.4 KB) Eugene Crosser, 07/27/2010 01:27 PM


Related issues

Related to Bug #258: Close/Crash when screen is powered off Assigned 04/19/2010
Related to Bug #271: Beem doesn't display ignored messages sometimes. Closed 06/12/2010
Related to Feature #400: Implement XEP-0198 Stream Managment New 01/26/2012
Duplicated by Bug #270: Beem does not reconnect automatically Assigned 06/07/2010
Duplicated by Bug #344: Beem silently quits on change of data connection New 02/23/2011
Duplicated by Bug #518: Beem terminates when connection lost and network interface changes New 07/25/2013
Duplicated by Bug #525: Keeps Disconecting New 01/21/2014
Precedes Feature #265: Start on phone boot New 05/22/2010

History

#1 Updated by Frédéric Barthéléry almost 7 years ago

  • Status changed from New to Assigned
  • Assignee set to Frédéric Barthéléry

#2 Updated by Nikita Kozlov almost 7 years ago

If some one want to try, I have applied the patch on the revision 737.

#3 Updated by Eugene Crosser over 6 years ago

With the patch, the app crashes when it attempts to reconnect, every time. It is very easy to reproduce: turn off the connectivity option currently in use, and if/when another connectivity option is available, in a few (10?) seconds I get a message "The application Beem (process com.beem.project.beem) has stopped unexpectedly. Please try again". I am attaching the relevant excerpt from traces.txt.

#4 Updated by Eugene Crosser over 6 years ago

I looked closer into the traces and realized that there is no information there that is present in the log, so I am attaching the log as well.

#5 Updated by Frédéric Barthéléry over 6 years ago

Hi Eugene,
It seems your crash is due to another bug. Could you disable the Auto-away options and try again ?

#6 Updated by Eugene Crosser over 6 years ago

Frédéric Barthéléry wrote:

Hi Eugene,
It seems your crash is due to another bug. Could you disable the Auto-away options and try again ?

It still crashes with auto-away turned off; I will produce new log and trace when I am near the right workstation.

#7 Updated by Eugene Crosser over 6 years ago

more verbose log of failing reconnect; auto-away off.

#8 Updated by Eugene Crosser over 6 years ago

I think that I have a case with slightly different symptoms: there is no message about "stopped unexpectedly", but the app does not work after changing connectivity, and when started it shows "connecting" animation indefinitely. There are "interesting" messages in the log attached.

#9 Updated by Jerome M. over 6 years ago

I applied the patch to rev.763 and I did not notice any crash. For the record, I trigger the deconnexion by switching 3G to 2g. Here is my build if someone wants to play with it, I keep on testing anyway.
Perhaps we could commit the patch in order to have more feedback ?

#10 Updated by Eugene Crosser over 6 years ago

For some reason, the .apk from comment #9 starts installing, and then says "Application not installed".

My self-build version from a week ago crashed only once or twice. But it never reconnects, and I end up with application that is running and "thinks" that it is connected but in fact it is not.

I will try to collect and upload more debug information these days.

#11 Updated by Jerome M. over 6 years ago

Eugene, I think you have to uninstall your version before (using adb uninstall for example) and then install the new package because of signature mismatch..

#12 Updated by Eugene Crosser over 6 years ago

Installed .apk from comment #9.

Turn off wifi so the phone switches to mobile data: beem shows that it is online, and the peer sees the user online, but no messages pass in either direction. Stays this way 20 minutes.

Turn Wifi back on. Bunch of messages that where typed on the phone before arrive at the peer, no messages typed by the peer arrive at the phone.

In 10-20 seconds, a couple I/O error messages on the phone, and then the app crashes with "stopped unexpectedly" message.

#13 Updated by Jerome M. over 6 years ago

Please retest with this apk (uninstall it before). Could you include the logcat and give details about your device and your android version.
I managed to reproduce the bug once but now it runs flawlessly...

#14 Updated by Frédéric Barthéléry over 6 years ago

Hi,
I am working on a better implementation of the reconnection.
I will probably give you a patch and apk later today.

#15 Updated by Eugene Crosser over 6 years ago

jer mar wrote:

For the record, I trigger the deconnexion by switching 3G to 2g.

Just got an idea. It may be that my and your symptoms are different because when you switch from 3g to 2g, the network gives you the same IP address. In my case, address of the device changes. When the IP address stays the same, the same TCP session stays functional after reconnection.

BTW I tried to reproduce the problem under emulator but found that (1) it does not provide emulated WiFi, and (2) even when I turn off "mobile network" it does not break IP connectivity to the virtual environment, and beem stays connected and functional.

Maybe the latter is the root of the problem. Maybe beem should shutdown the TCP connection when it receives android disconnection notification, no matter what?

#16 Updated by Jerome M. over 6 years ago

I don't think so, I had it working correctly when switching for wifi to mobile network. I also ran into beem crashes comming from unhandled exceptions in the reconnection manager.
I had a look into the CM code and I think that relying on it is not the good solution as it is not aware of the cause of the disconnection and of the availability of a connectivity mean (switching to airplane mode results to perpetual reconnection attempts) ; the reconnection process should be handled by beem service itself.
Frédéric is refactoring the implementation so it should result to a stronger reconnection process.
Wait and see :)

#17 Updated by Frédéric Barthéléry over 6 years ago

Jerome M. wrote:

I don't think so, I had it working correctly when switching for wifi to mobile network. I also ran into beem crashes comming from unhandled exceptions in the reconnection manager.
I had a look into the CM code and I think that relying on it is not the good solution as it is not aware of the cause of the disconnection and of the availability of a connectivity mean (switching to airplane mode results to perpetual reconnection attempts) ; the reconnection process should be handled by beem service itself.
Frédéric is refactoring the implementation so it should result to a stronger reconnection process.
Wait and see :)

It is exactly my conclusion :)

#18 Updated by Frédéric Barthéléry over 6 years ago

As promised :)
This should be much better.

#19 Updated by Eugene Crosser over 6 years ago

Frédéric Barthéléry wrote:

This should be much better.

It is indeed. It does reconnect me now. I will be running it full-time and report if I see troubles.

BTW,
HW: HTC Desire
SW: 1.21.405.2 rooted by modaco (because I need WPA/LEAP and custom CA certs)

#20 Updated by Eugene Crosser over 6 years ago

After a few days' run of the apk from comment #18:

- It successfully reconnects more often than not.
- It still don't keep me connected even for one whole day.

I've seen "terminated unexpectedly" message, I've found the app not running after a while, I've found it presumably active but with contact list empty and seen "offline" from outside.

I think that at least once I saw "terminated unexpectedly" when I was in the part of the building where both wifi and mobile connectivity are intermittent.

I have no logs because it always happened when I was out of the workstation; I will try to reproduce it when connected to adb, and if I succeed I'll post the logs.

#21 Updated by Eugene Crosser over 6 years ago

I had two scenarios with the package from comment #18:

For first, I have no log. It looked like this: at some moment, I got this message: "Sorry! Activity Beem-Chat (in application Beem) is not responding" with selection "Force Close" or "Wait". I selected "Force Close", after that the window with contact list had not entries, but there was "Disconnect" button in the menu.

Second scenario was very simple, and I have log for it. I freshly installed the package from comment #18, created account (""), and got online with wifi and mobile available. Things where all right, I exchanged messages with the contact on a desktop (""). Then I turned off wifi in the settings. From the point of view from the desktop, did not go offline even for a second, and Beem on android did not show signs of reconnecting. The messages typed on any side did not reach the peer. This continued for no less then 10 minutes. I have log attached for this case.

#22 Updated by Frédéric Barthéléry over 6 years ago

Eugene Crosser wrote:

I had two scenarios with the package from comment #18:

For first, I have no log. It looked like this: at some moment, I got this message: "Sorry! Activity Beem-Chat (in application Beem) is not responding" with selection "Force Close" or "Wait". I selected "Force Close", after that the window with contact list had not entries, but there was "Disconnect" button in the menu.

I think it is an unrelated issue. Anyway, we have to find a good way to deal with ANR issues like this one.

Eugene Crosser wrote:

Second scenario was very simple, and I have log for it. I freshly installed the package from comment #18, created account (""), and got online with wifi and mobile available. Things where all right, I exchanged messages with the contact on a desktop (""). Then I turned off wifi in the settings. From the point of view from the desktop, did not go offline even for a second, and Beem on android did not show signs of reconnecting. The messages typed on any side did not reach the peer. This continued for no less then 10 minutes. I have log attached for this case.

I don't know why but asmack did not detect the disconnection. There is no "connectionClosedOnError" which trigger the reconnection process.

#23 Updated by Eugene Crosser over 6 years ago

OK, I have a log for another failure scenario.
I started in mobile data mode, then turned on wifi and beem has switched successfully, then I turned wifi off, and shorty got "The application Beem ... has stopped unexpectedly. Please try again" window. Relevant log attached.

#24 Updated by Eugene Crosser over 6 years ago

From several days of running the app and looking at what is going on, it seems that the most often problem is that the app does not notice that the current connection is no longer active. The app still shows "Active" status but of course nothing is transferred, and the server eventually (after long time - hours maybe) realizes that the user is not connected.

I thought that maybe it would make sense to implement some sort of heartbeat check, and if the app cannot get response from the server for e.g. 10 minutes initiate reconnection? Just a thought...

No logs because it was all on the go. Incidentally, do you know if it is possible to write the log onto the SD card for later retrieval? That would help debug such cases as this one.

#25 Updated by Nikita Kozlov over 6 years ago

Eugene Crosser wrote:

From several days of running the app and looking at what is going on, it seems that the most often problem is that the app does not notice that the current connection is no longer active. The app still shows "Active" status but of course nothing is transferred, and the server eventually (after long time - hours maybe) realizes that the user is not connected.

Maybe a solution could be forcing the wifi policy to "AlwaysOn" when the phone went to idle state ? Maybe as an option in settings ?

I thought that maybe it would make sense to implement some sort of heartbeat check, and if the app cannot get response from the server for e.g. 10 minutes initiate reconnection? Just a thought...

I think for TCP connections the best method is to use the standard TCP keep alive.

No logs because it was all on the go. Incidentally, do you know if it is possible to write the log onto the SD card for later retrieval? That would help debug such cases as this one.

If on your phone you have a terminal emulator, you can use : logcat > /sdcard/logfile.txt

cf : http://wiki.cyanogenmod.com/index.php?title=Logcat

#26 Updated by Eugene Crosser over 6 years ago

Nikita Kozlov wrote:

From several days of running the app and looking at what is going on, it seems that the most often problem is that the app does not notice that the current connection is no longer active.

Maybe a solution could be forcing the wifi policy to "AlwaysOn" when the phone went to idle state ? Maybe as an option in settings ?

I have WiFi "always on" because I also run a SIP client, but it is obviously not a "solution" because it does not help when you move between areas with WiFi coverage, mobile coverage and no coverage. Which is exactly when the problems arises.

I thought that maybe it would make sense to implement some sort of heartbeat check, and if the app cannot get response from the server for e.g. 10 minutes initiate reconnection? Just a thought...

I think for TCP connections the best method is to use the standard TCP keep alive.

It does not sound like a good option to me (from RFC1122):

         4.2.3.6  TCP Keep-Alives

            Implementors MAY include "keep-alives" in their TCP
            implementations, although this practice is not universally
            accepted.  If keep-alives are included, the application MUST
            be able to turn them on or off for each TCP connection, and
            they MUST default to off.

            Keep-alive packets MUST only be sent when no data or
            acknowledgement packets have been received for the
            connection within an interval.  This interval MUST be
            configurable and MUST default to _no less than two hours_.

(my emphasis)

If on your phone you have a terminal emulator, you can use : logcat > /sdcard/logfile.txt

Thanks, I'll try that and see if I can collect any useful information. (Although I suspect that it may be killed by the system when run in the background.)

While I am here, I have another typical failure scenario: there is a service icon in the top bar but when I launch the app it starts "Connecting". And never gets authenticated. But if I force stop the running app and then start it, it connects all right.

#27 Updated by Eugene Crosser over 6 years ago

I wrote:

While I am here, I have another typical failure scenario: there is a service icon in the top bar but when I launch the app it starts "Connecting". And never gets authenticated. But if I force stop the running app and then start it, it connects all right.

I was exactly in this situation now, so I connected the device, ran adb logcat, and did the following:
  1. Pulled down status display, select Beem Status, press "Open Contact list". The (only) contact was shown as "Disconnected".
  2. Launched Beem as an application. It started "Connecting", then "Authenticating", then reported authentication error. Returned via "Back" button.
  3. Pulled down status display, select Beem Status, press "Open Contact List", Menu -> Disconnect.
  4. In a short time (10 seconds?) Beem successfully autoconnected.

Killed adb.

#28 Updated by Nikita Kozlov over 6 years ago

Eugene Crosser wrote:

Nikita Kozlov wrote:

From several days of running the app and looking at what is going on, it seems that the most often problem is that the app does not notice that the current connection is no longer active.

Maybe a solution could be forcing the wifi policy to "AlwaysOn" when the phone went to idle state ? Maybe as an option in settings ?

I have WiFi "always on" because I also run a SIP client, but it is obviously not a "solution" because it does not help when you move between areas with WiFi coverage, mobile coverage and no coverage. Which is exactly when the problems arises.

Yep, you are right, it won't help in that case. I was thinking about the case when the phone go in idle state and shutdown the wifi connection, which is happening to me more often.

I thought that maybe it would make sense to implement some sort of heartbeat check, and if the app cannot get response from the server for e.g. 10 minutes initiate reconnection? Just a thought...

I think for TCP connections the best method is to use the standard TCP keep alive.

It does not sound like a good option to me (from RFC1122):

[...]
(my emphasis)

I agree that implementing http://xmpp.org/extensions/xep-0199.html could be a better idea.
But just a comment about rfc1122, there was something like 100,000 computer in 1989 (no nat, no private network, classful ips ...). Let's agree that there are some better sources for today's protocols behavior ;-).

If on your phone you have a terminal emulator, you can use : logcat > /sdcard/logfile.txt

Thanks, I'll try that and see if I can collect any useful information. (Although I suspect that it may be killed by the system when run in the background.)

While I am here, I have another typical failure scenario: there is a service icon in the top bar but when I launch the app it starts "Connecting". And never gets authenticated. But if I force stop the running app and then start it, it connects all right.

In that case, you have switched from a wifi connection to a 3g connection (or vice versa) ? It's maybe because Smack is trying to use a wrong network interface or socket for it reconnection ?

#29 Updated by Eugene Crosser over 6 years ago

I used the approach that Nikita suggested, ran logcat locally in the device, in the background. Attached log covers the full lifecycle of the failure scenario: I started logcat when Beem was working all right, then I moved in the city, with connectivity changing, disappearing and reappearing multiple times, and then I found the situation described in comment #27. Then I selected "Disconnect" from Beem menu, and it reconnected (successfully). Then I killed logcat, and copied the log file, which is attached.

I hope that it helps: it's apparently the most comprehensive coverage of the failure that one can get.

#30 Updated by Nolan Darilek over 6 years ago

Silly question perhaps, but can anything be learned from what yaxim does? When I ran it, it generally seemed to stay connected reliably. I thought that it too uses asmack, but I'm not sure.

Then again, yaxim has its own bugs that beem doesn't seem to--namely, it produces half a dozen notifications for each message. So maybe there are some differences of which I'm not aware.

#31 Updated by Eugene Crosser over 6 years ago

Nolan Darilek wrote:

Silly question perhaps, but can anything be learned from what yaxim does? When I ran it, it generally seemed to stay connected reliably. I thought that it too uses asmack, but I'm not sure.

Not for me. I tried yaxim, and in a while (after a few changes of connectivity) I find myself with "Connection closed" displayed in the app and no attempts to reconnect.

#32 Updated by Tony Shark over 6 years ago

I'm not sure how far you've gotten on this problem, but keeping Beem connected and having it automatically reconnect is actually not that difficult at all. I think the real problem is not losing messages when you're disconnected. Android was built with the mobile network model in mind. This being said it expects to have times where the connection is gone. When the connection is gone any messages sent are disregarded.

If smack has a connection on error event AFTER you've already successfully connected once than just reconnect after. When reconnecting make sure the phone has network connectivity and if it doesn't when connectivity is reinstated just reconnect. Keeping connection is as simple as XEP-0199 Ping. Monitor incoming data from the server, after X time send a ping, if the ping went through the incoming data monitor will be updated, if it doesn't ping again, after X time realize the connection is dead. There is more but you get the overall idea.

Anyone have any idea how we can handle not losing messages in the time where you lose service? Even if its just for 30 seconds and the connection comes back and you never get disconnected, that message is lost. You can't really verify that the message went to the server either as XMPP doesn't have any kind of transactions unlike IMPS.

-Tony

#33 Updated by Günther Starnberger about 6 years ago

Detecting lost messages should be possible if both parties support XEP-0198 (Stream Management) which includes stanza acknowledgements.

#34 Updated by Olaf the Lost Viking over 4 years ago

I am using the up-to-date version from the android market (Google Play Store) [0.1.7] and am having the "beem goes offline after switching from wifi to 3g" problems. Since this feature request hasn't been updated since a long time I wanted to ask if the version I am using is already having the reconnection features but is somehow buggy or if I need to get another version from somewhere to be able to stay online while moving? Thanks.

Also available in: Atom PDF