[Solved] Infinite refresh loop in browser clients, with "Invalid or expired session" log error spam

@hmhealey: a different user also encountered this and managed to record similar log messages, but in addition:

  • document.cookie:
    "ajs_user_id=null; ajs_group_id=null; ajs_anonymous_id=%2200000000000000000000000000%22; MMUSERID=fkiabmcfeiyq3dgsug41h6zwdr"
  • the other four all printed null

Okay, so it’s not the login/logout code failing, but the user’s browser thinks it should be logged in (since the MMUSERID section exists in the cookie) while the server obviously doesn’t. The browser should detect the 401 and act as if it was logged out, but this code did change around 3.9, so maybe that broke. I’ll see if I can reproduce it locally

I haven’t identified an exact cause, but the problem appears to be that the browser isn’t clearing the user’s MMUSERID cookie when it detects that it should be logged out (due to the 401 Unauthorized error). Do the affected users have any unusual browser extensions or anything that could be causing that?

Normal Chrome browser AFAIK.

Could it be a mismatch of server URLs? We have aliases mattermost and mattermost.domain.something to the same server. Not sure if a response header and cookie might disagree on the server or URL or something.

i am also having that refresh-loop trouble mostly on mobile and browser, not the desktop apps.
was quiet when i first installed 4.0.0, but after i made a minor config.json change this “mess” started where no one was quite able to be on mobile or browser anymore :-/

fell back to a 3.10 backup and everything was fine, then went back to 4.0.1 and it suddenly worked.
now installed 4.0.2 and its broke again.

Preserved Browser Console Log:
https://privatebin.net/?96f716e80b259ca6#Cv9HYxvEM01j9edo5xBRX/nkCz77BZgM/CfeXfZouWA=

UPDATE, next Log:
1st Time Load of the Page in Browser was okay, saw normal channel content etc, then i gave in your 5 above mentioned commands, still no refreshes.
but then switched some channels and the refresh loop started again…
https://privatebin.net/?4c2f2ab2d10b824c#jr5k2eFF2tpgX0vV93yHqoVX9G57MK9mcg4zDhk6Y2U=

Also took a Server Debug log by ./platform:
https://privatebin.net/?e004313b1debc53e#SI6W7FmIe8Ad5efE9U/1KNAmgGILxlXE+HnKx/ZnSiI=

NEXT UPDATE:
When i clear the Browser Cache (Chromium) completely and login again no problems seem to appear.
Same with the Mobile App on Android, had to delete the cache completely.

But thats not convenient for most users, sometimes they dont even know how to do that.

:disappointed_relieved: ->>>>> Means my Chat is now half broken for most users :frowning:

@gubbins I don’t think multiple domains should cause a problem. We have a customer with a setup like that, and they haven’t reported this issue. Maybe keep an eye on what domain the users that have this problem are connected to?

I’m not sure what else there would be to check for. I’ve been attempting to reproduce this locally on 3.10 by logging myself out from another browser under all different circumstances, but it’s successfully detected that and logged me out on the client. The only other thing I can think is maybe update to 4.0.2 (or 4.0.3 once that’s out in a few days) since the part of the code responsible for handling this has changed slightly which may solve the problem.

@githubkoma Your issue appears to be different, but I’m seeing a few things in the client logs that you posted that may explain it.

  1. Was anyone changing any settings in the system console or config.json while this was happening? That triggers all users to reload to get the updated config, and you’ll see root.jsx:136 Detected version update refreshing the page logged by them when this happens

  2. I’m also seeing

    (unknown) WebSocket connection to 'wss://MYSERVER.XYZ/api/v4/websocket' failed: One or more reserved bits are on: reserved1 = 0, reserved2 = 1, reserved3 = 1
    

    which indicates the websocket is having issues. Are you running Nginx or another proxy? There’s a small change required to get that working in 4.0 (point 3 in the Important Notices section of this doc)

Dear hmhealey,
(1): not exactly in that moment, i had clients that were trying to log in hours later and had the problem, but yes i changed the config.json at some point in time, and (2): Apache! So perhaps something went wrong when i updated the config of apache,
I am giving it a try to change the apache config and cross my fingers it doesnt happen again when i do (1).

thanks so far!

OK @hmhealey, some more data points:

  • I upgraded our test server to 4.0.0 recently and coincidentally I hit the refresh issue on my own account! So, I know it can still happen on 4.0.0.
  • Auth is delegated to gitlab in case that’s important. In some cases users have a different gitlab username to mattermost username (they are matched on email address as I understand it), but AFAIK all users affected by this problem have matching usernames in mattermost and gitlab.
  • I redirect mattermost/ to mattermost/<teamname>/ (with a 301) for convenience since we only have one team. Not sure if the 301 may matter to the browser, although it looks fine from the console logs - it redirects once, then repeatedly refreshes on mattermost/<teamname>/.
  • Our nginx configuration is not the same as the latest recommended. Mine is based on what was provided in docs for an early version (I did not realise it had changed so much!). I think the substantive differences are:
    1. I am setting Connection to "upgrade" on all URLs, but the latest config in the docs does this only for websocket URLs (sets it to "" on all others)
    2. I don’t have the proxy_cache, proxy_buffer or proxy_read_timeout settings

One affected user here is still able to reproduce the problem, so if there is any more instrumentation/logging we can throw at it, let me know and they can try it out.

One other thought @hmhealey: I believe the sequence of events is:

  • user closes browser tab
  • session expires because of max session length
  • user opens browser tab and tries to navigate to mattermost URL

I am not sure about this because I haven’t checked back to see whether the affected session was due to expire at that time (but I could look in backups). This would fit the statistics of when we see it occurring I think. (Nobody ever logs out manually here, ever). This sequence may be different to the sequence you described from your testing where you logged out from a separate browser.

@gubbins Maybe try removing the redirect from mattermost to mattermost/<teamname> to see if that’s causing any problems? That shouldn’t be necessary since we changed Mattermost to do that automatically at some point

I’ve had the same issue happen on 3.10, It seems to possibly related to using Apache as the reverse proxy. Adding the following options in the Vhost seems to resolve the issue:

    #May be needed on chromium based browsers to prevent infinite page refresh
	RequestHeader unset If-Modified-Since
	RequestHeader unset If-None-Match

I remember reading about this in another thread here on the forum. I dont think its the most elegant solution, but it fixed the problem for my production instance with ~100 users.

@ionbasa Are you seeing any of the Invalid or expired session, please login again. [details: UserRequired] error messages logged on the server? I think your issue may be a different one related to etags as opposed to expired sessions

Edit: I believe your problem may be fixed in 4.1. The reason it’s refreshing is because it’s detecting that the version number changed because it’s receiving a cached result with the old version number. Removing those headers stops the caching, so that would fix that, but we’ve improved the version detection in 4.1 to account for that better

@hmhealey I’m not sure. My issue was very similar to what everyone else was describing. On 3.10 if I would logout, every so often once MM was at the login screen it would indefinitely refresh the page until I manually stopped it in my browser. I also noticed this behavior happening if the host happened to restart or the reverse proxy would restart on the host machine.

As to version 4.1: I’ve updated to 4.0 (4.0.2) this weekend when very little of my users were active. If 4.1 fixes this issue, I’ll take a look and hopefully update.

As to having “Invalid or Expired Sesions”, it seems to only show up in logs apon restarting Mattermost, this is what I see:

[2017/08/07 01:15:12 PDT] [EROR] /api/v4/users/me: code=401 rid=e65oasembbbjjp6nshwaqzysho uid= ip=66.74.176.217 Invalid or expired session, please login again. [details: UserRequired]
[2017/08/07 01:15:12 PDT] [EROR] /api/v4/users/me/preferences: code=401 rid=hrq31xkbcigq9ethq3r74tgcso uid= ip=66.74.176.217 Invalid or expired session, please login again. [details: UserRequired]
[2017/08/07 01:15:12 PDT] [EROR] /api/v4/users/me/teams: code=401 rid=yinyfj5q8bg65cpdf6qg9pdjfh uid= ip=66.74.176.217 Invalid or expired session, please login again. [details: UserRequired]
[2017/08/07 01:15:12 PDT] [EROR] /api/v4/users/me/teams/members: code=401 rid=hbwypxdqhjrbfef1juzyxyqjhr uid= ip=66.74.176.217 Invalid or expired session, please login again. [details: UserRequired]
[2017/08/07 01:15:12 PDT] [EROR] /api/v4/emoji: code=401 rid=phfntet5k3dn7c5zu9ghnncngc uid= ip=66.74.176.217 Invalid or expired session, please login again. [details: UserRequired]
[2017/08/07 01:15:12 PDT] [EROR] /api/v4/users/me/teams/unread: code=401 rid=7yatngahbpf9tx4i9zsjtuqkiw uid= ip=66.74.176.217 Invalid or expired session, please login again. [details: UserRequired]
[2017/08/07 13:02:29 PDT] [EROR] /api/v4/channels/posts: code=401 rid=kyp5mcge8ifpzjec8qkhiguafy uid= ip=70.213.20.241 Invalid or expired session, please login again. [details: UserRequired]

It seems that it only showed up once I do a restart of the Matermost service. This was also just after updating from 3.10 to 4.0

@ionbasa Oh that’s really interesting. I thought the two issues were unrelated. Hopefully 4.1 will solve the problem then when it comes out next week. If not, we have something else to go on then since I didn’t know about the problem being related to etags before

OK so we have some hope that 4.1 will fix the issue for me also? That sounds promising.

@hmhealey having upgraded to 4.1.0, my user with the stuck refresh loop reports that it is still stuck in infinite refresh, but now he does see a gitlab page pop up during the loop so it looks a bit different.

Hi @gubbins,

We’re checking with the devs on our team whether they have any further help for your issue,

Just checking, what are your session lengths set to in System Console > Security > Sessions?

One thing he could try is logging out of all his active sessions, clearing browser history/cache and then logging in again.

Let us know if this helps?

Session length is set to 90 days.

Yes, as I think I said above, clearing browser cache and logging in resolves the issue. Generally users (especially the less technical ones) don’t know about that workaround when they encounter this issue - so obviously we’d like to stop it happening in the first place.

Ah sorry, I missed that comment @gubbins,

Let’s see what suggestions the devs come back with :slight_smile:

I have found an even easier way to produce the issue:

  • open chrome Incognito tab
  • if it’s not already looping, then: log in, then manually log out

This always triggers the loop for me. (Doing this from a regular, non-incognito tab does not trigger it).

@hmhealey or any other devs, latest info:

  • please note the behaviour did change on upgrade to 4.1 as commented above: we now see the page with the big gitlab button as one of the pages it cycles through
  • I removed the redirect from mattermost/ to mattermost/<team-name>/ as suggested - no difference
  • I strongly suspect we would not have this issue if we were not relying on gitlab auth - so you probably need the gitlab auth setup to reproduce it