Mattermost stops working 502 Gateway error - backups also will not work

For feature requests, please see: http://www.mattermost.org/feature-requests/.

For troubleshooting questions, please post in the following format:

Summary

Since the Tuesday this week Mattermost has stopped working.
No changes were made to the server.
Restoring previous versions of the server from backups from before the fault appearing result in the same 502 Bad Gateway nginx/1.10.3 (Ubuntu) error.
System is less than one month old.
Using PostgreSQL and Let’s Encrypt.
No changes to DNS servers.

Steps to reproduce

Restoring the server produces the same result

Expected behavior

Mattermost to work

Observed behavior

502 Bad Gateway nginx/1.10.3 (Ubuntu) error.

Check the log file of mattermost and see what it says.

Hi sorry for the slow response. I have restored the system back two weeks to try and get the live server back again. Here is the log file. As you can see any requests today are not even getting to the Mattermost server. This image was working perfectly and no longer does.

[2017/07/31 10:41:34 UTC] [EROR] /api/v4/users/email/verify:verifyUserEmail code=400 rid=zosq8xcxs3dffg8mjz7ies9zge uid=moenhrrfdbgdbn756p4bw85har ip=83.216.64.54 Bad verify email link. [details: GetVerifyEmail$
[2017/07/31 11:23:23 UTC] [EROR] /api/v4/users/login:SqlUserStore.GetForLogin code=400 rid=jz69iregy3bfmrmf1obqnbhy8o uid= ip=78.144.195.251 We couldn’t find an existing account matching your credentials. This $
[2017/07/31 11:43:45 UTC] [EROR] /api/v4/users/login:Login code=401 rid=mrd58ejxjpgrzb3m393kf7ofpy uid= ip=78.144.195.251 Login failed because email address has not been verified [details: user_id=qh9yazzcc3dyt$
[2017/07/31 11:57:12 UTC] [EROR] /api/v4/users/login:Login code=401 rid=yfxz7knjnpnsfpq49wiqtwfxke uid= ip=78.144.195.251 Login failed because email address has not been verified [details: user_id=9xw8sz3ea3f1p$
[2017/07/31 12:22:05 UTC] [EROR] /api/v3/general/log_client:ServeHTTP code=501 rid=mqw75czz5fbouqxiw3nxri9yze uid= ip=83.69.117.189 API version 3 has been disabled on this server. Please use API version 4. See $
[2017/07/31 15:15:01 UTC] [EROR] /api/v4/users/login:SqlUserStore.GetForLogin code=400 rid=iq1iz7roupba9qjom9i8hp8y9e uid= ip=212.69.47.11 We couldn’t find an existing account matching your credentials. This te$
[2017/07/31 17:21:48 UTC] [EROR] /api/v4/users/login:checkUserPassword code=401 rid=ozudt8mj1pdybdstjoq84s1arc uid= ip=31.51.247.10 Login failed because of invalid password [details: user_id=sirsfrpc3ffwmxr13id$
[2017/07/31 18:02:09 UTC] [EROR] /api/v3/general/ping:ServeHTTP code=501 rid=3jysm7shbpbw9bkfo14cksxy4h uid= ip=92.40.249.99 API version 3 has been disabled on this server. Please use API version 4. See https:/$
[2017/08/01 11:57:35 UTC] [EROR] /api/v3/general/log_client:ServeHTTP code=501 rid=9qzw599mubb83n43gzp4z6g69h uid= ip=83.69.117.189 API version 3 has been disabled on this server. Please use API version 4. See $
[2017/08/02 00:30:09 UTC] [EROR] /api/v4/users/login:SqlUserStore.GetForLogin code=400 rid=58bmmod4d3r1tnaah5tq3ze9yc uid= ip=110.20.186.57 We couldn’t find an existing account matching your credentials. This t$
[2017/08/03 18:21:04 UTC] [EROR] /api/v4/users/login:SqlUserStore.GetForLogin code=400 rid=b88xwt7t4in6t8oyeiauaenroc uid= ip=46.171.132.5 We couldn’t find an existing account matching your credentials. This te$
[2017/08/04 16:25:48 UTC] [EROR] /api/v4/channels/members/me/view: code=401 rid=cmn7f8qiff839g58scn1nixr1r uid= ip=212.69.47.11 Invalid or expired session, please login again. [details: UserRequired]
[2017/08/04 16:25:48 UTC] [EROR] /api/v4/users/me/teams/m3tosawm83y7bnre1atyxmswiy/channels/members: code=401 rid=b5g5h5uib7b8p8g9wptiepnz1o uid= ip=212.69.47.11 Invalid or expired session, please login again. $
[2017/08/04 16:27:07 UTC] [EROR] /api/v4/users/me/teams/m3tosawm83y7bnre1atyxmswiy/channels: code=401 rid=xochtprgpfyetnmw7eeakf68oa uid= ip=212.69.47.11 Invalid or expired session, please login again. [details$
[2017/08/04 16:27:07 UTC] [EROR] /api/v4/channels/5uwysrggnbro7jt8k49giaunyo/posts: code=401 rid=rthgnepz5iyjpkcnd6ihq1915o uid= ip=212.69.47.11 Invalid or expired session, please login again. [details: UserReq$

It looks like it might be missing configurations on your nginx, could you post those?

Also please provide the versions you’re running of mattermost and ubuntu, thanks.

Thanks, below is Nginx mattermost config file. It is the only web site enabled.

Nginx is version 1.10.3
Ubuntu 16.04.2
Mattermost 4.0.0

upstream backend {
   server xxx.xxx.xxx.xxx:8065;
}

proxy_cache_path /var/cache/nginx levels=1:2 keys_zone=mattermost_cache:10m max_size=3g inactive=120m use_temp_path=off;

server {
   listen 80;
   server_name    my_domain.net;

   location ~ /api/v[0-9]+/(users/)?websocket$ {
       proxy_set_header Upgrade $http_upgrade;
       proxy_set_header Connection "upgrade";
       client_max_body_size 50M;
       proxy_set_header Host $http_host;
       proxy_set_header X-Real-IP $remote_addr;
       proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
       proxy_set_header X-Forwarded-Proto $scheme;
       proxy_set_header X-Frame-Options SAMEORIGIN;
       proxy_buffers 256 16k;
       proxy_buffer_size 16k;
       proxy_read_timeout 600s;
       proxy_pass http://backend;
   }

   location / {
       client_max_body_size 50M;
       proxy_set_header Connection "";
       proxy_set_header Host $http_host;
       proxy_set_header X-Real-IP $remote_addr;
       proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
       proxy_set_header X-Forwarded-Proto $scheme;
       proxy_set_header X-Frame-Options SAMEORIGIN;
       proxy_buffers 256 16k;
       proxy_buffer_size 16k;
       proxy_read_timeout 600s;
       proxy_cache mattermost_cache;
       proxy_cache_revalidate on;
       proxy_cache_min_uses 2;
       proxy_cache_use_stale timeout;
       proxy_cache_lock on;
       proxy_pass http://backend;
   }

    listen 443 ssl; # managed by Certbot
ssl_certificate /etc/letsencrypt/live/my_domain.net/fullchain.pem; # managed by Certbot
ssl_certificate_key /etc/letsencrypt/live/my_domain.net/privkey.pem; # managed by Certbot
    include /etc/letsencrypt/options-ssl-nginx.conf; # managed by Certbot
}

In your mattermost config file do you have:

"EnableAPIv3": true,

Or can you set to true and see if it resolves some of the issue you’re having?

The V3 is still required by it, so that option should be enabled.


Your nginx config looks OK.


Further looking at your logs, for example:

We couldn’t find an existing account matching your credentials. This t$

It cuts part of the message but have you checked if that id exists as a user? It looks like a legit error of wrong credentials to me, without further looking at it you wouldn’t know.

Bad verify email link. [details: GetVerifyEmail$

Again another error message cut, but could be due to the fact that link has already been used or is simple misspelled or expired?

Well that was interesting, I got a server response rather than a gateway error and was offered a file to download called qmNcAxxo.

The contents of the file are in the following image:
[cid:image001.png@01D3185A.6FAFECC0]

The mattermost log shows the following

[2017/07/27 15:48:03 UTC] [INFO] Current version is 4.0.0 (4.0.1/Wed Jul 19 00:20:37 UTC 2017/a350f$
[2017/07/27 15:48:03 UTC] [INFO] Enterprise Enabled: true
[2017/07/27 15:48:03 UTC] [INFO] Current working directory is /opt/mattermost/bin
[2017/07/27 15:48:03 UTC] [INFO] Loaded config file from /opt/mattermost/config/config.json
[2017/07/27 15:48:03 UTC] [INFO] Server is initializing…
[2017/07/27 15:48:03 UTC] [INFO] Pinging SQL master database
[2017/07/27 15:48:03 UTC] [EROR] Failed to ping DB retrying in 10 seconds err=pq: password authenti$
[2017/07/27 15:48:13 UTC] [INFO] Pinging SQL master database
[2017/07/27 15:48:13 UTC] [EROR] Failed to ping DB retrying in 10 seconds err=pq: password authenti$

So we are narrowing things down. However I am concerned that I might have the wrong binary since it says Enterprise Enable: true – I thought I had the Team edition. Would this make a difference?

Thanks

Richard

Perhaps a good opportunity to upgrade to 4.1 as well :wink: https://about.mattermost.com/download/

As for it not being able to ping your database, is your database installed in the same machine?

Have you tried doing something like:

sudo systemctl restart postgresql
sudo systemctl restart mattermost

To ensure both are running?

Also check your mattermost config file to ensure you’re using postregresql and not mysql(as we are doing the procedures related to postgresql, if its not we would have to change it for mysql instead).


This might also be relevant to your issue https://github.com/mattermost/platform/issues/1541 but its a bit old.

Might be good if you could post your config.json so we could look if anything wrong there on your database settings, but please do modify your password to something else but ensure it uses the same type of characters, for example the real password is a!1dBV*@ change it to something else using those special characters, for example Db!ai@p*@, for safety purpose of course.

if you’re using 127.0.0.1 ensure its allowing you to connect to it by trying it locally to connect your self, sometimes it may be configure to only listen to the socket thus the localhost would not work. So you would have to change the setting.