Mattermost, Inc.

[Sovled] System Console: configuration times out while saving

Hey all,

I have set up a fresh install of Mattermost 3.0.1 using some slightly modified Docker containers in Kubernetes. Websockets and the general UI are working, but XHR requests for saving the configuration time out.

There’s a container running with Nginx as a proxy with the following configuration:

server {
   listen 443 ssl;
   server_name mychat.example.com;

   ssl_certificate /etc/nginx/certificates/mychat.example.com/fullchain.pem;
   ssl_certificate_key /etc/nginx/certificates/mychat.example.com/privatekey.pem;

   ssl_protocols TLSv1 TLSv1.1 TLSv1.2;
   ssl_ciphers 'ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:DHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-AES128-SHA256:ECDHE-RSA-AES128-SHA256:ECDHE-ECDSA-AES128-SHA:ECDHE-RSA-AES256-SHA384:ECDHE-RSA-AES128-SHA:ECDHE-ECDSA-AES256-SHA384:ECDHE-ECDSA-AES256-SHA:ECDHE-RSA-AES256-SHA:DHE-RSA-AES128-SHA256:DHE-RSA-AES128-SHA:DHE-RSA-AES256-SHA256:DHE-RSA-AES256-SHA:ECDHE-ECDSA-DES-CBC3-SHA:ECDHE-RSA-DES-CBC3-SHA:EDH-RSA-DES-CBC3-SHA:AES128-GCM-SHA256:AES256-GCM-SHA384:AES128-SHA256:AES256-SHA256:AES128-SHA:AES256-SHA:DES-CBC3-SHA:!DSS';
   ssl_prefer_server_ciphers on;
   ssl_ecdh_curve secp384r1;
   ssl_stapling on;
   ssl_stapling_verify on;

   ssl_dhparam /etc/nginx/certificates/dhparam.pem;

   location / {
       gzip off;
       proxy_set_header X-Forwarded-Ssl on;
       client_max_body_size 50M;
       proxy_set_header Upgrade $http_upgrade;
       proxy_set_header Connection "upgrade";
       proxy_set_header Host $http_host;
       proxy_set_header X-Real-IP $remote_addr;
       proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
       proxy_set_header X-Forwarded-Proto $scheme;
       proxy_set_header X-Frame-Options SAMEORIGIN;
       proxy_pass http://mattermost;
   }

}

There’s an upstream mattermost which points to the IP address and port where Mattermost is running.

There are no particular errors in on the server side, for example when I try to set the SMTP config:

[2016/05/17 20:16:17 UTC] [DEBG] /admin_console
[2016/05/17 20:16:19 UTC] [DEBG] /api/v3/users/initial_load
[2016/05/17 20:16:19 UTC] [DEBG] /api/v3/admin/config
[2016/05/17 20:16:20 UTC] [DEBG] /api/v3/users/websocket
[2016/05/17 20:16:20 UTC] [DEBG] /api/v3/teams/all
[2016/05/17 20:16:20 UTC] [DEBG] /api/v3/admin/analytics/standard
[2016/05/17 20:16:20 UTC] [DEBG] /api/v3/admin/analytics/user_counts_with_posts_day
[2016/05/17 20:16:20 UTC] [DEBG] /api/v3/admin/analytics/post_counts_day
[2016/05/17 20:17:16 UTC] [DEBG] /api/v3/admin/save_config
[2016/05/17 20:17:51 UTC] [DEBG] /admin_console
[2016/05/17 20:17:53 UTC] [DEBG] /api/v3/users/initial_load
[2016/05/17 20:17:54 UTC] [DEBG] /api/v3/admin/config
[2016/05/17 20:17:54 UTC] [DEBG] /api/v3/users/websocket
[2016/05/17 20:17:54 UTC] [DEBG] /api/v3/teams/all
[2016/05/17 20:17:54 UTC] [DEBG] /api/v3/admin/analytics/standard
[2016/05/17 20:17:54 UTC] [DEBG] /api/v3/admin/analytics/post_counts_day
[2016/05/17 20:17:54 UTC] [DEBG] /api/v3/admin/analytics/user_counts_with_posts_day
[2016/05/17 20:18:20 UTC] [DEBG] /api/v3/admin/save_config

The config.json is writable for everybody and the actual error I see in the UI is a timeout (We received an unexpected status code from the server. (504))

Any idea where to look for solving this issue?

Update: the config.json file is located on a mounted persistent volume in /data/mattermost/ and is generated by a script before the very first run of Mattermost. When I change configuration in the Mattermost UI and hit save, the changes will be change eventually (can take a while), but something’s odd there. When I restart Mattermost (for example stop and run the container), the app won’t come up completely:

[07:18:25 UTC 2016/05/18] [INFO] (github.com/mattermost/platform/utils.GetTranslationsBySystemLocale:56) Loaded system translations for 'en' from '/opt/mattermost/i18n/en.json'

After that line, nothing more happens and connections to :8065 are refused.

When I delete the config.json and re-run Mattermost again, the boot up will succeed.

Hi robert,

Could you try updating to 3.0.2. There was a problem with saving from the system console that was fixed in that version.

thanks for the hint!

I just upgraded to 3.0.2, but saving the configuration still gives me time outs.

In the meantime I have also seen time outs for MySQL- not very often, but sometimes. What’s weird is, that on the same server there are various other containers using MySQL (actually Google SQL, so a SaaS) without any problems or time outs.

Could of course be that there’s something wrong with my hosting environment, but it’s hard to catch by these symptoms.

Hi @robert, thanks for the report,

  1. Could you share more about your hosting environment?
  2. Could you share which version you were running immediately before the upgrade? (was it 3.0.1 or an earlier version?)

Hi Ian!

About the hosting environment: sure, anything you like to know.

I’m using Docker in Kubernetes, hosted in the Google Cloud. Apart from that it’s a rather straight-forward setup: An Nginx server (in a container / pod) which accepts connections from the internet and does the SSL termination and forwards requests to the other container / pod which runs Mattermost.

Kubernetes uses an internal network and DNS for routing any kind of traffic (TCP / UDP) to the respective containers.

Google Cloud does have a firewall, but generally all outgoing traffic is not blocked, incoming traffic needs to be enabled per port / host.

I suspect that there is something generally wrong which results in timeouts for various things, because even though sending emails generally works, it sometimes doesn’t:

[2016/05/18 14:57:43 UTC] [EROR] Failed to send mention email successfully email=xy@foo.bar.com err=SendMail: utils.mail.connect_smtp.open_tls.app_error, dial tcp 104.130.177.23:465: getsockopt: connection timed out

Before running the upgrade I was running 3.0.1 on a fresh database. If it helps I can generally nuke my database and try something else if you like.

Thanks @robert, there might be something happening unintentionally between 3.0.1 and 3.0.2, curious if starting with a fresh database helps, could you try it out and share back what you find? Highly appreciate your help,

Thanks @it33, I think that did the trick!

I started with a fresh database and until now I could always save configuration, send emails, send invites and so on. I’ll let it run and test it for a while and will report back if I stumble over any further problems.