Mattermost Peer-to-Peer Forum

[SOLVED] Mattermost suddenly crashed and after reboot Web Container doesn't response


#1

Summary

Mattermost suddenly crashed and after reboot Web Container doesn’t response.

We are using Mattermost with Docker Composer. Mattermost is on Version 3.6.2

Steps to reproduce

Expected behavior

Mattermost should run normally without just crashing.

Observed behavior

Mattermost crashed suddenly after two weeks of running perfectly normal. We restarted our Server and Mattermost run again until suddenly Mattermost again stoppend working. Looked if again the Database crashed but everything looked perfectly normal (docker-compose ps showed that every container was up). I tried to restart the Web Containe but nothing happend, I could restart the Container. He didn’t react to anything until I stoppend the Docker Service and started him again. This happened again some time later. Again the Web Container didn’t react to anything.

When I looked into the Database log to understand why the Databased crashed, I saw the following error Message: ERROR: relation "idx_teams_description" does not exist

What was the cause of that error and how to prevent him from happening again?


#2

Hi @aschmid,

Could you try updating to the latest version of Mattermost released on March 16th and let us know if the issue still persists?


#3

Hi! Exactly the same happend to me. I had to reboot the server and when it started again I get a “502 Bad Gateway” when trying to access the web. I have already tried to update to the last version but still the same error. Any ideas?


#4

I could solve this issue by removing app folder inside volumes. Couldn’t figure out what had happend but at least is working again and I didn’t loose any data.


#5

Great news @iyanmv!

Thanks for posting your solution! I’m sure it will help others in the same situation.

I’ll close off this issue for now but please feel free to come back and post if you experience any problems.


#6

hello,
I have the same problem, I use mattermost with docker-compose: https://github.com/mattermost/mattermost-docker

Mattermost version: 4.7.1

Deleting volumes/app/mattermost/config/config.json file resolve the error except I need to do it each time I stop the docker containers.

So, IMO this is not solved.

PS: I was able to update config at runtime like this:

# first of all save config
cp volumes/app/mattermost/config/config.json{,.bak}
# then restart:
docker-compose stop
rm volumes/app/mattermost/config/config.json
docker-compose up -d
cp volumes/app/mattermost/config/config.json{.bak,}

#7

Hi @AllTheDey,

Thanks for your feedback,

I’m going to ask @pichouk to help troubleshoot your issue :slight_smile:


#8

Hi @AllTheDey :slight_smile:

Can you please provide log from your containers when the Mattermost instance crash please ?


#9

I’ll do it asap (probably tomorrow)


#10

here is some logs.

... 
db_1   | LOG:  database system is shut down
db_1   | AWS_ACCESS_KEY_ID is required for Wal-.10.22.51"
E but not set. Skipping Wal-E setup.
db_1   | LOG:  database system was shut down at 2018-03-21 21:28:12 CET
db_1   | LOG:  MultiXact member wraparound protections are now enabled
db_1   | LOG:  database system is ready to accept connections
db_1   | LOG:  autovacuum launcher started
db_1   | LOG:  incomplete startup packet
db_1   | ERROR:  relation "idx_teams_description" does not exist


db_1   | STATEMENT:  SELECT $1::regclass
db_1   | LOG:  received smart shutdown request
db_1   | LOG:  autovacuum launcher shutting down
db_1   | LOG:  shutting down
db_1   | LOG:  database system is shut down
db_1   | AWS_ACCESS_KEY_ID is required for Wal-E but not set. Skipping Wal-E setup.
db_1   | LOG:  database system was shut down at 2018-03-21 22:50:41 CET
db_1   | LOG:  MultiXact member wraparound protections are now enabled


web_1  | linking plain config
web_1  | ln: /etc/nginx/conf.d/mattermost.conf: File exists
web_1  | linking plain config
web_1  | ln: /etc/nginx/conf.d/mattermost.conf: File exists
web_1  | ln: /etc/nginx/conf.d/mattermost.conf: File exists
web_1  | linking plain config

db_1   | LOG:  database system is ready to accept connections
db_1   | LOG:  autovacuum launcher started
db_1   | LOG:  incomplete startup packet
db_1   | ERROR:  relation "idx_teams_description" does not exist
db_1   | STATEMENT:  SELECT $1::regclass
... a lot of vallid connections from web container
web_1  | linking plain config
web_1  | ln: /etc/nginx/conf.d/mattermost.conf: File exists
web_1  | 2018/03/07 06:23:30 [error] 9#9: *1 connect() failed (111: Connection refused) while connecting to upstream, 
... lot of connections refused from web container

#11

Thanks.

This is weird, it seems that your database container is requested to restart : received smart shutdown request. Do you have any idea of what can cause this ?

I don’t understand what happen with your application container, since there is no log from it. Also, this is disturbing because the date on your web container error log (2018/03/07) is previous than the date I can see in the database log (2018-03-21). Is that the correct log order ?