[solved] Platform crashing after post in channel

Today we’ve had the service repeatedly quit unexpectedly. Trying to diagnose the cause, we enabled debug log level. In each case, immediately before the service quits, we see a final line in the log like this:

[2016/03/16 14:14:58 AEDT] [DEBG] /api/v1/channels/3waejx8retrodn8jgi1tkwzckw/create

The channel ID is for our Off Topic channel. The event coincides with a post in that channel. We have seen crashes after posts from different users, via a browser or the electron-mattermost client.

In journalctl we have a lot more log output starting with

panic: runtime error: invalid memory address or nil pointer dereference
[signal 0xb code=0x1 addr=0xf0 pc=0x56d92b]
goroutine 58006 [running]:
github.com/mattermost/platform/api.sendNotifications(0xc821584fc0, 0xc8202b7ea0, 0xc820836140, 0xc820b57d90, 0xc82089f9c0, 0xc821116800, 0x3f, 0x40)
/go/src/github.com/mattermost/platform/api/post.go:487 +0x5ebb
created by github.com/mattermost/platform/api.handlePostEventsAndForget.func1

This is followed by a lot more stack dump output and then:

mattermost.service: main process exited, code=exited, status=2/INVALIDARGUMENT

While I was typing this I noticed it crashed again after a post in a different channel :frowning:

Any ideas what’s happening? Memory usage on the host looks heavy but not desperate…

It’s now happening on every post in that channel I think.

Could this be caused by a user with a nickname containing non-ASCII characters? We do have one such user…

@gubbins, did you make any manual changes to the database?

yes - I modified the AuthService / AuthData of some users

The line of code apparently reporting the invalid address is this:

            if len(profile.NotifyProps["mention_keys"]) > 0 {

I guess profile is nil? I took a look in ChannelMembers in the database to see if all the entries for that channel seem valid - can’t see anything strange.

You need to restore your database to the state before the manipulation. Mattermost is designed as a continuous archive and after you manipulate the database the system can no longer be supported.

:blush: ok, lesson learned, turns out we had deleted a user (who had never posted) from the Users table at some point… and in particular there were still entries in ChannelMembers referencing that user ID.

Everything is now working again.