Mattermost Peer-to-Peer Forum

Full Text Search


#1

I am curious to know why you have chosen to use mysql or postgres full text search rather than using a product/library meant for full text search (like elasticsearch/solr/lucene)

I have no direct experience of using mysql or pgsql full text search and would be interested to know how that works once you have millions of messages in the database.


#2

@aga_sumit, long term an option to use something like Lucene makes sense and we have a team member who used it extensively in a previous role working with Stanford Research Institute. The trade-off is that it’s not as well understood by the IT professional community as MySQL or Postgres


#3

I am thinking more from a performance perspective, how well does MySQL or Postgres work with hundreds of users actively creating messages for year.


#4

Hello all! I honestly believe that the current search implementation in Mattermost is useless in most real life cases, e.g. searching in any language, searching for bug number mentioned via url to a bug tracker, searching for a word part.
It is clear that mature software accumulating a lot of UGC must have a powerful tool to bring those data back to user.

@it33 you are right Lucene or Elastic Search addition will make the installation process a bit of a pain in a test environments especially. By the way it is crucial to provide an easy way to try Mattermost inside any infrastructure to be able to spread it widely.

Would you like to collaborate on this may be? I believe that we like the way Matetrmost works for us. But the search is a real blocker now.

Thank you and keep up the good work!


#5

Hi lig, thanks for the feedback and the offer to help as well,

Personally, I use search almost daily and it works well for me. Maybe I’m biased having been involved with the design though. E.g. if you want to find part of a word, like hair you can type hair* to find haircut and hairy.

Some of what you’re describing might be just usability improvements to add perhaps?

Our current plan is to focus on full text search in MySQL and Postgres for Team Edition,

  1. Install needs to remain simple, and get even more simple
  2. We’re making good progress on full text search, including support for CJK

We’d love to have more help here if you’re interested?

Lucene and Elastic search would be good for Enterprise Edition I think, since their setups are complicated and the benefit of that investment doesn’t really pay off until we’re into enterprise scenarios.

I don’t think we should give up on keeping deployment simple and improving on search in the database–even if that means reaching out to MySQL and Postgres with specific requests for what we need in their future versions…


#6

The case that blocks us now is related to bug numbers mentioned in messages.
It is common to search for a conversations on the particular bug through history.

Say you posted a link “https://mattermost.atlassian.net/browse/PLT-1961”.

And than mentioned bug PLT-1961 or even just by the number 1961.

You now have three messages you want to see in the search results:

Now to the reality:

  • `"*PLT*" -> PLT-1961`
    
  • `"*1961*" -> 1961`
    
  • `"1961" -> 1961`
    
  • `"PLT-1961" -> --`
    
  • `"*PLT-1961*" -> --`
    

Sad & unacceptable :frowning:


#7

Maybe Apache Solr might be easier to install and maintain than Lucene. Solr uses Lucene and extends it. http://lucene.apache.org/solr/features.html

From the Solr features page: “Solr is a standalone enterprise search server with a REST-like API. You put documents in it (called “indexing”) via JSON, XML, CSV or binary over HTTP. You query it via HTTP GET and receive JSON, XML, CSV or binary results.”

I used Solr 1.4 in a project a few years ago. It was relatively easy to set up, indexing was fast, and searching was fast and effective.

Solr is a stand alone system, so you could install it independently of Mattermost, just to see how it works. In that case, you’d extract the content from the Mattermost database and send it to Solr for indexing. Solr has a built-in web server, so you could quickly set up a web page for testing the search results.


#8

Is there any update on this, any plans for advanced search (even if only in Enterprise version)?


#9

Hi @jerry,

Thanks for your question!

Yes, we are working on making improvements to search functionality and this can be tracked with ticket


#10

Hi @lindy65,

I can’t seem to access that ticket (PLT-6402). Help? Thanks.


#11

Hi @fjarlq,

Sorry about that! You should be able to access it now. There may be a few closed tickets within the epic ticket that are not viewable due to confidential data.


#12

Thanks for providing the information on Apache SOLR.