View previous topic :: View next topic |
Author |
Message |
rac Bodhisattva
Joined: 30 May 2002 Posts: 6553 Location: Japanifornia
|
Posted: Mon Nov 25, 2002 11:48 pm Post subject: phpBB hacker feedback wanted |
|
|
In studying some of the performance bottlenecks that the forums are experiencing, I have found one area that looks outrageously wasteful to me, and I have witnessed it for all intents and purposes making the forums unusable when it was running, as it grabbed a lock on a key search table, causing other clients to pile up behind it.
I speak of the code in remove_search_post (includes/functions_search.php approx. line 353) that removes from the search wordlist all words from a given post that is being deleted, if that word was unique. This looks like it would be a massive query, especially for large posts, as my understanding is it searches for matches all across all posts for every single word in the post being deleted.
First, can somebody familiar with the phpBB code verify that my understanding of the purpose and scope of this query is accurate? The search_wordmatch table for the current state of the forums is 5 million+ records.
Second, if we were to disable this work, would there be any adverse effects other than potential waste of space in the wordlist table by words that were only used in posts that have since been deleted? If so, this seems to me an excellent tradeoff. That table has <200K records, and I don't mind wasting some spots in there at all. Besides, a monthly cron job could cross-reference the wordmatch table with the posts table to find these holes and remove them, if desired. _________________ For every higher wall, there is a taller ladder |
|
Back to top |
|
|
carambola5 Apprentice
Joined: 10 Jul 2002 Posts: 214
|
Posted: Tue Nov 26, 2002 5:45 am Post subject: |
|
|
Judging from my quick view over at www.phpbb.com/bugs It doesn't look like you filled a bug report. Might wanna do that. I'm willing to bet that the phpBB folks are quite familiar with their own code |
|
Back to top |
|
|
rac Bodhisattva
Joined: 30 May 2002 Posts: 6553 Location: Japanifornia
|
Posted: Tue Nov 26, 2002 5:53 am Post subject: |
|
|
I thought it a bit presumptuous to call it a bug, because I'm not sure that I understand exactly what it's trying to do, and if I do understand it, then it's doing what it should be doing. It's just that it seems like an awfully costly way of doing it. _________________ For every higher wall, there is a taller ladder |
|
Back to top |
|
|
klieber Bodhisattva
Joined: 17 Apr 2002 Posts: 3657 Location: San Francisco, CA
|
Posted: Tue Nov 26, 2002 12:53 pm Post subject: |
|
|
I rummaged around over at phpBB and came up with this thread which seems to validate what you're suggesting.
It's a PostgreSQL thread, but I imagine the same basic concept would apply to MySQL as well.
--kurt _________________ The problem with political jokes is that they get elected |
|
Back to top |
|
|
rac Bodhisattva
Joined: 30 May 2002 Posts: 6553 Location: Japanifornia
|
Posted: Tue Nov 26, 2002 7:11 pm Post subject: |
|
|
Thanks klieber, that looks exactly like the issue I'm talking about. The situation in MySQL is slightly different (and actually slightly worse) because MySQL cannot support subselects. So for MySQL, one SELECT gets a list of all words in the post being deleted, a second SELECT gets all of them that are in that list across the entire search database (this is the real killer, I think) and then a DELETE wipes out the ones that were only present in the deleted post. I will look into patching our board to defang this hotspot.
EDIT: I have tested a patch that comments out the section in question on another server running phpBB, and it acts like I expected. Unused and unreachable words remain in the wordlist, taking up space. However, they will never hit anything. There should be no user-visible repercussions of applying this patch. Deleted threads will not begin being hit by search results, which was my one minor concern.
EDIT2: This patch has now been applied to forums.gentoo.org - if anybody notices any strange new behavior that they think might be related, please post to this thread. _________________ For every higher wall, there is a taller ladder |
|
Back to top |
|
|
lars_msh n00b
Joined: 25 Aug 2002 Posts: 50 Location: Airstrip One
|
Posted: Wed Nov 27, 2002 11:04 pm Post subject: |
|
|
Interesting stuff! So, what did you do? It looks to me like a case of removing/commenting the lines of code in functions_search.php so the case statement for 'mysql' and 'mysql4' does nothing.
Or is there more to it than that?
I have a MySQL database and could live with a few dead words in there... |
|
Back to top |
|
|
rac Bodhisattva
Joined: 30 May 2002 Posts: 6553 Location: Japanifornia
|
Posted: Wed Nov 27, 2002 11:08 pm Post subject: |
|
|
I commented out the entire switch( SQL_LAYER ) {} in remove_search_post. _________________ For every higher wall, there is a taller ladder |
|
Back to top |
|
|
lars_msh n00b
Joined: 25 Aug 2002 Posts: 50 Location: Airstrip One
|
Posted: Wed Nov 27, 2002 11:21 pm Post subject: |
|
|
Yeah that ought to do it, thanks.
Footnote:
In a bizarre coincidence, the very same POP mail poll brought in my reply notification for this topic... and my first ever phpBB bug update for a bug I reported months ago. I wondered what was going on! |
|
Back to top |
|
|
|