10
replies
497 views
Description
Device: PC
Browser: Firefox 146.0.1
OS: Windows 11 Pro 10.0.26100
Steps to reproduce:
- make a donation
- reload page while logged in
Anything else to add:
Not really a bug and not sure, if it really is worth investigating. Never had the error before under any other circumstances and might be specific to an UPDATE query issued when making a donation as a logged in user. Query probably ran at 12:55 CET. Might find it in the logs, if slow query log is enabled.
Solution might be to:
- optimize the specific query
- set indexed on relevant tables, if not already done
- increase innodb_lock_wait_timeout in my.cnf
No biggie, feel free to close the thread, if not relevant.
Browser: Firefox 146.0.1
OS: Windows 11 Pro 10.0.26100
Steps to reproduce:
- make a donation
- reload page while logged in
Anything else to add:
Not really a bug and not sure, if it really is worth investigating. Never had the error before under any other circumstances and might be specific to an UPDATE query issued when making a donation as a logged in user. Query probably ran at 12:55 CET. Might find it in the logs, if slow query log is enabled.
Solution might be to:
- optimize the specific query
- set indexed on relevant tables, if not already done
- increase innodb_lock_wait_timeout in my.cnf
No biggie, feel free to close the thread, if not relevant.
Can be used to make Runewords:
Device: PC
Browser: Firefox 146.0.1
OS: Windows 11 Pro 10.0.26100
Steps to reproduce:
- make a donation
- reload page while logged in
Anything else to add:
Not really a bug and not sure, if it really is worth investigating. Never had the error before under any other circumstances and might be specific to an UPDATE query issued when making a donation as a logged in user. Query probably ran at 12:55 CET. Might find it in the logs, if slow query log is enabled.
Solution might be to:
- optimize the specific query
- set indexed on relevant tables, if not already done
- increase innodb_lock_wait_timeout in my.cnf
No biggie, feel free to close the thread, if not relevant.
Browser: Firefox 146.0.1
OS: Windows 11 Pro 10.0.26100
Steps to reproduce:
- make a donation
- reload page while logged in
Anything else to add:
Not really a bug and not sure, if it really is worth investigating. Never had the error before under any other circumstances and might be specific to an UPDATE query issued when making a donation as a logged in user. Query probably ran at 12:55 CET. Might find it in the logs, if slow query log is enabled.
Solution might be to:
- optimize the specific query
- set indexed on relevant tables, if not already done
- increase innodb_lock_wait_timeout in my.cnf
No biggie, feel free to close the thread, if not relevant.
Fixedby Teebling • 3 days ago•Go to post
This is fixed now by the way, but to answer some of your questions, as this may be helpful to future readers as well:
So that would have been a good fix actually, preventing the table from locking and allowing the transaction to finish anyway. I ended up going with a different approach though.
First I had to grep through my typical file cache to find patterns of which queries were being cached so often and so many times. Then once I had found a number of culprit queries that looked similar to each other, I had to figure out which files were issuing these queries and asking them to be cached. Then I refactored a bunch of the old database code from 2023 that was responsible to make it 95% more performant than it was before, and therefore not requiring so much cacheing. This was also a classic N+1 exponential growth problem where the way in which I was cacheing the queries was leading to unique cache files per user per item per runes within item per this per that etc. etc. which was leading to the file cache filling up with an extraordinarily large number of files and very quickly.
After rinsing and repeating this process a few times, and ensuring cacheing remained in a non-proliferous way for the heaviest queries affecting page load speed, I was able to get the cache back to a normal looking state - where it stays around 500-1000 files in total at any time - which is what it's supposed to look like for phpBB - a number that phpBB's native cache purge behaviour has no issues dealing with - meaning no more timeout wait times being exceeded! This underlying issue fix in turn also fixes a bunch of other problems that were happening with timeouts elsewhere in the experience.
Thanks for your help and your report Elektro! Marking resolved.
Yep, just a pure file cache with opcache sitting on top to compile bytecode - there is no object cache in memory like redis.
Yep, phpBB will attempt to scan and then purge the file cache in its entirety for a variety of reasons - in your specific case it was because your user colour changed when you are added to the Donators group. The cache always starts rebuilding as soon as the old one is purged.Elektroputzi wrote: 6 days ago[*] if a user is assigned to a new group, something is done with the file cache. (my understanding was it first is purged, then rebuild)
The slow down was not the building of the new cache, it is the scanning of/deleting of the old one (which at the time had hundreds of thousands of files in it) that was locking the table for too long, leading to a timeout error. That's why you received a lock_wait_timeout_exceeded error.Elektroputzi wrote: 6 days ago[*] the cache operation is taking very long, so that the program stalls too long and the db query runs into the configured timeout -> the user is getting the error, when trying to load the page, while still logged in and the process is still running (prolly due to the update command locking the table, the select query cannot fetch required data)
phpBB attempts to purge the entire file cache - it does not balkanise the cache on a per-user or per-table basis.Elektroputzi wrote: 6 days ago[*] whatever is done to the cache, is it done to the whole cache, or per user?
Yeah something like this way you suggested would have worked way better than the way phpBB does it natively:Elektroputzi wrote: 6 days agoThe idea was to replace the php function with something that renames the old folder, creates a new folder and use that instead, trigger the deletion of the old folder, but not wait for it to be done.
Code: Select all
mv /cache/production cache/productiono
nice -19 rm -rf /cache/productionoFirst I had to grep through my typical file cache to find patterns of which queries were being cached so often and so many times. Then once I had found a number of culprit queries that looked similar to each other, I had to figure out which files were issuing these queries and asking them to be cached. Then I refactored a bunch of the old database code from 2023 that was responsible to make it 95% more performant than it was before, and therefore not requiring so much cacheing. This was also a classic N+1 exponential growth problem where the way in which I was cacheing the queries was leading to unique cache files per user per item per runes within item per this per that etc. etc. which was leading to the file cache filling up with an extraordinarily large number of files and very quickly.
After rinsing and repeating this process a few times, and ensuring cacheing remained in a non-proliferous way for the heaviest queries affecting page load speed, I was able to get the cache back to a normal looking state - where it stays around 500-1000 files in total at any time - which is what it's supposed to look like for phpBB - a number that phpBB's native cache purge behaviour has no issues dealing with - meaning no more timeout wait times being exceeded! This underlying issue fix in turn also fixes a bunch of other problems that were happening with timeouts elsewhere in the experience.
Thanks for your help and your report Elektro! Marking resolved.
Hey Elektro,
Thanks for your report (and your donation!).
What is actually happening is that the site has too many cache files, and as part of the transaction when you donate, it tries to purge that cache after adding you to the Donators user group, but it takes too long (because there are too many cache files) and therefore the transaction times out as per the lock_wait_timeout setting.
This is part of a larger bug that I am actively investigating (to find the root source of the cache proliferation issue). The tables are indexed and my.cnf is configured correctly. Any other ideas?
Thanks for your report (and your donation!).
What is actually happening is that the site has too many cache files, and as part of the transaction when you donate, it tries to purge that cache after adding you to the Donators user group, but it takes too long (because there are too many cache files) and therefore the transaction times out as per the lock_wait_timeout setting.
This is part of a larger bug that I am actively investigating (to find the root source of the cache proliferation issue). The tables are indexed and my.cnf is configured correctly. Any other ideas?
OP
Good question. What kind of cache are we talking about?
Maybe as a dirty workaround first rename the cache folder and create a new one, then delete the old cache folder. Renaming will be way faster than deleting recursively and rm can run on the old folder as long as it needs, since it is not in use anymore.
Something like:
Maybe as a dirty workaround first rename the cache folder and create a new one, then delete the old cache folder. Renaming will be way faster than deleting recursively and rm can run on the old folder as long as it needs, since it is not in use anymore.
Something like:
mv /path/to/cache /path/to/cache_old && mkdir /path/to/cacherm -rf /path/to/cache_old
Cache is a file cache, not redis or anything like that. Purged regularly already but thanks for the suggestion. The issue is not the ability to purge the cache, it is that the cache fills up too quickly
Anyway, appreciate the help - this is a known issue and I'm working on addressing it.
OP
Alright, I understand. So it is not the purge, but the cache build up taking so long.
Guess there are 2 ways to address this.
However, happy hunting! glhf!
Guess there are 2 ways to address this.
- change the program logic to not use cache files, as long as the build up is not finished
- increase cache build up speed, by either:
- using CPU with significantly higher frequency (assuming php is used)
- using a faster caching algorithm or compression algorithm, or less efficient compression, if compression is a thing. will ofc lead to more disk space and more bandwidth being used
However, happy hunting! glhf!
No, that's not it - it's that the cache is building up with tens of thousands of files too quickly!
nice -19) I'm locating the queries that do this now.
OP
Ok, clearly I do NOT understand
I am a little confused here.
So this is what I think I understood correctly:
[*] we are talking pure file cache, no object cache involved (redis, memecached)
[*] if a user is assigned to a new group, something is done with the file cache. (my understanding was it first is purged, then rebuild)
[*] the cache operation is taking very long, so that the program stalls too long and the db query runs into the configured timeout -> the user is getting the error, when trying to load the page, while still logged in and the process is still running (prolly due to the update command locking the table, the select query cannot fetch required data)
My questions now would be:
[*] whatever is done to the cache, is it done to the whole cache, or per user?
[*] do you know the exact php funtion triggering the cache operation?
[*] can the code be customized or is it obfuscated. or is the risk too high to break something or to overwrite the changes with the next patch? (I am not familiar with the forum software)
Just to be clear, with my first idea I did not mean to purge the whole cache once or on a regular basis per cronjob or something like this. The idea was to replace the php function with something that renames the old folder, creates a new folder and use that instead, trigger the deletion of the old folder, but not wait for it to be done.
So this is what I think I understood correctly:
[*] we are talking pure file cache, no object cache involved (redis, memecached)
[*] if a user is assigned to a new group, something is done with the file cache. (my understanding was it first is purged, then rebuild)
[*] the cache operation is taking very long, so that the program stalls too long and the db query runs into the configured timeout -> the user is getting the error, when trying to load the page, while still logged in and the process is still running (prolly due to the update command locking the table, the select query cannot fetch required data)
My questions now would be:
[*] whatever is done to the cache, is it done to the whole cache, or per user?
[*] do you know the exact php funtion triggering the cache operation?
[*] can the code be customized or is it obfuscated. or is the risk too high to break something or to overwrite the changes with the next patch? (I am not familiar with the forum software)
Just to be clear, with my first idea I did not mean to purge the whole cache once or on a regular basis per cronjob or something like this. The idea was to replace the php function with something that renames the old folder, creates a new folder and use that instead, trigger the deletion of the old folder, but not wait for it to be done.
This is fixed now by the way, but to answer some of your questions, as this may be helpful to future readers as well:
So that would have been a good fix actually, preventing the table from locking and allowing the transaction to finish anyway. I ended up going with a different approach though.
First I had to grep through my typical file cache to find patterns of which queries were being cached so often and so many times. Then once I had found a number of culprit queries that looked similar to each other, I had to figure out which files were issuing these queries and asking them to be cached. Then I refactored a bunch of the old database code from 2023 that was responsible to make it 95% more performant than it was before, and therefore not requiring so much cacheing. This was also a classic N+1 exponential growth problem where the way in which I was cacheing the queries was leading to unique cache files per user per item per runes within item per this per that etc. etc. which was leading to the file cache filling up with an extraordinarily large number of files and very quickly.
After rinsing and repeating this process a few times, and ensuring cacheing remained in a non-proliferous way for the heaviest queries affecting page load speed, I was able to get the cache back to a normal looking state - where it stays around 500-1000 files in total at any time - which is what it's supposed to look like for phpBB - a number that phpBB's native cache purge behaviour has no issues dealing with - meaning no more timeout wait times being exceeded! This underlying issue fix in turn also fixes a bunch of other problems that were happening with timeouts elsewhere in the experience.
Thanks for your help and your report Elektro! Marking resolved.
This post was marked as the fix.Yep, just a pure file cache with opcache sitting on top to compile bytecode - there is no object cache in Memory like redis.
Yep, phpBB will attempt to scan and then purge the file cache in its entirety for a variety of reasons - in your specific case it was because your user colour changed when you are added to the Donators group. The cache always starts rebuilding as soon as the old one is purged.Elektroputzi wrote: 6 days ago[*] if a user is assigned to a new group, something is done with the file cache. (my understanding was it first is purged, then rebuild)
The slow down was not the building of the new cache, it is the scanning of/deleting of the old one (which at the time had hundreds of thousands of files in it) that was locking the table for too long, leading to a timeout error. That's why you received a lock_wait_timeout_exceeded error.Elektroputzi wrote: 6 days ago[*] the cache operation is taking very long, so that the program stalls too long and the db query runs into the configured timeout -> the user is getting the error, when trying to load the page, while still logged in and the process is still running (prolly due to the update command locking the table, the select query cannot fetch required data)
phpBB attempts to purge the entire file cache - it does not balkanise the cache on a per-user or per-table basis.Elektroputzi wrote: 6 days ago[*] whatever is done to the cache, is it done to the whole cache, or per user?
Yeah something like this way you suggested would have worked way better than the way phpBB does it natively:Elektroputzi wrote: 6 days agoThe idea was to replace the php function with something that renames the old folder, creates a new folder and use that instead, trigger the deletion of the old folder, but not wait for it to be done.
Code: Select all
mv /cache/production cache/productiono
nice -19 rm -rf /cache/productionoFirst I had to grep through my typical file cache to find patterns of which queries were being cached so often and so many times. Then once I had found a number of culprit queries that looked similar to each other, I had to figure out which files were issuing these queries and asking them to be cached. Then I refactored a bunch of the old database code from 2023 that was responsible to make it 95% more performant than it was before, and therefore not requiring so much cacheing. This was also a classic N+1 exponential growth problem where the way in which I was cacheing the queries was leading to unique cache files per user per item per runes within item per this per that etc. etc. which was leading to the file cache filling up with an extraordinarily large number of files and very quickly.
After rinsing and repeating this process a few times, and ensuring cacheing remained in a non-proliferous way for the heaviest queries affecting page load speed, I was able to get the cache back to a normal looking state - where it stays around 500-1000 files in total at any time - which is what it's supposed to look like for phpBB - a number that phpBB's native cache purge behaviour has no issues dealing with - meaning no more timeout wait times being exceeded! This underlying issue fix in turn also fixes a bunch of other problems that were happening with timeouts elsewhere in the experience.
Thanks for your help and your report Elektro! Marking resolved.
OP
Very nice, as it resolves the root issue, bloated cache, instead of tempering with the codeTeebling wrote: 3 days ago...
After rinsing and repeating this process a few times, and ensuring cacheing remained in a non-proliferous way for the heaviest queries affecting page load speed, I was able to get the cache back to a normal looking state - where it stays around 500-1000 files in total at any time - which is what it's supposed to look like for phpBB - a number that phpBB's native cache purge behaviour has no issues dealing with - meaning no more timeout wait times being exceeded! This underlying issue fix in turn also fixes a bunch of other problems that were happening with timeouts elsewhere in the experience.
...
Thank you for sharing and fixing!
Great work, thanks guys.
Main: Necromancer / Second: Assassin / Third: Amazon / Check my stash tabs
Similar pages
Advertisment
Hide ads
Greetings stranger!
You don't appear to be logged in...No matches
Elektroputzi
24