View Issue Details
ID | Project | Category | View Status | Date Submitted | Last Update |
---|---|---|---|---|---|
0000743 | LDMud 3.5 | Runtime | public | 2010-04-04 02:39 | 2018-01-30 03:59 |
Reporter | _xtian_ | Assigned To | zesstra | ||
Priority | normal | Severity | major | Reproducibility | always |
Status | resolved | Resolution | fixed | ||
Target Version | 3.5.0 | Fixed in Version | 3.5.0 | ||
Summary | 0000743: 3.5.0-current skipping heart_beats | ||||
Description | A few days ago we switched from a several months old 3.5.0 to a current (march 2010) 3.5.0 and immediately players started complaining that timers (that were actually fired by their heart_beats) were noticeably delayed. Measuring several objects heart_beat showed that about every 10% HBs of an object were delayed. Instead of a 2s delay, they had a 4s delay. This occured all the time - like I said easily for every 10% of HBs. Switching back to an older build of 3.5.0 solved the problem. If you can tell me where I can look-up the build number at runtime, I'll tell you which builds were involved. | ||||
Tags | No tags attached. | ||||
|
Concerning the 'look up revision at run-time': AFAIK it is not possible. The information should be in the debug file, but obviously does not help right now for older reversions. Concerning the problem: interesting. ;-) I guess I will have to start with some testcase first (or do you have a portable one ready?). Do you have any further info about the circumstances or do you just need 100 objects with no code but a heart_heat() in them? |
|
First try: I just cloned 300 objects with heartbeats and monitored if the HBs are skipped/delayed and the number of HB objects and processed HB objects in each cycle (via debug_info, DID_ST_HBEAT_OBJS, DID_ST_HBEAT_PROCESSED, DID_ST_HBEAT_AVG_PROC). So far I did not notice any problems, no skipped/delayed HBs on my machine at home. |
|
I just have an object that notifies me on every HB. Hmm, what else can I tell you. I can reproduce it in our develmud (on the server) but not in my homemud. The Develmud has <10 object with HBs, but a lot more call-outs running (if that is relevant) than the homemud. Or the server load somehow factors into this. |
|
Mhmm, the server load can influence it, because the drivers signal to execute the heartbeats is alarm/ITIMER_REAL and that can be delayed (or even missed on heavy server loads). But usually that delays should not be in the range of seconds. (What average loads and peak loads do you have?) It may be interesting to find out if a the HBs in a limited number of objects are missed or the complete HB cycle is delayed. |
|
The CPU never goes above 20%. All my testobjects delay at the same time. EDIT: Yep, approx 38 from 40 objects with HBs enabled delay at the same time. Looks like the whole queue. |
|
after bisecting: the regression appears in rev 2874 |
|
r2874? Are you sure? I read it again and still can't think of a reason, why this should introduce skipped heartbeats. It just synchronizes the alarms of the driver with full realtime seconds, making the first timer interval after bootup a little bit shorter. (Out of curiosity: do you have any "Last alarm was x seconds ago - restarting it." messages in your debug log?) |
|
Ah, another thing: could you please measure the time between your heartbeats with utime() instead of time()? |
|
Ok, problem found, I guess. r2874 is not faulty in itself (the problem could occur without it as well), but increases the chance for a very unlucky timing: r2874 tries to synchronize the time of alarms with whole seconds. The driver works with a granularity of seconds. Now consider: Alarm1 at 2.000001s. Alarm2 at 3.999999s. First thing: time() returns 2 and 3 at the time of the two alarms. The difference is < __HEART_BEAT_INTERVAL__ and the heartbeat is not executed, although _nearly_ __HEART_BEAT_INTERVAL__ has passed. Result: a heartbeat delayed by __ALARM_TIME__. Second possibility: Alarm1 at 1.999999s, Alarm2 at 4.000001s. The heartbeat is executed properly, about 2s passed since the last one, but the LPC internal time() will return 1 and 4, which looks like a delayed heartbeat (by one second). |
|
Ok, I changed the way current_time is determined in r2902. Could you please test in your mud if it solves your problems? |
|
Xtian reported success, so I am closing. In case of problems, please tell me to re-open. |
|
Fix committed to master branch. |
|
Fix committed in revision 06ea2820f6c36c54f5766f51fde1c6f9bd0c53bf to master branch (see changeset 1640 for details). Thank you for reporting! |
|
Fix committed in revision 06ea2820f6c36c54f5766f51fde1c6f9bd0c53bf to master branch (see changeset 2972 for details). Thank you for reporting! |
|
Fix committed in revision 06ea2820f6c36c54f5766f51fde1c6f9bd0c53bf to master branch (see changeset 4053 for details). Thank you for reporting! |
Date Modified | Username | Field | Change |
---|---|---|---|
2010-04-04 02:39 | _xtian_ | New Issue | |
2010-04-21 15:34 | zesstra | Note Added: 0001820 | |
2010-04-21 16:00 | zesstra | Project | LDMud => LDMud 3.5 |
2010-04-21 16:03 | zesstra | Note Added: 0001821 | |
2010-04-21 16:05 | _xtian_ | Note Added: 0001822 | |
2010-04-21 16:22 | zesstra | Note Added: 0001823 | |
2010-04-21 16:39 | _xtian_ | Note Added: 0001825 | |
2010-04-21 16:48 | _xtian_ | Note Edited: 0001825 | |
2010-04-24 03:32 | _xtian_ | Note Added: 0001828 | |
2010-04-25 15:43 | zesstra | Note Added: 0001829 | |
2010-04-26 08:53 | zesstra | Note Added: 0001830 | |
2010-04-26 13:35 | zesstra | Note Added: 0001831 | |
2010-04-26 14:10 | zesstra | Status | new => assigned |
2010-04-26 14:10 | zesstra | Assigned To | => zesstra |
2010-04-26 14:15 | zesstra | Note Added: 0001832 | |
2010-04-26 14:15 | zesstra | Status | assigned => feedback |
2010-04-26 14:41 | zesstra | Note Added: 0001834 | |
2010-04-26 14:41 | zesstra | Status | feedback => resolved |
2010-04-26 14:41 | zesstra | Fixed in Version | => 3.5.0 |
2010-04-26 14:41 | zesstra | Resolution | open => fixed |
2010-11-16 09:42 | zesstra | Source_changeset_attached | => ldmud.git master 06ea2820 |
2010-11-16 09:42 | zesstra | Note Added: 0001910 | |
2011-02-13 21:38 | zesstra | Target Version | => 3.5.0 |
2018-01-29 18:59 | zesstra | Source_changeset_attached | => ldmud.git master 06ea2820 |
2018-01-29 18:59 | zesstra | Note Added: 0002340 | |
2018-01-29 21:57 | zesstra | Source_changeset_attached | => ldmud.git master 06ea2820 |
2018-01-29 21:57 | zesstra | Note Added: 0002391 | |
2018-01-30 03:59 | zesstra | Source_changeset_attached | => ldmud.git master 06ea2820 |
2018-01-30 03:59 | zesstra | Note Added: 0002442 |