Thread: CPU multi-threading planned
-
November 24th, 2020, 19:02 #51
- Join Date
- Mar 2020
- Posts
- 84
I was getting on to post this same thing. You already told us exactly what the problem was a few pages ago, but I'll expound so-as to not take more dev time in this thread.
Render thread is always single-threaded in every UI framework, including QT, C# WPF, Xamarin, and other game engines. This is the case because UI is built with a layered approach and you can't just have things render in any old order or else you get all kinds of weird display artifacts. In the case of FGU, this is based on information coming from the LUA engine once the main playing area is displayed. The initial load of FGU from the launcher shows a progress bar because at that point in time the LUA state machine can run in its thread, and the UI render thread doesn't care what the LUA state machine is doing -- yet. In the launching mode it would only care to know when the LUA engine is ready.
Once the transition to "the game" happens, everything has to come via the LUA engine because that's where all the state is. When you start loading modules, now you're in a state where your render thread is directly dependent upon the LUA engine, which is busy loading your module state and blocking further requests until it's done. That's why it goes white screen at that point, and white screen is just the OS specific way of letting you know the program isn't rendering or accepting input.
Going to another scripting engine isn't a solution when part of the problem is maintaining backwards compatibility, and most of the problem is contention over locked resources (in this case the LUA state engine). There will still be contention regardless of which scripting engine is used because if the UI is rendering incomplete state you get weird artifacts or crashes. This is one reason that immutable (or Frozen) state is what you want to work with in the UI layer (and arguably in as many places as possible but that claim sparks a religious debate). The state of stuff will be changing, of course, but it's changing elsewhere.
The way many of the UI rendering engines get away with threading for processing on something OTHER than the render thread is that they dispatch state changes to the UI layer. So while it appears that the program is responsive, it might not be doing anything useful other than placating the user by letting them know it's busy and not in the process of crashing.
In the case of FGU, though, even if you could figure out a way to create such a dispatcher between LUA and the Unity code layers, you still have a major problem to overcome, namely "exactly what am I supposed to dispatch?" FGU's rulesets, modules, and extensions have a ton of expression based stuff. Value1 = value2 * value3. So if Value3 changes, it needs to dispatch notice of anything that subscribes to it. Most game related expressions aren't that simple. Even calculating a character's main attributes is a series of expressions in an "all games supported" engine like FG, and everything else builds off of the results of those expressions.
Creating a dynamic subscription dispatcher is a lot of work to do to avoid hitching, and a very difficult architectural change. It's also pretty special case, so probably wouldn't be available on the Unity store. And it's also completely understandable why it would not be included in a complete rebuild of FG.
So far the only times I've seen noticeable hitching have been when opening a list (which is so much better now than it was a few months ago), when loading all the modules I want to use when setting up my campaign for the first time, and when players are connecting. I can live with the loading of modules thing because I only need to do it the one time, and I never toggle it while players are connected. All the players will experience the same hitching on their end when they activate the modules for the first time, but again I only hear them complain about "the steps" whenever we start a new campaign. My group might be rare in that it's usually punctual, so we don't even have to deal with the player connection hitching except at the beginning of the session.
-
November 24th, 2020, 20:46 #52
one simple solution is to limit the amount of layers in you maps.. and if it gets really bad.. turn off things like Fx, and LOS. You can just use the original version of fog of war and reveal the maps as players explore..
-
November 25th, 2020, 01:15 #53
- Join Date
- Aug 2019
- Posts
- 2,025
Thanks for the thorough explanation!
Render thread is always single-threaded in every UI framework, including QT, C# WPF, Xamarin, and other game engines.
In the case of FGU, this is based on information coming from the LUA engine once the main playing area is displayed.
The initial load of FGU from the launcher shows a progress bar because at that point in time the LUA state machine can run in its thread, and the UI render thread doesn't care what the LUA state machine is doing -- yet.
In the launching mode it would only care to know when the LUA engine is ready.
Once the transition to "the game" happens, everything has to come via the LUA engine because that's where all the state is.
When you start loading modules, now you're in a state where your render thread is directly dependent upon the LUA engine, which is busy loading your module state and blocking further requests until it's done.
So the rendering pipeline waits for the LUA engine to give feedback. Why does the LUA engine not do this? I know it is busy loading and processing scripts, but how expensive can it be to sent a short ping to the rendering engine every once in a while?
That's why it goes white screen at that point, and white screen is just the OS specific way of letting you know the program isn't rendering or accepting input.
Going to another scripting engine isn't a solution when part of the problem is maintaining backwards compatibility, and most of the problem is contention over locked resources (in this case the LUA state engine).
This is one reason that immutable (or Frozen) state is what you want to work with in the UI layer (and arguably in as many places as possible but that claim sparks a religious debate).
The way many of the UI rendering engines get away with threading for processing on something OTHER than the render thread is that they dispatch state changes to the UI layer. So while it appears that the program is responsive, it might not be doing anything useful other than placating the user by letting them know it's busy and not in the process of crashing.
In the case of FGU, though, even if you could figure out a way to create such a dispatcher between LUA and the Unity code layers, you still have a major problem to overcome, namely "exactly what am I supposed to dispatch?"
FGU's rulesets, modules, and extensions have a ton of expression based stuff. Value1 = value2 * value3. So if Value3 changes, it needs to dispatch notice of anything that subscribes to it. Most game related expressions aren't that simple. Even calculating a character's main attributes is a series of expressions in an "all games supported" engine like FG, and everything else builds off of the results of those expressions.
Creating a dynamic subscription dispatcher is a lot of work to do to avoid hitching, and a very difficult architectural change. It's also pretty special case, so probably wouldn't be available on the Unity store. And it's also completely understandable why it would not be included in a complete rebuild of FG.
So far the only times I've seen noticeable hitching have been when opening a list (which is so much better now than it was a few months ago), when loading all the modules I want to use when setting up my campaign for the first time, and when players are connecting.
And then there is the regular dropouts when new features like LoS are used. This is something where Moon Wizard stated that they are considering multi-threading for the calculations of LoS. I would welcome this.
Then again, he also stated that networking is already multi-threaded, but I still got up to half a minute delays when starting Launcher, because the whole UI was waiting for networking to answer. Instead the UI should have started and told me in a meaningful way that networking is still busy doing its thing. These are design/concept problems that come before the code and UI framework hurdles. It sometimes seems as if the developers forget that there are people in front of their expensive computer hardware waiting to get information about what the heck is going on and whether they should rather get a hot drink meanwhile.
-
November 25th, 2020, 01:17 #54
- Join Date
- Aug 2019
- Posts
- 2,025
So effectively going back to FGC, because from a player's perspective these are the main incentives to even use FGU. Not to mention the GM who was happy to get rid of having to manually unmask the map.
Network transfer is worlds apart quicker in FGU compared to FGC, but players usually don't experience so much of that part when processing the transferred data takes so much longer instead.
-
November 25th, 2020, 11:19 #55
I can live with a half minute delay here and there (and longer because a minute is only a long time when you stare at a clock) while FGU is optimised and has features added (which is what sets FGU apart from FGC, its going to get more stuff, LOS being just the start), and being as I am not an expert in writing computer software because I have not done so for over 25 years, I will leave it to people who write and fix code all day to do their job, and let them deal with bug reports.
Got a Bug - Click & FOLLOW the procedure here, it will save time
Ultimate Edition Fantasy Grounds - ONLY ON Linux
Twitch Channel
-
November 25th, 2020, 19:48 #56
- Join Date
- Mar 2020
- Posts
- 84
I agree that giving some kind of progress indicator would address the perception issue. The solution is a lot more difficult than you make it out to be, though. I'd love to see it addressed, but not at the cost of features that matter more to me. From what a few others have chimed in, that's the same position others hold.
I'm on vacation right now, which is why I can spend so much time conjecturing and thinking about this. Plus I feel bad about dog-piling early in this thread now that I've given it a lot more thought and realized the scope of necessary changes.
I'm very bad at forum quoting, so I'm going to give it a try.
They aren't in the same thread. LUA runs on its own thread, but based on what Moon Wizard said, the "how to draw" comes from LUA, and that will be because FGU maintained backwards compatibility with FG. This is a very complicated problem to work around.
I appreciate the complexity of the pickle they've gotten themselves into. 15 years ago this was still commonplace in applications, and it wasn't until mobile applications hit it big that people started to demand a better experience than being happy to just have something that works.
Here's the pickle of the solution you suggested. LUA engine is busy loading a module. If they were to add a ping in that process, then the ping would go out to the Unity engine, and then need to be dispatched to a thread that cares. That thread happens to be the render thread. The render thread is waiting on the LUA thread, and now the LUA thread is waiting on the render thread. That's a deadlock.
Okay, so let's say they fire and forget the ping to avoid deadlocking the render and LUA threads. Now the ping goes out to the dispatcher. The dispatcher will wait until the render thread can pick it up. The render thread will pick that up as soon as the LUA thread gives it the information it's currently waiting on. As soon as it gets a chance to continue on it will process a bunch of those pings all in a row. If you can remember back far enough, you might even remember software that used to do this. The software would be frozen and then all of a sudden it would refresh, then refresh again multiple times. That's those pings finally coming through.
LUA is a scripting engine and is UI framework independent. That developers hook a UI to it is incidental to LUA and none of its concern. That developers are using the LUA engine to dictate everything about the UI is the crux of the problem we're seeing.
The optimization is something that they can work on. And if optimization is sufficient then all the rest of the asynchronous stuff wouldn't need to be done. I'd bet they are working on the optimization, but a few guys can only do so much in a given timeframe. And judging on the patch notes that I read with each update, they are quite busy.
That's not the application that is animating the cursor. The applications that manage it send an operating system message to change the cursor prior to doing heavy work, and then when the work is done they send another message to revert it back. Thankfully that's mostly a thing of the past, and application developers that do it nowadays need to be taught a new thing or two. The operating system now does it when an application doesn't respond to message prompting from the OS.
I had segued into a possible solution and was musing over the difficulties and complexities associated with that solution. Adding another layer to hold current "frozen" state is one possible solution to the problem. By adding a separate layer between the render layer and the LUA layer, the LUA layer could pump "pings" (as you put it) up to that layer and the pings would be of the nature of the new value of a given variable. Most applications can get away with pumping the fact that state changed and that's all that needs to happen. But pumping only that means that the receiver of the message would still need to go back to the LUA engine to get the actual changes. So that pump needs to be the state, too, not just that it changed.
The inherent problem with an approach of that nature in this case is that with an expression engine like what FG uses is dependent on many other things, and identifying what would need to be pumped up the ladder would take extra effort because when one value gets pumped the LUA layer would have to figure out all of the values dependent on value3, and on any values dependent upon those values. It's a tree. So now it would be pumping a ton of state all the time, and LUA isn't particularly speedy, either. (And the end result would be that it would take a lot longer to do any useful new processing.)
It is a shame from the perspective of a consumer that is fixating on that, for sure.
I can only conjecture, especially because I'm a relative newcomer, but I think this would have added at least a couple of months of development work to their timeline as they experimented with different approaches. They wouldn't have done it early because they needed a proof of concept. Then they needed to get ready for a Kickstarter. Then they needed to get the bits that people used and notice all the time functional for release. This loading issue isn't one that constantly plagues every user. It's a startup issue, or an issue that happens when a user toggles modules on later.
Anyways, it's a super complicated problem and I don't see that there's a whole lot of bang for buck to solve it. I'll admit that I've never seen FG's code. I have, however, come into projects to solve similar issues, and it's a hairy issue. If optimization doesn't fix it, it's a hard sell to say "well, I guess we can rewrite".
Developers truly don't forget. People assume developers are nefarious and have all these plots about how to make peoples' lives miserable. It's actually quite the contrary. You might be surprised to learn that most developers want their users to have the best possible experience. They are frustrated in their desires by only having so much time to get stuff done. That means prioritizing tasks in an order that makes the most sense to the widest audience. Only open source projects can afford to stay in architecture mode for 4-5 years.
-
November 25th, 2020, 20:13 #57
Yep, PCGen has really only been re-written fully like twice in 20 years.
Paul Grosse
PCGen BoD
PR Silverback
Autobackup Batch file, Always ALWAYS backup your data. Remember to follow the 3-2-1 rule!
-
November 25th, 2020, 22:22 #58
- Join Date
- Aug 2019
- Posts
- 2,025
For our group FG is mainly a graphical user-interface for table-top heavy systems, we don't need it for something like Eclipse Phase, just for Pathfinder. All the automation is great and helps to lessen the GM's work, but we could just as much play with PDF files and roll on our table (or via Discord bot). FG's main task (for us) is to provide a good input/output experience, show a map + help visualize what everyone is doing. That doesn't mean that we don't immensely enjoy the automation, especially my players who did not have to pay a single dime for it.
They aren't in the same thread. LUA runs on its own thread, but based on what Moon Wizard said, the "how to draw" comes from LUA, and that will be because FGU maintained backwards compatibility with FG. This is a very complicated problem to work around.
Here's the pickle of the solution you suggested. LUA engine is busy loading a module. If they were to add a ping in that process, then the ping would go out to the Unity engine, and then need to be dispatched to a thread that cares. That thread happens to be the render thread. The render thread is waiting on the LUA thread, and now the LUA thread is waiting on the render thread. That's a deadlock.
Let's assume that LUA methods and Unity API calls are not capable of processing anything else than loading a full file, then we could still at least keep the UI reactive in between multiple files being loaded and between loading and processing.
The optimization is something that they can work on. And if optimization is sufficient then all the rest of the asynchronous stuff wouldn't need to be done. I'd bet they are working on the optimization, but a few guys can only do so much in a given timeframe. And judging on the patch notes that I read with each update, they are quite busy.
Here is an example: I double-clicked on the "Campaign" image sub-folder and FGU began loading the images inside to display thumbnails. Not only does it not tell me what is going on, but it did not even make the jump to the sub-folder to show me that it accepted my double-click. Ignore the FPS being displayed at the top, the real fps is 0 (zero) and the screen goes white ("no response") if I click on FGU's window again.
Every other thumbnails displaying software out there would open the folder, display the thumbnails that were already processed and then one by one display the ones that finished later. Some software even presents a progress indicator to show how far it got. FGU displays zilch until all images are read and ready to be displayed as thumbnails.
Feature request: Please implement a thumbnail cache for assets and overhaul the horrible Assets UI to at least reach the standards of FGC's.
The operating system now does it when an application doesn't respond to message prompting from the OS.
I had segued into a possible solution and was musing over the difficulties and complexities associated with that solution...
Developers truly don't forget. People assume developers are nefarious and have all these plots about how to make peoples' lives miserable.Last edited by Weissrolf; November 25th, 2020 at 22:28.
-
November 26th, 2020, 00:39 #59
Concise bug reports are cool.
Expecting the devs to explain their rationale is not cool.Got a Bug - Click & FOLLOW the procedure here, it will save time
Ultimate Edition Fantasy Grounds - ONLY ON Linux
Twitch Channel
-
November 26th, 2020, 00:43 #60
- Join Date
- Aug 2019
- Posts
- 2,025
What does that tell us about the multi-threading discussion?
Last edited by Weissrolf; November 26th, 2020 at 00:45.
Thread Information
Users Browsing this Thread
There are currently 1 users browsing this thread. (0 members and 1 guests)
Bookmarks