The Phantasmal MUDlib for DGD: Thread-Local Storage

Phantasmal MUD Lib for DGD

Threads and Thread-Local Storage in DGD

Date: Sat, 6 Nov 1999 00:25:51 +0100 (CET) From: "Felix A. Croes" To: DGD Mailing List Subject: [DGD]Thread Local Storage What is thread local storage? There are variables which are unique within the mud, and which can be kept in a central object. Thus, for example, has this_player() been implemented in the 2.4.5 simulation. However, once multi-processor support is enabled in DGD a variable in an object which is modified in every thread -- such as this_player(), which is kept as a variable in /dgd/sys/global -- would effectively undo most of the advantages of running threads in parallel (for the details on why, I refer to my previous postings on the subject of multi-processor support; they're in the mailing list archive). The solution is thread local storage, which exists only for as long as the current thread -- usually started by receiving a message or a callout -- is active. Thread local storage exists only on the stack, not in an object. By inheriting /kernel/lib/api/tls, objects can access values in thread local storage, or even change the size of the thread local store. TLS is implemented using a trick: the value returned by call_trace() includes the arguments to all functions. The kernel lib ensures that the first argument to the second function in the trace is always an array, which can be retrieved from the trace and accessed directly. By inheriting /kernel/lib/api/tls, this can be done safely and efficiently. Regards, Dworkin From DGD Mailing List Wed May 7 16:52:01 2003 From: DGD Mailing List (Felix A. Croes) Date: Wed May 7 16:52:01 2003 Subject: [DGD] the kernel library & callout resources As you may have noticed, I removed callouts as a resource from the kernel library. My reason for doing this is that I designed the kernel library before I had worked out how I was going to support MP. As it turns out, keeping track of callouts in this way would almost completely undo the MP benefit, since that depends on not counting the addition of a new callout to an object as a modification of that object. Keeping track of callouts as a resource definitely involves data modifications, which means that two threads simultaneously adding a callout to the same object are now in competition, and only one can complete. In DGD/MP, breaking up a thread which modifies two objects into two smaller threads which each modify only one object is actually a significant optimization. There is no good solution for keeping track of callouts as an LPC resource in this environment. Since keeping track of callouts can still be very useful, I made their inclusion dependant on the value of a configuration parameter. By default, the value of CALLOUTRSRC in /include/config.h is FALSE, indicating that the callouts resource is not used. This can be turned into TRUE to keep using it. For a mud started from a statedump created with an earlier kernel library version, the default is to continue using the resource regardless of the value of CALLOUTRSRC. Finally, remove_callouts_rsrc() can be called in /kernel/sys/rsrcd to remove the callouts resource from a running mud. Regards, Dworkin From DGD Mailing List Sat May 10 17:34:01 2003 From: DGD Mailing List (Felix A. Croes) Date: Sat May 10 17:34:01 2003 Subject: [DGD] Multi Threading "Ben Chambers" wrote: > When the multi-threaded version is released, how exactly will it deal with > multithreading? Will it be possible to create a message queue and set a > thread to iterate over that and do some processing? What about setting a > thread to listen for connections to a webserver and run a webserver > independently of the the main thread. Will these types of things be > handleable? A "thread" in LPC (thread is not a very good description of what it is, actually) runs only for a <very> short time, and never concurrently with another thread. There is no such thing as a main thread. Instead, the normal running of the server consists of many brief threads running in sequence. DGD/MP will be multi-threaded internally, but on the LPC level it will appear to work the same as before. The difference is that DGD/MP will sneakily, and invisibly to the LPC level, run some LPC threads concurrently, while still giving the appearance of running them in sequence. It does this by letting each LPC thread run on a personal copy of all objects it affects, and only committing its changes to the actual objects when those objects have not been modified by other threads in the meanwhile. A thread which fails to commit is rescheduled. This means that the n-processor version of DGD/MP will not run n times as fast, since some threads will be rescheduled. In the worst case, with every thread except one failing to commit at all times, it will run as fast as if it were only using a single processor. Actual speed will vary, between 1 and n times that of a single-processor version. To avoid confusion, I do not call DGD/MP a multi-threaded server, even though it is multi-threaded internally. Instead, I call it multi-processor. There really is no point in using it, unless you run it on a machine with multiple processors. Regards, Dworkin From DGD Mailing List Sat May 17 04:03:00 2003 From: DGD Mailing List (Felix A. Croes) Date: Sat May 17 04:03:00 2003 Subject: [DGD] Multi Threading Kris Van Hees wrote: > Not to offer criticism (since I do believe that re-scheduling threads of > execution is the right way to go - you convinced me of that in 1995 already), > but isn't the worst case where one or more threads of execution may have to be > rescheduled multiple times, due to getting undercut by other threads? E.g. if > all threads need to access a central object during their execution, it could be > possible that thread n will get rescheduled (n - 1) times because all previous > threads modifying that single central object? It does of course depend on the > scheduling algorithm, and it is very easy to guarantee at least completion of > all threads, but it seems that the possible impact of a high degree of > collisions in accesses (bad design, but nonetheless) could be quite a bit higher > than sequential threads on a uni-processor driver. The system guarantees that of all LPC threads running at any one time, at least one will complete successfully. Thus the worst case is equivalent to a single-threaded system which executes LPC threads in the same order of completion, but sequentially. It is true that while the system as a whole still progresses, individual threads may be rescheduled a number of times, and thereby delayed. Making sure that eventually they do complete is not that hard a problem. The most simple solution would be to run a thread which has been rescheduled some given number of times in single-threaded mode. That would still not affect the worst case for the system as a whole. Of course, there is some overhead for scheduling, making copies of objects to work on for each thread, and so on. But if that overhead were significant, I would be doing a bad job as a programmer. Regards, Dworkin From DGD Mailing List Fri Feb 6 15:23:01 2004 From: DGD Mailing List (Felix A. Croes) Date: Fri Feb 6 15:23:01 2004 Subject: [DGD] Melville under the Kernel Lib Michael McKiel wrote: > I feel I have somewhat of a grip (albeit perhaps a vague one in a few places) > on the process and how to go about it, and what the KLib and Melville are > each doing. All except for the Thread Local Storage bit. > I grep for it, and look over the lib/api/tls.c ... and all its functions > while again actually defined in the driver itself -- are never actually used. > The only ones that appear to get called all thru the Kernel Lib is > query_tls_size(), along with allocates. And Melville doesn't even use TLS at > all. This is also one thing I couldn't find much resource for in the > archives. I know a description was given here, but I don't quite "get it" :) > So what I'm wondering is why are there so many tlvar functions that are never > used...what instances might require them to be ? Forget about how they're implemented, then. This is what you'd use them for: inherit "/kernel/lib/api/tls"; void create() { ::create(); set_tls_size(1); /* one thread local variable, index 0 */ } void set_this_player(object player) { set_tlsvar(0, player); } object this_player() /* return the current player */ { return get_tlsvar(0); } The kernel library itself does not use this interface because it can access the internals more directly. It actually uses 3 TLS variables itself for various purposes. > And what does the current Melville "lose" from not using TLS at all? Only DGD/MP preparedness. Regards, Dworkin From DGD Mailing List Sun Feb 8 10:43:01 2004 From: DGD Mailing List (Felix A. Croes) Date: Sun Feb 8 10:43:01 2004 Subject: [DGD] Melville under the Kernel Lib "Steve Foley" wrote: > > - It must be possible to start a callout without making > > a change to data in any object (this is why callouts > > are no longer a resource in the kernel library, since > > resources are tracked by a central object). > > I don't understand the nature of this requirement. I wish I could more > specifically articulate what I don't understand about this, but I'm afraid > I can't. I have a vague feeling it has to do with the compare, commit or > reschedule process that occurs in making a multi-threaded environment > appear to be single-threaded, but I'm not really even all that sure of > that. Would someone be so kind as to shine some light on this? Thanks in > advance. Let's start with what this requirement means. What you should not do is anything like the following, in the auto object: private int pending_callouts; int call_out(string function, mixed delay, mixed args...) { int handle; handle = ::call_out("call_out_gate", delay, string function, args); pending_callouts++; return handle; } void call_out_gate(string function, mixed *args) { --pending_callouts; call_other(this_object(), function, args...); } This is no good because each started callout modifies the object's data (i.e. pending_callouts). Therefore the thread will compete with any other thread that also tries to modify this object. If the above method of starting callouts is the only way, then in fact <any> callout addition will also modify the object, and no two threads will be able to simultaneously add callouts to the same object. Worse still, if you were to keep track of callouts in a central object somewhere, any callout started in any object will compete with any other callout started somewhere else, since both threads will be trying to modify the same central object. Removing all references to pending_callouts in the above example code, adding a callout to the object will not count as an object modification. 1000 simultaneous threads could each add a callout to the object, and none would be considered to be in competition with the others. In theory, all would be able to complete. Why is this important? Suppose that you want to broadcast a message to all players online. Actually calling a function in all player objects that modifies data in each of them is likely to conflict with anything else that also modifies player objects (and there is a lot of that going on). So, instead of sending the message directly, you call a relay function in the player object, which starts a zero-delay callout to process the message later on. No data is modified, and the broadcasting thread will not be in competition with any other. Regards, Dworkin From DGD Mailing List Fri Feb 13 17:34:01 2004 From: DGD Mailing List (Felix A. Croes) Date: Fri Feb 13 17:34:01 2004 Subject: [DGD] Another mudlib requirement I wrote: > - It must be possible to start a callout without making > a change to data in any object (this is why callouts > are no longer a resource in the kernel library, since > resources are tracked by a central object). Suppose that several threads are starting callouts in the same object. They don't modify the object, so these callouts are added, which is fine. However, some objects will have many callouts added to them all the time. That means that adding a new callout might not only happen simultaneously with another thread doing the same thing, but also with a thread started from a previous callout in that very object. The latter will probably modify the object, and if it commits before any of the callout-adding threads, they will be cancelled because they accessed a version of the object's state which is now out of date. There is a way around this: add a callout without accessing any data in the object at all (neither read nor write access). This includes the object's variables as well as the callout table itself. So, you can add a callout to the object, but you can't access the variables, or use status(obj) to get a listing of the existing callouts. Therefore I'm broadening the requirement to: It must be possible to start a callout in an object without accessing any data in that object. At present, the kernel library itself does not yet follow this rule. This will be addressed in the next patch. Regards, Dworkin From DGD Mailing List Sat Feb 14 06:04:01 2004 From: DGD Mailing List (Felix A. Croes) Date: Sat Feb 14 06:04:01 2004 Subject: [DGD] Another mudlib requirement "Steve Foley" wrote: > "Felix A. Croes" wrote: > > > The latter will probably modify the object, and if it commits before > > any of the callout-adding threads, they will be cancelled because > > they accessed a version of the object's state which is now out of > > date. > > How much discrimination will DGD be able to make in this regard? I can imagine > reading one member of an array (arr[0]) in one thread (thread A) while another > thread (thread B) modifies a different array member (arr[5]). If thread B > commits before thread A, will thread A be rescheduled as a consequence of thread > B? Assume thread A doesn't assign arr to a local or global, and doesn't pass it > as a parameter in a function call. In theory, the granularity could be brought down to individual array elements. The basic procedure is as follows: each thread works on a copy of the actual data. When the thread is ready to finish, all the 'data entities' it has accessed are checked, and if no other modifications have been committed to them since the thread started, the thread will commit its own modifications. Otherwise, the changed data copy is discarded and the thread is rescheduled. In DGD/MP, the granularity is just below objects. An object is divided into its entry in the global object table, its LPC-level data, and its callout table. Additions (not deletions) to the callout table receive special treatment, so that they can be executed in parallel; among other things this means that callouts added from different threads will always have different callout handles. Doing this for individual array elements would be possible, but then there would also be the administrative overhead of keeping track, for each array element, of which thread last modified it. For the LPC programmer it all boils down to two simple rules: - try to break up threads that access many objects, using callouts - try not to access objects that you add callouts to in any other way Regards, Dworkin From DGD Mailing List Sat Feb 14 06:10:01 2004 From: DGD Mailing List (Felix A. Croes) Date: Sat Feb 14 06:10:01 2004 Subject: [DGD] Another mudlib requirement Steve Wooster wrote: > At 12:33 AM 2/14/2004 +0100, you wrote: > >The latter will probably modify the object, and if it commits before > >any of the callout-adding threads, they will be cancelled because > >they accessed a version of the object's state which is now out of > >date. > > This caused me to wonder something... if I wrote a thread that took an > insane amount of CPU, and therefore a really long amount of time to > complete, would it be possible that it might end up getting postponed > indefinitely? For example: (assume this object has infinite rlimits) Note that such a thread always accesses the object that it starts in, since starting effectively involves the deletion of a callout from the object's callout table. This is a good question. I haven't made my decision yet, but I think that I will not let parallel threads start in the same object. So for this object, the thread that starts first would finish first, and the next one might be delayed. Note that threads won't be cancelled indefinitely. Eventually, DGD/MP will run them in such a way that they cannot be cancelled by any other thread. Regards, Dworkin From DGD Mailing List Sat Feb 14 16:39:01 2004 From: DGD Mailing List (Felix A. Croes) Date: Sat Feb 14 16:39:01 2004 Subject: [DGD] Another mudlib requirement "Steve Foley" wrote: > > Doing this for individual array elements would be possible, but then > > there would also be the administrative overhead of keeping track, for > > each array element, of which thread last modified it. > > So, if these array elements were LWOs, and the changes were made to data in the > LWO but not the array itself you could avoid a reschedule? I'm assuming when > you say 'objects' that includes LWOs, but I just wanted to be sure. Sorry, no, I mean persistent objects only. DGD manages light-weight objects internally as arrays, most of the time. Regards, Dworkin

From: DGD Mailing List (Noah Gibbs) Date: Sat May 10 16:16:02 2003 Subject: [DGD] Multi Threading Those things are quite handleable now. But if you mean, "will any thread appear to always be running and never exit?", the answer is "no". Threads will still spawn often and terminate quickly, it's just that more than one may be running at the same time. Remember that DGD does a certain amount of heavy lifting when threads terminate (swapping objects to disk, recompiling objects, etc), and that still has to happen, and happen often. --- Ben Chambers wrote: > When the multi-threaded version is released, how > exactly will it deal with > multithreading? Will it be possible to create a > message queue and set a > thread to iterate over that and do some processing? > What about setting a > thread to listen for connections to a webserver and > run a webserver > independently of the the main thread. Will these > types of things be > handleable?

From: DGD Mailing List (Erwin Harte) Date: Sat Apr 24 19:19:01 2004 Subject: [DGD] Re: time() and suspending call_outs On Sat, Apr 24, 2004 at 03:34:16PM -0700, Steve Wooster wrote: > My first question is... are time() and militime() static within a > single thread? For example, would this be an infinite loop or a one-second > loop? > > void hog_all_the_cpu() > { > int time=time(); > rimits (-1,-1) > { > while(time==time()); > } > } That should be a one second loop. > What about if I replaced time() with militime() in that function? Did you try? > @code t = millitime(); while (t[0] == millitime()[0] && t[1] == millitime()[1]); return ({ t, millitime() }); $30 = ({ ({ 1082847960, 0.883 }), ({ 1082847960, 0.884 }) }) :-) [...] > Is there any way to do the use_up_ticks_without_using_cpu() function? Not that I'm aware of. [...] > I just thought of one more question... for an object daemon, since I don't > want the source code of every single object stored in memory at once, am I > forced to store the source code in real objects rather than LWOs referenced > by the object daemon? If you want to store the source code at all (which is quite a project, let me tell you), then you need persistent objects yes, otherwise it'll still all be stored in the main object after all. Cheers, Erwin. -- Erwin Harte

This website is released to the public under the terms of the GNU Free Documentation License, Version 1.3 or later