Phantasmal MUD Lib for DGD

Phantasmal Site > DGD > DGD LPC Reference > Thread-Local Storage

Threads and Thread-Local Storage in DGD

Date: Sat, 6 Nov 1999 00:25:51 +0100 (CET)
From: "Felix A. Croes"
To: DGD Mailing List
Subject: [DGD]Thread Local Storage

What is thread local storage?
 
There are variables which are unique within the mud, and which can be
kept in a central object.  Thus, for example, has this_player() been
implemented in the 2.4.5 simulation.  However, once multi-processor 
support is enabled in DGD a variable in an object which is modified
in every thread -- such as this_player(), which is kept as a
variable in /dgd/sys/global -- would effectively undo most of the 
advantages of running threads in parallel (for the details on why,
I refer to my previous postings on the subject of multi-processor
support; they're in the mailing list archive).  
 
The solution is thread local storage, which exists only for as long
as the current thread -- usually started by receiving a message or a
callout -- is active.  Thread local storage exists only on the stack,
not in an object.  By inheriting /kernel/lib/api/tls, objects can
access values in thread local storage, or even change the size of the
thread local store.

TLS is implemented using a trick: the value returned by call_trace()
includes the arguments to all functions.  The kernel lib ensures
that the first argument to the second function in the trace is  
always an array, which can be retrieved from the trace and
accessed directly.  By inheriting /kernel/lib/api/tls, this can
be done safely and efficiently.

Regards,
Dworkin 



From DGD Mailing List  Wed May  7 16:52:01 2003
From: DGD Mailing List (Felix A. Croes)
Date: Wed May  7 16:52:01 2003
Subject: [DGD] the kernel library & callout resources

As you may have noticed, I removed callouts as a resource from the
kernel library.

My reason for doing this is that I designed the kernel library before
I had worked out how I was going to support MP.  As it turns out,
keeping track of callouts in this way would almost completely undo
the MP benefit, since that depends on not counting the addition of
a new callout to an object as a modification of that object.
Keeping track of callouts as a resource definitely involves data
modifications, which means that two threads simultaneously adding
a callout to the same object are now in competition, and only one
can complete.

In DGD/MP, breaking up a thread which modifies two objects into two
smaller threads which each modify only one object is actually a
significant optimization.  There is no good solution for keeping
track of callouts as an LPC resource in this environment.

Since keeping track of callouts can still be very useful, I made their
inclusion dependant on the value of a configuration parameter.  By
default, the value of CALLOUTRSRC in /include/config.h is FALSE,
indicating that the callouts resource is not used.  This can be turned
into TRUE to keep using it.  For a mud started from a statedump created
with an earlier kernel library version, the default is to continue using
the resource regardless of the value of CALLOUTRSRC.  Finally,
remove_callouts_rsrc() can be called in /kernel/sys/rsrcd to remove the
callouts resource from a running mud.

Regards,
Dworkin



From DGD Mailing List  Sat May 10 17:34:01 2003
From: DGD Mailing List (Felix A. Croes)
Date: Sat May 10 17:34:01 2003
Subject: [DGD] Multi Threading

"Ben Chambers" wrote:

> When the multi-threaded version is released, how exactly will it deal with
> multithreading?  Will it be possible to create a message queue and set a
> thread to iterate over that and do some processing?  What about setting a
> thread to listen for connections to a webserver and run a webserver
> independently of the the main thread.  Will these types of things be
> handleable?

A "thread" in LPC (thread is not a very good description of what it is,
actually) runs only for a <very> short time, and never concurrently
with another thread.  There is no such thing as a main thread.  Instead,
the normal running of the server consists of many brief threads running
in sequence.

DGD/MP will be multi-threaded internally, but on the LPC level it will
appear to work the same as before.  The difference is that DGD/MP will
sneakily, and invisibly to the LPC level, run some LPC threads
concurrently, while still giving the appearance of running them in
sequence.  It does this by letting each LPC thread run on a personal
copy of all objects it affects, and only committing its changes to the
actual objects when those objects have not been modified by other
threads in the meanwhile.  A thread which fails to commit is rescheduled.

This means that the n-processor version of DGD/MP will not run n times
as fast, since some threads will be rescheduled.  In the worst case,
with every thread except one failing to commit at all times, it will
run as fast as if it were only using a single processor.  Actual
speed will vary, between 1 and n times that of a single-processor
version.

To avoid confusion, I do not call DGD/MP a multi-threaded server,
even though it is multi-threaded internally.  Instead, I call it
multi-processor.  There really is no point in using it, unless you
run it on a machine with multiple processors.

Regards,
Dworkin



From DGD Mailing List  Sat May 17 04:03:00 2003
From: DGD Mailing List (Felix A. Croes)
Date: Sat May 17 04:03:00 2003
Subject: [DGD] Multi Threading

Kris Van Hees wrote:

> Not to offer criticism (since I do believe that re-scheduling threads of
> execution is the right way to go - you convinced me of that in 1995 already),
> but isn't the worst case where one or more threads of execution may have to be
> rescheduled multiple times, due to getting undercut by other threads?  E.g. if
> all threads need to access a central object during their execution, it could be
> possible that thread n will get rescheduled (n - 1) times because all previous
> threads modifying that single central object?  It does of course depend on the
> scheduling algorithm, and it is very easy to guarantee at least completion of
> all threads, but it seems that the possible impact of a high degree of
> collisions in accesses (bad design, but nonetheless) could be quite a bit higher
> than sequential threads on a uni-processor driver.

The system guarantees that of all LPC threads running at any one time,
at least one will complete successfully.  Thus the worst case is
equivalent to a single-threaded system which executes LPC threads in
the same order of completion, but sequentially.

It is true that while the system as a whole still progresses, individual
threads may be rescheduled a number of times, and thereby delayed.
Making sure that eventually they do complete is not that hard a problem.
The most simple solution would be to run a thread which has been
rescheduled some given number of times in single-threaded mode.  That
would still not affect the worst case for the system as a whole.

Of course, there is some overhead for scheduling, making copies of
objects to work on for each thread, and so on.  But if that overhead
were significant, I would be doing a bad job as a programmer.

Regards,
Dworkin




From DGD Mailing List  Fri Feb  6 15:23:01 2004
From: DGD Mailing List (Felix A. Croes)
Date: Fri Feb  6 15:23:01 2004
Subject: [DGD] Melville under the Kernel Lib

Michael McKiel wrote:

> I feel I have somewhat of a grip (albeit perhaps a vague one in a few places)
> on the process and how to go about it, and what the KLib and Melville are
> each doing. All except for the Thread Local Storage bit.
> I grep for it, and look over the lib/api/tls.c ... and all its functions
> while again actually defined in the driver itself -- are never actually used.
> The only ones that appear to get called all thru the Kernel Lib is
> query_tls_size(), along with allocates. And Melville doesn't even use TLS at
> all. This is also one thing I couldn't find much resource for in the
> archives. I know a description was given here, but I don't quite "get it" :)
> So what I'm wondering is why are there so many tlvar functions that are never
> used...what instances might require them to be ? 

Forget about how they're implemented, then.  This is what you'd use them
for:

    inherit "/kernel/lib/api/tls";

    void create()
    {
	::create();
	set_tls_size(1);	/* one thread local variable, index 0 */
    }

    void set_this_player(object player)
    {
	set_tlsvar(0, player);
    }

    object this_player()	/* return the current player */
    {
	return get_tlsvar(0);
    }

The kernel library itself does not use this interface because it can
access the internals more directly.  It actually uses 3 TLS variables
itself for various purposes.


> And what does the current Melville "lose" from not using TLS at all?

Only DGD/MP preparedness.

Regards,
Dworkin



From DGD Mailing List  Sun Feb  8 10:43:01 2004
From: DGD Mailing List (Felix A. Croes)
Date: Sun Feb  8 10:43:01 2004
Subject: [DGD] Melville under the Kernel Lib

"Steve Foley" wrote:

> >     - It must be possible to start a callout without making
> >       a change to data in any object (this is why callouts
> >       are no longer a resource in the kernel library, since
> >       resources are tracked by a central object).
>
> I don't understand the nature of this requirement.  I wish I could more
> specifically articulate what I don't understand about this, but I'm afraid
> I can't.  I have a vague feeling it has to do with the compare, commit or
> reschedule process that occurs in making a multi-threaded environment
> appear to be single-threaded, but I'm not really even all that sure of
> that.  Would someone be so kind as to shine some light on this?  Thanks in
> advance.

Let's start with what this requirement means.  What you should not do
is anything like the following, in the auto object:

    private int pending_callouts;

    int call_out(string function, mixed delay, mixed args...)
    {
	int handle;

	handle = ::call_out("call_out_gate", delay, string function, args);
	pending_callouts++;
	return handle;
    }

    void call_out_gate(string function, mixed *args)
    {
	--pending_callouts;
	call_other(this_object(), function, args...);
    }

This is no good because each started callout modifies the object's data
(i.e. pending_callouts).  Therefore the thread will compete with any
other thread that also tries to modify this object.  If the above
method of starting callouts is the only way, then in fact <any> callout
addition will also modify the object, and no two threads will be able
to simultaneously add callouts to the same object.

Worse still, if you were to keep track of callouts in a central object
somewhere, any callout started in any object will compete with any other
callout started somewhere else, since both threads will be trying to
modify the same central object.

Removing all references to pending_callouts in the above example code,
adding a callout to the object will not count as an object modification.
1000 simultaneous threads could each add a callout to the object, and
none would be considered to be in competition with the others.  In
theory, all would be able to complete.

Why is this important?  Suppose that you want to broadcast a message
to all players online.  Actually calling a function in all player
objects that modifies data in each of them is likely to conflict with
anything else that also modifies player objects (and there is a lot of
that going on). So, instead of sending the message directly, you call
a relay function in the player object, which starts a zero-delay callout
to process the message later on.  No data is modified, and the
broadcasting thread will not be in competition with any other.

Regards,
Dworkin



From DGD Mailing List  Fri Feb 13 17:34:01 2004
From: DGD Mailing List (Felix A. Croes)
Date: Fri Feb 13 17:34:01 2004
Subject: [DGD] Another mudlib requirement

I wrote:

>     - It must be possible to start a callout without making
>       a change to data in any object (this is why callouts
>       are no longer a resource in the kernel library, since
>       resources are tracked by a central object).

Suppose that several threads are starting callouts in the same object.
They don't modify the object, so these callouts are added, which is
fine.  However, some objects will have many callouts added to them
all the time.  That means that adding a new callout might not only
happen simultaneously with another thread doing the same thing, but
also with a thread started from a previous callout in that very
object.

The latter will probably modify the object, and if it commits before
any of the callout-adding threads, they will be cancelled because
they accessed a version of the object's state which is now out of
date.

There is a way around this: add a callout without accessing any data
in the object at all (neither read nor write access).  This includes
the object's variables as well as the callout table itself.  So, you
can add a callout to the object, but you can't access the variables,
or use status(obj) to get a listing of the existing callouts.

Therefore I'm broadening the requirement to:

    It must be possible to start a callout in an object without
    accessing any data in that object.

At present, the kernel library itself does not yet follow this rule.
This will be addressed in the next patch.

Regards,
Dworkin



From DGD Mailing List  Sat Feb 14 06:04:01 2004
From: DGD Mailing List (Felix A. Croes)
Date: Sat Feb 14 06:04:01 2004
Subject: [DGD] Another mudlib requirement

"Steve Foley" wrote:

> "Felix A. Croes" wrote:
>
> > The latter will probably modify the object, and if it commits before
> > any of the callout-adding threads, they will be cancelled because
> > they accessed a version of the object's state which is now out of
> > date.
>
> How much discrimination will DGD be able to make in this regard?  I can imagine
> reading one member of an array (arr[0]) in one thread (thread A) while another
> thread (thread B) modifies a different array member (arr[5]).  If thread B
> commits before thread A, will thread A be rescheduled as a consequence of thread
> B?  Assume thread A doesn't assign arr to a local or global, and doesn't pass it
> as a parameter in a function call.

In theory, the granularity could be brought down to individual array
elements.  The basic procedure is as follows: each thread works on a
copy of the actual data.  When the thread is ready to finish, all the
'data entities' it has accessed are checked, and if no other
modifications have been committed to them since the thread started,
the thread will commit its own modifications.  Otherwise, the changed 
data copy is discarded and the thread is rescheduled.

In DGD/MP, the granularity is just below objects.  An object is divided
into its entry in the global object table, its LPC-level data, and
its callout table.  Additions (not deletions) to the callout table
receive special treatment, so that they can be executed in parallel;
among other things this means that callouts added from different
threads will always have different callout handles.

Doing this for individual array elements would be possible, but then
there would also be the administrative overhead of keeping track, for
each array element, of which thread last modified it.

For the LPC programmer it all boils down to two simple rules:

 - try to break up threads that access many objects, using callouts
 - try not to access objects that you add callouts to in any other way

Regards,
Dworkin

From DGD Mailing List  Sat Feb 14 06:10:01 2004
From: DGD Mailing List (Felix A. Croes)
Date: Sat Feb 14 06:10:01 2004
Subject: [DGD] Another mudlib requirement

Steve Wooster wrote:

> At 12:33 AM 2/14/2004 +0100, you wrote:
> >The latter will probably modify the object, and if it commits before
> >any of the callout-adding threads, they will be cancelled because
> >they accessed a version of the object's state which is now out of
> >date.
>
>      This caused me to wonder something... if I wrote a thread that took an 
> insane amount of CPU, and therefore a really long amount of time to 
> complete, would it be possible that it might end up getting postponed 
> indefinitely? For example: (assume this object has infinite rlimits)

Note that such a thread always accesses the object that it starts in,
since starting effectively involves the deletion of a callout from the
object's callout table.

This is a good question.  I haven't made my decision yet, but I think
that I will not let parallel threads start in the same object.  So for
this object, the thread that starts first would finish first, and the
next one might be delayed.

Note that threads won't be cancelled indefinitely.  Eventually, DGD/MP
will run them in such a way that they cannot be cancelled by any other
thread.

Regards,
Dworkin



From DGD Mailing List  Sat Feb 14 16:39:01 2004
From: DGD Mailing List (Felix A. Croes)
Date: Sat Feb 14 16:39:01 2004
Subject: [DGD] Another mudlib requirement

"Steve Foley" wrote:

> > Doing this for individual array elements would be possible, but then
> > there would also be the administrative overhead of keeping track, for
> > each array element, of which thread last modified it.
>
> So, if these array elements were LWOs, and the changes were made to data in the
> LWO but not the array itself you could avoid a reschedule?  I'm assuming when
> you say 'objects' that includes LWOs, but I just wanted to be sure.

Sorry, no, I mean persistent objects only.  DGD manages light-weight
objects internally as arrays, most of the time.

Regards,
Dworkin

From: DGD Mailing List (Noah Gibbs)
Date: Sat May 10 16:16:02 2003
Subject: [DGD] Multi Threading

  Those things are quite handleable now.  But if you
mean, "will any thread appear to always be running and
never exit?", the answer is "no".  Threads will still
spawn often and terminate quickly, it's just that more
than one may be running at the same time.
  Remember that DGD does a certain amount of heavy
lifting when threads terminate (swapping objects to
disk, recompiling objects, etc), and that still has to
happen, and happen often.

--- Ben Chambers wrote:
> When the multi-threaded version is released, how
> exactly will it deal with
> multithreading?  Will it be possible to create a
> message queue and set a
> thread to iterate over that and do some processing? 
> What about setting a
> thread to listen for connections to a webserver and
> run a webserver
> independently of the the main thread.  Will these
> types of things be
> handleable?

From: DGD Mailing List (Erwin Harte)
Date: Sat Apr 24 19:19:01 2004
Subject: [DGD] Re: time() and suspending call_outs

On Sat, Apr 24, 2004 at 03:34:16PM -0700, Steve Wooster wrote:
>     My first question is... are time() and militime() static within a 
> single thread? For example, would this be an infinite loop or a one-second 
> loop?
> 
> void hog_all_the_cpu()
> {
>     int time=time();
>     rimits (-1,-1)
>     {
>         while(time==time());
>     }
> }

That should be a one second loop.

> What about if I replaced time() with militime() in that function?

Did you try?

    > @code t = millitime(); while (t[0] == millitime()[0] && t[1] == millitime()[1]); return ({ t, millitime() });
    $30 = ({ ({ 1082847960, 0.883 }), ({ 1082847960, 0.884 }) })

:-)

[...]
> Is there any way to do the use_up_ticks_without_using_cpu() function?

Not that I'm aware of.

[...]
> I just thought of one more question... for an object daemon, since I don't 
> want the source code of every single object stored in memory at once, am I 
> forced to store the source code in real objects rather than LWOs referenced 
> by the object daemon?

If you want to store the source code at all (which is quite a project,
let me tell you), then you need persistent objects yes, otherwise
it'll still all be stored in the main object after all.

Cheers,

Erwin.
-- 
Erwin Harte