Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:

Advanced Wireless Networks - 4G Technologies

.pdf
Скачиваний:
47
Добавлен:
17.08.2013
Размер:
21.94 Mб
Скачать

648 ACTIVE NETWORKS

 

0.28

 

 

 

 

 

 

 

0.26

 

 

 

 

 

 

 

0.24

 

 

No control

 

 

 

percentage

 

 

 

 

 

 

0.22

 

 

 

Bang-Bang

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Loss

0.20

 

 

 

 

 

 

 

 

 

 

 

 

 

 

0.18

 

 

 

 

 

 

 

 

 

 

RNN with reinforcement learning

 

 

0.16

 

 

 

 

 

 

 

0.14

 

 

 

 

 

 

 

0.3

0.4

0.5

0.6

0.7

0.8

0.9

External arrival rate

Figure 16.18 Comparison of average loss through the CPN with reinforcement learningbased control using delay and loss as the goal.

Each player’s objective function, ui , is a function of the particular action chosen by player i, ai and the particular actions chosen by all of the other players in the game, ai , and yields a real number. Other games may include additional components, such as the information available to each player and communication mechanisms. In a repeated game, players are allowed to observe the actions of the other players, remember past actions, and attempt to predict future actions of players. An action vector a is said to be a Nash equilibrium (NE) if

ui (a) ui (bi , ai ) i N, bi Ai

(16.6)

Restated, an NE is an action vector from which no player can profitably unilaterally deviate. NE correspond to the steady-states of the game and are then predicted as the most probable outcomes of the game.

A repeated game is a sequence of stages where each stage is the same normal form game. When the game has an infinite number of stages, the game is said to be an infinite horizon game. Players choose strategies (actions at each stage), based on their knowledge of the game-past actions, future expectations and current observations. These strategies can be fixed, contingent on the actions of other players or adaptive. These strategies can be also designed to punish players who deviate from agreed upon behavior. When punishment occurs, players choose their actions to minimize the payoff of the offending player. However, i is still able to achieve some payoff vi . Thus there is a limit to the how much a player can be punished.

As estimations of future values of ui are uncertain, many repeated games modify the original objective functions by discounting the expected payoffs in future stages by δ, where 0 < δ < 1 such that the anticipated value in stage k to player i is given by

ui,k (a) = δk ui (a)

(16.7)

GAME THEORY MODELS IN COGNITIVE RADIO NETWORKS

649

It can be shown that, in a repeated game with an infinite horizon and discounting, for every feasible payoff vector v > vi for all i N , there exists a δ < 1 such that for all δ (δ, 1) there is an NE with payoffs v (Folk theorem [69]). To generalize the Folk theorem, given a discounted infinite horizon repeated game, nearly any behavior can be designed to be the steady-state through the proper choice of punishment strategies and δ. Thus convergent behavior of a repeated game can be achieved nearly independent of the objective function.

A myopic game is defined as a repeated game in which there is no communication between the players, memory of past events or speculation of future events. Any adaptation by a player can still be based on knowledge of the current state of the game. As players have no consideration of future payoffs, the Folk theorem does not hold for myopic games and the convergence to steady-state behavior must occur through other means. Two convergence dynamics possible in a myopic game are the best response dynamic and the better response dynamic. Both dynamics require additional structure in the stage game to ensure convergence. Best response dynamic [70] refers to the case where, at each stage, one player i N is permitted to deviate from ai to some randomly selected action bi Ai if

ui (bi , ai ) ui (ci , ai ) ci =bi Ai and ui (bi , ai ) > ui (a)

Better response dynamic [70] refers to the case where, at each stage, one player i N is permitted to deviate from ai to some randomly selected action bi Ai if ui (bi , ai ) >

ui (ai , ai ).

An S-modular game restricts {u j } such that, for all i N , either Equation (16.8) or (16.9) is satisfied.

2ui (a)

0 j =i N

(16.8)

ai a j

2ui (a)

0 j =i N

(16.9)

ai a j

In the former case, the game is said to be supermodular; in the latter the game is said to be submodular. Myopic games whose stages are S-modular games with a unique NE and follow a best response dynamic converge to the NE when the NE is unique [71].

A potential game is a special type of game where {u j } are such that the change in value seen by a unilaterally deviating player is reflected in a function P: A . All myopic games where the stages are the same potential game converge to a NE when decisions are updated according to a better response dynamic [70].

An exact potential game (EPG) is the game where there exists some function (EPF) P: A such that i N , a A

ui (ai , ai ) ui (bi , ai ) = P(ai , ai ) P(bi , ai )

(16.10)

A necessary and sufficient condition for a game to be an exact potential game is [72]

2ui (a)

=

2u j (a)

i, j N , a A

(16.11)

ai a j

 

 

a j ai

Coordination-dummy game [72] is a composite of a coordination game with identical interest function V and a dummy game with dummy function Di whose value is solely

650 ACTIVE NETWORKS

dependent on the actions of the other players and can be expressed as

ui (a) = V (a) + Di (ai )

(16.12)

An EPF for this game can be written as

P(a) = V (a)

(16.13)

Self-motivated games’ utility functions are a function solely of that player’s actions.

ui (a) = hi (a)

(16.14)

A self-motivated game can be shown to be an EPG by introducing the EPF as

P(a) =

i N hi (ai )

(16.15)

Bilateral symmetric interaction (BSI) game [73] refers to the case where every player’s objective function can be characterized by

ui (a) =

j N \{i} wi j (ai , a j ) hi (ai )

(16.16)

where wi j : Ai × A j and hi : Ai such that for every

(ai , a j ) Ai × A j ,

wi j (ai , a j ) = w ji (a j , ai ). An EPF for a BSI game is given by

 

P(a) =

i1

 

 

wi j (ai , a j )

hi (ai )

(16.17)

i N

j=1

i N

 

Ordinal potential games (OPG) refer to the case where there exists some function (OPF) P: A such that, in addition to Equation (16.10), we also have

ui (ai , ai ) > ui (bi , ai ) P(ai , ai ) > P(bi , ai ), i N , a A

(16.18)

All EPG are also OPG.

16.6.1 Cognitive radio networks as a game

The cognitive radios in the network form the game’s set of decision makers. The set of physical layer parameters which a radio is permitted to alter forms the player’s action set. From these action sets, the action space is formed. Preference relations over the action space are formed by an exhaustive evaluation of the adaptation algorithms. Objective functions are then formed by mapping the preference relations to the real number line so that preferable action vectors are larger than less preferable action vectors.

16.6.1.1 Distributed power control

As an example we examine distributed power control algorithms within the context of CRNs [71, 74, 77]. Below, the following notation is used:

N, the set of decision-making (cognitive) radios in the network;

i, j, two different cognitive radios, i, j N ;

Pj , the set of power levels available to radio j; this is presumed to be a segment of the real number line ;

GAME THEORY MODELS IN COGNITIVE RADIO NETWORKS

651

p j , a power level chosen by j from Pj ;

P, the

power space ( n ) formed from the Cartesian product of all Pj ; P = P1 ×

P2 × ·

· · Pn ;

p, a power profile (vector) from P formed as p = { p1, p2, · · · pn };

u j ( p), the utility that

j receives from p; this is the specific objective function that j is

looking to maximize.

 

Based on

these conventions, a power control game G can be formulated as G =

N , P, {u j } .

16.6.1.2

Repeated power games

MacKenzie and Wicker [74] consider a discounted repeated power control game implemented on a packet-based network wherein the original objective function for each radio

j is the modified function of throughput given as

 

u j (p) = R [1 2BER(p)]L / p j

(16.19)

where R is the transmission rate, L is the packet length and BER is the bit error rate which is a function of the SINR seen by j. As the modeled game has an infinite horizon, a CRN implemented in this manner will exhibit convergent behavior if:

(1)some mechanism exists for broadcasting the desired operating vector, the discount factor, and the punishment strategy;

(2)there is full knowledge of the environment so the radios can differentiate deviant behavior from fades and jammers external to the CRN;

(3)there is knowledge of the action chosen by each radio at each stage.

16.6.1.3 S-modular games

Altman and Altman [71] examines the application of super-modular games to distributed power control. Thus the objective functions for this game are characterized by Equations (16.8) and (16.9). Herein each game follows a general updating algorithm (GUA), which is actually a best response dynamic. Thus if these network games have a unique NE, behavior converges to the NE from any initial p. For a CRN which satisfies this characterization, the conditions for convergence are:

(1)The adaptation algorithms must incorporate perfect knowledge of the objective function.

(2)The network must have a unique steady state.

(3)Some method must exist for measuring current performance and for sensing relevant environmental parameters. Depending on the particular adaptation algorithm, it might be necessary to know the number and type of other radios in the network.

Altman and Altman [71] assert that the classes of power control algorithms considered in by Yates [75] are S-modular games. Yates examines power control in the context of the

652 ACTIVE NETWORKS

uplink of a cellular system under five scenarios:

(1)fixed assignment where each mobile is assigned to a particular base station;

(2)minimum power assignment where each mobile is assigned to the base station where its SINR is maximized;

(3)macro diversity where all base stations combine the signals of the mobiles;

(4)limited diversity where a subset of the base stations combine the signals of the mobiles; and

(5)multiple connection reception where the target SINR must be maintained at a number of base stations.

In each scenario, each mobile, j, tries to achieve a target SINR γ j as measured by a function, I (p). I (p) is the standard effective interference function which has the following properties: (1) positivity, e.g. I (p) > 0; (2) monotonicity, e.g. if p p*, then I (p) I ( p*);

(3) scalability, e.g. for all α > 1, α I (p) > I (αp), where the convention that p > p* means that pi > pi * i N.

Single-cell target SINR games are also ordinal potential games. To prove it, consider a modified game where the action sets are the received power levels at the base station. A received power level for mobile j, r j , is the product of its path loss to base station k, h j,k , and its transmitted power level. Then the objective functions of a target SINR game can be expressed as

u j ( p) = 1 − −γ j

+ r j

 

2

 

ri

σk

(16.20)

 

 

i N \ j

 

 

where σ k is the noise power at k. Expanding the above expression gives

 

u j (p) = 1 γ j2 σk2 2γ j σk

 

2

ri 2σk

 

ri

2γ j

ri

 

i N \ j

 

i N \ j

i N \ j

r2j + 2γ j r j + 2σkr j + 2r j

ri

 

(16.21)

 

i N \ j

 

 

 

The first line of Equation (16.21) is a dummy game; the following three terms in Equation (16.21) are a self-motivated game, EPG. The final term is also an EPG as it is a BSI game. Since a composite game formed from a linear combination of two EPGs is itself an EPG, it is seen that the target SINR game is an EPG. As other forms of target SINR games are ordinal transformations (OT) of Equation (16.21), all target SINR games are OPG, since OT of an OPG is an OPG. It can be also shown that all target throughput power control games are OPG (all throughput maximization power control games are OPG).

16.6.1.4 Nonlinear group pricing

Goodman and Mandayam [76] consider a scenario wherein mobile devices are trying to maximize a function of their throughput. Note that the only NE of games whose utility

GAME THEORY MODELS IN COGNITIVE RADIO NETWORKS

653

functions are pure throughput function is the power vector where all mobiles transmit at maximum power. To avoid this problem, Goodman and Mandayam [76] introduce the modified throughput function as

u j (p) =

BT(p)

cRp j

(16.22)

p j

where R is the transmission rate, c is a constant, B is the battery life, and T is the throughput function. This can neither be shown to be an OPG nor an OPG. The left-most term is generally only an OPG, whereas the pricing function is an EPG as it is a self-motivated game.

Goodman and Mandayam [76] use a best response dynamic in their experiments, which converge to a NE. Although not stated in Goodman and Mandayam [76], Saraydar et al. [78] show that Equation (16.22) is indeed a supermodular game. Also note that, without the cost function, Equation (16.19) is a particular instance of Equation (16.22). So a CRN implementing the repeated games of References [74], [76] or [78] is a supermodular game and will exhibit convergent behavior if the properties specified above for S-modular games are satisfied. However, it should be noted that, when left as a repeated game, there is greater flexibility in dynamically selecting the steady-state.

16.6.1.5 Nonlinear group pricing

Sung and Wong [77] consider another single cell distributed power control game with pricing characterized by

u j (p) = R j T (p)

λh j p j

(16.23)

i N hi pi

where λ is a constant and the base station subscript has been dropped in this case as all mobiles are communicating with the same base station. The pricing function is to reflect that the damage caused by p j to the other devices in the network is a function of the relative difference in powers rather than absolute power level. Note that Equation (16.23) is just a composite game of throughput maximization and a cost function. As we have indicated that throughput maximization is an OPG, Equation (16.23) can only be guaranteed to be an OPG only if, though not necessarily if, the pricing function has an EPF which can only be true if Equation (16.11) is satisfied. Differentiating the price function twice yields:

 

2Ci

=

 

k N hk pk

2 λhi h j 2h j

k N hk pk gi (p)

 

pi p j

 

 

k N

hk pk

4

 

 

 

 

 

 

 

 

 

 

2C j

=

 

k N hk pk

2 λhi h j 2hi

k N hk pk g j (p)

 

p j pi

 

 

k N

hk pk

4

 

 

 

 

 

 

 

 

 

where gi (p) = λhi k N

hk pk λhi2 pi . Further evaluation leads to the result that this price

function has an EPF if h j p j = hi pi . Note that this is only satisfied when the received powers are identical, which will generally not be true. Thus the cost function does not have an EPF and the nonlinear group priced game is not an OPG. Also note that Equation (16.23) cannot be guaranteed to be a supermodular game either as properly chosen power vectors p and p* will yield different signs for the second partial derivative of the cost functions.

654 ACTIVE NETWORKS

Therefore neither the better response dynamic nor the best response dynamic will assuredly converge, and a more complex convergence dynamic is required. As the repeated game dynamic is guaranteed for convergence, it can still be used. Thus this modification of the pricing function significantly complicated the network.

Additional examples of game theory applications in resource allocation modeling can be found in References [116–127] and routing in [79–115].

16.7 BIOLOGICALLY INSPIRED NETWORKS

Current Internet protocols were never planned for the emerging pervasive environments where the amount of information will be enormous. The communications requirements placed by these protocols on the low cost sensor and tag nodes are in direct contradiction to the fundamental goals of these nodes, being small, inexpensive and maintenance-free. Biological systems provide insights into principles which can be adopted to completely redefine the basic concepts of control, structure, interaction and function of the emerging pervasive environments. The study of the rules of genetics and evolution combined with mobility leads to the definition of service-oriented communication systems which are autonomous and autonomously self-adaptive. Based on References [128–131] in this section we discuss how this paradigm shift, which views a network only as a randomly selforganizing by-product of a collection of self-optimizing services, may become the enabler of the new world of omnipresent low cost pervasive environments of the future.

16.7.1 Bio-analogies

In Carreras et al. [131] the depicted scenario services are associated with living organisms. Service is defined by chromosomes. In this way service evolves and adapts to the environment constantly and autonomously. By analogy with living organisms, chromosomes are collections of genes that are the smallest service (related) data unit and inborn intelligence/instincts and thus represent all the information needed for the organism to function and service to be executed. Like in nature, this concept defines a complete life-cycle of the organisms and therefore of services. The life cycle starts from the birth of an organism, goes through the reproduction and ends with the death. Reproduction and evolution occur, applying evolution rules inherited from nature. Fitness is measuring the correspondence of the organism’s genetic information with the environment and determines the exchange of information (genetic information). Therefore no-end-to-end communication concept exists in these systems. Information is only exchanged as needed, locally, between mating organisms.

Environment is determining the natural selection based on the fitness of the organisms, with the environment leading to the best services possible as a function of the environment. In this concept the service is the organism. A scenario is envisioned where users will be more and more interested in a service able to provide reliable localized information. The role of the service will be, for instance, to provide answers to questions like How is the weather around the train station? or Where will I find a free parking space around there?

Services will be hosted on users’ devices and will go around through the physical movement

BIOLOGICALLY INSPIRED NETWORKS

655

of the users. Each service is constituted by a program and its related data that is organized into chromosomes. Each chromosome consists of:

(1)data that is the genetic information of the organism, organized in genes;

(2)a plugin that stores a syntax notation describing the actions dictated by the chromosome and the fitness (degree of attraction) operator yielding natural selection through preferred mating;

Genes are a tuple of information and consist of: (a) value; (b) timing information; (c) information source ID/location information; and (d) other data depending on the service.

Organisms are diploid, which means that there is always two homologous chromosomes associated with each service. The two homologous chromosomes have the same genes but in different forms. Alleles are different forms of the same gene and correspond to different values in the tuple of gene information. They may differ in timing information or in the source value information. Each allele may be dominant or recessive depending on the service behavior. Having two chromosomes allows us to estimate the reliability of the data. We would probably always choose the youngest data value to be the actual one if it is a parking lot, but we might also average the sensor data if it represents temperature.

The choice of the preferred value among the two reflects the concept of dominant and recessive genes. As in nature, recessive information enables the service to survive in different environments, providing the service with higher resilience against data corruption and fraud, and may even allow for additional features. Analogy with the life cycle is also possible. In this concept service is born when the user downloads the chromosome onto his device. From that moment on, the user is able to interact with the other organisms (i.e. users carrying chromosomes) and with the environment where the users are physically moving. When gathering information from the environment the service grows. While growing, the service improves its functionalities, meaning that it becomes able to increase performance. When a user meets another user while moving, services may reproduce and produce offspring. It is in this phase that evolution and natural selection occur. In order to be able to reproduce the service must satisfy some fitness requirements, since it is not willing to spread useless information. We assume a service to be dead when it cannot reproduce anymore.

Analogies with birth, growth, reproduction and death are even further elaborated in this concept. As we have already said, the service is born when the user obtains (downloads) an empty chromosome of a certain service that consists only of the plugin. From that instant on the user can read data from sensors and use the syntax definition from the plugin. At this stage the user is haploid, i.e. it has only one chromosome per service. When reading sensor data, the user fills the chromosomes both at the same time. This information is in any case more reliable than the previous information on the chromosomes.

The concept of mating is performed the following way: in meiosis, the diploid cell splits into two haploid reproductive cells (the eggs or sperm). In real life, four haploid cells would appear because of one step of copying the chromosomes, including cross-over. Each two copies would be identical. This means that the chromosome pair is being split and copied. Two packets are sent out one after another containing one chromosome each. In the best case when all the sent packets reach the receiver, it has four different combinations of the chromosome pairs. This is the best case; the user may for energy reasons decide to send out only one chromosome, or receive only one because of packet loss. A selection has to take place to decide which of these combinations survives. This selection could be influenced by

656 ACTIVE NETWORKS

the quality of the plugin (by the version number). Having such a selection can help to spread new versions of plugins. It also may help to repair broken chromosomes (that were damaged during the wireless transmission). In a sense, we allow for truly spontaneous mutations and we may even find that mutations spread. It remains to be seen if this is any good. The selection occurs as a consequence of localization and age. The fitness of a chromosome is defined as the average of the timing information of the genes weighted with the localization information. In this sense the environment participates in the selection, since the survival of a service also depends on where the user is (but not only) when he mates with another user.

If the sensor data in the chromosome is too old, it is very likely useless. Thus we forbid sending out the chromosome after a certain threshold (coded in the plugin). This way the chromosome can die, but it may be reborn through reading new sensor data or receiving a fresh chromosome. The service is considered alive as long as it is able to answer to questions. Death is therefore a consequence of outdated chromosomal information (sensor-gathered information is aging). It is in the interest of the user to exchange information and to gather sensor information and this same interest drives the service instinct to survive. This defines a complete life cycle of the service.

As an example of the environmental monitoring applications, Carreras et al. [132] describe a parking lot finding application as a possible application scenario of the above concept for the nomadic sensor network:

Each parking spot in a city is equipped with a sensor that is capable of notifying whether the parking lot is free or not together with its geographical position (or ID). Each user subscribing to the service is equipped with a hand-held device that communicates both with the sensor nodes and with other users who subscribed to the service. The users, while moving around the city, collect information on the status (free/not free) of the parking lots they encounter on their way. This information is stored in their devices, together with a time stamp of when the information has been collected. Whenever two users come into communication range, they exchange their knowledge of the parking lot status (and update their own if the information is ‘fresh’ enough). The basic service of the system to the user would be the assistance in finding the nearest parking lot. The user queries his device (which might be some PDA), asking for the nearest free parking lot in a specific area of the city. The device will provide this information based on the data that has stored so far. Of course, the information cannot be considered real-time.

16.7.2 Bionet architecture

The bio-networking architecture [130–134] applies key concepts and mechanisms described above to design network applications. One of the key concepts in biological systems is emergence. In biological systems, beneficial system properties (e.g. adaptability) often emerge through the simple and autonomous interactions among diverse biological entities (see the above example). The bio-networking architecture applies the concept of emergence by implementing network applications as a group of distributed, autonomous and diverse objects called cyber-entities (CEs). In Itao et al. [134], analogy to a bee colony (a network application) consisting of multiple bees (CEs) is used. Each CE implements a functional

BIOLOGICALLY INSPIRED NETWORKS

657

service related to the application and follows simple behaviors similar to biological entities, such as reproduction, death, migration and environment sensing, as discussed in the previous section.

In the bio-networking architecture, CEs are designed based on the three principles described below in order to interact and collectively provide network applications that are autonomous, scalable, adaptive, and simple.

(1)CEs are decentralized – there are no central entities to control and coordinate CEs (i.e. no directory servers and no resource managers). Decentralization allows network applications to be scalable and simple by avoiding a single point of performance bottleneck and failure and by avoiding any central coordination in developing and deploying CEs.

(2)CEs are autonomous – CEs monitor their local network environments, and based on the monitored environmental conditions, they autonomously interact without any intervention from human users or from other controlling entities.

(3)CEs are adaptive to dynamically changing environnemental conditions (e.g. user demandes, user locations and resource availability) over the shortand long-term.

The bionet platform described in Suzuki et al. [130], and presented in Figure 16.19, provides an execution environment for CEs. It consists of two types of software components. The first type of components, supporting components, abstracts low-level operating and networking details (e.g. network I/O and concurrency control for executing CEs). The second type of components, runtime components, provides runtime services that CEs use to perform their services and behaviours. The bionet platform is implemented in Java, and each bionet platform runs on a Java virtual machine (JVM). Each CE is implemented as a Java object and runs on a bionet platform.

As shown in Figure 16.20, a CE consists of three main segment: attributes, body and behaviours. Attributes carry descriptive information regarding the CE (e.g. CE ID and description of a service it provides). The body implements a service that the CE provides

Platform representative

Bionet platform

CE, cyber-entity.

CE

CE

CE context

Bionet services

Bionet container

Bionet message transport Bionet class loader

Java VM

Figure 16.19 Bionet platform architecture. (Reproduced by permission of IEEE [130].)