Search results
Results From The WOW.Com Content Network
The utility of an event x occurring at future time t under utility function u, discounted back to the present (time 0) using discount factor β, is (). Since more distant events are less liked, 0 < β < 1.
The discount factor, DF(T), is the factor by which a future cash flow must be multiplied in order to obtain the present value. For a zero-rate (also called spot rate) r , taken from a yield curve , and a time to cash flow T (in years), the discount factor is:
The discount factor determines the importance of future rewards. A factor of 0 will make the agent "myopic" (or short-sighted) by only considering current rewards, i.e. (in the update rule above), while a factor approaching 1 will make it strive for a long-term high reward. If the discount factor meets or exceeds 1, the action values ...
The discount factor determines the importance of future rewards. A discount factor of 0 makes the agent "opportunistic", or "myopic", e.g., [4] by only considering current rewards, while a factor approaching 1 will make it strive for a long-term high reward. If the discount factor meets or exceeds 1, the values may diverge.
Therefore, the preferences at t = 1 is preserved at t = 2; thus, the exponential discount function demonstrates dynamically consistent preferences over time. For its simplicity, the exponential discounting assumption is the most commonly used in economics. However, alternatives like hyperbolic discounting have more empirical support.
The result quantifies the advantage of being the first to propose (and thus potentially avoiding the discount). The generalized result quantifies the advantage of being less pressed for time, i.e. of having a discount factor closer to 1 than that of the other party.
Hyperbolic discounting is mathematically described as = + where g(D) is the discount factor that multiplies the value of the reward, D is the delay in the reward, and k is a parameter governing the degree of discounting (for example, the interest rate).
The discount factor determines how much immediate rewards are favored over more distant rewards. When γ = 0 {\displaystyle \gamma =0} the agent only cares about which action will yield the largest expected immediate reward; when γ → 1 {\displaystyle \gamma \rightarrow 1} the agent cares about maximizing the expected sum of future rewards.