MDP Modeling in GAMS

DonHammer · Mar 10, 2023

Hi there,

I'm developing an MDP code in GAMS and can't understand the errors:

1. Firstly, I keep getting a "Error 203: Too few arguments for function" error. the function is a standard bellman V(s) = max_a (R(s,a) + gamma * sum_s' P(s'|s,a) * V(s'))

2. At the final display for optimal policy, I keep getting a "Error 767 Unexpected symbol will terminate the loop - symbol replaced by )" around the smax and "Error 409: Unrecognisable Item" the equations is an optimized bellman Q(s,a) = R(s,a) + gamma * sum_s' P(s'|s,a) * V(s')

sets
regions /r1*r11/
actions /A, B, C/
states /low, medium, high/
states_next /low, medium, high/;

parameters
reward(states,actions) /low.A -5, low.B 1, low.C 5, medium.A -5, medium.B 1, medium.C 5, high.A -5, high.B 1, high.C 5/
discount_rate /0.95/
value(states, regions)
value_new(states, regions)
epsilon /0.01/
max_iter /1000/
iter;

value(states, regions)=0;
value_new(states, regions)=0;
iter=0;

* Transition probability
* Probability of transitioning from one state to another when an action is taken
* Format: (region, current state, action, next state)
* Actions A=415V, B=33/11kV, C=330/132kV
Set transition_prob(regions, states, actions, states_next) /r1.high.A.low 0.54, r1.high.B.low 0.54, r1.high.C.low 0.54,
r2.medium.A.low 0.54, r2.medium.B.low 0.54, r2.medium.C.low 0.54,
r3.medium.A.low 0.54, r3.medium.B.low 0.54, r3.medium.C.low 0.54,
r4.medium.A.low 0.54, r4.medium.B.low 0.54, r4.medium.C.low 0.54,
r5.low.A.low 0.54, r5.low.B.low 0.54, r5.low.C.low 0.54,
r6.low.A.low 0.54, r6.low.B.low 0.54, r6.low.C.low 0.54,
r7.low.A.low 0.54, r7.low.B.low 0.54, r7.low.C.low 0.5,
r8.low.A.low 0.54, r8.low.B.low 0.54, r8.low.C.low 0.54,
r9.low.A.low 0.54, r9.low.B.low 0.54, r9.low.C.low 0.54,
r10.low.A.low 0.54, r10.low.B.low 0.54, r10.low.C.low 0.54,
r11.low.A.low 0.54, r11.low.B.low 0.54, r11.low.C.low 0.54/;

* Value iteration to convergence
while((iter = max_iter) or ((value_new - value) < epsilon),
*while(iter = max_iter,
iter = iter + 1;
value = value_new;

loop(regions,
loop(states,
loop(actions,
value_new(states, regions) = max(reward(states, actions) +
discount_rate * sum(transition_prob(regions, states, actions, states_next) * value_new(states, regions)),
value_new(states, regions))
);
);
);
);

* Print the optimal policy
display "Optimal policy for each region:";

loop(regions,
display regions;
loop(states,
display states, " action: ", actions(smax(reward(states, actions) +
discount_rate *
sum(transition_prob(regions, states, actions, states_next) * value(states_next, regions))
, actions))

);

Please kindly assist.

Nulled · Apr 27, 2023

value_new(states, regions) = max(reward(states, actions), value_new(states, regions))

action = smax(reward(states, actions) + discount_rate * sum(transition_prob(regions, states, actions, states_next) * value(states_next, regions)), actions)

Change array shapes	1	Oct 28, 2022
Help please	8	Jul 7, 2023
Can't figure out what the error is here	4	Oct 9, 2007
GCC inline assembly	3	Apr 23, 2004
Register Dump	1	Jul 31, 2006
Data Register Block	0	Mar 7, 2009
Simple Processor VHDL Doubt	0	May 24, 2011
error when running app in arm-linux	1	Nov 13, 2006

MDP Modeling in GAMS

DonHammer

Nulled

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads