Project Part 1

Back to Programming Project landing page

The bulk of Programming Project Part 1 requires you develop a “State Quality” function representing a set of heuristics that define how good or desirable an arbitrary country state is deemed to be. You are also required to implement the pieces of code necessary to calculate an “Expected Utility” value, as described in the subsequent sections. Additionally, you should be able to parse Transformation templates from an external file (in the format shown in the Transformations subsection), as well as a set of initial country states and resource weights from a CSV file with column headers as shown in Representing a Virtual World.

The State Quality of a Country

The State Quality of a country is determined individually by each student, though you can share quality functions on the Discussion Forum and in-class in a form that is comprehensible to others, but not in the form of code. Your State Quality function, whatever it is, must be substantively dependent on a country’s resources. For example, one State Quality function could be a weighted sum of resource factors, and it could be normalized by the population resource, such as wRi ∗ cRi ∗ ARi/AP opulation, where ARi is the amount of a resource, and cRi is a proportionality constant (e.g., 2 units food per person, 0.5 houses per person). Or the quality function could be an ecological footprint that is normalized by the AvailableLand resource. State Quality could take other forms as well.

In any case, some kind of weighting of resources will likely be important. Your choice of State Quality function should be informed by one or more sources indicating relevant measures of country health in the real world (though avoid getting “into the weeds”), and you can share these sources in the Discussion Forum or live in class. Generally, I expect and hope that you will share your ideas freely on the Programming Project Discussion Forum.

Resources

The following exhaustive list of resources may be used. Note that resources marked with a * must be included when defining states, computing State Quality, and the like:

• AvailableLand
• Water
• Population*
• PopulationWaste
• MetallicElements* • Timber*

• Farm (Farm is obtained from AvailableLand and reduces the AvalilableLand amount) • FarmWaste
• MetallicAlloys*
• MetallicAlloysWaste*

• Electronics*
• ElectronicsWaste*
• Housing*
• HousingWaste*
• Food (Food is enabled by the Farm resource, perhaps accompanied by Water)
• FoodWaste
• PotentialFossilEnergy (e.g., oil)
• PotentialFossilEnergyUsable (e.g., oil that is extracted from land, can be exported) • PotentialFossilEnergyUsableWaste (e.g., waste as a result of the extraction process) • PotentialRenewableEnergy
• PotentialRenewableEnergyUsable
• PotentialRenewableEnergyUsableWaste

For the required (*) created resources, you may either use the transformation templates given in the Project Background section, or you may use the following transformations which only utilize required resources. Note that you can alter these templates in your AI agent, so long as you cite sources to justify the alterations:

Housing Template

(TRANSFORM C
           (INPUTS (Population 5)

                   (MetallicElements 1)
                   (Timber 5)
                   (MetallicAlloys 3))

           (OUTPUTS (Housing 1)
                    (HousingWaste 1)

                    (Population 5)))

Alloys Template

(TRANSFORM C
           (INPUTS (Population 1)

                   (MetallicElements 2))
           (OUTPUTS (Population 1)

                    (MetallicAlloys 1)
                    (MetallicAlloysWaste 1)))

Electronics Template

(TRANSFORM C
           (INPUTS (Population 1)

                   (MetallicElements 3)

                   (MetallicAlloys 2))
           (OUTPUTS (Population 1)

                    (Electronics 2)
                    (ElectronicsWaste 1)))

Including one or more other resources can get you additional points, particularly if you have sources to support, even vaguely, your design of additional TRANSFORM operations. Every resource that has an associated Waste resource, must be accompanied by that Waste resource in your state definition. FYI: Water is involved (in the real world) in virtually every transformation (e.g., https://www.patagonia.com/our-footprint/organic-cotton.html), and the inclusion of water would be an example of citing sources to support the alteration of your transformation definitions.

The Format of a Schedule

A schedule is formatted as a sequence (list) of TRANSFORM and TRANSFER operations that are each fully ground (i.e., all variables are replaced by constants). For example, the following indicates a schedule containing three consecutive actions:

[ (TRANSFORM Atlantis
             (INPUTS (Population 25)

                     (MetallicElements 5)
                     (Timber 25)
                     (MetallicAlloys 15))

             (OUTPUTS (Housing 5)
                      (HousingWaste 5)

                      (Population 25)))
  (TRANSFER Atlantis Carpania ((Housing 3)))
  (TRANSFER Dinotopia Atlantis ((Timber 5)))

]

Each operator in the schedule will change the current world state into its successor state with certainty. Thus, there are no uncertainties associated with the effects of operators. The state that precedes an operator’s application must satisfy the preconditions of that operator (i.e., the country that is executing a transform or transfer must have sufficient resources). A schedule in which each operator’s preconditions are satisfied just prior to the operator’s application is called a legal schedule.

Undiscounted Reward of a Schedule

The Undiscounted Reward of a schedule can be positive or negative, and is the difference between the State Quality of a country’s final state resulting from execution of an entire schedule and the State Quality of the initial state for the same country.

A schedule can benefit or degrade different countries to varying extents, so each country that participates in a schedule probably has a different Undiscounted Reward for any given schedule.

Remember, you design how to implement the State Quality for a country as discussed above, but you must use the difference between the initial and final State Qualities as the Undis- counted Reward:

R(ci,sj) = Qend(ci,sj) – Qstart(ci,sj), for country ci and schedule sj. Discounted Reward of a Schedule

The Discounted Reward comes from the final state of a schedule, just like the Undiscounted Reward; however, in this case, the greater the number of steps required to reach the final state, the less rewarding it is for a country. If there are N time steps in a schedule, then the discounted expected reward for a country is:

DR(ci,sj)=gammaN ∗(Qend(ci,sj)–Qstart(ci,sj)),where0<=gamma<1.

For many sequential decision problems (Chapter 17, Russell and Norvig; Chapter 9, Poole and Mackworth), each individual state in the sequence comes with its own reward (positive or negative/penalty), and the utility of the entire sequence is the sum of the discounted state rewards, but in this project, we assume that the entire reward comes with the final state only. Nonetheless, your system can (and probably will) compute the Discounted Reward for every partial schedule on the search frontier, which could be used to organize the frontier as a priority queue (refer to the Expected Utility of a Schedule section). You will experiment with different values of gamma in the above formula and can report your results on the Discussion Forum.

Probability that a Schedule will Succeed

Even though there is no uncertainty associated with an operator’s effects when the operator is applied and therefore no uncertainty associated with a legal schedule’s effects when the schedule is applied, there remain other sources of uncertainty in the schedule. Notably, other

countries referenced in the schedule may not “go along” with the schedule if an attempt were made to “execute” it in the real world. You will use the State Quality values for other countries, as well as your own country, to judge the likelihood that a schedule will be accepted by all parties. Notably, if a schedule references some number of other countries, then the more that the schedule benefits each of the referenced countries (i.e., in terms of their Discounted Rewards), the chances that they will agree to the schedule increase.

The probability that a country, ci, will agree to a schedule, sj, is computed by the logistic function where x corresponds to DR(ci, sj ) and L = 1, and you can experiment with different values of x0 and k (but use x0 = 0 and k = 1 as starting points). Think about and report on how different parameter settings might reflect biases in the real world (e.g., a reason for shifting x0 might be to reflect opportunity costs — what other benefits might await a patient country?).

Given the individual probabilities of acceptance for each participating country, P(ci,sj), the probability that a schedule will ultimately be accepted and succeed, P(sj), could be computed in a number of ways (e.g., the min of the probabilities, reflecting the weakest link), but for the purposes of this project, we will use the product of the probabilities of the individual P(ci,sj) values.

Note that the strategy for estimating the probability that a schedule is accepted (i.e., will be accepted by all parties to the schedule) and succeeds, does not come from statistics accu- mulated over data from the “real world” (to include game play) as we might think is ideal, but our method draws from an information theoretic tradition of estimating probabilities of “events” from the “descriptions” of those events. An assumption in this description- driven methodology is a bias that more “complicated” descriptions represent lower proba- bility events. It is a quantification of Occam’s razor.

This is the one way (probably the only way given the time constraints of this class) that the AI agent for your country will take into account the State Qualities of other countries, as well as your own. It has the effect of countering ill-considered greed.

Expected Utility of a Schedule

The probability that a schedule will be accepted and ultimately succeed (which takes into account other countries) multiplied by the Discounted Reward for your country (self) is the central factor in computing the Expected Utility of a schedule (sj) for your country. But what is the cost to a country of producing a schedule that would ultimately fail? For simplicity, let the cost of failure be a negative constant, C. You may choose any value for C, with justification given, or if you are more ambitious, design a more general function to represent the failure cost.

The Expected Utility of a schedule, then, represents the overall “goodness” that is expected to be achieved were a given schedule to be executed in the real world. Calculation of the Expected Utility is given by the following formula:

EU(ci,sj)=(P(sj)∗DR(ci,sj))+((1−P(sj))∗C),whereci =self

Summary of the Interdependence of Measures

Figure 3: Interdependence Between Various Measures Comprising the Expected Utility

Part 1 Deliverables

The primary deliverables for Part 1 of the Programming Project will be in the form of Powerpoint slides and source code. You will submit a zip file containing the following files (where the names of the files should be exactly as outlined in bold):

• ExplanationSlides.pptx: Powerpoint slides explaining:

Your choice of State Quality function (description and mathematical formulas)
A justification for your State Quality function (which can be intuitive, but sources can be cited here too and would be desirable)
A summarization of the pipeline of scores being calculated, from State Quality, through the rewards and schedule success probabilities, to the Expected Utility∗ You can reuse the image provided in these instructions or create your own
∗ Include your thoughts, possibly informed by experiments, on certain parameter settings (e.g., why you chose certain values for x0, k, gamma, etc.) • SourceCode.zip: Well-documented and well-formatted code, including:
- A rudimentary representation (object-oriented or otherwise) of the world state at any given time∗ The world state should primarily indicate the amount of resources currently held by each individual country at a given time
- An implementation of your State Quality function, which should take a world state representation as its input and produce a single-valued output
- Code for parsing initial world states and resource weights from external CSV files
- All pipeline score calculations from State Quality to Expected Utility
- Code for parsing Transformation templates

Archives

Meta