Old People are Rude

Posted by Sam on January 24th, 2010 under Mind of Me  •  1 Comment

Seems the old people were out in full force today as I went down to my local shopping block (they must be shopping after church). And from today, coupled with experience from other harrowing encounters, I have surmised that in general, old people are quite rude.

My first encounter with them was when I was driving about the Warehouse carpark. Generally in a carpark, pedestrians stay out of the middle of the driving lanes, where cars go, but old people seem to completely disregard this fact, ambling towards me as I cruised about for a park. The normal reaction for a pedestrian to perform when a car is coming towards them is to get out of the way. But this old couple seemed to be completely oblivious towards my car’s presence. Something had grabbed the old dude’s attention of to the left and he was fixated on it, ignoring the fact he was cutting off half of the lane with his aging body.

I figured he was just a stupid old person, the only one I should encounter today. Unfortunately, the carpark was fuller than usual, increasing the proportion of mindless old people. Cursing at him, I parked my car and went into the Warehouse for my supplies.

I should have noticed the number of wrinkled old people entering and exiting the store with me, but it wasn’t until I got into line that my frustration rose. Normally I make a point of avoiding lines in which people are trying to purchase clothing, as they tend to take a while. But I didn’t notice the short elderly lady with some nasty looking shirts in front of me in what I thought was a short and quick line. Something in their withered minds must make them regard time as so much less precious. Perhaps it’s because they’ve seen so much, or perhaps it’s because everything took longer ‘in their day.’ But something about old people and buying clothes causes them to negotiate and haggle over fixed store prices because ’something’s missing’, or ‘there’s a bright tag on that there tunic.’

This particular old lady was fussed because she had two identical cyan shirts, but one was missing the singlet so she wanted a price reduction. I swear these people purposely search for these sorts of things; missing singlets, broken labels, etc. During the whole ordeal, she briefly turns back to me and utters “sorry.” Fuck you, you hag. You ain’t sorry. You’re happy you’ve managed to keep the poor checkout operator busy for as long as you did. Perhaps the crone craves the interaction, the feeling of power over the store workers, knowing that as the consumer, she’s always right. “I demand respect because I have lived longer!” Well that doesn’t mean you can disrespect everyone else.

Getting through my transaction at a fraction of the time, I zipped away to the other store, thankfully free of anyone. But the next one, a veggie shop was unfortunately quite busy. I grabbed my goods in short order and proceeded to wait in line. But who should upset the sanctity of the line? None other than some granny with a trolley who happened to be finished with her shop and decided that she’ll be next to be served. She did all this while walking past the line, glancing at the befuddled group of Asians at the head of the line who were unsure of what they should do. What’s worse is at the end of her transaction, and she was presented with the total, she almost sounded like she was going to debate the cost. “$18.70?!” Don’t you fucking dare dispute that, or I’ll destroy you. After realising they had been wronged, the Asians moved closer to the counter, ensuring they were next.

As the transactions were processed, more people started to line up. Well, lines up, as there was more tahn one line forming, thanks to other old people starting their own line just behind the Asians. As I was originally behind them, I was right annoyed at these fuckers, but thankfully the checkout server was aware I was next and motioned me over. But the first old person in the impromptu ‘line’ was not happy with this and made a point of standing right next to me as I went through the transaction, fixing me with an evil glare as though she had been wronged.

At the end of all this, I figure this: sure young people can be insolent and rude, but old people are manipulatively rude. They do things which are flat out crimes against social law, but act as though they are above it. Perhaps they simply forget social laws, or perhaps they are envious and want to piss off those who still have full control of their bladder.

PhD Progress: Maximal covering complete

Posted by Sam on January 22nd, 2010 under PhD Project  •  No Comments

I have finished the code for covering a state when no rules fire. It appears to be quite stable and usually always finds the most general rules. However, this can also be a problem. Because the most general conditions for an action (i.e. the basic action preconditions) may not be the best rules for the job (for example stack requires highest(X), which is above the basic condition of clear(X)).

Not only is there a problem in generality, but also in goal constants (onAB). The onAB problem requires that the constant terms a and b are present in the rules to have an optimal policy and as of yet, the algorithm doesn’t have these. I feel like I’ve addressed this problem before but I can’t find where.

A possibility for heuristical specification is to note the pre-goal achieved state and perform a direct specification, or perhaps a number of specifications on the rule (creating a bunch of mutations). The rule can then be removed, as at least one of the children should be on the right track towards an optimal rule.

Another issue is that of covering which does not find the most general rule. For instance, the generated rules would sometimes create rules which included the onFloor(X) (tied to clear(X)) condition simply because the actions it used to cover all dealt with rules on the floor. The only solution to this I can currently think of is finding the actions proposed by the rule and crossing them with the valid actions of the same type, and ensuring they match. If the valid actions have more actions than the rule predicts, then the rule is not general enough. But this could undo the progress made with the above paragraph’s specialisation mutation.

Perhaps during covering creation the unification needs to occur over ALL rules, not just until it senses no change. This still has a small chance of creating onFloor(X) rules. Maybe mark which rules have attained maximum generality and which haven’t and for every rule that hasn’t attained maximum generality, cover it until it does (or has seen enough states, as it may be impossible to attain maximum generality). Using this marking system, rules which are mutants of maximally general rules can avoid being re-generalised.

PhD Progress: Covering non-necessary terms

Posted by Sam on January 22nd, 2010 under PhD Project  •  No Comments

When covering an action from the state, it is likely that non-necessary terms (with regards to the action in question) will also be in the rule (such as on(X,Z), where Z isn’t part of the action). An example of such a rule is:
on(a,c) & cl(a) & on(b,d) & cl(b) -> move(a,b)

Note that ‘c’ and ‘d’ aren’t part of the action so when the action is inversely subsumed, these need to be put into variables of sorts. However, if they were given concrete named variables, they may cause problems when unifying later, as the variables would need to be checked if they can sumsume one-another.

A possible fix for this (though perhaps too rough), is to swap these variables with special terms ‘_’ (don’t care, again seen in FOXCS). These don’t care terms can be anything (within type constraints) and do not necessarily have to be inequal to one-another ‘_(1)’ can equal ‘_(2)’. However, on creation of the final rule for the action, these ‘_’ terms need to be concretely put into variable terms, though again without the inequals predicate bounding them from each other.

Currently this issue isn’t a big one, but this is mostly due to the small domain size and lack of background predicates being included in the observations.

The process of inversely substituting terms could be performed with Replacements, but I have a feeling it may proceed more smoothly by using writing the actions in string form and using the existing rule parser to create the rules. Strings hold advantage by allowing simple equality checks, omission of state terms and dealing with state terms, and special behaviour when parsing the rule (‘_’ term).

PhD Progress: Valid Actions Problem

Posted by Sam on January 14th, 2010 under PhD Project  •  No Comments

Like the FOXCS paper, this algorithm will be using an observation containing the valid actions for the state. This is all fine and good in Blocks World, but in larger worlds like StarCraft or PacMan, this may be problematic. The algorithm would be required to compute every possible action, which is near infinite in the StarCraft domain. Perhaps the action set needs to be bounded somehow.

I feel like I’ve touched on this problem before, but I cannot recall where or when. Ergh, it’s too hot for thinking today…

PhD Progress: Scrapping Bernoulli Distributions

Posted by Sam on January 11th, 2010 under PhD Project  •  No Comments

With the extensive changes to the system comes new ideas and methods of doing things. Because the policy will look radically different to the previous policies, it will need an alternative method of creation.

In the old CE system, the policy was of fixed size, with each slot containing a large number of random rules to optimise. In the new system (still utilising CE), the policy is of an adaptive size, initially sized at the number of actions in the state specification. Each slot in this adaptive policy contains initially no rules, but these will be filled with rules covered from the environment, though it is likely there won’t be a large number. Also, each slot is bound by an action, so each rule within leads to the same action.

A problem with an adaptive policy using action bound slots is that it has no order of rules, so deterministic policies will fail if a bad rule is at the top. A possible ordering is to arrange the rules in order of specificity, such that the most specific rules are checked first. This still may cause problems, as a general rule may never be checked, even if it is right for the job. And this fault, combined with a Bernoulli distribution may result in useful slots being turned off.

The CE distribution can still be utilised for slot ordering by weighting the usefulness of a slot and creating a policy by sampling from the slot distribution in the creation of an ordered policy. Note that a slot may only occur once in a policy, so sampling is done via removal. So initially every slot (aka every action) has an equal chance of being selected for the top of the policy, but as weights are changed (through updates of firing rules), more useful slots will be placed at the top, allowing useful actions to be quickly evaluated. This strategy will still result in every slot being used, but because there is likely to be a low number of overall slots, it shouldn’t be an issue. Perhaps slots with probability < epsilon are discarded, and slots with probability > 1 – epsilon are fixed.

Something to note is while every slot will be present in the policy, updates will still influence particular slots over one-another. this is achieved by only looking at which slots (and rules) were used in experimentation. Otherwise nothing would be updated.

A future problem is how to deal with the dynamics of covering and updates. However, I will relegate this to later thought, when I’m at that stage in the code.