Balancing the prices of UQM's ships
Based on Shiver's forthcoming PvP guide, I think we could mathematically calculate new and perfectly balanced values for UQM's ships (using the data about how well each ship fares against each other ship).
I have uploaded a C++ version here (EDIT: Link deleted, new link — version 1.02 — is here), but it makes an ugly oversimplification and it's not exactly mathematically proven. The probability of each outcome of each match-up should be taken into account, I guess.
-With hope that you can sort things out, Valaggar 17:33, 15 January 2008 (CET)
Yes, I had this in mind as an application of the idea I put on that page, I guess the only thing at the moment is that my real life has been quite busy in recent times, resubmitting our paper (it's morning where I live and I only just saw this before heading out to face a busy day). I was actually thinking of running the simulations in matlab too, but we can do it every which way :)
There are all kinds of issues to consider here, but I may have some time coming up. I've been watching Shiver's guide with keen interest.
There are also other things we can look at with the approach I suggested. For example, I saw the talk in the dreadnought=banana boat thread on UQMF and actually I think one of the main reasons for the weakness of the dreadnought is that the limited number of directions is a disadvantage for any ship with long range weapons which do not track the enemy. This effect we could test by seeing what happens when we add more directions. There are many other interesting speculations like this too.
For instance, it was also mentioned that the dreadnought was not designed for retreat, it is for frontal assault. Now, I think that because one is never really concerned with holding a position in melee, ships against which this would be difficult are disadvantaged. Like the dreadnought, which is no banana boat, at least as I argue here, because melee doesn't do it justice.
I also think the dreadnought is better in numbers, and that there are some ships which would be terrible in numbers like the eluder.
All kinds of things we could test. But a lot of work. Anyway, thanks Val --Zeracles 21:46, 15 January 2008 (CET)
My apologies, for I was in a great hurry yesterday - I didn't have time to look at your program and misunderstood what you had in mind, Val. I see now that it's mainly an algorithm which deduces values assuming quantities which summarise the match-ups, and what those quantities are is a different but related question. I don't know when I'll have time, but this is really excellent, I will look more closely ASAP.
Oh and it was also mentioned in the dreadnought=banana boat thread that the marauder was better - just as an aside, I suspect this could change completely with numbers. Both of the marauder's weapons are more likely to cause friendly damage than the dreadnought's. --Zeracles 19:22, 16 January 2008 (CET)
- Thank you very much, Zeracles. But what do you mean with the Marauder causing friendly damage? The Marauder doesn't collide with its own shurikens or F.R.I.E.D. jets. Valaggar 19:36, 16 January 2008 (CET)
- If your algorithm is what I suspect it is (and if it isn't, this is because I don't know any of C's variants), I think it's clever. You'll correct me if I'm wrong, but here is what I understand it to be:
- The outlook for each match-up is summarised with a number between 0 and 1. It would seem natural to me to have this as probability of victory.
- The sum of these for each ship gives one a first guess at how effective it is, its x-value.
- The cleverness of the algorithm comes in here - a ship is not necessarily very useful if it is only really good against weak ships (which opponents may never choose). So, we reward it for being good against other ships which are strong, i.e., ships which have high x-values. This is achieved by weighting the sum of match-up values with the opposing ship's x-values.
- These weighted sums give us a new x-value for each ship, a ``second guess".
- These new x-values are used to again make weighted sums of the original match-up values for a third guess at the x-values.
- We continue iterating until these x-values converge.
- If this is right, it sounds good to me. Well done! A possible worry is whether we can guarantee convergence (and related to this could be how you normalise the values between iterations, which I'm not clear on), but seeing as how it seems to work now, we can probably already be almost certain that this is sound.
- If you go with interpreting the input as probabilities, then even if Shiver's values are not very precise for individual match-ups, I think this should work well when all of them are thrown together. It would be interesting to see what you get if you put the match-up values in assuming computer vs computer - I suspect this will deliver ship prices closer to the default ones than that guy's mod (I'd put a link but again I'm just posting quickly before another busy day!). Also, if we ever do get to doing melee simulations, the probabilities we get out of that would be refinement of Shiver's values.
- You speak of an ugly oversimplification . . . you put the problem very well, so I'll just quote you :)
- Also note that I used the rather gross oversimplification that ships merely have a certain chance of winning, not taking into account how damaged the victor's ship comes out of the combat (it's not necessarily directly proportional to the chance of winning, as, for instance: the Scout barely has a chance of winning, yet it still can deal considerable damage; the Drone barely has a chance of winning, yet it doesn't deal much damage generally; etc.). This is something that really needs improvement (and professional help ;)).
- Maybe we'll think about that if I haven't embarrassed myself and completely missed the point of your algorithm. It may then be that probabilities as input are not the way to go. Then again, for the precision required to pin down how much damage ships will do, on average, it seems to me that simulations would be most valuable. But we can always guess and see what happens. Great work! I stand ready (well, apart from my real life interfering) to assist in any way I can. This could be worthy of a page - I think in the main body of pages, but a user page at least.
- About the marauder, I was just assuming that while an individual marauder can't hurt itself, it would still be able to hurt its friends. I suppose because if melee was to be extended to third person, it would be natural to be consistent and allow friendly fire for all the ships (but obviously kzer-za fighters will have enough sense not to attack friends). Still, it could be argued that it would be equally consistent to retain all the settings of the first person melee, then again we know it wasn't designed with third person in mind.
- Well, how's this for a tactic dreadnoughts could employ in numbers (which marauders couldn't!) - the fighters don't necessarily return to the ship from which they came. They return some friendly ship which is low on crew, to make it last a bit longer (or even be sent out only for the purpose of jumping into a friendly ship rather than attacking the enemy - with enough support, the front line dreadnoughts could be virtually indestructible!). These are the sorts of interesting tactics not reproduced in third person games I have seen. But then again the only third person games I've played are Dune II and StarCraft (terminators are the zealots of Star Control, are they not?). --Zeracles 19:53, 17 January 2008 (CET)
- You've got it completely right. That's my algorithm summarized.
- Regarding the converging and normalizing of values: At the very least, empiric testing shows they converge (unless one ship has 1 against each other ship, in which case the limit of the uber ship's price is infinity, and the limit of all other prices is zero. Which reflects the idealized reality represented by the algorithm pretty well, as a ship that always defeats every other ship renders those other ships useless). By "converge" I mean that the proportions between them stabilize (as, without normalizing, the variables themselves will quickly overflow, even as doubles; were overflowing not a problem, the proportions between them would still be conserved by normalizing... I think. I'll look into it tomorrow).
- To normalize, I first search the initial array of match-up values for the cell with the highest number (if there are multiple such cells, the first of them gets chosen). Its x coordinate is assigned to the variable maxx, its y coordinate is assigned to the variable maxy. In every iteration, I divide match-up value and each x-value by (curmax / 100), where curmax is the cell with the x coordinate maxx and the y coordinate maxy in the current iteration.
- And "that guy" with the balance mod is Elvish Pillager, the world champion at Super Melee, so show some respect. ;) Actually, he provided the inspiration for Point 3 in your bulleted list. We dueled once (many months ago) on #uqm-arena, and he won, and I blamed my defeat on the imbalance of UQM's ships. He said that the ships are only slightly imbalanced. I replied that my algorithm shows they are badly imba. (yes I still remember that dialog!) He said that my values are off, and said that a ship good against a single good ship is worth more than a ship good against multiple weak ships. I had to concede on that point. He suggested an inaccurate approach, though (calculating the "first guess" x-value by adding 1 for each ship that the current ship has a probability higher than 50% of defeating; and the x-value stopped at this "first guess"). But still, the credit for the idea goes to him, just as the credit for the idea of *colours* being related to perception goes to you, although it was me who integrated it into my theory and refined it. Rant over. Valaggar 21:36, 17 January 2008 (CET)
- I'm starting to think that the input should be how much damage gets done as a proportion of the opposing ship's max crew. This introduces some complications, though. Like with the podship and penetrator, negative values come into it. And it's more difficult. But of course, we can try everything. Under this, whether one can get negative values in those cases depends on what the opposing ship's crew was at the beginning. But the probable outcome of a fight depends on the respective crews too. What this starts to question is the assumption of both ships initiating with a clean slate (though this isn't clear in Shiver's thinking, I think this is mostly assumed in the outlooks). Related is the question of how to handle the intruder's ability to warp close to the enemy.
- Possibly the only solution is to hit the problem with full melee simulation. So, we have virtual players show up for a nuke BBQ and choose their ships based on some initial ship prices. As the melee progresses they make choices based on experience of previous outcomes regarding crew levels and such. Er, which assumes this is part of informing the tactics of these virtual players. Anyway, after some fighting, see how useful the ships were and update values.
- Um, forget my stupid idea (although it probably just seems more work than it is because this would be a simulation, you wouldn't have to sit there watching and turning the handle). This is far far away and we should probably just go with probability/damage fraction as input for your algorithm.
- I haven't really thought the normalisation through but is there a reason why you divide by (curmax/100) and not simply curmax?
- About credit for ideas . . . hmm. --Zeracles 20:41, 18 January 2008 (CET)
- Regarding normalizing: Yes, it does indeed preserve proportions. If mtch[x][y][n] stands for y's match-up value against x as of iteration n (x and y range from 0 to 24, n is greater than 0), and curmax[i] stands for the curmax as of iteration i, then:
- mtch[x][y][n] = ((mtch[x][y]/curmax) * ((a[x][n-1]/curmax[n-1]) + (a[x][n-1]/curmax[n-1]) + ... + (a[x][n-1]/curmax[n-1]))) / curmax[n]
- mtch[x][y][n] = (mtch[x][y] * (mtch[x][n-1] + mtch[x][n-1] + ... + mtch[x][n-1])) / (curmax * curmax[n-1] * curmax[n])
- Without normalizing, mtch[x][y][n] would be equal to only the numerator of that last fraction. Since all members of the array (as of iteration n) are divided by the same denominator, the proportions between the members of the array are indeed preserved.
- Regarding dividing by curmax/100 instead of curmax: It was a matter of personal preference, I preferred having values between 0 and 100 rather than between 0 and 1. Nevertheless, I've changed the algorithm so that curmax = mtch[maxx][maxy][cur] / 100 and division is by curmax, not curmax / 100. And the final X values are now normalized too, to be more fit for using in-game (no need for a 20000pts maximum fleet). 
- I'll ramble on about your idea of using damage/max_crew instead of chance_of_victory tomorrow, it's past midnight right now. :) Valaggar 23:41, 18 January 2008 (CET)
- Let's analyze the problem at its very core, under the guidance of the Ultron. All readers who wish to enter trance-state may do so now.
- What do we actually mean by "balance"? Or, more importantly, what do we want to obtain by attaining this "balance"? The answer is that we want to make the frequency of usage of ships as evenly distributed as possible. Prices (in conjunction with the rule that a fleet's cost cannot be more than 200pts) are the way this is implemented. Therefore, prices have to be changed in such a way as to minimize the standard deviation of the victory probabilities for player A for each possible fleet choice of A (assuming optimal choice of fleet by player B). The set of prices for which this is the lowest, is the most balanced set of prices.
- In addition to fleet cost being at most 200pts, we will want to use the restriction that there cannot be two ships of the same type in the same fleet. This is the general rule at Melee tournaments, and a good way to assure variety.
- There's my two bani. Valaggar 21:20, 19 January 2008 (CET)
Probability of victory vs damage fraction
Good to hear about the convergence. I may check through your code, just because annoying bugs creep in for even the best of us, but I'm sure it's fine.
I recognise the point you make about what we are trying to achieve - we are looking for balance, to ensure variety, in the case of tournaments, and because players make choices throughout a battle, the probability input is probably preferred over the damage fraction. But undeniably, that's an ideal and there are always going to be instances where a player employs several ships to nag away at one because the player has spent that member of his fleet which would be the natural enemy. This case favours the damage fraction input.
Now, I suspect the ideal input mode would be somewhere in between, but heavily in favour of the probability of victory (interestingly, I think this is more the case at the beginning of a melee, and less so towards the end!). However, away from the tournament aspect of this and into the prospective third person arena (personally, I'm sure we will see it someday), I think it could be more the other way. That's an aside, and because I suspect (faithful) third person SC would be quite different from traditional RTS, it may not even be all that impoprtant.
For our purposes (presumably releasing this when Shiver's guide is finished, but it's up to you of course), I think you're right that it would be best to use the probability interpretation. But there is also the possibility of doing both - perhaps a weighted average of both probability of victory and damage fraction. Probably the easiest way to code this would be to have an additional input file which has the damage fraction values. Also I would have an input file which has two numbers - what to normalise the final x values to (some may find it pleasing/tractable to have this max at 30), and how to weight the probability of victory and damage fraction inputs, expressed as one of these weights probably. All this just so you don't have to compile the code every time you make a change. Just a suggestion.
If you do this, when you release the values you could provide some options.
Testing of the sort in my stupid idea a while back would tell us where the balance is, but that's for another time - it would tell us the victory probabilities assuming given fleet selections which you speak of. Where the balance is would be affected by, I think, the point limit and restriction on having two identical ships. So they could be different for different tournament conditions.
Good work! --Zeracles 18:14, 20 January 2008 (CET)
- I've made the change you proposed.
- By the way, I have to make a slight correction to a reply I posted above. I tested again the situation where one ship has 1 against each other ship, and its value does not actually tend to infinity — far from it. Just replace the uppermost row with 1's (let the first number as 0.5) and the leftmost column with 0's (let the uppermost number as 0.5) and you'll see. The ubership has 30 (when 30 is given in options.in, obviously), and others have values around 20, 18, 16, 13, 9.
- I fear that this shows a grave omission in my algorithm.
- Regarding the damage fraction values: Well, Shiver's guide doesn't exactly list the probabilities of victory, it lists how good each ship is against each ship. I think this does actually take into account both probability of victory and damage fraction, especially considering the text he backs his outlooks up with.
- Regarding simulating Melee matches: Such a simulation would be horrendously difficult to implement, unfortunately. An excellent AI would have to be developed, and probably using learning AI's would be necessary, as a purely iterative approach would take years (if not decades) to run. Melee is based on many high-level tactics (such as the movement pattern of an Y-form, or flanking, or gravity whipping, or other complex and delicate maneuvers), which have to be taken into account as indivisible objects in the process of computing the optimal next move.
- Also, if the standard deviation of the victory probabilities for player A for each possible fleet choice of A (assuming optimal choice of fleet by player B) is called as "imbalance coefficient" of a set of prices, we only need a way to calculate the imbalance coefficient for any given set of prices (and then we can set the prices via trial and error, starting either from the vanilla prices or from EP's balance mod prices). I think. Valaggar 19:38, 20 January 2008 (CET)
Reorganised this page a bit.
Well, if there is still a problem I can look it into it when/if I have time.
Not being able to quantify the extent to which Shiver's values take into account either measure is a problem, and it makes his outlooks difficult to interpret if we care about consistently weighting the measures (which we may not for a first application). Think about the skiff vs avatar case for example. The way to do this properly is with simulation.
You'll forgive me if I don't judiciously prove this overnight, but actually, no, using learning AI's would not be necessary. It seems you underestimate either my seriousness or competence, and while there isn't much I can do about these fallacies, perhaps I can do just a little about your overestimation of how difficult the task would be.
Now there's a reason why, on this page I drew your attention to the planet landing formula work. That problem has striking similarities. To try every strategy one needs to model the melee environment. How difficult is this really, when we have the game's source? It would probably take me longer to learn enough C than it would to do it. Then model each strategy - now, how many possible strategies are there? Not many, you know. In fact, they can all be defined completely by vectors - check out how I have defined the strategy space formalism. No matter how complex and delicate they may look in the end, if humans can do them, there's actually a simple underlying pattern - because a simple strategy often still maps to a complex series of moves.
So define them and code them in as part of the melee environment, which you then run iteratively for each strategy pair. As for taking years or decades, no. Have you never played melee in frenzy mode? Good old 486 computers can handle it, so a complete melee simulation on faster computers might only take a few seconds. And then to the issue of choosing a strategy - this is just like in the planet lander problem - instead of maximising RU gained, one is maximising damage done (not in every circumstance, but there's the starting point). The control structure of the code would even be similar to that of the planet lander code I've started (and as I've shown, a lot of the conceptual work is already outlined).
I estimate that melee simulation is only in the order of a factor of roughly 252 more difficult than the planet lander problem (and it's at least that many times more valuable). Which is not as bad as you say. In reality, the only thing which would stop me is time and lack of assistance. If it was to succeed, it would truly be a blow for the star control cause. There's my pitch. As I've said before, a learning AI learns by experience anyway, so you'll end up doing at least as much work in my opinion, and then your results will not be as easy to interpret. Nor will they be as sturdy.
About the imbalance coefficient, could you backtrack a bit and explain what ``optimal choice of fleet by player B" means? I haven't given it much thought but don't we just want to make the probability of victory the same for any fleet selection which nudges the point limit? I think you could be on to something interesting here with this trial and error approach. But I'm still optimistic about your algorithm! Have you looked into it any more? If so, is it more likely to be a theoretical or coding problem? --Zeracles 19:54, 21 January 2008 (CET)
- I don't know how to word this, but I'm really deeply sorry about underestimating you. I feel thoroughly ashamed of myself, and I must retire now to perform rituals of anguish. Before I go, though, I will reply to your last questions.
- Regarding "assuming optimal choice of fleet by player B": Since for each possible choice of fleet by player A, player B can also choose a lot of possible fleets (and in each of these cases, A's chances of winning are different), we disregard A's probability of winning against all but one of these fleet choices of B. The only fleet choice of B that is taken into account is that one which has the greatest probability of victory for the greatest number of possible fleet choices of A. We only take this choice into account because that's what B would choose with ~100% probability, assuming he's a good enough player — and good players are the type of players we're trying to tailor the prices for, I suppose. Other possible fleet choices of B are of negligible probability. ... umm, maybe we could actually also take into account those other possible fleet choices which are good enough, though not exactly the best? You know, because even good players aren't that good.
- And yes, I guess that, in the end, the probability of victory for any fleet selection which nudges the point limit will be (nearly) the same. Reducing the imbalance coefficient until it hits the lowest value (whether it is indeed zero or just close) is the way this end will be attained.
- Regarding my old algorithm and what type of problem that is: Theoretical. I've looked over the code and I located no problem. I've thought over the algorithm and realized that it wouldn't be normal for it to give 0 to the non-uber ships. I've tested the program when all ships except the uber-ship have 0 against every ship (excluding themselves), and I've tested the program with only three ships, and the results confirm the hypothesis that it's a theoretical rather than a coding problem.
- Now, my old algorithm was anyway doomed from its inception (from a balance standpoint), as balance means not limiting player choices, and just giving values to ships that are directly proportional to how good the ships are... this is not enough to balance it. Some ships will still be rather useless. An Umgah Drone which is rated for what is worth, is not worth much. But an Umgah Drone which has a price smaller than what it would deserve (say it has 2 instead of 7, for example) would be more on par with other ships (it would be as if all of its match-up values were multiplied by something larger than one).
- Anyway, if we would be interested in giving prices to ships proportional to how good they are... "how good a ship is" is actually quite an artificial concept, given the rock-paper-scissors nature of the game... still, it might be possible, I don't know... it's a little too late in the evening/night for this.
- And sorry again for my horrible behaviour, Zeracles. Valaggar 22:16, 21 January 2008 (CET)
Knowledge of the enemy
Just as an aside, if you saw my last post on the UQMF, the set of numbers which goes into the code I talk about is the vector which defines strategy. One would change this vector in real time to adapt to changing situations. If you are for example harrassing the enemy with eluders and they are straying too close to the enemy before turning and releasing BUTTs, simply change the relevant element of the strategy vector. If you then find that they are releasing too frequently and shorting out their energy, another real time adaptation may be in order. As commander, instead of telling units what to do, you are telling them what tactics to adopt. For what it's worth it's probably closer to what you call RTT, then again I'm not sure what the distinction between RTS and RTT is.
Okay, optimal choice of fleet by player B seems to assume player B already knows what player A's fleet is. Tournament conditions come into it. And another question is whether it should be right that the probability of victory should approach the same value for any fleet selection - there is value in choosing a fleet which covers a range of abilities to counter a range of possible enemies. What this possible range is depends on whether the composition of the enemy fleet is known. Depends on tournament conditions. About not assuming that B's choice is perfect . . . I don't know how one would quantify this imperfection.
For the case where B doesn't know the composition of A's fleet, well, B doesn't have an optimal choice. Rather, there is an optimal probability of selecting each particular fleet composition. I won't go any further with this just yet, but an alternative way of posing the balance problem might be to make this optimal probability the same for each particular combination. It may not be exactly the same as the imbalance coefficient approach. So the values which achieve balance may be different - depending on tournament conditions.
My description is probably a bit confused, but this touches on a game theoretic consideration. Anyway, the trial-and-error approach sounds right, but unless you have a way of calculating this imbalance coefficient, it appears beyond reach at this point.
Of course, in the third person arena, knowing the composition of the enemy fleet becomes knowledge of the disposition of the enemy, and some ships will have value added for their ability to come by this knowledge. While I've speculated that mobile ships like the eluder would be terrible in numbers, they have value as scouts and for spoiling formations like the dreadnought formation I've described. Then again, the mauler isn't mobile but could be good at spoiling formations, while being good in numbers itself.
I wouldn't throw out your algorithm just yet - there are no uber-ships . . . I'm looking into it.
Regarding the question of how good a ship is, as I think you know (and as I'm only noting) this is probably only a question which can be answered globally, by looking at what effect sets of values have on balance. Yes, it's an artificial concept, which disturbs an idealist like me, as artificial as looking for balance in the first place. It may not be a question we can answer yet. But the algorithm could still be a valuable first step. I wouldn't say it limits player choices - it's the point limit which does that - I would say it modifies the suitability of the choices.