Luckily, a site I frequent, shares their pocket monster data for anyone. The data I'm looking for is kept at csv in the pokedex/data/csv folder. After getting the data, let's load it into R.
# load the data
types <- read.csv("types.csv")
type_efficacy <- read.csv("type_efficacy.csv")
Now let's clean up the data a bit.
# get rid of some columns we don't care about
types <- types[, 1:2]
# scale down damage factor
type_efficacy$damage_factor <- type_efficacy$damage_factor / 100
Ok, great. If we look at our type_efficacy data frame, we have ids for the damage and target types. Would like to replace those ids with plain text labels that I can understand a little easier. This would be super easy in sql, but for some reason, seems to be a real pain in R. Maybe there is a better answer.
# hacky way to get this to work. need a better solutions.
# join from plyr won't work because column names are not the same?
te <- merge(type_efficacy, types, by.x = "damage_type_id", by.y="id", all=T)
names(te)[[4]] <- "damage_type"
te <- merge(te, types, by.x = "target_type_id", by.y="id", all=T)
names(te)[[5]] <- "target_type"
# get rid of the NA rows and id value columns
te <- te[complete.cases(te), c("damage_type", "target_type", "damage_factor")]
Next we need all pairings of the types. Turns out not all of these type pairings exist as real pokemon, but we can deal with that later.
# create all pairings of types
tm <- expand.grid(t, t)
colnames(tm) <- c("t1", "t2")
# get rid of rows with same type
tm <- tm[which(tm$t1 != tm$t2 | is.na(tm$t2)) , ]
tm <- tm[!is.na(tm$t1), ]
Great! But, there is a small problem. We have both Steel / Grass and Grass / Steel. The order of the pairings does not matter in calculating the resistances, so let's get rid of those redundant combinations. This is another place where a better solution is probably available, but this quick function makes it fast and easy to identify these duplicated pairings. I'll source this function, run it against the type pairings data frame and then get rid of all duplicated pairings.
# quick function to find combinations that are already in the data frame
tagDups <- function(df = tm) {
df$dup = FALSE
for (i in seq(1, nrow(df))) {
v1 = df[1:i, ]$t1 == df[i, ]$t2
v2 = df[1:i, ]$t2 == df[i, ]$t1
if(length(which(v1 & v2)) > 0) {
df[i, ]$dup = TRUE
}
}
return(df)
}
# and get rid of those rows
tm <- tagDups()
tm <- tm[!tm$dup, 1:2]
Back to the damage calculations, I would like a data frame where I can easily look up the damage factors of the type can be easily calculated. I'll cast our earlier type_efficacy data frame into a more convenient form.
# create a damage / target matrix
library(reshape2)
type_dmg <- dcast(te, damage_type ~ target_type)
head(type_dmg)
damage_type bug dark dragon electric fighting fire flying ghost grass ground ice normal poison psychic rock steel water
1 bug 1.0 2.0 1.0 1.0 0.5 0.5 0.5 0.5 2.0 1 1 1 0.5 2.0 1.0 0.5 1.0
2 dark 1.0 0.5 1.0 1.0 0.5 1.0 1.0 2.0 1.0 1 1 1 1.0 2.0 1.0 0.5 1.0
3 dragon 1.0 1.0 2.0 1.0 1.0 1.0 1.0 1.0 1.0 1 1 1 1.0 1.0 1.0 0.5 1.0
4 electric 1.0 1.0 0.5 0.5 1.0 1.0 2.0 1.0 0.5 0 1 1 1.0 1.0 1.0 1.0 2.0
5 fighting 0.5 2.0 1.0 1.0 1.0 1.0 0.5 0.0 1.0 1 2 2 0.5 0.5 2.0 2.0 1.0
6 fire 2.0 1.0 0.5 1.0 1.0 0.5 1.0 1.0 2.0 1 2 1 1.0 1.0 0.5 2.0 0.5
Ok, ugly formatting, but whatever. You get the idea. Now we can get a type profile simply with the command:
# now we can get the damage taken profile for a type
type_dmg[, "ice"]
At this point we have a data frame listing all type pairings in tm and a lookup table for a type damage profile in type_dmg. This should give us enough to do all the analysis now.
First, we need a quick function to create the pairing summaries for us.
# returns damage taken vector for type pairing
calcRes <- function(row, td = type_dmg) {
t1 = as.character(row$t1)
t2 = as.character(row$t2)
if(is.na(row$t2)) {
res <- td[, t1]
} else {
res <- (td[, t1] * td[, t2])
}
immunities <- length(which(res == 0))
resistances <- length(which(res > 0 & res < 1))
standards <- length(which(res == 1))
weakness <- length(which(res == 2))
kryptonite <- length(which(res == 4))
return(c(t1, t2, immunities, resistances, standards, weakness, kryptonite, res))
}
And an object to store our results.
type_summary <- data.frame(t1 = character(),
t2 = character(),
immunities = numeric(),
resistance = numeric(),
standards = numeric(),
weaknesses = numeric(),
kryptonite = numeric(),
bug = numeric(),
dark = numeric(),
dragon = numeric(),
electric = numeric(),
fighting = numeric(),
fire = numeric(),
flying = numeric(),
ghost = numeric(),
grass = numeric(),
ground = numeric(),
ice = numeric(),
normal = numeric(),
poison = numeric(),
psychic = numeric(),
rock = numeric(),
steel = numeric(),
water = numeric(),
stringsAsFactors=FALSE)
Finally, for each type pairing, let's calculate it's resistances. I should have probably done this is sapply or something, but for only a few hundred rows, I guess it's ok to do it the "easiest" way I know.
for(i in seq(1, nrow(tm))) {
type_summary[i, ] <- calcRes(tm[i, ])
}
After it's all done, we have a nice data set that looks like this.
CSV
Rda
That was fun. Now I can look at various type pairings as I plan my team for X/Y in October. Surely this will make X/Y more fun for me. Or actually, maybe this was the fun part. No insight to share this time, maybe in the next post. Of course, thanks to veekun and his pokedex for the raw data.
Found this recently. Will use it instead.
ReplyDeletehttp://pokemondb.net/type/dual