r - Comparing plot variables with average results -
provided following dataframe have been able create following plot:
library(ggplot2) df = read.csv("http://pastebin.com/raw.php?i=mltkev3z") ggplot(df, aes(x = factor(identificación.con.el.barrio), fill = nombre.barrio) ) + geom_histogram(position="dodge") + ggtitle("¿te identificas con tu barrio?") + labs(x="grado de identificación con el barrio", fill="barrios")
resulting in following plot:
however, add new column average results per each observation "grado" variable (with no stratification per neighborhood - aka"barrio"), able compare each neighborhood result city's.
could me in how achieve that?
not elegant, works. involves changing original data around make frequency table, adding group averages. necessitates using geom_bar() in ggplot instead of geom_histogram(). result identical.
# make frequency table of data library('plyr') df2 <- ddply(df, .(barrio,grado), summarise, freq=length(grado)) # make table of averages avg <- data.frame(as.data.frame(table(df2$grado)/3,stringsasfactors=f)) names(avg)[1] <- "grado" avg$barrio <- "average" # combine tables df2 <- rbind(df2, avg) df2$grado <- as.character(df2$grado) df2[is.na(df2$grado),"grado"] <- "n/a" # plot using barplot instead of histogram ggplot(df2, aes(x=grado,y=freq,fill=barrio)) + geom_bar(stat="identity",position=position_dodge()) + scale_x_discrete("grado de identificación con el barrio") + scale_y_continuous("count")
note: changed variable names simpler, hence scale labels.
the result this:
Comments
Post a Comment