Visualization

Arabic Plots in R

Many R packages do not work well with right-to-left languages like Arabic. Here's how I've gotten around this problem.

I’ve read that this is not a problem for Linux users, but for Mac users like myself, neither base R graphics nor ggplot can properly display Arabic script. This appears to be more than an encoding problem; changing encodings to more Arabic-friendly alternatives doesn’t fix the problem. Here’s a basic bar plot with Arabic script, using the same data I used in this previous post; the pre-summarized data from the Arab Barometer III survey may be downloaded from here. Arabic readers will note that the text is both in the wrong direction and that letters are not properly connected, despite the fact that the script renders properly within the RStudio environment, i.e., where a string is assigned to mytitle and myname.

## the data
library(reshape2)
library(magrittr)
library(ggplot2)
data<-read.csv("http://rnotr.com/assets/files/trust_gov_army_ar.csv", header=TRUE, encoding="UTF-8") 
data$name_en <- factor(data$name_en, levels = data$name_en[order(-data$trust_army)])
data$name_ar <- factor(data$name_ar, levels = data$name_ar[order(data$trust_army)])
data <- melt(data, id=c("name_en","name_ar"))
## the plot
mytitle<-"الثقة في الحكومات العربية مقابل الثقة في الجيوش العربية"
myname <- "في المئة الذين يثقون في المؤسسة"
p<-ggplot(data, aes(x=name_ar, y=value, fill=variable)) + 
  geom_bar(stat="identity", position="dodge", color="#ffffff") +
  theme(panel.background = element_rect(fill = "#ffffff")) +
  ggtitle(mytitle) +
  scale_fill_manual(values=c("#45B29D", "#E27A3F"), 
  name="") + 
  xlab("") + scale_y_continuous(name="",limits=c(0,100), breaks = seq(0,100, 20)) 
p

Of course the plot renders with no problem in English, as below.

## in English
ggplot(data, aes(x=name_en, y=value, fill=variable)) + 
  geom_bar(stat="identity", position="dodge", color="#ffffff") +
  theme(panel.background = element_rect(fill = "#ffffff")) +
  ggtitle("\nConfidence in Arab Government vs. Arab Armies\n") +
  scale_fill_manual(values=c("#45B29D", "#E27A3F"), 
  name="Percent Trust\nin Institution", labels=c("Government","Army")) + 
  xlab("") + scale_y_continuous(name="",limits=c(0,100), breaks = seq(0,100, 20))

plot of chunk unnamed-chunk-3

I think that dealing with the encoding problem at the system-level would certainly be the best approach, but this is beyond what I can do. The same problems with rendering Arabic script in Microsoft Office for Mac persist, and I’d like to think that if there were an easy fix for this, Microsoft would have implemented this by now. After all, there are nearly 300 million native Arabic speakers, making it the fifth most-spoken language in the world.

Because nearly every modern web browser has no problem with the display of Arabic text, I passed the messy Arabic ggplot object through plotly, and it worked quite well. After transforming this to a plotly object, I only had to make a few minor adjustments. I moved the y-axis to the right-hand side, increased the right margin slightly. I moved the legend to the top-left corner. And the legend labels were “lost,” reverting to the trace name, so I had to re-apply these manually. Not every plot needs to be interactive. I appended .svg to the plotly url to embed a static version of the plot in this post.

library(plotly)
gg<-ggplotly(p)  %>% layout(titlefont=list(size=24), yaxis = list(side="right",  gridcolor = toRGB("gray90"),
    gridwidth = 1, ticks="", title="في المئة الذين يثقون في المؤسسة", titlefont=list(size=20)), 
    xaxis = list(ticks="", tickfont=list(size=16)), 
    legend = list(x = 0, y = 1, font=list(size=20)),
    margin = list(l = 30, r=45, b = 30, t = 80))
gg$x$data[[1]]$name <- "الحكومة"
gg$x$data[[2]]$name <- "الجيش"
gg

GGPLOT · PLOTLY
tutorial survey arabic

Dialogue & Discussion