Skip to main content

A lunch roulette algorithm in R

The purpose of the algorithm described here is to randomly create pairs from a list of people to go out for lunch. As an input, this algorithm needs a list of people created as shown below and saved to a file called people.csv, in UTF-8 format.

jill.valentine@capcom.com,Jill Valentine,1
james.bond@mi6.gov.uk,007,1
tony.stark@marvel.com,Tony Stark,2
david@konami.com,Solid Snake,1
hannibal.lecter@jhu.edu,Hannibal Lecter,0
lara.croft@core-design.com,Lara Croft,2
daenerys.targaryen@hbo.com,Daenerys Targaryen,1
walter.white@caltech.edu,Heisenberg,1

Each line in this file should contain an email address, a name, and a code that should be set to 0, 1 or 2. Code 0 means ignored, so only people with codes 1 or 2 are considered when creating new pairs. In case of an odd number of people with codes 1 or 2, 1 randomly selected person with code 2 is discarded. In order to not have to worry about odd numbers, it is best to always have at least 1 person with code 2. Also important to keep in mind, names can be repeated if necessary but the email addresses should all be different. The output of the algorithm is a list of pairs of email addresses generated as shown below, which is automatically saved to a file.

daenerys.targaryen@hbo.com,jill.valentine@capcom.com
walter.white@caltech.edu,james.bond@mi6.gov.uk
david@konami.com,lara.croft@core-design.com

The R code that makes this happen is shown below.

rm(list = ls())

everyone <- read.table("people.csv", sep = ",", fileEncoding = "UTF-8")
names(everyone) <- c("E", "N", "EN")

sort.2 <- function(x) {
  n <- rep(c(TRUE, FALSE), length.out = length(x))
  for (p in 1:sum(n) * 2) {
    if (x[p] < x[p - 1]) {
      n[p] <- TRUE
      n[p - 1] <- FALSE
    }
  }
  return(paste(x[n], x[!n], sep = ","))
}

e <- rev(list.files(pattern = "^event-[0-9]{3}-.{10}\\.csv$"))

past.2 <- data.frame()

if (length(e) > 0) {
  for (j in 1:length(e)) {
    past.2 <- rbind(past.2, cbind(read.table(e[j], sep = ","), C = 2 / 2 ^ j))
  }
  names(past.2)[1:2] <- c("E.1", "E.2")
  a.1 <- as.character(rbind(past.2$E.1, past.2$E.2))
  a.2 <- sort.2(a.1)
}

w <- everyone$EN > 0
i <- -1
while (i != 0) {
  s <- w
  if (sum(s) %% 2 > 0) {
    s[-sample(-which(everyone$EN > 1), 1)] <- FALSE
  }
  b.1 <- sample(everyone$E[s])
  b.2 <- sort.2(b.1)
  u <- rep(0, length(b.2))
  if (length(e) > 0) {
    for (k in 1:length(u)) {
      u[k] <- sum(past.2$C[match(a.2, b.2[k], nomatch = 0) > 0])
    }
  }
  if (i < 0 || max(u) < max(new.2$C)) {
    i <- max(0, i) + 1e4
    new.2 <- data.frame(E.1 = b.1[c(TRUE, FALSE)], E.2 = b.1[c(FALSE, TRUE)],
                        C = u)
  }
  i <- i - 1
}

write.table(new.2[1:2], sprintf("event-%03d-%s.csv", length(e) + 1,
                                Sys.Date()),
            quote = FALSE, sep = ",", row.names = FALSE, col.names = FALSE)

rm(list = setdiff(ls(), c("everyone", "new.2", "past.2", "sort.2")))

In this code, the variable everyone contains the list of people extracted from the file people.csv. Any pairs previously created by the algorithm are sorted by most recent first, assigned a cost, and stored in the variable past.2. This cost starts at 1 and drops by 50% every time the age of the pairs increases. The last part of the algorithm is all about finding a solution that minimizes repeated pairs. The number of attempts is defined by the variable i in the sense that the algorithm stops when i reaches 0. Every time a better solution is found, the number of remaining attempts is increased by 10000. In each attempt: discard 1 person with code 2 if needed, create a new random list of pairs, calculate the cost of each pair in this new list by adding all the corresponding costs found in past.2, calculate the cost of the new solution, and update the best solution if the newly calculated cost is lower than the lowest cost previously found. In this algorithm, the cost of a solution is defined as the cost of the pair with the highest cost. Once everything is finished, the best solution found is stored in the variable new.2.