Speed up kAnon (localSuppression)

Maartje Boer mentioned to speed up kAnon by parallelization.

Here is a simple code that shows that parallelization of kAnon would be beneficial regarding stratification.

```{r}
library(ggplot2)
library(sdcMicro)
data(testdata)

testdata$ageG <- cut(testdata$age, 5, labels=paste0("AG",1:5))
kv <- c("urbrur", "roof", "walls", "water", "electcon", "relat", "sex")

## data.frame method (no stratification)
system.time(res <- kAnon(testdata, keyVars = kv))
system.time(res2 <- kAnon(testdata, keyVars = kv, strataVars = "ageG"))

plot(res)
plot(res2)

bs <- function(df, n = nrow(df)){
  sample_indices <- sample(seq_len(nrow(df)), size = n, replace = TRUE)
  bootstrap_sample <- df[sample_indices, , drop = FALSE]
  
  return(bootstrap_sample)
}


f <- function(x = testdata, kv, size, svar = NULL){
  ctime <- system.time(res <- kAnon(bs(x, size), keyVars = kv, strataVar = svar))["elapsed"]
  return(ctime)
}

N <- seq(100000, 5000000, 500000)
mytime_strat <- mytime <- numeric(length(N))
for(i in 1:length(N)){
  mytime[i] <- f(testdata, kv, N[i])
  mytime_strat[i] <- f(testdata, kv, N[i], svar = "ageG")  
}

mytimes <- data.frame("time" = c(mytime,mytime_strat),
                      "N" = rep(N, 2),
                      "method" = rep(c("no strat", "strat"), each = length(N)))

options(scipen = 999)
ggplot(mytimes, aes(x = N, y = time, colour = method)) + 
  geom_line() + 
  geom_point()
```

The strata might be calculated on different cores, which might get the computation times nearly to the non-strata case.
See code line 471 of `localSuppression.R`, so see where parallelization might come into play.

Note that further parameters might be varied: alpha and number of keys, and benchmarking might be extended (e.g. with microbenchmark) 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Speed up kAnon (localSuppression) #349

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Speed up kAnon (localSuppression) #349

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions