Loop Perforation: streamcluster Perforated Loops

This benchmark implements an online approximation of the K-means clustering algorithm~\cite{hartigan1979k}. It partitions a set of points into clusters, with each cluster centered around one of the input points. The algorithm is designed to process sets of points from a space of features, with the clusters corresponding to entities with similar features.

The inner loop in pFL() estimates the cost of opening a new cluster center. Perforating this loop allows the application to make a less accurate estimate of this cost more quickly. The following comment appears above the definition of the loop bound (ITER) in the source code:

/* higher ITER --> more likely to get correct number of centers */
/* higher ITER also scales the running time almost linearly */

This comment reflects the fact that the original version of the loop computes an inexact approximation of the true cost, with the number of iterations controlling a performance vs. quality of service tradeoff. In effect, the perforation space exploration space algorithm rediscovers this tradeoff.

// This function is defined in streamcluster.cpp in the streamcluster source code
float pFL(Points *points, int *feasible, int numfeasible,
   float z, long *k, double cost, long iter, float e,
   int pid, pthread_barrier_t* barrier)
{
#ifdef ENABLE_THREADS
 pthread_barrier_wait(barrier);
#endif
 long i;
 long x;
 double change;
 long numberOfPoints;

 change = cost;
 /* continue until we run iter iterations without improvement */
 /* stop instead if improvement is less than e */
 while (change/cost > 1.0*e) {
   change = 0.0;
   numberOfPoints = points->num;
   /* randomize order in which centers are considered */

   if( pid == 0 ) {
     intshuffle(feasible, numfeasible);
   }
#ifdef ENABLE_THREADS
   pthread_barrier_wait(barrier);
#endif
    // This loop is perforated, changing i++ to i+=2
   for (i=0;i<iter;i++) {
     x = i%numfeasible;
     change += pgain(feasible[x], points, z, k, pid, barrier);
   }

   cost -= change;
#ifdef PRINTINFO
   if( pid == 0 ) {
     fprintf(stderr, "%d centers, cost %lf, total distance %lf\n",
       *k, cost, cost - z*(*k));
   }
#endif
#ifdef ENABLE_THREADS
   pthread_barrier_wait(barrier);
#endif
 }
 return(cost);
}

Our examination of the source code confirms that perforating this loop
(as the comment suggests) leaves the application able to process all inputs.

Loop Perforation

Wednesday, November 25, 2009

streamcluster Perforated Loops

Blog Archive

Contributors

Followers