Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
[SCRIPT] eprofile: record and predict build-times
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Documentation, Tips & Tricks
View previous topic :: View next topic  
Author Message
Havin_it
Veteran
Veteran


Joined: 17 Jul 2005
Posts: 1266
Location: Edinburgh, UK

PostPosted: Mon Apr 26, 2010 4:17 pm    Post subject: [SCRIPT] eprofile: record and predict build-times Reply with quote

MOTIVATION:
Do you ever emerge -uDva world, look at a whopping list of packages and wonder: "How the heck long will that take?" No? Well, I do, and that is why I cooked up this script to (try to) answer the question.

In a nutshell, eprofile first reads /var/log/emerge.log to determine how long each merge took. NOTE: If you've never emptied or rotated this file in a few years, this will take a while! It stores all this information in a sqlite database.

Once the database is full, you can use eprofile to find the time it took for any previous merges of a given package, along with an average and maximum build-time calculation, as long as it's a package you've merged before. Alternatively you can feed it an emerge command, and have it predict the average and max time for the whole operation. It also supports some "utility" commands like revdep-rebuild, perl-cleaner and python-updater.

DEPENDENCIES
dev-lang/php [cli pcre pdo sqlite3]

Thanks to this forum's regular-expressions gurus who helped me with the horrendously-complicated regexes in this script. If anyone has comments or suggestions for this, please post 'em. Have fun!

IMPORTANT NOTE: The regular expressions are too long for the page and have been broken into bits below. After copying the code into an editor they should appear as I wrote them (works with kwrite), but if this doesn't work you'll need to re-join these lines so each line that starts with "const" near the top of the script is on its own line. Make sure not to delete any spaces except newlines.
Code:
#!/usr/bin/php
<?php
class Profiler {
   
   const RXP_MERGE      = '/^(\d{10}):  >>> (emerge (\(\d+ of \d+\)) ([^\/]+)\/\S+ to \/)\n\d{10}:  === \3 Cleaning \(\4\/[^:]+::\/usr\/portage\/(\4\/([^\/]+))\/\6-(.+)\.ebuild\)\n(\d{10}:(?!  \*\*\* terminating\.).+\n){1,10}(\d{10}):  ::: completed \2/mU';
   const RXP_ATOM      = '/^([<>]?=?)((([^\/]+)\/)?(?U)(\S+))(-(\d+(\.\d+)*[a-z]?(_(alpha|beta|pre|rc|p)\d*)*(-r\d+)?))?$/';
   const STORAGEDIR   = '/var/eprofile';
   const DBFILE      = '/var/eprofile/profile-stats.sqlite';
   const EOUTFILE      = '/var/eprofile/profile-emerge-out.txt';
   const LOGFILE      = '/var/log/emerge.log';


   private static $db      = null;
   private static $maxTS   = null;
   private static $res      = array();

   public static function run() {
      global $argc, $argv;
      if($argc < 2) {
         echo "Insufficient arguments!\n";
         Profiler::usage();
         exit(1);
      }
      $arg = strtolower($argv[1]);
      switch($arg) {
         case 'load':
            self::addData();
            break;

         case 'query':
            for($i=2; $i<$argc; $i++) {
               self::query($argv[$i]);
            }
            self::output();
            break;

         case 'emerge':
            $params = '';
            for($i=2; $i<$argc; $i++) {
               $params .= ' '.$argv[$i];
            }
            self::emerge($params);
            self::output();
            break;

         case 'python-updater':
         case 'perl-cleaner':
         case 'revdep-rebuild':
            $cmd = $arg;
            for($i=2; $i<$argc; $i++) {
               if($argv[$i] == '--') {
                  break;
               }
               $cmd .= ' '.$argv[$i];
            }
            $cmd .= ' -- -pq --nospinner';
            self::shell_get_pkglist($cmd);
            self::output();
            break;

         case 'usage':
         case '--help':
         default:
            self::usage();
            break;
      }
   }

   private static function getDB() {
      if(self::$db == null) {
         $needDB = !file_exists(self::DBFILE);
         try {
            if(!is_dir(self::STORAGEDIR)) {
               mkdir(self::STORAGEDIR);
            }
            self::$db = new PDO('sqlite:'.self::DBFILE);
            self::$db->setAttribute(PDO::ATTR_ERRMODE,PDO::ERRMODE_EXCEPTION);
            if($needDB) {
               self::$db->exec("CREATE TABLE build (
                  ts1 INT(10) NOT NULL PRIMARY KEY,
                  ts2 INT(10) NOT NULL,
                  pkg VARCHAR(150) NOT NULL,
                  ver VARCHAR(30) NOT NULL
               );");
               self::$maxTS = '0000000000';
            }
         } catch(PDOException $e) {
            die($e->getMessage());
         }
      }
      return self::$db;
   }

   private static function getMaxTS() {
      if(self::$maxTS == null) {
         $tmp = self::getDB()->query("SELECT MAX(ts1) AS maxTS FROM build;",PDO::FETCH_ASSOC)->fetchAll();
         self::$maxTS = empty($tmp[0]['maxTS']) ? '0000000000' : $tmp[0]['maxTS'];
      }
      return self::$maxTS;
   }

   private static function addData() {
      self::getMaxTS();
      $file = fopen(self::LOGFILE,'r');
      $readon = false;
      $txt = '';
      while(!feof($file)) {
         $ln = fgets($file);
         if(!$readon) {
            if(substr($ln,0,10) <= self::$maxTS) {
               continue;
            } else {
               $readon = true;
            }
         }
         $txt .= $ln;
      }
      fclose($file);
      preg_match_all(self::RXP_MERGE,$txt,$matches);
      $cnt = count($matches[0]);
      if($cnt == 0) {
         echo "\nNo new data found.\n";
         exit(1);
      }
      $cntRep = count($matches);
      $res = array();
      try {
         $stmt = self::getDB()->prepare("INSERT INTO build (ts1,ts2,pkg,ver) VALUES (?,?,?,?);");
      } catch (PDOException $e) {
         var_dump($stmt);
         die($e->getMessage());
      }
      for($i=0; $i<$cnt; $i++) {
         $stmt->execute(array(
            /*'ts1'   => */$matches[1][$i],
            /*'ts2'   => */$matches[$cntRep-1][$i],
            /*'pkg'   => */$matches[5][$i],
            /*'ver'   => */$matches[7][$i]
         ));
         echo "Added {$matches[5][$i]}-{$matches[7][$i]}\n";
      }
      $stmt->closeCursor();
      unset($stmt);
      echo "\n\n$cnt build times were added.\n";
      exit(0);
   }

   private static function query($str,$strict=false) {
      $rtn = array();
      $q = "SELECT * FROM build WHERE pkg ";
      $atom = preg_replace('/["\'`]/','',$str);
      if(!isset(self::$res[$atom])) {
         self::$res[$atom] = array();
      }
      preg_match(self::RXP_ATOM,$atom,$matches);
      if(empty($matches)) {
         die("$atom failed\n");
      }
      $vspec = $matches[1];
      $qname = $matches[2];
      $version = isset($matches[7]) ? $matches[7] : null;

      $q .= $strict ? "= '$qname'" : "LIKE '%$qname'";
      if(!empty($version)) {
         if(empty($vspec)) {
            $vspec = '=';
         }
         $q .= "AND ver $vspec '$version'";
      }
      $q .= ';';
      foreach(self::getDB()->query($q,PDO::FETCH_ASSOC) as $row) {
         if(!isset(self::$res[$atom][$row['pkg']])) {
            self::$res[$atom][$row['pkg']] = array();
         }
         self::$res[$atom][$row['pkg']][] = $row;
      }
   }

   private static function getTimeStr($b) {
      $b = round($b);
      $s = $b % 60;
      $b -= $s;
      $m = $b > 0 ? $b / 60 : 0;
      $h = 0;
      if($m >= 60) {
         $m = ($b % (60*60)) / 60;
         $b -= $m;
         $h = $b / (60*60);
      }
      return sprintf('%1$02dh %2$02dm %3$02ds',$h,$m,$s);
   }

   private static function output() {
      $i = 0;
      $avg = 0;
      $max = 0;
      foreach(self::$res as $k => $v) {
         $sum_atom = 0;
         $cnt_atom = 0;
         echo sprintf("%4s",++$i).") $k\n".
            "Package matches:\n\n";
         if(empty($v)) {
            echo "None, sorry :(\n\n";
            continue;
         }
         foreach($v as $kk => $vv) {
            $sum_pkg = 0;
            $max_pkg = 0;
            $cnt_atom += $cnt_pkg = count($vv);
            echo "$kk :\n\n".
             "Date/Time             Version         Build Time\n";
            foreach($vv as $row) {
               $sum_pkg += $b = $row['ts2']-$row['ts1'];
               $max_pkg = max($b,$max_pkg);
               echo date("d M Y, H:i",$row['ts1'])."    ".
                sprintf("%-16s",$row['ver']).
                self::getTimeStr($b).
                "\n";
            }
            $avg += $avg_pkg = $sum_pkg / $cnt_pkg;
            $sum_atom += $sum_pkg;
            $max += $max_pkg;
            echo "\n".
             "                  Average Build Time: ".self::getTimeStr($avg_pkg)."\n\n";
         }
         echo "         Combined Average Build Time: ".self::getTimeStr($sum_atom/$cnt_atom)."\n\n\n";
      }
      echo     "Predicted Total Build Time:  Average: ".self::getTimeStr($avg)."\n".
             "                             Maximum: ".self::getTimeStr($max)."\n\n";
      self::$res = array();
   }

   private static function emerge($params) {
      $params = str_replace(' --ask','',$params);
      $params = preg_replace('/ -([a-zA-Z1]*)a([a-zA-Z1]*) /',' -\1\2 ',$params);
      self::shell_get_pkglist('emerge -pq --nospinner '.$params);
   }

   private static function shell_get_pkglist($cmd) {
      $res = array();
      echo "Processing command: $cmd\n\n".
         "A list of packages selected is at ".self::EOUTFILE.".\n".
         "To quickly recommence the merge of these packages, run:\n\n".
         "   emerge --nodeps `cat ".self::EOUTFILE."`\n\n";
      exec($cmd,$out,$rtval);
      if($rtval > 0) {
         die("Error: Shell command failed.\n\nCommand was:\n$cmd");
      }
      foreach($out as $ln) {
         if(preg_match('/^\[ebuild [\s\D]+\] (\S+)/',$ln,$matches) > 0) {
            self::query('<='.$matches[1],true);
            $res[] = '='.$matches[1];
         }
      }
      if(empty($res)) {
         die("No packages selected.\n");
      }
      file_put_contents(self::EOUTFILE,implode("\n",$res));

   }
   
   private static function usage() {
      echo 'Usage:
      
eprofile < load | query <pkg1 [ pkg2 ... ] > |
      emerge [ portage-options ] < portage-atom [ portage-atom ... ] > |
      revdep-rebuild [ opts ] |
      python-updater [ opts ] |
      perl-cleaner [ opts] >

OPTIONS:

load            Parse '.self::LOGFILE.' and record build-times in database.

query           Display previous build-times for given portage package atoms.
                Sets (world, system) are NOT supported.

emerge          Run "emerge -pq" for the specified atom(s) and query each package
                selected for merging, displaying previous build-times for each
                package, plus a cumulative average and maximum build-time for the
                entire merge.
                world, system and other sets ARE supported.

revdep-rebuild,
python-updater,
perl-cleaner    Generate a package-list using the named command and profile the
                resulting merge operation.

usage           print this usage text.
';
   }
}

Profiler::run();
?>
Back to top
View user's profile Send private message
avx
Advocate
Advocate


Joined: 21 Jun 2004
Posts: 2152

PostPosted: Tue Apr 27, 2010 8:55 pm    Post subject: Reply with quote

What does it do different from `genlop -t` (app-portage/genlop)?
Back to top
View user's profile Send private message
Havin_it
Veteran
Veteran


Joined: 17 Jul 2005
Posts: 1266
Location: Edinburgh, UK

PostPosted: Wed Apr 28, 2010 11:40 am    Post subject: Reply with quote

Haha, good question ;) I didn't know of this app, but I was fully expecting someone to reveal an existing app that does the same (or similar) thing. I had a look, and I guess the differences are:

* eprofile is slightly faster than genlop, as the data is stored optimised in a database
* "eprofile emerge something" saves some turnaround time over "emerge -p something|genlop -p", as it generates the package-list that you can then emerge and skip recalculating dependencies with "emerge --nodeps `cat /var/eprofile/profile-emerge-out.txt`".
* eprofile outputs maximum as well as average build-time.
* It allows you to specify version constraints, e.g. "eprofile query '>=xulrunner-1.9.2'"
* Once the emerge.log has been parsed once, you can truncate it and save a bit of space :)

genlop is a great tool, and if I knew of it I'd probably not have made eprofile, but eprofile is a more specialised tool for a particular need and doesn't do half the things that genlop does. By way of a metaphor, think of genlop as a Swiss Army knife, and eprofile as the thing for getting the stones out of horses' hooves ;)
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Documentation, Tips & Tricks All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum