Travis job analysis tools

Since we’re dealing with a fair amount of Travis job timeouts, I have prepped some tools for analyzing job logs. The tools are based on initial work by @mvo. The tools were pushed to this repo:

https://github.com/bboozzoo/snapd-tools

Since the initial version, log parser was cleaned up a bit. I’ve also added CSV output that records various interesting information about each job, like start & end times, action, whether it’s an error, name of test and so on. The CSV format is as follows:

id,travis_job,machine,duration,start,end,type,action,error,test,line,text
c8859ee,18849.1,ubuntu-14.04-64,534.808675,1513195116.931079,1513195651.739754,project,preparing,False,<none>,876,2017-12-13 19:58:36 Preparing project on linode:ubuntu-14.04-64...
df292090,18849.1,fedora-26-64,675.742901,1513195133.349733,1513195809.092634,project,preparing,False,<none>,915,2017-12-13 19:58:53 Preparing project on linode:fedora-26-64...
acf4508,18849.1,fedora-26-64,821.308763,1513195134.082041,1513195955.390805,project,preparing,False,<none>,927,2017-12-13 19:58:54 Preparing project on linode:fedora-26-64...
a92de548,18849.1,ubuntu-14.04-64,721.13953,1513195141.03401,1513195862.17354,project,preparing,False,<none>,955,2017-12-13 19:59:01 Preparing project on linode:ubuntu-14.04-64...
...
a03f343b,18849.1,debian-9-64,0.036897,1513195737.884604,1513195737.921501,test,restoring,False,tests/main/snap-connect,1836,2017-12-13 20:08:57 Restoring linode:debian-9-64:tests/main/snap-connect...
a27387f3,18849.1,debian-9-64,8.551624,1513195737.921587,1513195746.473211,test,preparing,False,tests/main/interfaces-avahi-observe,1839,2017-12-13 20:08:57 Preparing linode:debian-9-64:tests/main/interfaces-avahi-observe...
b735146e,18849.1,debian-sid-64,2.195881,1513195742.564783,1513195744.760664,test,executing,False,tests/main/systemd-service,1880,2017-12-13 20:09:02 Executing linode:debian-sid-64:tests/main/systemd-service (26/1583)...
3a253e0d,18849.1,debian-sid-64,0.056648,1513195744.761561,1513195744.818208,test,restoring,False,tests/main/systemd-service,1883,2017-12-13 20:09:04 Restoring linode:debian-sid-64:tests/main/systemd-service...

I’ve also pushed a small tool that uses Pandas to analyze the dump (hence the CSV :slight_smile: output). You can apply some basic filtering by job type (test, project), action (preparing, restoring, executing), machine, and so on. Use --help to see all the options. Example analysis output looks like this:

$ ./pd.py --top --type test --action 'executing' --machine debian-9-64 < out.csv
          machine        duration  type                         test  \
52    debian-9-64 00:06:08.280524  test                tests/unit/go
2585  debian-9-64 00:05:54.580586  test   tests/main/interfaces-many
245   debian-9-64 00:02:07.375277  test  tests/regression/lp-1721518
1016  debian-9-64 00:00:58.706139  test        tests/main/completion
1115  debian-9-64 00:00:51.159301  test    tests/completion/indirect
700   debian-9-64 00:00:50.992620  test    tests/completion/indirect
978   debian-9-64 00:00:50.979990  test    tests/completion/indirect
77    debian-9-64 00:00:50.978067  test    tests/completion/indirect
432   debian-9-64 00:00:50.977519  test    tests/completion/indirect
575   debian-9-64 00:00:50.976149  test    tests/completion/indirect

                                                   text
52    2017-12-13 20:07:59 Executing linode:debian-9-...
2585  2017-12-13 20:23:27 Executing linode:debian-9-...
245   2017-12-13 20:10:23 Executing linode:debian-9-...
1016  2017-12-13 20:15:37 Executing linode:debian-9-...
1115  2017-12-13 20:16:10 Executing linode:debian-9-...
700   2017-12-13 20:13:48 Executing linode:debian-9-...
978   2017-12-13 20:15:27 Executing linode:debian-9-...
77    2017-12-13 20:08:27 Executing linode:debian-9-...
432   2017-12-13 20:11:35 Executing linode:debian-9-...
575   2017-12-13 20:12:34 Executing linode:debian-9-...
$ ./pd.py --sum --type test --action 'executing' --machine debian-9-64 < out.csv
debian-9-64          0:34:56.320009
$ ./pd.py --sum --type test --action 'preparing' --machine debian-9-64 < out.csv
debian-9-64          0:35:22.697595
$ ./pd.py --sum --type test --action 'restoring' --machine debian-9-64 < out.csv
debian-9-64          0:01:40.905868

Note, that Travis job ID is embedded in CSV values, so it should be possible to do analysis across different jobs.

And it’s a CSV, so you fire up you IPython/Jupyter notebook and share the results.

Final note, vis.js has timeline plots. With little hacking it should be quite easy to visualize the build timeline.