I was offline for the past two weeks, and it was glorious. I highly recommend it!
Monday is meetings day.
Shrainik, Bill, and I discussed our efforts (with Dominik) to define and quantify the extent of variety in different datasets and understand how data management and integration systems help combat or even reduce this variety. The three of them wrote up our initial thoughts while I was gone, and now we are discussing next steps. As an aside, I am excited to repurpose their code to find all the crazy ways scientists use UDFs in SQLShare.
We talked with Brandon about next steps for the failed submission on his Grappa work. By and large, the reviewers missed the boat — but that is our fault: what it really means is that we need to present the work better through organization, writing, and figures. We also discussed plans for generals and the relative pros and cons of literature review-style generals vs thesis proposal-style. In my opinion, a lit review would be more useful because Brandon’s work is so cross-disciplinary. The main counter-argument is that a thesis proposal might be more directly on the fast-path to graduation, and negotiations continue.
In the Myria meeting, Jingjing Wang showed off her improvements to launching, shutting down, and restarting Myria. Soon the production Myria demo will be much easier to kick when something goes funky. Brendan Lee has been documenting the Myria REST API using apiary.io and will be showing that off soon as well.
In the UWDB paper seminar, Srini Iyer and Prasang Upadhyaya presented the MadLINQ paper by Qian, et al. It was fun to see the theorists and the systems folks in the room discuss the relative tradeoffs between elegance and performance of the design. Some of these techniques are relatively straightforward techniques to implement in Myria, i.e. tiled matrix multiply with replication looks an awful lot like HyperCube shuffle with range-partitioning.
There are comments.