Today I met with Jevin West to continue our discussions of Sandra Anderson’s work using Myria to study citation networks. This week, we are digging into one 37-hop citation path from neuroscience in 1988 to law in 1912. Really, really fascinating data. Soon, we hope to be able to ...
I also did not get much time to do real work today. There were three major activities:
UW Data Science Incubator applications are due Thursday! They have started rolling in, so I have started looking at them and have started a few clarifying discussions with some of the authors. Getting ...
We had a great Myria meeting this afternoon. We discussed Andrew Whitaker‘s user-defined aggregate (UDA) extensions to MyriaL, which provide a very nice way to get scalable, distributed partial aggregation to implement many complicated aggregations in a single scan rather ...
Today I spent all day with Sandra Anderson’s citation graph lineage queries. Though I can compute “all-pairs reachability” for the first 10000 papers in the dataset… I can only currently compute “least-common ancestor” for the first 500 papers. There are some severe algorithmic scalability challenges here that we are ...
In between meetings, I spent most of today continuing yesterday’s work on the citation use case. Further query rewrites and testing exposed an interesting bug in the optimizer due to a mismatch between logical algebra representation and the actual system implementation behavior — the optimizer assumed the system could perform ...
Today I picked up some of the work that Sandra Anderson did in her summer internship, namely trying to find common citations (transitively) between pairs of papers in Jevin West‘s data sets.
Once again I identified a number of nice optimization opportunities:
Today I hacked more on the blog organization and layout; fighting with GitHub CNAMEs was harder than I expected it to be. Eventually I settled on creating a sub-project for the blog as hosting it in my personal dhalperi/dhalperi.github.io repository affected the URLs for other projects like ...