Other articles


  1. Myers General Exam

    Brandon Myers rocked his generals exam today. Brandon (perhaps atypically, but appropriate in this case) gave a literature-review style talk in which he focused on work in many fields related to the performance of generated code and hand-coded solutions for distributed query processing.

    In-memory data management (IMDM) systems have achieved ...

    read more

    There are comments.

  2. Myria updates

    Not much happening this week, what with the holidays. I took advantage of the break to handle some long-overdue code reviews and code improvements to Myria.

    • Jingjing Wang has added resource profiling to Myria. We can now measure the resource consumption of each operator during query execution. (Unfortunately, these data ...

    read more

    There are comments.

  3. Incubator 7.2

    More work with Andy Becker on the KBMOD database and query design today. We have been running into issues loading all the pixels in the database — O(Millions) of images with O(Millions) of pixels each means 10^12 records, which would take years to load into our Postgres database ...

    read more

    There are comments.

  4. Public cluster + private experiments

    The hot button issue today is what we do with our public Myria service.

    As part of the grant proposal, we promised that “the project develops and deploys a Web-based query-as-a-service interface to the new middleware. The service will be made available to domain scientists” (p.1). This service has ...

    read more

    There are comments.

  5. Incubator 5.1

    In this incubator this morning, Sophie and I worked on the IPython Notebook — Myria pipeline. In particular, we pushed some bug fixes to some SSL bugs (but not all the way) and demonstrated an ability to issue queries in Datalog or MyriaL from Python that get executed on Myria!

    Sophie ...

    read more

    There are comments.

  6. Data and databases

    Over the weekend, both Sophie Clayton and Andy Becker worked independently on their Data Science Incubator projects; I spent some time then and today answering emails :).

    Sophie has been loading underway data (GPS, temperature, salinity, etc. from ships in motion) into SQLShare for cleaning. Every research vessel is its own ...

    read more

    There are comments.

  7. 2014-09-19 daily

    Today Sophie Clayton and I hacked on Myria for SeaFlow once again. We found another few opportunities for language and usability improvements, and made little progress because of an issue introduced when fixing other bugs earlier this week.

    In the Myria research meeting, we had both Johannes Gehrke from Microsoft ...

    read more

    There are comments.

  8. 2014-09-16 daily

    I also did not get much time to do real work today. There were three major activities:

    1. UW Data Science Incubator applications are due Thursday! They have started rolling in, so I have started looking at them and have started a few clarifying discussions with some of the authors. Getting ...

    read more

    There are comments.

  9. 2014-09-15 daily

    Next week, I’ll see if the incrementalization actually helps us scale.

    Only had a tiny bit of time today; I worked more on the least common ancestor query. Here is what new work contributed to better scaling:

    • Incrementalizing the code (duh) did in fact let me scale it farther ...

    read more

    There are comments.

  10. 2014-09-11 daily

    Today I spent all day with Sandra Anderson’s citation graph lineage queries. Though I can compute “all-pairs reachability” for the first 10000 papers in the dataset… I can only currently compute “least-common ancestor” for the first 500 papers. There are some severe algorithmic scalability challenges here that we are ...

    read more

    There are comments.

  11. 2014-09-10 daily

    In between meetings, I spent most of today continuing yesterday’s work on the citation use case. Further query rewrites and testing exposed an interesting bug in the optimizer due to a mismatch between logical algebra representation and the actual system implementation behavior — the optimizer assumed the system could perform ...

    read more

    There are comments.

  12. 2014-09-09 daily

    Today I picked up some of the work that Sandra Anderson did in her summer internship, namely trying to find common citations (transitively) between pairs of papers in Jevin West‘s data sets.

    Once again I identified a number of nice optimization opportunities:

    • some query rewrites that result in better ...
    read more

    There are comments.

  13. 2014-09-08 daily

    Today we held the information session for the second installation of our Data Science Incubator projects which we will hold in the Spring. It was fairly well attended; maybe 20—25 people came and many of these indicated that they will be submitting proposals.

    Over the weekend and today I ...

    read more

    There are comments.

  14. 2014-08-28 daily

    We had our monthly SeaFlow/eScience group meeting meeting. For this grant the oceanographers have been doing lots of new science using tools like SQLShare, Myria, and popcycle, our software for storing indexing and analyzing SeaFlow data. We discussed needed improvements to popcycle and to the seaflow-viz web dashboard (see ...

    read more

    There are comments.

  15. 2014-08-25 daily

    Another fantastic hack session with Sophie today. We analyzed the quality and quantity of data in the existing files, including determining which of the 64K SeaFlow samples are within a reasonable amount (say, 1σ) of the “average” SeaFlow sample according to the calibration beads. Surprisingly/hearteningly, the vast majority of ...

    read more

    There are comments.

  16. 2014-08-20 daily

    Today I hacked more on the blog organization and layout; fighting with GitHub CNAMEs was harder than I expected it to be. Eventually I settled on creating a sub-project for the blog as hosting it in my personal dhalperi/dhalperi.github.io repository affected the URLs for other projects like ...

    read more

    There are comments.

blogroll

social