Fall 2016 – Week 8: The Halfway Point

Monday, October 10, 2016: 9:54am – 4:54pm (7 hours):

Kaushik and I looked through some code for a while and spotted a pretty huge bug that had flown under the radar for the last few months: makeTrainPredicates and makeTestPredicates only considered examples where the number at the end of the line (i.e. 500000) matched the number in the sentence (i.e. 500,000) minus commas. The downfall was with the situations where a number and words were scrunched (i.e. 500,000shares), so Kaushik changed looking for an exact match to checking if the number was ‘in’ the word. We decided to table the rest of our work until Wednesday.

Team meeting and project updates followed by hackathon. Professor brought to our attention that Tushar planned on arriving on November 4th and wanted to work through the weekend. Nobody seemed opposed to the idea of doing a weekend long hackathon (especially if food was involved).

Last week Kaushik worked on a tree-adding algorithm, for hackathon we wanted to use the .dot files to visually represent this manipulation. I found a couple python packages that would do the job, one even made it look like we could output the changes as a gif.

Our progress was slow, we had some hurdles getting the the RDN-boost code to work correctly on my Chromebook. Eclipse didn’t seem to want to cooperate.


Tuesday, October 11, 2016:

Professor’s class had midterms today, good luck to everyone in our lab!


Wednesday, October 12, 2016: 9:48am – 4:36pm (6.8 hours):

I started my day by fixing an error in the interface that caused the checksum-override to malfunction, preventing anyone from running the pipeline if there were errors. The more I thought about it the more ironic this seemed.

For the next few hours I manually ran the pipeline, hoping Kaushik’s fix from Monday improved the results since ~145 examples were no longer thrown out before training.

On average, training has been taking about 4 hours, during this time I worked on my statement of purpose (draft #6) and took thirty minutes to play Cricket with the lab.

When training was complete, we noticed there was a really strange error that showed up: 500,000 and 5,000,000 were considered equal. ‘in,’ as it turned out, literally looked for situations where a number was part of another number. The fix would not be as simple as we initially thought.


Thursday, October 13, 2016: 1:00pm – 2:06pm (1.1 hours):

Reading group and Nandini’s birthday! The cake was shaped like cookie monster, Devendra ate an eye, and I got chocolate cake all over my copy of “Bisimulation-based Approximate Lifted Inference.” I needed to leave a bit early since I had class.

img-20161013-wa0001


Friday, October 14, 2016: 12:30pm – 5:48pm (5.3 hours):

My weekly Serve-IT meeting took place at 11:30, so I got into lab shortly after that wrapped. It was eerily quiet, Phillip and Shuo were the only two here when I arrived, and Devendra and Mayukh arrived shortly after myself.

I spent the day making updates to runthis.sh:

Menus:

Moved options from the main menu to the help menu, cleaned up the licenses (now completely moved into the help menu) and added the GPLv3 license that RDN-Boost is associated with, reorganized the placement of options on the main menu (now 6 instead of 8). Licensing menu now matches the design of all of the others (including the *).

Training pipeline:

‘Training’ option used to prompt the user for primary or secondary shares. This was redone so four options are made available and the training pipeline responds to the request. A log file is now created during the process (‘view results’ now points to this), the saving option will soon point to this so it can suggest options based on what model was trained.

I found a huge bug in extractSecondaryShares.py: it would crash if it tried to take the max of an empty list.

Another huge bug flew under the radar for quite some time in makeOneLine.sh: echo will expand * by treating them as wildcards. The trick used by performing echo $(cat $1) created files with the contents of a directory also stored inside of them. I resolved this by removing asterisks from the files with sed.