Blog Archives

10. Bug Fix, Build Tree and Sort Edges (03/23/2020 - 03-29/2020)

3/28/2020

Part 0: Connectivity Issue

Due to BU closing down all the non-essential research labs, the lab computer has been turned off, and I was having difficulty accessing the lab computer via TeamViewer. Fortunately, Zack helped me restart the computer on Friday, so I could continue to work remotely.

Part I: Bug Fix

There was a bug in the code: the index out of bound error would appear sometimes when running the code, but not all the times (see screenshot below).

After hours of investigation, it turned out that the publisher (node.py) was sending empty features to the subscriber(quickmatch_node.py). Since the features are supposed to be published every second, I only checked the time intervals, not whether the feature collection is empty.
The bug has been fixed. Now the publisher (node.py) would only publish features when the time interval has exceeded one second AND the features collection is not empty. The error did not occur again after running the code 10 times.

Part II: Build Tree and Sorted Edges

The build_kdtree and sort_edge_index functions have been integrated into the code. Both functions are working in the callback function of the quickmatch_node. No errors occur during runtime so far.

build_kdtree
- Input: density matrix, image membership matrix, number of features
- Output: parent matrix and parent_edge array
sort_edge_index
- Input: parent_edge array
- Output: sorted_index array

Next Steps

There are two versions of the break_merge_tree algorithm in the QuickMatch and NetMatch code bases. The next step is to investigate those two versions and integrate one of them into quickmatch_node.py.

0 Comments

9. Modifying k-means transmission & calculating feature density (03/16/2020 - 03/22/2020)

3/22/2020

0 Comments

Part I: Modifying K-means communication

Previously, k-means partitions are computed by one node for the first image, and then written to the param.yaml file for every other node to read from. Since the parameter file only loads once when all the nodes are created, I am modifying the nodes to communicate k-means partition via a topic.
Currently,

The first node computes k-means partitions (centers) for the first image, it then transmit the partitions via the topic '/labels'.
All the nodes will receive partitions via the same topic and store them in a local variable.
All the nodes will not process/transmit any data until they have received the labels from k-means.

The dimensions of the label before and after the modification remain the same.

Part II: Calculating Feature Densities

Currently, the quickmatch node (different than the original node that initially publishes and receive features) is able to receive features (along with the corresponding agents they belong to) and calculate feature densities in the callback function. The callback function is triggered each time the subscriber receives information from the original node. The original node publishes features and feature members every second.
It has been tested and verified that

dimensions of the feature densities = dimensions of bandwidth = number of features received = number of features published
total number of features processed (whose densities are calculated) = total number of features received = total number of features published

Next Step:

Next step is to keep implementing the quick match algorithm with the existing feature densities.

0 Comments

8. K-means on new data, Processing Node (03/09/2020-03/15/2020)

3/14/2020

0 Comments

Part I: K-means testing on new data

Before this week, I was using the existing 6 pictures to test k-means, and the number of collected features matched the number of extracted/published features. This week, I took 6 pictures of the same Clorox wipe box from different angles with my phone. For these new pictures, initially the total number of collected features by all nodes did not match the number of extracted/published features by all nodes. After hours of debugging, I found out that it had nothing to do with my code. It was purely due to the delay of the computation. If I put print statements after letting the nodes sleep for 2 seconds, the numbers will match again.

One problem is that there's a delay from reading after writing the param files with the k-means labels. Currently the nodes read the labels written from the last time the script was run (always one round behind).

Part II: Processing Node

A new processing node was created to process the features for QuickMatch.
The features are sent from the non-processing node every 1 second. After sending the collected features, the non-processing node will empty its collection.
The total number of features and feature_members match the total number of features extracted -> data transmission was successful.

Questions:

How to deal with the delay of the parameter file writing and reading? (show recorded video) Python multi-processing module necessary? Multiprocessing problem with accessing global variables?
Better way to transmit data with a timer? If not, where to best put the timer check? (Current code doesn't work perfectly with two checks).

0 Comments

7. K-means partitioning (03/02/2020 - 03/08/2020)

3/9/2020

0 Comments

This week, k-means partitioning has been implemented and tested for 3 nodes. It works as follows:

Only the first node (node0) will run the k-means algorithm for the first image.
The first node then writes the k-means centers as a label matrix in the params.yaml file.
When running the nearest neighbor algorithm, each node will use its label in the label matrix in the parameter file (matrix index corresponds to the node_id) to process features and send them to the corresponding nodes.

Feedback from meeting:

Test k-means for images from the Internet and see what the distribution is like.
Keep implementing QuickMatch after calculating densities.

0 Comments

6. Understanding Netmatch (02/24/2020 - 03/01/2020)

3/6/2020

0 Comments

This week's main task is to understand the current paper as well as the NetMatch algorithm. Since the basic code with random labels and nearest neighbors is working, it is time to start implementing QuickMatch distributedly on different nodes.

I spent a lot of time talking with Zack and reading about the current paper in publication. Here is a summary of Distributed QuickMatch (NetMatch) algorithm:

One agent runs k-means partition and informs all other nodes about the partition.
After receiving all the features, calculate distinctiveness and the density of each feature
Build tree, sorted edges, and break edges from the longest to the shortest
Check contested clusters by using features from both agents closest to the border. If the point meets criteria, send entire cluster that the point is in to the other agent.

The next step would be to start implementing k-means partitioning and make sure everything still works the same.

0 Comments

10. Bug Fix, Build Tree and Sort Edges (03/23/2020 - 03-29/2020)

Part 0: Connectivity Issue

Part I: Bug Fix

Part II: Build Tree and Sorted Edges

Next Steps

9. Modifying k-means transmission & calculating feature density (03/16/2020 - 03/22/2020)

Part I: Modifying K-means communication

Part II: Calculating Feature Densities

Next Step:

8. K-means on new data, Processing Node (03/09/2020-03/15/2020)

Part I: K-means testing on new data

Part II: Processing Node

Questions:

7. K-means partitioning (03/02/2020 - 03/08/2020)

6. Understanding Netmatch (02/24/2020 - 03/01/2020)

Archives

Categories