DIMACS
DIMACS REU 2023

General Information

me
Student: Kyrylo Karlov
Office: CoRE 434
School: Faculty of Mathematics and Physics, Charles University
Contact: kirill.karlov1 (at) gmail (dot) com
Project: Hardness of Approximation for Clustering Objectives
Partners: Ashwin Padaki, Styopa Zharkov
Mentors: Karthik C.S., Henry Fleischmann

I am a part of a group of students from Charles University that includes Ondrej Chwiedziuk, Barbora Dohnalova, Jiri Kalvoda, Gaurav Kucheriya, Volodymyr Kuznietsov, Josef Matejka, Jakub Petr, G.A. Gamboa Quintero and Ondrej Sladky.


Project Description

Clustering is one of the most prevalent problems in computer science, with applications ranging from machine learning to bioinformatics. For many objective functions, finding an optimal clustering is NP-hard, which motivates the study of the problem's hardness of approximation. There are many gaps between known hardness results and existing approximation algorithms, an example of which is minimum diameter clustering in the L2 metric. We aim to close gaps of this nature, advancing our understanding of clustering.

Research Log

Week 1 (5/31-6/4)

Henry introduced to me fundamental results in hardness of clustering, starting with basic reductions. I get familiar with the general framework of proving hardness and particular goals of the project. Among other results, I learned the construction for best known hardness result for L2 metric using planar vertex cover, which is not tight and equal to ~1.97. I spent the rest of the week reading literature and preparing for the initial presentation.

Week 2 (6/5-6/11)

On Monday we had the initial presentation of the project. Later Henry showed us the Frechet map, which is an isometric embedding from Lp metric space to L. This fact makes clustering in L strictly harder than for any other Lp metric space. Later in the week we focused on closing 1.97-2 gap for L2 metric. We have also discussed tight example of the Gonzalez's 2-approximation algorithm. I have read Henry's notes on Split diameter problem. On Wednesday we have spent half the day analyzing exact set cover reduction to k-diameter clustering. We devoted the rest of the week thinking about reduction to clustering in L1 from 3-coloring. We came up with a new reduction from 3-edge coloring to min diameter k-clustering which gives slightly better bound for Lp. We improved this reduction by generalizing to the coloring on hypergraphs.

On Tuesday Philip took us to a wonderful Jazz night organized by New Brunswick jazz project.

Week 3 (6/12-6/18)

On Monday we have started to think about and developed an approximation algorithm for constant number of cluster and constant dimension. We have spent few days analyzing details and correctness of the algorithm. On the next meeting with our mentor he showed us an embedding via the error correcting codes.

During the week I have attended some lectures at Workshop on Modern Techniques in Graph Algorithms.

Week 4 (6/19-6/25)

At the start of the week we met with our coordinator. We discussed few more hard instances for k=3 case. One of them used embedding to Johnson graphs. The instance was not NP-complete, but it led us to the idea of 3-coloring with forbidden subgraphs. Turns out that P6-free graphs have polynomial 3-coloring algorithm. Combining this with the fact that embedding of 3 coloring to L1 cube gave us the barrier for easy polynomial instances. We have spent the rest of the week thinking about this barrier and finding the hard instances for k=3.

Meeting on Monday happened to be my first trip to New York. On Sunday I returned to New York and spent a wonderful day full of walking and sightseeing.

Week 5 (6/26-7/2)

We started our week with a meeting. We tried to create hard instance to approach barrier for k=3 for L1. We also discussed possible approaches to remove intermediate distances but the approach so far works only for log(n) of them. Intermediate distances is the main obstacle in designing the algorithm for k=3.

On Tuesday I switched my attention to k=5 case, since barrier result does not extend to it. Later we discussed the possibility of removing the intermediate distances in general case for constant k. We wanted to use the fact that for graphs with certain forbidden induced subgraphs 3-coloring is polynomial. For example long paths such as P5, P6, P7. We also had an exiting and extremely funny talk on Experimental math.

On Wednesday we continued to investigate forbidden subgraphs for 3-coloring and removing the intermediate distances. On Thursday we started to write a paper, compiling our result into readable format and going through the literature.

Week 6 (7/3-7/9)

This week we started thinking about embeddings using gadgets for L1 hardness for k=3 case. I decided to revisit 1.97 case hardness for L2 plane and search for appropriate NP-complete problem. We also discussed different algorithms for k=3 case and various approaches to add intermediate distances to our embeddings.

On Sunday Josef and I went to the New York. We walked in the Central park and went to the MOMA museum. We have seen and discussed many paintings by Picasso, Dali, Gauguin, van Gogh and many more. My highlight of the exhibition is Marc Chagall's I and the Village.

Week 7 (7/10-7/16)

This week we extended our understanding of the embedding with gadget into L1 and various barrier for k=3 via obtaining multiple barrier result. In the process we are using many Graph theoretical problems like unique colorability and critically-colorable graphs. We have also been investigating k-colorable graph with high girth which were introduced by Erdos. We have started to write the outline of the paper and polish analysis of our algorithms.

On Tuesday we went to another Jazz night organized by New Brunswick jazz project. On Thursday we made a trip to IBM Thomas J. Watson Research Center. We had multiple AI lectures there. On Saturday we went to the Metropolitan Museum of Art in New York. During our short time at the museum we glanced at Japanese, Chinese expositions, also modern and contemporary art. I personally enjoyed many works by Picasso, Garden at Sainte-Adresse by Monet, A pair of shoes by Van Gogh, Still Life with a Guitar by Gris.

Week 8 (7/17-7/21)

As this is our final week, we focused on writing a paper and composing final presentation.


Presentations


Acknowledgements

This work was carried out while the author Kyrylo Karlov was a participant in the 2023 DIMACS REU program, supported by CoSP, a project funded by European Union's Horizon 2020 research and innovation programme, grant agreement No. 823748.