# Master thesis excerpts

The whole thesis is better read as a pdf file. This page is provided just as a convenience.

## Introduction

The main focus of computational complexity theory has been for at least 40 years on the study of the relationship between the running time of an algorithm solving some problem and the problem size, like the number of vertices and edges of a graph. This approach led us to the definitions of complexity classes such as P (polynomial time), NP (nondeterministic polynomial time) and EXP (exponential time), and especially the notion of an NP-complete problem, which are all truly foundational for the field of computer science as we know it now. Here we of course have in mind primarily the seminal work done by Cook [Cook71], Karp [Karp72] and Levin [Levin73].

But over time it started to be more and more apparent that a finer approach needs to be taken when dealing with hard problems. In practice the inputs we are feeding to our algorithms are often times nowhere close to random; there might be some underlying structure in our instances that actually allows us to create algorithms that work efficiently for these instances, even though the problem still stays hard in general. In other words, we find that by looking at just the problem size when analyzing the worst-case time complexity we are missing a significant part of the picture, and that it is beneficial to split the input into two parts which are examined separately.

The ideas above are formalized by the quickly growing field of Parameterized Complexity [DowneyF99] [FlumG06]. In parameterized complexity we get a parameter as a special part of the input; quite often it is just an integer representing the solution size, but generally it can an arbitrary piece of information that gives us more information about the input (its structure etc.). To be clear we are not saying that the problem becomes easier to solve, rather that we measure its complexity in two dimension instead of one, and we contain the hardness to just one of the dimension. In this sense parameterized complexity can be viewed as a tool to rigorously capture where exactly lies the complexity of a given problem. Fellows [Fellows01] mentions specific cases where parameterized complexity was used to explain the success of some heuristics -- these heuristics were implicitly relying on a certain structure of the input graph, which parameterized complexity explicitly captures and analyzes as the parameter.

A classical example of the first case (i.e., a parameter that is simply the solution size) is a parameterized approach to the (NP-complete) Vertex Cover problem -- if along with the input graph we are given the size of the solution $k$ , there is an algorithm which solves the problem in time $O(1.271^k + kn)$ [ChenKJ99]. This is still exponential in the solution size, but if that is fixed and we can treat it as a constant, the algorithm runs in time linear with the graph size. We should add that this approach is not attractive just in theory, but the algorithm mentioned above "has been implemented and is quite practical for $n$ of unlimited size and $k$ up to around 400" (quote from an overview by Fellows [Fellows01]).

A classical example of a more complicated parameter is treewidth, which can be roughly said to be a measure of how close a graph is to a tree. This parameter arose as a part of the Graph Minor Project undertaken by Robertson and Seymour [RobertsonS84] [RobertsonS86]. Just like many hard problems are efficiently solvable on trees by dynamic programming, it was found that the same technique can be used on graphs whose treewidth is bounded by a constant. Moreover, treewidth provides a characterization of some naturally occuring graph classes, such as series-parallel graphs.

Treewidth is also characteristic for describing the development of metaalgorithmic theorems, most famously Courcelle's theorem [Courcelle90]. The idea of metaalgorithmic theorems is that instead of describing many similar algorithms for similar problems, there is a way to capture a common essence of these, which is done by proving that some logical language is efficiently decidable when the parameter is fixed. This provides great intuition when studying said parameter.

The progress done on treewidth led to the exploration of many other structural parameters, which are usually studied from two perspectives. The first is trying to answer questions such as "How restrictive is this parameter? How many graphs display this structure? What graph classes are captured well by this parameter?" The second perspective is concerned with questions like "Which problems become tractable when we use this parameter? What logical languages become efficiently decidable when this parameter is fixed?"

Instead of asking the first few questions mentioned above (like "What graph classes are captured well by this parameter?") we can take a somewhat "dual" approach and ask "What are good parameters for this class of graphs?" A very vague way to distinguish between graphs is to look at how "dense" or "sparse" a graph is. It can be argued that treewidth is a successful parameter for handling sparse graphs (although we are by no means at the end of the road yet). On the other hand, with dense graphs the situation is still very much open. It is this question, "What is a good parameter for dense graphs?", where this thesis is trying to make some contribution.

## Overview of parameters

In this chapter we make a short survey of the state of the art in the world of parameters for sparse graphs and dense graphs. We realize that it is limited and there are parameters that we do not mention. The main message of this chapter is that the situation of parameters suitable for dense graphs is much more clouded and with less consensus than the situation of parameters suitable for sparse graphs, and even that is not at all finished. For the convenience of the reader, we provide the exact definitions of parameters discussed in this chapter in the Appendix.

We admit that our understanding of the words "sparse" and "dense" in the following sections is quite limited and specific to the scenario we are focusing on. A relevant recent paper dealing with the difficulty of the concept of sparsity (and related concepts) is due to Nešetřil and de Mendez [NesetrilM12]. Typically, graphs with $O(n)$ edges are defined as sparse and graphs with $\Omega(n^2)$ edges as dense. With these definitions there are many graph classes that do not fall into either category.

One of the problems with these definitions in our scenario is that no parameter subsumes either class completely. Already planar graphs (which have $O(n)$ edges) do not have bounded treewidth. The situation for cliquewidth is similar [KaminskiLM09]. Only the other direction holds: all graph classes of bounded treewidth have $O(n)$ edges, and some graph classes of bounded cliquewidth have $\Omega(n^2)$ edges. There are other problems that we will mention.

Thus, the way we use these words is mostly motivated by practice and how they are used in literature.

In the following sections we will be using the notion of equivalence of parameters. Two parameters $A$ and $B$ are said to be equivalent if for every graph $G$ the parameter $A$ is bounded by a constant if and only if the parameter $B$ is bounded by a constant.

### Sparse graphs

In this section we explain why we think that treewidth is a successful parameter for a significant part of sparse graphs. We use a survey from Bodlaender [Bodlaender12] extensively in the following paragraphs.

The notion of treewidth was introduced as a part of the Graph Minor Project of Roberson and Seymour [RobertsonS84] [RobertsonS86] in the 80s. In hindsight we see that there are other equivalent definitions of treewidth which were studied by other authors independently, some even before Roberson and Seymour. Graphs with bounded treewidth gathered interest because many problems that are intractable (e.g., NP-hard) become linear time (or polynomial time with polynomials of reasonable degree) solvable when restricted to graphs of bounded treewidth. Bodlaender [Bodlaender12] says:

Such algorithms have been found for many combinatorial problems and also have been employed for problems from computational biology, constraint satisfaction, probabilistic networks and other areas. [...] In other words: many graph problems become fixed-parameter tractable when parameterized by the treewidth of the input graph.

Also, since the end of the 80s and the beginning of the 90s treewidth gathered attention from the fields of automata theory, database theory and finite model theory, resulting in the famous metaalgorithmical theorem of Courcelle [Courcelle90]. This theorem states that the model checking problem of $\text{MSO}_2$ and CMSO logics is FPT with respect to treewidth. (CMSO, or counting MSO, is an extension of the MSO logic to also express the cardinality of sets modulo $p$ .) This was later extended to optimization variants (EMSO) by Arnborg et al. [ArnborgLS91] and recently to some other more specialized logics, for instance MSO with local cardinality constraints (Szeider [Szeider11]). The interest from database theory is demonstrated for example by papers from Gottlob et al. [GottlobPW10] who show how to translate MSO formulas into Datalog queries. As for finite model theory, it is where the methods Gottlob et al. and many others use come from; for more see Libkin's book [Libkin04].

For a parameter to be useful in practice there has to be a way to efficiently (in FPT time) determine the parameter and afterwards obtain the relevant decomposition. This is because we usually do not know the parameter exactly, only that it is somehow bounded on the input graph.

Both of these are possible for treewidth thanks to the extensive work done by Bodlaender et al. in the past approximately 15 years. The first important step was that Bodlaender [Bodlaender96] showed in 1996 that for fixed $k$ it is possible to decide if a graph $G$ has treewidth at most $k$ in linear time, and if the answer is positive, to return a tree decomposition of width $k$ . This algorithm was shown to be impractical by Röhrig [Roehrig98], but heuristics which work reasonably were found instead. In the following years much research was done in this area and is summed up in a series of papers. The first two papers due to Bodlaender and Koster show polynomial heuristics for computing upper bounds [BodlaenderK10] and lower bounds [BodlaenderK11] of the treewidth of a graph. The third paper is yet to come out but it will be dedicated to exact algorithms for computing treewidth; some progress is described in a paper by Bodlaender et al. [BodlaenderFKKT06] which describes exact exponential algorithms running in space polynomial in the parameter (unlike the first algorithm from 1996). Those algorithms combined with aformentioned heuristics provide the needed tools to obtain good (almost optimal) tree decompositions quickly.

A great example demonstrating how far the research done on treewidth got is the recent paper "Evaluation of a MSO solver" by Langer et al. [LangerRRS12]. In it the authors evaluate an implementation of a new kind of proof for Courcelle's theorem which they introduced the previous year [KneisLR11]. In spite of the fact that the parameterized complexity community still considers metaalgorithmical theorems like Courcelle's theorem to be merely of theoretical interest (see Niedermeier's book [Niedermeier04]), the MSO solver implemented by Langer et al. wildly outperforms the industry standard ILP solver CPLEX on a real-world instance of the Bus Stop Location problem.

We see the following three reasons as the main reason why treewidth is a successful parameter:

1. The class of graphs with bounded treewidth is wide enough to capture well many real-world instances for many practical problems which become FPT with respect to treewidth.
2. The model checking problem for a variety of logical languages becomes FPT on graphs of bounded treewidth.
3. It is possible to determine treewidth and find an optimal tree decomposition in FPT time with respect to treewidth. Moreover, practical heuristics exist.

The biggest theoretical problem of treewidth are the lower bounds on the complexity of model checking for MSO logics given by Frick and Grohe [FrickG04] in 2004. Because of the positive results due to Langer et al. [LangerRRS12] [KneisLR11] we described above we do not think those lower bounds play a big role in practice. Still, there was interest in finding some answer to this obstacle.

To that end attention turned to tree-depth, a parameter introduced in 2006 by Nešetřil and de Mendez [NesetrilM06], which creates a finer hierarchy between graphs of bounded vertex cover and bounded treewidth. Instead of explaining this relationship between treewidth and tree-depth we refer to a paper by Blumensath and Courcelle [BlumensathC10] which proves an equivalence between tree-depth and $l$ -depth treewidth, a parameter that is easier to describe.

A graph $G$ has $l$ -depth treewidth bounded by $k$ if a tree decomposition of width $k$ exists for $G$ and also it is possible to choose a root in this decomposition such that all leaves are in depth at most $l$ . The equivalence between those two parameters is such that tree-depth $\leq k$ implies $k$ -depth treewidth $k$ , and $l$ -depth treewidth $\leq k$ implies tree-depth $lk$ . Treewidth is often described as a measure of how close a graph is to a tree; in this sense tree-depth is a measure of how close a graph is to a star.

Why is tree-depth important? Gajarský and Hliněný [GajarskyH12b] have recently proven that $\text{MSO}_2$ is FPT with respect to tree-depth and the function $f$ has elementary dependence on the formula (i.e., the height of the exponential tower does not depend on the formula). Intuitively, this construction allows us to trade the nonelementary dependence on the formula for a bound on the depth of the tree decomposition. This seems like a good answer to the lower bounds of Frick and Grohe from a theoretical standpoint.

Even when we factor in the limitations of our use of the word "sparse", judging by the amount of interest treewidth generated both in theory and practice we conclude that it successfully answers many important questions.

### Dense graphs

In this section we explain why we think that the search for a good parameter for "dense" graphs is still very much an ongoing effort.

The notion which is probably the closest to treewidth with regards to "dense" (again, for a suitable definition thereof) graphs is the parameter cliquewidth. A good starting point for this parameter is the survey conducted by Kaminski et al. [KaminskiLM09]. Cliquewidth was introduced by Courcelle and Oum [CourcelleO00] in 2000 as a generalization of treewidth, as bounded treewidth implies bounded cliquewidth [GurskiW00]; on the other hand there are graphs with constant cliquewidth but unbounded treewidth. Many important hard problems are FPT on graphs of bounded cliquewidth (for more specifics see the introduction chapter of a recent paper by Ganian, Hliněný and Obdržálek [GanianHO13]). On the other hand, since more graphs have bounded cliquewidth than bounded treewidth it is not surprising that not all problems that are FPT with respect to treewidth are also FPT with respect to cliquewidth. For example the Chromatic number problem is W[1]-hard with respect to cliquewidth [FominGLS10].

As for metaalgorithmical results, already in 2000 Courcelle, Makowski and Rotics [CourcelleMR00] proved the equivalent of Courcelle's theorem for graphs of bounded cliquewidth with one significant difference: only $\text{MSO}_1$ (not $\text{MSO}_2$ ) model checking is FPT with respect to cliquewidth. Another important result were the nonelementary lower bounds on the complexity of the function $f(k)$ of $\text{MSO}_1$ model checking given by Frick and Grohe [FrickG04].

We are already starting to see some significant disadvantages of cliquewidth when compared to treewidth, although some might be attributed simply to the fact that cliquewidth is younger by about 15 years. We have not yet discussed the third "component of success" of treewidth, that is the FPT algorithm determining the treewidth of a graph and computing its decomposition.

Upon seeing that it is possible to solve some hard problems efficiently on graphs of bounded cliquewidth, the problem of determining (or at least approximating) cliquewidth gained importance. In 2006, Oum and Seymour [OumS06] proved that if a graph $G$ has cliquewidth at most $k$ , it is possible to find a $(2^{k+1}-1)$ -expression (the equivalent of a tree decomposition for cliquewidth) defining $G$ in FPT time. This discrepancy between orders of $k$ and $2^k$ was the main motivation for the introduction [Oum05] of a parameter called rank-width.

It holds [OumS06] that for any graph $G$ , $rw(G) \leq cw(G) \leq 2^{rw(G)+1} - 1$ , that is, the parameters are equivalent and cliquewidth is always at most roughly $2^{rw(G)}$ . It still remains open if there is an FPT algorithm that would, for a graph $G$ with $cw(G) = k$ , construct its $k$ -expression. For these reasons rank-width is often more popular for algorithm design. It is worth mentioning that the ideas used in the alternative proof of Courcelle's theorem for treewidth given by Kneis et al. [KneisLR11] were used in the proof of the metaalgorithmical theorem on rank-width by Langer et al. [LangerRS11].

Another parameter equivalent to cliquewidth (and thus rank-width) is boolean-width. It was introduced by Bui-Xuan et al. [Bui-XuanTV09] in 2009 and further investigated by Adler et al. [AdlerBRRTV10] and Belmonte and Vatshelle [BelmonteV11]. The reason why boolean-width is interesting is the combination of two of its properties. First, boolean-width can be exponentially smaller than rank-width. More interestingly, many graph classes (see Figure 1 in the paper by Belmonte and Vatshelle [BelmonteV11]) have boolean-width $O(\log n)$ . It is not typical in parameterized complexity to look at graph classes with logarithmic values of the parameter, but here it makes sense. The reason is that, second, a wide class of problems (see the paper by Adler et al. [AdlerBRRTV10]) can be solved on graphs of boolean-width $w$ in time $2^{O(w)}\text{poly}(n)$ (i.e., with $f(w)$ singly exponential). When those two properties are combined the result are FPT algorithms for those problems on the classes with boolean-width $O(\log n)$ . The authors also show that for the relevant graph classes boolean decompositions can be obtained in polynomial time.

Observe that in this sense boolean-width is not equivalent to cliquewidth or rank-width, because graphs with boolean-width $O(\log n)$ can have cliquewidth much bigger (even doubly exponentially), so even if we were able to use the same algorithms as for boolean-width, we would not get a runtime polynomial in $n$ . Note that this is the only case known to the aformentioned authors and to us too where this technique is used; typically only classes of graphs with a parameter bounded by a constant are studied.

Above we summed the progress motivated by the original question of determining cliquewidth efficiently -- the notions of rank-width and boolean-width both proved to carry significant advantages over clique-width even though they are all bounded by some constant for the same graph classes. Now we turn to the research motivated by the lower bounds for model checking due to Frick and Grohe [FrickG04].

The first parameter for which model checking of $\text{MSO}_1$ with elementary $f$ was proved was neighborhood diversity. This recent result is due to Lampis [Lampis12]. Gajarský [Gajarsky12] generalized neighborhood diversity to finite type and proved a similar result in his thesis. The most interesting result in this regard is one analogous to the one we mentioned in the previous section for tree-depth. Just like tree-depth fills the gap between vertex cover and treewidth, shrub-depth (introduced by Ganian et al. [GanianHNOMR12]) fills the gap between vertex cover and cliquewidth. Interestingly, graphs with shrub-depth 1 are precisely graphs of bounded neighborhood diversity [GajarskyH12]. Gajarský also noted in an email conversation that graphs of finite type have bounded shrub-depth, but not vice versa. The analogy between tree-depth and shrub-depth is that tree-depth allows $\text{MSO}_2$ model checking with elementary $f$ in the time complexity and shrub-depth allows $\text{MSO}_1$ model checking with elementary $f$ .

Still, there remain problems with shrub-depth and related parameters. Ganian et al. [GanianHNOMR12] state: "Primarily, we do not know yet how to efficiently (in FPT) construct decompositions related to our depth parameters (in this respect our situation is similar to that of clique-width)."

In this thesis we did the main part of our work on graphs of bounded neighborhood diversity. Even though this parameter is easy to work with, easy to determine (in time $poly(n)$ , unlike all other parameters we mentioned) and theoretically attractive (since it is incomparable with treewidth), we have to realistically admit that it is probably not very useful in practice. Recently, Lambert [Lambert12] investigated the values of $vc(G)$ and $nd(G)$ for some graphs which are encountered in practice. Even though we have our reservations towards the choice of these graphs (there are better samples available online) it is apparent that unlike with treewidth it is unusual to encounter graphs of bounded neighborhood diversity in practice.

Thus we conclude that there is no one parameter for "dense" graphs that would fit all three criteria from the previous section -- let many hard problems become tractable on many graph classes, allow for efficient model checking and allow for efficient finding of decompositions -- so the search is still very much open. We sum this chapter with a figure of those parameters we discussed.

Extended hierarchy of discussed parameters. Included are vertex cover, tree-depth, treewidth, neighborhood diversity, finite type, shrub-depth, cliquewidth, rankwidth and booleanwidth. An arrow implies generalization, for example finite type is a generalization of neighborhood diversity. A dashed arrow indicates that the generalization may increase the parameter exponentialy, for example treewidth $k$ implies cliquewidth at most $2^k$ .

 [Courcelle90] (1, 2) Courcelle 1990: The Monadic Second-Order Logic of Graphs. I. Recognizable Sets of Finite Graphs
 [Libkin04] Libkin 2004: Elements of Finite Model Theory
 [ArnborgLS91] Arnborg, Lagargen, Seese 1991: Easy Problems for Tree-Decomposable Graphs
 [FrickG04] (1, 2, 3) Frick, Grohe 2004: The complexity of first-order and monadic second-order logic revisited
 [DowneyF99] Downey, Fellows 1999: Parameterized Complexity
 [Fellows01] (1, 2) Fellows 2001: Parameterized Complexity: The Main Ideas and Some Research Frontiers
 [FlumG06] Flum, Grohe 2006: Parameterized Complexity Theory
 [ChenKJ99] Chen, Kanj, Jia 1999: Vertex Cover: Further Observations and Further Improvements.
 [Karp72] Karp 1972: Reducibility Among Combinatorial Problems.
 [Levin73] Levin 73: Universal Sequential Search Problems
 [Cook71] Cook 71: The Complexity of Theorem Proving Procedures
 [RobertsonS84] (1, 2) Robertson, Seymour 1984: Graph minors. III. Planar tree-width
 [RobertsonS86] (1, 2) Robertson, Seymour 1986: Graph Minors. II. Algorithmic Aspects of Tree-Width
 [KneisLR11] (1, 2, 3) Kneis, Langer, Rossmanith 2011: A Game-theoretic approach to Courcelle's theorem
 [LangerRRS12] (1, 2) Langer, Reidl, Rossmanith, Sikdar 2012: Evaluation of an MSO-Solver.
 [GottlobPW10] Gottlob, Pichler, Wei 2010: Monadic datalog over structures of bounded treewidth
 [Bodlaender96] Bodlaender 1996: A linear-time algorithm for finding tree-decompositions of small treewidth
 [Roehrig98] Röhrig 1998: Tree Decomposition: A Feasibility Study
 [BodlaenderK10] Bodlaender, Koster 2010: Treewidth computations I. Upper bounds
 [BodlaenderK11] Bodlaender, Koster 2011: Treewidth computations II. Lower bounds
 [BodlaenderFKKT06] Bodlaender, Fomin, Koster, Kratsch, Thilikos 2006: On Exact Algorithms for Treewidth.
 [Bodlaender12] (1, 2) Bodlaender 12: Fixed-Parameter Tractability of Treewidth and Pathwidth.
 [Niedermeier04] Niedermeier 2004: Ubiquitous Parameterization - Invitation to Fixed-Parameter Algorithms.
 [Lampis12] Lampis 2012: Algorithmic Meta-theorems for Restrictions of Treewidth.
 [GurskiW00] Gurski, Wanke 2000: The Tree-Width of Clique-Width Bounded Graphs Without Kn, n.
 [FominGLS10] Fomin, Golovach, Lokshtanov, Saurabh 2010: Intractability of Clique-Width Parameterizations.
 [Szeider11] Szeider 2011: Monadic second order logic on graphs with local cardinality constraints.
 [NesetrilM12] Nešetřil, de Mendez 2006: Sparsity - Graphs, Structures, and Algorithms.
 [NesetrilM06] Nešetřil, de Mendez 2006: Tree-depth, subgraph coloring and homomorphism bounds.
 [Lambert12] Lambert 2012: Srovnání Vertex cover, Twin-cover a Neighborhood diversity na grafech [online]
 [CourcelleMR00] Courcelle, Makowsky, Rotics 2000: Linear Time Solvable Optimization Problems on Graphs of Bounded Clique-Width
 [CourcelleO00] Courcelle, Olariu 2000: Upper bounds to the clique width of graphs
 [Oum05] Oum 2005: Approximating Rank-Width and Clique-Width Quickly
 [OumS06] (1, 2) Oum, Seymour 2006: Approximating clique-width and branch-width
 [KaminskiLM09] (1, 2) Kaminski, Lozin, Milanič 2009: Recent developments on graphs of bounded clique-width.
 [GanianHO13] Ganian, Hliněný, Obdržálek 2013: A unified approach to polynomial algorithms on graphs of bounded (bi-)rank-width.
 [LangerRS11] Langer, Rossmanith, Sikdar 2011: Linear-Time Algorithms for Graphs of Bounded Rankwidth: A Fresh Look Using Game Theory
 [GanianHNOMR12] (1, 2) Ganian, Hliněný, Nešetřil, de Mendez, Ramadurai 2012: When Trees Grow Low: Shrubs and Fast MSO1.
 [GajarskyH12] Gajarský, Hliněný 2012: Faster Deciding MSO Properties of Trees of Fixed Height, and Some Consequences.
 [GajarskyH12b] Gajarský, Hliněný 2012: Deciding Graph MSO Properties: Has it all been told already?
 [BlumensathC10] Blumensath, Courcelle 2010: On the Monadic Second-Order Transduction Hierarchy
 [Bui-XuanTV09] Bui-Xuan, Telle, Vatshelle 2009: Boolean-Width of Graphs
 [BelmonteV11] (1, 2) Belmonte, Vatshelle 2011: Graph Classes with Structured Neighborhoods and Algorithmic Applications.
 [AdlerBRRTV10] (1, 2) Adler, Bui-Xuan, Rabinovich, Renault, Telle, Vatshelle 2010: On the Boolean-Width of a Graph: Structure and Applications.
 [Gajarsky12] Gajarský 2012: Efficient solvability of graph MSO properties [online]