# Master thesis excerpts

The whole thesis is better read as a pdf file. This page is provided just as a convenience.

## Introduction

The main focus of computational complexity theory has been for at least 40 years on the study of the relationship between the running time of an algorithm solving some problem and the problem size, like the number of vertices and edges of a graph. This approach led us to the definitions of complexity classes such as P (polynomial time), NP (nondeterministic polynomial time) and EXP (exponential time), and especially the notion of an NP-complete problem, which are all truly foundational for the field of computer science as we know it now. Here we of course have in mind primarily the seminal work done by Cook [Cook71], Karp [Karp72] and Levin [Levin73].

But over time it started to be more and more apparent that a finer approach needs to be taken when dealing with hard problems. In practice the inputs we are feeding to our algorithms are often times nowhere close to random; there might be some underlying structure in our instances that actually allows us to create algorithms that work efficiently for these instances, even though the problem still stays hard in general. In other words, we find that by looking at just the problem size when analyzing the worst-case time complexity we are missing a significant part of the picture, and that it is beneficial to split the input into two parts which are examined separately.

The ideas above are formalized by the quickly growing field of
*Parameterized Complexity* [DowneyF99] [FlumG06]. In parameterized
complexity we get a *parameter* as a special part of the input; quite
often it is just an integer representing the solution size, but
generally it can an arbitrary piece of information that gives us more
information about the input (its structure etc.). To be clear we are
not saying that the problem becomes easier to solve, rather that we
measure its complexity in two dimension instead of one, and we contain
the hardness to just one of the dimension. In this sense parameterized
complexity can be viewed as a tool to rigorously capture *where
exactly* lies the complexity of a given problem. Fellows [Fellows01]
mentions specific cases where parameterized complexity was used to
explain the success of some heuristics -- these heuristics were
implicitly relying on a certain structure of the input graph, which
parameterized complexity explicitly captures and analyzes as the
parameter.

A classical example of the first case (i.e., a parameter that is
simply the solution size) is a parameterized approach to the
(NP-complete) *Vertex Cover* problem -- if along with the input
graph we are given the size of the solution \(k\)
, there is an algorithm
which solves the problem in time \(O(1.271^k + kn)\)
[ChenKJ99]. This
is still exponential in the solution size, but if that is fixed and we
can treat it as a constant, the algorithm runs in time linear with the
graph size. We should add that this approach is not attractive just in
theory, but the algorithm mentioned above "has been implemented and is
quite practical for \(n\)
of unlimited size and \(k\)
up to around 400"
(quote from an overview by Fellows [Fellows01]).

A classical example of a more complicated parameter is *treewidth*,
which can be roughly said to be a measure of how close a graph is to a
tree. This parameter arose as a part of the *Graph Minor Project*
undertaken by Robertson and Seymour [RobertsonS84]
[RobertsonS86]. Just like many hard problems are efficiently solvable
on trees by dynamic programming, it was found that the same technique
can be used on graphs whose treewidth is bounded by a
constant. Moreover, treewidth provides a characterization of some
naturally occuring graph classes, such as series-parallel
graphs.

Treewidth is also characteristic for describing the development of metaalgorithmic theorems, most famously Courcelle's theorem [Courcelle90]. The idea of metaalgorithmic theorems is that instead of describing many similar algorithms for similar problems, there is a way to capture a common essence of these, which is done by proving that some logical language is efficiently decidable when the parameter is fixed. This provides great intuition when studying said parameter.

The progress done on treewidth led to the exploration of many other structural parameters, which are usually studied from two perspectives. The first is trying to answer questions such as "How restrictive is this parameter? How many graphs display this structure? What graph classes are captured well by this parameter?" The second perspective is concerned with questions like "Which problems become tractable when we use this parameter? What logical languages become efficiently decidable when this parameter is fixed?"

Instead of asking the first few questions mentioned above (like "What graph classes are captured well by this parameter?") we can take a somewhat "dual" approach and ask "What are good parameters for this class of graphs?" A very vague way to distinguish between graphs is to look at how "dense" or "sparse" a graph is. It can be argued that treewidth is a successful parameter for handling sparse graphs (although we are by no means at the end of the road yet). On the other hand, with dense graphs the situation is still very much open. It is this question, "What is a good parameter for dense graphs?", where this thesis is trying to make some contribution.

## Overview of parameters

In this chapter we make a short survey of the state of the art in the world of parameters for sparse graphs and dense graphs. We realize that it is limited and there are parameters that we do not mention. The main message of this chapter is that the situation of parameters suitable for dense graphs is much more clouded and with less consensus than the situation of parameters suitable for sparse graphs, and even that is not at all finished. For the convenience of the reader, we provide the exact definitions of parameters discussed in this chapter in the Appendix.

We admit that our understanding of the words "sparse" and "dense" in the following sections is quite limited and specific to the scenario we are focusing on. A relevant recent paper dealing with the difficulty of the concept of sparsity (and related concepts) is due to Nešetřil and de Mendez [NesetrilM12]. Typically, graphs with \(O(n)\) edges are defined as sparse and graphs with \(\Omega(n^2)\) edges as dense. With these definitions there are many graph classes that do not fall into either category.

One of the problems with these definitions in our scenario is that no
parameter subsumes either class completely. Already planar graphs
(which have \(O(n)\)
edges) do not have bounded treewidth. The situation
for cliquewidth is similar [KaminskiLM09]. Only the other direction
holds: *all* graph classes of bounded treewidth have \(O(n)\)
edges, and
*some* graph classes of bounded cliquewidth have \(\Omega(n^2)\)
edges. There are other problems that we will mention.

Thus, the way we use these words is mostly motivated by practice and how they are used in literature.

In the following sections we will be using the notion of equivalence
of parameters. Two parameters \(A\)
and \(B\)
are said to be *equivalent*
if for every graph \(G\)
the parameter \(A\)
is bounded by a constant if
and only if the parameter \(B\)
is bounded by a constant.

### Sparse graphs

In this section we explain why we think that treewidth is a successful parameter for a significant part of sparse graphs. We use a survey from Bodlaender [Bodlaender12] extensively in the following paragraphs.

The notion of treewidth was introduced as a part of the *Graph Minor
Project* of Roberson and Seymour [RobertsonS84] [RobertsonS86] in
the 80s. In hindsight we see that there are other equivalent
definitions of treewidth which were studied by other authors
independently, some even before Roberson and Seymour. Graphs with
bounded treewidth gathered interest because many problems that are
intractable (e.g., NP-hard) become linear time (or polynomial time
with polynomials of reasonable degree) solvable when restricted to
graphs of bounded treewidth. Bodlaender [Bodlaender12] says:

Such algorithms have been found for many combinatorial problems and also have been employed for problems from computational biology, constraint satisfaction, probabilistic networks and other areas. [...] In other words: many graph problems become fixed-parameter tractable when parameterized by the treewidth of the input graph.

Also, since the end of the 80s and the beginning of the 90s treewidth gathered attention from the fields of automata theory, database theory and finite model theory, resulting in the famous metaalgorithmical theorem of Courcelle [Courcelle90]. This theorem states that the model checking problem of \(\text{MSO}_2\) and CMSO logics is FPT with respect to treewidth. (CMSO, or counting MSO, is an extension of the MSO logic to also express the cardinality of sets modulo \(p\) .) This was later extended to optimization variants (EMSO) by Arnborg et al. [ArnborgLS91] and recently to some other more specialized logics, for instance MSO with local cardinality constraints (Szeider [Szeider11]). The interest from database theory is demonstrated for example by papers from Gottlob et al. [GottlobPW10] who show how to translate MSO formulas into Datalog queries. As for finite model theory, it is where the methods Gottlob et al. and many others use come from; for more see Libkin's book [Libkin04].

For a parameter to be useful in practice there has to be a way to efficiently (in FPT time) determine the parameter and afterwards obtain the relevant decomposition. This is because we usually do not know the parameter exactly, only that it is somehow bounded on the input graph.

Both of these are possible for treewidth thanks to the extensive work done by Bodlaender et al. in the past approximately 15 years. The first important step was that Bodlaender [Bodlaender96] showed in 1996 that for fixed \(k\) it is possible to decide if a graph \(G\) has treewidth at most \(k\) in linear time, and if the answer is positive, to return a tree decomposition of width \(k\) . This algorithm was shown to be impractical by Röhrig [Roehrig98], but heuristics which work reasonably were found instead. In the following years much research was done in this area and is summed up in a series of papers. The first two papers due to Bodlaender and Koster show polynomial heuristics for computing upper bounds [BodlaenderK10] and lower bounds [BodlaenderK11] of the treewidth of a graph. The third paper is yet to come out but it will be dedicated to exact algorithms for computing treewidth; some progress is described in a paper by Bodlaender et al. [BodlaenderFKKT06] which describes exact exponential algorithms running in space polynomial in the parameter (unlike the first algorithm from 1996). Those algorithms combined with aformentioned heuristics provide the needed tools to obtain good (almost optimal) tree decompositions quickly.

A great example demonstrating how far the research done on treewidth
got is the recent paper "Evaluation of a MSO solver" by Langer et
al. [LangerRRS12]. In it the authors evaluate an implementation of a
new kind of proof for Courcelle's theorem which they introduced the
previous year [KneisLR11]. In spite of the fact that the
parameterized complexity community still considers metaalgorithmical
theorems like Courcelle's theorem to be merely of theoretical interest
(see Niedermeier's book [Niedermeier04]), the MSO solver implemented
by Langer et al. wildly outperforms the industry standard ILP solver
CPLEX on a real-world instance of the *Bus Stop Location* problem.

We see the following three reasons as the main reason why treewidth is a successful parameter:

- The class of graphs with bounded treewidth is wide enough to capture well many real-world instances for many practical problems which become FPT with respect to treewidth.
- The model checking problem for a variety of logical languages becomes FPT on graphs of bounded treewidth.
- It is possible to determine treewidth and find an optimal tree decomposition in FPT time with respect to treewidth. Moreover, practical heuristics exist.

The biggest theoretical problem of treewidth are the lower bounds on the complexity of model checking for MSO logics given by Frick and Grohe [FrickG04] in 2004. Because of the positive results due to Langer et al. [LangerRRS12] [KneisLR11] we described above we do not think those lower bounds play a big role in practice. Still, there was interest in finding some answer to this obstacle.

To that end attention turned to *tree-depth*, a parameter introduced
in 2006 by Nešetřil and de Mendez [NesetrilM06], which creates a
finer hierarchy between graphs of bounded vertex cover and bounded
treewidth. Instead of explaining this relationship between treewidth
and tree-depth we refer to a paper by Blumensath and Courcelle
[BlumensathC10] which proves an equivalence between tree-depth and
\(l\)
-*depth treewidth*, a parameter that is easier to describe.

A graph \(G\) has \(l\) -depth treewidth bounded by \(k\) if a tree decomposition of width \(k\) exists for \(G\) and also it is possible to choose a root in this decomposition such that all leaves are in depth at most \(l\) . The equivalence between those two parameters is such that tree-depth \(\leq k\) implies \(k\) -depth treewidth \(k\) , and \(l\) -depth treewidth \(\leq k\) implies tree-depth \(lk\) . Treewidth is often described as a measure of how close a graph is to a tree; in this sense tree-depth is a measure of how close a graph is to a star.

Why is tree-depth important? Gajarský and Hliněný [GajarskyH12b] have
recently proven that \(\text{MSO}_2\)
is FPT with respect to tree-depth
*and the function* \(f\)
*has elementary dependence on the formula*
(i.e., the height of the exponential tower does not depend on the
formula). Intuitively, this construction allows us to trade the
nonelementary dependence on the formula for a bound on the depth of
the tree decomposition. This seems like a good answer to the lower
bounds of Frick and Grohe from a theoretical standpoint.

Even when we factor in the limitations of our use of the word "sparse", judging by the amount of interest treewidth generated both in theory and practice we conclude that it successfully answers many important questions.

### Dense graphs

In this section we explain why we think that the search for a good parameter for "dense" graphs is still very much an ongoing effort.

The notion which is probably the closest to treewidth with regards to
"dense" (again, for a suitable definition thereof) graphs is the
parameter cliquewidth. A good
starting point for this parameter is the survey conducted by Kaminski
et al. [KaminskiLM09]. Cliquewidth was introduced by Courcelle and
Oum [CourcelleO00] in 2000 as a generalization of treewidth, as
bounded treewidth implies bounded cliquewidth [GurskiW00]; on the
other hand there are graphs with constant cliquewidth but unbounded
treewidth. Many important hard problems are FPT on graphs of bounded
cliquewidth (for more specifics see the introduction chapter of a
recent paper by Ganian, Hliněný and Obdržálek [GanianHO13]). On the
other hand, since more graphs have bounded cliquewidth than bounded
treewidth it is not surprising that not all problems that are FPT with
respect to treewidth are also FPT with respect to cliquewidth. For
example the *Chromatic number* problem is W[1]-hard with respect
to cliquewidth [FominGLS10].

As for metaalgorithmical results, already in 2000 Courcelle, Makowski and Rotics [CourcelleMR00] proved the equivalent of Courcelle's theorem for graphs of bounded cliquewidth with one significant difference: only \(\text{MSO}_1\) (not \(\text{MSO}_2\) ) model checking is FPT with respect to cliquewidth. Another important result were the nonelementary lower bounds on the complexity of the function \(f(k)\) of \(\text{MSO}_1\) model checking given by Frick and Grohe [FrickG04].

We are already starting to see some significant disadvantages of cliquewidth when compared to treewidth, although some might be attributed simply to the fact that cliquewidth is younger by about 15 years. We have not yet discussed the third "component of success" of treewidth, that is the FPT algorithm determining the treewidth of a graph and computing its decomposition.

Upon seeing that it is possible to solve some hard problems
efficiently on graphs of bounded cliquewidth, the problem of
determining (or at least approximating) cliquewidth gained
importance. In 2006, Oum and Seymour [OumS06] proved that if a graph
\(G\)
has cliquewidth at most \(k\)
, it is possible to find a
\((2^{k+1}-1)\)
-expression (the equivalent of a tree decomposition for
cliquewidth) defining \(G\)
in FPT time. This discrepancy between orders
of \(k\)
and \(2^k\)
was the main motivation for the introduction [Oum05]
of a parameter called *rank-width*.

It holds [OumS06] that for any graph \(G\) , \(rw(G) \leq cw(G) \leq 2^{rw(G)+1} - 1\) , that is, the parameters are equivalent and cliquewidth is always at most roughly \(2^{rw(G)}\) . It still remains open if there is an FPT algorithm that would, for a graph \(G\) with \(cw(G) = k\) , construct its \(k\) -expression. For these reasons rank-width is often more popular for algorithm design. It is worth mentioning that the ideas used in the alternative proof of Courcelle's theorem for treewidth given by Kneis et al. [KneisLR11] were used in the proof of the metaalgorithmical theorem on rank-width by Langer et al. [LangerRS11].

Another parameter equivalent to cliquewidth (and thus rank-width) is
*boolean-width*. It was introduced by Bui-Xuan et al. [Bui-XuanTV09]
in 2009 and further investigated by Adler et al. [AdlerBRRTV10] and
Belmonte and Vatshelle [BelmonteV11]. The reason why boolean-width is
interesting is the combination of two of its properties. First,
boolean-width can be exponentially smaller than rank-width. More
interestingly, many graph classes (see Figure 1 in the paper by
Belmonte and Vatshelle [BelmonteV11]) have boolean-width \(O(\log
n)\)
. It is not typical in parameterized complexity to look at graph
classes with *logarithmic* values of the parameter, but here it makes
sense. The reason is that, second, a wide class of problems (see the
paper by Adler et al. [AdlerBRRTV10]) can be solved on graphs of
boolean-width \(w\)
in time \(2^{O(w)}\text{poly}(n)\)
(i.e., with \(f(w)\)
singly exponential). When those two properties are combined the result
are FPT algorithms for those problems on the classes with
boolean-width \(O(\log n)\)
. The authors also show that for the relevant
graph classes boolean decompositions can be obtained in polynomial
time.

Observe that in this sense boolean-width is *not* equivalent to
cliquewidth or rank-width, because graphs with boolean-width \(O(\log
n)\)
can have cliquewidth much bigger (even doubly exponentially), so
even if we were able to use the same algorithms as for boolean-width,
we would not get a runtime polynomial in \(n\)
. Note that this is the
only case known to the aformentioned authors and to us too where this
technique is used; typically only classes of graphs with a parameter
bounded by a constant are studied.

Above we summed the progress motivated by the original question of determining cliquewidth efficiently -- the notions of rank-width and boolean-width both proved to carry significant advantages over clique-width even though they are all bounded by some constant for the same graph classes. Now we turn to the research motivated by the lower bounds for model checking due to Frick and Grohe [FrickG04].

The first parameter for which model checking of \(\text{MSO}_1\)
with
elementary \(f\)
was proved was neighborhood diversity. This recent
result is due to Lampis [Lampis12]. Gajarský [Gajarsky12]
generalized neighborhood diversity to finite type and proved a similar
result in his thesis. The most interesting result in this regard is
one analogous to the one we mentioned in the previous section for
tree-depth. Just like tree-depth fills the gap between vertex cover
and treewidth, *shrub-depth* (introduced by Ganian et
al. [GanianHNOMR12]) fills the gap between vertex cover and
cliquewidth. Interestingly, graphs with shrub-depth 1 are precisely
graphs of bounded neighborhood diversity [GajarskyH12]. Gajarský also
noted in an email conversation that graphs of finite type have bounded
shrub-depth, but not vice versa. The analogy between tree-depth and
shrub-depth is that tree-depth allows \(\text{MSO}_2\)
model checking
with elementary \(f\)
in the time complexity and shrub-depth allows
\(\text{MSO}_1\)
model checking with elementary \(f\)
.

Still, there remain problems with shrub-depth and related parameters. Ganian et al. [GanianHNOMR12] state: "Primarily, we do not know yet how to efficiently (in FPT) construct decompositions related to our depth parameters (in this respect our situation is similar to that of clique-width)."

In this thesis we did the main part of our work on graphs of bounded neighborhood diversity. Even though this parameter is easy to work with, easy to determine (in time \(poly(n)\) , unlike all other parameters we mentioned) and theoretically attractive (since it is incomparable with treewidth), we have to realistically admit that it is probably not very useful in practice. Recently, Lambert [Lambert12] investigated the values of \(vc(G)\) and \(nd(G)\) for some graphs which are encountered in practice. Even though we have our reservations towards the choice of these graphs (there are better samples available online) it is apparent that unlike with treewidth it is unusual to encounter graphs of bounded neighborhood diversity in practice.

Thus we conclude that there is no one parameter for "dense" graphs that would fit all three criteria from the previous section -- let many hard problems become tractable on many graph classes, allow for efficient model checking and allow for efficient finding of decompositions -- so the search is still very much open. We sum this chapter with a figure of those parameters we discussed.

[Courcelle90] | (1, 2) Courcelle 1990: The Monadic Second-Order Logic of Graphs. I. Recognizable Sets of Finite Graphs |

[Libkin04] | Libkin 2004: Elements of Finite Model Theory |

[ArnborgLS91] | Arnborg, Lagargen, Seese 1991: Easy Problems for Tree-Decomposable Graphs |

[FrickG04] | (1, 2, 3) Frick, Grohe 2004: The complexity of first-order and monadic second-order logic revisited |

[DowneyF99] | Downey, Fellows 1999: Parameterized Complexity |

[Fellows01] | (1, 2) Fellows 2001: Parameterized Complexity: The Main Ideas and Some Research Frontiers |

[FlumG06] | Flum, Grohe 2006: Parameterized Complexity Theory |

[ChenKJ99] | Chen, Kanj, Jia 1999: Vertex Cover: Further Observations and Further Improvements. |

[Karp72] | Karp 1972: Reducibility Among Combinatorial Problems. |

[Levin73] | Levin 73: Universal Sequential Search Problems |

[Cook71] | Cook 71: The Complexity of Theorem Proving Procedures |

[RobertsonS84] | (1, 2) Robertson, Seymour 1984: Graph minors. III. Planar tree-width |

[RobertsonS86] | (1, 2) Robertson, Seymour 1986: Graph Minors. II. Algorithmic Aspects of Tree-Width |

[KneisLR11] | (1, 2, 3) Kneis, Langer, Rossmanith 2011: A Game-theoretic approach to Courcelle's theorem |

[LangerRRS12] | (1, 2) Langer, Reidl, Rossmanith, Sikdar 2012: Evaluation of an MSO-Solver. |

[GottlobPW10] | Gottlob, Pichler, Wei 2010: Monadic datalog over structures of bounded treewidth |

[Bodlaender96] | Bodlaender 1996: A linear-time algorithm for finding tree-decompositions of small treewidth |

[Roehrig98] | Röhrig 1998: Tree Decomposition: A Feasibility Study |

[BodlaenderK10] | Bodlaender, Koster 2010: Treewidth computations I. Upper bounds |

[BodlaenderK11] | Bodlaender, Koster 2011: Treewidth computations II. Lower bounds |

[BodlaenderFKKT06] | Bodlaender, Fomin, Koster, Kratsch, Thilikos 2006: On Exact Algorithms for Treewidth. |

[Bodlaender12] | (1, 2) Bodlaender 12: Fixed-Parameter Tractability of
Treewidth and Pathwidth. |

[Niedermeier04] | Niedermeier 2004: Ubiquitous Parameterization - Invitation to Fixed-Parameter Algorithms. |

[Lampis12] | Lampis 2012: Algorithmic Meta-theorems for Restrictions of Treewidth. |

[GurskiW00] | Gurski, Wanke 2000: The Tree-Width of Clique-Width Bounded Graphs Without Kn, n. |

[FominGLS10] | Fomin, Golovach, Lokshtanov, Saurabh 2010: Intractability of Clique-Width Parameterizations. |

[Szeider11] | Szeider 2011: Monadic second order logic on graphs with local cardinality constraints. |

[NesetrilM12] | Nešetřil, de Mendez 2006: Sparsity - Graphs, Structures, and Algorithms. |

[NesetrilM06] | Nešetřil, de Mendez 2006: Tree-depth, subgraph coloring and homomorphism bounds. |

[Lambert12] | Lambert 2012: Srovnání Vertex cover, Twin-cover a Neighborhood diversity na grafech [online] |

[CourcelleMR00] | Courcelle, Makowsky, Rotics 2000: Linear Time Solvable Optimization Problems on Graphs of Bounded Clique-Width |

[CourcelleO00] | Courcelle, Olariu 2000: Upper bounds to the clique width of graphs |

[Oum05] | Oum 2005: Approximating Rank-Width and Clique-Width Quickly |

[OumS06] | (1, 2) Oum, Seymour 2006: Approximating clique-width and branch-width |

[KaminskiLM09] | (1, 2) Kaminski, Lozin, Milanič 2009: Recent developments
on graphs of bounded clique-width. |

[GanianHO13] | Ganian, Hliněný, Obdržálek 2013: A unified approach to polynomial algorithms on graphs of bounded (bi-)rank-width. |

[LangerRS11] | Langer, Rossmanith, Sikdar 2011: Linear-Time Algorithms for Graphs of Bounded Rankwidth: A Fresh Look Using Game Theory |

[GanianHNOMR12] | (1, 2) Ganian, Hliněný, Nešetřil, de Mendez, Ramadurai 2012: When Trees Grow Low: Shrubs and Fast MSO1. |

[GajarskyH12] | Gajarský, Hliněný 2012: Faster Deciding MSO Properties of Trees of Fixed Height, and Some Consequences. |

[GajarskyH12b] | Gajarský, Hliněný 2012: Deciding Graph MSO Properties: Has it all been told already? |

[BlumensathC10] | Blumensath, Courcelle 2010: On the Monadic Second-Order Transduction Hierarchy |

[Bui-XuanTV09] | Bui-Xuan, Telle, Vatshelle 2009: Boolean-Width of Graphs |

[BelmonteV11] | (1, 2) Belmonte, Vatshelle 2011: Graph Classes with
Structured Neighborhoods and Algorithmic
Applications. |

[AdlerBRRTV10] | (1, 2) Adler, Bui-Xuan, Rabinovich, Renault, Telle,
Vatshelle 2010: On the Boolean-Width of a Graph:
Structure and Applications. |

[Gajarsky12] | Gajarský 2012: Efficient solvability of graph MSO properties [online] |