Next Thought: March 2011

Thursday, March 10, 2011

Tale about binaries

Until recently STRUCTure handled binaries separetly using the implications graph. The problem with this scheme is that binaries need special attention which complicates the design and the implementation. I decided to give binaries a less special status and keep them both in the watch lists and implications graph. Needless to say, the code was greatly simplified.

The upside. The reasoning algorithms now work with binary clauses. Blocked clause elimination which was previously ignoring all variables that appeared in binaries can now eliminate even more clauses. Trivial necessary assignments should be detected by watch lists and the rest of them by hyper binary resolution.

The downside. Branching heuristic was greatly affected because the literals propagations were previously approximated using the topological sort of the implication graph. However I can rebuild the implications graph, so there will only be a small speed penalty.

Sunday, March 6, 2011

Instances solved during a run.

This morning I though of checking the number of variables in subinstances solved during a typical run. I created a histogram below (note log scale on both axis). Over 1.000.000 instances had 3 variables. That's a lot of overhead considering that an instance with very few variables is easy to solve by checking all possible assignments. In the picture below created represents the instances created and solved represents the instances created but not solved by the simplification procedure.

At some point I added a simple backtracking algorithm to deal away with small instances, however it didn't influence the results that much. Probably I should test it again. I could use, for example, sat4j to solve small instances, but since probably sat4j is more efficient than STRUCTure the results will be biased.

Nevertheless switching to another algorithm doesn't fix the root problem which I can't understand yet. My guess is that lookahead solvers are not that good on easy instances with many variables that appear in relatively few clauses. What I want to do next is to add another simplification procedure to eliminate these variables like variable elimination in SatElite, but simpler and at every branching node.

Wednesday, March 2, 2011

Dependent Variables

Let's talk a little about XOR clauses. If you have a variable that appears in a single XOR we can remove that clause because it always can be satisfied by setting the proper value of the chosen variable. Let's call this variable dependent. I was reading Mate Soos's paper on Gaussian Elimination and he suggested that we can apply a similar technique for variables that appear in two XOR clauses. These variables, called mutexes, are very common when splitting long XOR clauses before transforming them in CNF clauses.

For example: \[
\begin{align*}
a+b+c+d+e+f & = 1 \\
\text{can be split into} \\
a+b+c+z & = 0 \\
z+d+e+f & = 1
\end{align*}
\] with $z$ being a mutex.

I think that dependent variables can be generalized further for all variables that appears only in XOR clauses. Let $u$ be such a variable. Since a XOR clause is actually a equation in $G_2$ we use the shortest XOR clause containing $u$ to knock out $u$ from all other XOR clauses containing $u$ as in Gaussian Elimination. Now u appears in a single XOR clause and therefore it can be removed.together with the clause. I wonder if anybody else thought of this.

Update: Here is a graph showing the performance improvement of Dependent Variable Elimination. On one instance the speed improvement was almost 200 fold. That's impressive. Of course, the instances without XOR gates were not affected.

SAT Competition 2011

I've submitted the solver just now for SAT Competition 2011. The submission can be downloaded from here and the sources from here (note the git tag sc11). There is a huge gap between STRUCTure and CryptoMiniSat 2.6.0. My goal is not to win the competition (which could happen only if Java HotSpot developers proved that P = NP), but to get a complete objective evaluation of STRUCTure. IIRC look ahead solvers are somehow better than cdcl solvers on random instances and on these instances STRUCTure seems to perform better than CryptoMiniSat.

Update: As it turns out I had a bug that rendered extraction of XOR gates ineffective on my SAT Competition 2011 submission. No wonder why adjusting the coefficient for XOR gates in branching heuristic didn't influence the results.