The largest well predicted subset problem is formulated for comparison of two predicted 3D protein structures from the same sequence. Given two ordered point sets A = {a_1,…, a_n} and B = {b_1,b_2,…b_n} containing n points, and a threshold d, the largest well predicted subset problem is to find the rigid transformation T for a largest subset B_(opt) of B such that the distance between a_i and T(b_i) is at most d for every b_i in B_(opt). A meaningful prediction requires that the size of B_(opt) is at least αn for some constant α [8]. We use LWPS(A, B, d, α) to denote the largest well predicted subset problem with meaningful prediction. An (1 + δ_1, 1 - δ_2)-approximation for LWPS(A, B, d, α) is to find a transformation T to bring a subset B' (is contained in) B of size at least (1 - δ_2)|B_(opt)| such that for each b_i ∈ B', the Euclidean distance between the two points distance(a_i, T(b_i)) ≤ (1 + δ_1)d. We develop a constant time (1 + δ_1, 1 - δ_2)-approximation algorithm for LWPS(A, B, d, α) for arbitrary positive constants δ_1 and δ_2.
展开▼