Maximum subarray sum modulo M

25

13

Most of us are familiar with the maximum sum subarray problem. I came across a variant of this problem which asks the programmer to output the maximum of all subarray sums modulo some number M.

The naive approach to solve this variant would be to find all possible subarray sums (which would be of the order of N^2 where N is the size of the array). Of course, this is not good enough. The question is - how can we do better?

Example: Let us consider the following array:

6 6 11 15 12 1

Let M = 13. In this case, subarray 6 6 (or 12 or 6 6 11 15 or 11 15 12) will yield maximum sum ( = 12 ).

Bhoot

Posted 2015-06-29T11:01:39.007

Reputation: 1 792

Is there an upper limit on M? – dasblinkenlight – 2015-06-29T11:04:55.363

1let us assume that the upper limit on number M is equal to maximum number in the array. – Bhoot – 2015-06-29T11:05:46.170

O(n*M) is trivial, by finding existence subarrays that ends in i and sums (in modolus) to k, for each index i and for each k in [0,M) (done in DP) – amit – 2015-06-29T11:07:55.387

1@amit we would like our complexity to be independent of modulo M. – Bhoot – 2015-06-29T11:08:56.663

Answers

20

We can do this as follow:

Maintaining an array sum which at index ith, it contains the modulus sum from 0 to ith.

For each index ith, we need to find the maximum sub sum that end at this index:

For each subarray (start + 1 , i ), we know that the mod sum of this sub array is

int a = (sum[i] - sum[start] + M) % M

So, we can only achieve a sub-sum larger than sum[i] if sum[start] is larger than sum[i] and as close to sum[i] as possible.

This can be done easily if you using a binary search tree.

Pseudo code:

int[] sum;
sum[0] = A[0];
Tree tree;
tree.add(sum[0]);
int result = sum[0];
for(int i = 1; i < n; i++){
    sum[i] = sum[i - 1] + A[i];
    sum[i] %= M;
    int a = tree.getMinimumValueLargerThan(sum[i]);
    result = max((sum[i] - a + M) % M, result);
    tree.add(sum[i]);
}
print result;

Time complexity :O(n log n)

Pham Trung

Posted 2015-06-29T11:01:39.007

Reputation: 9 620

Nice. Also you can make it O(n log min(n, M)) by only inserting distinct sums into the tree. – j_random_hacker – 2015-06-30T13:05:39.187

4in line 5 result should be sum[0]%m, not sum[0] – Jackery Xu – 2015-11-12T03:49:42.387

looking at this, to me it doesn't seem to be possible that this is a solution since it doesn't even refer to any elements of A apart from A[0]. There's something missing – dajobe – 2016-01-18T00:42:47.757

@dajobe oh, there is a bug in my code, fixed it, thanks :) – Pham Trung – 2016-01-18T02:46:12.953

Why is getMinimumValueLargerThan guaranteed to return something? Suppose, M = 10000000, sequence = [1,1,...]. On the first step when sum[0]=1, tree=[1], sum[1]=2 -- there is no minimum larger than 2 in the tree. Am i mising something? – zw0rk – 2016-04-11T12:37:25.023

@zw0rk it is ok if it not return anything, that case is trivial, so I expect you can handle that. – Pham Trung – 2016-04-11T14:54:45.007

I don't understand why you need a full array for the sums if you are only using the last one – fons – 2016-05-29T11:59:25.387

Why we have +M in (sum[i] - sum[start] + M) % M. Can't figure out. – zim32 – 2016-08-29T14:17:35.233

5Because sum[i] - sum[start] can be negative, hence we add M and take modulo of M to get positive remainder. Also adding any multiples of M would not change the remainder value. 1%7 == (1 + 7)%7 == (1+2*7)%7 etc. – Ravindra M – 2016-10-22T04:53:26.857

7

Let A be our input array with zero-based indexing. We can reduce A modulo M without changing the result.

First of all, let's reduce the problem to a slightly easier one by computing an array P representing the prefix sums of A, modulo M:

A = 6 6 11 2 12 1
P = 6 12 10 12 11 12

Now let's process the possible left borders of our solution subarrays in decreasing order. This means that we will first determine the optimal solution that starts at index n - 1, then the one that starts at index n - 2 etc.

In our example, if we chose i = 3 as our left border, the possible subarray sums are represented by the suffix P[3..n-1] plus a constant a = A[i] - P[i]:

a = A[3] - P[3] = 2 - 12 = 3 (mod 13)
P + a = * * * 2 1 2

The global maximum will occur at one point too. Since we can insert the suffix values from right to left, we have now reduced the problem to the following:

Given a set of values S and integers x and M, find the maximum of S + x modulo M

This one is easy: Just use a balanced binary search tree to manage the elements of S. Given a query x, we want to find the largest value in S that is smaller than M - x (that is the case where no overflow occurs when adding x). If there is no such value, just use the largest value of S. Both can be done in O(log |S|) time.

Total runtime of this solution: O(n log n)

Here's some C++ code to compute the maximum sum. It would need some minor adaptions to also return the borders of the optimal subarray:

#include <bits/stdc++.h>
using namespace std;

int max_mod_sum(const vector<int>& A, int M) {
    vector<int> P(A.size());
    for (int i = 0; i < A.size(); ++i)
        P[i] = (A[i] + (i > 0 ? P[i-1] : 0)) % M;
    set<int> S;
    int res = 0;
    for (int i = A.size() - 1; i >= 0; --i) {
        S.insert(P[i]);
        int a = (A[i] - P[i] + M) % M;
        auto it = S.lower_bound(M - a);
        if (it != begin(S))
            res = max(res, *prev(it) + a);
        res = max(res, (*prev(end(S)) + a) % M);
    }
    return res;
}

int main() {
    // random testing to the rescue
    for (int i = 0; i < 1000; ++i) {
        int M = rand() % 1000 + 1, n = rand() % 1000 + 1;
        vector<int> A(n);
        for (int i = 0; i< n; ++i)
            A[i] = rand() % M;
        int should_be = 0;
        for (int i = 0; i < n; ++i) {
            int sum = 0;
            for (int j = i; j < n; ++j) {
                sum = (sum + A[j]) % M;
                should_be = max(should_be, sum);
            }
        }
        assert(should_be == max_mod_sum(A, M));
    }
}

Niklas B.

Posted 2015-06-29T11:01:39.007

Reputation: 68 979

I feel like there is a non-explicit assumption in your explanation regarding S + x mod M reaches its maximum at S = M - 1 - x. If S and x can be any value then S = M - 1 - x + y * M are also valid solutions. In your tree you only store one of them. I think this works out because the x and S are both in [0,M[. – ThomasD – 2016-01-02T18:06:57.500

Yes, we're only considering the canonical representatives mod M. Hence the sum of two representatives is in (0, 2M( – Niklas B. – 2016-01-03T18:31:55.000

1

Total java implementation with O(n*log(n))

import java.io.BufferedReader;
import java.io.InputStreamReader;
import java.util.TreeSet;
import java.util.stream.Stream;

public class MaximizeSumMod {

    public static void main(String[] args) throws Exception{

        BufferedReader in = new BufferedReader(new InputStreamReader(System.in));
        Long times = Long.valueOf(in.readLine());

        while(times --> 0){
            long[] pair = Stream.of(in.readLine().split(" ")).mapToLong(Long::parseLong).toArray();
            long mod = pair[1];            
            long[] numbers = Stream.of(in.readLine().split(" ")).mapToLong(Long::parseLong).toArray();
            printMaxMod(numbers,mod);
        }
    }

    private static void printMaxMod(long[] numbers, Long mod) {

        Long maxSoFar = (numbers[numbers.length-1] + numbers[numbers.length-2])%mod;
        maxSoFar = (maxSoFar > (numbers[0]%mod)) ? maxSoFar : numbers[0]%mod;
        numbers[0] %=mod;
        for (Long i = 1L; i < numbers.length; i++) {
            long currentNumber = numbers[i.intValue()]%mod;            
            maxSoFar = maxSoFar > currentNumber ? maxSoFar : currentNumber;
            numbers[i.intValue()] = (currentNumber + numbers[i.intValue()-1])%mod;
            maxSoFar = maxSoFar > numbers[i.intValue()] ? maxSoFar : numbers[i.intValue()];
        }

        if(mod.equals(maxSoFar+1) || numbers.length == 2){
            System.out.println(maxSoFar);
            return;
        }

        long previousNumber = numbers[0];
        TreeSet<Long> set = new TreeSet<>();
        set.add(previousNumber);

        for (Long i = 2L; i < numbers.length; i++) {
            Long currentNumber = numbers[i.intValue()];
            Long ceiling = set.ceiling(currentNumber);
            if(ceiling == null){
                set.add(numbers[i.intValue()-1]);            
                continue;
            }

            if(ceiling.equals(currentNumber)){
                set.remove(ceiling);
                Long greaterCeiling = set.ceiling(currentNumber);
                if(greaterCeiling == null){
                    set.add(ceiling);
                    set.add(numbers[i.intValue()-1]);            
                    continue;
                }
                set.add(ceiling);                    
                ceiling = greaterCeiling;
            }
            Long newMax = (currentNumber - ceiling + mod);
            maxSoFar = maxSoFar > newMax ? maxSoFar :newMax;
            set.add(numbers[i.intValue()-1]);            
        }

        System.out.println(maxSoFar);

    }

}

fatih tekin

Posted 2015-06-29T11:01:39.007

Reputation: 655

1

Here is Java code for maximum sub array sum modulo. We handle the case we can not find least element in the tree strictly greater than s[i]

public static long maxModulo(long[] a, final long k) {
    long[] s = new long[a.length];
    TreeSet<Long> tree = new TreeSet<>();

    s[0] = a[0] % k;
    tree.add(s[0]);
    long result = s[0];

    for (int i = 1; i < a.length; i++) {

        s[i] = (s[i - 1] + a[i]) % k;

        // find least element in the tree strictly greater than s[i]
        Long v = tree.higher(s[i]);

        if (v == null) {
            // can't find v, then compare v and s[i]
            result = Math.max(s[i], result);
        } else {
            result = Math.max((s[i] - v + k) % k, result);
        }
        tree.add(s[i]);
    }
    return result;
 }

The Tran

Posted 2015-06-29T11:01:39.007

Reputation: 333

1

Adding STL C++11 code based on the solution suggested by @Pham Trung. Might be handy.

#include <iostream>
#include <set>

int main() {
    int N;
    std::cin>>N;
    for (int nn=0;nn<N;nn++){
        long long n,m;
        std::set<long long> mSet;
        long long maxVal = 0; //positive input values
        long long sumVal = 0;

        std::cin>>n>>m;
        mSet.insert(m);
        for (long long q=0;q<n;q++){
            long long tmp;

            std::cin>>tmp;
            sumVal = (sumVal + tmp)%m;
            auto itSub = mSet.upper_bound(sumVal);
            maxVal = std::max(maxVal,(m + sumVal - *itSub)%m);
            mSet.insert(sumVal);                
        }
        std::cout<<maxVal<<"\n";
    }
}

Nir

Posted 2015-06-29T11:01:39.007

Reputation: 139

1

For me, all explanations here were awful, since I didn't get the searching/sorting part. How do we search/sort, was unclear.

We all know that we need to build prefixSum, meaning sum of all elems from 0 to i with modulo m

I guess, what we are looking for is clear. Knowing that subarray[i][j] = (prefix[i] - prefix[j] + m) % m (indicating the modulo sum from index i to j), our maxima when given prefix[i] is always that prefix[j] which is as close as possible to prefix[i], but slightly bigger.

E.g. for m = 8, prefix[i] being 5, we are looking for the next value after 5, which is in our prefixArray.

For efficient search (binary search) we sort the prefixes.

What we can not do is, build the prefixSum first, then iterate again from 0 to n and look for index in the sorted prefix array, because we can find and endIndex which is smaller than our startIndex, which is no good.

Therefore, what we do is we iterate from 0 to n indicating the endIndex of our potential max subarray sum and then look in our sorted prefix array, (which is empty at the beginning) which contains sorted prefixes between 0 and endIndex.

def maximumSum(coll, m):
    n = len(coll)
    maxSum, prefixSum = 0, 0
    sortedPrefixes = []

    for endIndex in range(n):
        prefixSum = (prefixSum + coll[endIndex]) % m
        maxSum = max(maxSum, prefixSum)

        startIndex = bisect.bisect_right(sortedPrefixes, prefixSum)
        if startIndex < len(sortedPrefixes): 
            maxSum = max(maxSum, prefixSum - sortedPrefixes[startIndex] + m)

        bisect.insort(sortedPrefixes, prefixSum)

    return maxSum

denis631

Posted 2015-06-29T11:01:39.007

Reputation: 708

-1

Modify Kadane algorithm to keep track of #occurrence. Below is the code.

#python3
#source: https://github.com/harishvc/challenges/blob/master/dp-largest-sum-sublist-modulo.py  
#Time complexity: O(n)
#Space complexity: O(n)
def maxContiguousSum(a,K):
    sum_so_far =0
    max_sum = 0
    count = {} #keep track of occurrence
    for i in range(0,len(a)):
            sum_so_far += a[i]
            sum_so_far = sum_so_far%K
            if sum_so_far > 0:
                    max_sum = max(max_sum,sum_so_far)
                    if sum_so_far in count.keys():
                            count[sum_so_far] += 1
                    else:
                            count[sum_so_far] = 1
            else:
                    assert sum_so_far < 0 , "Logic error"
                    #IMPORTANT: reset sum_so_far
                    sum_so_far = 0
    return max_sum,count[max_sum]

  a = [6, 6, 11, 15, 12, 1]
  K = 13
  max_sum,count = maxContiguousSum(a,K)
  print("input >>> %s max sum=%d #occurrence=%d" % (a,max_sum,count))

harishvc

Posted 2015-06-29T11:01:39.007

Reputation: 363