Solution for Problem 2 of the Coinbase technical interview
Problem 2
We have some clickstream data that we gathered on our client's website. Using cookies, we collected snippets of users' anonymized URL histories while they browsed the site. The histories are in chronological order, and no URL was visited more than once per person.
Write a function that takes two users' browsing histories as input and returns the longest contiguous sequence of URLs that appears in both.
Sample Input:
user0 = ["/start", "/green", "/blue", "/pink", "/register", "/orange", "/one/two"]
user1 = ["/start", "/pink", "/register", "/orange", "/red", "a"]
user2 = ["a", "/one", "/two"]
user3 = ["/pink", "/orange", "/yellow", "/plum", "/blue", "/tan", "/red", "/amber", "/HotRodPink", "/CornflowerBlue", "/LightGoldenRodYellow", "/BritishRacingGreen"]
user4 = ["/pink", "/orange", "/amber", "/BritishRacingGreen", "/plum", "/blue", "/tan", "/red", "/lavender", "/HotRodPink", "/CornflowerBlue", "/LightGoldenRodYellow"]
user5 = ["a"]
user6 = ["/pink","/orange","/six","/plum","/seven","/tan","/red", "/amber"]
Sample Output:
findContiguousHistory(user0, user1) => ["/pink", "/register", "/orange"]
findContiguousHistory(user0, user2) => [] (empty)
findContiguousHistory(user0, user0) => ["/start", "/green", "/blue", "/pink", "/register", "/orange", "/one/two"]
findContiguousHistory(user2, user1) => ["a"]
findContiguousHistory(user5, user2) => ["a"]
findContiguousHistory(user3, user4) => ["/plum", "/blue", "/tan", "/red"]
findContiguousHistory(user4, user3) => ["/plum", "/blue", "/tan", "/red"]
findContiguousHistory(user3, user6) => ["/tan", "/red", "/amber"]
Where,
n
: Length of the first user's browsing historym
: Length of the second user's browsing history
Solution
This is a "Longest Common Substring/Subarray" problem and can be solved using dynamic programming.
Given array A and array B, we deduce that, if a common subarray of A and B exists, it must end at some A[i] and B[j].
We let dp[i][j]
be the longest common subarray of A and B for the subarray ending at i and j. For example, given the following arrays:
arr1 = [1, 2, 3, 4]
arr2 = [4, 2, 3, 1]
dp =
[0, 0, 0, 1]
[0, 1, 0, 0]
[0, 0, 2, 0]
[1, 0, 0, 0]
Whenever A[i] == B[j], we know dp[i][j] = dp[i-1][j-1] + 1
.
The maximum subarray is then max(dp[i][j])
for all i and j. The length of the longest common subarray and the index of the last element in the subarray are maintained while iterating.
The time complexity of this solution is O(nm). The space complexity is O(nm).
package main
// findContiguousHistory returns the longest contiguous sequence of URLs
// that appears in both s1 and s2.
func findContiguousHistory(s1 []string, s2 []string) []string {
// Allocate a zeroed 2D array of len(s1) x len(s2).
m, n := len(s1), len(s2)
dp := make([][]int, m)
for i := 0; i < m; i++ {
dp[i] = make([]int, n)
}
// Maintain the length of the longest contiguous sequence of URLs and the
// index of its last element.
longest, longestIdx := 0, 0
for i := 0; i < m; i++ {
for j := 0; j < n; j++ {
if s1[i] == s2[j] {
// NOTE: Since the solution to the current subproblem builds from the
// solution to the previous subproblem, it is common to initialize the
// 2D array to be m + 1 and n + 1, then iterate from dp[1...m][1...n].
// This, however, requires that the indices into `s1` and `s2` be
// different from the indices into dp for `i` and `j`. Instead, we can
// simply handle the edge case with a conditional and set the value
// accordingly.
if i == 0 || j == 0 {
dp[i][j] = 1
} else {
dp[i][j] = dp[i-1][j-1] + 1
}
if dp[i][j] > longest {
longest, longestIdx = dp[i][j], i
}
}
}
}
// If longest != 0, return a slice of `s1` using the index of the last
// element and the length of the longest contiguous sequence.
if longest == 0 {
return make([]string, 0)
} else {
return s1[longestIdx-longest+1 : longestIdx+1]
}
}