By Rahul Vashishtha in Algorithm — May 5, 2024

Efficient Techniques to Identify a Unique Element in an Array

Discover effective methods to detect the unique element in arrays with practical C implementations.

This is one of my favourite questions to ask during an interview. This type of question helps me understand the thought process of candidates that how they understand the problem and how they approach the solution. It also helps me understand if the candidate knows basic programming fundamentals. So, let's continue and understand how can we solve the problem.

So, the question states "Provided an unsorted array of size N. Each element inside the array has at least one duplicate, instead of a single element X which is unique. You have to find that unique element X."

Introduction

Identifying a unique element in an array is a common problem in programming, often used in interviews to assess a candidate's problem-solving skills and understanding of algorithms. The challenge is to find an element in an array that doesn't have duplicates while every other element does. This task can be approached using various methods, each with its advantages and limitations.

Sorting and Linear Scan

Algorithm Details and Pseudo Code:

Sort the array.
Traverse the sorted array and check if the current element is different from its neighbors.

Pseudo Code:

function findUnique(arr) {
    sort(arr);
    for i from 0 to length of arr - 1:
        if (i == 0 and arr[i] != arr[i+1]) or
           (i == arr.length - 1 and arr[i] != arr[i-1]) or
           (arr[i] != arr[i-1] and arr[i] != arr[i+1]):
            return arr[i];
    return null;
}

This method sorts the array first, then scans it to find an element that doesn't match its neighbors, assuming that duplicates are adjacent after sorting.

Implementation in C:

#include <stdio.h>
#include <stdlib.h>

int compare(const void *a, const void *b) {
    return (*(int*)a - *(int*)b);
}

int findUnique(int arr[], int size) {
    qsort(arr, size, sizeof(int), compare);
    for (int i = 0; i < size; i++) {
        if ((i == 0 && arr[i] != arr[i+1]) ||
            (i == size - 1 && arr[i] != arr[i-1]) ||
            (i > 0 && i < size - 1 && arr[i] != arr[i-1] && arr[i] != arr[i+1])) {
            return arr[i];
        }
    }
    return -1;
}

int main() {
    int arr[] = {1, 2, 9, 4, 4, 6, 8, 9, 3, 2, 1, 4, 8, 9, 3};
    int n = sizeof(arr)/sizeof(arr[0]);
    int unique = findUnique(arr, n);
    if (unique != -1)
        printf("Unique Element: %d\n", unique);
    else
        printf("No unique element found.\n");
    return 0;
}

C Code Explanation

The C implementation uses the qsort function for sorting and then scans the array. The compare function defines the sorting criteria (ascending in this case). The findUnique function scans the sorted array, checking each element against its neighbors to determine if it's unique.

Time and Space Complexity:

Time Complexity: 𝑂(𝑛log⁡𝑛)O(nlogn) due to the sorting step.
Space Complexity: 𝑂(1)O(1) as the sorting is done in-place.

Using Hashmaps

Algorithm Details:
This approach involves using a hashmap to store the frequency of each element. The element with a frequency of one is the unique one.

Pseudo Code:

function findUniqueWithHash(arr) {
    hashmap = {}
    for each element in arr:
        if element in hashmap:
            increment hashmap[element]
        else:
            hashmap[element] = 1

    for each element in hashmap:
        if hashmap[element] == 1:
            return element
    return null;
}

Time and Space Complexity:

Time Complexity: 𝑂(𝑛)O(n), as each element is processed once.
Space Complexity: 𝑂(𝑛)O(n), as potentially all elements might be stored in the hashmap.

Floyd's Tortoise and Hare Algorithm for Finding a Unique Element

While the classic application of Floyd's Tortoise and Hare algorithm is for detecting cycles in linked lists (i.e., finding duplicates), with a creative adaptation, it can also be employed to identify unique elements in a specially structured array. This approach assumes that each element in the array is within the range from 1 to N where N is the length of the array minus one, facilitating the use of elements as indices.

Algorithm Details:

Phase 1 - Detecting a Cycle: Use two pointers, one moving twice as fast as the other. The slow pointer (tortoise) moves one step at a time, while the fast pointer (hare) moves two steps at a time. The meeting point of these pointers will be inside the cycle.
Phase 2 - Finding the Entrance to the Cycle: Once a cycle is detected (the tortoise and hare meet), move one pointer to the start of the array while keeping the other at the meeting point. Move both pointers at the same pace; the point at which they meet again is the start of the cycle, which in this case would be the repeated number.

This adaptation works under the assumption that the array behaves like a linked list where each value points to the next index.

C Code Implementation:

#include <stdio.h>

int findDuplicate(int nums[], int n) {
    int tortoise = nums[0];
    int hare = nums[0];

    // Phase 1: Finding the intersection point in the cycle
    do {
        tortoise = nums[tortoise];
        hare = nums[nums[hare]];
    } while (tortoise != hare);

    // Phase 2: Find the entrance to the cycle
    int ptr1 = nums[0];
    int ptr2 = tortoise;
    while (ptr1 != ptr2) {
        ptr1 = nums[ptr1];
        ptr2 = nums[ptr2];
    }

    return ptr1;
}

int main() {
    int nums[] = {3, 1, 3, 4, 2};
    int n = sizeof(nums) / sizeof(nums[0]);
    printf("Duplicate Element: %d\n", findDuplicate(nums, n));
    return 0;
}

Explanation of the C Code:

The code follows the two-phase approach:

Phase 1: Both the tortoise and hare start at the first element. The hare moves twice as fast as the tortoise. This loop continues until they meet inside the cycle, indicating a duplicate.
Phase 2: One pointer starts over from the beginning of the array, while the other starts from the meeting point. Both move at the same speed. The meeting point of these pointers is where the cycle begins, which is the duplicate number in this context.

Time and Space Complexity:

Time Complexity: 𝑂(𝑛)O(n). The algorithm completes in linear time because it only passes through the array a couple of times.
Space Complexity: 𝑂(1)O(1). This approach requires only a few extra variables, maintaining a constant space complexity.

Usage and Conclusion:

Floyd's Tortoise and Hare algorithm is highly efficient for finding duplicates in an array where each element maps to a position in the array, forming a cycle. It is particularly useful in environments with limited memory where a linear time complexity solution is required. While initially designed for cycle detection, its adaptation for array problems showcases the versatility and power of algorithmic strategies in solving complex programming challenges efficiently.