Go Data Structures & Algorithms
Specialized Data Structures in Go
Table of Contents
Introduction
Beyond the fundamental data structures like arrays, linked lists, trees, and hash tables, there exists a rich ecosystem of specialized data structures designed to solve specific problems efficiently. These specialized structures often provide significant performance improvements for particular use cases, making them invaluable tools in a programmer's toolkit.
In this comprehensive guide, we'll explore several specialized data structures, their implementations in Go, and their practical applications. We'll cover tries, bloom filters, skip lists, disjoint sets, segment trees, Fenwick trees, LRU caches, and more.
Tries (Prefix Trees)
A trie (pronounced "try" or "tree") is a tree-like data structure used to store a dynamic set of strings. Tries are particularly efficient for operations like prefix matching, which makes them ideal for applications such as autocomplete, spell checking, and IP routing.
Key Features
- Each node represents a single character of a string
- The path from the root to a node forms a prefix
- Children of a node are typically stored in a map or array
- Leaf nodes or nodes with special markers indicate complete words
Operations
- Insert: Add a string to the trie
- Search: Check if a string exists in the trie
- StartsWith: Check if any string with a given prefix exists
- Delete: Remove a string from the trie
Implementation
Here's a basic implementation of a trie in Go:
package main
import (
"fmt"
)
// TrieNode represents a node in the trie
type TrieNode struct {
children map[rune]*TrieNode
isEnd bool
}
// NewTrieNode creates a new trie node
func NewTrieNode() *TrieNode {
return &TrieNode{
children: make(map[rune]*TrieNode),
isEnd: false,
}
}
// Trie represents a trie data structure
type Trie struct {
root *TrieNode
}
// NewTrie creates a new trie
func NewTrie() *Trie {
return &Trie{
root: NewTrieNode(),
}
}
// Insert adds a word to the trie
func (t *Trie) Insert(word string) {
node := t.root
for _, ch := range word {
if _, exists := node.children[ch]; !exists {
node.children[ch] = NewTrieNode()
}
node = node.children[ch]
}
node.isEnd = true
}
// Search checks if a word exists in the trie
func (t *Trie) Search(word string) bool {
node := t.root
for _, ch := range word {
if _, exists := node.children[ch]; !exists {
return false
}
node = node.children[ch]
}
return node.isEnd
}
// StartsWith checks if any word with the given prefix exists
func (t *Trie) StartsWith(prefix string) bool {
node := t.root
for _, ch := range prefix {
if _, exists := node.children[ch]; !exists {
return false
}
node = node.children[ch]
}
return true
}
// Delete removes a word from the trie
func (t *Trie) Delete(word string) bool {
return t.deleteHelper(t.root, word, 0)
}
// deleteHelper is a recursive helper function for Delete
func (t *Trie) deleteHelper(node *TrieNode, word string, index int) bool {
if index == len(word) {
if !node.isEnd {
return false // Word doesn't exist
}
node.isEnd = false
return len(node.children) == 0
}
ch := rune(word[index])
child, exists := node.children[ch]
if !exists {
return false // Word doesn't exist
}
shouldDeleteChild := t.deleteHelper(child, word, index+1)
if shouldDeleteChild {
delete(node.children, ch)
return len(node.children) == 0 && !node.isEnd
}
return false
}
// FindWordsWithPrefix finds all words with the given prefix
func (t *Trie) FindWordsWithPrefix(prefix string) []string {
node := t.root
words := []string{}
// Navigate to the node representing the prefix
for _, ch := range prefix {
if _, exists := node.children[ch]; !exists {
return words // Prefix doesn't exist
}
node = node.children[ch]
}
// Collect all words starting from the prefix node
t.collectWords(node, prefix, &words)
return words
}
// collectWords is a recursive helper function for FindWordsWithPrefix
func (t *Trie) collectWords(node *TrieNode, prefix string, words *[]string) {
if node.isEnd {
*words = append(*words, prefix)
}
for ch, child := range node.children {
t.collectWords(child, prefix+string(ch), words)
}
}
func main() {
trie := NewTrie()
// Insert words
words := []string{"apple", "app", "application", "banana", "band", "bandana"}
for _, word := range words {
trie.Insert(word)
}
// Search for words
fmt.Println("Search 'apple':", trie.Search("apple")) // true
fmt.Println("Search 'app':", trie.Search("app")) // true
fmt.Println("Search 'apricot':", trie.Search("apricot")) // false
// Check prefixes
fmt.Println("Prefix 'app':", trie.StartsWith("app")) // true
fmt.Println("Prefix 'ban':", trie.StartsWith("ban")) // true
fmt.Println("Prefix 'car':", trie.StartsWith("car")) // false
// Find words with prefix
fmt.Println("Words with prefix 'app':", trie.FindWordsWithPrefix("app"))
fmt.Println("Words with prefix 'ban':", trie.FindWordsWithPrefix("ban"))
// Delete a word
trie.Delete("app")
fmt.Println("After deleting 'app':")
fmt.Println("Search 'app':", trie.Search("app")) // false
fmt.Println("Search 'apple':", trie.Search("apple")) // true
fmt.Println("Words with prefix 'app':", trie.FindWordsWithPrefix("app"))
}
Applications
- Autocomplete Systems: Suggesting words as users type
- Spell Checkers: Verifying if a word exists in a dictionary
- IP Routing: Finding the longest matching prefix for an IP address
- Text Search: Implementing algorithms like Aho-Corasick for pattern matching
- Word Games: Validating words in games like Scrabble or Boggle
Performance Analysis
| Operation | Time Complexity |
|---|---|
| Insert | O(m) where m is the length of the key |
| Search | O(m) where m is the length of the key |
| StartsWith | O(p) where p is the length of the prefix |
| Delete | O(m) where m is the length of the key |
| Space Complexity | O(n*m) where n is the number of keys and m is the average key length |
Bloom Filters
A Bloom filter is a space-efficient probabilistic data structure used to test whether an element is a member of a set. It can tell you with certainty that an element is not in the set, but it might give false positives (incorrectly reporting that an element is in the set when it's not).
Key Features
- Space-efficient: Uses significantly less memory than storing the actual elements
- Probabilistic: May produce false positives but never false negatives
- Uses multiple hash functions to set and check bits in a bit array
- Cannot remove elements (without additional structures)
Operations
- Add: Add an element to the filter
- Contains: Check if an element might be in the filter
Implementation
Here's a basic implementation of a Bloom filter in Go:
package main
import (
"fmt"
"hash/fnv"
"math"
)
// BloomFilter represents a Bloom filter
type BloomFilter struct {
bitArray []bool
size int
hashFuncs int
}
// NewBloomFilter creates a new Bloom filter with the given size and number of hash functions
func NewBloomFilter(size int, hashFuncs int) *BloomFilter {
return &BloomFilter{
bitArray: make([]bool, size),
size: size,
hashFuncs: hashFuncs,
}
}
// OptimalBloomFilter creates a Bloom filter with optimal parameters for the expected number of elements
// and desired false positive probability
func OptimalBloomFilter(expectedElements int, falsePositiveProb float64) *BloomFilter {
// Calculate optimal size
size := int(-float64(expectedElements) * math.Log(falsePositiveProb) / math.Pow(math.Log(2), 2))
// Calculate optimal number of hash functions
hashFuncs := int(float64(size) / float64(expectedElements) * math.Log(2))
// Ensure at least one hash function
if hashFuncs < 1 {
hashFuncs = 1
}
return NewBloomFilter(size, hashFuncs)
}
// hash generates multiple hash values for a string
func (bf *BloomFilter) hash(data string, seed int) int {
h := fnv.New64a()
h.Write([]byte(data))
h.Write([]byte{byte(seed)})
return int(h.Sum64() % uint64(bf.size))
}
// Add adds an element to the Bloom filter
func (bf *BloomFilter) Add(element string) {
for i := 0; i < bf.hashFuncs; i++ {
index := bf.hash(element, i)
bf.bitArray[index] = true
}
}
// Contains checks if an element might be in the Bloom filter
func (bf *BloomFilter) Contains(element string) bool {
for i := 0; i < bf.hashFuncs; i++ {
index := bf.hash(element, i)
if !bf.bitArray[index] {
return false // Definitely not in the set
}
}
return true // Might be in the set
}
// EstimateFalsePositiveRate estimates the current false positive rate
func (bf *BloomFilter) EstimateFalsePositiveRate(elementsAdded int) float64 {
k := float64(bf.hashFuncs)
m := float64(bf.size)
n := float64(elementsAdded)
// Calculate the probability that a bit is still 0 after adding n elements
p := math.Pow(1 - math.Exp(-k*n/m), k)
return p
}
func main() {
// Create a Bloom filter with optimal parameters for 1000 elements and 1% false positive rate
bf := OptimalBloomFilter(1000, 0.01)
// Add elements
elements := []string{"apple", "banana", "cherry", "date", "elderberry"}
for _, element := range elements {
bf.Add(element)
}
// Check membership
tests := []string{"apple", "banana", "grape", "kiwi", "lemon"}
for _, test := range tests {
if bf.Contains(test) {
if contains(elements, test) {
fmt.Printf("%s: Correctly identified as present\n", test)
} else {
fmt.Printf("%s: False positive\n", test)
}
} else {
fmt.Printf("%s: Correctly identified as absent\n", test)
}
}
// Estimate false positive rate
fmt.Printf("Estimated false positive rate: %.4f%%\n", bf.EstimateFalsePositiveRate(len(elements))*100)
}
// Helper function to check if a slice contains a string
func contains(slice []string, item string) bool {
for _, s := range slice {
if s == item {
return true
}
}
return false
}
Applications
- Database Systems: Quickly checking if a key might exist before performing expensive disk operations
- Caching Systems: Avoiding cache lookups for items that don't exist
- Web Crawlers: Checking if a URL has already been visited
- Spell Checkers: Quickly filtering out non-dictionary words
- Network Routers: Packet filtering and forwarding
Performance Analysis
| Operation | Time Complexity |
|---|---|
| Add | O(k) where k is the number of hash functions |
| Contains | O(k) where k is the number of hash functions |
| Space Complexity | O(m) where m is the size of the bit array |
Skip Lists
A skip list is a probabilistic data structure that allows for fast search, insertion, and deletion operations. It's essentially a linked list with multiple layers, where each layer skips over some elements of the layer below, creating a hierarchy of linked lists.
Key Features
- Maintains multiple layers of linked lists
- Each layer skips over some elements of the layer below
- Probabilistic: Uses random coin flips to determine the height of each element
- Provides expected O(log n) search, insert, and delete operations
Operations
- Insert: Add an element to the skip list
- Search: Find an element in the skip list
- Delete: Remove an element from the skip list
Implementation
Here's a basic implementation of a skip list in Go:
package main
import (
"fmt"
"math/rand"
"time"
)
const (
maxLevel = 16 // Maximum level for the skip list
p = 0.5 // Probability of promoting to the next level
)
// Node represents a node in the skip list
type Node struct {
key int
value interface{}
next []*Node // Array of pointers to next nodes at each level
}
// NewNode creates a new node with the given key, value, and level
func NewNode(key int, value interface{}, level int) *Node {
return &Node{
key: key,
value: value,
next: make([]*Node, level),
}
}
// SkipList represents a skip list
type SkipList struct {
head *Node
level int
length int
random *rand.Rand
}
// NewSkipList creates a new skip list
func NewSkipList() *SkipList {
return &SkipList{
head: NewNode(-1, nil, maxLevel),
level: 1,
length: 0,
random: rand.New(rand.NewSource(time.Now().UnixNano())),
}
}
// randomLevel generates a random level for a new node
func (sl *SkipList) randomLevel() int {
level := 1
for level < maxLevel && sl.random.Float64() < p {
level++
}
return level
}
// Insert adds a key-value pair to the skip list
func (sl *SkipList) Insert(key int, value interface{}) {
update := make([]*Node, maxLevel)
current := sl.head
// Find the position to insert
for i := sl.level - 1; i >= 0; i-- {
for current.next[i] != nil && current.next[i].key < key {
current = current.next[i]
}
update[i] = current
}
// Move to the next node
current = current.next[0]
// If the key already exists, update the value
if current != nil && current.key == key {
current.value = value
return
}
// Generate a random level for the new node
level := sl.randomLevel()
// Update the skip list level if necessary
if level > sl.level {
for i := sl.level; i < level; i++ {
update[i] = sl.head
}
sl.level = level
}
// Create a new node
newNode := NewNode(key, value, level)
// Insert the new node at each level
for i := 0; i < level; i++ {
newNode.next[i] = update[i].next[i]
update[i].next[i] = newNode
}
sl.length++
}
// Search finds a value by key in the skip list
func (sl *SkipList) Search(key int) (interface{}, bool) {
current := sl.head
// Start from the highest level and work down
for i := sl.level - 1; i >= 0; i-- {
for current.next[i] != nil && current.next[i].key < key {
current = current.next[i]
}
}
// Move to the next node
current = current.next[0]
// Check if the key exists
if current != nil && current.key == key {
return current.value, true
}
return nil, false
}
// Delete removes a key-value pair from the skip list
func (sl *SkipList) Delete(key int) bool {
update := make([]*Node, maxLevel)
current := sl.head
// Find the position to delete
for i := sl.level - 1; i >= 0; i-- {
for current.next[i] != nil && current.next[i].key < key {
current = current.next[i]
}
update[i] = current
}
// Move to the next node
current = current.next[0]
// If the key doesn't exist, return false
if current == nil || current.key != key {
return false
}
// Update pointers to skip the node being deleted
for i := 0; i < sl.level; i++ {
if update[i].next[i] != current {
break
}
update[i].next[i] = current.next[i]
}
// Update the skip list level if necessary
for sl.level > 1 && sl.head.next[sl.level-1] == nil {
sl.level--
}
sl.length--
return true
}
// Length returns the number of elements in the skip list
func (sl *SkipList) Length() int {
return sl.length
}
// String returns a string representation of the skip list
func (sl *SkipList) String() string {
result := "SkipList:\n"
for i := sl.level - 1; i >= 0; i-- {
result += fmt.Sprintf("Level %d: ", i)
node := sl.head.next[i]
for node != nil {
result += fmt.Sprintf("%d -> ", node.key)
node = node.next[i]
}
result += "nil\n"
}
return result
}
func main() {
sl := NewSkipList()
// Insert elements
sl.Insert(3, "three")
sl.Insert(6, "six")
sl.Insert(7, "seven")
sl.Insert(9, "nine")
sl.Insert(12, "twelve")
sl.Insert(19, "nineteen")
sl.Insert(17, "seventeen")
sl.Insert(26, "twenty-six")
sl.Insert(21, "twenty-one")
sl.Insert(25, "twenty-five")
fmt.Println(sl)
// Search for elements
value, found := sl.Search(9)
fmt.Printf("Search 9: %v, %v\n", value, found)
value, found = sl.Search(15)
fmt.Printf("Search 15: %v, %v\n", value, found)
// Delete elements
deleted := sl.Delete(19)
fmt.Printf("Delete 19: %v\n", deleted)
deleted = sl.Delete(15)
fmt.Printf("Delete 15: %v\n", deleted)
fmt.Println(sl)
fmt.Println("Skip list length:", sl.Length())
}
Applications
- Sorted Sets: Maintaining a sorted collection of elements
- Database Indexes: Implementing efficient range queries
- Priority Queues: Implementing priority queues with efficient operations
- In-Memory Databases: Storing sorted data for quick access
Performance Analysis
| Operation | Average Case | Worst Case |
|---|---|---|
| Search | O(log n) | O(n) |
| Insert | O(log n) | O(n) |
| Delete | O(log n) | O(n) |
| Space Complexity | O(n) | O(n log n) |
Disjoint Sets (Union-Find)
A disjoint-set data structure (also known as a union-find data structure) keeps track of a set of elements partitioned into a number of disjoint (non-overlapping) subsets. It provides near-constant-time operations to add new sets, merge existing sets, and determine whether elements are in the same set.
Key Features
- Maintains a collection of disjoint sets
- Each set is identified by a representative element
- Supports efficient union and find operations
- Uses path compression and union by rank for optimization
Operations
- MakeSet: Create a new set with a single element
- Find: Find the representative of the set containing an element
- Union: Merge two sets
Implementation
Here's a basic implementation of a disjoint-set data structure in Go:
package main
import (
"fmt"
)
// DisjointSet represents a disjoint-set data structure
type DisjointSet struct {
parent []int
rank []int
size []int
}
// NewDisjointSet creates a new disjoint-set with n elements
func NewDisjointSet(n int) *DisjointSet {
parent := make([]int, n)
rank := make([]int, n)
size := make([]int, n)
// Initialize each element as a separate set
for i := 0; i < n; i++ {
parent[i] = i
rank[i] = 0
size[i] = 1
}
return &DisjointSet{
parent: parent,
rank: rank,
size: size,
}
}
// Find returns the representative of the set containing x
// with path compression
func (ds *DisjointSet) Find(x int) int {
if ds.parent[x] != x {
ds.parent[x] = ds.Find(ds.parent[x]) // Path compression
}
return ds.parent[x]
}
// Union merges the sets containing x and y
// using union by rank
func (ds *DisjointSet) Union(x, y int) {
rootX := ds.Find(x)
rootY := ds.Find(y)
if rootX == rootY {
return // Already in the same set
}
// Union by rank
if ds.rank[rootX] < ds.rank[rootY] {
ds.parent[rootX] = rootY
ds.size[rootY] += ds.size[rootX]
} else if ds.rank[rootX] > ds.rank[rootY] {
ds.parent[rootY] = rootX
ds.size[rootX] += ds.size[rootY]
} else {
ds.parent[rootY] = rootX
ds.rank[rootX]++
ds.size[rootX] += ds.size[rootY]
}
}
// Connected checks if x and y are in the same set
func (ds *DisjointSet) Connected(x, y int) bool {
return ds.Find(x) == ds.Find(y)
}
// Size returns the size of the set containing x
func (ds *DisjointSet) Size(x int) int {
return ds.size[ds.Find(x)]
}
// Count returns the number of disjoint sets
func (ds *DisjointSet) Count() int {
count := 0
for i := 0; i < len(ds.parent); i++ {
if ds.parent[i] == i {
count++
}
}
return count
}
func main() {
// Create a disjoint-set with 10 elements
ds := NewDisjointSet(10)
// Perform some unions
ds.Union(0, 1)
ds.Union(2, 3)
ds.Union(4, 5)
ds.Union(6, 7)
ds.Union(8, 9)
ds.Union(0, 2)
ds.Union(4, 6)
ds.Union(0, 4)
// Check connectivity
fmt.Println("0 and 9 connected:", ds.Connected(0, 9))
fmt.Println("1 and 7 connected:", ds.Connected(1, 7))
// Get set sizes
fmt.Println("Size of set containing 0:", ds.Size(0))
fmt.Println("Size of set containing 8:", ds.Size(8))
// Count the number of disjoint sets
fmt.Println("Number of disjoint sets:", ds.Count())
}
Applications
- Kruskal's Algorithm: Finding the minimum spanning tree of a graph
- Network Connectivity: Determining if two nodes are connected
- Image Processing: Connected component labeling
- Social Networks: Finding communities or friend circles
- Grid Percolation: Determining if a path exists from top to bottom
Performance Analysis
| Operation | Time Complexity |
|---|---|
| MakeSet | O(1) |
| Find | O(α(n)) (amortized) where α is the inverse Ackermann function |
| Union | O(α(n)) (amortized) |
| Space Complexity | O(n) |
Segment Trees
A segment tree is a tree data structure used for storing information about intervals or segments. It allows for efficient querying of cumulative information over an array or range of values, such as finding the sum, minimum, maximum, or other properties of a range.
Key Features
- Binary tree structure where each node represents a segment of the array
- Leaf nodes represent individual elements
- Internal nodes represent the combination of their children
- Supports efficient range queries and updates
Operations
- Build: Construct the segment tree from an array
- Query: Query a range for some property (sum, min, max, etc.)
- Update: Update a single element and propagate the changes
Implementation
Here's a basic implementation of a segment tree for range sum queries in Go:
package main
import (
"fmt"
"math"
)
// SegmentTree represents a segment tree
type SegmentTree struct {
tree []int
n int
}
// NewSegmentTree creates a new segment tree from an array
func NewSegmentTree(arr []int) *SegmentTree {
n := len(arr)
// Calculate the size of the segment tree array
// Size is 2*2^(ceil(log2(n))) - 1
x := int(math.Ceil(math.Log2(float64(n))))
maxSize := 2*int(math.Pow(2, float64(x))) - 1
tree := make([]int, maxSize)
st := &SegmentTree{
tree: tree,
n: n,
}
// Build the segment tree
st.build(arr, 0, n-1, 0)
return st
}
// build constructs the segment tree
func (st *SegmentTree) build(arr []int, start, end, index int) int {
// Leaf node
if start == end {
st.tree[index] = arr[start]
return arr[start]
}
// Internal node
mid := (start + end) / 2
// Build left and right subtrees
leftSum := st.build(arr, start, mid, 2*index+1)
rightSum := st.build(arr, mid+1, end, 2*index+2)
// Store the sum of left and right subtrees
st.tree[index] = leftSum + rightSum
return st.tree[index]
}
// Query returns the sum of elements in the range [qStart, qEnd]
func (st *SegmentTree) Query(qStart, qEnd int) int {
// Check if the range is valid
if qStart < 0 || qEnd >= st.n || qStart > qEnd {
return 0
}
return st.queryRange(0, st.n-1, qStart, qEnd, 0)
}
// queryRange is a recursive helper function for Query
func (st *SegmentTree) queryRange(start, end, qStart, qEnd, index int) int {
// Complete overlap
if qStart <= start && qEnd >= end {
return st.tree[index]
}
// No overlap
if end < qStart || start > qEnd {
return 0
}
// Partial overlap
mid := (start + end) / 2
leftSum := st.queryRange(start, mid, qStart, qEnd, 2*index+1)
rightSum := st.queryRange(mid+1, end, qStart, qEnd, 2*index+2)
return leftSum + rightSum
}
// Update updates the value at index i to newValue
func (st *SegmentTree) Update(i, newValue int) {
// Check if the index is valid
if i < 0 || i >= st.n {
return
}
st.updateValue(0, st.n-1, i, newValue, 0)
}
// updateValue is a recursive helper function for Update
func (st *SegmentTree) updateValue(start, end, i, newValue, index int) {
// Index is out of range
if i < start || i > end {
return
}
// Leaf node
if start == end {
st.tree[index] = newValue
return
}
// Internal node
mid := (start + end) / 2
// Update left or right subtree
if i <= mid {
st.updateValue(start, mid, i, newValue, 2*index+1)
} else {
st.updateValue(mid+1, end, i, newValue, 2*index+2)
}
// Update the current node
st.tree[index] = st.tree[2*index+1] + st.tree[2*index+2]
}
func main() {
arr := []int{1, 3, 5, 7, 9, 11}
st := NewSegmentTree(arr)
fmt.Println("Segment Tree:", st.tree)
// Query ranges
fmt.Println("Sum of [1, 3]:", st.Query(1, 3))
fmt.Println("Sum of [0, 5]:", st.Query(0, 5))
fmt.Println("Sum of [2, 4]:", st.Query(2, 4))
// Update a value
st.Update(2, 10)
fmt.Println("After updating index 2 to 10:")
fmt.Println("Segment Tree:", st.tree)
// Query again
fmt.Println("Sum of [1, 3]:", st.Query(1, 3))
fmt.Println("Sum of [0, 5]:", st.Query(0, 5))
fmt.Println("Sum of [2, 4]:", st.Query(2, 4))
}
Applications
- Range Queries: Finding the sum, minimum, maximum, or other properties of a range
- Computational Geometry: Solving problems involving intervals or segments
- Database Systems: Implementing range queries in databases
- Competitive Programming: Solving problems involving range operations
Performance Analysis
| Operation | Time Complexity |
|---|---|
| Build | O(n) |
| Query | O(log n) |
| Update | O(log n) |
| Space Complexity | O(n) |
Fenwick Trees (Binary Indexed Trees)
A Fenwick tree (also known as a binary indexed tree or BIT) is a data structure that can efficiently update elements and calculate prefix sums in a table of numbers. It's more space-efficient than a segment tree but supports a more limited set of operations.
Key Features
- Represented as an array where each element stores a cumulative sum
- Uses the binary representation of indices to determine which elements to sum
- More space-efficient than segment trees
- Supports efficient point updates and prefix sum queries
Operations
- Update: Add a value to an element
- PrefixSum: Calculate the sum of elements from index 0 to i
- RangeSum: Calculate the sum of elements from index i to j
Implementation
Here's a basic implementation of a Fenwick tree in Go:
package main
import (
"fmt"
)
// FenwickTree represents a Fenwick tree (binary indexed tree)
type FenwickTree struct {
tree []int
n int
}
// NewFenwickTree creates a new Fenwick tree from an array
func NewFenwickTree(arr []int) *FenwickTree {
n := len(arr)
tree := make([]int, n+1) // 1-indexed
ft := &FenwickTree{
tree: tree,
n: n,
}
// Build the Fenwick tree
for i := 0; i < n; i++ {
ft.Update(i, arr[i])
}
return ft
}
// Update adds val to the element at index i
func (ft *FenwickTree) Update(i, val int) {
i++ // Convert to 1-indexed
for i <= ft.n {
ft.tree[i] += val
i += i & -i // Add the least significant bit
}
}
// PrefixSum returns the sum of elements from index 0 to i
func (ft *FenwickTree) PrefixSum(i int) int {
i++ // Convert to 1-indexed
sum := 0
for i > 0 {
sum += ft.tree[i]
i -= i & -i // Remove the least significant bit
}
return sum
}
// RangeSum returns the sum of elements from index i to j
func (ft *FenwickTree) RangeSum(i, j int) int {
if i > j || i < 0 || j >= ft.n {
return 0
}
if i == 0 {
return ft.PrefixSum(j)
}
return ft.PrefixSum(j) - ft.PrefixSum(i-1)
}
// Get returns the value at index i
func (ft *FenwickTree) Get(i int) int {
return ft.RangeSum(i, i)
}
// Set sets the value at index i to val
func (ft *FenwickTree) Set(i, val int) {
oldVal := ft.Get(i)
ft.Update(i, val - oldVal)
}
func main() {
arr := []int{1, 3, 5, 7, 9, 11}
ft := NewFenwickTree(arr)
fmt.Println("Fenwick Tree:", ft.tree)
// Query prefix sums
fmt.Println("Prefix sum up to index 3:", ft.PrefixSum(3))
fmt.Println("Prefix sum up to index 5:", ft.PrefixSum(5))
// Query range sums
fmt.Println("Sum of [1, 3]:", ft.RangeSum(1, 3))
fmt.Println("Sum of [2, 4]:", ft.RangeSum(2, 4))
// Update a value
ft.Update(2, 5) // Add 5 to index 2
fmt.Println("After adding 5 to index 2:")
fmt.Println("Fenwick Tree:", ft.tree)
// Query again
fmt.Println("Sum of [1, 3]:", ft.RangeSum(1, 3))
fmt.Println("Sum of [2, 4]:", ft.RangeSum(2, 4))
// Set a value
ft.Set(2, 10) // Set index 2 to 10
fmt.Println("After setting index 2 to 10:")
fmt.Println("Fenwick Tree:", ft.tree)
// Query again
fmt.Println("Sum of [1, 3]:", ft.RangeSum(1, 3))
fmt.Println("Sum of [2, 4]:", ft.RangeSum(2, 4))
}
Applications
- Range Sum Queries: Efficiently calculating the sum of a range of elements
- Frequency Counting: Counting the frequency of elements in a range
- Inversion Counting: Counting the number of inversions in an array
- Cumulative Distribution Functions: Computing cumulative distribution functions
Performance Analysis
| Operation | Time Complexity |
|---|---|
| Build | O(n log n) |
| Update | O(log n) |
| PrefixSum | O(log n) |
| RangeSum | O(log n) |
| Space Complexity | O(n) |
LRU Cache
An LRU (Least Recently Used) cache is a data structure that maintains a fixed-size cache of items, discarding the least recently used items when the cache is full. It provides fast access to recently used items while automatically managing the cache size.
Key Features
- Maintains a fixed-size cache of key-value pairs
- Provides O(1) access to cached items
- Automatically evicts the least recently used item when the cache is full
- Typically implemented using a hash map and a doubly linked list
Operations
- Get: Retrieve a value by key and mark it as recently used
- Put: Add or update a key-value pair and mark it as recently used
Implementation
Here's a basic implementation of an LRU cache in Go:
package main
import (
"fmt"
)
// Node represents a node in the doubly linked list
type Node struct {
key int
value int
prev *Node
next *Node
}
// LRUCache represents an LRU cache
type LRUCache struct {
capacity int
cache map[int]*Node
head *Node // Most recently used
tail *Node // Least recently used
}
// NewLRUCache creates a new LRU cache with the given capacity
func NewLRUCache(capacity int) *LRUCache {
cache := &LRUCache{
capacity: capacity,
cache: make(map[int]*Node),
head: nil,
tail: nil,
}
return cache
}
// Get retrieves a value by key and marks it as recently used
func (c *LRUCache) Get(key int) int {
node, exists := c.cache[key]
if !exists {
return -1 // Key not found
}
// Move the node to the front (most recently used)
c.moveToFront(node)
return node.value
}
// Put adds or updates a key-value pair and marks it as recently used
func (c *LRUCache) Put(key, value int) {
// Check if the key already exists
if node, exists := c.cache[key]; exists {
// Update the value
node.value = value
// Move the node to the front
c.moveToFront(node)
return
}
// Create a new node
node := &Node{
key: key,
value: value,
prev: nil,
next: nil,
}
// Add the node to the cache
c.cache[key] = node
// Add the node to the front of the list
c.addToFront(node)
// If the cache is over capacity, remove the least recently used item
if len(c.cache) > c.capacity {
c.removeLRU()
}
}
// moveToFront moves a node to the front of the list (most recently used)
func (c *LRUCache) moveToFront(node *Node) {
// If the node is already at the front, do nothing
if node == c.head {
return
}
// Remove the node from its current position
if node.prev != nil {
node.prev.next = node.next
}
if node.next != nil {
node.next.prev = node.prev
}
if node == c.tail {
c.tail = node.prev
}
// Add the node to the front
node.prev = nil
node.next = c.head
if c.head != nil {
c.head.prev = node
}
c.head = node
// If the list was empty, update the tail
if c.tail == nil {
c.tail = node
}
}
// addToFront adds a new node to the front of the list
func (c *LRUCache) addToFront(node *Node) {
node.prev = nil
node.next = c.head
if c.head != nil {
c.head.prev = node
}
c.head = node
// If the list was empty, update the tail
if c.tail == nil {
c.tail = node
}
}
// removeLRU removes the least recently used item from the cache
func (c *LRUCache) removeLRU() {
if c.tail == nil {
return // Cache is empty
}
// Remove the tail node from the cache
delete(c.cache, c.tail.key)
// Update the tail
c.tail = c.tail.prev
if c.tail != nil {
c.tail.next = nil
} else {
// List is now empty
c.head = nil
}
}
// Len returns the number of items in the cache
func (c *LRUCache) Len() int {
return len(c.cache)
}
// String returns a string representation of the cache
func (c *LRUCache) String() string {
result := "LRUCache (most recent first): ["
node := c.head
for node != nil {
result += fmt.Sprintf("(%d:%d)", node.key, node.value)
if node.next != nil {
result += " -> "
}
node = node.next
}
result += "]"
return result
}
func main() {
cache := NewLRUCache(3)
// Add some items
cache.Put(1, 10)
cache.Put(2, 20)
cache.Put(3, 30)
fmt.Println(cache)
// Access an item (moves it to the front)
fmt.Println("Get 2:", cache.Get(2))
fmt.Println(cache)
// Add a new item, which will evict the least recently used item (1)
cache.Put(4, 40)
fmt.Println(cache)
// Try to access the evicted item
fmt.Println("Get 1:", cache.Get(1))
// Update an existing item
cache.Put(3, 33)
fmt.Println(cache)
// Add another item, which will evict the least recently used item (2)
cache.Put(5, 50)
fmt.Println(cache)
}
Applications
- Database Caching: Caching frequently accessed database records
- Web Caching: Caching web pages or API responses
- Memory Management: Implementing page replacement algorithms
- Image Caching: Caching images in mobile or web applications
Performance Analysis
| Operation | Time Complexity |
|---|---|
| Get | O(1) |
| Put | O(1) |
| Space Complexity | O(capacity) |
Concurrent Data Structures
Concurrent data structures are designed to be safely accessed and modified by multiple goroutines simultaneously. Go provides several built-in concurrent data structures and synchronization primitives to help build thread-safe applications.
Sync.Map
Go's sync.Map is a concurrent map implementation that is safe for use by multiple goroutines without additional locking.
package main
import (
"fmt"
"sync"
)
func main() {
var m sync.Map
// Store key-value pairs
m.Store("key1", "value1")
m.Store("key2", "value2")
// Load a value
value, exists := m.Load("key1")
if exists {
fmt.Println("Value for key1:", value)
}
// LoadOrStore returns the existing value if the key exists,
// otherwise it stores and returns the given value
value, loaded := m.LoadOrStore("key3", "value3")
if loaded {
fmt.Println("Key3 already exists with value:", value)
} else {
fmt.Println("Stored new key3 with value:", value)
}
// Delete a key
m.Delete("key2")
// Iterate over all key-value pairs
m.Range(func(key, value interface{}) bool {
fmt.Printf("%v: %v\n", key, value)
return true // continue iteration
})
}
Channels
Channels are a built-in Go feature that provides a way for goroutines to communicate and synchronize their execution.
package main
import (
"fmt"
"time"
)
func main() {
// Create a buffered channel with capacity 2
ch := make(chan int, 2)
// Send values to the channel
ch <- 1
ch <- 2
// Receive values from the channel
fmt.Println(<-ch)
fmt.Println(<-ch)
// Using channels for communication between goroutines
done := make(chan bool)
go func() {
fmt.Println("Working...")
time.Sleep(time.Second)
fmt.Println("Done!")
done <- true
}()
// Wait for the goroutine to finish
<-done
// Using channels for iteration
nums := make(chan int, 5)
for i := 0; i < 5; i++ {
nums <- i
}
close(nums) // Close the channel to signal that no more values will be sent
// Range over the channel
for num := range nums {
fmt.Println("Received:", num)
}
}
Custom Thread-Safe Data Structures
You can create your own thread-safe data structures using mutexes or other synchronization primitives.
package main
import (
"fmt"
"sync"
)
// ConcurrentQueue represents a thread-safe queue
type ConcurrentQueue struct {
items []interface{}
mu sync.Mutex
}
// NewConcurrentQueue creates a new concurrent queue
func NewConcurrentQueue() *ConcurrentQueue {
return &ConcurrentQueue{
items: make([]interface{}, 0),
}
}
// Enqueue adds an item to the queue
func (q *ConcurrentQueue) Enqueue(item interface{}) {
q.mu.Lock()
defer q.mu.Unlock()
q.items = append(q.items, item)
}
// Dequeue removes and returns the first item from the queue
func (q *ConcurrentQueue) Dequeue() (interface{}, bool) {
q.mu.Lock()
defer q.mu.Unlock()
if len(q.items) == 0 {
return nil, false
}
item := q.items[0]
q.items = q.items[1:]
return item, true
}
// Peek returns the first item without removing it
func (q *ConcurrentQueue) Peek() (interface{}, bool) {
q.mu.Lock()
defer q.mu.Unlock()
if len(q.items) == 0 {
return nil, false
}
return q.items[0], true
}
// Size returns the number of items in the queue
func (q *ConcurrentQueue) Size() int {
q.mu.Lock()
defer q.mu.Unlock()
return len(q.items)
}
func main() {
queue := NewConcurrentQueue()
// Concurrent operations
var wg sync.WaitGroup
// Enqueue items concurrently
for i := 0; i < 10; i++ {
wg.Add(1)
go func(val int) {
defer wg.Done()
queue.Enqueue(val)
fmt.Printf("Enqueued %d\n", val)
}(i)
}
wg.Wait()
fmt.Println("Queue size:", queue.Size())
// Dequeue items concurrently
for i := 0; i < 5; i++ {
wg.Add(1)
go func() {
defer wg.Done()
if item, ok := queue.Dequeue(); ok {
fmt.Printf("Dequeued %v\n", item)
}
}()
}
wg.Wait()
fmt.Println("Queue size after dequeuing:", queue.Size())
}
Persistent Data Structures
Persistent data structures preserve previous versions of themselves when modified, allowing for efficient access to both the current and historical states. They are particularly useful in functional programming and scenarios where you need to track changes over time.
Immutable Linked List
An immutable linked list is a simple persistent data structure where modifications create new nodes instead of modifying existing ones.
package main
import (
"fmt"
)
// ImmutableList represents an immutable linked list
type ImmutableList struct {
head *ImmutableNode
count int
}
// ImmutableNode represents a node in an immutable linked list
type ImmutableNode struct {
value interface{}
next *ImmutableNode
}
// NewImmutableList creates a new empty immutable list
func NewImmutableList() *ImmutableList {
return &ImmutableList{
head: nil,
count: 0,
}
}
// Cons adds a new value to the front of the list and returns a new list
func (l *ImmutableList) Cons(value interface{}) *ImmutableList {
return &ImmutableList{
head: &ImmutableNode{
value: value,
next: l.head,
},
count: l.count + 1,
}
}
// Head returns the first value in the list
func (l *ImmutableList) Head() (interface{}, bool) {
if l.head == nil {
return nil, false
}
return l.head.value, true
}
// Tail returns a new list without the first element
func (l *ImmutableList) Tail() *ImmutableList {
if l.head == nil {
return l
}
return &ImmutableList{
head: l.head.next,
count: l.count - 1,
}
}
// IsEmpty checks if the list is empty
func (l *ImmutableList) IsEmpty() bool {
return l.head == nil
}
// Count returns the number of elements in the list
func (l *ImmutableList) Count() int {
return l.count
}
// ToSlice converts the list to a slice
func (l *ImmutableList) ToSlice() []interface{} {
result := make([]interface{}, 0, l.count)
current := l.head
for current != nil {
result = append(result, current.value)
current = current.next
}
return result
}
// String returns a string representation of the list
func (l *ImmutableList) String() string {
return fmt.Sprintf("%v", l.ToSlice())
}
func main() {
// Create an empty list
list1 := NewImmutableList()
// Add elements to create new lists
list2 := list1.Cons(3)
list3 := list2.Cons(2)
list4 := list3.Cons(1)
fmt.Println("List1:", list1)
fmt.Println("List2:", list2)
fmt.Println("List3:", list3)
fmt.Println("List4:", list4)
// Get the head of list4
head, _ := list4.Head()
fmt.Println("Head of list4:", head)
// Get the tail of list4 (should be equivalent to list3)
tail := list4.Tail()
fmt.Println("Tail of list4:", tail)
// Check if the lists are empty
fmt.Println("List1 is empty:", list1.IsEmpty())
fmt.Println("List4 is empty:", list4.IsEmpty())
// Get the count of elements
fmt.Println("Count of list4:", list4.Count())
}
Choosing the Right Data Structure
Selecting the appropriate data structure for a given problem is crucial for achieving optimal performance and maintainability. Here are some guidelines to help you choose the right specialized data structure:
Consider the Operations
Different data structures excel at different operations. Consider which operations are most frequent in your application:
- If you need fast lookups by key, consider hash tables or tries
- If you need to maintain sorted data, consider skip lists or balanced trees
- If you need to perform range queries, consider segment trees or Fenwick trees
- If you need to track set membership with limited memory, consider Bloom filters
Consider the Data Characteristics
The nature of your data can influence the choice of data structure:
- If your data consists of strings with common prefixes, tries can be very efficient
- If your data is numeric and within a limited range, bit sets or Fenwick trees might be appropriate
- If your data has a hierarchical structure, trees are often the best choice
Consider the Constraints
System constraints can also influence your choice:
- If memory is limited, consider space-efficient structures like Bloom filters or bit sets
- If concurrency is required, use thread-safe data structures or add appropriate synchronization
- If persistence is needed, consider immutable or persistent data structures
Consider the Trade-offs
Every data structure involves trade-offs between different performance characteristics:
- Time complexity vs. space complexity
- Average-case performance vs. worst-case performance
- Simplicity vs. optimization
- Generality vs. specialization
Conclusion
Specialized data structures are powerful tools that can significantly improve the performance and capabilities of your Go applications. By understanding their characteristics, implementations, and applications, you can make informed decisions about which data structure to use for a given problem.
Remember that the best data structure is often the simplest one that meets your requirements. Start with the standard library and common data structures, and only reach for specialized structures when you have a clear need for their specific advantages.
This guide has covered several important specialized data structures, but there are many more out there. As you encounter new problems, continue to explore and learn about data structures that might provide elegant and efficient solutions.