Go Data Structures & Algorithms

code

Graphs in Go

Graphs are versatile data structures used to represent relationships between objects. They consist of a set of vertices (or nodes) connected by edges. Graphs are fundamental in modeling networks, social connections, dependencies, and many other real-world scenarios. In this comprehensive guide, we will explore graphs in Go, including their representations, traversal algorithms, and common applications.

What are Graphs?

A graph G is defined as an ordered pair G = (V, E), where V is a set of vertices (nodes) and E is a set of edges connecting pairs of vertices. Unlike trees, graphs can have cycles, and vertices can have multiple parents or no parents.

Graphs are used to model:

  • Networks (e.g., computer networks, social networks, road networks)
  • Dependencies (e.g., task dependencies, software dependencies)
  • Relationships (e.g., family trees, organizational charts)
  • State transitions (e.g., finite state machines)

Graph Terminology

Understanding graph terminology is essential:

  • Vertex (Node): A fundamental unit of a graph.
  • Edge (Link, Arc): A connection between two vertices.
  • Adjacent Vertices: Two vertices connected by an edge.
  • Incident Edge: An edge connected to a vertex.
  • Degree of a Vertex: The number of edges incident to a vertex. In directed graphs, we distinguish between in-degree (incoming edges) and out-degree (outgoing edges).
  • Path: A sequence of vertices where each adjacent pair is connected by an edge.
  • Cycle: A path that starts and ends at the same vertex.
  • Connected Graph: A graph where there is a path between any two vertices.
  • Subgraph: A graph whose vertices and edges are subsets of another graph.
  • Weighted Graph: A graph where each edge has an associated weight or cost.
  • Directed Graph (Digraph): A graph where edges have a direction (from one vertex to another).
  • Undirected Graph: A graph where edges have no direction.

Types of Graphs

Graphs can be classified based on various properties:

  • Undirected Graph: Edges have no direction. If there is an edge between A and B, you can traverse from A to B and from B to A.
  • Directed Graph (Digraph): Edges have a direction. An edge from A to B allows traversal only from A to B.
  • Weighted Graph: Edges have associated weights or costs.
  • Unweighted Graph: Edges have no weights (or all weights are considered 1).
  • Simple Graph: A graph with no loops (edges connecting a vertex to itself) and no multiple edges between the same pair of vertices.
  • Multigraph: A graph that allows multiple edges between the same pair of vertices.
  • Cyclic Graph: A graph containing at least one cycle.
  • Acyclic Graph: A graph containing no cycles. A Directed Acyclic Graph (DAG) is particularly important.
  • Connected Graph: An undirected graph where there is a path between every pair of vertices.
  • Strongly Connected Graph: A directed graph where there is a path from any vertex to any other vertex.
  • Complete Graph: A simple undirected graph where every pair of distinct vertices is connected by a unique edge.
  • Bipartite Graph: A graph whose vertices can be divided into two disjoint sets such that every edge connects a vertex in one set to a vertex in the other set.

Graph Representations

There are several ways to represent graphs in memory. The choice of representation depends on the type of graph and the operations you need to perform efficiently.

Adjacency Matrix

An adjacency matrix is a 2D array (or slice of slices in Go) where the value at index [i][j] indicates whether there is an edge between vertex i and vertex j. For unweighted graphs, this value is typically 1 (or true) if an edge exists and 0 (or false) otherwise. For weighted graphs, it stores the weight of the edge, or infinity/zero if no edge exists.

// Adjacency Matrix for an unweighted, undirected graph with 4 vertices
// Vertices: 0, 1, 2, 3
// Edges: (0, 1), (0, 2), (1, 2), (2, 3)
adjMatrix := [][]int{
    {0, 1, 1, 0}, // Edges from vertex 0
    {1, 0, 1, 0}, // Edges from vertex 1
    {1, 1, 0, 1}, // Edges from vertex 2
    {0, 0, 1, 0}, // Edges from vertex 3
}

// Check if edge exists between 1 and 2
edgeExists := adjMatrix[1][2] == 1 // true

Pros:

  • O(1) time complexity to check if an edge exists between two vertices.
  • Simple to implement for dense graphs (graphs with many edges).

Cons:

  • O(V^2) space complexity, where V is the number of vertices. Inefficient for sparse graphs (graphs with few edges).
  • O(V) time complexity to find all neighbors of a vertex.
  • Adding or removing vertices is expensive (requires resizing the matrix).

Adjacency List

An adjacency list represents a graph as an array (or slice) of lists (or slices). The index i of the array corresponds to vertex i, and the list at that index contains all vertices adjacent to vertex i. For weighted graphs, the list can store pairs of (neighbor, weight).

// Adjacency List for the same unweighted, undirected graph
// Using a map where keys are vertices and values are slices of neighbors
adjList := map[int][]int{
    0: {1, 2},
    1: {0, 2},
    2: {0, 1, 3},
    3: {2},
}

// Get neighbors of vertex 2
neighbors := adjList[2] // {0, 1, 3}

Pros:

  • Space complexity is O(V + E), where V is the number of vertices and E is the number of edges. Efficient for sparse graphs.
  • O(degree(v)) time complexity to find all neighbors of a vertex v.
  • Adding vertices is relatively easy.

Cons:

  • O(degree(v)) time complexity to check if an edge exists between two vertices.
  • Slightly more complex to implement than adjacency matrix.

Edge List

An edge list is simply a list (or slice) of all the edges in the graph. Each edge can be represented as a pair (or tuple) of vertices, possibly with a weight.

// Edge List for the same unweighted, undirected graph
type Edge struct {
    U, V int
}

edgeList := []Edge{
    {0, 1},
    {0, 2},
    {1, 2},
    {2, 3},
}

// For weighted graphs:
type WeightedEdge struct {
    U, V   int
    Weight float64
}

Pros:

  • Simple representation.
  • Space complexity is O(E).
  • Easy to iterate over all edges.

Cons:

  • Inefficient for finding neighbors of a vertex (O(E)).
  • Inefficient for checking if an edge exists (O(E)).

The adjacency list is the most common representation for general-purpose graph algorithms due to its efficiency for sparse graphs, which are common in practice.

Implementing a Graph

Let's implement a basic graph structure in Go using an adjacency list representation. We will implement an unweighted, undirected graph.

package main

import (
    "fmt"
    "sync"
)

// Graph represents an unweighted, undirected graph using an adjacency list
type Graph struct {
    vertices map[int][]int
    mu       sync.RWMutex // For thread safety
}

// NewGraph creates a new empty graph
func NewGraph() *Graph {
    return &Graph{
        vertices: make(map[int][]int),
    }
}

// AddVertex adds a new vertex to the graph
func (g *Graph) AddVertex(vertex int) error {
    g.mu.Lock()
    defer g.mu.Unlock()
    
    if _, exists := g.vertices[vertex]; exists {
        return fmt.Errorf("vertex %d already exists", vertex)
    }
    
    g.vertices[vertex] = []int{}
    return nil
}

// AddEdge adds an undirected edge between two vertices
func (g *Graph) AddEdge(vertex1, vertex2 int) error {
    g.mu.Lock()
    defer g.mu.Unlock()
    
    // Check if vertices exist
    if _, exists := g.vertices[vertex1]; !exists {
        return fmt.Errorf("vertex %d does not exist", vertex1)
    }
    if _, exists := g.vertices[vertex2]; !exists {
        return fmt.Errorf("vertex %d does not exist", vertex2)
    }
    
    // Check if edge already exists (optional, prevents duplicates)
    if contains(g.vertices[vertex1], vertex2) {
        return fmt.Errorf("edge between %d and %d already exists", vertex1, vertex2)
    }
    
    // Add edge in both directions
    g.vertices[vertex1] = append(g.vertices[vertex1], vertex2)
    g.vertices[vertex2] = append(g.vertices[vertex2], vertex1)
    
    return nil
}

// GetNeighbors returns the neighbors of a given vertex
func (g *Graph) GetNeighbors(vertex int) ([]int, error) {
    g.mu.RLock()
    defer g.mu.RUnlock()
    
    if neighbors, exists := g.vertices[vertex]; exists {
        // Return a copy to prevent external modification
        result := make([]int, len(neighbors))
        copy(result, neighbors)
        return result, nil
    }
    
    return nil, fmt.Errorf("vertex %d does not exist", vertex)
}

// GetVertices returns all vertices in the graph
func (g *Graph) GetVertices() []int {
    g.mu.RLock()
    defer g.mu.RUnlock()
    
    vertices := make([]int, 0, len(g.vertices))
    for vertex := range g.vertices {
        vertices = append(vertices, vertex)
    }
    return vertices
}

// String returns a string representation of the graph
func (g *Graph) String() string {
    g.mu.RLock()
    defer g.mu.RUnlock()
    
    str := ""
    for vertex, neighbors := range g.vertices {
        str += fmt.Sprintf("%d -> %v\n", vertex, neighbors)
    }
    return str
}

// Helper function to check if a slice contains an element
func contains(slice []int, element int) bool {
    for _, item := range slice {
        if item == element {
            return true
        }
    }
    return false
}

func main() {
    graph := NewGraph()
    
    // Add vertices
    vertices := []int{0, 1, 2, 3, 4}
    for _, v := range vertices {
        graph.AddVertex(v)
    }
    
    // Add edges
    edges := [][2]int{{0, 1}, {0, 2}, {1, 2}, {1, 3}, {2, 4}, {3, 4}}
    for _, edge := range edges {
        graph.AddEdge(edge[0], edge[1])
    }
    
    fmt.Println("Graph representation:")
    fmt.Print(graph)
    
    fmt.Println("Vertices:", graph.GetVertices())
    
    neighbors, err := graph.GetNeighbors(1)
    if err == nil {
        fmt.Println("Neighbors of vertex 1:", neighbors)
    }
}

This implementation provides a basic thread-safe graph structure using an adjacency list. You can extend this to support directed graphs, weighted edges, and more complex operations.

Graph Traversal

Graph traversal algorithms visit all reachable vertices from a starting vertex. The two main traversal methods are Breadth-First Search (BFS) and Depth-First Search (DFS).

Breadth-First Search (BFS)

BFS explores the graph level by level. It starts at a source vertex and explores all its neighbors before moving to the next level of neighbors. BFS uses a queue to keep track of vertices to visit.

// BFS performs a Breadth-First Search starting from a given vertex
func (g *Graph) BFS(startVertex int) ([]int, error) {
    g.mu.RLock()
    defer g.mu.RUnlock()
    
    if _, exists := g.vertices[startVertex]; !exists {
        return nil, fmt.Errorf("start vertex %d does not exist", startVertex)
    }
    
    visited := make(map[int]bool)
    queue := []int{startVertex}
    result := []int{}
    
    visited[startVertex] = true
    
    for len(queue) > 0 {
        // Dequeue vertex
        currentVertex := queue[0]
        queue = queue[1:]
        result = append(result, currentVertex)
        
        // Enqueue unvisited neighbors
        neighbors := g.vertices[currentVertex]
        for _, neighbor := range neighbors {
            if !visited[neighbor] {
                visited[neighbor] = true
                queue = append(queue, neighbor)
            }
        }
    }
    
    return result, nil
}

// Example usage in main:
// bfsOrder, _ := graph.BFS(0)
// fmt.Println("BFS starting from 0:", bfsOrder)

BFS is often used to find the shortest path in unweighted graphs and for level-order traversal.

Depth-First Search (DFS)

DFS explores as far as possible along each branch before backtracking. It starts at a source vertex, explores one of its neighbors completely, then backtracks and explores other neighbors. DFS can be implemented recursively or iteratively using a stack.

// DFS performs a Depth-First Search starting from a given vertex (recursive)
func (g *Graph) DFS(startVertex int) ([]int, error) {
    g.mu.RLock()
    defer g.mu.RUnlock()
    
    if _, exists := g.vertices[startVertex]; !exists {
        return nil, fmt.Errorf("start vertex %d does not exist", startVertex)
    }
    
    visited := make(map[int]bool)
    result := []int{}
    
    var dfsRecursive func(vertex int)
    dfsRecursive = func(vertex int) {
        visited[vertex] = true
        result = append(result, vertex)
        
        neighbors := g.vertices[vertex]
        for _, neighbor := range neighbors {
            if !visited[neighbor] {
                dfsRecursive(neighbor)
            }
        }
    }
    
    dfsRecursive(startVertex)
    return result, nil
}

// Example usage in main:
// dfsOrder, _ := graph.DFS(0)
// fmt.Println("DFS starting from 0:", dfsOrder)

DFS is used for cycle detection, topological sorting (in DAGs), finding connected components, and solving puzzles like mazes.

Shortest Path Algorithms

Finding the shortest path between two vertices is a common graph problem.

Dijkstra's Algorithm

Dijkstra's algorithm finds the shortest path from a single source vertex to all other vertices in a weighted graph with non-negative edge weights. It uses a priority queue to efficiently select the next vertex to visit.

Bellman-Ford Algorithm

The Bellman-Ford algorithm finds the shortest path from a single source vertex to all other vertices in a weighted graph, even if edge weights are negative (but assumes no negative cycles reachable from the source). It is slower than Dijkstra's but more versatile.

Floyd-Warshall Algorithm

The Floyd-Warshall algorithm finds the shortest paths between all pairs of vertices in a weighted graph. It can handle negative edge weights but not negative cycles.

Implementing these algorithms requires a weighted graph representation and is more involved. Libraries like dijkstra or gonum/graph provide implementations.

Minimum Spanning Tree (MST)

A Minimum Spanning Tree (MST) of a connected, undirected, weighted graph is a subgraph that connects all vertices together, without any cycles, and with the minimum possible total edge weight.

Prim's Algorithm

Prim's algorithm builds the MST by starting with an arbitrary vertex and iteratively adding the cheapest edge that connects a vertex in the growing MST to a vertex outside the MST.

Kruskal's Algorithm

Kruskal's algorithm builds the MST by sorting all edges by weight and iteratively adding the cheapest edge that does not form a cycle with the edges already added. It often uses a Disjoint Set Union (DSU) data structure to efficiently detect cycles.

Implementations of MST algorithms are also available in graph libraries.

Performance Analysis

The performance of graph algorithms depends heavily on the graph representation and the number of vertices (V) and edges (E).

Operation/Algorithm Adjacency Matrix Adjacency List
Space Complexity O(V^2) O(V + E)
Add Vertex O(V^2) (resize) O(1)
Add Edge O(1) O(1)
Check Edge O(1) O(degree(v)) or O(V)
BFS O(V^2) O(V + E)
DFS O(V^2) O(V + E)
Dijkstra (with Priority Queue) O(V^2) O(E + V log V)
Prim (with Priority Queue) O(V^2) O(E + V log V)
Kruskal (with DSU) O(E log E) or O(E log V) O(E log E) or O(E log V)

For most applications, especially with sparse graphs, the adjacency list representation offers better overall performance.

Use Cases

Graphs are ubiquitous in computer science and real-world modeling:

  • Social Networks: Representing users and connections (e.g., Facebook, LinkedIn).
  • Mapping and Navigation: Representing locations and routes (e.g., Google Maps).
  • World Wide Web: Representing web pages and hyperlinks.
  • Recommendation Systems: Modeling relationships between users and items.
  • Computer Networks: Representing routers, switches, and connections.
  • Dependency Management: Representing dependencies between tasks, packages, or modules (often using DAGs).
  • Biology: Modeling protein-protein interactions or metabolic pathways.
  • Circuit Design: Representing components and connections.
  • State Machines: Representing states and transitions.

Best Practices

Here are some best practices for working with graphs in Go:

Choose the Right Representation

  • Use adjacency lists for sparse graphs (most common case).
  • Use adjacency matrices for dense graphs or when O(1) edge checking is critical.
  • Use edge lists when iterating over all edges is the primary operation.

Consider Graph Type

Explicitly decide if your graph is directed/undirected, weighted/unweighted, simple/multigraph, etc., and choose or implement the representation accordingly.

Use Libraries for Complex Algorithms

For complex algorithms like shortest paths or MST, consider using well-tested libraries like gonum/graph to avoid implementation errors.

Handle Disconnected Graphs

Traversal algorithms like BFS and DFS only visit vertices reachable from the start vertex. If your graph might be disconnected, you may need to iterate through all vertices and start a traversal from each unvisited vertex.

Implement Thread Safety if Needed

If the graph structure can be modified concurrently by multiple goroutines, ensure your implementation is thread-safe using mutexes or other synchronization primitives.

Abstract Graph Operations

Define clear interfaces for graph operations (adding vertices/edges, getting neighbors, traversal) to decouple algorithms from the specific graph representation.

Conclusion

Graphs are powerful and flexible data structures for modeling relationships and networks. Understanding different graph representations, traversal techniques, and fundamental algorithms like shortest path and MST is crucial for solving a wide variety of problems. Go provides the tools to implement graphs efficiently, and choosing the right representation and algorithms is key to building effective graph-based solutions.

In the next section, we will explore heaps, specialized tree-based structures often used for implementing priority queues.