2025-10-23
Simple introduction to category theory
Category theory in computers: Functional programming
Benefits of functional programming
Example use cases in data analysis
Example use cases in software development
Lots of math in category theory, but we’ll only cover the basics.
Graphs have nodes and edges.
Categories have objects and arrows (called “morphisms”).
Category = composition of all objects and arrows
An abstraction of anything: a number, a string, a function.
A vector: Contains sub-objects of numbers or strings.
A data frame: Contains sub-objects of vectors.
A list/collection: Contains sub-objects of any type.
Necessary for correct mathematical composability.
You can create any type you want.
A boolean type can only have two values:
An integer type can have infinite values:
object: type
notationA: Integer
B: String
D: Function
Can be empty (()
):
y: ()
Product/sum type:
C: List(A, B)
Holds other types or values.
Two kinds: Sum types and product types.
Can compose together (e.g. sum in product).
Number of possible types/values is the sum of the types/values inside.
Colours: Red | Green | Blue
Bool: True | False
3 + 2 = 5 possible types/values
Number of possible types is the product of the types inside.
Colours: Red | Green | Blue
Bool: True | False
3 * 2 = 6 possible types
Actions take (input) a type and output a type.
Transform one object to another object.
Action will always produce same type.
Don’t need to know how, only the input and output types.
Actual math syntax is more complex.
f(A: Integer) -> B: String
g(B: String) -> C: Boolean
.
= composition.
g(f(A: Integer)) -> C: Boolean
h = f . g
h(A: Integer) -> C: Boolean
“f composed into g” or “f followed by g”
Can “chain” or “pipe” functions via composition.
g(f(A)) = f(A) . g()
Piping helps with readability.
Type that allows an action to be applied to types inside.
But keeps the overall structure.
E.g. lists are usually functors.
A: Integer
B: Boolean
map f(List(A, A)) -> List(B, B)
From product type to product type, but different types inside.
F
here is the type “function” that map
uses as input.
Complex is only a problem for our limited human minds. Solved with:
Not have to imperatively state how exactly to do something, every time.
Existing objects don’t change when doing an action.
Otherwise, the math won’t work.
E.g. changing values of an existing type.
Can only output one type.
f(A: Integer) -> B: String
Predictable and testable.
Can use them in other functions.
Tools like map
, filter
, reduce
.
Can construct mathematically provable programs.
Humans naturally think declaratively.
Computers operate imperatively.
“What do you want your program to produce?” vs “How do you want your program to produce it?”
So that we can work declaratively (e.g. we don’t write in assembly or machine code).
Closer match between mental model of the design of solution to problem and the implementation.
Action on the object, e.g. “buy groceries” or “pick up the kids”.
Less code = easier to maintain and read.
Same output from same input.
Functions can be memoized (cached when same input is used)
Abstract containers (types):
Future = A value that will be available later
Promise = How to get that value later
Functional programming is very common in data analysis, though often not explicitly recognized.
Design of databases and the queries are functional.
Decompose into small pieces/steps:
Most common type of objects are: data.frame
, vector
, list
.
Are algebraic data types (e.g. data.frame
and list
are product types).
Are functors (can apply function to internal types).
Are (mostly) immutable.
Object types depend on the package you’re use.
Objects (classes) are mutable.
Functional programming features are mostly an afterthought.
Python doesn’t have piping.
targets
R package: Manage complex analysis pipelinesRequires writing functionally.
File: _targets.R
Code from targets documentation.
Simple because of functional programming design:
See furrr documentation.
Need functional programming in Python to do (multiprocessing
).
(Not objects in object-oriented programming.)
Then build up actions around those objects.
Helps build the foundation of the software.
Build containers for objects to act as functors.
Apply functions to all objects via functors.
Python types are not enforced. R doesn’t have these type annotations.
Keeps it simple and predictable. So this isn’t functional in Python:
Direction
can only have one value and there are four possible values.
Either score or lost, but not both.
Need to add a value to the enum in Python.
Essentially is a sum or product type as object.
Must implement a flat map function (e.g. “unwraps” a container).
Option
type in RustOptional
Maybe
from returns
packageAt least two possible values: Success and its type or Error
Result
Result
as “railway” intersectionNeed to convert Result
to String
before using it.
Result
with match
Results
from returns
package in Pythonmap
)Option
, Result
)